## 1. Introduction

Two distinct approaches have been used to study the “coarse grain” structure of atmospheric low-frequency variability (10 < *T* < 100 day): the episodic or intermittent and the oscillatory or periodic. Ghil and Robertson (2002) have reviewed studies of the Northern Hemisphere in these terms. The intermittency approach describes geographically fixed multiple-flow (or weather) regimes, their persistence and recurrence, and the Markov chain of transitions between them. The periodicity approach studies intraseasonal oscillations and their predictability. Plaut and Vautard (1994) described low-frequency variability in the Northern Hemisphere midlatitudes in terms of oscillatory phenomena as well as in terms of flow regimes defined by the most frequently occurring patterns. Their results reveal both oscillatory and episodic features. For example, they found that a Euro–Atlantic blocking regime is strongly favored by, although not systematically associated with, a particular phase of a 30–35-day oscillatory component. This type of relationship has potentially important practical implications due to the higher inherent predictability of oscillatory behavior.

The extratropical circulation of the Southern Hemisphere, being much more zonally symmetric than that of the Northern Hemisphere, is deceptively simpler. A closer look hints at a higher complexity since the variance spectrum of empirical orthogonal functions (EOFs) is flatter and the leading modes are characterized by higher zonal wavenumbers, thus suggesting a system with a higher number of degrees of freedom (Ghil and Mo 1991). The leading EOFs of low-frequency variability constructed over the entire Southern Hemisphere consist of a high-latitude vacillation in the strength of the polar vortex and Rossby wave trains over the South Pacific (Kidson 1988). Mo and Ghil (1987) identified a wave-train-like pattern arching poleward from the subtropical central Pacific to Argentina, and then refracting equatorward into the Atlantic. Szeredi and Karoly (1987a,b) identified a similar wave pattern, but with a 90° zonal phase lead. These wavelike “modes” have since been referred to as Pacific–South American (PSA) patterns, numbers 1 and 2, respectively. PSA 1 and 2 are reproduced well by the two leading EOFs of low-frequency variability over a domain restricted to the South Pacific extratropics (see Fig. 1). The nomenclature is motivated by analogy with the Pacific–North American pattern in the Northern Hemisphere (e.g., Wallace and Gutzler 1981).

The spatial phase-quadrature relationship between PSA 1 and 2, together with their near-degeneracy in EOF analyses, suggest a propagating wave. In an analysis of 200-hPa streamfunction, Mo and Higgins (1998) have reported evidence during austral winter from lag correlations and singular spectrum analysis (SSA) that PSA 1 and 2 represent an eastward-propagating wave with a period of about 40 days. They also argued that forcing over the western tropical Pacific can create a favorable situation for a particular phase of the PSA modes to strengthen. Low-frequency variability in the Southern Hemisphere is also episodic in nature (Mo 1986). Mo and Ghil (1987) studied quasi-stationary events in the Southern Hemisphere using daily 500-hPa geopotential maps during 1972–83 and found two types of geographically fixed persistent anomalies, both with a strong zonal wavenumber 3 component and strongly resembling the two leading EOFs in the dataset.

A non-dispersive propagating wave has no preferred phase so that is unclear how the PSA patterns can simultaneously be described as quadrature phases of a propagating wave, and geographically fixed circulation regimes. The aim of the present paper is to explore systematically the episodic/intermittent and oscillatory/periodic characteristics of the PSA modes, and to extend the work of Mo and coworkers to all four seasons. We examine the evidence for circulation regimes as well as low-frequency oscillations over the South Pacific sector from the National Centers for Environmental Prediction–National Center for Atmospheric Research (NCEP–NCAR) reanalysis data, and investigate their seasonal dependence. We apply three analysis methods to the 700-hPa geopotential height dataset: a *K*-means clustering and Gaussian mixture model, as well as multichannel SSA (MSSA). We then assess any relationships between the circulation regimes and oscillatory behavior that could lead to predictability of the former. All three methods use conventional EOF analysis only as a means of data reduction, thereby circumventing the potential problem of near-degeneracy of PSA 1 and 2 when defined in terms of the leading EOFs, which may complicate the physical interpretation of either one.

The paper is organized as follows. In section 2 we describe the dataset and preprocessing steps. Section 3 contains the regime analysis for each of the four conventional 3-month meteorological seasons. In section 4 we use MSSA to isolate oscillatory behavior and discuss its relationships with the clusters in section 5. This is followed by the conclusions in section 6.

## 2. Data and preprocessing

All analyses in this paper are restricted to the South Pacific sector 20°–90°S, 150°E–60°W. We use daily 700-hPa geopotential heights from the NCEP–NCAR reanalysis dataset for 1948–99 (Kalnay et al. 1996). The lower-tropospheric geopotential is chosen so as to emphasize the midlatitude circulation where 700 hPa corresponds to an approximate “steering level” (Blackmon et al. 1984). Data are very sparse in much of the domain of study, and a near-surface variable should be more closely controlled by surface observations that predominate. The reanalysis fields are on a 2.5° latitude–longitude grid, which was subsampled by omitting every other point to yield a 5° × 5° resolution. The values were preprocessed by low-pass filtering (Blackmon and Lau 1980) at 10 days, followed by construction of the leading EOFs using the covariance matrix, unweighted by area. There is an error in the reanalysis dataset between 1979 and 1992 due to bogus Australian surface pressure data, but its impact on low-pass-filtered data is expected to be minimal (Mo and Higgins 1998).

For the regime analysis (section 3) we determine EOFs separately for the December–February (summer, DJF), March–May (fall, MAM), June–August (winter, JJA), and September–November (spring, SON) seasons during the 1948–99 interval, which provides the maximum sample size. For the oscillatory analysis (section 4), we constructed EOFs for the *entire* calendar year during the period 1958–99, but with 1) the mean seasonal cycle subtracted on a daily basis (after the low-pass filtering), and 2) the interannual variability suppressed by forming (deseasonalized) anomalies from annual means.

The two leading EOFs from the latter year-round filtered data are shown in Fig. 1 and compare closely with the PSA patterns constructed by Mo and Higgins (1998) from 500-hPa geopotentials for June–August. They explain comparable fractions of (low-pass filtered) variance (21.7% and 19.2%), and are well separated from the higher-ranked EOFs (EOF 3 explains 12.5%). The pairing between the first two empirical modes is less clear in the individual seasons (in which interannual variability was also retained). When interannual variability is retained, PSA 1 exhibits a zonally symmetric component over the pole, especially in summer. The leading 10 EOFs account for more than 85% of the (low-pass filtered) variance in all analyses.

## 3. Regime analysis

### a. Number of regimes

To analyze low-frequency variability from the episodic point of view, we use primarily *K*-means clustering (MacQueen 1967). This is a straightforward and widely used partitioning method that classifies all days into a predefined set of *K* clusters, such as to minimize the spread within them. The number of clusters must be specified, but the sensitivity of the resulting centroids to the choice of initial seeds and the data subsample can be used as ad hoc criteria for assessing the validity of the partitioning into distinct clusters (Michelangeli et al. 1995). We performed the cluster analysis in the subspace of the leading 10 EOFs for each season in turn, as described in appendix A.

To be most conservative, both robustness criteria are applied together. The *K*-means method then indicates multiple regimes in fall (*K* = 3 or 5), spring (*K* = 4), and marginally in winter (*K* = 3) (see Table A1). The sensitivity to initial seeds is considerably higher in summer, suggesting that the regime description is less valid then.

In order to obtain an independent measure of the extent to which the geopotential height data support the existence of multiple stationary flow regimes, we have also fitted a Gaussian mixture model to the probability density function (PDF) in the EOF subspace. This method seeks to fit the PDF with a small number of Gaussian components and enables a rigorous test for the existence of multimodality, by estimating the cross-validated likelihood of a single Gaussian versus that of a mixture of several. The methodology follows the work of Smyth et al. (1999), and the details are given in appendix B.

In fall and winter, the mixture model indicates multimodality with three to four Gaussians (Table B1), supporting the results of the *K*-means method. Figure 2 shows the data scatter in the subspace of EOFs 1 and 2 for fall, together with the locations of the three Gaussians. The positions of the three regime centroids given by the mixture model (circles) agree closely with those of the cluster analysis (squares). In summer the mixture model indicates unimodality, which is consistent with the much higher summertime sensitivity to initial seeds in the *K*-means analysis. In spring, the *K*-means analysis selects *K* = 4 while the mixture model indicates unimodality. We will return to this apparent contradiction in section 6.

Based on the above results, the regime description of low-frequency variability over the South Pacific is well justified in fall and winter, somewhat less certain in spring, and probably not applicable in summer. For parsimony, in what follows we will consider the case of *K* = 3 for fall, winter, and spring, basing our analysis on the *K*-means results. Although the spring values in appendix A (Table A1) suggest that *K* = 4 is more appropriate, three of the cluster centroids obtained in spring are almost indistinguishable from those in winter when *K* = 3 (the pattern correlation between the respective centroids exceeds 0.97), while the fourth centroid is less robust between the seasons.

### b. Spatial structures and regime transitions

Figure 3 shows hemispheric geopotential height anomaly composites for each of the three clusters for fall, winter, and spring. In all three seasons the regimes exhibit PSA-like patterns, characterized by meridionally elongated tripole Rossby wave patterns (Berbery et al. 1992). Regimes 1 and 3 resemble opposite polarities of EOF 1 although they are slightly phase-shifted, while regime 2 resembles EOF 2 (Fig. 1). The cluster analysis does not, however, simply select each polarity of the EOFs, as is clear from the off-axis positions of the centroids in Fig. 2, which represent a linear combination of EOF 1 and 2. Thus, EOF 1 can be viewed as the single spatial pattern that maximizes the variance contained in regimes 1 and 3.

For regime durations beyond a few days, the cumulative frequency distribution of residence times in each regime follows approximately a geometric distribution (not shown), similar to the findings of Dole and Gordon (1983) and Kimoto and Ghil (1993) in the Northern Hemisphere. Thus, the duration of events can be approximated by a first-order Markov chain. The composites in Fig. 3 are ordered in terms of the their most frequently occurring transitions, so that the “circuit” 1 ⇒ 2 ⇒ 3 ⇒ 1 is the preferred temporal ordering. This progression of spatial patterns suggests an eastward propagation. Simple counts of the number of times this eastward-propagating circuit occurs, compared to the opposite westward-propagating one (1 ⇒ 3 ⇒ 2 ⇒ 1), are tabulated in Table 1. In each season the eastward-propagating circuit is more frequent than the reverse, but not dramatically so. The third row of Table 1 gives the degree of asymmetry (*a*) in the transition matrices, where a value of *a* = 2 means that transitions are twice as likely to have a preferred direction in time, while *a* = 1 denotes equal likelihood of direction. The tendency toward propagation is strongest in spring with a rough estimate of the period of 30 days given by summing the average regime durations. This apparent propagation motivates the analysis in section 4 where we take an oscillatory approach to analyzing low-frequency variability over the South Pacific sector.

The results in this section were checked a posteriori for any severe long-term trends and large seasonal variation in the frequency of regime occurrence. No large long-terms trends were found. The largest within-season variations in regime membership (averaged across the 51 yr in the dataset) have a magnitude of about a factor of 2, despite the rather similar regime spatial structures seen in different seasons.

## 4. Low-frequency oscillations

In this section we examine the oscillatory components of the 700-hPa geopotential height over the South Pacific, considering the whole calendar year but with the seasonal cycle subtracted. We begin by computing the year-round subannual EOFs using low-pass-filtered data (Fig. 1), as described in section 2. Next, 5-day averages of *unfiltered* data, with the mean seasonal cycle subtracted are projected onto these EOFs to give pseudo–principal component time series (referred to simply as the PCs in the following). Thus, in effect, we use the leading EOFs of low-frequency variability as “spatial filters” to focus on the low-frequency structures while retaining the full temporal spectrum. Finally, singular spectrum analysis (SSA and M-SSA) is applied to these time series, using the University of California, Los Angeles (UCLA), SSA–multi taper method (MTM) Toolkit (Dettinger et al. 1995; Ghil et al. 2002).

We begin with a univariate analysis, and apply SSA to PCs 1 and 2 (i.e., the time series of the PSA 1 and 2 patterns plotted in Fig. 1) in turn. For an independent spectral estimate we also compute the MTM spectra. Figure 4 shows the SSA and MTM spectra for PC 1, with statistical significance estimated against the null hypothesis of red noise, using the tests of Allen and Smith (1996) for SSA and Mann and Lees (1996) for MTM. Both spectra have a long-term “trend” component. This is probably due to interannual variability associated with ENSO's influence on PSA 1 (Karoly 1989; Cazes-Boezio et al. 2003). The MTM spectrum exhibits subseasonal peaks at periods of about 45, 36, 30, 20, and 15 days (all significant at the 95% confidence level). The SSA spectrum suggests oscillatory pairs of eigenvalues at 45, 31, 19, and 15 days, with the 45- and 31-day components being the most statistically significant though at a somewhat lower level than those identified by MTM. The spectrum of PC 2 (Fig. 5) shows very pronounced oscillatory components at 48, 22, and 15 days in both spectral estimates. Mo and Higgins (1998) applied SSA to PSA time series derived from 200-hPa streamfunction and reported common periods in PSA 1 and PSA 2 of 36–40, 22–25, and 16–18 days.

While the spectral peaks in PCs 1 and 2 have comparable periods, the respective spectra do not exhibit strongly commensurate peaks that would indicate a propagating wave, given the approximate spatial quadrature seen in EOFs 1 and 2. Indeed, the maximum lag correlation between PCs 1 and 2 is only 0.14, and the spectra in Figs. 4 and 5 indicate a red background with rather modest oscillatory components superposed.

To avoid the potential problem of near degeneracy of PCs 1 and 2 and to examine more closely the spatiotemporal structure of the data, we next apply *multichannel* SSA to the leading six PC time series, thereby using the spatial EOF analysis only as a data reduction tool. To focus on the intraseasonal band, variability with timescales greater than 65 days was removed at the outset using the SSA “detrending” procedure employed by Robertson (1996), applied to each PC time series in turn. Similar results were obtained using only PCs 1 and 2.

The multichannel analysis identifies the 45–50-day component present in both the univariate analyses of EOFs 1 and 2 with a period closer to 42 days, while the behavior between 20 and 30 days appears to be more broad band. Figure 6 shows the eigenvalue spectrum and the power spectra of the leading eight temporal PCs identified by multichannel SSA. The leading two eigenelements form an oscillatory pair with a period of about 42 days. The next four eigenvalues are clustered together and are associated with a broadband peak in the temporal PCs with periods of 20–30 days; these are followed by an oscillatory 18-day pair.

Each eigenelement of the MSSA is associated with an evolving spatiotemporal structure, such that the sum of all reconstructs the original dataset. The reconstructed contribution of the 42-day pair is illustrated in Fig. 7. The time series of channels 1 and 2 (i.e., PSA 1 and 2) are plotted in Fig. 7a over the 1997–2000 interval. The PSA 1 and 2 components of the wave do appear in phase quadrature, with PSA 2 leading PSA 1, indicative of eastward phase propagation also hinted at in the regime analysis of section 3 (cf. also EOFs 1 and 2 in Fig. 1). However, the PSA 1 component has considerably higher amplitude than that of PSA 2.

Figure 7b shows the spatial structure of the oscillation through a composite half cycle formed by compositing the reconstruction of the 42-day wave in phase intervals of 1/8th period (i.e., about every 5 days), using the technique of Plaut and Vautard (1994). Here we have plotted the composites of the wave in the PC 1–2 subspace, so that the patterns are linear combinations of PSAs 1 and 2 plotted in Fig. 1. To aid interpretation of the longitudinal phase progression with time through the cycle, Fig. 7c shows meridional averages over the band 50°–60°S, together with the longitudes of the maxima of EOFs 1 and 2. The temporal evolution can be interpreted as a very gradual eastward propagation, with rapid intensification of amplitude into category 1, where the oscillation resembles EOF 1 (i.e., PSA 1). This pattern persists while drifting very slowly eastward into category 2, before decaying rapidly into an EOF-2-like pattern (category 3). Thus, these phase composites indicate a predominantly stationary oscillation, with the peak amplitude corresponding to PSA 1 and the minimum amplitude corresponding to PSA 2. The cyclic evolution of the wave plotted in Figs. 7b and 7c suggests a highly dispersive wave, with a eastward group velocity much exceeding the phase speed. When the phase composites are plotted in the PC 1–6 subspace (in which the MSSA was computed), the peak phase changes little while the quadrature phase is no longer recognizable as PSA 2.

A slow modulation in the amplitude of the 42-day component is visible in Fig. 7a, but there is no detectable relationship with ENSO. To determine whether there is any marked seasonal variation, Fig. 8 shows the variance of the PSA 1 and 2 components of the 42-day wave for each season. PSA 1 clearly dominates the variance in all seasons, with largest values in winter/spring and a minimum in summer. Thus, low-frequency variability during summer appears less coherent, and is neither well characterized by oscillatory behavior nor regimes.

## 5. Relationships between regimes and LFOs

The oscillatory components identified in the previous section account for only a very small fraction of the variance, with the 42-day mode accounting for about 5% of the subannual variance of 5-day means over the South Pacific sector. In this section we explore whether or not this weak oscillation may nonetheless be related to the occurrence of the three circulation regimes constructed in section 3. To do this, regime occurrence was simply counted for each of the eight phase categories of the 42-day oscillation taking the fall, winter, and spring seasons in turn. Confidence limits for by-chance occurrence were computed by permutating the order of the regime occurrence time series 100 times.

There is a highly statistically significant relationship between the 42-day oscillation and regimes 1 and 3 during winter and spring, as shown in Fig. 9 for regime 1; the frequency of occurrence of regime 3 is generally the inverse of regime 1 (Table 2). This comes about because regimes 1 and 3 resemble opposite polarities of PSA 1, which is also the spatial pattern that dominates the oscillation (Fig. 7b). Regime 2 also shows a significant relationship during several phases of the oscillation, though the changes in its frequency of occurrence are more moderate, consistent with weakness of the quadrature phase of the oscillation (Figs. 7b,c). There is a general regime progression of 1 → 2 → 3 through the cycle of the 42-day oscillation, which is consistent with Table 1.

The relationship between the 42-day oscillation and the frequency of occurrence of the three regimes suggests that the latter may contain some predictability. To explore this further, we compute the probability of regime occurrence conditional on the lagged phase category of the oscillation, following Plaut and Vautard (1994). Thus, assuming that the phase category of the oscillation is known at some initial time, we aim to predict the regime occurrence at later times. Figure 10 shows the probability of regime 1 occurring during winter at lead times up to 50 days, conditional on phase category of the 42-day oscillation at the initial time. The results are shown for initial phase categories 1–4, with similar results obtained for the second half cycle. The conditional probabilities clearly exceed chance up to 30 days into the future. Similar results are found for regime 3, as well as for spring and fall. Regime 2 is found to be less predictable. Similar results are found with *K* = 2 regimes specified, in which case we recover EOF 1 (i.e., PSA 1).

The apparent predictability seen in Fig. 10 cannot, however, be interpreted as hindcast skill because the probabilities have been computed from the same data used to determine the regimes and the oscillation.

## 6. Discussion and conclusions

We have analyzed low-frequency variability in the NCEP–NCAR reanalysis data of the lower-troposphere (700 hPa) geopotential height field over the South Pacific sector in terms of geographically fixed circulation regimes and oscillatory behavior. The regimes were identified using a *K*-means cluster analysis and a cross-validated Gaussian mixture model, while a multichannel singular spectrum analysis (MSSA) was used to the search for oscillatory components.

The spatial structures identified in both types of analysis are similar to the Pacific–South American (PSA) wave trains identified in previous studies. The two leading (*T* > 10 days) EOFs are usually referred to as PSA 1 and PSA 2 (Fig. 1). In relaxing the orthogonality constraints inherent in EOF analysis, our results confirm the physical relevance of EOF 1, since a similar pattern dominates our oscillatory analysis in terms of amplitude (Figs. 7b,c). Our analysis of circulation regimes (Fig. 3) also indicates that the leading two EOFs give reasonable approximations of spatial structure, although the regimes are linear combinations of them (Fig. 2).

Both methods of regime analysis show that low-frequency variability over the South Pacific sector is well described by three or four recurrent geographically fixed circulation regimes in austral fall (MAM) and winter (JJA), and to some extent in spring. Both methods suggest that the regime description is less valid in summer. The spatial structures of the regimes (Fig. 3) are found to be very similar in fall, winter, and spring, consisting of zonal wave trains across the South Pacific. These patterns are confined to midlatitudes and differ from the PSA wave trains detected by Mo and Higgins (1998) in 200-hPa streamfunction that extend into the Tropics. In further contrast to the latter study, we found no statistically significant relationships with tropical OLR and, thus, conclude that the regimes identified here are intrinsic to the midlatitudes, similar to the interpretation of Lau et al. (1994). On the other hand, the frequency of occurrence of regime 3 is highly correlated with ENSO during austral spring (*r* = 0.60 with the Niño-3.4 index). This is consistent with the study of Cazes-Boezio et al. (2003), which proposed that interannual ENSO teleconnections over southeastern South America could be interpreted in terms of the changes in the frequency of occurrence of intraseasonal circulation regimes during October–December.

The spectral analysis of PSAs 1 and 2 reveals a predominantly red spectrum, which is consistent with episodic regime-like behavior where regime durations follow approximately a geometric distribution without a preferred duration. However, there is evidence of significant oscillatory peaks in the 40–50 and 20–30-day bands. The spectral peaks in PSAs 1 and 2 do not match closely, suggesting no simple propagating wave. A multichannel analysis in the subseasonal range identifies a dominant peak at 42.5 days, which is slightly longer period than the 36–40-day peak found by Ghil and Mo (1991) and Mo and Higgins (1998). Phase composites of this component show that it is dominated by the PSA 1 spatial pattern, and is almost stationary in phase. A gradual eastward drift of this pattern accompanies its rapid attenuation, so that the quadrature phase is very weak in amplitude. The oscillation is present throughout the year but is most pronounced in austral winter and spring. No ENSO modulation was found.

Previous studies (e.g., Mo and Higgins 1998) have interpreted low-frequency variability over the South Pacific in terms of a propagating wave with a period of about 35–40 days, with an eastward progression characterized by the PSA 1 and PSA 2 patterns. Such a description would not be compatible with the geographically fixed circulation regimes. Our analysis does find strong evidence for the latter in fall and winter. We also find evidence of an oscillatory component with a period of about 42 days. These two findings are consistent because the oscillation is found to be predominantly stationary in space. Thus, we find that both the episodic and the oscillatory viewpoints are consistent with each other, with the regimes characterizing the slow part of the cycle.

While we have stressed the geographically fixed nature of the PSA patterns, we do find some evidence of eastward propagation in austral spring. This is brought out by the transitions between the three regimes identified in the *K*-means analysis (Table 1). It is also hinted at by the increased robustness of the *K* = 4 description in which the four regimes have spatial structures that are close to PSAs 1 and 2. The Gaussian mixture model indicates unimodality in spring, arguing against the regime description during spring. It is beyond the scope of this paper to stratify the oscillatory analysis by season, but it would be of interest to know if the 42-day wave has a stronger propagating component during austral spring.

The 42-day wave is weak, only explaining about 5% of low-frequency variance, so that its relevance is questionable. We investigated whether or not a weak oscillatory component could nonetheless influence regime transitions. According to Fig. 9 there is quite a strong (and highly statistically significant) relationship between the phase of the 42-day wave and regime occurrence during winter and spring. The frequency of occurrence of regime 1 (similar to PSA 1) changes by a factor of 3 between the extreme phases of the oscillation, suggesting that the oscillatory component is stronger than its variance indicates. One explanation for this apparent discrepancy would be that the oscillation is somewhat broader band than indicated by the MSSA. The finding that low-frequency variability over the South Pacific is characterized by both (i) geographically fixed PSA-like circulation regimes and (ii) by oscillatory components has implications for potential predictability. We find that bias in the frequency of occurrence of each regime may be strong enough for the oscillation to be used as a predictor of the probability of regime occurrence, up to 30 days in advance in certain cases (Fig. 10), although further work is required to determine whether there is any useful skill using cross validation. The predictive nature of the oscillatory component found here is similar in extent to that reported for the North Atlantic–European sector by Plaut and Vautard (1994).

## Acknowledgments

We would like to thank and H. Berbery, G. Cazes-Boezio, and K. Mo for helpful discussions, and the two reviewers whose comments led to a substantial improvement of the manuscript. Many of the computations were performed by G. Kerneis and A. Loireau, two summer student visitors from the French Ecole Polytechnique. This work was supported by NOAA Grants NA06GPO427 and NA06GPO511. The NCEP–NCAR reanalysis data were provided through the NOAA Climate Diagnostics Center (information online at http://www.cdc.noaa.gov/).

## REFERENCES

Allen, M. R., and L. A. Smith, 1996: Monte Carlo SSA: Detecting irregular oscillations in the presence of colored noise.

,*J. Climate***9****,**3373–3404.Berbery, E. H., J. Nogués-Paegle, and J. D. Horel, 1992: Wavelike Southern Hemisphere extratropical teleconnections.

,*J. Atmos. Sci.***49****,**155–177. Corrigendum,**49,**2347.Blackmon, M. L., and N-C. Lau, 1980: Regional characteristics of the Northern Hemisphere wintertime circulation: A comparison of the simulation of a GFDL general circulation model with observations.

,*J. Atmos. Sci.***37****,**497–514.Blackmon, M. L., N-C. Lau, and H-H. Hsu, 1984: Time variation of 500 mb height fluctuations with long, intermediate and short time scales as deduced from lag-correlation statistics.

,*J. Atmos. Sci.***41****,**981–991.Cazes-Boezio, G., A. W. Robertson, and C. R. Mechoso, 2003: Seasonal dependence of ENSO teleconnections over South America and relationships with precipitation in Uruguay.

,*J. Climate***16****,**1159–1176.Cheng, X., and J. M. Wallace, 1993: Cluster analysis of the Northern Hemisphere wintertime 500-hPa height field: Spatial patterns.

,*J. Atmos. Sci.***50****,**2674–2696.Dettinger, M. D., C. M. Strong, W. Weibel, M. Ghil, and P. Yiou, 1995: Software for singular spectrum analysis of noisy time series.

,*Eos, Trans. Amer. Geophys. Union***76**(2) 12, 14, 21.Dole, R. M., and N. D. Gordon, 1983: Persistent anomalies of the extratropical Northern Hemisphere wintertime circulation: Geographical distribution and regional persistence characteristics.

,*Mon. Wea. Rev.***111****,**1567–1586.Ghil, M., and K. Mo, 1991: Intraseasonal oscillations in the global atmosphere. Part II: Southern Hemisphere.

,*J. Atmos. Sci.***48****,**780–790.Ghil, M., and A. W. Robertson, 2002: “Waves” vs. “particles” in the atmosphere's phase space: A pathway to long-range forecasting?

,*Proc. Natl. Acad. Sci.***99****,**(Suppl. 1),. 2493–2500.Ghil, M., and Coauthors. 2002: Advanced spectral methods for climatic time series.

,*Rev. Geophys.***40****,**1003. doi:10.1029/2000RG000092.Kalnay, E., and Coauthors. 1996: The NECP/NCAR 40-Year Reanalysis Project.

,*Bull. Amer. Meteor. Soc.***77****,**437–471.Karoly, D. J., 1989: Southern Hemisphere circulation features associated with El Niño–Southern Oscillation events.

,*J. Climate***2****,**1239–1251.Kidson, J. W., 1988: Interannual variations in the Southern Hemisphere circulation.

,*J. Climate***1****,**1177–1198.Kimoto, M., and M. Ghil, 1993: Multiple flow regimes in the Northern Hemisphere winter. Part I: Methodology and hemispheric regimes.

,*J. Atmos. Sci.***50****,**2625–2643.Lau, K-M., P-J. Sheu, and I-S. Kang, 1994: Multiscale low-frequency circulation modes in the global atmosphere.

,*J. Atmos. Sci.***51****,**1169–1193.MacQueen, J., 1967: Some methods for classification and analysis of multivariate observations.

*Proc. Fifth Berkeley Symp. on Mathematical Statistics and Probability,*Berkeley, CA, University of California, Berkeley, 281–297.Mann, M. E., and J. M. Lees, 1996: Robust estimation of background noise and signal detection in climatic time series.

,*Climatic Change***33****,**409–445.Michelangeli, P. A., R. Vautard, and B. Legras, 1995: Weather regimes: Recurrence and quasi-stationarity.

,*J. Atmos. Sci.***52****,**1237–1256.Mo, K. C., 1986: Quasi-stationary states in the Southern Hemisphere.

,*Mon. Wea. Rev.***114****,**808–823.Mo, K. C., and M. Ghil, 1987: Statistics and dynamics of persistent anomalies.

,*J. Atmos. Sci.***44****,**877–901.Mo, K. C., and R. W. Higgins, 1998: The Pacific South American modes and tropical convection during the Southern Hemisphere winter.

,*Mon. Wea. Rev.***126****,**1581–1598.Plaut, G., and R. Vautard, 1994: Spells of low-frequency oscillations and weather regimes in the Northern Hemisphere.

,*J. Atmos. Sci.***51****,**210–236.Robertson, A. W., 1996: Interdecadal variability in a multicentury climate integration.

,*Climate Dyn.***12****,**227–241.Smyth, P. J., K. Ide, and M. Ghil, 1999: Multiple regimes in Northern Hemisphere height fields via mixture model clustering.

,*J. Atmos. Sci.***56****,**3704–3723.Szeredi, I., and D. Karoly, 1987a: The horizontal structure of monthly fluctuations of the Southern Hemisphere troposphere from station data.

,*Aust. Meteor. Mag.***35****,**119–129.Szeredi, I., and D. Karoly, 1987b: The vertical structure of monthly fluctuations of the Southern Hemisphere troposphere.

,*Aust. Meteor. Mag.***35****,**19–30.Wallace, J. M., and D. S. Gutzler, 1981: Teleconnections in the geopotential height field during the Northern Hemisphere winter.

,*Mon. Wea. Rev.***109****,**784–812.

## APPENDIX A

### Cluster Analysis

The *K*-means method is applied in the subspace of the leading 10 PCs following Michelangeli et al. (1995). An initial 10% random subset of days is used to determine the initial seeds, and the algorithm proceeds iteratively from the initial seeds, modifying the cluster centroids (i.e., the *means*) at each iteration. The clustering is then repeated 50 times to eliminate any sensitivity to initial seeds. The *reference partition* is defined from this set of 50 analyses to be the one whose cluster centroids are most similar to the remaining 49, in terms of pattern correlation. The sensitivity to the choice of initial seeds gives a measure of how classifiable the dataset is for a particular prespecified number of clusters *K.* The similarity between two partitions *P*_{i} and *P*_{j} can be quantified by the *smallest* pattern correlation between a centroid in *P*_{i} with its best analog in *P*_{j}. A *classifiability index* (*CI*) can then be defined as the average of this similarity value over all pairs of partitions (Michelangeli et al. 1995). The CI is unity for a perfect match and zero for uncorrelated patterns.

The number of clusters *K* should also maximize the reproducibility of the patterns obtained from subsets of the data (Cheng and Wallace 1993). To quantify reproducibility, random subsets containing 50% of the days in the dataset are drawn 100 times. Splitting a dataset into two equal halves is a common device for assessing robustness. The reference partition is computed for each 50% subset, and its similarity with that of the full dataset calculated; averaging these similarity values from all 100 subsets defines a *reproducibility index* (*R*) for each value of *K.* Cheng and Wallace (1993) argue, on the basis of experience, that two *hemispheric* patterns bear a strong resemblance to each other if their pattern correlation is near or above 0.89. They increase this threshold for a sector of the hemisphere to account for the reduced number of spatial degrees of freedom, so as to retain a similar value of the Student's *t* statistic. For the 150° South Pacific sector considered here, the corresponding threshold would be 0.94.

Table A1 shows the classifiability and reproducibility indices (in percent) as a function of *K* for each 3-month season. Values of *K* = 3–4 in Table A1 generally yield the best CI and *R* scores, although the highest values of these indices do not always clearly point to a particular value of *K.* The case of *K* = 2 yields almost exactly EOF 1 in all seasons so that the *K*-means method does not yield any additional information to classical EOF analysis (Michelangeli et al. 1995).

## APPENDIX B

### The Gaussian Mixture Model

The method of cross-validated maximum likelihood is used to determine the number of component Gaussian distributions that provide the best fit to the data. The cross validation here consists of randomly selecting 25 seasons, training the model on these seasons, and then validating on the remaining 25. The procedure is repeated 20 times. This method provides a rigorous test of any multimodality in the PDF, against the null hypothesis of a single unimodal Gaussian.

The mixture model was applied to the same seasonally stratified datasets used for the *K*-means analysis, restricting the data to the subspace of the leading two PCs. The results are shown in Table B1 as a function of the number of Gaussian components, *k.* The cross-validated log-likelihoods are relative measures of likelihood, with the maximum (i.e., smallest negative) value being the most likely. The estimated (posterior) probabilities of each value of *k,* given the dataset [i.e., *P*(*k* | *D*)], are also tabulated (see Smyth et al. 1999). The high posterior probability of *k* = 3 in fall is consistent with the cluster analysis in which there is a unique coincidence of both CI and *R* values greater than 0.98 (Table A1) with three clusters during the fall.

Scatter of daily geopotential heights (Mar–May) projected onto the PC 1/PC 2 plane, with every fifth point plotted. The solid squares denote the positions of the three *K* means. The results of the Gaussian mixture model are plotted in terms of its centroids (solid circles) and covariance ellipses

Citation: Monthly Weather Review 131, 8; 10.1175//2548.1

Scatter of daily geopotential heights (Mar–May) projected onto the PC 1/PC 2 plane, with every fifth point plotted. The solid squares denote the positions of the three *K* means. The results of the Gaussian mixture model are plotted in terms of its centroids (solid circles) and covariance ellipses

Citation: Monthly Weather Review 131, 8; 10.1175//2548.1

Scatter of daily geopotential heights (Mar–May) projected onto the PC 1/PC 2 plane, with every fifth point plotted. The solid squares denote the positions of the three *K* means. The results of the Gaussian mixture model are plotted in terms of its centroids (solid circles) and covariance ellipses

Citation: Monthly Weather Review 131, 8; 10.1175//2548.1

Regime composites of 700-hPa geopotential height obtained from the *K*-means analysis of the South Pacific sector for each season, 1948–99. Contour interval: 10 gpm. Negative anomalies are shaded

Citation: Monthly Weather Review 131, 8; 10.1175//2548.1

Regime composites of 700-hPa geopotential height obtained from the *K*-means analysis of the South Pacific sector for each season, 1948–99. Contour interval: 10 gpm. Negative anomalies are shaded

Citation: Monthly Weather Review 131, 8; 10.1175//2548.1

Regime composites of 700-hPa geopotential height obtained from the *K*-means analysis of the South Pacific sector for each season, 1948–99. Contour interval: 10 gpm. Negative anomalies are shaded

Citation: Monthly Weather Review 131, 8; 10.1175//2548.1

Spectral analysis of pseudo-PC 1. (a) SSA spectrum computed with *M* = 40 5-day means. The error bars give the 95% confidence interval of a red noise process fitted to the time series. (b) MTM spectrum computed with 39 tapers. The background curves denote the 50%, 95%, and 99% thresholds of a red noise null hypothesis. Periodicities of interest are given above each spectrum (in days). The SSA used a window of *M* = 40 5-day means (an approximate spectral resolution of 0.0040 cpd). For MTM, 39 tapers were used yielding a half-bandwidth spectral resolution of 0.0013 cpd

Citation: Monthly Weather Review 131, 8; 10.1175//2548.1

Spectral analysis of pseudo-PC 1. (a) SSA spectrum computed with *M* = 40 5-day means. The error bars give the 95% confidence interval of a red noise process fitted to the time series. (b) MTM spectrum computed with 39 tapers. The background curves denote the 50%, 95%, and 99% thresholds of a red noise null hypothesis. Periodicities of interest are given above each spectrum (in days). The SSA used a window of *M* = 40 5-day means (an approximate spectral resolution of 0.0040 cpd). For MTM, 39 tapers were used yielding a half-bandwidth spectral resolution of 0.0013 cpd

Citation: Monthly Weather Review 131, 8; 10.1175//2548.1

Spectral analysis of pseudo-PC 1. (a) SSA spectrum computed with *M* = 40 5-day means. The error bars give the 95% confidence interval of a red noise process fitted to the time series. (b) MTM spectrum computed with 39 tapers. The background curves denote the 50%, 95%, and 99% thresholds of a red noise null hypothesis. Periodicities of interest are given above each spectrum (in days). The SSA used a window of *M* = 40 5-day means (an approximate spectral resolution of 0.0040 cpd). For MTM, 39 tapers were used yielding a half-bandwidth spectral resolution of 0.0013 cpd

Citation: Monthly Weather Review 131, 8; 10.1175//2548.1

Spectral analysis of pseudo-PC 2. Details as in Fig. 4

Citation: Monthly Weather Review 131, 8; 10.1175//2548.1

Spectral analysis of pseudo-PC 2. Details as in Fig. 4

Citation: Monthly Weather Review 131, 8; 10.1175//2548.1

Spectral analysis of pseudo-PC 2. Details as in Fig. 4

Citation: Monthly Weather Review 131, 8; 10.1175//2548.1

Multichannel SSA of detrended pseudo-PCs 1–6, using *M* = 20. (a) Eigenvalue rank spectrum, in % of total variance. (b) MEM spectra (five poles) of the leading eight temporal PCs resulting from the MSSA. The approximate periods are given above the curves. The MSSA was applied using the full covariance matrix (Plaut and Vautard 1994)

Citation: Monthly Weather Review 131, 8; 10.1175//2548.1

Multichannel SSA of detrended pseudo-PCs 1–6, using *M* = 20. (a) Eigenvalue rank spectrum, in % of total variance. (b) MEM spectra (five poles) of the leading eight temporal PCs resulting from the MSSA. The approximate periods are given above the curves. The MSSA was applied using the full covariance matrix (Plaut and Vautard 1994)

Citation: Monthly Weather Review 131, 8; 10.1175//2548.1

Multichannel SSA of detrended pseudo-PCs 1–6, using *M* = 20. (a) Eigenvalue rank spectrum, in % of total variance. (b) MEM spectra (five poles) of the leading eight temporal PCs resulting from the MSSA. The approximate periods are given above the curves. The MSSA was applied using the full covariance matrix (Plaut and Vautard 1994)

Citation: Monthly Weather Review 131, 8; 10.1175//2548.1

The sum of MSSA reconstructed components (RCs) 1 and 2. (a) Channel 1 (solid curve) and channel 2 (dotted curve) plotted over the 3-yr interval 1997–99. (b) Phase composites over phase intervals 0–*π*/4 (1), *π*/4–*π*/2 (2), *π*/2–3*π*/4 (3), and 7*π*/4–2*π* (8), with negative anomalies shaded. (c) Meridional averages 50°–60°S of the maps in (b). Also shown (arrows) are the longitudes of the maxima of EOFs 1 and 2 (Fig. 1). The units are arbitrary. Panels (b) and (c) show the projection of the oscillation in the 2D subspace of PCs 1 and 2 (Fig. 1)

Citation: Monthly Weather Review 131, 8; 10.1175//2548.1

The sum of MSSA reconstructed components (RCs) 1 and 2. (a) Channel 1 (solid curve) and channel 2 (dotted curve) plotted over the 3-yr interval 1997–99. (b) Phase composites over phase intervals 0–*π*/4 (1), *π*/4–*π*/2 (2), *π*/2–3*π*/4 (3), and 7*π*/4–2*π* (8), with negative anomalies shaded. (c) Meridional averages 50°–60°S of the maps in (b). Also shown (arrows) are the longitudes of the maxima of EOFs 1 and 2 (Fig. 1). The units are arbitrary. Panels (b) and (c) show the projection of the oscillation in the 2D subspace of PCs 1 and 2 (Fig. 1)

Citation: Monthly Weather Review 131, 8; 10.1175//2548.1

The sum of MSSA reconstructed components (RCs) 1 and 2. (a) Channel 1 (solid curve) and channel 2 (dotted curve) plotted over the 3-yr interval 1997–99. (b) Phase composites over phase intervals 0–*π*/4 (1), *π*/4–*π*/2 (2), *π*/2–3*π*/4 (3), and 7*π*/4–2*π* (8), with negative anomalies shaded. (c) Meridional averages 50°–60°S of the maps in (b). Also shown (arrows) are the longitudes of the maxima of EOFs 1 and 2 (Fig. 1). The units are arbitrary. Panels (b) and (c) show the projection of the oscillation in the 2D subspace of PCs 1 and 2 (Fig. 1)

Citation: Monthly Weather Review 131, 8; 10.1175//2548.1

The seasonal variation of the amplitude of the 42-day wave reconstructed components (RCs 1–2), in terms of the variance of channels (PCs) 1 (gray) and 2 (black)

Citation: Monthly Weather Review 131, 8; 10.1175//2548.1

The seasonal variation of the amplitude of the 42-day wave reconstructed components (RCs 1–2), in terms of the variance of channels (PCs) 1 (gray) and 2 (black)

Citation: Monthly Weather Review 131, 8; 10.1175//2548.1

The seasonal variation of the amplitude of the 42-day wave reconstructed components (RCs 1–2), in terms of the variance of channels (PCs) 1 (gray) and 2 (black)

Citation: Monthly Weather Review 131, 8; 10.1175//2548.1

Occurrence frequency of regime 1 for each phase category of the 42-day wave reconstructed components (RCs 1–2). The error bars show the 95% range of random sampling

Citation: Monthly Weather Review 131, 8; 10.1175//2548.1

Occurrence frequency of regime 1 for each phase category of the 42-day wave reconstructed components (RCs 1–2). The error bars show the 95% range of random sampling

Citation: Monthly Weather Review 131, 8; 10.1175//2548.1

Occurrence frequency of regime 1 for each phase category of the 42-day wave reconstructed components (RCs 1–2). The error bars show the 95% range of random sampling

Citation: Monthly Weather Review 131, 8; 10.1175//2548.1

Conditional probability of occurrence of regime 1 during Jun–Aug, taking the phase category of the 42-day wave reconstructed components (RCs 1–2) as a predictor, as a function of lead time. Error bars give the 95% range of random sampling

Citation: Monthly Weather Review 131, 8; 10.1175//2548.1

Conditional probability of occurrence of regime 1 during Jun–Aug, taking the phase category of the 42-day wave reconstructed components (RCs 1–2) as a predictor, as a function of lead time. Error bars give the 95% range of random sampling

Citation: Monthly Weather Review 131, 8; 10.1175//2548.1

Conditional probability of occurrence of regime 1 during Jun–Aug, taking the phase category of the 42-day wave reconstructed components (RCs 1–2) as a predictor, as a function of lead time. Error bars give the 95% range of random sampling

Citation: Monthly Weather Review 131, 8; 10.1175//2548.1

Number of transitions that belongs to each temporal “circuit,” and the average asymmetry in the transition matrix, defined by the arithmetic average of the ratios of the off-diagonal elements

The number of days that fall simultaneously into the three clusters (columns) and eight 42-day oscillation *π */4 phase categories (rows). The clusters were determined separately for each season. Values smaller than the 2.5th percentile of random sampling are given in italics, with those that exceed the 97.5th percentile in boldface

Table A1. Classifiability (CI) and reproducibility (*R*) indices for the *K*-mean analysis as a function of *K* for each season, both in %. The reproducibility is given as a range over all the *K* centroids in the partition. Values of 94% or above are significant, according to an ad hoc threshold

Table B1. (top) Cross-validated log-likelihood and (bottom) estimated posterior probability of the Gaussian mixture model as a function of *k.* The most likely values are highlighted in boldface