This article investigates the prominent features of the Southern Hemisphere (south of 20°S) atmospheric circulation when extracted using EOF analysis and a k-means clustering algorithm. The focus is on the southern annular mode (SAM), the nature of its recent trend, and the zonal symmetry of associated spatial patterns. The study uses the NCEP–Department of Energy Atmospheric Model Intercomparison Project II Reanalysis (NCEP-2) (period 1979–2009) to obtain robust patterns over the recent years and the Twentieth Century Reanalysis Project (period 1871–2008) to document decadal changes. Also presented is a comparison of these signals against a station-based reconstruction of the SAM index and a gridded interpolated dataset [Hadley Centre Sea Level Pressure dataset version 2 (HadSLP2)].
Over their common period, both reanalyses are in fair agreement, both in terms of spatial patterns and temporal variability. In particular, both datasets show weather regimes that can be interpreted as the opposite phases of the SAM. At the decadal time scale, the study shows that the trend toward the positive SAM phase (as inferred from the usual EOF-based index) is related more to an increase in the frequency of clusters corresponding to the positive phase, with little changes in the frequency of the negative SAM events. Similarly, the long-term tropospheric warming trend already discussed in the literature is shown to be related more to a decrease in the number of abnormally cold days, with little changes in the number of abnormally warm days. The cluster analysis therefore allows for complement descriptions based on simple indexes or EOF decompositions, highlighting the nonlinear nature of the decadal changes in the Southern Hemisphere atmospheric circulation and temperature.
The southern annular mode (SAM; Rogers and van Loon 1982; Thompson and Wallace 2000), also called the Antarctic Oscillation (AAO), is the leading mode of atmospheric variability south of 20°S. It basically consists of an atmospheric mass transfer from the Antarctic region to the southern midlatitudes, with these two regions experiencing out-of-phase surface pressure and geopotential height anomalies. The SAM positive (negative) phase translates into a poleward (equatorward) shift and a strengthening (weakening) of the midlatitude westerly wind belt. The SAM variability can be found throughout the year, with a possible seasonal peak in December (Gong and Wang 1999). Its most obvious effects are mainly restricted to the lower and middle layers of the troposphere, although they were noted to extend to the upper levels and the stratosphere during the late austral spring season (Thompson and Wallace 2000).
The most conspicuous signal present in most descriptors or indexes of the SAM is a trend toward its positive phase in the recent decades (from the 1960s). This trend, related to decreased (increased) pressure over Antarctica (the midlatitudes) and an associated poleward shift of the midlatitude westerlies, has been identified using various observational or reanalyzed datasets, albeit with different characteristics (Marshall 2003; Renwick 2004; Hines et al. 2000). It has been attributed (Thompson and Solomon 2002; Shindell and Schmidt 2004; Arblaster and Meehl 2006; Polvani et al. 2011) to a combination of forcings, with a domination of ozone and greenhouse effect gases (GHGs). There are still uncertainties of whether this trend will carry on in the future when the combination of increased GHG but decreased ozone are considered (Son et al. 2008, 2010; Perlwitz 2011; Arblaster et al. 2011).
While often described as a zonal mode (hence the term “annular”), the canonic spatial signature presents a certain degree of zonal asymmetry, especially in the Pacific sector, where the zonal distribution of positive–negative anomalies of, for example, geopotential height is interrupted around 100°W. The relationships between the regional versus hemispheric signatures of the SAM have been investigated by Kushner and Lee (2007), who isolated regional-scale (wavenumbers 3 and 4) eastward-propagating wave structures in the SAM patterns. In an aquaplanet model simulation, Cash et al. (2002) have also shown that the modeled annular mode can be more accurately described as a zonally homogeneous distribution of zonally localized events, showing clear meridional structures, rather than a zonally symmetric mode of variability per se. These issues are of primary importance to, specifically, the oceanic response to the SAM, which can be sensitive to even weak meridional structures (Sallée et al. 2010), notably through variations in the meridional component of the surface wind. Coupled models (Sen Gupta and England 2006) usually simulate a too–zonally symmetric SAM structure, making such tools inappropriate to study the details of the variability patterns embedded within the SAM variability and its impact on air–sea fluxes.
In this paper, our aim is to examine the relationships between the most usual definition of the SAM, based on empirical orthogonal functions (EOFs), and the results of a decomposition into recurrent “weather regimes” obtained via cluster analysis. Emphasis is given on the relationships between regional versus hemispheric signals and the corresponding emergence of zonal asymmetries, as well as on the characterization of the trend observed in the recent decades in terms of changes in the occupation statistics of daily circulation regimes.
2. Data and methods
The SAM index has been defined in several ways in the literature. Gong and Wang (1999), Marshall (2003), and Meneghini et al. (2006) used the difference of normalized zonally mean sea level pressure (SLP) between 40° and 65°S. Rogers and van Loon (1982) retained the first eigenvector of an EOF analysis applied to the sea level pressure over a domain extending south of 20°S. Thompson and Wallace (2000) applied the same analysis on the same domain but on the 850-hPa geopotential height (Z850). Carvalho et al. (2005), Reason and Rouault (2005), L’Heureux and Thompson (2006), Hendon et al. (2007), and Pohl et al. (2010) instead used the 700-hPa geopotential height (Z700).
In this paper, we also applied the methodology based on the EOF analysis, since it does not postulate a priori that the SAM is purely zonally symmetric. Our daily SAM index is obtained as the leading principal component (PC1) of an EOF analysis applied to the Z700 anomaly field south of 20°S, after the removal of the annual cycle. Each grid point was previously scaled by the square root of the cosine of its latitude. Here the Z700 daily fields are provided by two complementary reanalysis products.
The National Centers for Environmental Prediction–Department of Energy (NCEP–DOE) Atmospheric Model Intercomparison Project II Reanalysis (NCEP-2; Kanamitsu et al. 2002) has been available since 1979 on a 2.5° × 2.5° regular grid and has been assimilating satellite fields to provide the best possible estimates of the atmosphere state during the recent years. The Z700 field is among the most reliable data in NCEP-2 (class A variable; Kalnay et al. 1996), although the amount of data that were assimilated over the domain is quite low. NCEP-2 will be useful in establishing the nature of the spatial patterns related to the SAM and especially the emergence of zonal asymmetries. Its short period (30 yr) and its start in 1979—while several observational studies suggest that the SAM positive trend started in approximately the mid-1960s—however, do not make it ideal to study long-term trends.
The Twentieth Century Reanalysis version 2 (20CR; Compo et al. 2011) has been available since 1871 on a 2° × 2° regular grid and has been assimilating only surface pressure data through an ensemble Kalman filter, for consistency between the presatellite and the satellite eras. 20CR fields used here are the ensemble mean of 56 members. Although surface data were assimilated since the first years of the reanalysis in the southern midlatitudes, the first continuous data assimilated in Antarctica date back from the early 1910s, with a 30-yr gap between the two World Wars. The station network is denser and almost constant since the International Geophysical Year (1957; G. P. Compo 2011, personal communication). This inconstancy in the assimilated data generates decreasing uncertainties with time, with recent years being more reliable than the late nineteenth and the early twentieth centuries. The appendix presents a comparison of 20CR SAM-related signals with two station-based indexes (Marshall 2003, Visbeck, 2009) as well as with the gridded, optimally interpolated Hadley Centre Sea Level Pressure dataset version 2 (HadSLP2) (Allan and Ansell 2006). While discrepancies exist, especially before the 1950s, there is very good agreement between 20CR and observation-based patterns in the last 60 yr: the advantage is that the daily resolution of the 20CR allows for interpreting trends and decadal variability in terms of changes in the frequency of recurrent daily regimes.
In this study, we chose to use both reanalyses over their longest periods (1979–2010 for NCEP-2 and 1871–2008 for 20CR), even if caution must be used in the interpretation of variability in the 20CR before the 1950s (see the appendix). Over their common period, both datasets simulate very similar mean Z700 fields (Figs. 1a and 1d). On a daily time scale, the strongest Z700 variance is not found at such high latitudes (Figs. 1b and 1e). It takes place between 45° and 75°S; particularly near 60°S over the southern Pacific basin, a feature that both reanalyses succeed at reproducing. Daily Z700 variations, however, are slightly lower in the 20CR over these latitudes but also over Antarctica (Fig. 1h). This is probably due to more smoothed fields, in line with the averaging of the 56 members.
The first EOF (EOF1) explains only 8.5% (NCEP-2) and 9.6% (20CR) of the original variance of the respective Z700 anomaly daily fields. Loading patterns (Figs. 1c and 1f) are very similar to those found in the literature (e.g., Carvalho et al. 2005), even if the 20CR presents more uniform loading values in the southern midlatitudes. When averaged at the monthly time scale on the post-1979 period, our two indexes show more than 98% of common variance with the monthly SAM index available on the Climate Prediction Center (CPC) website, which is based on the same methodology (available at http://www.cpc.ncep.noaa.gov/products/precip/CWlink/daily_ao_index/aao/monthly.aao.index.b79.current.ascii.table). A positive (negative) phase of the SAM index corresponds to above-average (below average) surface pressure over the Southern Ocean and below-average (above average) pressure over Antarctica, as well as an increase (decrease) of the average pressure gradient at these latitudes (Figs. 1a and 1d). Note, however, that the largest loadings are found over the Antarctic region rather than over the Southern midlatitudes. The consequence is that any structure of geopotential anomalies with a strong signal over the Antarctic continent—even if the structure is weak or not zonally symmetric over the midlatitude—is likely to project strongly onto this EOF and then translate into relatively high PC scores. Hence, the EOF-based definition of the SAM is adequate to depict the regional variability over Antarctica alone, but it is not so optimal to infer the variability structures in the midlatitudes. EOF decomposition also assumes linearity and orthogonality of the successive components: it assumes symmetry between the negative and positive phases and renders difficult the analysis of trends in terms of changes in occupation statistics.
A complementary description of the SAM-related variability in the Southern Hemisphere is therefore needed. To that end we decompose Z700 variability structures in terms of recurrent, discrete, daily weather regimes. This approach enables us to compare information gained on possible zonal asymmetries, as well as information on the long-term trends in terms of occupation statistics of the different regimes. This methodology has been successfully applied to the variability related to the North Atlantic Oscillation in the Northern Hemisphere (e.g., Monahan et al. 2000, 2001; Cassou 2008).
We hereby use the so-called k-means clustering algorithm (Michelangeli et al. 1995, Cassou et al. 2005). Citing Cassou (2008, p. 3), “Despite some controversial about their existence (Stephenson et al. 2004) and significance, as well as their number (Christiansen 2007), it is now widely recognized that changes in the occurrence and intrinsic properties of the weather regimes may be an important issue for medium-range (weekly to monthly) to climate change forecasts (decadal to trend: Straus et al. 2007).”
We essentially follow the methodology of Cheng and Wallace (1993) and Michelangeli et al. (1995). Let us assume that we have time series of daily observations on grid points (e.g., daily maps of Z700), and each set of daily observations corresponds to a data point in an N-dimensional phase space, where N is the number of grid points. Given an a priori fixed number of regimes, k, the aim of the k-means algorithm is to obtain a partition P of the data point into k regimes that minimize the sum of the intra-regime variances, , where Yj is the centroid of cluster Cj. The Euclidean distance is used to measure the similarity between two data points, X and Y. The overall minimum of the function W(P) corresponds then to the partition that best separates the different data points. When the classification is applied to large samples— atmospheric fields, for example—this overall minimum cannot be found in practice because of the huge number of different possibilities to explore. The algorithm defines n iterative partitions, P(n), for which W[P(n)] decreases with n and eventually converges to a local minimum of the function, W(P). The overall minimum of W(P) may be surrounded by many local minima that differ from the overall minimum as well as from each other by only a few data points, which are located in the phase space far away from the regime centroids. These points may move from one regime to another depending on, for example, the analyzed sample or on the initially chosen regime seeds. The reproducibility of the obtained partitions should therefore be tested.
If the distribution of the data points is uniform, then the final partition is assumed to be largely dependent on the initial randomly chosen seeds. In contrast, when the dataset is distributed into well-defined regimes, two different initial draws should theoretically lead to roughly similar final partitions. The dependence of the final result on the initial random draw may thus be used as an indicator of the degree of “classifiability” of the dataset into k regimes. Following Michelangeli et al. (1995) and Moron and Plaut (2003), we performed 100 different partitions of the Z700 anomaly patterns, each time initialized by a different random draw. The most natural way to measure the dependence of the final partition on the initial random draw, and thus the classifiability of the original dataset, consists of comparing several final partitions , with m = 1, … , 100, for a given number of regimes k. We then retain the partition having the highest similarity with the 99 other ones. A “classifiability index,” c* (Cheng and Wallace 1993), is then calculated, measuring the average similarity c within the 100 sets of regimes: it is defined by . Its value would be exactly 1 if all the partitions were identical. In addition to this test, used in most papers dealing with weather regimes, we duplicated this analysis 100 times, hereby providing 100 different values of c*. This operation allows us to estimate the reproducibility of c* and its sensitivity to the initial random draws.
Another issue comes from the best choice for the number of cluster k to be retained. If the Z700 anomaly patterns gather into k regimes in a natural way, then one would expect the classifiability of the actual maps to be significantly better than that of an ensemble of artificial datasets generated through a first-order Markov process, and having the same covariance matrix as the true atmospheric data (Moron and Plaut 2003). The red-noise test (applied to Markov-generated red-noise data) operates as follows: 100 samples of the same length as the atmospheric dataset are generated, providing 100 values of the classifiability index, which are then ranked to find the 5% and 95% confidence limits. The value of c* for the atmospheric dataset is then compared with these limits: a value above the 95% confidence limit indicates, for the corresponding value of k, a classifiability significantly higher than that of the red-noise model. The operation is repeated for k varying from 2 to 10: in most cases the best choice for the number of regimes appears quite unambiguously (Michelangeli et al. 1995).
In the present study, the k-means algorithm has been applied on the subspace spanned by the first 31 (29) PCs (explaining 80% of the original variance) of the EOF analysis of Z700 daily anomalies, for the NCEP-2 (20CR) over their respective periods. This step is useful to reduce the dimensionality of the problem and ensure linear independence between the input variables (Huth 1996). Figure 2 presents c* for both reanalyses and associated 5% and 95% significance levels estimated by the red-noise test (see section 2), for a number of clusters k varying between 2 and 10. For each value of k, the box-and-whisker plots show the empirical distribution for the 100 k-means analyses. Figure 2 reveals that the best number of classes k is four (four or eight) in NCEP-2 (20CR), since all corresponding partitions systematically reach the 95% significance level. For consistency between the two reanalyses and for the sake of compactness, we successively analyze the partitioning that presents the highest c* among the 100 that were performed for k = 4 in NCEP-2 and the 20CR.
a. NCEP-2 reanalysis
The NCEP-2 reanalysis is first considered to depict recent climate variability using a state-of-the-art reanalysis that assimilated both in situ and satellite data over a coherent period. Figure 3a shows the Z700 anomaly patterns associated with the four clusters (numbered 1–4). Similar patterns can be found continuously between 850 and 200 hPa (not shown) and are thus, in good approximation, barotropic. The clusters have similar sizes, with around 3000 days for clusters 1, 3, and 4 and slightly less (around 2400) for cluster 2, and their occurrences are equally distributed within the annual cycle (not shown).
Cluster 3 is the only one that shows a clear and unambiguous “SAM like” pattern, corresponding to the negative phase of the SAM (Fig. 3a). Positive Z700 anomalies prevail over Antarctica, and near-annular negative anomalies are found over the circumpolar Southern Ocean in the midlatitudes. There, three main centers of actions (i.e., regions showing the local extrema of Z700 anomalies) are located over the southern Indian basin, the southwestern Pacific Ocean near and east of New Zealand, and the southern Atlantic basin. This pattern is in fair accordance with most usual SAM descriptions, including the spatial pattern of PC1 (Fig. 1c), except for the southern Atlantic anomalies, which slightly differ. As a consequence to these strong similarities, cluster 3 significantly projects onto the negative phase of the SAM, as defined by the negative scores of the PC1 time series (Figs. 1c and 3b). It is worth noting that all days with a score of less than minus two standard deviations of PC1 are systematically affiliated with cluster 3 (Fig. 3b). This cluster is also the most representative of the anomaly fields of its constitutive days (Fig. 3d): the spatial correlations between Z700 daily anomaly patterns and the cluster 3 mean pattern is larger than 0.6 during 23% of the corresponding days (compared to only 13% for the PC1 positive phase, defined by a score of less than minus one standard deviation). Cluster 3 is also the most persistent (Fig. 3c), which is fully consistent with Carvalho et al. (2005), who noted that the negative SAM phase is more persistent on average than the positive one.
The positive SAM phase corresponds to clusters 1 and 4 (Figs. 3a and 3b). Both clusters show spatially coherent and highly significant negative Z700 anomalies over Antarctica, which is once again consistent with the spatial pattern associated with PC1 (Fig. 1c). Because of the high loading values in the southern high latitudes, these two clusters significantly project onto the positive SAM phase (Fig. 3b). Despite such an apparent association between these two clusters and PC1, the associated Z700 anomalies fail to show any annular pattern in the midlatitudes (Fig. 3a). In detail, both clusters display a clear wavenumber 4 pattern, which forms the typical signature of the synoptic-scale Rossby wave activity embedded in the midlatitude mean circulation. This is consistent with the typical duration of the cluster sequences, their median (mean) duration being 4 (5) days (Fig. 3c). The midlatitude centers of action in clusters 1 and 4 are approximately out of phase. Note, however, that regional positive Z700 anomalies tend to be larger in amplitude than negative anomalies. Hence, when combined into one single regime, these two clusters form the picture of an annular belt of continuous positive Z700 anomalies at the southern midlatitudes, in agreement with the traditional pattern of the positive SAM phase. These results nonetheless suggest that this picture is not the most recurrent at the synoptic time scale. Cluster 1, in particular, is more strongly correlated with daily Z700 anomaly fields than the EOF1 loading pattern (14.5% of the corresponding days show spatial correlations of at least 0.6 compared to 11% for the PC1 positive scores). As Cash et al. (2002) did, we conclude that the midlatitude variability related to the (positive) SAM phase is predominantly formed by a zonally homogeneous distribution of zonally localized events. Like Kushner and Lee (2007), we identified wavenumber 3 and wavenumber 4 transient structures; because of the out-of-phase differences between the two clusters, the direction of wave propagation cannot be established here.
The remaining cluster (cluster 2) occurs less frequently than the others and is strongly reminiscent of the Pacific–South America (PSA) mode (Mo and Higgins 1998; Mo and Paegle 2001). PSA has been described as a Rossby wave train with three alternative pressure anomaly centers in the South Pacific, in opposition to the southeast Pacific and South America (Yuan and Li 2008). Indeed, the most prominent feature of cluster 2 is an opposition between positive (negative) geopotential height anomalies off the coast of West Antarctica and Marie Byrd Land (over the southwest Pacific basin east of New Zealand on one side and the Antarctic Peninsula, western Weddell Sea, and Drake Passage on the other side). Similar patterns hold for sea level pressure and the 200-hPa streamfunction (not shown), originally used by, for example, Mo and Higgins (1998) to characterize the PSA. Because of weak and contrasted anomalies over Antarctica, this cluster does not significantly project onto the PC1 SAM index (Fig. 3b). This result then supports that the statistical decorrelation between the SAM and the PSA is not an artifact due to the orthogonality constraint in EOF analyses, the PSA being often obtained as the second EOF orthogonal to the first mode describing the SAM. Cluster 2 is also the less persistent (Fig. 3c) and the less representative of the days ascribed to its centroid (Fig. 3d). Only 2% of the corresponding days show large spatial correlations with the cluster 2 mean pattern.
Figure 4 shows the regimes’ robustness and quantifies their reproducibility. We compare their timing and basic properties to those associated with the 99 other partitions performed for k = 4 (see section 2). The cluster size appears first as very reproducible from one partition to another (Fig. 4a). Cluster 2 (PSA) systematically remains smaller in size than the three other clusters. Performing 100 k-means analyses also allows identification of the days, or sequences of days, that are more or less classifiable, that is, those that systematically fall into the same cluster for the 100 partitions (Fig. 4b). Roughly 95% of the days (94.99%) are perfectly classifiable; that is, they are ascribed to the same cluster in the 100 independent partitions. More than 98% of the days (98.20%) fall into the same cluster in at least 95% of the partitions. The four clusters shown in Fig. 3 are thus insensitive to the initial seeds, and we suggest that they represent the optimal weather regimes that one can expect in the NCEP-2 reanalysis over this region.
Figure 5 analyzes their temporal behavior, that is, their recent long-term trends and their interannual covariability with the SAM index (Fig. 1c). Over the 1979–2009 period, and at the interannual time scale (Fig. 5b), cluster 3 is very strongly and negatively correlated to the SAM index (r = −0.95), confirming that it represents the negative phase of the “canonical” SAM. Clusters 1 and 4 show weaker—but still significant—positive correlations, which is consistent with the fact that they represent two possible alternatives for the positive phase. None of these results is qualitatively modified when linear trends are removed from all of the time series (not shown).
At the longer time scales, the annual mean SAM index shows very weak and not significant trend over the recent years in NCEP-2 (Figs. 1i)—a result already found in Pohl et al. (2010). This result conceals more contrasted evolutions at the monthly time scale, which cancel out their effects on annual means (not shown). Although clear shifts toward the positive (negative) phase prevail during austral summer and fall (winter and spring), January is the only month that reaches the 95% significance threshold. This austral summer signal is typical of the delayed impact of the spring trend in ozone depletion (e.g., Son et al. 2008, 2010; Polvani et al. 2011), while the austral winter signal, though nonnegligible (see the appendix), has been less of a focus.
Figure 5b shows how such trends are depicted by the number of monthly and yearly occurrences of the weather regimes. As far as yearly trends are considered, cluster 4 alone shows a significant trend, consistent with the shift toward the positive SAM phase. Seasonally, these evolutions are restricted to September–October and January. Cluster 3 (the negative phase of the SAM) shows results that are coherent with the PC1 series, with negative (positive) trends in January (June), which are once again consistent with the summer shift toward the positive SAM phase.
Over the last 30 yr, the weather regimes seem promising to capture significant parts of the Southern Hemisphere interannual variability (Figs. 1i and 5a). Though useful to depict the recent evolutions using a reanalysis product assimilating coherent satellite data over a region where in situ measurements are rare, linear trends computed on the regime frequencies hardly enable us to discuss decadal variability. To complete this picture and to investigate lower-frequency variability, section 3b adopts a similar approach, applied to the 20CR.
b. Twentieth-century reanalyses
This section is devoted to the analysis of the clusters (labeled A–D) obtained with the 20CR. A k-means analysis of the same field restricted to the 1979–2008 period, used in section 3a, revealed clusters that are very similar to those obtained with NCEP-2. None of the results discussed in section 3a is qualitatively modified with the 20CR (not shown; quantitatively, corresponding clusters show weaker Z700 variability in the Southern Hemisphere, in full agreement with Fig. 1h). For this reason and to take advantage of the full period of the 20CR, we chose to present only the regimes calculated over 1871–2008, which allows us to discuss their decadal and long-term variability. Because caution must be used when working on this dataset since the late nineteenth century, we recall that the temporal inconstancy in the 20CR reliability, widely discussed in Compo et al. (2011), is also addressed in the appendix over the Southern Hemisphere mid- and high latitudes.
Figure 6 shows Z700 anomalies associated with our four 20CR clusters (Fig. 6a), projections onto corresponding PC1 time series (Fig. 6b), and cluster persistence (Fig. 6c) and spatial representativeness (Fig. 6d). As for the mean field and the loading pattern of the first EOF (Figs. 1d–f), Z700 fields are more coherent spatially in the southern midlatitudes than in NCEP-2. Clusters B and C depict the positive and negative SAM phases, respectively, with very zonal and out-of-phase anomalies between Antarctica and the Southern Ocean. Clusters A and D are very uniform and depict positive (negative) Z700 anomalies, which are related to positive (negative) air temperature anomalies (see section 3c). They do not significantly project onto the SAM phases (Fig. 6b) and regroup approximately 15 000–16 000 days each compared to roughly 9000–10 000 for the opposite phases of the SAM (clusters B and C). Importantly, the mean anomaly patterns of clusters B and C are sensibly more representative of the daily anomalies of their constitutive days (Fig. 6d), which suggests that SAM-like patterns can indeed be observed at the daily time scale. This is likely not so clear for generalized warm and cold days, such as depicted by clusters A and D. Their identification as recurrent regimes is likely due to the strong warming trend recorded over the region during the twentieth century (Qu et al. 2012; section 3c), that attracted a large fraction of the overall Z700 variance.
As such, these clusters, while representing statistically stable attractors in the 20CR (e.g., they are natural solutions toward which the classification algorithm converges; Fig. 7), should not be interpreted in terms of “weather” regimes per se, in the sense of organized and recurrent synoptic circulation anomalies. They nonetheless are representative of the low-frequency variability in geopotential heights present in the 20CR, related to the long-term warming trend observed in the Southern Hemisphere and noted elsewhere in the literature (e.g., Solomon et al. 2007). An analysis of their low-frequency variability (see below), moreover, provides an insight into the nature of the warming in the SH.
Over their common 1979–2008 period, both series of clusters show significant associations (Table 1)—significant at the 99% according to the χ2 statistics. Logically, clusters 3 and C (materializing the negative SAM phase) tend to occur in phase between NCEP-2 and the 20CR; note, however, that cluster 3 occurrences almost equally match those of cluster A. As far as the positive phase is concerned, cluster B occurs in phase with clusters 1 and 4. Results are less clear for cluster D, even if 55% of its occurrences match cluster 4. These five preferential associations (in bold in Table 1) regroup 68% of the 10 950 days during the period 1979–2008: the two reanalyses are thus in agreement roughly seven out of 10 days.
Although their median persistence is very similar to that of NCEP-2 clusters, cluster sequences in the 20CR show extreme values that reach far larger values (up to 118 days compared to 46 for NCEP-2). As for the clusters of section 3a, the timing and the size of these clusters is strongly reproducible over the whole period (Fig. 7), and none of the regimes shows clear seasonality (not shown). During the 1871–2008 period, 98.5% of the days systematically fall into the same regime within our set of 100 partitions and more than 99% of the days fall into the same regime 95 times out of 100. It can be noted that there is no decadal trend in the classifiability of the days (i.e., the first years of the period are as classifiable as the latest ones, showing that the clusters are as robust and as efficient at depicting Z700 variability over the whole period). Of course, the reliability of the Z700 fields themselves is questionable and inconstant (the appendix).
Figure 8 presents the interannual variability (Fig. 8a) and long-term changes (Fig. 8b) in the 20CR cluster frequencies. Table 2 shows the interannual correlations between the cluster occurrences and PC1: both raw values and detrended time series are considered. Logically, clusters B and C show the greatest correlations with PC1 (Table 2), even after removal of all linear trends from the time series. The statistically significant correlation between occurrences of cluster D and PC1 is only due to the presence of marked trends in both series (Fig. 8).
Only clusters B and D show significant trends over the period, consisting of in a shift toward the positive SAM phase (for cluster B) and less frequent occurrences of uniformly low Z700 fields at the near-hemispheric scale (for cluster D). Qualitatively similar trends can be found for shorter periods, for example, 1920–2008 (during which the PC1 time series shows its most significant trend; Fig. 1i) and 1957–2008 (during which the amount of data assimilated in the 20CR is almost stationary over Antarctica and the Southern Ocean; see section 2 and the appendix). Quantitatively, the trends gradually strengthen when shorter periods are considered, denoting that the simulated long-term variability over these regions became more pronounced during recent decades (the appendix). For instance, according to the 20CR, the trend toward the positive SAM phase became significant since the 1920s and is remarkably strong between the mid-1960s and the late 1990s (Fig. 1i)—an evolution that cluster B succeeds at describing (Fig. 8a).
Once again, and in agreement with NCEP-2, these results suggest that observed trends in the SAM do not involve a shift of the global distribution. They more precisely concern the positive phase (which tends to be more frequent), while the negative phase does not show any significant decadal variability. The same conclusion holds for “uniformly cold” and “uniformly warm” days, with cold days becoming less frequent, while warm days remain roughly constant (except in December and January; Fig. 8b). Unlike the usual EOF definition of the SAM, weather regimes make it possible to highlight the nonlinear, asymmetrical evolution of climate variability in the Southern Hemisphere.
c. Capability of the regimes in discriminating Z700 variability
The aim of this section is to illustrate the usefulness of the weather regimes for depicting the day-to-day variability of the Z700 field (and to compare them to the commonly-used EOF definition of the SAM), as well as their capability in monitoring the long-term trends in air temperature.
Figure 9 compares, for each reanalysis, the capabilities of the weather regimes and the EOF-based SAM index to represent the variability of the original datasets. The quantification of the Z700 variance explained by the partitioning into weather regimes is obtained by replacing, over the corresponding period, the daily anomaly fields by that associated with the corresponding cluster, and then computing the ratio between the variances of the reconstructed and the real anomaly fields. The PC1 time series clearly appears as the most efficient descriptor of the Z700 variability over Antarctica (Figs. 9a and 9c), but its capability in monitoring the regional climate abruptly decreases over the Southern Ocean, north of 65°S. Even if the daily Z700 variance is weaker over Antarctica than in the midlatitudes (Figs. 1b and 1e), it is spatially more coherent and thus attracts the first eigenvector of an EOF analysis (Fig. 1c). Because of their discretization into a limited number of recurrent anomaly patterns, weather regimes are in contrast less capable of monitoring climate variability at the high latitudes. However, even four clusters are much more adequate to describe the Z700 variability over the Southern Ocean, where it is largest but spatially less coherent (Figs. 3 and 6). Thus, the regimes are useful to take into account the contribution of the midlatitude transients. This is particularly true for the NCEP-2 regimes, midlatitude perturbations being strongly smoothed in the 20CR because of the averaging of the 56 members. As a consequence, NCEP-2 regimes display a stronger synoptic signature than their equivalents computed in the 20CR, and they also explain a larger part of Z700 variability along the midlatitude storm track.
Figure 10 explores to what extent the 20CR regimes are capable of depicting the long-term trends in air temperature at 700 hPa (T700) over the 1871–2008 period. To that end, we first calculated the composite T700 anomalies associated with each of the four clusters (Fig. 10a), and then we compared the trends, such as they appear directly in the 20CR and in reconstructed fields, for which each day was assigned the composite mean value of its corresponding cluster (Figs. 10b and 10c). Qualitatively similar results are found for the periods 1920–present and 1957–present (not shown).
Figure 10a shows that the positive (negative) Z700 anomalies associated with cluster A (D) are mostly attributable to positive (negative) T700 anomalies. The positive (negative) SAM phase depicted by cluster B (C) favors cold (warm) conditions over Antarctica and vice versa in the midlatitudes, consistent with Gillett et al. (2006). Given the positive (negative) trend noted for cluster B (D) in Fig. 8, one can expect that the decadal changes in the occurrences of these clusters will correspond to a generalized warming of the Southern Hemisphere. This is indeed the pattern that is simulated by the 20CR (Fig. 10b), in agreement with, for example, Gille (2002, 2008), Steig et al. (2009), and Schneider et al. (2011), among many others. This warming trend was shown to be mostly attributable to human influence (Gillett et al. 2008; Monaghan and Bromwich 2008).
Long-term changes in the frequency of the regimes succeed at regionalizing these evolutions (Fig. 10b, right), although the amplitude of the warming remains strongly underestimated (Fig. 8c). Except for localized areas of South America, where warming trends are almost null (Fig. 10b) and where the discretization into weather regimes leads to sign errors (Fig. 8c), the magnitude of the warming depicted by the reconstructed fields is comprised between 30% and 60% of that simulated by the 20CR. The regimes perform well in the midlatitudes (especially over Australia and the southwest Pacific basin) but are less satisfactory near and over Antarctica (15%). Thus, depending on the regions, 40%–85% of the local climate changes can be explained by the intrinsic variability within each regime. That is to say, about half of the decadal variability can be explained by modifications in the weather regime occurrences, with the remaining part indicating that these regimes do not have constant intrinsic properties (such as temperature anomaly fields) over the period. Together with Fig. 9, they show that a limited number of recurrent atmospheric configurations can capture a sizeable fraction of climate variability, from the daily to the decadal time scale. Besides a quantification of the skill of our weather regimes, these results also illustrate the limitations and quantify the uncertainties related to downscaling climate variability or climate change scenarios (e.g., Kageyama et al. 1999; Boé and Terray 2008).
4. Summary and discussion
This work attempts to investigate the potential asymmetries in the SAM-related variability and trends in the southern mid- and high latitudes. To that end, weather regimes were calculated using a k-means partitioning algorithm applied to daily maps of Z700 anomalies. Two complementary reanalysis products were used, namely, the NCEP-2 reanalysis over the period 1979–2009 and the 20CR over the period 1871–2008. Results may be summarized as follows.
Spatially, our reference regimes are given by the NCEP-2 reanalysis over 1979–2009, during which the assimilation of large amounts of satellite data improves the reliability of the reanalysis. Corresponding regimes are associated with the opposite phases of the SAM and to the PSA mode; illustrating the capability of the regimes in depicting physically coherent modes of atmospheric variability, without linear assumptions or orthogonal constraints as for EOF decompositions. Interestingly, all regimes (except the negative SAM phase) show clear wavenumber 4 features in the southern midlatitudes, which is the typical signature of synoptic-scale variability. The positive SAM phase is described by two complementary regimes, which show negative Z700 anomalies over Antarctica and out-of-phase wave trains over the Southern Ocean. This result suggests that the commonly used definitions of the SAM, based on the first component of an EOF applied to Z700 anomalies, gives the unrealistic picture of an annular variability pattern in the midlatitudes. Following Cash et al. (2002), we conclude here that the SAM (at least in its positive phase) is actually constituted of spatially coherent anomalies in the high latitudes on the one hand and a zonally homogeneous distribution of zonally localized events located over the Southern Ocean on the other hand. Such transient events are associated with Z700 anomalies that can be 50%–100% larger in amplitude than those predominantly associated with the SAM over Antarctica but are much less coherent spatially. In this regard, the 20CR does not perform as well as the NCEP-2 reanalysis. Averaging their 56 members smoothes the irreproducible component of climate variability and strongly reduces the midlatitude troughs.
Temporally, the regime sequences typically persist about 5–10 days (with extreme values reaching 45 days in NCEP-2 and nearly 120 days in the 20CR), illustrating again their strong synoptic component. In agreement with Carvalho et al. (2005), the negative phase of the SAM appears as more persistent than the positive phase, or even more clearly the PSA mode. This confirms also that a large fraction of SAM variance is at the subseasonal scale (Pohl et al. 2010). At the interannual time scale, the regimes describing SAM variability are highly correlated with the EOF-based SAM index (e.g., r ≈ 0.8 in the 20CR over the overall 1871–2008 period). Long-term trends remain weak in NCEP-2 over 1979–2009; logically, they appear much more clearly in the 20CR and are extremely robust and significant, especially since the 1920s. They are most pronounced between the early 1960s and the late 1990s. Analysis of monthly trends revealed (i) that the austral summer trend toward the positive phase of the SAM is mostly related to an increase in the number of occurrences of the positive phase, with very little change for the clusters associated with its negative phase; and (ii) that the warming trend in the southern mid- and high latitudes is mainly associated with a decrease in the number of “cold” days, without a clear increase in the number of “warm” days. Thus, the recent low-frequency variability does not seem to consist of a shift of the whole distribution but of asymmetric changes that could enhance their skewness. In this regard, the weather regimes seem appropriate to depict such nonlinearities in the low-frequency variability over the region. This can be compared to studies carried on the Northern Hemisphere atmospheric variability by, for example, Monahan et al. (2000, 2001), in which nonlinear methods of the atmospheric circulation variability (through nonlinear principal component analysis and mixture models) provided insights relating occupation statistics of discrete regimes to the decadal North Atlantic Oscillation (NAO) variability.
A comparison of the capabilities of the weather regimes and the EOF-based SAM indexes to represent the variability of the Z700 field in the Southern Hemisphere showed that the regimes are more adequate to describe climate variability over the midlatitudes (but explain slightly less variance than the EOF-based indexes over Antarctica). This is due to the better description of midlatitude transient wave patterns, especially in NCEP-2, where they are more realistic. Similar analyses applied for the low-frequency trends in the 20CR established that about half of the twentieth-century warming trend can be explained by modifications in the regime frequencies, with the remaining half concerning changes in the weather regime intrinsic properties. Although it cannot be generalized for other regions than the Southern Hemisphere, and for other periods than the last 140 yr, this result documents to what extent recurrent regimes can explain multidecadal variability over a period of roughly a century (e.g., climate change scenarios).
One of our main results suggests that a large component of the trend could be related to an increase in the frequency of weather regimes associated with simultaneous negative pressure anomalies over the Antarctic continent and positive pressure anomalies over parts of the Southern Ocean midlatitudes (i.e., the positive SAM phase). If confirmed, this could lead to potential asymmetric changes in the probability distribution function of surface winds, with important implications in terms of the impact on the ocean physics (as the response of the ocean is nonlinear to the wind speed). Deviations from zonal symmetry in the SAM pattern have indeed been shown to be important in terms of the oceanic response in the Southern Ocean (Sallée et al. 2010). Meridional wind anomalies associated with these deviations, especially marked in the central Pacific Ocean and eastern Indian Ocean, cause anomalies in heat fluxes that, in turn, are responsible for considerable contrasts in the patterns of the mixed layer depth anomalies during the positive phase of the SAM. These changes impact air–sea exchange, ocean sequestration of heat and carbon, and biological productivity. Zonally asymmetric patterns embedded in the SAM variability, notably synoptic-scale perturbations in the midlatitudes, as well as the detail of their long-term changes in frequency, need to be better characterized and understood if we are to better constrain the impacts of the SAM in the upper Southern Ocean physics and biogeochemistry, and their nonlinearities.
Nicolas Fauchereau was funded by the SRP Project TA 2010 035 (EEOC005), CSIR YREF Project EEOC003, and CSIR-UCT Cooperation Project EEOC009. 20CR data were downloaded (from http://dss.ucar.edu/datasets/ds131.1/). NCEP-2 reanalysis were downloaded (from http://www.esrl.noaa.gov/psd/data/gridded/data.ncep.reanalysis2.html). Calculations were performed using HPC resources from DSI-CCUB (Université de Bourgogne). Constructive comments and suggestions from three anonymous reviewers greatly helped improve the manuscript. The authors would also like to thank Gil Compo, Julie Jones, Gareth Marshall, and James Renwick for their helpful discussions and comments.
Intercomparison of Usual SAM Descriptors
Despite its qualitative homogeneity in terms of data assimilation throughout its period, the 20CR dataset nonetheless assimilated an increasing number of station surface data over the globe, and more particularly in the southern mid- and high latitudes. This issue questions the reliability of the simulated trends in the Southern Hemisphere and especially that of the long-term evolution of the SAM.
To document to what extent the 20CR solution converges with observed or proxy SAM variability, we propose in this appendix to compare the SAM time series, variability, and trends as they appear in (i) the SAM index defined in this study (i.e., based on an EOF analysis of the Z700 field simulated by the 20CR); (ii) the index developed by Marshall (2003), based on various stations located near 40° and 65°S over the period 1957–2009; and (iii) the index reconstructed by Visbeck (2009), based on statistical projections of long time series onto recent SAM variability patterns, over the period 1887–2005. To document the sensitivity to the methodology used to define the SAM, we also defined (iv) a SAM index based on an EOF analysis of the optimally interpolated HadSLP2 SLP fields (Allan and Ansell 2006) over the period 1871–2008 and (v) an EOF analysis of the SLP field derived from the 20CR over the same period (i.e., the field in the 20CR that is most constrained by data assimilation).
Figure A1 shows these five alternative SAM descriptors over their longest periods averaged annually. All indexes succeed at describing the recent trend toward the positive SAM phase since the 1960s. Before 1957 and the first in situ SLP records over Antarctica, marked differences are found between the Visbeck, HadSLP2, and 20CR series. This is, for instance, the case between 1880 and 1895, with notable discrepancies between 20CR and HadSLP2, and between 1920 and 1940, during which these two datasets vary in phase but radically differ from the Visbeck time series. In the absence of continuous and reliable SLP records, it is not possible to conclude on their respective reliability, and each solution is probably nearly as realistic as the others.
Figure A2 shows how the same SAM descriptors depict its long-term trends and are correlated with each other over 30-yr-long moving windows and over the annual cycle (i.e., monthly for all indexes except Visbeck, defined at the trimestral scale). Interannual correlations first confirm the very strong covariability of all indices since the 1950s. Prior to the International Geophysical Year, it is confirmed that each index describes a different SAM interannual variability, especially during the late nineteenth century.
Lower-frequencies also radically differ from one index to another. HadSLP2 data describe negative trends throughout the year from the start of the period to the 1920s, followed by positive trends that reach their maxima in the 1940s and 1950s for the austral winter and between 1920 and 1940 and then since the 1960s for the austral summer. It is the latter signal that has been attributed mainly to ozone depletion in the literature (Thompson and Solomon 2002; Shindell and Schmidt 2004; Arblaster and Meehl 2006; Polvani et al. 2011). 20CR simulates weaker negative trends during the first half of the period, which are mainly concentrated during the austral winter season (April–November). Over the recent years, simulated trends are stronger between 1920 and 1940 on the one hand, and since the 1960s on the other hand; winter trends in the 1940s and 1950s are, however, much weaker than their counterparts in HadSLP2. When the whole period is considered, 20CR simulates trends that are approximately 1.5 times larger than those found in HadSLP2 (not shown).
In contrast, the Visbeck index tends to show linear trends that are almost opposite of those previously discussed, with positive (negative) trends prevailing before the 1920s (between the 1920s and the 1960s). Like other SAM descriptors, the recent shift toward the positive phase is accurately monitored and found to be maximal in austral summer, which is not so clear in HadSLP2 and 20CR. Available only since 1957, the Marshall observational index depicts more contrasted trends: very strong positive trends (from November to January and in April–May) alternating with strong negative ones (in June and to a lesser extent in September and October) over the seasonal cycle. Over an even shorter period, NCEP-2 also simulates contrasted trends, but their seasonality also slightly differs from those discussed here (see section 3a).
This lack of consensus in terms of SAM variability and low-frequency evolution illustrates the amplitude of associated uncertainties and our partial knowledge concerning the state of the Southern Hemisphere circulation prior to the late 1950s. Although it does not perform worse than other datasets, the 20CR must therefore be used with caution during the first decades. Therefore, in this study, we verified that all conclusions presented and discussed remain qualitatively valid during the period 1957–2008 (not shown).
Current affiliation: National Institute for Water and Atmospheric Research, Auckland, New Zealand.