## Abstract

A new variant is proposed for calculating functions empirically and orthogonally from a given space–time dataset. The method is rooted in multiple linear regression and yields solutions that are orthogonal in one direction, either space or time. In normal setup, one searches for that point in space, the base point (predictor), which, by linear regression, explains the most of the variance at all other points (predictands) combined. The first spatial pattern is the regression coefficient between the base point and all other points, and the first time series is taken to be the time series of the raw data at the base point. The original dataset is next reduced; that is, what has been accounted for by the first mode is subtracted out. The procedure is repeated exactly as before for the second, third, etc., modes. These new functions are named empirical orthogonal teleconnections (EOTs). This is to emphasize the similarity of EOT to both teleconnections and (biorthogonal) empirical orthogonal functions (EOFs). One has to choose the orthogonal direction for EOT. In the above description of the normal space–time setup, picking successive base points in space, the time series are orthogonal. One can reverse the role of time and space—in this case one picks base points in time, and the spatial maps will be orthogonal. If the dataset contains biorthogonal modes, the EOTs are the same for both setups and are equal to the EOFs. When applied to four commonly used datasets, the procedure was found to work well in terms of explained variance (EV) and in terms of extracting familiar patterns. In all examples the EV for EOTs was only slightly less than the optimum obtained by EOF. A numerical recipe was given to calculate EOF, starting from EOT as an initial guess. When subjected to cross validation the EOTs seem to fare well in terms of explained variance on independent data (as good as EOF). The EOT procedure can be implemented very easily and has, for some (but not all) applications, advantages over EOFs. These novelties, advantages, and applications include the following. 1) One can pick certain modes (or base point) first—the order of the EOTs is free, and there is a near-infinite set of EOTs. 2) EOTs are linked to specific points in space or moments in time. 3) When linked to flow at specific moments in time, the EOT modes have undeniable physical reality. 4) When linked to flow at specific moments in time, EOTs appear to be building blocks for empirical forecast methods because one can naturally access the time derivative. 5) When linked to specific points in space, one has a rational basis to define strategically chosen points such that an analysis of the whole domain would benefit maximally from observations at these locations.

## 1. Introduction

Empirical orthogonal functions (EOFs) have been in widespread use in meteorology and climatology for a few decades (Lorenz 1956; Gilman 1957) and their use still seems to be on the increase. We here present a particularly simple way to calculate functions, empirically and orthogonally, without these functions being traditional EOFs. In words, this method is as follows. Suppose we have a space–time dataset *T*(*s, t*). One can search for that point in space (sb; a “base” point) that explains the maximum of the variance of all points combined. What is explained by *T*(sb, *t*) is removed from *T*(*s, t*) by standard regression, and one can then search the reduced data for the next most important point in space, and so on. The spatial patterns are regression coefficients from sb to the other points *s,* and the time series are the *T*(sb, *t*) at the base point. We will refer to the resulting functions as empirical orthogonal teleconnections (EOTs). EOTs are orthogonal in one direction; one has a choice of space or time as the orthogonal direction. From the examples to be given below, EOTs do look very much like EOFs for some commonly studied datasets and even more like regionalized “rotated” EOFs (Horel 1981; Richman 1986; Barnston and Livezey 1987, hereinafter BL). The relationship between EOF and EOT will be laid out formally. In many respects EOTs are like one-point correlation or “teleconnectivity” maps (Wallace and Gutzler 1981, hereinafter WG), but with the added property of orthogonality and functional representation. Although EOT can hardly be a “new” method, the authors have not found any references to it in our field.

Examples will be detailed in section 3, but, to start, Fig. 1 shows the interannual standard deviation of extratropical Northern Hemisphere (NH) seasonal mean 700-mb height for winter (gpm; Fig. 1b) and contours of the fraction of variance of the whole field that can be explained by each grid point individually (%; Fig. 1a). Each grid point obviously explains itself in full (≪1% domain variance) plus a bit of the nearby area, but some grid points (or observations) tell a lot more about the rest of the hemisphere than others. In the Atlantic and Pacific basins we have several areas where single points can explain no less than 15%–20% of the variance of the entire domain. Connoisseurs will immediately see the Pacific–North American pattern (PNA) and the North Atlantic Oscillation (NAO) emerging in Fig. 1a. The first base point sb to be picked would be near Greenland and the first EOT mode (tethered to 40°W, 70°N) explains 20.5% of the variance. The areas of high standard deviation and high explained domain variance have some correspondence, but the relationship is far from being 1 to 1, notable exceptions being for instance Florida (low std dev, but very high domain EV) and the Atlantic along the line Newfoundland–Scandinavia (high std dev, but low domain EV).

## 2. The method

Suppose we have a discrete space–time (*s, t*) dataset *T*(*s, t*), 1 ⩽ *t* ⩽ nt and 1 ⩽ *s* ⩽ ns, where *T* denotes the variable (like height, pressure, streamfunction, etc.). EOT is a stepwise linear regression, where the predictands are the *T*(*s, t*), and the predictors are *T*(*s, t*) as well. One can search all *s* for that point in space (sb; a base point) that explains the most of the variance at all other points (including itself) combined. What is explained by *T*(sb, *t*) is removed from *T*(*s, t*) by standard regression, and one can then search the reduced data for the next most important (in terms of variance explaining) point in space. And so on. Eventually one obtains *T*(*s, t*) = Σ *α*_{m}(*t*)*e*_{m}(*s*), where the *α*’s are time series and the *e*’s are spatial patterns, and the summation is over mode *m* = 1, ..., ns. The description so far is for a “normal” space–time setup, detailed immediately below (section 2a), but an analogous calculation with space–time coordinates reversed follows in section 2b.

### a. Normal setup

In the normal setup the idea is to find, by brute force, that point (sb) in space that explains as much as possible of the variance of all other points. The first EOT is connected in full to this point sb, denoted as sb_{1}, in the sense that the space pattern of the first mode [*e*_{1}(*s*)] is the regression coefficient between *T* at sb_{1} and *T* at all other points *s,* and the time series [*α*_{1}(*t*)] of the first mode is the time series of the original data at point sb_{1}, that is, *T*(sb_{1}, *t*). [The field *e*_{1}(*s*) is a very close cousin to a so-called teleconnectivity map for base boint sb_{1}; see WG.] At this point we split the original data into an “explained” and a “reduced” part:

where

std dev(*s*) stands for the temporal standard deviation of *T*(*s, t*) at point *s* defined as

and the temporal correlation is defined by

Note that in the last two expressions the time mean has not been removed. Nevertheless, the usual −1 ⩽ cor(*s*1, *s*2) ⩽ +1 does hold. The *e*_{1}(*s*) is a regression coefficient as defined in “regression through the origin” (Brownslee 1965, p. 358). The time–space or domain variance to be explained is given by

The expression is valid only for datasets that are “equidistant” in time and space. If this is not the case, on a latitude–longitude grid, for instance, one has to use weights [such as cos(latitude)] multiplying *T*(*s, t*)^{2}.

At this point we loop around for the second mode, which is identically the same procedure except we start with the once-reduced *T*(*s, t*). (The successive data reduction is the equivalent of using partial correlation; see section 4.) With all modes calculated we have:

where *α*_{m}(*t*) = *T*^{reduced}(sb_{m}, *t*), sb_{m} is the *m*th base point, *T*^{reduced} is *T* after *m* − 1 reductions, and *e*_{m}(*s*) = cor(*s,* sb_{m})[std dev(*s*)/std dev(sb_{m})], with the understanding that cor and std dev are calculated from the (*m* − 1) times–reduced data. There is a 1–1 linkage between the ordered base points sb_{m} and the modes *m* = 1 to *M.* The maximum value for *m* is ns, at which point *T*^{expl}(*s, t*) = *T*(*s, t*) and *T*^{reduced}(*s, t*) = 0. When searching for sb_{m} the number of reductions applied to *T*(*s, t*) is *m* − 1. Last, the variance explained by *m* modes as a percentage of STVAR is given by

In the jargon of EOF literature, the EOTs are the result of a “covariance-based” calculation. The modes explain successive nonoverlapping portions of the variance, which implies orthogonality in the domain in which the correlation and standard deviation are calculated, that is, here: the time domain, that is, Σ_{t}*α*_{m}(*t*)*α*_{n}(*t*) = 0 for *n* ≠ *m.*

The order in which the successive points sb are selected is arbitrary. Selecting points in order of explaining the most (remaining) variance makes sense as a matter of efficiency. Otherwise the order is free for EOT. If one choses a certain point first (“forces it in”) all subsequent modes and time series will change, but in the end (1) is satisfied just as precisely. If one insists on retaining all local information at one or more specific locales, one needs to force those particular points in before truncation. Indeed, with orthogonality in only one direction (here: time), the methodological constraints are very weak, and there is a virtual infinity of possibilities to satisfy (1).

Note that *e*_{m}(*s*) is a nondimensional quantity (a regression coefficient, of order 1), while *α*_{m}(*t*) carries the units of the input dataset.

### b. Space–time reversal

Although there is no more to space–time reversal than interchanging the role of the *s* and *t* coordinate, reversing space and time does require some attention. To define what we mean, first look at some of the previously given equations with time and space “reversed”:

where

std dev(*t*) now stands for the spatial standard deviation of *T* at time *t,* that is,

and

Spatial means are not subtracted in the two above expressions. Last, one obtains

Instead of a base point in space (sb), we now have a base point in time (tb), the *e*s are a function of time, and the orthogonal *α*s are a function of space. The first time–space-reversed EOT spatial pattern is thus the field *T*(*s,* tb) that explains the most of the fields observed at other times. As far as notation is concerned, note that unless stated otherwise *α*_{m}(*s*) or *e*_{m}(*t*) will automatically imply time–space-reversed calculations, while *α*_{m}(*t*) and *e*_{m}(*s*) are used for the normal setup. The *α*_{m} are orthogonal in either setup. Just as in the normal setup the *e*_{m}(*t*) is a nondimensional quantity (regression coefficient), while *α*_{m}(*s*) carries the units of the input data. As was the case with (1), there is a virtual infinity of ways to satisfy (1a). One can chose any realization *T*(*s,* tb) first, and proceed from there.

### c. Standardization

The reversal brings the issue of standardization into some focus. Even if the input data is standardized in time, the spatial mean will not be zero at all times (especially on a small domain), nor will the spatial standard deviation be unity at all times. Given the EOT method as defined, there is no need to standardize data, neither in time or space. One may still want to do this, but a procedure based on the above will succeed regardless, and adhere to (1) or (1a). The correlation, variance, and covariance in either setup are all calculated without removal of sample means, and STVAR is identically the same in the normal setup and space–time-reversed calculation. If there is a nonzero mean component in the data (along either coordinate), it will be expressed in one or more of the EOT. Some further comments will be made in section 4.

### d. Formal connection to EOF

For regular EOF there is a unique expression of the form

where *M* is the smaller of (ns, nt), and *α*_{m}(*t*) and *e*_{m}(*s*), both orthogonal, are such that

is minimized for all *M*′ < *M.* Furthermore, the following identities hold:

The biorthogonality is much more constraining, and as a result there is only one set of EOFs, not the near-infinite freedom associated with EOTs (1) or (1a). Still, the one and only difference with respect to EOF we make in the EOT method (normal setup) is that *α*_{m}(*t*) is required to be the time series of *T* at one specific observational site (or gridpoint) sb, *T*(sb, *t*), rather than a more general linear combination over *s*: Σ *b*(*s*)*T*(*s, t*), where *b*(*s*) are weights for each grid point. Is this a big departure from EOFs? It depends on the nature of the dataset, of course. From Barnston and van den Dool (1993a) we know that the time series associated with regular EOFs (in monthly 700-mb height) correlate with *T*(*s, t*) in excess of 0.8 at certain locations; that is, *b*(*s*) is large at what amounts to a base point. Hence EOT is already somewhat close to EOF for these datasets, and the degree of similarity increases further after rotation (BL), when local correlations increase to >0.9.

In *t–s* reverse setup, the one and only difference with respect to EOF we make in the EOT method is that *α*_{m}(*s*) is required to be the spatial pattern of *T* at one specific time tb, rather than a more general linear combination over *t*: Σ *b*(*t*)*T*(*s, t*). For traditional EOFs the *t–s* reversal is only a technical issue (of computer memory)—the answers are the same either way (although standardization issues need to be addressed).

If the *e*_{m}(*s*) were known, *α*_{m}(*t*) could be calculated from (2a), and if *α*_{m}(*t*) were known the *e*_{m}(*s*) could be found from (2b). For EOT in normal setup only (2b) is true, while only (2a) holds for space time reversal [but note that in the notation of section 2b, Eq. (2a) would read: *e*_{m}(*t*) = Σ *α*_{m}(*s*)*T*(*s, t*)/Σ *α*_{m}(*s*)*α*_{m}(*s*)]. Nevertheless, (2a) and (2b) will be referred to in section 3 and the appendix in the EOT context—because it can be shown that given the EOT solutions one can iterate very rapidly toward a traditional EOF solution that satisfies both (2a) and (2b).

## 3. Examples

One can very easily write a code (just a few dozen lines, no library routines needed) to calculate EOT. We tested it out on four commonly used datasets.

### a. Data

January mean temperature in the United States, at 102 evenly distributed climate divisions (CDs) for 1932–95. The original data source is the National Climatic Data Center (NCDC), but the mapping onto 102 CDs is described in He et al. (1998). The time mean is made zero pointwise in space, and standardized units are used for this dataset.

May soil moisture in the United States, at 102 evenly distributed climate divisions for 1932–95. The time mean over 1961–90 is subtracted, and the units are millimeters. See Huang et al. (1996) as a source and references thereof as a source for dataset 1 as well.

December–February (DJF) seasonal mean 700-mb heights, for 20°–90°N, 1948/49–1997/98 based on the National Centers for Environmental Prediction–National Center for Atmospheric Research reanalysis (Kalnay et al. 1996) on a 5° lat × 10° long grid, and a square root cosine (of latitude) weighting applied to the anomalies at the 504 grid points so as to properly integrate variance on the sphere (no equal area grid needed). The 50-yr winter mean is subtracted, and the units for the anomalies are geopotential meters.

Daily 0000 UTC 500-mb height for January, other details as in dataset 3.

These four datasets cover a wide range of situations in terms of domain size or spatial degrees of freedom, the number of points in space and in time and their ratio, standardization, time and or space mean being zero or not, etc. We will discuss EV for each of these datasets, but will present selected graphics for datasets 1 and 3 only, to make the more salient points. Results for datasets 2 and 4 are for space–time-reversed mode exclusively. Direct comparison of the normal setup and space–time reversal is discussed for dataset 1 only.

### b. Explained variance (EV) for each dataset

First we discuss the example of January mean temperature during 1932–95 (64 cases) over the contiguous United States at 102 CDs. The explained variance profile is very satisfactory in this situation (see Table 1, column 1). Three modes is enough for 75% EV, and seven modes explain 90%. A total of 102 modes (=all points in space) make up the full 100%. Column 1a refers to EOT for the same dataset, but now under time–space reversal. The numbers in parentheses in column 1 refer to covariance based EOF (which were calculated with the recipe given in the appendix). Traditional EOF has only 1%–4% more EV than the greater of columns 1 or 1a. On this dataset EOT is thus quite close to maximizing EV. Column 2 is for soil moisture in May, a far more challenging field for effective representation by any function because of the spotty nature of soil moisture anomalies. Nevertheless, in time–space-reversed mode it takes only four EOT modes to explain 50% of the variance. Columns 3 and 4 are for seasonal mean 700-mb height and daily 500-mb height, respectively, both on a much larger extratropical Northern Hemispheric domain. On this large domain, EOT appear to be equally satisfactory in explaining the variance. It takes only four modes to explain 50% of the variance in seasonal mean 700-mb height—for daily 500-mb height it takes 13 modes to accomplish the same.

Columns 1 and 1a compare EV for the same dataset but for a normal setup versus for space–time reversal. It turns out that the same total space–time variance is broken up in a roughly equally efficient way. In time, the neighboring years are not especially correlated with tb, at least not in the way the grid point sb is with its neighbors, hence the first EOT is slightly more efficient in normal time–space setup. However, beyond the monopole positive neighbor connection, the teleconnections (especially negative correlation) are stronger in time, and the reverse EOT is actually more efficient for modes 2, 3, etc., up to 3% EV. Apart from these differences, the normal and space–time reverse setup are basically the same. In *t–s* reverse mode the full 100% EV is reached with 64 modes (the number of time levels), instead of 102 modes in the normal setup. Fundamentally, a distribution of the variance across 64 modes can never be identically the same on a mode-to-mode basis as a distribution across 102 modes {unless the modes are orthogonal in both time and space [in which case the modes beyond 64 (the smallest dimension) are degenerate]}.

### c. Time series and spatial patterns: January mean temperature in the United States 1932–95

Figure 2 shows *e*_{1}(*s*) and *e*_{2}(*s*), two broadscale patterns suggesting frequent occurrence of temperature standardized anomaly fields of one sign covering about 75% of the east (west) with opposing anomalies in the west (east). The *e*(*s*) are maps of a regression coefficient with the base points in west Kentucky (*e*_{1}) and east Wyoming (*e*_{2}), respectively. The *e*(*s*) are not orthogonal by construction, but in fact they very nearly are for this dataset. The alternative spatial patterns *α*_{1}(*s*) and *α*_{2}(*s*), calculated orthogonally under space–time reversal are shown in Fig. 3. While the scale differs (std dev units × 100 in Fig. 3), one may observe Figs. 2 and 3 to be rather similar. The spatial mean in Fig. 3 is very much nonzero and taking out the spatial mean (as would be required in standardization) would have made it impossible to find this pattern. In Fig. 4 we provide the time series *e*_{1}(*t*) and *e*_{2}(*t*) associated with the *α*(*s*) in Fig. 3—checking for the *t* such that *e*(*t*) equals 100 brings out that *α*_{1}(*s*) is the January 1977 field, while *α*_{2}(*s*) is the January 1937 temperature (after the Jan 1977 has been regressed out). These two years feature very large anomalies (>3 std dev units)—a high signal-to-noise ratio is a necessary but not sufficient condition for making these flow realizations suitable to explain other years. Note also that Fig. 4 displays considerable low-frequency variability. Finally, Fig. 5 compares *e*_{1}(*t*) and *α*_{1}(*t*). As was true for *e*_{1}(*s*) and *α*_{1}(*s*) (Figs. 2 and 3, top panels), the time series of the leading mode is, except for the sign and units, nearly identical in normal setup and space–time-reversed mode. Remarkably, the raw time series at one point, west Kentucky, tells us nearly the same (about the whole time–space domain) as a single field in 1977. When normal and *t–s* reverse calculation yield the same result, as they do here very nearly for the top modes, the EOT are orthogonal along both coordinates and EOFs and EOTs become the same, not by definition but because the data dictates that result.

### d. Spatial patterns: Winter mean 700-mb height over the Northern Hemisphere 1948/9–1997/8

Many of the comments made in section 3c, pertaining to U.S. monthly temperature, could have been made as well for seasonal mean 700-mb height over the extratropical Northern Hemisphere. Therefore let us discuss the results for this field in a somewhat different way emphasizing other aspects briefly. Figure 6 shows the first four EOT modes in normal setup, cor(*s,* sb) rather than *e*(*s*), to be visually consistent with the literature, BL in particular. As announced when presenting Fig. 1, one recognizes instantly the usual suspects, the NAO and PNA, and indeed, these modes, explaining 20.5% and 16.8% of the domain variance, appear as modes 1 and 2. These modes are very nearly orthogonal in both time and space—this has to be a result dictated by the data because the method requires orthogonality in one direction only. Subsequent modes explain much less variance. Although concentrated in the Atlantic European area, these higher EOTs still display sweeping patterns with negative correlation at large (tele) distances. The NAO and PNA are truly outstanding and virtually independent (in linear correlation sense) operating modes—only near the line from Florida into west central Canada do these modes overlap, but the competition for the same variance is minor. To prove this we forced the PNA to be mode 1 by picking sb = 45°N, 160°W first, but, except for the order of the upper two, the four panels in Fig. 6, representing NAO and PNA, changed very little. Although the NAO is represented here by 75°N, 40°W, one could have picked several other points near Greenland, a southern point over the Atlantic or in Central Europe (see Fig. 1), and still be quite efficient in representing the NAO.

Figures 7 and 8 are the counterpart of Fig. 1’s bottom and top, that is, the standard deviation and pointwise domain variance explained, after subsequent modes have been removed. After removal of NAO and PNA the domain variance explained (Fig. 8) has quickly become rather flat in space (compared to Fig. 1 top) with peak values of 6%–7% only, and the choice of the next mode does not suggest itself very strongly. The remaining standard deviation (Fig. 7) is not flat at all—some locations have retained most of their variance after four modes are out (and >50% of the domain variance is out), for instance, over Scandinavia, west of Alaska, and off the Washington and British Columbia coast. Even though Florida and surroundings participate in both NAO and PNA, there is some 40% variance left after these modes are removed. One can very clearly see the “holes drilled” in the std dev map (Fig. 7) at the locations of subsequent sb, where the remaining std dev is zero.

### e. Noise, sampling variability, and reproducibility

EOT and EOF are based on full sample correlations, and as such subject to sampling variability and observational error. The EOTs have an especially obvious noise component: by accepting the “raw” time series *T*(sb, *t*) or the map *T*(*s,* tb) in full, this error becomes part of the EOT mode along the orthogonal direction. For instance, in Fig. 3, we have displayed the first two *α*(*s*) modes, which are in fact the observed temperature maps for January 1937 and 1977, and one may feel that the wiggly lines in the west of the United States are due to noise or undersampling. Some noise can also be surmised in Fig. 7 where the remaining standard deviation after removal of subsequent modes leaves small holes, suggesting that the time series at sb contains gridpoint specific variance, which does not correlate very far. This problem could be reduced in ad hoc fashion by some smoothing in space around the point sb before the dataset is reduced; see appendix for more on this issue. On the other hand, the upper panel of Fig. 1 shows robustness in the domain variance explained—the base points picked represent a rather large area, and moving the base point around is neither going to change the EV nor the appearance of the mode very much. Furthermore, *e*(*s*) in Fig. 2, a regression coefficient (not raw data), seems much more smooth, in line with what we have come to expect (rightly or wrongly) of modes: round and smooth. Apparently, EOT has the noise primarily along one axis, the orthogonal direction.

The presence of noise as well as sampling variability (given finite samples) raises the issue of reproducibility on independent data, an issue addressed by North et al. (1982) for regular EOF. We study the problem for January mean temperature (dataset 1) by doing the following cross validation (CV). We left 1 year out and calculated *M* modes from the 63 remaining years. The field observed in the year left out was then projected onto these modes and the EV for that year was calculated. The projection requires ideally orthogonal modes so the CV (leaving out 1 year at a time) was done in time–space-reversed setup. The procedure was repeated for all years exhaustively, and the EV was summed through over all 64 yr. The result is shown in Table 2. We find that the EV is reduced by only 1%–1.5%, a sign of considerable robustness in the EOT method in terms of its ability to explain variance. The largest difference is about 1.6% in the range 10–20 modes, when about 95% EV is reached. The columns on the right are for regular EOFs, and we conclude that EOTs survive CV as well as, or slightly better than EOF, since the latter lose up to 2% under CV; that is, although EOTs have an obvious noise component, there is no evidence that EOTs contain more noise than EOFs—the latter just partition the noise more evenly along space and time axis so as to make it less obvious. The calculation for EOT under CV perhaps possesses a greater stability because the leading modes (fields observed in 1937, 1977, 1950, etc.) do not change one iota, unless one of these years themselves is withheld. Even if the order in which these years are picked were to change, the appearance of the modes may change completely, but a linear combination of the leading years stays the same. We checked the above conclusions on other datasets and found much the same. For instance, while the loss in EV from full sample to CV is more on the order of 10% for the soil moisture dataset 2, the EOTs fare slightly better than EOFs.

A North et al. (1982) rule of thumb should probably apply also; that is, the EOT pattern, in terms of its looks, for mode *M* from a limited sample may not faithfully represent the parent population if the uncertainty in EV for mode *M* is on the order of the difference in EV for subsequent modes. The uncertainty being 1%, modes beyond 10 can certainly not be considered representative. The issue of appearance of modes is actually quite different for EOT and EOF, because there is a near-infinite number of ways to make a set of EOTs, while EOFs are unique.

The saturation EV is obviously 100% for the full sample size (64), while it is ∼99% for the CV (for this dataset). With saturation in sight (see Table 2), the “advantage” of dependent data over CV decreases because (i) there is less left to be explained by subsequent modes and (ii) the admission of an additional orthogonal basis function (no matter how unstable) invariably increases EV. The latter point was already made by North et al. (1982); that is, even if orthogonal patterns are not reproducible on independent data they still efficiently explain the variance.

It should be pointed out that under CV we restandardized 64 times for sample size 63, and expressed the year left out in terms of the climatology and standard deviation of the reduced developmental sample. This makes the variance to be explained in columns 1 and 2 of Table 2 slightly different—1.00 for the full sample and 1.04 for the reduced sample. Failure to restandardize creates an artificial connection between the year left out and the remaining sample [of the sort described in van den Dool (1987) and Barnston and van den Dool (1993b)] and diminishes the independence of the year left out for testing.

Leaving out only 1 yr may not seem like much, but the neighboring years are not particularly correlated, so each year is indeed an almost-independent realization. Leaving out 3 or 5 yr at a time, and applying CV to the center year of the group left out, leads to nearly identical results. One can imagine CV with the normal setup as well. In this case time series at certain locations are left out, the modes are calculated from less than 102 locations, and so on.

## 4. Conclusions and discussion

We have defined and presented a simple method to calculate functions, empirically and orthogonally, and name them empirical orthogonal teleconnections (EOTs). Given a space–time dataset *T*(*s, t*), we search for that point in space (sb; a base point) that explains the maximum possible of the variance of all points combined. What is explained by *T*(sb, *t*) is removed from *T*(*s, t*) by standard regression, and one can then search the reduced data *T*(*s, t*) for the next most important point in space. Eventually one obtains *T*(*s, t*) = Σ *α*_{m}(*t*)*e*_{m}(*s*), where the *α*s are time series and the *e*s are spatial patterns, and the summation is over mode *m.* The space patterns are regression coefficients between point sb and the other points *s,* and the time series are the raw data time series at base point sb, *T*(sb, *t*). There is a 1-to-1 correspondence between modes *m* and the successively chosen points sb_{m}. By applying EOT to four commonly used datasets we have shown by example that these functions can indeed be calculated with minimal complication and with good results both in terms of efficient explanation of variance and in terms of producing familiar patterns. EOT are orthogonal in either time or in space, not both (one has the choice). If orthogonality in space is chosen, the role of points in space is taken over by points in time. Most of the language above (and below) is for the normal setup.

The present authors are not aware of any references to EOT-like methods in our field. Certainly, EOTs, except for many details, have much in common with the Gram–Schmidt orthogonalization in linear algebra (Pearson 1975). The EOT procedure has its roots in regression, so many of the common approaches in regression (forward, backward, retroactive pruning) to come up with the shortest and most significant and efficient equations could be applicable here, but have not been pursued.

EOTs are less constrained than EOFs by being orthogonal in just one direction (as opposed to two in regular EOF). In this regard EOTs are like rotated EOFs (REOFs). But note that EOTs are calculated in one step;that is, one does not need to rotate anything, nor is there a truncation point for rotation (O’Lenic and Livezey 1988).

In appearance, the leading EOTs look very much like previously described (rotated) EOFs (BL), and teleconnection patterns (WG). The formal link to EOFs is presented (section 2d) and an iteration can be used to calculate EOFs, given EOTs (see appendix). The link between EOT and traditional teleconnections (Namias 1981; WG) is obviously very strong as well. A (any) WG or Namias teleconnection map is the same as a first EOT (normal *t–s* setup) anchored to that base point. In neither WG nor Namias (1981) was there any data reduction, and all teleconnections were derived from the original full data. Hence, there was no orthogonality, no explained variance notion, and no functional representation as in (1).

It is hard to find a completely satisfactory name for what we have here called EOT. We considered names like “special” or “simple” EOF (SEOF—too vague) or empirical normal modes [ENM—name exists already for something else (Brunet 1994)]. Of course “simplicity” is in the eye of the beholder, but anyone who knows multiple linear regression will understand EOT in that framework rather easily. The route to EOFs via regression methods, as an alternative to the eigenproblem, is not commonly used in meteorology, but certainly hinted at in works like Kim and North (1998). The chosen name EOT indicates that these new functions bridge the gap between EOFs and teleconnections. While EOT is very close to teleconnections, we should mention three differences. First, teleconnections have not been considered with time and space reversed; only the search for analogs and antilogs (van den Dool 1987) comes close. Second, the WG teleconnections are picked for remote robust negative correlations, while EOT derive most of their variance-explaining capability from the nearby positive correlations. Third, while teleconnections are calculated from raw data, EOT beyond the first mode is calculated from reduced data.

The reason that EOF, EOT, and WG teleconnections all come up with similar patterns (such as NAO and PNA) is not the similarity in the basic tool used (linear correlation). Nature is helping here by providing a few modes that are in fact close to being biorthogonal. Therefore, we will find these modes whether we use a method that is constrained to be orthogonal in one (EOT) or two directions (EOF). For the same reason we will find these modes, whether we solve for them all at once (EOF) or one after the other (EOT, teleconnections), removing the variance of the previous modes (EOT) or not (WG teleconnections). Exactly why nature has a few quasi-biorthogonal modes is not clear but would be worthwhile to know.

Often, EOFs have been used to study low-frequency variations. The time series based on EOT (see Fig. 4) show various ups and downs over various timescales, including interdecadal. There could be some concern about the reality of trends in EOT time series being affected by the methodological requirement that the time series equals unity at some specific base time tb (1937 and 1977 in Fig. 4). While the correlation would be maximum at tb, the regression coefficient need not be (and regularly is not!) maximum. Also, if there is a real trend, the base time should be very late or early in the series. In Fig. 5 we compare the time series in regular and reverse setup—while only the latter suffers from the “forced” maximum, in 1977 the two time series are nearly the same.

Some smaller details regarding EOT calculation (1–5) to keep in mind are presented.

- CPU time [0.4 s (Cray90, 1 processor, all modes) for a 102 × 64 problem] is reduced by two-thirds when using partial correlation; that is, rather than recalculating the ns × ns correlation matrix explicitly from the reduced
*T*(*s, t*) dataset, one can also reduce the original correlation matrix directly by where cor(*s*1,*s*2|sb) denotes the partial correlation (Panofsky and Brier 1968) between*T*at points*s*1 and*s*2, given that the time series at sb has been accounted for. Use of partial correlation is only for CPU reduction. As is, the

*e*s and*α*s are not standardized as in being orthonormal; that is, inner products like Σ*α*_{m}(*t*)*α*_{n}(*t*)/nt for*n*=*m*are not unity. However, a postprocessor could achieve that result.EOT are covariance based. The switch to a correlation-based EOT, as defined, is made by inputting standardized data.

One has to choose time or space as the orthogonal coordinate.

EOT is linked to specific points in time or space. This statement is true in full only for the first mode. For the

*m*th mode, EOT is linked in full to the data (at some point or time) that remains after removal of*m*− 1 previous modes; that is, beyond mode 1, the linkage of modes to the raw input data may not be so obvious, unless teleconnections in different sectors of the domain are largely independent, which we found to be true for the first two modes both for U.S. January temperature and extratropical Northern Hemisphere DJF 700-mb height.

We next list some special advantages (1–4) and possible further applications (1–4) of EOT.

In terms of (physical) interpretation (always a debatable issue with EOFs), EOT may offer advantages. Some researchers may appreciate the linkage of modes to specific sites because it keeps the orthogonal function close to time-honored indices (local time series, really), such as are used for El Niño–Southern Oscillation monitoring. A more theoretical advantage is that the “effective spatial degrees of freedom in space” (Bretherton et al. 1999) has sometimes been equated to

*N*processes going on independently at*N*points in space. Just where might these evasive points be? EOT provides an answer. For monthly temperature in January, west Kentucky explains the most (∼38% EV) of monthly mean temperature around the United States, followed by east Wyoming with 31%.That EOT indeed gives more physical patterns can best be seen under time–space reversal. In this case the space patterns are actually observed fields, the physical reality of which cannot be questioned (except for observational/analysis noise). In this sense EOT has an important advantage over EOF. With time–space reversed one finds out that January 1977 explains most (37% EV) of the other years, followed by January 1937 (January 1977 regressed out) with 34%. The first EOT of instantaneous fields actually satisfies the full set of nonlinear governing equations.

One has the liberty to force certain modes (space–time points, rather) in first. If one wants to see an unadulterated version of the PNA as mode 1, one just forces sb = 45°N, 160°W in first. One can start with any base point one wants or can let the order of subsequent sb be determined by considerations other than maximizing EV. EOT gives some freedom that (R)EOF does not have.

Given the EOT method as defined, it is not necessary to make the space or time mean zero a priori. One may still have good reason to do this, but a procedure based on the above will adhere to (1) or (1a). This is a distinct advantage since simultaneous standardizing along both the time and space coordinate is impossible, and, by not requiring standardization, the same total variance is explained in the space–time reversal option. The correlation, variance, and covariance are all calculated without removal of sample means. If the sample mean is nonzero, it will be explained by one or more EOTs.

These considerations play a role in real-time forecast tools, where, following protocol, one intends to forecast anomalies with respect to the 1961–90 “normal” but also wants to use as much data as possible. For example in the case of the May soil moisture data, dataset 2, monthly data for 1932–95 minus the 1961–90 climatology in units of millimeters was offered to the EOT code. Neither the time nor the space means are zero, but (1a) is valid to the last millimeter if enough modes are admitted, and a forecast scheme automatically produces anomalies with respect to 1961–90.

We cannot claim that the standardization issue is solved to satisfaction. One feature of the procedure we have adopted is that the regression coefficient *e*_{m}, based on a “regression through the origin” (Brownslee 1965), is sensitive to the distance to the origin. Consequently, the appearance of modes will change when a constant is added (such as would happen in converting from °C to K when time series have nonzero mean—the multiplicative factor has no impact). However, this may be no worse than the change in modes one gets by taking the mean out a priori. When using regression with an intercept the mean comes out in full along with mode 1.

Possible further applications 1–4: We here list possible applications other than the obvious ones pursued in section 3. To some extent this is a list of future or ongoing parallel work.

- A promising application is the use of EOT in forecast schemes. The use of EOF as a matter of efficiency for (numerical) forecast models has long been considered, but here we believe that space–time-reversed EOT are more logical building blocks from a forecast and simulation point of view. EOFs are efficient, but there is no obvious way of linking EOFs to the essence of forecasting, that is, to the time derivative (other than by substitution into the governing equations and truncating somewhere). Specifically, by expressing an initial condition
*T*(*s, t*= 0) into regular EOFs as we have done little to find the time derivative of*T.*Since we use only orthogonality in space anyway, we can also express the inital state in terms of reversed EOT: (where time tb_{m}corresponds 1 to 1 to mode*m*), which gives immediate access to the time derivative, since the EOT are linked to realizations [*T*(*s,*tb_{m})], and each flow has a known realization following. In combination with constructed analog (van den Dool 1994), which seeks to project an initial condition onto a nonorthogonal set of historical realizations, that is,*T*(*s, t*= 0) = Σ*d*_{m}*T*(*s,*tb_{m}), the use of EOT appears to be useful in making it easier and less ambiguous to find the weights*d*_{m}. The ambiguity of finding*d*_{m}is discussed in van den Dool (1994), but given*c*_{m}it is easy to calculate*d*_{m}. More in general, the linkage to specific years is an enormous bounty, because one now points (without further statistics) to the time derivatives observed in these years, and one can make an empirical–numerical prediction or simulation model, which can run both backward and forward in time. Other variables and levels. One can easily include other levels; that is, the

*s*in*T*(*s, t*) would run both in horizontal and vertical space. Some kind of a priori data treatment like standardization to place levels on equal footing needs to be addressed. Adding more variables (*T,*wind components, surface pressure) is equally doable, as long as the relative units make sense to the researcher.Special CCA.

(a)We have presented EOT (special EOF) in section 2 as a stepwise linear regression where

*T*(*s, t*) is both the predictor and the predictand. However, if the predictor is denoted as*Z*(*s, t*) and the predictand is*T*(*s, t*), one can develop special canonical correlation analysis (SCCA); that is, find that point sb in the*Z*field such that*Z*(sb,*t*) explains the most of the variance of*T*(*s, t*) combined over all*s.*The regression of*Z*(sb,*t*) against*Z*(*s, t*) on the one hand and against*T*(*s, t*) on the other yields the first pair of canonical patterns.*Z*(sb,*t*) is the single time series. Substituting*Z*(*s, t*) =*T*(*s, t*), one can see that EOT is “SCCA onto itself,” likewise regular EOF is self-CCA.(b)Another route to CCA is to consider more than one point sb at the same time. In EOT we have one point (sb) on the predictor side and a whole field on the predictand side. One can ask which linear combination of two base points in

*Z*will explain the most of the whole field*T*(*s, t*). Next is three points and ultimately a linear combination of all points in*Z*to explain the field*T*(*s, t*). This is a route to classical CCA.Climate analysis. Knowing the

*N*points in space that are the most efficient in explaining, say, 95% of the variance of all other points would seem to provide a rationale for a global climate analysis system that requires no more than*N*optimally situated observations (without error). Here*N*may not be all that large (on the order of a few hundred at most).(a)Retroactive analysis: Assuming that

*N*is only a few hundred, one could consider a reanalysis back to 1850 or so, based on surface data (monthly pressure and temperature) alone. Here one forces those points in where surface observations have been made without interruptions from the beginning to first derive the*N*modes from modern multilevel multivariable data [say 1950–present as provided by Kalnay et al. (1996)], and then reconstruct the 3D multivariable fields in the pre-1950 era as far back as one can go with just surface data. That such an approach might work is clear from Klein and Dai (1998), who had considerable success with what they called reverse specification.(b)Climate observation system: Knowing the

*N*points in space that explain, say, 95% of the variance of all other points would seem to provide a rationale as to where we treasure the observations most, where we would like to continue to have time series, or to make a case to start time series now. A case in point is the base point of the PNA at 45°N, 160°W. Is it not peculiar that, as of 1999, we have only a few surface observations and satellite and airplane wind data (at 200 mb) but no vertical soundings (other than satellite retrievals) in an area that gives us (believing the gridded analysis) nearly the most explained variance (in midtropospheric height) for the entire Northern Hemisphere domain? Of course, the optimal sites would have to consider temperature, winds, humidity, surface pressure, and so on, combined, and may be seasonally dependent.

## Acknowledgments

We thank Dave Unger and Tom Smith for their insightful internal review. We also thank Ed Epstein, Siegfried Schubert, Tony Barnston, Gilbert Brunet, Hans von Storch, Francis Zwiers, Ming Cai, Risheng Wang, and the anonymous reviewers for their constructive comments.

## REFERENCES

### APPENDIX

#### An Iteration of EOT toward EOF

EOT places 100% faith in time series at successive base points sb, even though there must be some observational and analysis error. This concern can be addressed in various ad hoc ways, but here we mention a method of iteration following section 2d. The proposed procedure is inspired by the way teleconnection indices are (or perhaps should be) constructed. Referring to Fig. 6’s upper-right rendition of the PNA pattern, the teleconnection index according to EOT is just *T*(sb, *t*), where sb = 45°N, 160°W. However, Wallace and Gutzler (1981) define the PNA index as an average of *T*(*s, t*) at the four major centers, that is, near Hawaii, the North Pacific, Canada, and the Southeast United States, where the even points enter as −*T*(*s, t*). This definition already implies the kind of smoothing we are seeking here. Wallace and Gutzler (1981) express hesitation as to what the weight should be for the four points. In a way one would want to use the other three locations, at any given time, to specify what 45°N, 160°W should be, and average these three specifications with the observation as an amended version of *α*_{1}(*t*). In the EOT (normal setup) context the most logical field of weights would be proportional to *e*_{1}(*s*); that is, given the spatial pattern *e*_{1}(*s*) (and no more) from EOT as first guess one can calculate a new *α*_{1}(*t*) as

followed by recalculation of *e*_{1}(*s*) given the latest *α*_{1}(*t*)

We applied this to dataset 1 and found that the correlation of 1.0 in west Kentucky (Fig. 2 top) gets lowered to about 0.97 and some surrounding and far away values are adjusted a little. One can also start from a time series and enter (A1b) first. In doing so, patterns emerge more smooth than the ones shown in Fig. 3. In Fig. 7 one would no longer see holes drilled at the successive sbs because some “noise” up to 10 units is left behind. Expressions (A1a) and (A1b) are the same as (2a) and (2b), but written for mode 1 alone. If one loops around (A1a) and (A1b) for a few more iterations, starting from EOT’s first guess *e*_{1}(*s*), the EOT solution converges toward regular EOFs, the modes become orthogonal in both time and space and somewhat more variance gets explained. In the few examples studied here, one or two iterations was enough to achieve most of the change one will get when convergence is reached. In a way we invented a recipe as to how one rotates EOTs back to EOFs, after first presenting a recipe that yields EOT (close to rotated EOFs) in one single and simple calculation. One can apply the iteration for each mode in order 1, 2, etc., with the understanding that when calculating the *m*th mode *T*(*s, t*) is the dataset reduced by subtracting the *m* − 1 previous modes out. The EOFs need not come out in order of EV. As an aside, we suspect that one can get EOFs out of (A1a) and (A1b) starting from rather arbitrary fields (like the observed fields), that is, without calculating any correlation. The only question for a given initial guess: to which EOF do we converge?

## Footnotes

*Corresponding author address:* H. M. van den Dool, Climate Prediction Center/NCEP, WWB, W/NP51 Rm. 604, 5200 Auth Rd., Camps Springs, MD 20746.

Email: huug.vandendool@noaa.gov