## 1. Introduction

Regression maps have been used in many studies to find climate signals associated with a given time series, for instance, to identify temperature and precipitation signals associated with the North Atlantic, Arctic, or Antarctic Oscillations (Hurrel 1996; Thompson and Wallace 1998; Shindell et al. 1999; Reichert et al. 2001; Jones and Widmann 2003); to find geopotential height anomalies related to the leading principal components (PCs) of precipitation (Quadrelli et al. 2001); or to find the climatic response to solar forcing (Waple et al. 2002). When the link between the time series and the time-dependent field is formulated individually for each location, regression maps are a straightforward way to capture the signal of the time series in the field. A nonlocal interpretation of regression maps was given in Wallace et al. (1995), who showed for the special case of fields with a spatial mean of zero that the time expansion coefficient (TEC) of the regression map, defined by orthogonal projection of the field onto the regression map, has maximal covariance with the time series that was used to define the regression map and that, thus in this case, the results are identical to those of a singular value decomposition (SVD). TECs of regression maps between the Arctic Oscillation index (AOI) and temperature or geopotential height fields were also calculated by Thompson and Wallace (1998) using an orthogonal projection, whereas in Thompson et al. (2000) TECs of correlation maps were used. It was not discussed in these papers that calculating TECs of regression maps through orthogonal projection is related to SVD and the TECs were compared to the AOI, but not used to estimate the AOI or vice versa.

It is the purpose of this paper to clarify the relationship between regression maps, SVD, and canonical correlation analysis (CCA) and to discuss the various ways in which a time series can be linearly linked to a time-dependent field. All the derived properties follow directly from the basic definitions and general solutions of CCA and SVD. Some of the findings could also be easily inferred from relations listed in Bretherton et al. (1992), hereafter referred to as BSW92. However, the case in which one of the two fields is just a time series, which then must be proportional to the TEC of a canonical pattern or singular vector, was not explicitly discussed in BSW92, and thus the relationship between regression maps and CCA or SVD may have gone unnoted.

## 2. Some properties of canonical correlation analysis and singular value decomposition

In this section properties of CCA and SVD that are needed for the arguments in this paper are briefly reviewed. The nomenclature follows closely that of BSW92. Basic definitions and properties that are omitted for brevity can be found for instance in BSW92 and in von Storch and Zwiers (1999).

*N*- and

_{s}*N*-dimensional, real, time-dependent fields with zero temporal mean in each dimension, which are represented as column vectors

_{z}**s**(

*t*) and

**z**(

*t*). In statistical climatology

*N*and

_{s}*N*typically refer to a spatial index. Both CCA and SVD find coupled patterns in these fields, based on different optimization criteria. Let us first consider CCA and let

_{z}**u**

*and*

_{k}**v**

*be the weight vectors or adjoint canonical patterns, and*

_{k}**p**

*,*

_{k}**q**

*the canonical patterns in the*

_{k}**s**(

*t*) and

**z**(

*t*) field, respectively. In CCA the general solution for the adjoint canonical patterns

**u**

*is given by the eigenvector equationwhere 𝗖*

_{k}*and 𝗖*

_{ss}*denote the covariance matrices of*

_{zz}**s**and

**z**, 𝗖

*the cross-covariance matrix between the two fields, and*

_{sz}*λ*the eigenvalues. A similar equation holds for the adjoint canonical patterns

_{k}**v**

*. Pairs of adjoint patterns are related throughwith a suitable normalization factor*

_{k}*η*determined by the normalization conventionPatterns and adjoint patterns are related throughThe weight vectors are used to define the TECs

*a*(

_{k}*t*) for the patterns

**p**

*throughand analogously for*

_{k}**z**(

*t*).

**P**∈ ℝ

^{Ns×Ns}and

**Q**∈ ℝ

^{Nz×Nz}are orthogonal matrices, whose columns are the patterns or singular vectors

**p**

*and*

_{k}**q**

*. Because in contrast to CCA the set of singular vectors*

_{k}**p**

*(*

_{k}**q**

*) is orthonormal, the*

_{k}*k*th TEC of the pattern is obtained by orthogonal projection of the data at each time step onto the pattern, which means that the weight vectors

**u**

*and*

_{k}**v**

*are identical to the patterns*

_{k}**p**

*and*

_{k}**q**

*.*

_{k}**p**

*,*

_{k}**q**

*) or weight vectors (*

_{k}**u**

*,*

_{k}**v**

*). A second application, which is used frequently for instance for statistical downscaling or for climate reconstructions, is the estimation of one field from the other. If*

_{k}**z**is estimated from

**s**, the estimate

**ẑ**based on the leading

*n*pairs of CCA or SVD patterns is given bywhere the estimated TECs

*b̂*(

_{k}*t*) are obtained by linear regression from the TECs

*a*(

_{k}*t*) of the other field. Because in CCA and SVD the TEC

*b*(

_{k}*t*) is only correlated with

*a*(

_{k}*t*) and uncorrelated with all other TECs of the field

**s**, the estimate is given byFor CCA-based reconstructions this can be rewritten using the canonical correlation

*ρ*Note that Eqs. (8) and (9) include weight vectors of the predictor field and patterns of the predictand field. Because in CCA weight vectors and patterns are not identical, the best way to present the results of a CCA used for estimating one field from another is to show the adjoint canonical patterns of the predictor field and the canonical patterns of the predictand field since these are the terms actually used in the estimation. In SVD weight vectors and pattern are identical, and thus the question whether to present the former or the latter does not arise.

_{k}## 3. Solutions for one-dimensional CCA and SVD

When one of the two fields **s**(*t*) and **z**(*t*) is one-dimensional some of the matrices that define the solutions become considerably simpler, and there is only one pair of canonical vectors or singular vectors. Let for instance *s*(*t*) be the one-dimensional time series. Then 𝗖* _{ss}* is just a scalar given by 𝗖

*= var[*

_{ss}*s*(

*t*)], while 𝗖

*reduces to a row vector 𝗖*

_{sz}*= cov[*

_{sz}*s*(

*t*),

**z**

^{T}(

*t*)], whose components are the covariances between

*s*(

*t*) and the components of

**z**(

*t*).

*u*is a scalar given by the normalization (3) asAs

*u*is known, Eq. (2) and the normalization constraint (3) can be used to calculate the adjoint canonical pattern

**v**, which is then given byEmploying (4) yields for the canonical patterns

*is just a vector, and yieldswhich is proportional to the CCA solution (13). (To keep the notation simple the same variable names as in the CCA section are used, although the CCA and SVD solutions differ).*

_{sz}**q**derived from SVD and from CCA are both proportional to the regression mapwhile the adjoint canonical pattern

**v**is proportional to the weightsobtained from a multiple linear regression (MLR) with

**z**as the predictors and

*s*as the predictand. Note that the first proportionality could be derived from Table 1 in BSW92, where it is noted that singular vectors and canonical patterns are proportional to heterogeneous covariance maps, which are defined as the covariances between the TEC

*a*(

_{k}*t*) and the field

**z**(

*t*), or between

*b*(

_{k}*t*) and

**s**(

*t*). When one takes into account that for one-dimensional

*s*the SVD and CCA TEC

*a*(

*t*) is just a multiple of

*s*, the proportionality between

**q**and

**m**follows.

### a. Estimating a scalar from a vector

In this subsection the CCA, SVD, and MLR approaches for estimating a scalar time series *s*(*t*) from a time-dependent vector **z**(*t*) are discussed.

*s*is given by an equation analogous to Eq. (9) aswhere we have used Eqs. (11) and (12). The CCA estimate includes the adjoint canonical pattern

**v**, which, as mentioned above, is proportional to the MLR weights, and the CCA estimate is identical to the MLR estimate. The coefficient of multiple determination in MLR is identical to the canonical correlation

*ρ*. This equivalence of MLR and one-dimensional CCA is well known in statistical climatology (e.g., Glahn 1968).

**z**(

*t*) and

**s**(

*t*) interchanged, as well as the fact that

*s*and the TEC

*a*are identical, one obtains for the SVD estimatewhich can be rewritten as

This equation includes weights for the predictor field **z**(*t*) that are proportional to the regression map (15). The SVD-based estimate is usually not used in statistical climatology, but it should be noted that the normalized TECs of regression maps in Thompson and Wallace (1998) are in line with the idea of estimating *s* from a time series obtained by orthogonal projection of the data onto the regression map and only need to be scaled properly to obtain the SVD estimate *ŝ*(*t*). For estimating a time series *s*(*t*) from a multivariate predictor, often PC-prefiltered MLR, or equivalently CCA, are used, as they maximize the explained variance. However, this optimization criterion holds only for the fitting data, and it is not a priori clear whether CCA–MLR has a better skill than SVD on independent data.

The SVD approach has the advantage that no PCs need to be calculated and no subjective decision on the number of retained PCs is required. BSW92 analyzed examples in which both fields were multivariate and obtained similar results with SVD and with prefiltered CCA, whereas CCA without prefiltering was uncompetitive due to high sampling variability. SVD was used later in several studies for investigating links between two fields (e.g., Qian et al. 2003; Loschnigg et al. 2003) or for estimating one field from another (Widmann et al. 2003). Another potential problem with CCA is that it includes the inversion of the within-field covariance matrices, which may lead to unstable results on small samples and requires using generalized inverses when the number of variables is higher than the number of time steps and, as a consequence, the results may be difficult to interpret (BSW92; Cherry 1996). However, this problem is partly accounted for by the PC prefiltering and, as pointed out by Cherry (1996), SVD can under certain circumstances also yield spurious coupled patterns. Thus in the multidimensional case no method performs generally better than the other, and therefore both the prefiltered CCA and the SVD approach may also be useful for estimating a scalar from a vector. A comparison in a practical example follows below.

### b. Estimating a vector from a scalar

**z**from

*s*is given byor, using Eq. (13) and then Eq. (10), bywhich is identical to the estimate obtained from individual linear regression equations for the components of

**z**or, in other words, to the product of

*s*and the regression map

**m**[Eq. (15)].

*s*and the TEC

*a*are identical one obtains for the SVD estimatewhich can be shown to be identical to the CCA and component-wise regression estimate in Eq. (24). Thus the regression map var(

*s*)

^{−1}

**C**

^{T}

_{sz}can be interpreted as the signal of

*s*in

**z**from the CCA, the SVD, and the component-wise regression perspective.

### c. Coupling strength between a time series and a time-dependent vector

CCA and SVD allow one to express the strength of the linear coupling between a time series and a time-dependent vector through the correlation of the time series and the TEC of the canonical pattern or singular vector. When the time series *s*(*t*) is estimated from the time-dependent vector **z**(*t*), multiple regression analysis also yields a measure for the strength of the coupling through the coefficient of multiple determination while, in the case when **z**(*t*) is estimated from *s*(*t*), component-wise regression analysis does not.

*s*(

*t*) is estimated from

**z**(

*t*), CCA and MLR are equivalent, and the coefficient of multiple determination is identical to the squared canonical correlation, which is given bydue to the proportionality of

*s*and the TEC

*a*, and the fact that

*b*is obtained by weighting

**z**with the adjoint canonical pattern

**v**given in Eq. (11).

*b*and another value for the correlation

*r*, namelyHere we have used the identity of patterns and weight vectors in SVD and Eq. (14). Given the same set of predictors,

*r*will be less than or equal to the canonical correlation

*ρ*, as CCA maximizes the correlation between the TECs.

When **z**(*t*) is estimated from *s*(*t*) by component-wise regression, the strength of the coupling between *s* and the individual components *z _{i}* of

**z**can be expressed by means of local correlations, but no measure for the overall strength of the coupling is available. If one considers the entire regression map rather than the individual regression coefficients, one can calculate its TEC in the two ways described above, either according to CCA by weighting

**z**with the adjoint canonical pattern or according to SVD by weighting

**z**proportional to the regression map itself. Despite the fact that the CCA and SVD estimates for

**z**are identical, this leads again to the two different correlations given in Eqs. (27) and (29).

Estimating *s*(*t*) from **z**(*t*) and estimating **z**(*t*) from *s*(*t*) are very different problems, and the formulation in terms of MLR or local regression analysis is indeed quite different. However, when considered as a one-dimensional case of CCA or SVD the problem becomes more symmetrical, and the strength of the linear link between *s*(*t*) and **z**(*t*) is the same regardless of whether *s*(*t*) is estimated from **z**(*t*) or vice versa.

As mentioned above, using a large number of predictors in MLR or CCA may lead to problems related to the inversion of 𝗖* _{zz}* and to overfitting. Therefore it is common practice to reduce the number of predictors by PC prefiltering. Note that a consistent prefiltered CCA approach would include a regression map derived from prefiltered data, and thus it is difficult to obtain the CCA TEC of the unfiltered regression map. The SVD-based correlation

*r*can be used without prefiltering as an alternative measure of the strength of the coupling.

## 4. Example: The relationship between the AOI and the temperature field

We now compare the SVD and PC-prefiltered CCA approaches for linking a time series to a time-dependent vector in a typical climatological application. The time series *s*(*t*) is given by the January AOI calculated as PC1 of 1948–2002 January SLP means between 20° and 85°N from the National Centers for Environmental Prediction–National Center for Atmospheric Research (NCEP–NCAR) reanalysis (Kalnay et al. 1996; Kistler et al. 2001). The vector **z**(*t*) represents the spatial field of January means of 850-hPa reanalysis temperature (*T*_{850}) given on a 2.5° × 2.5° grid between 20° and 85°N when regression maps or the SVD AOI estimate are calculated, and a PC-filtered version of this field for the PC-prefiltered CCA calculations. As mentioned in the introduction, estimating the AOI temperature signal has been part of several papers that have investigated circulation-induced temperature changes, while estimating the variability of the AOI or of other dominant circulation modes from the temperature field is for instance relevant in the context of proxy-based circulation reconstructions (e.g., Cook et al. 2002; D’Arrigo et al. 2003; Jones and Widmann 2003).

The AOI was estimated from the unfiltered temperature field by means of SVD according to Eq. (21), and by MLR or equivalently CCA according to Eq. (18) after PC prefiltering with a varying number of retained PCs. For cross validation the dataset was split into two parts, from 1948 to 1975 and from 1976 to 2002. The first two rows of panels in Fig. 1 refer to these two periods. The left-hand column shows regression maps, which give the temperature change for a positive change in the AOI of one standard deviation (left-hand color bar applies). As the weights for the grid cell temperatures used in the SVD-based AOI estimate are proportional to the regression map, the left-hand column can also be interpreted as these statistical weights (right-hand color bar applies). The effective weights for grid cell temperatures that result from the CCA equation are shown in the middle column for two retained PCs, which is the lowest number for which the reconstruction has good skill and in the right-hand column, as an example for a relatively large number, for 12 retained PCs. The estimate *ŝ*(*t*) does not change when **z**(*t*) is represented with respect to a different basis, and thus the effective weights can be expressed as the sum of the products of the retained EOFs with the statistical weights for the retained PCs that are obtained when the PCs are used as predictors for a CCA-based or, equivalently, MLR-based AOI estimate.

As the grid cells do not represent equal areas, the data were area weighted. For the SVD-based estimate this was done by weighting the temperature field with the cosine of the grid cell latitudes before projecting it onto the regression map derived from the unweighted data [which is equivalent to applying Eq. (21) with **z**(*t*) as the temperature field weighted with the square root of the cosine of the latitude]. For the CCA-based estimate the area weighting was included in the PCs by weighting the temperature field with the square root of the cosine of the latitude prior to performing the principal component analysis. The weight patterns in Fig. 1 do not include this area weighting. They are correct for an equal-area grid (up to an overall scaling factor that depends on the number of grid cells). Temperature data on a non-equal-area grid have to be weighted proportional to the area size and then multiplied by the displayed weights in order to obtain the correct AOI estimate (again, up to a scaling factor that depends on the number of grid cells). Regression maps and weights were calculated from detrended data and then applied to the undetrended data to obtain the AOI estimates.

The regression maps or SVD weights are similar to the CCA weights with two PCs retained, while the CCA weights with 12 retained PCs are in some areas substantially different from the two PC version. Similar differences between the two fitting periods occur in all three weight patterns. The NCEP–NCAR AOI and the various estimates are shown in the lower panel of Fig. 1, with the estimates for 1948–75 being based on the weights derived from the period 1976–2002 and vice versa. All estimates are in reasonable agreement with the true AOI. A detailed assessment of the skill in terms of correlation, rmse, and bias is given in Fig. 2. Solid lines refer to the reconstructions for independent data as presented in Fig. 1; dashed lines refer to the skill within the fitting periods. The figure shows the skill of the CCA estimates for 2 to 22 retained PCs and, as a horizontal line, the skill of the SVD estimates. The correlations for the cross-validated CCA estimate for less than seven retained PCs are slightly lower and the rmse slightly higher than for the SVD estimate, and practically identical when more PCs are retained. The number of effective degrees of freedom derived from the eigenvalue spectrum of the temperature covariance matrix (Bretherton et al. 1999) is about nine, and thus close to the number of retained PCs after which the cross-validation skill levels off. For all numbers of retained PCs the cross-validated CCA estimate has a higher bias than the SVD estimate. The overestimation of the true skill on independent data by the correlations and rmse calculated from the fitting data is for most numbers of retained PCs higher for CCA than for SVD (the bias during the fitting period is zero by definition). Note that the small differences between the SVD weights and the CCA weights with two retained PCs do noticeably affect the skill and that the substantial differences between the CCA weights for different numbers of retained PCs affect the skill during the fitting period more than the skill on independent data.

## 5. Summary and discussion

It was shown that regression maps calculated by regressing the components of a time-dependent vector **z**(*t*) on a time series *s*(*t*) are proportional to canonical patterns and singular vectors, and that CCA, SVD, and component-wise regressions lead to identical estimates for **z**(*t*) from *s*(*t*), whereas the estimate of *s*(*t*) from **z**(*t*) depends on whether CCA (or equivalently MLR) or SVD is used. The definition of the TEC of a regression map, and as a consequence the correlation between the TEC and *s*(*t*), depends on whether the CCA or the SVD perspective is adopted. Other authors have calculated the TEC of the regression map by orthogonal projection of **z**(*t*) onto the regression map, which in this paper was shown to be equivalent to performing SVD, while the calculation of the CCA TEC involves the adjoint pattern.

Although CCA minimizes by definition the mean square difference between *s*(*t*) and its estimate from **z**(*t*), it appears difficult to decide from a theoretical standpoint whether the CCA or the SVD approach yields better estimates for *s*(*t*) when applied to independent data. In the practical example considered in this paper a very similar skill on independent data was found for CCA and SVD when a sufficient number of PCs was retained in the prefiltering for CCA. Skills calculated from the fitting data overestimated the skill on independent data more strongly in the CCA (or MLR) than in the SVD model.

Calculating the TEC of the regression map by orthogonal projection and then using it as the predictor in a linear regression for the time series *s*(*t*) is thus conceptually well defined because it is equivalent to performing SVD, and may in climatological applications yield estimates for *s*(*t*) that have similar skill to those obtained from CCA or MLR. SVD has the advantage that the signal of *s*(*t*) in **z**(*t*), which is given by the regression map, and the statistical weights for **z**(*t*) used to estimate *s*(*t*) are proportional, whereas in the CCA approach the signal is given by the regression map but the statistical weights are proportional to the adjoint pattern. Therefore the SVD perspective may be particularly useful when both directions, estimating **z**(*t*) from *s*(*t*) and vice versa, are of interest. When in this case CCA is used, the signal pattern and its adjoint are relevant, which may complicate the discussion. Moreover, one would have to address the issue that it is often natural to define signals based on unfiltered data, but CCA often requires PC prefiltering.

This research was supported by the Helmholtz Society under the KIHZ project (Klima in Historischen Zeiten, climate in historical times) and by the Federal Ministry of Education and Research under the DEKLIM program (Deutsches Klimaforschungsprogramm). The author thanks C. B. Bretherton, U. Callies, Y. Dmitriev, J. M. Jones, C. Matulla, H. von Storch, E. Zorita, and two anonymous reviewers for valuable comments.

## REFERENCES

Bretherton, C S., , C. Smith, , and J M. Wallace, 1992: An intercomparison of methods for finding coupled patterns in climate data.

,*J. Climate***5****,**541–560.Bretherton, C S., , M. Widmann, , V. Dymnikov, , J. Wallace, , and I. Bladé, 1999: The effective number of spatial degrees of freedom of a time-varying field.

,*J. Climate***12****,**1990–2009.Cherry, S., 1996: Singular value decomposition analysis and canonical correlation analysis.

,*J. Climate***9****,**2003–2009.Cook, E R., , R D. D’Arrigo, , and M E. Mann, 2002: A well-verified, multiproxy reconstruction of the winter North Atlantic Oscillation index since A.D. 1400.

,*J. Climate***15****,**1754–1764.D’Arrigo, R., , E R. Cook, , M E. Mann, , and G C. Jacoby, 2003: Tree-ring reconstructions of temperature and sea-level pressure variability associated with the warm-season Arctic Oscillation since AD 1650.

,*Geophys. Res. Lett.***30****.**1549, doi:10.1029/2003GL017250.Glahn, H R., 1968: Canonical correlation and its relationship to discriminant analysis and multiple regression.

,*J. Atmos. Sci.***25****,**23–31.Hurrel, J W., 1996: Influence of variations in extratropical wintertime teleconnections on Northern Hemisphere temperature.

,*Geophys. Res. Lett.***23****,**665–668.Jones, J M., , and M. Widmann, 2003: Instrument- and tree-ring-based estimates of the Antarctic Oscillation.

,*J. Climate***16****,**3511–3524.Kalnay, E., and Coauthors, 1996: The NCEP/NCAR 40-Year Reanalysis Project.

,*Bull. Amer. Meteor. Soc.***77****,**437–471.Kistler, R., and Coauthors, 2001: The NCEP–NCAR 50-Year Reanalysis: Monthly means CD-ROM and documentation.

,*Bull. Amer. Meteor. Soc.***82****,**247–267.Loschnigg, J., , G A. Meehl, , P J. Webster, , J M. Arblaster, , and G P. Compo, 2003: The Asian monsoon, the tropospheric biennial oscillation, and the Indian Ocean zonal mode in the NCAR CSM.

,*J. Climate***16****,**1617–1642.Qian, Y F., , Y Q. Zheng, , Y. Zhang, , and M. Miao, 2003: Responses of China’s summer monsoon climate to snow anomaly over the Tibetan Plateau.

,*Int. J. Climatol.***23****,**593–613.Quadrelli, R., , V. Pavan, , and F. Molteni, 2001: Wintertime variability of mediterranean precipitation and its links with large-scale circulation anomalies.

,*Climate Dyn.***17****,**457–466.Reichert, B K., , L. Bengtsson, , and J. Oerlemans, 2001: Midlatitude forcing mechanisms for glacier mass balance investigated using general circulation models.

,*J. Climate***14****,**3767–3784.Shindell, D., , D. Rind, , N. Balachandran, , J. Lean, , and P. Lonergan, 1999: Solar cycle variability, ozone, and climate.

,*Science***184****,**305–308.Thompson, D. W. J., , and J M. Wallace, 1998: The Arctic Oscillation signature in wintertime geopotential height and temperature.

,*Geophys. Res. Lett.***25****,**1297–1300.Thompson, D. W. J., , J M. Wallace, , and G. Hegerl, 2000: Annular modes in the extratropical circulation. Part II: Trends.

,*J. Climate***13****,**1018–1036.von Storch, H., , and F W. Zwiers, 1999:

*Statistical Analysis in Climate Research*. Cambridge University Press, 484 pp.Wallace, J M., , Y. Zhang, , and J A. Renwick, 1995: Dynamic contribution to hemispheric mean temperature trends.

,*Science***270****,**780–783.Waple, A N., , M. Mann, , and R S. Bradley, 2002: Long-term patterns of solar irradiance forcing in model experiments and proxy based surface temperature reconstructions.

,*Climate Dyn.***18****,**563–578.Widmann, M., , C S. Bretherton, , and E P. Salathé Jr., 2003: Statistical precipitation downscaling over the northwestern United States using numerically simulated precipitation as a predictor.

,*J. Climate***16****,**799–816.