## 1. Introduction

It is well established that some components in the climate system are more predictable than others. For instance, large-scale structures tend to be more predictable than small-scale structures (Shukla 1981); sea surface temperatures in the tropical Pacific tend to be more predictable than those in the Atlantic (Schneider et al. 2003); rainfall in the tropical Pacific tends to be more predictable than rainfall in Europe (Palmer et al. 2004). It is natural, then, to seek the *most* predictable components. However, attempts to identify maximally predictable components based on such methods as principal component analysis and singular value decomposition generally lead to different results. The purpose of this paper is to suggest a framework in which familiar methods of predictability analysis lead to *consistent* results.

A clue as to how diverse statistical methods can produce consistent results lies in the fact that many methods depend implicitly or explicitly on a norm. For instance, principal component analysis determines components that maximize variance, as defined by some norm. Singular vector decomposition depends on *two* norms: one for measuring “response” and another for constraining “initial condition.” Without a firm basis for choosing these norms, variance analysis could generate virtually any set of vectors by a suitable choice of norm.

One approach to these problems is to define predictability precisely and then to choose norms to ensure consistency with the associated measure of predictability. Unfortunately, no universally accepted measure of predictability exists. Therefore, this approach merely replaces the problem of the choice of norm with the problem of the choice of predictability measure.

Surprisingly, there is a path that gives a reasonable way out. The first step is to restrict attention to predictability measures that are invariant to affine transformations and monotonically related to forecast uncertainty. These measures are consistent in the following sense: measuring the predictability of the same system in two different coordinate systems gives the same result. We then show that components that maximize these measures are *independent of the details of the measure*. This result, proven in section 2, explains why different measures of predictability, such as signal-to-noise ratio, anomaly correlation, predictive information, and the Mahalanobis error all have the same maximally predictable component (DelSole and Tippett 2007). We then show that this component can be obtained by applying principal component analysis to transformed forecast variables. The transformation, called the whitening transformation, can be interpreted as specifying the norm in empirical orthogonal function (EOF) analysis. This result is generalized in section 3 to specify the norms in singular vector analysis to obtain the same predictable components. The norm for measuring forecast uncertainty has not appeared in previous predictability studies, but nonetheless these norms have several attractive properties that make their use compelling. Several components of interest to predictability are illustrated with an empirical stochastic model for sea surface temperatures in section 4. This paper concludes with a summary and discussion of results.

## 2. Predictable components

In this section we define a class of predictability measures and then show that components that maximize these measures are independent of the detailed form of the measure. The first step to defining predictability is to recognize that no forecast is complete without a description of its uncertainty in the form of a probability distribution. Following DelSole and Tippett (2007), the *forecast distribution* is defined as the conditional distribution of the state given antecedent observations of the system, and the *climatological distribution* is the unconditional distribution of the state. A system is deemed unpredictable if the forecast and climatological distributions are identical. Thus, a measure of predictability should indicate predictability only if the forecast and climatological distributions differ.

To ensure consistency, we propose that if the predictability of a system is measured in two different coordinate systems, then the measure should be the same. At the very least, then, the measure should be invariant to affine transformations, that is, to translations and to nonsingular, linear transformations. This property will be called the *invariance* property.

Finally, we consider measures of predictability that increase if and only if the uncertainty in the forecast decreases. This principle requires defining uncertainty. However, for normal distributions, any reasonable measure of uncertainty is an increasing function of variance. Thus, we assume that the predictability of normal distributions increases if and only if the forecast variance decreases, holding all other parameters constant.

Many measures of predictability satisfy the above properties, including signal-to-noise ratio, anomaly correlation, predictive information, and Mahalanobis error. The Ω index of Koster et al. (2000) also satisfies the above properties. However, mean square error does not satisfy the above properties because it is not invariant to linear transformation. Nevertheless, we retain linear invariance to ensure *consistency*. Similarly, not all measures increase if and only if forecast uncertainty decreases. For instance, measures of the “distance” between distributions, such as relative entropy (Kleeman 2002) or Bhattacharyya distance (Mardia et al. 1979), will change if the mean changes, even for constant forecast uncertainty. However, components that optimize the restricted class of measures can provide useful lower bounds on other measures.

**be the state of the system and**

*ν***q**be a projection vector. We seek the projection vector

**q**such that the inner product

**q**

^{T}

**optimizes predictability, where the superscript T denotes the transpose. Let the forecast distribution have mean**

*ν*

*μ**and covariance matrix*

_{f}**Σ**

*, and the climatological distribution have mean*

_{f}

*μ**and covariance matrix*

_{c}**Σ**

*. Since the variables are normally distributed, any linear combination of them also is normal. Thus, the climatological and forecast distributions of the projected variable*

_{c}**q**

^{T}

**have the following scalar means and variances:**

*ν**μ*and

_{f}*σ*

^{2}

_{f}measure the signal and noise, respectively. The variable’s predictability depends only on its forecast and climatological distributions, which are described completely by the parameters

*μ*,

_{f}*μ*,

_{c}*σ*

^{2}

_{f}, and

*σ*

^{2}

_{c}. By the invariance property, the projected variable can be translated and rescaled without altering the value of predictability. Accordingly, we standardize the variable to have zero mean and unit variance under the climatological distribution. The mean and variance of the standardized forecast become (

*μ*−

_{f}*μ*)/

_{c}*σ*and

_{c}*σ*

^{2}

_{f}/

*σ*

^{2}

_{c}, respectively. By the property that predictability increases if only if forecast uncertainty decreases, the parameter (

*μ*−

_{f}*μ*)/

_{c}*σ*can be dropped, because it does not affect uncertainty. Thus, maximum predictability can be found by minimizing

_{c}*σ*

^{2}

_{f}/

*σ*

^{2}

_{c}. The projection vector that minimizes

*σ*

^{2}

_{f}/

*σ*

^{2}

_{c}will be called a

*predictable component*, following the equivalent usage of Déqué (1988), Renwick and Wallace (1995), and Schneider and Griffies (1999). Without loss of generality, we hereafter assume

*μ*= 0.

_{c}*σ*

^{2}

_{f}/

*σ*

^{2}

_{c}and hence maximize the discrepancy between forecast and climatological spreads. The ratio of spreads can be written as

**Σ**

^{1/2}

_{c}denotes the matrix square root of

**Σ**

*, which satisfies*

_{c}**Σ**

_{c}=

**Σ**

^{1/2}

_{c}

**Σ**

^{1/2T}

_{c}. It is well known that if the eigenvectors of

**Σ̃**

_{f}are ordered in ascending order of eigenvalues, then the first eigenvector minimizes the right-hand side of (2), the second minimizes the right-hand side of (2) subject to being orthogonal to the first, and so on (Noble and Daniel 1988, theorem 10.28). We call the above procedure

*predictable component analysis*(PrCA). This procedure is equivalent to the procedures proposed independently by Déqué (1988) and Schneider and Griffies (1999). Following Renwick and Wallace (1995), this technique will be denoted PrCA, not to be confused with principal component analysis (PCA).

**Σ̃**

_{f}is a positive-definite, symmetric matrix, its eigenvectors form a complete set that satisfy the relations

**u**

^{T}

_{j}

**u**

_{k}= 0 and

**u**

^{T}

_{j}

**Σ̃**

_{f}

**u**

_{k}= 0 for all

*j*≠

*k*. These relations imply that the covariance between any two projected variables vanish; that is, if

*j*≠

*k*, then

**′ =**

*ν***−**

*ν*

*μ**, and the square brackets [] and angle brackets 〈〉 denote an expectation with respect to the forecast and climatological distributions, respectively. The above orthogonality properties imply that the variables are uncorrelated with respect to both the forecast and climatological distributions. Accordingly, the predictable components define an uncorrelated set of components such that the first maximizes predictability; the second maximizes predictability subject to being uncorrelated with the first, and so on.*

_{f}### a. Predictable components as EOFs

*whitened*variable

**are the eigenvectors of the covariance matrix**

*ν̃***Σ̃**

_{f}, which in turn are the predictable components. That the predictable components are the EOFs of whitened variables also was noted by Schneider and Griffies (1999). The variable

**is said to be whitened because, following Fukunaga (1990, p. 28), its climatological covariance matrix equals the identity matrix:**

*ν̃***Σ**

*, arranged columnwise, each multiplied by the inverse square root of the corresponding eigenvalue; the resulting whitened variable specifies the state in a normalized EOF space.*

_{c}It is critical to understand why the predictable components are the EOFs of the whitened forecast, and not the EOFs of the forecast itself. The leading eigenvector of **Σ*** _{f}* explains the most spread and hence the most uncertainty. However, uncertainty of a component does not measure predictability. Rather, predictability of a component depends on the amount of spread relative to its climatological spread. For instance, the component with the most uncertainty could be the most predictable, if its climatological spread were sufficiently large. The virtue of whitening is that any projection of whitened variables has unit climatological variance, and thus the forecast variance of such a projection immediately measures relative variance, or equivalently relative uncertainty.

**u**

_{1}

**u**

_{2}. . .

**u**

*are normalized such that*

_{k}**u**

^{T}

_{j}

**u**

_{j}= 1 for all

*j*, then

**may be written as a linear combination of these eigenvectors as**

*ν̃***u**

^{T}

_{j}

**times a component**

*ν̃***u**

*. In addition, each term is independent of the others, and each successive term explains decreasing predictability. Note that the vectors*

_{j}**u**

*in (9) play the dual role of defining the projection vector and defining the component, because they are orthogonal. When the whitening transformation is inverted, the two roles are played by two distinct vectors. In particular, the above decomposition takes the form*

_{j}**p**

*is associated with a spatial structure, and the inner product*

_{k}**q**

^{T}

_{k}

**gives the corresponding time series.**

*ν***Σ̃**

_{f}is the vector

**u**that minimizes the average

*L*

_{2}norm difference between the whitened variable and its projection

**p**and

**q**are defined in (9). The latter expression implies that the predictable components minimize the difference between the state and its projection onto the predictable components, but with the norm of the difference measured with respect to the metric

**Σ**

*. This norm is called the*

_{c}*Mahalanobis norm*in data assimilation (Swinbank and Lahoz 2003). In general, transformation of a variable prior to variance-based analysis is equivalent to changing the norm in the analysis.

### b. Average predictability

Predictable components characterize a forecast ensemble at an instant in time. We may also be interested in components that maximize the average predictability. We call the former *instantaneous* predictable components and the latter *average* predictable components.

*σ*

^{2}

_{f}/

*σ*

^{2}

_{c}, including signal-to-noise ratio, anomaly correlation, predictive information, and the Mahalanobis error. By Jensen’s inequality (Cover and Thomas 1991), the average of these measures is bounded below by the measure evaluated at 〈

*σ*

^{2}

_{f}〉/

*σ*

^{2}

_{c }. By the property that predictability is a decreasing function of forecast uncertainty, minimizing 〈

*σ*

^{2}

_{f}〉/

*σ*

^{2}

_{c }is equivalent to maximizing the lower bound. Since the component that minimizes

*σ*

^{2}

_{f}/

*σ*

^{2}

_{c }is the trailing eigenvector of the whitened forecast covariance

**Σ̃**

_{f}, we can surmise that the component that minimizes 〈

*σ*

^{2}

_{f}〉/

*σ*

^{2}

_{c }is the trailing eigenvector of

**Σ̃**

_{f}〉 can be used to construct a lower bound on the maximum of the average predictability. If the measure is a linear function of

*σ*

^{2}

_{f}/

*σ*

^{2}

_{c}, as in the case of signal-to-noise ratio and anomaly correlation, then the lower bound is an exact equality. Orthogonality of the predictable components implies that the second predictable component maximizes the lower bound of the maximum average predictability, out of all components that are uncorrelated with the first, and so on.

It is perhaps worth noting that if the forecast covariance **Σ*** _{f}* is constant, that is, independent of observation or initial condition, as occurs in the case of a linear, autonomous, stochastic models with stationary noise, then the instantaneous and average predictable components coincide.

### c. Signal EOFs, noise EOFs, and signal-to-noise EOFs

We now discuss the relation between predictable components and other components that can be derived from EOF methods. Let us call the EOFs of the forecast ensemble the *noise EOFs*, since they describe the variability of the forecast ensemble about the ensemble mean (recent examples include Yang et al. 1998 and Straus and Shukla 2002). An alternative approach is to compute the EOFs of the *ensemble mean* over time, which we call the *signal EOFs* and can be interpreted as describing the predictable part of a forecast (recent examples include Sutton et al. 2000; Straus and Shukla 2002; Peng and Kumar 2005). The relevance of these components to predictability defined here is not immediately evident because these components optimize absolute measures of spread or signal, whereas predictability depends on relative measures of spread or signal.

*signal*–

*noise decomposition*, after DelSole and Tippett (2007). This equation implies that the average whitened forecast and signal covariances are related by

**Σ̃**

_{s}) are identical to the average noise EOFs,

*but with reversed ordering*. Thus, the average predictable components may be obtained either as the signal EOFs of whitened variables or as the average noise EOFs of whitened variables.

The equivalence between the average noise EOFs and signal EOFs demonstrated above holds only for whitened variables. In general, the noise and signal EOFs of the original forecast variables differ; that is, the component that explains the most signal differs from the component that explains the least noise. This discrepancy is problematic when one attempts to identify one of these components as “most predictable,” since there is no compelling reason for choosing one over the other. The whitening transformation removes this discrepancy. Furthermore, as shown in section 2a, the whitening transformation can be interpreted as changing the norm used to measure “variance” in the EOF calculation. These considerations imply that consistency of signal analysis and error analysis constrains the norm to be the Mahalanobis norm.

*μ*/

_{f}*σ*, which can be optimized by fingerprint methods (DelSole and Tippett 2007). For an average over all forecast ensembles, the signal variance of the projection vector

_{c}**q**is

**q**

^{T}

**Σ**

_{s}**q**, and the average noise variance is

**q**

^{T}〈

**Σ̃**

_{f}〉

**q**, in which case the signal-to-noise ratio is

*s*is a monotonic function of 〈

*σ*

^{2}

_{f}〉/

*σ*

^{2}

_{c}, optimizing signal-to-noise ratio is equivalent to optimizing the ratio of variances, hence signal-to-noise EOFs are identical to the predictable components that optimize (or bound) the average predictability. Thus, the Mahalanobis norm renders the signal EOFs and the average noise EOFs identical to each other, identical to the signal-to-noise EOFs, and identical to the average predictable components.

## 3. Singular vectors

*left and right singular vectors*, respectively, and the diagonal elements of 𝗦 are the

*singular values*. It is convention to order the singular values in descending order. A standard fact is that for mappings of the form

**u**= 𝗚

**w**, where

**w**is “initial condition” and

**u**is “response,” the leading right singular vector gives the initial condition that maximizes

**u**

^{T}

**u**out of all vectors that satisfy

**w**

^{T}

**w**= 1, the second right singular vector gives the initial condition that maximizes

**u**

^{T}

**u**

*subject to being orthogonal to the first vector*, and so on.

In the discussion below, the state of the system at the initial time will be denoted by **i**. The initial state is assumed to be estimated from observations by a data assimilation system. In practice, the assimilation assumes a normal distribution for the initial state **i**. For ease of interpretation, we write the initial condition as **i** = **a** + **e**, where **a** is the mean of the initial condition, often called the *analysis*, and **e** is the *analysis error* with zero mean and covariance matrix **Σ*** _{e}*. The covariance of

**a**over all initial conditions will be denoted

**Σ**

*.*

_{a}### a. Deterministic systems

*tangent linear model*of the form

**′ = 𝗚**

*ν***e**, where 𝗚 is a square matrix called the

*propagator*, which depends on time, and

**e**and

**′ denote initial and final “errors” (perturbations about a solution of the full dynamical system). Lorenz then effectively computed the noise EOFs for this system, which are the eigenvectors of the forecast covariance matrix**

*ν*^{T}. By analogy, the left singular vectors of 𝗚

**Σ**

^{1/2}

_{e}are the eigenvectors of the forecast covariance matrix 𝗚

**Σ**

*𝗚*

_{e}^{T}, as previously noted by Ehrendorfer and Tribbia (1997). Therefore, the left singular vector of 𝗚

**Σ**

^{1/2}

_{e}is the noise EOF, and the right singular vector gives the initial condition that excites this noise EOF. To the extent that the noise EOFs optimally describe forecast errors, the singular vector method determines the fewest number of ensemble members with which to approximate the forecast spread (Palmer 1995; Ehrendorfer and Tribbia 1997).

**Σ**

^{1/2}

_{e}, not the singular vectors of 𝗚 itself. To interpret this result, let us define the new variables:

**= 𝗚***

*ν*′**ê**. By analogy with the usual interpretation, the singular vectors of 𝗚* maximize the forecast error variance

*ν*′^{T}

**subject to the constraint**

*ν*′**ê**

^{T}

**ê**= 1. Thus, noise EOFs maximize the forecast error variance

*ν*′^{T}

**subject to the constraint**

*ν*′**Σ**

^{1/2}

_{e}are the singular vectors of 𝗚 but with the initial norm (21). This result can be generalized to show that any linear transformation of the propagator is equivalent to changing the norm for measuring the initial or final vectors. The fact that constraint (21) compels the singular vectors to be the EOFs of forecast error was noted previously by Palmer (1995) and Ehrendorfer and Tribbia (1997).

A standard result in probability states that if **e** is normally distributed with zero mean and covariance matrix **Σ*** _{e}*, then

**ê**defined in (20) also is normal with zero mean and covariance matrix 𝗜. The distribution of

**ê**thus depends only on the distance from the origin

**ê**

^{T}

**ê**and hence is isotropic. Measuring initial error using a norm based on initial error covariance

**Σ**

*is thus consistent with Lorenz’s consideration of an ensemble of initial errors such that “no direction in . . . [state] space is preferred over any other direction.” Constraint (21) also is consistent with the constraint typically used to compute ensemble-based estimates of forecast error with singular vectors (Houtekamer 1995; Molteni et al. 1996). Finally, since the distribution of*

_{e}**ê**is isotropic, all states satisfying (21) have equal probability density. Accordingly, we call (21) the

*equal likelihood constraint*. Thus, the leading EOF of forecast error can be interpreted as the forecast with maximum error out of all forecasts with equally likely initial errors. This constraint immediately solves the problem of ensuring that singular vectors are “realistic,” since any vector that satisfies (21) is just as likely as any other vector to be drawn from the initial analyses.

**Σ̃**

_{f}) are the left singular vectors of

*whitened propagator*.

*****implied a change in initial norm. We now show that transformation from 𝗚 to

**′ = 𝗚**

*ν***e**can be transformed into

*ν̃*′^{T}

**subject to**

*ν̃*′**ê**

^{T}

**ê**= 1; that is, they maximize

It is perhaps worth emphasizing that a predictable component characterizes an ensemble of forecasts, whereas a singular vector pertains to a single forecast. These two components coincide only in linear models, and only for proper choice of norms.

### b. Linear stochastic models

**is a Gaussian white noise variable with zero mean and positive-definite covariance matrix**

*ξ***Σ**, and

_{ξ}**i**is the initial condition. Recall that the initial condition is

**i**=

**a**+

**e**. Substituting this relation into the state space model (28) gives

**Σ**

*𝗚*

_{e}^{T}measures the forecast spread due to initial condition error, and

**Σ**measures the forecast spread due to model noise. The predictable components of this model are the eigenvectors of the whitened covariance matrix

_{ξ}The fact that two different whitened propagators,

In the absence of stochastic forcing, the predictable components are the trailing singular vectors of

**Σ̃**

_{f}is positive definite and thus has positive eigenvalues, relation (35) proves that the singular values of

**Σ̃**

_{f}are less than one. Thus, there is no “growth” due to

### c. Maximum covariance analysis

*maximum covariance analysis*(MCA), and von Storch and Zwiers (1999, p. 321) show that this procedure is equivalent to computing the SVD of the cross-covariance matrix:

**Σ̃**

_{va}

**Σ̃**

^{T}

_{va}and thus satisfy the eigenvalue problem

**Σ**

^{−1/2}

_{c}and substituting the relation between

**q**and

**u**(3) gives

*canonical correlation analysis*[CCA; see von Storch and Zwiers 1999, Eq. (14.10)]. CCA is a procedure that finds components in two fields that are maximally correlated. Numerous studies use CCA to highlight the evolution of predictable patterns (Barnett and Preisendorfer 1987; Barnston and Ropelewski 1992; Barnston and Smith 1996). The above analysis demonstrates that CCA is equivalent to SVD of the whitened time-lagged covariance matrix, a fact that has been noted in previous studies (Bretherton et al. 1992; DelSole and Chang 2003).

**a**

^{T}and taking expectations, noting that 〈

*ξ*a^{T}〉 =

**0**by causality. The above relation also is the least squares operator for predicting

**given**

*ν***a**. The corresponding whitened propagator is thus

## 4. An example based on sea surface temperatures

In this section, we illustrate various components of interest to predictability. We adopt the linear inverse model of Penland and Sardeshmukh (1995) for tropical sea surface temperature (SST). The main difference between our model and others of this type is that we include estimates of analysis errors when computing the climatological covariance matrix, which affects the EOF basis set used to represent the state vector. We use the 2° × 2° extended reconstruction of sea surface temperature analysis by Smith and Reynolds (2003), denoted ERSSTv2. We utilize all months in the 56-yr period 1950–2005 in the tropical Indo-Pacific ocean basin bounded by 30°S–30°N, 30°E–60°W. There are 3424 grid boxes in this domain, excluding land points, and a total of 672 time points. Monthly anomalies were computed by subtracting the mean of each calendar month from the corresponding monthly mean value at each grid point.

The climatological covariance matrix can be decomposed into three terms that measure signal variance, model noise, and initial condition error. To gain an idea of the relative contribution of the three terms, we plot in Fig. 3 the trace of the whitened covariance matrices. We see immediately that the initial condition error is negligible. The result probably is a consequence of the fact that the analysis error covariance matrix is diagonal in physical space, and so has only weak projections on the eight EOFs. Presumably, a more realistic, nondiagonal error covariance would lead to a larger contribution by initial condition error.

## 5. Summary and discussion

This paper showed that if a measure of predictability is invariant to affine transformation and monotonically related to forecast uncertainty, then the component that maximizes the measure for normal distributions is a universal function of the distributions, independent of the details of the measure. This result explains why different measures of predictability, such as signal-to-noise ratio, anomaly correlation, predictive information, and the Mahalanobis error all have the same maximally predictable component (DelSole and Tippett 2007). It also implies that the Ω index of Koster et al. (2000) is maximized by the same components. These components can be obtained by applying EOF analysis to whitened forecast variables, a procedure called *predictable component analysis*. The resulting vectors, called *predictable components*, define a complete set that can be ordered such that the first maximizes predictability, the second maximizes predictability subject to being uncorrelated with the first, and so on.

Predictable components also can be obtained by applying singular value decomposition to the whitened propagator of linear models. The whitening transformation is tantamount to changing the initial and final norms in the singular vector calculation. In the tangent linear case, the initial norm is based on the analysis error covariance, consistent with previous studies, while the final norm is based on the Mahalanobis norm. The Mahalanobis norm has several attractive properties that make its use compelling. Specifically, the Mahalanobis norm is invariant to linear transformation and has unit climatological variance, and thus constitutes a consistent measure of predictability. Also, the Mahalanobis norm renders the signal EOFs identical to noise EOFs, but with reversed ordering, where signal and noise identify forecast mean and spread, respectively. Furthermore, these components are identical to the signal-to-noise EOFs. This equivalence does not hold for other norms. Finally, maximum covariance analysis between two whitened variables is equivalent to CCA of the two variables, which in turn is equivalent to determining the predictable components of an associated least squares model.

In essence, the whitening transformation converts variance analysis to predictability analysis. The components identified with conventional variance analysis, such as structures with large signal variance, are of interest to predictability but do not necessarily play a distinguished role in predictability. For instance, the structure with maximum signal variance may not be the most predictable, since the corresponding climatological variance could be very large by comparison. It is remarkable that a large class of predictability measures has the same predictable components, and these components can be obtained from variance analysis merely by transforming variables, or equivalently by using the Mahalanobis norm to measure size.

Just as singular vectors of propagators optimally represent error variance, singular vectors of whitened propagators optimally represent predictability. Therefore, if only a few of the singular values indicate significant predictability, then an ensemble based on just the corresponding singular vectors should give a reasonable estimate of the total predictability. The singular values of whitened propagators measure the strength of predictability, in contrast with the usual interpretation of singular values as a measure of variance growth. It is also worth noting that this paper appears to give for the first time the generalization of singular vector methods to models that contain both stochastic forcing and initial condition error.

Some predictability measures are not additive, for example, signal-to-noise ratio and anomaly correlation. In contrast, information theory measures are additive for independent events. For the class of measures and distributions considered in this paper, any nonadditive measure can be converted into any additive measure because these measures are monotonically related to forecast uncertainty, and hence monotonically related to each other. This transformation may prove useful for computing total predictability or the fractional contribution of each component to predictability.

Distance-related measures of predictability, such as relative entropy (Kleeman 2002) and Bhattacharyya distance (Mardia et al. 1979), can increase even if the forecast uncertainty is constant, for example, by a change in mean, and thus do not satisfy all properties assumed in this paper. However, these measures tend to be convex functions of forecast uncertainty, so predictable components provide lower bounds on distance-related predictability measures.

Components of interest to predictability were illustrated with a linear inverse model for SST. The signal EOFs, the maximum covariance components, and the leading singular vector of the propagator were all dominated by the leading EOF of SST variance. In contrast, the leading predictable component exhibited a linear trend over 50 yr. A linear trend may be identified sensibly as highly predictable. The forecast spread in this model was dominated completely by the stochastic forcing; that is, the analysis errors were negligible. These conclusions pertain to our particular empirical model and may not carry over to the real system.

Attempts to generalize the above framework meet with significant difficulties. For instance, relaxing the assumption of normal distributions is difficult because the measure would then depend on higher-order moments and thus depend on higher-order nonlinearities of the projection vector. Relaxing the linear model assumption loses contact with singular vector methods. Relaxing the perfect model scenario requires accounting for model error in the forecast distribution, which is a largely unsolved problem. The above framework also involves significant practical difficulties. For instance, the framework assumed all covariances were known, whereas in practice they must be estimated from relatively small samples. Also, the Mahalanobis norm is very sensitive to estimation errors in the variance of the trailing EOFs. However, an interesting by-product of the framework discussed in this paper is clarification of the fact that seemingly different statistical methods are fundamentally connected. For instance, predictable component analysis has been related to EOF analysis, SVD analysis, CCA, and linear regression. These connections imply that estimation techniques which have proven to be effective in one statistical method can be applied directly to predictable component analysis.

## Acknowledgments

Comments from Ben Kirtman and two anonymous reviewers led to significant clarifications in this paper. The first author’s research was supported by the National Science Foundation (ATM0332910), National Aeronautics and Space Administration (NNG04GG46G), and the National Oceanographic and Atmospheric Administration (NA04OAR4310034). The second author’s research was supported by a Grant/Cooperative Agreement from the National Oceanic and Atmospheric Administration, NA05OAR4311004. The views expressed herein are those of the authors and do not necessarily reflect the views of NOAA or any of its subagencies.

## REFERENCES

Barnett, T. P., and R. Preisendorfer, 1987: Origins and levels of monthly and seasonal forecast skill for United States surface air temperatures determined by canonical correlation analysis.

,*Mon. Wea. Rev.***115****,**1825–1850.Barnston, A. G., and C. F. Ropelewski, 1992: Prediction of ENSO episodes using canonical correlation analysis.

,*J. Climate***5****,**1316–1345.Barnston, A. G., and T. M. Smith, 1996: Specification and prediction of global surface temperature and precipitation from global SST using CCA.

,*J. Climate***9****,**2660–2697.Bretherton, C. S., C. Smith, and J. M. Wallace, 1992: An intercomparison of methods for finding coupled patterns in climate data.

,*J. Climate***5****,**541–560.Cover, T. M., and J. A. Thomas, 1991:

*Elements of Information Theory*. Wiley, 576 pp.DelSole, T., and B. F. Farrell, 1995: A stochastically excited linear system as a model for quasigeostrophic turbulence: Analytic results for one- and two-layer fluids.

,*J. Atmos. Sci.***52****,**2531–2547.DelSole, T., and P. Chang, 2003: Predictable component analysis, canonical correlation analysis, and autoregressive models.

,*J. Atmos. Sci.***60****,**409–416.DelSole, T., and M. K. Tippett, 2007: Predictability: Recent insights from information theory.

,*Rev. Geophys.***45****.**RG4002, doi:10.1029/2006RG000202.Déqué, M., 1988: 10-day predictability of the Northern Hemisphere winter 500-mb height by the ECMWF operational model.

,*Tellus***40A****,**26–36.Ehrendorfer, M., and J. Tribbia, 1997: Optimal prediction of forecast error covariances through singular vectors.

,*J. Atmos. Sci.***54****,**286–313.Farrell, B. F., and P. J. Ioannou, 1993: Stochastic dynamics of baroclinic waves.

,*J. Atmos. Sci.***50****,**4044–4057.Fukunaga, K., 1990:

*An Introduction to Statistical Pattern Recognition*. 2nd ed. Academic Press, 591 pp.Hasselmann, K., 1976: Stochastic climate models. Part I: Theory.

,*Tellus***28****,**473–485.Houtekamer, P., 1995: The construction of optimal perturbations.

,*Mon. Wea. Rev.***123****,**2888–2898.Hu, Z-Z., and B. Huang, 2007: The predictive skill and the most predictable pattern in the tropical Atlantic: The effect of ENSO.

,*Mon. Wea. Rev.***135****,**1786–1806.Kirtman, B. P., 2003: The COLA anomaly coupled model: Ensemble ENSO prediction.

,*Mon. Wea. Rev.***131****,**2324–2341.Kleeman, R., 2002: Measuring dynamical prediction utility using relative entropy.

,*J. Atmos. Sci.***59****,**2057–2072.Kleeman, R., and A. M. Moore, 1997: A theory for the limitation of ENSO predictability due to stochastic atmospheric transients.

,*J. Atmos. Sci.***54****,**753–767.Kleeman, R., Y. Tang, and A. M. Moore, 2003: The calculation of climatically relevant singular vectors in the presence of weather noise as applied to the ENSO problem.

,*J. Atmos. Sci.***60****,**2856–2868.Koster, R. D., M. J. Suarez, and M. Heiser, 2000: Variance and predictability of precipitation at seasonal-to-interannual timescales.

,*J. Hydrometeor.***1****,**26–46.Lorenz, E. N., 1965: A study of the predictability of a 28-variable atmospheric model.

,*Tellus***17****,**321–333.Mardia, K. V., J. T. Kent, and J. M. Bibby, 1979:

*Multivariate Analysis*. Academic Press, 521 pp.Molteni, F., R. Buizza, T. N. Palmer, and T. Petroliagis, 1996: The ECMWF ensemble prediction system: Methodology and validation.

,*Quart. J. Roy. Meteor. Soc.***122****,**73–119.Noble, B., and J. W. Daniel, 1988:

*Applied Linear Algebra*. 3rd ed. Prentice-Hall, 521 pp.Palmer, T. N., 1995: Predictability of the atmosphere and oceans: From days to decades.

*Decadal Climate Variability: Dynamics and Predictability*, D. T. A. Anderson and J. Willebrand, Eds., NATO ASI Series, Vol. 44, Springer, 83–155.Palmer, T. N., and Coauthors, 2004: Development of a European Multimodel ensemble system for seasonal-to-interannual prediction (DEMETER).

,*Bull. Amer. Meteor. Soc.***85****,**853–872.Peng, P., and A. Kumar, 2005: A large ensemble analysis of the influence of tropical SSTs on seasonal atmospheric variability.

,*J. Climate***18****,**1068–1085.Penland, C., and P. D. Sardeshmukh, 1995: The optimal growth of tropical sea surface temperature anomalies.

,*J. Climate***8****,**1999–2024.Renwick, J. A., and J. M. Wallace, 1995: Predictable anomaly patterns and the forecast skill of Northern Hemisphere wintertime 500-mb height fields.

,*Mon. Wea. Rev.***123****,**2114–2131.Schneider, E. K., D. G. DeWitt, A. Rosati, B. P. Kirtman, L. Ji, and J. J. Tribbia, 2003: Retrospective ENSO forecasts: Sensitivity to atmospheric model and ocean resolution.

,*Mon. Wea. Rev.***131****,**3038–3060.Schneider, T., and S. M. Griffies, 1999: A conceptual framework for predictability studies.

,*J. Climate***12****,**3133–3155.Shukla, J., 1981: Predictability of the tropical atmosphere. NASA Tech. Memo. 83829, 51 pp.

Smith, T. M., and R. W. Reynolds, 2003: Extended reconstruction of global sea surface temperatures based on COADS data (1854–1997).

,*J. Climate***16****,**1495–1510.Straus, D. M., and J. Shukla, 2002: Does ENSO force the PNA?

,*J. Climate***15****,**2340–2358.Sutton, R. T., S. P. Jewson, and D. P. Rowell, 2000: The elements of climate variability in the tropical Atlantic region.

,*J. Climate***13****,**3261–3284.Swinbank, R., V. Shutyaev, and W. A. Lahoz, 2003:

*Data Assimilation for the Earth System*. Springer, 388 pp.Syu, H-H., J. D. Neelin, and D. Gutzler, 1995: Seasonal and interannual variability in a hybrid coupled GCM.

,*J. Climate***8****,**2121–2143.Thompson, C. J., and D. S. Battisti, 2000: A linear stochastic dynamical model of ENSO. Part I: Model development.

,*J. Climate***13****,**2818–2832.Tippett, M. K., and A. Giannini, 2006: Potentially predictable components of African summer rainfall in an SST-forced GCM simulation.

,*J. Climate***19****,**3133–3144.Venzke, S., M. R. Allen, R. T. Sutton, and D. P. Rowell, 1999: The atmospheric response over the North Atlantic to decadal changes in sea surface temperature.

,*J. Climate***12****,**2562–2584.Von Storch, H., and F. Zwiers, 1999:

*Statistical Analysis in Climate Research*. Cambridge University Press, 494 pp.Waliser, D. E., C. Jones, J-K. E. Schemm, and N. E. Graham, 1999: A statistical extended-range tropical forecast model based on the slow evolution of the Madden–Julian oscillation.

,*J. Climate***12****,**1918–1939.Yang, X-Q., J. L. Anderson, and W. F. Stern, 1998: Reproducible forced modes in AGCM ensemble integrations and potential predictability of atmospheric seasonal variations in the extratropics.

,*J. Climate***11****,**2942–2959.