1. Introduction
It is well established that some components in the climate system are more predictable than others. For instance, large-scale structures tend to be more predictable than small-scale structures (Shukla 1981); sea surface temperatures in the tropical Pacific tend to be more predictable than those in the Atlantic (Schneider et al. 2003); rainfall in the tropical Pacific tends to be more predictable than rainfall in Europe (Palmer et al. 2004). It is natural, then, to seek the most predictable components. However, attempts to identify maximally predictable components based on such methods as principal component analysis and singular value decomposition generally lead to different results. The purpose of this paper is to suggest a framework in which familiar methods of predictability analysis lead to consistent results.
A clue as to how diverse statistical methods can produce consistent results lies in the fact that many methods depend implicitly or explicitly on a norm. For instance, principal component analysis determines components that maximize variance, as defined by some norm. Singular vector decomposition depends on two norms: one for measuring “response” and another for constraining “initial condition.” Without a firm basis for choosing these norms, variance analysis could generate virtually any set of vectors by a suitable choice of norm.
One approach to these problems is to define predictability precisely and then to choose norms to ensure consistency with the associated measure of predictability. Unfortunately, no universally accepted measure of predictability exists. Therefore, this approach merely replaces the problem of the choice of norm with the problem of the choice of predictability measure.
Surprisingly, there is a path that gives a reasonable way out. The first step is to restrict attention to predictability measures that are invariant to affine transformations and monotonically related to forecast uncertainty. These measures are consistent in the following sense: measuring the predictability of the same system in two different coordinate systems gives the same result. We then show that components that maximize these measures are independent of the details of the measure. This result, proven in section 2, explains why different measures of predictability, such as signal-to-noise ratio, anomaly correlation, predictive information, and the Mahalanobis error all have the same maximally predictable component (DelSole and Tippett 2007). We then show that this component can be obtained by applying principal component analysis to transformed forecast variables. The transformation, called the whitening transformation, can be interpreted as specifying the norm in empirical orthogonal function (EOF) analysis. This result is generalized in section 3 to specify the norms in singular vector analysis to obtain the same predictable components. The norm for measuring forecast uncertainty has not appeared in previous predictability studies, but nonetheless these norms have several attractive properties that make their use compelling. Several components of interest to predictability are illustrated with an empirical stochastic model for sea surface temperatures in section 4. This paper concludes with a summary and discussion of results.
2. Predictable components
In this section we define a class of predictability measures and then show that components that maximize these measures are independent of the detailed form of the measure. The first step to defining predictability is to recognize that no forecast is complete without a description of its uncertainty in the form of a probability distribution. Following DelSole and Tippett (2007), the forecast distribution is defined as the conditional distribution of the state given antecedent observations of the system, and the climatological distribution is the unconditional distribution of the state. A system is deemed unpredictable if the forecast and climatological distributions are identical. Thus, a measure of predictability should indicate predictability only if the forecast and climatological distributions differ.
To ensure consistency, we propose that if the predictability of a system is measured in two different coordinate systems, then the measure should be the same. At the very least, then, the measure should be invariant to affine transformations, that is, to translations and to nonsingular, linear transformations. This property will be called the invariance property.
Finally, we consider measures of predictability that increase if and only if the uncertainty in the forecast decreases. This principle requires defining uncertainty. However, for normal distributions, any reasonable measure of uncertainty is an increasing function of variance. Thus, we assume that the predictability of normal distributions increases if and only if the forecast variance decreases, holding all other parameters constant.
Many measures of predictability satisfy the above properties, including signal-to-noise ratio, anomaly correlation, predictive information, and Mahalanobis error. The Ω index of Koster et al. (2000) also satisfies the above properties. However, mean square error does not satisfy the above properties because it is not invariant to linear transformation. Nevertheless, we retain linear invariance to ensure consistency. Similarly, not all measures increase if and only if forecast uncertainty decreases. For instance, measures of the “distance” between distributions, such as relative entropy (Kleeman 2002) or Bhattacharyya distance (Mardia et al. 1979), will change if the mean changes, even for constant forecast uncertainty. However, components that optimize the restricted class of measures can provide useful lower bounds on other measures.
a. Predictable components as EOFs
It is critical to understand why the predictable components are the EOFs of the whitened forecast, and not the EOFs of the forecast itself. The leading eigenvector of Σf explains the most spread and hence the most uncertainty. However, uncertainty of a component does not measure predictability. Rather, predictability of a component depends on the amount of spread relative to its climatological spread. For instance, the component with the most uncertainty could be the most predictable, if its climatological spread were sufficiently large. The virtue of whitening is that any projection of whitened variables has unit climatological variance, and thus the forecast variance of such a projection immediately measures relative variance, or equivalently relative uncertainty.
b. Average predictability
Predictable components characterize a forecast ensemble at an instant in time. We may also be interested in components that maximize the average predictability. We call the former instantaneous predictable components and the latter average predictable components.
It is perhaps worth noting that if the forecast covariance Σf is constant, that is, independent of observation or initial condition, as occurs in the case of a linear, autonomous, stochastic models with stationary noise, then the instantaneous and average predictable components coincide.
c. Signal EOFs, noise EOFs, and signal-to-noise EOFs
We now discuss the relation between predictable components and other components that can be derived from EOF methods. Let us call the EOFs of the forecast ensemble the noise EOFs, since they describe the variability of the forecast ensemble about the ensemble mean (recent examples include Yang et al. 1998 and Straus and Shukla 2002). An alternative approach is to compute the EOFs of the ensemble mean over time, which we call the signal EOFs and can be interpreted as describing the predictable part of a forecast (recent examples include Sutton et al. 2000; Straus and Shukla 2002; Peng and Kumar 2005). The relevance of these components to predictability defined here is not immediately evident because these components optimize absolute measures of spread or signal, whereas predictability depends on relative measures of spread or signal.
The equivalence between the average noise EOFs and signal EOFs demonstrated above holds only for whitened variables. In general, the noise and signal EOFs of the original forecast variables differ; that is, the component that explains the most signal differs from the component that explains the least noise. This discrepancy is problematic when one attempts to identify one of these components as “most predictable,” since there is no compelling reason for choosing one over the other. The whitening transformation removes this discrepancy. Furthermore, as shown in section 2a, the whitening transformation can be interpreted as changing the norm used to measure “variance” in the EOF calculation. These considerations imply that consistency of signal analysis and error analysis constrains the norm to be the Mahalanobis norm.
3. Singular vectors
In the discussion below, the state of the system at the initial time will be denoted by i. The initial state is assumed to be estimated from observations by a data assimilation system. In practice, the assimilation assumes a normal distribution for the initial state i. For ease of interpretation, we write the initial condition as i = a + e, where a is the mean of the initial condition, often called the analysis, and e is the analysis error with zero mean and covariance matrix Σe. The covariance of a over all initial conditions will be denoted Σa.
a. Deterministic systems
A standard result in probability states that if e is normally distributed with zero mean and covariance matrix Σe, then ê defined in (20) also is normal with zero mean and covariance matrix 𝗜. The distribution of ê thus depends only on the distance from the origin êTê and hence is isotropic. Measuring initial error using a norm based on initial error covariance Σe is thus consistent with Lorenz’s consideration of an ensemble of initial errors such that “no direction in . . . [state] space is preferred over any other direction.” Constraint (21) also is consistent with the constraint typically used to compute ensemble-based estimates of forecast error with singular vectors (Houtekamer 1995; Molteni et al. 1996). Finally, since the distribution of ê is isotropic, all states satisfying (21) have equal probability density. Accordingly, we call (21) the equal likelihood constraint. Thus, the leading EOF of forecast error can be interpreted as the forecast with maximum error out of all forecasts with equally likely initial errors. This constraint immediately solves the problem of ensuring that singular vectors are “realistic,” since any vector that satisfies (21) is just as likely as any other vector to be drawn from the initial analyses.




It is perhaps worth emphasizing that a predictable component characterizes an ensemble of forecasts, whereas a singular vector pertains to a single forecast. These two components coincide only in linear models, and only for proper choice of norms.
b. Linear stochastic models



The fact that two different whitened propagators,
In the absence of stochastic forcing, the predictable components are the trailing singular vectors of




c. Maximum covariance analysis
4. An example based on sea surface temperatures
In this section, we illustrate various components of interest to predictability. We adopt the linear inverse model of Penland and Sardeshmukh (1995) for tropical sea surface temperature (SST). The main difference between our model and others of this type is that we include estimates of analysis errors when computing the climatological covariance matrix, which affects the EOF basis set used to represent the state vector. We use the 2° × 2° extended reconstruction of sea surface temperature analysis by Smith and Reynolds (2003), denoted ERSSTv2. We utilize all months in the 56-yr period 1950–2005 in the tropical Indo-Pacific ocean basin bounded by 30°S–30°N, 30°E–60°W. There are 3424 grid boxes in this domain, excluding land points, and a total of 672 time points. Monthly anomalies were computed by subtracting the mean of each calendar month from the corresponding monthly mean value at each grid point.
The climatological covariance matrix can be decomposed into three terms that measure signal variance, model noise, and initial condition error. To gain an idea of the relative contribution of the three terms, we plot in Fig. 3 the trace of the whitened covariance matrices. We see immediately that the initial condition error is negligible. The result probably is a consequence of the fact that the analysis error covariance matrix is diagonal in physical space, and so has only weak projections on the eight EOFs. Presumably, a more realistic, nondiagonal error covariance would lead to a larger contribution by initial condition error.
5. Summary and discussion
This paper showed that if a measure of predictability is invariant to affine transformation and monotonically related to forecast uncertainty, then the component that maximizes the measure for normal distributions is a universal function of the distributions, independent of the details of the measure. This result explains why different measures of predictability, such as signal-to-noise ratio, anomaly correlation, predictive information, and the Mahalanobis error all have the same maximally predictable component (DelSole and Tippett 2007). It also implies that the Ω index of Koster et al. (2000) is maximized by the same components. These components can be obtained by applying EOF analysis to whitened forecast variables, a procedure called predictable component analysis. The resulting vectors, called predictable components, define a complete set that can be ordered such that the first maximizes predictability, the second maximizes predictability subject to being uncorrelated with the first, and so on.
Predictable components also can be obtained by applying singular value decomposition to the whitened propagator of linear models. The whitening transformation is tantamount to changing the initial and final norms in the singular vector calculation. In the tangent linear case, the initial norm is based on the analysis error covariance, consistent with previous studies, while the final norm is based on the Mahalanobis norm. The Mahalanobis norm has several attractive properties that make its use compelling. Specifically, the Mahalanobis norm is invariant to linear transformation and has unit climatological variance, and thus constitutes a consistent measure of predictability. Also, the Mahalanobis norm renders the signal EOFs identical to noise EOFs, but with reversed ordering, where signal and noise identify forecast mean and spread, respectively. Furthermore, these components are identical to the signal-to-noise EOFs. This equivalence does not hold for other norms. Finally, maximum covariance analysis between two whitened variables is equivalent to CCA of the two variables, which in turn is equivalent to determining the predictable components of an associated least squares model.
In essence, the whitening transformation converts variance analysis to predictability analysis. The components identified with conventional variance analysis, such as structures with large signal variance, are of interest to predictability but do not necessarily play a distinguished role in predictability. For instance, the structure with maximum signal variance may not be the most predictable, since the corresponding climatological variance could be very large by comparison. It is remarkable that a large class of predictability measures has the same predictable components, and these components can be obtained from variance analysis merely by transforming variables, or equivalently by using the Mahalanobis norm to measure size.
Just as singular vectors of propagators optimally represent error variance, singular vectors of whitened propagators optimally represent predictability. Therefore, if only a few of the singular values indicate significant predictability, then an ensemble based on just the corresponding singular vectors should give a reasonable estimate of the total predictability. The singular values of whitened propagators measure the strength of predictability, in contrast with the usual interpretation of singular values as a measure of variance growth. It is also worth noting that this paper appears to give for the first time the generalization of singular vector methods to models that contain both stochastic forcing and initial condition error.
Some predictability measures are not additive, for example, signal-to-noise ratio and anomaly correlation. In contrast, information theory measures are additive for independent events. For the class of measures and distributions considered in this paper, any nonadditive measure can be converted into any additive measure because these measures are monotonically related to forecast uncertainty, and hence monotonically related to each other. This transformation may prove useful for computing total predictability or the fractional contribution of each component to predictability.
Distance-related measures of predictability, such as relative entropy (Kleeman 2002) and Bhattacharyya distance (Mardia et al. 1979), can increase even if the forecast uncertainty is constant, for example, by a change in mean, and thus do not satisfy all properties assumed in this paper. However, these measures tend to be convex functions of forecast uncertainty, so predictable components provide lower bounds on distance-related predictability measures.
Components of interest to predictability were illustrated with a linear inverse model for SST. The signal EOFs, the maximum covariance components, and the leading singular vector of the propagator were all dominated by the leading EOF of SST variance. In contrast, the leading predictable component exhibited a linear trend over 50 yr. A linear trend may be identified sensibly as highly predictable. The forecast spread in this model was dominated completely by the stochastic forcing; that is, the analysis errors were negligible. These conclusions pertain to our particular empirical model and may not carry over to the real system.
Attempts to generalize the above framework meet with significant difficulties. For instance, relaxing the assumption of normal distributions is difficult because the measure would then depend on higher-order moments and thus depend on higher-order nonlinearities of the projection vector. Relaxing the linear model assumption loses contact with singular vector methods. Relaxing the perfect model scenario requires accounting for model error in the forecast distribution, which is a largely unsolved problem. The above framework also involves significant practical difficulties. For instance, the framework assumed all covariances were known, whereas in practice they must be estimated from relatively small samples. Also, the Mahalanobis norm is very sensitive to estimation errors in the variance of the trailing EOFs. However, an interesting by-product of the framework discussed in this paper is clarification of the fact that seemingly different statistical methods are fundamentally connected. For instance, predictable component analysis has been related to EOF analysis, SVD analysis, CCA, and linear regression. These connections imply that estimation techniques which have proven to be effective in one statistical method can be applied directly to predictable component analysis.
Acknowledgments
Comments from Ben Kirtman and two anonymous reviewers led to significant clarifications in this paper. The first author’s research was supported by the National Science Foundation (ATM0332910), National Aeronautics and Space Administration (NNG04GG46G), and the National Oceanographic and Atmospheric Administration (NA04OAR4310034). The second author’s research was supported by a Grant/Cooperative Agreement from the National Oceanic and Atmospheric Administration, NA05OAR4311004. The views expressed herein are those of the authors and do not necessarily reflect the views of NOAA or any of its subagencies.
REFERENCES
Barnett, T. P., and R. Preisendorfer, 1987: Origins and levels of monthly and seasonal forecast skill for United States surface air temperatures determined by canonical correlation analysis. Mon. Wea. Rev., 115 , 1825–1850.
Barnston, A. G., and C. F. Ropelewski, 1992: Prediction of ENSO episodes using canonical correlation analysis. J. Climate, 5 , 1316–1345.
Barnston, A. G., and T. M. Smith, 1996: Specification and prediction of global surface temperature and precipitation from global SST using CCA. J. Climate, 9 , 2660–2697.
Bretherton, C. S., C. Smith, and J. M. Wallace, 1992: An intercomparison of methods for finding coupled patterns in climate data. J. Climate, 5 , 541–560.
Cover, T. M., and J. A. Thomas, 1991: Elements of Information Theory. Wiley, 576 pp.
DelSole, T., and B. F. Farrell, 1995: A stochastically excited linear system as a model for quasigeostrophic turbulence: Analytic results for one- and two-layer fluids. J. Atmos. Sci., 52 , 2531–2547.
DelSole, T., and P. Chang, 2003: Predictable component analysis, canonical correlation analysis, and autoregressive models. J. Atmos. Sci., 60 , 409–416.
DelSole, T., and M. K. Tippett, 2007: Predictability: Recent insights from information theory. Rev. Geophys., 45 .RG4002, doi:10.1029/2006RG000202.
Déqué, M., 1988: 10-day predictability of the Northern Hemisphere winter 500-mb height by the ECMWF operational model. Tellus, 40A , 26–36.
Ehrendorfer, M., and J. Tribbia, 1997: Optimal prediction of forecast error covariances through singular vectors. J. Atmos. Sci., 54 , 286–313.
Farrell, B. F., and P. J. Ioannou, 1993: Stochastic dynamics of baroclinic waves. J. Atmos. Sci., 50 , 4044–4057.
Fukunaga, K., 1990: An Introduction to Statistical Pattern Recognition. 2nd ed. Academic Press, 591 pp.
Hasselmann, K., 1976: Stochastic climate models. Part I: Theory. Tellus, 28 , 473–485.
Houtekamer, P., 1995: The construction of optimal perturbations. Mon. Wea. Rev., 123 , 2888–2898.
Hu, Z-Z., and B. Huang, 2007: The predictive skill and the most predictable pattern in the tropical Atlantic: The effect of ENSO. Mon. Wea. Rev., 135 , 1786–1806.
Kirtman, B. P., 2003: The COLA anomaly coupled model: Ensemble ENSO prediction. Mon. Wea. Rev., 131 , 2324–2341.
Kleeman, R., 2002: Measuring dynamical prediction utility using relative entropy. J. Atmos. Sci., 59 , 2057–2072.
Kleeman, R., and A. M. Moore, 1997: A theory for the limitation of ENSO predictability due to stochastic atmospheric transients. J. Atmos. Sci., 54 , 753–767.
Kleeman, R., Y. Tang, and A. M. Moore, 2003: The calculation of climatically relevant singular vectors in the presence of weather noise as applied to the ENSO problem. J. Atmos. Sci., 60 , 2856–2868.
Koster, R. D., M. J. Suarez, and M. Heiser, 2000: Variance and predictability of precipitation at seasonal-to-interannual timescales. J. Hydrometeor., 1 , 26–46.
Lorenz, E. N., 1965: A study of the predictability of a 28-variable atmospheric model. Tellus, 17 , 321–333.
Mardia, K. V., J. T. Kent, and J. M. Bibby, 1979: Multivariate Analysis. Academic Press, 521 pp.
Molteni, F., R. Buizza, T. N. Palmer, and T. Petroliagis, 1996: The ECMWF ensemble prediction system: Methodology and validation. Quart. J. Roy. Meteor. Soc., 122 , 73–119.
Noble, B., and J. W. Daniel, 1988: Applied Linear Algebra. 3rd ed. Prentice-Hall, 521 pp.
Palmer, T. N., 1995: Predictability of the atmosphere and oceans: From days to decades. Decadal Climate Variability: Dynamics and Predictability, D. T. A. Anderson and J. Willebrand, Eds., NATO ASI Series, Vol. 44, Springer, 83–155.
Palmer, T. N., and Coauthors, 2004: Development of a European Multimodel ensemble system for seasonal-to-interannual prediction (DEMETER). Bull. Amer. Meteor. Soc., 85 , 853–872.
Peng, P., and A. Kumar, 2005: A large ensemble analysis of the influence of tropical SSTs on seasonal atmospheric variability. J. Climate, 18 , 1068–1085.
Penland, C., and P. D. Sardeshmukh, 1995: The optimal growth of tropical sea surface temperature anomalies. J. Climate, 8 , 1999–2024.
Renwick, J. A., and J. M. Wallace, 1995: Predictable anomaly patterns and the forecast skill of Northern Hemisphere wintertime 500-mb height fields. Mon. Wea. Rev., 123 , 2114–2131.
Schneider, E. K., D. G. DeWitt, A. Rosati, B. P. Kirtman, L. Ji, and J. J. Tribbia, 2003: Retrospective ENSO forecasts: Sensitivity to atmospheric model and ocean resolution. Mon. Wea. Rev., 131 , 3038–3060.
Schneider, T., and S. M. Griffies, 1999: A conceptual framework for predictability studies. J. Climate, 12 , 3133–3155.
Shukla, J., 1981: Predictability of the tropical atmosphere. NASA Tech. Memo. 83829, 51 pp.
Smith, T. M., and R. W. Reynolds, 2003: Extended reconstruction of global sea surface temperatures based on COADS data (1854–1997). J. Climate, 16 , 1495–1510.
Straus, D. M., and J. Shukla, 2002: Does ENSO force the PNA? J. Climate, 15 , 2340–2358.
Sutton, R. T., S. P. Jewson, and D. P. Rowell, 2000: The elements of climate variability in the tropical Atlantic region. J. Climate, 13 , 3261–3284.
Swinbank, R., V. Shutyaev, and W. A. Lahoz, 2003: Data Assimilation for the Earth System. Springer, 388 pp.
Syu, H-H., J. D. Neelin, and D. Gutzler, 1995: Seasonal and interannual variability in a hybrid coupled GCM. J. Climate, 8 , 2121–2143.
Thompson, C. J., and D. S. Battisti, 2000: A linear stochastic dynamical model of ENSO. Part I: Model development. J. Climate, 13 , 2818–2832.
Tippett, M. K., and A. Giannini, 2006: Potentially predictable components of African summer rainfall in an SST-forced GCM simulation. J. Climate, 19 , 3133–3144.
Venzke, S., M. R. Allen, R. T. Sutton, and D. P. Rowell, 1999: The atmospheric response over the North Atlantic to decadal changes in sea surface temperature. J. Climate, 12 , 2562–2584.
Von Storch, H., and F. Zwiers, 1999: Statistical Analysis in Climate Research. Cambridge University Press, 494 pp.
Waliser, D. E., C. Jones, J-K. E. Schemm, and N. E. Graham, 1999: A statistical extended-range tropical forecast model based on the slow evolution of the Madden–Julian oscillation. J. Climate, 12 , 1918–1939.
Yang, X-Q., J. L. Anderson, and W. F. Stern, 1998: Reproducible forced modes in AGCM ensemble integrations and potential predictability of atmospheric seasonal variations in the extratropics. J. Climate, 11 , 2942–2959.
Leading 7-month singular vector of an 8-EOF regression model for SST. (left) Results for singular vectors of the SST propagator; (right) results for singular vectors derived for the whitened propagator. (top) The right singular vectors, (middle) the left singular vectors, and (bottom) the time series of the left singular vectors. The singular values and explained variances are indicated in the figure.
Citation: Journal of the Atmospheric Sciences 65, 5; 10.1175/2007JAS2401.1
(left) Leading 7-month singular vector for the signal propagator and (right) of the time-lagged covariance matrix for SST. (top) The right singular vectors, (middle) the left singular vectors, and (bottom) the time series of the left singular vectors. The singular values and explained variances are indicated in the figure.
Citation: Journal of the Atmospheric Sciences 65, 5; 10.1175/2007JAS2401.1
Trace of the whitened covariance matrix for the signal (solid), model noise (dash), and initial condition error (line with circles) as a function of lead time for an 8-EOF regression model for SST. The regression model is refitted at each lead time. The sum of all three curves is one.
Citation: Journal of the Atmospheric Sciences 65, 5; 10.1175/2007JAS2401.1