# Search Results

## You are looking at 1 - 10 of 81 items for

- Author or Editor: Timothy DelSole x

- Refine by Access: All Content x

## Abstract

A basic question in turbulence theory is whether Markov models produce statistics that differ systematically from dynamical systems. The conventional wisdom is that Markov models are problematic at short time intervals, but precisely what these problems are and when these problems manifest themselves do not seem to be generally recognized. A barrier to understanding this issue is the lack of a closure theory for the statistics of nonlinear dynamical systems. Without such theory, one has difficulty stating precisely how dynamical systems differ from Markov models. It turns out, nevertheless, that certain fundamental differences between Markov models and dynamical systems can be understood from their differential properties. It is shown than any stationary, ergodic system governed by a finite number of ordinary differential equations will produce time-lagged covariances with negative curvature over short lags and produce power spectra that decay faster than any power of frequency. In contrast, Markov models (which necessarily include white noise terms) produce covariances with positive curvature over short lags, and produce power spectra that decay only with some integer power of frequency. Problems that arise from these differences in the context of statistical prediction and turbulence modeling are discussed.

## Abstract

A basic question in turbulence theory is whether Markov models produce statistics that differ systematically from dynamical systems. The conventional wisdom is that Markov models are problematic at short time intervals, but precisely what these problems are and when these problems manifest themselves do not seem to be generally recognized. A barrier to understanding this issue is the lack of a closure theory for the statistics of nonlinear dynamical systems. Without such theory, one has difficulty stating precisely how dynamical systems differ from Markov models. It turns out, nevertheless, that certain fundamental differences between Markov models and dynamical systems can be understood from their differential properties. It is shown than any stationary, ergodic system governed by a finite number of ordinary differential equations will produce time-lagged covariances with negative curvature over short lags and produce power spectra that decay faster than any power of frequency. In contrast, Markov models (which necessarily include white noise terms) produce covariances with positive curvature over short lags, and produce power spectra that decay only with some integer power of frequency. Problems that arise from these differences in the context of statistical prediction and turbulence modeling are discussed.

## Abstract

This paper presents a framework for quantifying predictability based on the behavior of imperfect forecasts. The critical quantity in this framework is not the forecast distribution, as used in many other predictability studies, but the conditional distribution of the state given the forecasts, called the regression forecast distribution. The average predictability of the regression forecast distribution is given by a quantity called the mutual information. Standard inequalities in information theory show that this quantity is bounded above by the average predictability of the true system and by the average predictability of the forecast system. These bounds clarify the role of potential predictability, of which many incorrect statements can be found in the literature. Mutual information has further attractive properties: it is invariant with respect to nonlinear transformations of the data, cannot be improved by manipulating the forecast, and reduces to familiar measures of correlation skill when the forecast and verification are joint normally distributed. The concept of potential predictable components is shown to define a lower-dimensional space that captures the full predictability of the regression forecast without loss of generality. The predictability of stationary, Gaussian, Markov systems is examined in detail. Some simple numerical examples suggest that imperfect forecasts are not always useful for joint normally distributed systems since greater predictability often can be obtained directly from observations. Rather, the usefulness of imperfect forecasts appears to lie in the fact that they can identify potential predictable components and capture nonstationary and/or nonlinear behavior, which are difficult to capture by low-dimensional, empirical models estimated from short historical records.

## Abstract

This paper presents a framework for quantifying predictability based on the behavior of imperfect forecasts. The critical quantity in this framework is not the forecast distribution, as used in many other predictability studies, but the conditional distribution of the state given the forecasts, called the regression forecast distribution. The average predictability of the regression forecast distribution is given by a quantity called the mutual information. Standard inequalities in information theory show that this quantity is bounded above by the average predictability of the true system and by the average predictability of the forecast system. These bounds clarify the role of potential predictability, of which many incorrect statements can be found in the literature. Mutual information has further attractive properties: it is invariant with respect to nonlinear transformations of the data, cannot be improved by manipulating the forecast, and reduces to familiar measures of correlation skill when the forecast and verification are joint normally distributed. The concept of potential predictable components is shown to define a lower-dimensional space that captures the full predictability of the regression forecast without loss of generality. The predictability of stationary, Gaussian, Markov systems is examined in detail. Some simple numerical examples suggest that imperfect forecasts are not always useful for joint normally distributed systems since greater predictability often can be obtained directly from observations. Rather, the usefulness of imperfect forecasts appears to lie in the fact that they can identify potential predictable components and capture nonstationary and/or nonlinear behavior, which are difficult to capture by low-dimensional, empirical models estimated from short historical records.

## Abstract

This paper tests the hypothesis that optimal perturbations in quasigeostrophic turbulence are excited sufficiently strongly and frequently to account for the energy-containing eddies. Optimal perturbations are defined here as singular vectors of the propagator, for the energy norm, corresponding to the equations of motion linearized about the time-mean flow. The initial conditions are drawn from a numerical solution of the nonlinear equations associated with the linear propagator. Experiments confirm that energy is concentrated in the leading evolved singular vectors, and that the average energy in the initial singular vectors is within an order of magnitude of that required to explain the average energy in the evolved singular vectors. Furthermore, only a small number of evolved singular vectors (4 out of 4000) are needed to explain the dominant eddy structure when total energy exceeds a predefined threshold. The initial singular vectors explain only 10% of such events, but this discrepancy was similar to that of the full propagator, suggesting that it arises primarily due to errors in the propagator. In the limit of short lead times, energy conservation can be expressed in terms of suitable singular vectors to constrain the energy distribution of the singular vectors in statistically steady equilibrium. This and other connections between linear optimals and nonlinear dynamics suggests that the positive results found here should carry over to other systems, provided the propagator and initial states are chosen consistently with respect to the nonlinear system.

## Abstract

This paper tests the hypothesis that optimal perturbations in quasigeostrophic turbulence are excited sufficiently strongly and frequently to account for the energy-containing eddies. Optimal perturbations are defined here as singular vectors of the propagator, for the energy norm, corresponding to the equations of motion linearized about the time-mean flow. The initial conditions are drawn from a numerical solution of the nonlinear equations associated with the linear propagator. Experiments confirm that energy is concentrated in the leading evolved singular vectors, and that the average energy in the initial singular vectors is within an order of magnitude of that required to explain the average energy in the evolved singular vectors. Furthermore, only a small number of evolved singular vectors (4 out of 4000) are needed to explain the dominant eddy structure when total energy exceeds a predefined threshold. The initial singular vectors explain only 10% of such events, but this discrepancy was similar to that of the full propagator, suggesting that it arises primarily due to errors in the propagator. In the limit of short lead times, energy conservation can be expressed in terms of suitable singular vectors to constrain the energy distribution of the singular vectors in statistically steady equilibrium. This and other connections between linear optimals and nonlinear dynamics suggests that the positive results found here should carry over to other systems, provided the propagator and initial states are chosen consistently with respect to the nonlinear system.

## Abstract

Recent studies reveal that randomly forced linear models can produce realistic statistics for inhomogeneous turbulence. The random forcing and linear dissipation in these models parameterize the effect of nonlinear interactions. Due to lack of a reasonable theory to do otherwise, many studies assume that the random forcing is homogeneous. In this paper, the homogeneous assumption is shown to fail in systems with sufficiently localized jets. An alternative theory is proposed whereby the rate of variance production by the random forcing and dissipation are assumed to be proportional to the variance of the response at every point in space. In this way, the stochastic forcing produces a response that drives itself. Different theories can be formulated according to different metrics for measuring “variance.” This paper gives a methodology for obtaining the solution to such theories and the conditions that guarantee that the solution is unique. An explicit hypothesis for large-scale, rotating flows is put forward based on local potential enstrophy as a measure of eddy variance. This theory, together with conservation of energy, determines all the parameters of the stochastic model, except one, namely, the multiplicative constant specifying the overall magnitude of the eddies. Comparison of this and more general theories to both nonlinear simulations and to assimilated datasets are found to be encouraging.

## Abstract

Recent studies reveal that randomly forced linear models can produce realistic statistics for inhomogeneous turbulence. The random forcing and linear dissipation in these models parameterize the effect of nonlinear interactions. Due to lack of a reasonable theory to do otherwise, many studies assume that the random forcing is homogeneous. In this paper, the homogeneous assumption is shown to fail in systems with sufficiently localized jets. An alternative theory is proposed whereby the rate of variance production by the random forcing and dissipation are assumed to be proportional to the variance of the response at every point in space. In this way, the stochastic forcing produces a response that drives itself. Different theories can be formulated according to different metrics for measuring “variance.” This paper gives a methodology for obtaining the solution to such theories and the conditions that guarantee that the solution is unique. An explicit hypothesis for large-scale, rotating flows is put forward based on local potential enstrophy as a measure of eddy variance. This theory, together with conservation of energy, determines all the parameters of the stochastic model, except one, namely, the multiplicative constant specifying the overall magnitude of the eddies. Comparison of this and more general theories to both nonlinear simulations and to assimilated datasets are found to be encouraging.

## Abstract

A stochastic model for shear-flow turbulence is constructed under the constraint that the parameterized nonlinear eddy–eddy interactions conserve energy but dissipate potential enstrophy. This parameterization is appropriate for truncated models of quasigeostrophic turbulence that cascade potential enstrophy to subgrid scales. The parameterization is not closed but constitutes a rigorous starting point for more thorough parameterizations. A major simplification arises from the fact that independently forced spatial structures produce covariances that can be superposed linearly. The constrained stochastic model cannot sustain turbulence when dissipation is strong or when the mean shear is weak because the prescribed forcing structures extract potential enstrophy from the mean flow at a rate too slow to sustain a transfer to subgrid scales. The constraint therefore defines a transition shear separating states in which turbulence is possible from those in which it is impossible. The transition shear, which depends on forcing structure, achieves an absolute minimum value when the forcing structures are optimal, in the sense of maximizing enstrophy production minus dissipation by large-scale eddies.

The results are illustrated with a quasigeostrophic model with eddy dissipation parameterized by spatially uniform potential vorticity damping. The transition shear associated with spatially localized random forcing and with reasonable eddy dissipation is close to the correct turbulence transition point determined by numerical simulation of the fully nonlinear system. In contrast, the transition shear corresponding to the optimal forcing functions is unrealistically small, suggesting that at weak shears these structures are weakly excited by nonlinear interactions. Nevertheless, the true forcing structures must project on the optimal forcing structures to sustain a turbulent cascade. Because of this property and their small number, the leading optimal forcing functions may be an attractive basis set for reducing the dimensionality of the parameterization problem.

## Abstract

A stochastic model for shear-flow turbulence is constructed under the constraint that the parameterized nonlinear eddy–eddy interactions conserve energy but dissipate potential enstrophy. This parameterization is appropriate for truncated models of quasigeostrophic turbulence that cascade potential enstrophy to subgrid scales. The parameterization is not closed but constitutes a rigorous starting point for more thorough parameterizations. A major simplification arises from the fact that independently forced spatial structures produce covariances that can be superposed linearly. The constrained stochastic model cannot sustain turbulence when dissipation is strong or when the mean shear is weak because the prescribed forcing structures extract potential enstrophy from the mean flow at a rate too slow to sustain a transfer to subgrid scales. The constraint therefore defines a transition shear separating states in which turbulence is possible from those in which it is impossible. The transition shear, which depends on forcing structure, achieves an absolute minimum value when the forcing structures are optimal, in the sense of maximizing enstrophy production minus dissipation by large-scale eddies.

The results are illustrated with a quasigeostrophic model with eddy dissipation parameterized by spatially uniform potential vorticity damping. The transition shear associated with spatially localized random forcing and with reasonable eddy dissipation is close to the correct turbulence transition point determined by numerical simulation of the fully nonlinear system. In contrast, the transition shear corresponding to the optimal forcing functions is unrealistically small, suggesting that at weak shears these structures are weakly excited by nonlinear interactions. Nevertheless, the true forcing structures must project on the optimal forcing structures to sustain a turbulent cascade. Because of this property and their small number, the leading optimal forcing functions may be an attractive basis set for reducing the dimensionality of the parameterization problem.

## Abstract

**C**

_{τ}, and are substituted into the fluctuation-dissipation relation for a first-order Markov model with white noise forcing

**C**

_{τ}

**C**

_{0}

^{−1}

**A**

**A**. The dynamic operator obtained by inverting the relation was found to depend on time lag. In particular, for small time lags (τ < 1 day), the eigenvectors and imaginary eigenvalues were independent of time lag, while the damping rates increased linearly with time lag. It is shown analytically that precisely this discrepancy occurs when the relation is applied to data generated by a red noise Markov model using a time lag that is small compared to the decorrelation time of the noise. Although a fourth-order Markov model with white noise can more accurately reproduce the covariances, the result of inverting the fluctuation-dissipation relation for such a model implies that the spectrum of the noise involves a superposition of stochastic processes of different spectral characteristics, in which case the effective dissipation and stochastic excitation cannot be completely solved by inverting such generalized fluctuation-dissipation relations. Projecting the data onto the dominant EOFs can distort the dynamic operator and introduce discrepancies even when the underlying data rigorously satisfies the fluctuation-dissipation relation. Despite this confounding factor, the consistency of the results at each order suggests that the effective dissipation is composed of low-order cross-stream gradients of streamfunction and that the excitation is correlated in the cross-stream direction within only a few Rossby radii.

## Abstract

**C**

_{τ}, and are substituted into the fluctuation-dissipation relation for a first-order Markov model with white noise forcing

**C**

_{τ}

**C**

_{0}

^{−1}

**A**

**A**. The dynamic operator obtained by inverting the relation was found to depend on time lag. In particular, for small time lags (τ < 1 day), the eigenvectors and imaginary eigenvalues were independent of time lag, while the damping rates increased linearly with time lag. It is shown analytically that precisely this discrepancy occurs when the relation is applied to data generated by a red noise Markov model using a time lag that is small compared to the decorrelation time of the noise. Although a fourth-order Markov model with white noise can more accurately reproduce the covariances, the result of inverting the fluctuation-dissipation relation for such a model implies that the spectrum of the noise involves a superposition of stochastic processes of different spectral characteristics, in which case the effective dissipation and stochastic excitation cannot be completely solved by inverting such generalized fluctuation-dissipation relations. Projecting the data onto the dominant EOFs can distort the dynamic operator and introduce discrepancies even when the underlying data rigorously satisfies the fluctuation-dissipation relation. Despite this confounding factor, the consistency of the results at each order suggests that the effective dissipation is composed of low-order cross-stream gradients of streamfunction and that the excitation is correlated in the cross-stream direction within only a few Rossby radii.

## Abstract

This paper documents the low-frequency (i.e., decadal) variations of surface temperature for the period 1899–1998 in observations, and in simulations conducted as part of the Intergovernmental Panel on Climate Change (IPCC) Fourth Assessment Report (AR4). The space–time structure of low-frequency variations is extracted using optimal persistence analysis, which is a technique that linearly decomposes a vector time series into a set of uncorrelated components, ordered such that the first component maximizes the decorrelation time, the second maximizes the decorrelation time subject to being uncorrelated with the first, and so on. The results suggests that only the first two optimal persistence patterns (OPPs) in the observation-based record are statistically distinguishable from white noise. These two components can reproduce the spatial structure of local linear trends over various multidecadal periods, indicating that they give an efficient representation of the observed change in surface temperature. In contrast, most simulations suggest the existence of a single physically significant OPP, all with qualitatively similar time series but each with somewhat different spatial structure. The leading OPP computed from the full model grid is surprisingly consistent with the leading OPP computed from the observation-based grid with missing data masked out, suggesting that the observation-based grid does not pose a serious barrier to extracting the dominant low-frequency variations in the global climate system. The regions in which the leading optimal persistence patterns agree in their predictions of warming coincides with the regions in which warming has in fact been observed to occur.

## Abstract

This paper documents the low-frequency (i.e., decadal) variations of surface temperature for the period 1899–1998 in observations, and in simulations conducted as part of the Intergovernmental Panel on Climate Change (IPCC) Fourth Assessment Report (AR4). The space–time structure of low-frequency variations is extracted using optimal persistence analysis, which is a technique that linearly decomposes a vector time series into a set of uncorrelated components, ordered such that the first component maximizes the decorrelation time, the second maximizes the decorrelation time subject to being uncorrelated with the first, and so on. The results suggests that only the first two optimal persistence patterns (OPPs) in the observation-based record are statistically distinguishable from white noise. These two components can reproduce the spatial structure of local linear trends over various multidecadal periods, indicating that they give an efficient representation of the observed change in surface temperature. In contrast, most simulations suggest the existence of a single physically significant OPP, all with qualitatively similar time series but each with somewhat different spatial structure. The leading OPP computed from the full model grid is surprisingly consistent with the leading OPP computed from the observation-based grid with missing data masked out, suggesting that the observation-based grid does not pose a serious barrier to extracting the dominant low-frequency variations in the global climate system. The regions in which the leading optimal persistence patterns agree in their predictions of warming coincides with the regions in which warming has in fact been observed to occur.

## Abstract

A two-layer quasigeostrophic model is used to investigate whether dissipation can induce absolute instability in otherwise convectively unstable or stable background states. It is shown that dissipation of either temperature or lower-layer potential vorticity can cause absolute instabilities over a wide range of parameter values and over a wide range of positive lower-layer velocities (for positive vertical shear). It is further shown that these induced absolute instabilities can be manifested as local instabilities with similar properties. Compared to the previously known absolute instabilities, the induced absolute instabilities are characterized by larger scales, weaker absolute growth rates, and substantially weaker vertical phase tilt (typical values for subtropical states are zonal wavenumber 1–3, absolute growth rate 80–100 days, and period 7–10 days).

The analysis of absolute instabilities, including the case of multiple absolute instabilities, is reviewed in an . Because the dispersion relation of the two-layer model can be written as a polynomial in both wavenumber and frequency, all possible saddle points and poles of the dispersion relation can be determined directly. An unusual feature of induced absolute instabilities is that the absolute growth rate can change discontinuously for small changes in the basic-state parameters. The occurrence of a discontinuity in the secondary instability is not limited to the two-layer model but is a general possibility in any system involving multiple absolute instabilities. Depending on the location of the discontinuity relative to the packet peak, a purely local analysis, as used in many numerical techniques, would extrapolate the secondary absolute instability to incorrect regions of parameter space or fail to detect the secondary absolute instability altogether. An efficient procedure for identifying absolute instabilities that accounts for these issues is developed and applied to the two-layer model.

## Abstract

A two-layer quasigeostrophic model is used to investigate whether dissipation can induce absolute instability in otherwise convectively unstable or stable background states. It is shown that dissipation of either temperature or lower-layer potential vorticity can cause absolute instabilities over a wide range of parameter values and over a wide range of positive lower-layer velocities (for positive vertical shear). It is further shown that these induced absolute instabilities can be manifested as local instabilities with similar properties. Compared to the previously known absolute instabilities, the induced absolute instabilities are characterized by larger scales, weaker absolute growth rates, and substantially weaker vertical phase tilt (typical values for subtropical states are zonal wavenumber 1–3, absolute growth rate 80–100 days, and period 7–10 days).

The analysis of absolute instabilities, including the case of multiple absolute instabilities, is reviewed in an . Because the dispersion relation of the two-layer model can be written as a polynomial in both wavenumber and frequency, all possible saddle points and poles of the dispersion relation can be determined directly. An unusual feature of induced absolute instabilities is that the absolute growth rate can change discontinuously for small changes in the basic-state parameters. The occurrence of a discontinuity in the secondary instability is not limited to the two-layer model but is a general possibility in any system involving multiple absolute instabilities. Depending on the location of the discontinuity relative to the packet peak, a purely local analysis, as used in many numerical techniques, would extrapolate the secondary absolute instability to incorrect regions of parameter space or fail to detect the secondary absolute instability altogether. An efficient procedure for identifying absolute instabilities that accounts for these issues is developed and applied to the two-layer model.

## Abstract

This paper presents a framework based on Bayesian regression and constrained least squares methods for incorporating prior beliefs in a linear regression problem. Prior beliefs are essential in regression theory when the number of predictors is not a small fraction of the sample size, a situation that leads to overfitting—that is, to fitting variability due to sampling errors. Under suitable assumptions, both the Bayesian estimate and the constrained least squares solution reduce to standard ridge regression. New generalizations of ridge regression based on priors relevant to multimodel combinations also are presented. In all cases, the strength of the prior is measured by a parameter called the ridge parameter. A “two-deep” cross-validation procedure is used to select the optimal ridge parameter and estimate the prediction error.

The proposed regression estimates are tested on the Development of a European Multimodel Ensemble System for Seasonal to Interannual Prediction (DEMETER) hindcasts of seasonal mean 2-m temperature over land. Surprisingly, none of the regression models proposed here can consistently beat the skill of a simple multimodel mean, despite the fact that one of the regression models recovers the multimodel mean in a suitable limit. This discrepancy arises from the fact that methods employed to select the ridge parameter are themselves sensitive to sampling errors. It is plausible that incorporating the prior belief that regression parameters are “large scale” can reduce overfitting and result in improved performance relative to the multimodel mean. Despite this, results from the multimodel mean demonstrate that seasonal mean 2-m temperature is predictable for at least three months in several regions.

## Abstract

This paper presents a framework based on Bayesian regression and constrained least squares methods for incorporating prior beliefs in a linear regression problem. Prior beliefs are essential in regression theory when the number of predictors is not a small fraction of the sample size, a situation that leads to overfitting—that is, to fitting variability due to sampling errors. Under suitable assumptions, both the Bayesian estimate and the constrained least squares solution reduce to standard ridge regression. New generalizations of ridge regression based on priors relevant to multimodel combinations also are presented. In all cases, the strength of the prior is measured by a parameter called the ridge parameter. A “two-deep” cross-validation procedure is used to select the optimal ridge parameter and estimate the prediction error.

The proposed regression estimates are tested on the Development of a European Multimodel Ensemble System for Seasonal to Interannual Prediction (DEMETER) hindcasts of seasonal mean 2-m temperature over land. Surprisingly, none of the regression models proposed here can consistently beat the skill of a simple multimodel mean, despite the fact that one of the regression models recovers the multimodel mean in a suitable limit. This discrepancy arises from the fact that methods employed to select the ridge parameter are themselves sensitive to sampling errors. It is plausible that incorporating the prior belief that regression parameters are “large scale” can reduce overfitting and result in improved performance relative to the multimodel mean. Despite this, results from the multimodel mean demonstrate that seasonal mean 2-m temperature is predictable for at least three months in several regions.

## Abstract

*τ*is the time lag and

*ρ*

_{ τ }is the correlation function of the time series. These integrals arise naturally in sampling theory and power spectra analysis. Moreover, these integrals define the maximum lead time beyond which linear prediction models lose all forecast skill. Thus, an optimally persistent pattern is interesting because it optimizes a quantity that is of fundamental and practical importance. An orthogonal set of time series that optimize these integrals can be obtained from the lagged covariance matrix of the dataset. The corresponding patterns, called optimal persistence patterns (OPPs), may provide a useful basis set for statistical prediction models, because they may remain correlated for much longer periods than individual empirical orthogonal functions (EOFs). The main shortcoming of OPPs is that they are sensitive to sampling errors. To reduce the sensitivity, the upper limit of integration and the basis set used to define the pattern need to be as small as possible, yet large enough to resolve the space–time structure of the pattern.

Examples of OPPs are presented for the Lorenz model and the daily anomaly 500-hPa geopotential height fields. In the case of the Lorenz model, the technique is shown to be far superior at capturing persistent, oscillatory signals than other techniques. As for geopotential height, the technique reveals that the absolute longest decorrelation time, in the space spanned by the first few dozen EOFs, is 12–15 days. It is perhaps noteworthy that this time is virtually identical to the theoretical limit of atmospheric predictability determined in previous studies. This result suggests that the monthly anomaly in this state space, which is often used to study long-term climate variability, arises not from a perturbation that lasts for a month, but rather from a few “episodes” often lasting less than 2 weeks. Depending on the number of EOFs and on which measure of decorrelation time is considered, the leading OPP resembles the Arctic oscillation. The second OPP is associated with an apparent discontinuity around March 1977. The OPP that minimizes decorrelation time (the “trailing OPP”) is associated with synoptic eddies along storm tracks.

The technique not only finds persistent signals in stationary data, but also finds trends, discontinuities, and other low-frequency signals in nonstationary data. Indeed, for datasets containing both a random component and a nonstationary component, maximizing decorrelation time is shown to be equivalent to maximizing the signal-to-noise ratio of low-frequency variations. The technique is especially attractive in this regard because it is very efficient and requires no preconceived notion about the form of the nonstationary signal.

## Abstract

*τ*is the time lag and

*ρ*

_{ τ }is the correlation function of the time series. These integrals arise naturally in sampling theory and power spectra analysis. Moreover, these integrals define the maximum lead time beyond which linear prediction models lose all forecast skill. Thus, an optimally persistent pattern is interesting because it optimizes a quantity that is of fundamental and practical importance. An orthogonal set of time series that optimize these integrals can be obtained from the lagged covariance matrix of the dataset. The corresponding patterns, called optimal persistence patterns (OPPs), may provide a useful basis set for statistical prediction models, because they may remain correlated for much longer periods than individual empirical orthogonal functions (EOFs). The main shortcoming of OPPs is that they are sensitive to sampling errors. To reduce the sensitivity, the upper limit of integration and the basis set used to define the pattern need to be as small as possible, yet large enough to resolve the space–time structure of the pattern.

Examples of OPPs are presented for the Lorenz model and the daily anomaly 500-hPa geopotential height fields. In the case of the Lorenz model, the technique is shown to be far superior at capturing persistent, oscillatory signals than other techniques. As for geopotential height, the technique reveals that the absolute longest decorrelation time, in the space spanned by the first few dozen EOFs, is 12–15 days. It is perhaps noteworthy that this time is virtually identical to the theoretical limit of atmospheric predictability determined in previous studies. This result suggests that the monthly anomaly in this state space, which is often used to study long-term climate variability, arises not from a perturbation that lasts for a month, but rather from a few “episodes” often lasting less than 2 weeks. Depending on the number of EOFs and on which measure of decorrelation time is considered, the leading OPP resembles the Arctic oscillation. The second OPP is associated with an apparent discontinuity around March 1977. The OPP that minimizes decorrelation time (the “trailing OPP”) is associated with synoptic eddies along storm tracks.

The technique not only finds persistent signals in stationary data, but also finds trends, discontinuities, and other low-frequency signals in nonstationary data. Indeed, for datasets containing both a random component and a nonstationary component, maximizing decorrelation time is shown to be equivalent to maximizing the signal-to-noise ratio of low-frequency variations. The technique is especially attractive in this regard because it is very efficient and requires no preconceived notion about the form of the nonstationary signal.