1. Introduction
In climate research it is common practice to fit a statistic or stochastic model to time series of observed variables. A familiar example is the projection of data on a regression model (e.g., von Storch and Zwiers 1999). The master equation (e.g., Gardiner 1983, 8–11) is a prognostic equation for the probability density function (PDF) among discrete states of a system. A discrete time approximation of the master equation is used in this paper in discretized phase spaces spanned by climate variables. The coefficients of a discrete time master equation are probabilities for cell transitions. These probabilities can be estimated from a time series of the variables (e.g., Egger 2001), hence the attribute empirical. The empirical master equation (EME) is described in section 2. The PDF forecasts given by an EME can be used for making probabilistic predictions. Another use of the EME is for studying the processes underlying the variable set. For example, Egger (2001) studied inter alia the evolution of the mean position, that is, a trajectory in phase space, from various initial conditions. This approach is applied in the second part of this paper (Dall’Amico and Egger 2007, hereafter Part II) for studying the relationship between the variables. Another example is offered by Pasmanter and Timmermann (2002), where the entropy production is derived from the coefficients of EMEs for assessing predictability time scales.
The EME is not new in the atmospheric sciences. For instance, Spekat et al. (1983) analyzed the zonal, mixed, and meridional weather regimes from a centennial time series on the basis of an empirical first order Markov model; the latter is closely related to the EME used in this paper (see section 2). Fraedrich (1988) applied inter alia a Markov chain model to the problem of estimating predictability time scales from annual time series of ENSO. Egger (2001) derived master equations from time series of the equatorial components of the global angular momentum of the atmosphere and related torques in order to analyze dynamics in the phase plane of two variables at any one time. Pasmanter and Timmermann (2002) applied the theory of cyclic Markov chains to the ENSO predictability problem; Crommelin (2004) studied the issue of atmospheric circulation regimes in Northern Hemisphere winter using a similar Markov model.
EMEs are numerical structures whose numerical properties are not yet well known. Some of the factors influencing the quality of an EME depend on choices made by the user such as the number and the choice of the variables, and the type and degree of phase space partition. Only rules of thumb are available to assess an adequate level of partitioning. Other factors are predetermined, as for instance the accessible climate time series and computer resources. The length and resolution of the available time series greatly influence the quality of an EME. The number of variables must be small because long data records are needed to correctly estimate the coefficients of an EME, and any addition to the number of variables involved dramatically increases the amount of data required (see also Crommelin 2004, section 2). This paper presents for the first time three-dimensional EMEs, yet in many applications the studied system can be expected to have more dimensions.1 It is the purpose of this part of the paper to address these problems by taking output from the Lorenz (1963) model with additional white noise forcing as a data basis (section 3). Systematic variations of grid size, time series length, sampling interval, and the number of variables are conducted in section 4 in order to study their effect on how well the EME reproduces the dynamics of the studied system. The conclusions are outlined in section 5. The results of this part of the paper provide guidelines for the application of this methodology to any problem in (and beyond) the atmospheric sciences. Real data of limited length will be considered in Part II, were these guidelines are applied to EMEs for the quasi-biennial oscillation of equatorial stratospheric wind (see the review by Baldwin et al. 2001), the 11-yr solar cycle (e.g., Labitzke and van Loon 1999), and the northern annular mode (Thompson and Wallace 2000). This two-part paper is based on work presented in Dall’Amico’s doctoral thesis (Dall’Amico 2005).
2. EMEs
Depending on the system, transition coefficients may depend on time. Seasonal dependence was introduced for instance by Pasmanter and Timmermann (2002), who estimated transition matrices for each month of the year on the basis of a 640-yr-long ENSO model run. However, the amount of data available in observational records is in most cases insufficient to introduce such a time dependence.

3. The Lorenz model with additional white noise forcing
The Lorenz model displays chaotic dynamics for a suitable choice of the parameters. We choose the standard parameter set Pr = 10, r = 28, and b = 8/3 (e.g., Lorenz 1963; Palmer 1993; Kaplan and Glass 1995). The state vector q = (q1, q2, q3) = (x, y, z) evolves around the famous Lorenz attractor with its two butterfly-wing-shaped lobes. A trajectory far from the Lorenz attractor rapidly approaches the attractor, whereas trajectories near the attractor show sensitive dependence on initial conditions.
The Lorenz model has often been taken as a paradigm of large-scale atmospheric circulation (e.g., Palmer 1993). Though only three-dimensional, it reflects many of the properties of the full climate system (Thuburn 2005). It is a shortcoming of the model in comparison to the atmosphere that its diffusivity in phase space is fairly low. White noise in inserted partly for this reason. Moreover, the divergence of the Lorenz model without stochastic forcing is, with ∇ · (ẋ, ẏ, ż) = −Rr − 1 − b, negative and constant. This means that the phase space occupied by the trajectories is shrinking continuously onto the Lorenz attractor. The additional white noise forcing acts against this frictional contraction, leading to a stationary stochastic system (see von Storch and Zwiers 1999, 1–2), as needed for EMEs with time-constant transition coefficients. A noise amplitude α = 2.5 is chosen, such that a modest diffusion of the states is obtained (see Figs. 5a,b), which does not drastically alter character and shape of the trajectories (see Fig. 2).
Three time scales are associated with the model. The one describing the evolution of a trajectory about the (weakly) unstable fixed point at the center of each attractor wing is t win ≈ 0.7. The residence time, t res, in a wing varies approximately between 1 and 10 time units. These time scales can be seen in Fig. 2 (solid line), where the evolution of the x component from an arbitrary initial state is shown as a function of time. The dependence on initial conditions is revealed by the evolution of the dashed line, where the white noise forcing is identical but the initial conditions are slightly different from those of the solid line. There exists also a diffusive time scale, t dif ≈ L/α, due to the stochastic forcing (L ≈ 20 is the diameter of a wing), with t win < t dif.
Both the choice of the Lorenz model and that of a rather weak white noise forcing are quite challenging. The attractor has a complex shape and a relatively fine grid size is required in order to resolve its wings. The demands on the quality of the EME would have been less stringent if a system with a more trivial attractor had been chosen or, as often seen in literature, if a stronger noise had been used [as in e.g., Gradišek et al. (2000) and Thuburn (2005), where Fokker–Planck equations are considered]. With weak white noise, a cloud of points smears quite slowly. In contrast, as α grows, the diffusion of a cloud of points due to the noise term becomes comparable to the numerical diffusion acting on the PDF forecast by an EME derived from a long time series (see section 4), leading to better predictive skill when comparing the two.
4. Results
The impact of the choice of the main numerical parameters on the quality of the EME is discussed in the following subsections. The transition coefficients of each EME are estimated from a single time series whose length is indicated as Δt .
a. Grid size
The choice of the grid size depends on the problem at hand. At best, a few guidelines can be formulated. The data of a “long” time series beginning near the attractor are included in a parallelepiped with a volume of about 50 · 60 · 50. This volume is the domain of solution. The grid size is set initially depending on the features that have to be resolved. In Fig. 3 is shown how the observed state density for a time series of length Δt = 51200 varies depending on the grid size chosen to partition the phase space. With a grid size Dq = 5.00, the above parallelepiped is partitioned into 1200 cells (see Fig. 3a, where −30 ≤ y < 30). Almost 250 of these cells intersect the attractor. A relatively fine grid size is needed because of the complicated structure of the attractor. As may be seen in Fig. 3a, the choice of Dq = 5.00 does not resolve the “holes” in each of the butterfly wings. They are resolved with Dq = 2.50 (Fig. 3b) and the overall picture contains more details when Dq = 1.25 (Fig. 3c). However, the required computer cost grows dramatically by reducing the grid size. With Dq = 2.50, approximately 1000 cells intersect the attractor, and with Dq = 1.25 almost 4750 cells. Moreover, as the number of cells representing the domain of the solution grows by decreasing Dq, since transitions to and from all cells have to be considered, the size of the array representing the transition matrix, 𝗪, grows with the square of the number of cells. The time needed for computing correlation functions increases with approximately the eighth power of the inverse of the grid size for this three-dimensional case. Therefore the grid size is set at Dq = 2.50.
Figure 4 reports the ratios Rw for grid sizes Dq = 2.50 and Dq = 1.25 as a function of the length of the time series used to estimate 𝗪 for an arbitrary time resolution; the time series must be about eight times longer for the Rw values for Dq = 0.125 to be as low as those for Dq = 2.50. While deriving an EME from an observational time series, the grid size might have to be adjusted according to the ratio Rw and to the correlation functions delivered by the EME. Correlation functions as delivered by EMEs confirm the choice Dq = 2.50 in terms of quality and of needed computer resources also with respect to other grid sizes (not shown).
b. Time series length
The length of the time series affects the value of the ratio Rw, as may be seen in Fig. 4. The values of Rw in Fig. 4 are quite high for short time series even for the chosen grid size, Dq = 2.50. For Δt < 200, for instance, Rw > 0.5, which means that the statistical significance of the transition coefficients’ estimates is low.
By affecting the estimate of the transition coefficients, the time series length can also have an impact on the PDF forecasts. As a case study, PDF forecasts from a particular initial condition, which are given by EMEs derived from time series of different lengths, are compared to the evolution of an ensemble of points. The latter is obtained by integration of the Eqs. (7) and is shown for t = 0, 0.2, 0.4, 0.6 in Fig. 5a and for t = 0.8 in Fig. 5b. These points are initially located in a cell of the phase space discretized with Dq = 2.50. This cell includes part of the attractor. Palmer (1993) pointed out that there are portions of the Lorenz attractor where trajectories depart fairly slowly. In other cases, an ensemble of points reaches the splitting region of the attractor and adjacent trajectories diverge toward the two different wings, so that there are regions of the attractor that are relatively more sensitive to initial conditions. The cloud in Fig. 5a moves partly through the splitting region between t = 0.3 and t = 0.4, and only very few points are located on the left-hand side of the attractor at t = 0.8. Losses in prediction skill are expected if the PDF evolves occupying the splitting region imprecisely. Initial conditions as in Fig. 4 of Palmer (1993), which either evolve far away or directly through the splitting region, represent an easier task for the EME. The one in Fig. 5a is challenging and shows how the estimation of 𝗪 from a very short time series may lead to very poor forecasts. The forecasts are considered at time t = 0.8. This time is close to the minimum time generally needed for nearby initial conditions to diverge toward the two wings of the attractor (see Palmer 1993). The reference density of ensemble members, μ, that is, the relative frequency of points per cell integrated along the y axis, is shown for t = 0.8 in Fig. 5c. To compare the PDF with μ, the latter is normalized accordingly: ΣiμiDq3 = 1. In Fig. 5c there is a well-marked head on the right wing of the attractor and a long thinly populated tail. The PDF forecasts in Fig. 6 are compared to Fig. 5c. These forecasts are given by EMEs derived from time series of different lengths. The time series are obtained by extending the integration shown by the solid line in Fig. 2. No forecast is possible if the time series is shorter than 50 time units since no observation falls into the starting cell. The forecast shown in Fig. 6a is of poor quality. The forecasts shown in Figs. 6d,e,f are of better quality and almost identical. In these forecasts, the PDF is higher on the right wing of the attractor, just as the density of ensemble members in Fig. 5c; the PDF on the left wing is a result of the numerical diffusion (see below), which leads to nonzero PDF for time t = 0.4 (see Fig. 5a) on a wider region, a part of which evolves to the left attractor wing.
c. Time resolution
The transition coefficients, and consequently their estimate after Eq. (5), depend on the time step. On one hand, a fine time resolution is desirable in order to improve the statistical significance of the transition coefficients. The ratio Rw generally decreases with increasing ΣiNi as is the case for a long time series and a fine time resolution. On the other hand, the computing time increases by reducing the time step and, since the time scales of the investigated phenomena are usually known, Dt should not be unnecessarily fine. The time resolution also sets a limit to the highest systems’ frequency that an EME will be able to reproduce.
Figure 9 shows PDF forecasts obtained with different time steps. Figure 9b shows for reference the forecast obtained with a time series of length Δt = 3200 and time resolution Dt = 0.020 (same as Fig. 6d), that is, longer than the threshold length discussed in section 4b. Figure 9a shows the PDF forecast delivered by the EME when the sampling frequency of the same time series as for Fig. 9b is increased by a factor of 5, that is, Dt = 0.004. In contrast, for Fig. 9c, the time resolution is coarser than in Fig. 9b by a factor of 5, Dt = 0.100. The best forecast of the three is Fig. 9c, where the PDF, integrated along the y axis, exceeds a value of 0.0100 on the right wing of the attractor. Surprisingly, this forecast is the one obtained with the coarsest time resolution, Dt = 0.100. The skill scores S computed for the forecasts in Fig. 9 confirm this. For the finest time resolution the skill is the worst at 0.83 (Fig. 9a), and improves to 0.75 with a time resolution Dt = 0.020 (Fig. 9b); the skill is best at 0.48 for the coarsest time resolution (Fig. 9c).
The two drawings in Fig. 10 represent a simple example, which helps one understand the surprisingly rapid smearing of the PDF observed in connection with a finer time resolution. In Fig. 10b, the time step is 3 times larger than in Fig. 10a. The PDF values written in Fig. 10 are predicted by this illustrative EME on the base of the few observations available. It is easy to follow the evolution of the PDF and to realize that the PDF smears much faster in the case with shorter Dt (Fig. 10a). Given good estimates of the transition coefficients, the EME makes best forecasts, whatever the initial condition, over one single time step no matter how large Dt. (The PDF forecast is almost perfect if the cloud of states initially coincides with a cell.) For the previous case study (Fig. 9), Dt = 0.8 is the best time resolution since only one time step is needed for prediction. After any time step, the PDF is spread evenly over each cell. This spreading is unavoidable and implies diffusion. This is a specific example of numerical diffusion. While Fig. 10 is illustrative, exact statements about the numerical spreading of PDFs can be made for simple systems; in the appendix, an explicit solution to the problem of numerical spreading is given for an advective case.
Figure 11 shows some correlation functions7 estimated directly from the time series (solid lines) and delivered by the EMEs that gave the forecasts in Fig. 9. The transition coefficients of these EMEs are estimated from time series differing only in their sampling interval whereas the starting point, the evolution, and the length are the same. Figure 11 shows that all EMEs approximate the decay of the correlation functions extremely well. The EME based on the time series with the coarsest time resolution does the best job in reproducing the sample correlations, confirming the results seen in the previous case study. This finding contrasts with standard results where the numerical diffusion is reduced for smaller time steps (Mesinger and Arakawa 1976). Given the low diffusion in the time series due to the model’s chaotic nature and due to white noise forcing, numerical diffusion prevails unless a much finer grid size is used at which the diffusion in the data dominates the advective transport (see Durran 1998, p. 139). As the time step grows, the ratio between the scale of diffusion in the data and the grid size increases, thus weakening the impact of numerical diffusion. As can be seen in Fig. 11, the decorrelation time for the time series in question is much longer than the time resolutions considered (Dt = 0.004, 0.020, and 0.100). This makes numerical diffusion the sole explanation for the result that the PDF forecasts delivered by the EME improve by increasing the time step. A reduction to numerical diffusion can also be sought by reducing the grid size. However, besides implying a growth of the ratio Rw, a grid size reduction also causes a rapid increase of the requirements for the computing resources.
The encouraging results in Figs. 9c and 11 (dashed–dotted lines) have important practical implications. With Dq = 2.50, about ζ = 103 cells intersect the region of the phase space occupied by the observations and up to ζ2 = 106 transitions are conceivable. Yet the EME was derived from a time series of length Δt = 3200 and time resolution Dt = 0.100, that is, of only 32 × 103 data points. For the EME in question, Rw = 0.29. Transitions occur mainly toward the cells along the main direction of movement (not backward or in any transverse direction), and in this case merely 3.2 × 103 transitions can actually take place. For an atmospheric application where the region occupied by the observations has a much simpler structure than the attractor of Eqs. (7), the grid size can be initially set to a value that gives a few hundred cells. The grid size can then be adjusted depending on the ratio Rw for the considered time step. The adherence of the correlation functions may also suggest a change of the numerical parameters. This approach is adopted in Part II with positive results.
The EME used here is discrete in time and phase space. The Fokker–Planck equation, on the other hand, is a partial differential equation that can also be used to predict the PDF at least under favorable circumstances (Zwanzig 2001). In practice, the Fokker–Planck equation must be solved numerically, whereby it is transformed into a finite difference equation (e.g., Thuburn 2005). The drift and diffusion terms of a Fokker–Planck equation may also be estimated from data (Siegert et al. 1998). This technique was applied, for example, by Egger and Jònsson (2002) to meteorological observations in the Icelandic region. An empirical discrete Fokker–Planck equation can be seen as a particular case of an EME. Master equations are much more general than Fokker–Planck equations (Zwanzig 2001). In a discretized version of the phase space, the coefficients within the parentheses in Eq. (8) can also be estimated from a time series of the variables. Gradišek et al. (2000) applied this technique to time series of various origin, among these some stemming from the Lorenz (1963) model with additional stochastic forcings. They found that the maximum acceptable time step needed for the estimates of the drift and diffusion coefficients to converge is shorter than the time step required for the integration of the corresponding differential equations. Moreover, the integration of the Fokker–Planck equation is constrained by the Courant–Friedrichs–Levy criterion and its numerical implementation involves normally only neighboring cells. Large time steps can be used only with the EME, which considers transitions to any cell in phase space. Within a time Dt = 0.100, for instance, the state vector may jump to a cell separated from the previous one by up to 12 cells of grid size Dq = 2.50.
d. Dimension of the EME
In a more realistic situation, the EME does not contain as many variables as the investigated system. In practice, EMEs with, say, three variables are applied to systems with many variables. To assess the effect of considering a reduced set of variables, a two-dimensional EME is derived from a projection on the (x, z) plane of the same time series, which led to the prediction in Fig. 9c. In this case, variable y does not appear in the EME. The left panel of Fig. 12 shows the forecast delivered by such an EME starting from the same initial condition as in Fig. 9c. The partition of the x and z axes is unchanged. The value of Rw is now 0.19, whereas Rw was 0.29 in the three-dimensional case. Such a reduction should not surprise since the same number of data points as in the three-dimensional case is now used to estimate a much lower number of transitions. The forecast in the left panel of Fig. 12, where the PDF is about equally distributed on either attractor wing, is clearly worse than the corresponding three-dimensional one (Fig. 9c). The evolution of the autocorrelation function of the x component, rxx, is shown in the right panel of Fig. 12 and beyond about 0.3 time units it is not as good as the one delivered by the three-dimensional EME. In the case of an observational time series, the introduction of another variable should be considered.8 While deriving EMEs from atmospheric datasets, however, it is practically impossible to consider a complete set of variables. In the example of Fig. 12, the low value of Rw might have induced optimism, yet this EME does not quite a good job in reproducing the dynamics of the studied system. An Rw value below, say, 0.4, suggests that the amount of data available is adequate for the number of transitions to be estimated. For Rw values between 0.4 and 0.6, much caution in the study of the EME is recommended if the user decides not to increase the grid size. Values above 0.6 should lead to the use of a coarser grid size. Whatever the Rw value, caution is recommended in evaluating results obtained from an EME if its correlation functions suggest that the behavior of the system is not adequately reproduced. An atmospheric application requiring a five-dimensional rather than a three-dimensional EME is discussed in section 3b of Part II.
5. Conclusions
EMEs are constructed directly from data and provide a model of the phase space dynamics of a system. They may capture nonlinear behavior. A study of the numerical properties of EMEs has been the object of this part of the paper. This is the first time that grid size, time series length, time step, and phase space dimensionality have been studied systematically. The time series needed for the numerical study have been generated by numerical integration of the equations of the Lorenz model with additional white noise forcing. Thus, time series with the desired characteristics could be generated easily. The choice of the dynamical model has been quite challenging. The attractor of this model has a fairly complicated structure so that a fine grid size is needed in order for the EME to replicate the motion in phase space. Moreover, the model’s diffusivity due to its chaotic nature and due to the white noise forcing is fairly low, making numerical diffusion a dominant factor. The significance of the estimates of the transition coefficients of the EME has been assessed in terms of a weighted averaged ratio, Rw, between the half-width of the confidence intervals of the transition coefficient estimates and the estimates themselves. The adherence of correlation functions as delivered by the EME to those estimated from the data has been used as a test of how well the EME reproduces the dynamics of the system.
We find that:
The grid size choice is a compromise between desired resolution in phase space on one side, and available data and computer resources on the other. A moderately fine grid size has delivered very encouraging results.
The estimate of the transition coefficients improves with growing time series length. Moreover, we find that, for a given grid size and time resolution, there is a threshold time series length beyond which the forecast skill does not improve; this threshold length appears to be moderate. For the Lorenz model with additional white noise forcing, we find a threshold of order 103 time units. An approach considering the transition coefficient matrix 𝗪 as a multidimensional vector shows that 𝗪 converges with growing time series length.
Surprisingly, EMEs derived from time series with coarser time resolutions show better forecast skills and deliver better correlation functions. This is due to a decrease in the numerical diffusion acting on the PDF forecasts as the time step grows. Best forecasts are obtained with a single time step.
All in all, encouraging results are obtained considering partitions of the attractor into about 1000 cells with a time series consisting of only 3.2 × 104 data points. This result supports the applicability of the method to atmospheric time series. For a three-dimensional atmospheric application, we recommend to begin by considering partitions of the “data cloud” into several hundred cells. The numerical parameters can be adjusted according to the corresponding value of the ratio Rw and, finally, to the quality of the correlation functions delivered by the EME.
The example chosen is unusual in that the EME has the same number of variables as the Lorenz model. An EME derived for time series of only the x and z components leads, despite a low Rw value, to poor PDF predictions. The adherence of the correlation functions deteriorates with respect to the three-dimensional case. This result demonstrates the importance of the number of variables and calls for caution in cases when the correlation functions delivered by the EME strongly disagree with the ones obtained from the data. This disagreement can occur particularly when high dimensional systems are studied, suggesting, if feasible, to increase the dimension of the EME.
In Part II, EMEs are derived from time series obtained from the ERA-40 re-analysis (Uppala et al. 2005) and observations. These time series are limited in time and hence the existence of a threshold length may not be tested. However, the remaining results of this numerical study, including the surprising role of time resolution, are confirmed.
Acknowledgments
Financial support was provided by the German Ministry of Education and Research and the German Aerospace Center within KLIMESTO, a project of the German Climate Research Program, Contract 01LD0033. See Part II for more thorough acknowledgments.
REFERENCES
Anderson, T. W., and L. A. Goodman, 1957: Statistical Inference about Markov Chains. Ann. Math. Stat., 28 , 89–110.
Baldwin, M. P., and Coauthors, 2001: The quasi-biennial oscillation. Rev. Geophys., 39 , 179–229.
Cencini, M., G. Lacorata, A. Vulpiani, and E. Zambianchi, 1999: Mixing in a meandering jet: A Markovian approximation. J. Phys. Oceanogr., 29 , 2578–2594.
Crommelin, D. T., 2004: Observed nondiffusive dynamics in large-scale atmospheric flow. J. Atmos. Sci., 61 , 2384–2396.
Dall’Amico, M., 2005: Data-based master equations for the stratosphere. Ph.D. thesis, Ludwig-Maximilians-Universität of Munich, Germany, 71 pp. [Available online at http://edoc.ub.uni-muenchen.de/archive/00003890/.].
Dall’Amico, M., and J. Egger, 2007: Empirical master equations. Part II: Application to stratospheric QBO, solar cycle, and northern annular mode. J. Atmos. Sci., 64 , 2996–3015.
Durran, D. R., 1998: Numerical Methods for Wave Equations in Geophysical Fluid Dynamics. Springer, 465 pp.
Egger, J., 2001: Master equations for climatic parameter sets. Climate Dyn., 17 , 169–177.
Egger, J., 2002: Master equations for Himalayan valley winds. Stochastic Dyn., 2 , 381–394.
Egger, J., and T. Jònsson, 2002: Dynamic models for Icelandic meteorological data sets. Tellus, 54A , 1–13.
Egger, J., and M. Dall’Amico, 2007: Empirical master equations: Numerics. Meteor. Z., 16 , 139–147.
Fraedrich, K., 1988: El Niño/Southern Oscillation predictability. Mon. Wea. Rev., 116 , 1001–1012.
Gardiner, C. W., 1983: Handbook of Stochastic Methods for Physics, Chemistry and the Natural Sciences. Springer, 442 pp.
Gradišek, J., S. Siegert, R. Friedrich, and I. Grabec, 2000: Analysis of time series from stochastic processes. Phys. Rev. E, 62 , 3146–3155.
Kaplan, D., and L. Glass, 1995: Understanding Nonlinear Dynamics. Springer, 420 pp.
Kloeden, P. E., E. Platen, and H. Schurz, 1997: Numerical Solution of SDE through Computer Experiments. 2d ed. Springer-Verlag, 292 pp.
Labitzke, K. G., and H. van Loon, 1999: The Stratosphere, Phenomena, History, and Relevance. Springer, 179 pp.
Levy, P., 1948: Processus stochastiques et mouvement brownien. Gauthier-Villars, 365 pp.
Lorenz, E. N., 1963: Deterministic nonperiodic flow. J. Atmos. Sci., 20 , 130–141.
Mesinger, F., and A. Arakawa, 1976: Numerical methods used in atmospheric models. Global Atmospheric Research Programme (GARP) Publication Series 17, World Meteorological Organization, 64 pp.
Nicolis, C., 1990: Chaotic dynamics, Markov processes and climate predictability. Tellus, 42A , 401–412.
Nicolis, C., W. Ebeling, and C. Baraldi, 1997: Markov processes, dynamic entropies and the statistical prediction of mesoscale weather regimes. Tellus, 49A , 108–118.
Palmer, T. N., 1993: Extended range atmospheric prediction and the Lorenz model. Bull. Amer. Meteor. Soc., 74 , 49–65.
Pasmanter, R. A., and A. Timmermann, 2002: Cyclic Markov chains with an application to an intermediate ENSO model. Nonlinear Proc. Geophys., 9 , 1–14.
Press, W. H., S. A. Teukolsky, W. T. Vetterling, and B. P. Flannery, 1999: Numerical Recipes in Fortran 77. The Art of Scientific Computing. Volume 1 of Fortran Numerical Recipes. 2d ed. Cambridge University Press, 933 pp.
Siegert, S., R. Friedrich, and J. Peinke, 1998: Analysis of data sets of stochastic systems. Phys. Lett. A, 243 , 275–280.
Spekat, A., B. Heller-Schulze, and M. Lutz, 1983: Über Großwetter und Markov-Ketten (“Großwetter” circulation analysed by means of Markov chains). Meteor. Rundsch., 36 , 243–248.
Thompson, D. W. J., and J. M. Wallace, 2000: Annular modes in the extratropical circulation. Part I: Month-to-month variability. J. Climate, 13 , 1000–1016.
Thuburn, J., 2005: Climate sensitivities via a Fokker–Planck adjoint approach. Quart. J. Roy. Meteor. Soc., 131 , 73–92.
Uppala, S. M., and Coauthors, 2005: The ERA-40 re-analysis. Quart. J. Roy. Meteor. Soc., 131 , 2961–3012.
Vautard, R., K. C. Mo, and M. Ghil, 1990: Statistical significance test for transition matrices of atmospheric Markov chains. J. Atmos. Sci., 47 , 1926–1931.
von Storch, H., and F. W. Zwiers, 1999: Statistical Analysis in Climate Research. Cambridge University Press, 484 pp.
Zwanzig, R., 2001: Nonequilibrium Statistical Mechanics. Oxford University Press, 222 pp.
APPENDIX
Numerical Diffusion in an Advecting System





The analytic solution, Eq. (A4), does not show any spread, whereas the solution of the master equation is spread out over n + 1 grid intervals at time
The probability density function f(tn) at time tn. The transition coefficient Wi2 gives the probability that the variable q leaves grid interval 2 to enter i at the next time step.
Citation: Journal of the Atmospheric Sciences 64, 9; 10.1175/JAS3992.1
Time series of the x component of the Lorenz model with white Gaussian noise; Eqs. (7). The dashed time series is obtained by decreasing the initial x value by only 10−3 (the white noise forcing is identical).
Citation: Journal of the Atmospheric Sciences 64, 9; 10.1175/JAS3992.1
Observed state density, defined with Eq. (4) and integrated along the y axis, for a time series of length Δt = 51 200 and sampling interval Dt = 0.020, obtained from Eqs. (7) by extending the integration that lead to the solid time series in Fig. 2. The starting point of the time series is marked by a circled dot. The isoline contour interval is 0.5 × 10−3, and every third isoline is bold. (a) The grid size used to partition the phase space, Dq = 5.00, is too coarse to resolve the “holes” in the attractor wings; these may be seen in (b), where the grid size Dq = 2.50; in (c) the grid size Dq = 1.25.
Citation: Journal of the Atmospheric Sciences 64, 9; 10.1175/JAS3992.1
The ratio Rw defined with Eq. (6) is shown as a function of the time series length, Δt , in nondimensional time units; the time resolution Dt = 0.020. Points mark the results obtained with a grid size Dq = 2.50; circles correspond to Dq = 1.25. The time series are generated by extending the numerical integration of Eqs. (7) which lead to the solid time series in Fig. 2. There is no stationary estimate of the transition coefficients for Dq = 1.25 and Δt = 1.00.
Citation: Journal of the Atmospheric Sciences 64, 9; 10.1175/JAS3992.1
(a) The position at t = 0, t = 0.2, t = 0.4, and t = 0.6 of a cloud of 8000 points according to Eqs. (7); the position at t = 0.8 is shown in (b). (c) The normalized density of ensemble members (points) for t = 0.8, obtained from (b) after discretizing the phase space with Dq = 2.50 and integrating along the y axis; the isoline contour interval is 10−3.
Citation: Journal of the Atmospheric Sciences 64, 9; 10.1175/JAS3992.1
PDF forecasts for time t = 0.8 as delivered by EMEs derived from time series of varying lengths; the grid size Dq = 2.50; the time resolution Dt = 0.020. The star marks the sharp initial condition (see also Fig. 5a). The PDF has been integrated along the y axis. The isoline contour interval is 10−3, and every third isoline is bold. The reference density of ensemble members is shown in Fig. 5c.
Citation: Journal of the Atmospheric Sciences 64, 9; 10.1175/JAS3992.1
Skill S for the EME forecasts shown in Fig. 6 (and others), shown as a function of the time series length; the grid size Dq = 2.50; the time resolution Dt = 0.020. The reference density of ensemble members is shown in Fig. 5c.
Citation: Journal of the Atmospheric Sciences 64, 9; 10.1175/JAS3992.1
The convergence coefficient Cw is shown as a function of the length of the longer time series, which is twice as long as the shorter one; the time resolution Dt = 0.020. Points mark the results obtained with a grid size Dq = 2.50; circles correspond to Dq = 1.25.
Citation: Journal of the Atmospheric Sciences 64, 9; 10.1175/JAS3992.1
PDF forecasts for time t = 0.8 as delivered by EMEs derived from time series of varying time resolutions; the grid size Dq = 2.50; the time series length Δt = 3200. The star marks the sharp initial condition (see also Fig. 5a). The PDF has been integrated along the y axis. (a) Dt = 0.004; (b) Dt = 0.020 (same as Fig. 6d); (c) Dt = 0.100. The isoline contour interval is 10−3, and every third isoline is bold. The reference density of ensemble members is shown in Fig. 5c.
Citation: Journal of the Atmospheric Sciences 64, 9; 10.1175/JAS3992.1
Influence of the time step, Dt, on an EME. The phase plane is partitioned into squares of unit grid size. The dash-dotted lines indicate parts of a single illustrative trajectory, the arrows show the direction of motion, and bold points indicate sampled observations. The dashed closed curves describe the domain of a cloud of states at two different times, t 0 (initial time) and
Citation: Journal of the Atmospheric Sciences 64, 9; 10.1175/JAS3992.1
Correlation functions rxx, rzz, and ryx as observed (solid), and as reproduced by EMEs derived from time series of varying time resolutions: dotted for Dt = 0.100, dash-dotted for Dt = 0.020, and dashed for Dt = 0.004; the grid size Dq = 2.50; the time series length Δt = 3200.
Citation: Journal of the Atmospheric Sciences 64, 9; 10.1175/JAS3992.1
(left) PDF forecasts for time t = 0.8 as delivered by an EME derived from a time series of the x and z components of the Lorenz model with stochastic forcing; the grid size Dq = 2.50; the time series length Δt = 3200; the time resolution Dt = 0.100. The star marks the sharp initial condition. The isoline contour interval is 10−3, and every third isoline is bold. The reference density of ensemble members is shown in Fig. 5c. The corresponding forecast, obtained with a three-dimensional EME is shown in Fig. 9c. (right) The autocorrelation function of the first component, rxx, as observed (solid) and as reproduced by the EME (dotted).
Citation: Journal of the Atmospheric Sciences 64, 9; 10.1175/JAS3992.1
With the evolution of computing power, the implementation of four- or five-dimensional EMEs will eventually become feasible, provided that appropriately long time series are available.
Nicolis (1990) used the concept of Markov partitions to discretize the third component of the Lorenz (1963) model into two grid intervals. Pasmanter and Timmermann (2002) made use of equipartitions (i.e., each cell contains the same number of observations) to discretize one- and two-dimensional phase planes. Our approach of considering equal sized grid intervals has been used by Egger (2001), and, similarly, Thuburn (2005) partitioned into equal sized cubical cells the phase space spanned by the variables of the Lorenz (1963) model.
See also the discussion on stochastic matrices and the Perron–Frobenius theorem in Pasmanter and Timmermann (2002).
Crommelin (2004) introduced an approach to statistical significance applicable to equipartitions into, say, up to 10 cells, whenever one is interested in a few meaningful state transitions. Vautard et al. (1990) introduced a significance test based on Monte Carlo simulations while examining a few atmospheric circulation pattern transitions.
A second-order Markov chain involves a number of coefficients of order (i1max · i2max · . . . · iΛmax)3, where iλmax is the total number of grid intervals along the λth axis, and Λ is the number of variables. Such a hypothesis test is impractical with the computing resources we have. (In the papers cited in section 1, such hypothesis tests have been carried out by only a few authors in situations where the total number of cells was up to 5.) If the test is computationally feasible, the limited amount of data generally available in atmospheric applications would not allow the rejection of the hypothesis because of the high number of cells we plan to work with. Even if the hypothesis is rejected, the questions remain open on whether a first-order Markov chain description may still bring some insight in the underlying processes and whether a higher-order Markov chain description represents a feasible way to deal with the available data. For these reasons, we introduce another type of test of the quality of the EME.
If nonadditive noise terms were present, care should be taken in the choice of the numerical scheme (e.g., Kloeden et al. 1997).
Correlation functions were not introduced in section 4b because both the ones directly estimated from a time series and the ones delivered by an EME derived from the same time series improve with increasing time series length. Correlation functions estimated from a time series of length Δt = 3200 negligibly improve if a substantially longer time series is used. A time series of length Δt = 3200 occupies densely and uniformly the portion of the phase space where the attractor is located [not shown, compare Fig. 3b with Fig. 3.8 in Dall’Amico (2005)]. Correlation functions are derived from this whole portion.
Kaplan and Glass (1995, 308–311) mention how time-lag embedding enables the reconstruction of the geometry of a chaotic system from a time series even if only one of the variables is measured.