## 1. Introduction

The Dansgaard–Oeschger (DO) events of rapid climate shifts in the glacial climate observed in the Greenland ice cores are still not well understood. It is debated to which extent this is either a regional or a more global in nature phenomenon (Wunsch 2006), but it seems to be agreed that there are two quasi-stationary climate states: the glacial or stadial and the DO or interstadial states. These represent reorganizations of the climate system either related to changes in the North Atlantic oceanic circulation (Weaver et al. 1991), the extent of the ice sheets (MacAyeal 1993), sea ice (Gildor and Tziperman 2003), or even changes in the mean state of the atmospheric flow (Wunsch 2006). The state-of-the-art global circulation models are not capable of simulating the events, which should be of high priority in order to test if the models resolve the true dynamical range of possible climate changes. The mechanism for triggering a DO event could be externally driven possibly by solar variations (Braun et al. 2005), related to either internal nonlinear free oscillations (Haarsma et al. 2001) or random noise-driven jumps between quasi-stationary climate states (Cessi 1994; Ditlevsen 1999).

To approach this enigma from an observational point of view, the waiting time statistics could be used both to distinguish between the different possibilities and to be a benchmark for modeling the events. From the spectral content of the signal it has been proposed that the triggering might be periodic (Schulz 2002). However, it was demonstrated in Monte Carlo simulations that the apparent periodicity is not significantly different from what would be found in a random signal (Ditlevsen et al. 2007). Likewise, it was demonstrated that a stochastic resonance model (Alley et al. 2001) would have too small a strength of the periodic forcing to be detected above the noise level for the DO events (Ditlevsen et al. 2005).

The waiting time statistics relies on accurate dating of the events observed in the ice core records. The dating of the North Greenland Ice Core Project (NGRIP) ice core has been completed by annual layer counts going back to 60 kyr BP with unprecedented accuracy (Svensson et al. 2008). The quoted dating uncertainty (one sigma) accumulates to 1300 yr at 60 kyr BP. Thus, an approximate uncertainty of (1300/*n**n**n* kyr. Herein this measuring uncertainty is not included in the analysis of the data read from the NGRIP isotope record.

The isotope record from NGRIP is shown in Fig. 1. The numbers refer to the original numbering of DO events (Dansgaard et al. 1993). The red circles indicate the initiations of DO events. A discussion of criteria for determining the initiations can be found in Ditlevsen et al. (2005). The green circles indicate the terminations of the DO events. As seen in the blow up in the lower panel, these are much less well determined than the initiations.

## 2. Distribution of the time between consecutive DO events

The fact that the time points of initiation of the DO events are much more precisely identified than the points of termination motivates us to first study the statistical properties of the stream of initiation points by themselves.

In appendix B, we investigate whether it is compatible with the observed sequence of consecutive time distances between the DO events to assume that these chronologically ordered distances are mutually independent. This is done by transforming the observed time distances into approximately normally distributed numbers, followed by calculating the “autocorrelation” coefficients of the transformed sequence. The deviations from zero of these “number lag” correlation coefficients are found to be insignificant. The property of zero correlation is not a sufficient condition for independence in the case of normal marginal distributions. However, the simplest possible consistent assumption is that the bivariate distribution is normal, implying that the two variables become independent. The dataset is far too small for rejection of the bivariate normal hypothesis with any reasonable confidence.

The simplest suggestion of a model of a random point stream with statistically independent consecutive time distances is that these have a common probability distribution characterized by a lack of memory in the following sense: The probability distribution of the remaining time distance to the first point after any chosen instant is the same independent of how far back in time from the instant the previous point occurred. This assumption implies that the exponential distribution is the only possible distribution for the time distance between two consecutive points in the stream (Johnson and Kotz 1970). The point stream is then a so-called homogeneous Poisson process.

The observed sample consists of 20 distances and has the empirical distribution function plotted as circles in Fig. 2 {plotting position after ordering the sample as [*t _{i}*, 1/40 + (

*i*− 1)/20], where

*t*

_{1}≤

*t*

_{2}≤ … ≤

*t*≤ … ≤

_{i}*t*

_{20}}. Moreover, the exponential distribution function with the mean equal to the average 2388 yr of the observed sample is shown. The systematic misfit that appears between the empirical distribution and the exponential distribution indicates that the assumption of lack of memory may not be fully convincing.

A next step of choice of a simple and also physically reasonable model that includes memory is to assume that the system can be in two different states, and that the only memory element is the current state of the system.

This is tantamount to the assumption that the glacial climate varied in time as a realization of a stationary on–off jump process [where the interstadials (D–Os) are the “on” state and the stadials are the “off” state] in which the jumps between the two states occur completely at random and at any time statistically independent of the past. It is emphasized that such an on–off model does not imply that there must be statistically stationary behavior (stability) within each state. At this level of modeling the within-state variations are of no concern except for the lack of memory. Physically the triggering of the jumps could come from the fast time-scale chaotic variations (noise), which are temporally uncorrelated on the observed climatic time scales. The assumed within-state lack of memory implies that the durations of the interstadials and the durations of the stadials become exponentially distributed, but not necessarily with the same mean values.

*T*=

*T*

_{1}+

*T*

_{2}, where

*T*

_{1}and

*T*

_{2}are independent exponentially distributed random variables of means

*τ*

_{1}and

*τ*

_{2}, respectively. The sum of the two has the probability densityFor

*τ*

_{1}≠

*τ*

_{2}, the probability density at the point (

*t*

_{1}, …,

*t*) in the

_{n}*n*-dimensional space of the

*n*sample of

*T*isaccording to the multiplication rule for statistically independent random variables. This density calculated for the actual sample

*t*

_{1}, …,

*t*is called the likelihood of the parameters (

_{n}*τ*

_{1},

*τ*

_{2}) and as function of these parameters,

*L*(

*τ*

_{1},

*τ*

_{2};

*t*

_{1}, …,

*t*) is called the likelihood function. It is obviously a reasonable principle to estimate the true values of the parameters by the point at which their likelihood is maximal.

_{n}*τ*

_{1}from a sample

*t*

_{11}, …,

*t*

_{1n}of the exponentially distributed variable

*T*

_{1}is the average of the sample, and similarly for

*τ*

_{2}. From this fact it does not follow that the maximum likelihood estimate of

*τ*

_{1}+

*τ*

_{2}is the sum of the two averages, that is, the average of the sample

*t*

_{11}+

*t*

_{21}, …,

*t*

_{1n}+

*t*

_{2n}, except if

*τ*

_{1}=

*τ*

_{2}. However, the sample average estimator of

*τ*

_{1}+

*τ*

_{2}is consistent with the fact that

*E*(

*T*) =

*τ*

_{1}+

*τ*

_{2}. This means that the maximum likelihood estimate of

*τ*

_{1}+

*τ*

_{2}is biased while the average is not. Thus, we remove this bias by maximizing

*L*under the restrictionThe sample size is

*n*= 20 and the average is

*= 2388 yr. The value of*t

*τ*

_{1}for which

*L*is maximal is

*τ*

_{1}= 493 yr. Figure 2 (right panel) shows the corresponding distribution function, together with the observed 20-point sample distribution function. The improvement of the fit from the left panel to the right panel is striking. However, it should be noted that there is a very small sensitivity of the distribution function in the right panel to variations of the parameter values

*τ*

_{1}and

*τ*

_{2}given the sum

*τ*

_{1}+

*τ*

_{2}. This is clear from inspecting the bunch of distribution function graphs in the left panel of Fig. 3. These correspond to the distribution functions of

*T*for different values of the ratio

*τ*

_{1}/(

*τ*

_{1}+

*τ*

_{2}) ranging from 0 (exponential distribution) over 0.1, 0.2, 0.3, 0.4, to 0.5 (gamma distribution). For the obtained maximum likelihood estimate the ratio is 493/2388 ≈ 0.21. The right panel of Fig. 3 shows the conditional distribution functions (each obtained from 10 000 simulated 20-point samples) of the ratio of the maximum likelihood estimator of

*τ*

_{1}and the average estimator of

*τ*

_{1}+

*τ*

_{2}, given the ratio

*r*=

*τ*

_{1}/(

*τ*

_{1}+

*τ*

_{2}) for

*r*= 0.0, 0.1, …, 0.5. It is remarkable that there is a probability of about 0.36 (the jump at

*x*= 0.5) that the maximum likelihood estimate of

*τ*

_{1}and

*τ*

_{2}becomes equal even if the true ratio

*τ*

_{1}/(

*τ*

_{1}+

*τ*

_{2}) is as small as 0.2, close to the obtained estimate 0.21. Thus, the gamma distribution is wrongly selected with a probability of 0.36, if a sample of size of 20 is drawn and the maximum likelihood estimate of

*τ*

_{1}is calculated from that sample.

It is also seen that if the exponential distribution is assumed to be the true distribution of *T*, then there is only about a 10% probability to get an estimate larger than 0.2 of *τ*_{1}/(*τ*_{1} + *τ*_{2}). Thus, it is not unreasonable to look for a better model than the exponential. This is further supported by the physically reasonable hypothesis of the existence of two states.

It is an obvious task to investigate whether it is possible to construct an equally well fitting distribution function under some deterministic DO event triggering mechanism. For example, this mechanism might be some hidden periodic forcing. Such a model is considered in appendix A. One might defend its more complicated nature if the detected periodicity could be given some geophysical or astronomical cause. Otherwise, the simple pure randomness model is just as good. It seems difficult to go further in the direction of determinism when considering the possibility of the existence of two different states of random duration.

The indication of the existence of two states of independent exponentially distributed durations by the quite good fit obtained in the right panel of Fig. 2 invites further inspection of the record to see if the states are visible and measurable with acceptable accuracy.

## 3. Estimation of the exponential distributions

The estimation of the parameters *τ*_{1} and *τ*_{2} will obviously be improved if the 20 realizations of the (*T*_{1}, *T*_{2}) pair are included in the estimation. However, in the *δ*^{18}*O* isotope record the interstadials have the characteristic saw-tooth shape, which is not present in the records of dust concentrations. By inspecting the *δ*^{18}*O* record displayed in Fig. 1 it is recognized that because of the saw-tooth shape it is considerably more difficult to accurately read the time points of transition from interstadial to stadial states than from the stadial to the interstadial states analyzed in the previous section. In addition to the duration variables *T*_{1} and *T*_{2} we need, therefore, to introduce an error-of-identification variable *R* assigned with suitable distributional properties.

*T*=

*X*

_{1}+

*X*

_{2}, where

*X*

_{1}and

*X*

_{2}are random variables that represent the identified interstadial and stadial durations, respectively. Next write

*X*

_{1}=

*T*

_{1}+

*R*,

*X*

_{2}=

*T*

_{2}−

*R*, where, for given (

*T*

_{1},

*T*

_{2}) = (

*t*

_{1},

*t*

_{2}),

*R*is a random variable with values that by necessity is limited to the interval −

*t*

_{1}<

*R*<

*t*

_{2}because

*X*

_{1}and

*X*

_{2}both are positive. This means that the usual normal distribution model for an error distribution is not applicable here. Instead, an applicable sufficiently extensive and mathematically convenient distribution family is represented by the beta-distribution family, for which the standard form of the probability density is

*f*(

_{β}*x*;

*p*,

*q*) =

*x*

^{p−1}(1 −

*x*)

^{q−1}/

*B*(

*p*,

*q*), 0 <

*x*< 1, where

*p*and

*q*are positive parameters, and

*B*(

*p*,

*q*) = ∫

_{0}

^{1}

*x*

^{p−1}(1 −

*x*)

^{q−1}

*dx*(the beta function). Thus, for given (

*T*

_{1},

*T*

_{2}) = (

*t*

_{1},

*t*

_{2}), the random variable (

*T*

_{1}+

*R*)/(

*T*

_{1}+

*T*

_{2}) is assumed to have the conditional probability densityFrom the relations

*B*(

*p*,

*q*) = Γ(

*p*) Γ(

*q*)/ Γ(

*p*+

*q*) and Γ(

*p*+ 1) =

*p*Γ(

*p*) it is directly seen that ∫

_{0}

^{1}

*x*

^{p}(1 −

*x*)

^{q − 1}

*dx*=

*B*(

*p*,

*q*)

*p*/(

*p*+

*q*), so that we get the conditional expectationWe will model the error correction

*R*so that it has the conditional mean value

*E*(

*R*|

*T*

_{1}=

*t*

_{1},

*T*

_{2}=

*t*

_{2}) =

*μt*

_{1}

*t*

_{2}, where

*μ*is some constant. This model reflects that the error must vanish in the two limits

*t*

_{1}→ 0 and

*t*

_{2}→ 0. It follows from (5) that this property is obtained if

*q*is defined asTo obtain the joint distribution of

*X*

_{1},

*X*

_{2}we note that

*x*

_{1}+

*x*

_{2}=

*t*

_{1}+

*t*

_{2}. Thus,Removing the conditioning by use of the total probability theorem, (7) gives the joint probability densitywhich, with the restriction

*τ*

_{2}=

*a*−

*τ*

_{1}[see (3)],

*a*= 2388 yr, gives the likelihood functionThe maximum is obtained for

*τ*

_{1}= 802 yr,

*τ*

_{2}= 1586 yr,

*p*= 698, and

*μ*= 39.1 10

^{−6}yr

^{−1}. The average of the

*X*

_{1}sample is 846 yr, that is, 44 yr larger than

*τ*

_{1}. We note that

*μτ*

_{1}

*τ*

_{2}≈ 50 yr.

Figure 4 shows the estimated conditional distributions of the error term *R* for a selected set of values of *T*_{1}. Even though the conditional standard deviations of *R* are not insignificant for values of *T*_{1} in the range somewhat larger than from about *τ*_{1} to about *τ*_{2}, the error practically averages out in its influence on the estimated distribution functions shown in Fig. 5. Only a small influence comes from the bias away from zero of the mean of *R*.

The obtained ratio *τ*_{1}/(*τ*_{1} + *τ*_{2}) = 802/2388 ≈ 0.34 is not in conflict with the ratio 0.21 estimated from the on–on data in the previous section. In fact, if we assume that 0.34 is the true value of the ratio, then it follows from the corresponding conditional distribution function in the right panel of Fig. 3 that the probability of obtaining an estimate of the ratio less than 0.21 is about 25% when drawing a random 20 samples from the assumed distribution.

A simple evaluation of the statistical uncertainty of the maximum likelihood estimators of *τ*_{1} and *τ*_{2}, interpreted as the mean values of two independent exponentially distributed random variables, are the standard deviations *τ*_{1}/*n**τ*_{2}/*n**a* = *τ*_{1} + *τ*_{2} is ^{2} + 355^{2}

The estimated distribution functions fit quite reasonably to the observed 20-sample distribution function values shown as circles in Fig. 5. The statistically inexperienced reader may express doubt about this. However, it is important for the judgment of the quality of the fit to be aware that for a sample size of no more than 20, deviations of a size as seen in Fig. 5 must be expected with high probability. To convince the reader, a demonstration of the effect of this kind of statistical uncertainty is given in the next section.

The obtained distribution results, based on the assumption that *T*_{1} and *T*_{2} are statistically independent, to some degree also support the independence hypothesis. For further support, the scatterplot of the 20 pairs of observations of (*X*_{1}, *X*_{2}) (transformed to standard normal distribution) are shown (see Fig. C1). The empirical correlation coefficient is calculated to be as small as about 0.01. It should be noted that zero correlation between *T*_{1} and *T*_{2} implies that there is some correlation between *X*_{1} = *T*_{1} + *R* and *X*_{2} = *T*_{2} − *R*. For completeness the correlation coefficient between *X*_{1} and *X*_{2} is calculated in appendix C only by use of the model and the obtained parameter estimates. The theoretical correlation coefficient is obtained to 0.02, that is, it is negligibly small. In the same calculation the standard deviation of *R* is found to be about 90 yr.

The findings in this section increase the confidence that the hypothesis of the exponential distribution of the consecutive distances between the DO events should be rejected. This is because the hypothesis that *T*_{1} + *T*_{2} has exponential distribution is inconsistent with the finding that *T*_{1} and *T*_{2} are uncorrelated and exponentially distributed (or just close to being exponentially distributed). The exponential probability density *a*exp(−*at*) for the sum would be generated if *T*_{1} and *T*_{2} are independent with gamma probability densities proportional to *t ^{r}*

^{−l}exp(−

*at*) and

*t*

^{s}^{−1}exp(−

*at*), respectively, where

*r*+

*s*= 1,

*r*> 0,

*s*> 0. These gamma densities are infinite for

*t*= 0. However, the data do not support an assumption of infinite probability densities at

*t*= 0.

## 4. Demonstration of the actual statistical uncertainty

Figure 6 shows six independent examples of empirical distributions obtained by generating 20 samples from the standard exponential distribution with density function *f* (*x*) = *e*^{−x}, *x* > 0. The corresponding distribution function *F* (*x*) = 1 − *e ^{−x}* is shown in each panel. Thus, large deviations occur without inconsistency with the theoretical probability distribution. Visual comparison with the deviations seen in Fig. 5 shows that there is no reason to expect a fit of better quality than that obtained there.

Returning to Fig. 2, it is seen in the perspective of Fig. 6 that the exponential distribution for the on–on time distances cannot be excluded as the valid distribution. In fact, earlier we read from the right panel in Fig. 3 that there is a probability of about 10% that the estimate of the ratio *τ*_{1}/(*τ*_{1} + *τ*_{2}) becomes larger than 0.2 under the assumption that the ratio is zero. Thus, the path of observation point deviations seen in the left panel of Fig. 2 does not occur so rarely for the exponential model that one is entitled to reject the exponential model as the valid model. To reject the exponential model with confidence it is essential that supplementary information is considered, for example, such as the information that two independent states are detectable in the isotope record.

## 5. Interpretations and concluding remarks

The waiting time distribution observed in the ice core record is well fitted by a random on–off process. The mean waiting times in the two states are different, indicating that the waiting time depends on the climate state. In the warm state the mean waiting time (≈800 yr) is markedly shorter than the mean waiting time in the cold state (≈1600 yr). The random nature of the waiting times indicates that the jumping is triggered by internal climatic fluctuations. A possible exception is the transition into the present warm Holocene period, which has been stable for 12 000 yr. This could indicate either that the Holocene climate is a distinctly different climate from the interstadials or that the internal climate fluctuations become so small in the Holocene climate that the triggering of jumps into the cold state is exponentially suppressed. Both possibilities could be argued to be caused by the disappearance of the Laurentide and Fennoscandian ice sheets, which makes the Holocene distinctly different from the interstadials.

From the random no-memory nature of the climatic jumps it should be expected that there are no correlations between consecutive climate periods, so that a short interstadial period is not preferentially followed by either a long or a short stadial period, and vice versa. This is independently checked and is indeed the case, as is shown in detail in appendixes B and C. In principle this observation could constraint the possibilities for how glacier buildup or Atlantic sea-saw connections influence the Dansgaard–Oeschger climatic shifts.

In the glacial period the internal climate fluctuations are observed to be larger in the cold stadial state than in the warm interstadial state (Ditlevsen et al. 2002). Naively, one should expect that from a purely noise-induced transition the waiting time in the cold state should be shorter than in the warm state, in contrast to the observations. Thus, it can be concluded from the observations that the cold state is more stable than the warm state and the barrier for noise-induced transition is highest from the cold to the warm state. This is an important benchmark for high-resolution climate models, if these can be constructed to simulate the two climate states and the transitions between the two states. At present the two climate states have been simulated by so-called water hosing experiments in intermediate complexity models, and are thus not internally noise generated in these models (Schmittner et al. 2002).

As seen from appendix A, other statistical models, including hidden periodic triggers, can explain the data record as well. This is an unfortunate consequence of the limited size of the record. From a philosophical point of view one should, in our opinion, refer to the principle of Occam’s razor, favoring the simplest model with fewest assumptions to explain the data, unless independent evidence selects a different model. In that sense the purely random model seems at present to be the first choice.

As a final remark it should be mentioned that the observation of a simple statistical structure of the climate shifts does not imply a simple underlying climate dynamics. The jump process and threshold crossing dynamics is a highly nonlinear phenomenon.

## REFERENCES

Alley, R. B., , S. Anandakrishnan, , and P. Jung, 2001: Stochastic resonance in the North Atlantic.

,*Paleoceanography***16****,**190–198.Braun, H., , M. Christi, , S. Rahmstorf, , A. Ganopolski, , A. Mangini, , C. Kubatski, , K. Roth, , and B. Kromer, 2005: Possible solar origin of the 1,470-year glacial climate cycle demonstrated in a coupled model.

,*Nature***438****,**208–211.Cessi, P., 1994: A simple box model of stochastically forced thermohaline flow.

,*J. Phys. Oceanogr.***24****,**1911–1920.Dansgaard, W., and Coauthors, 1993: Evidence for general instability of past climate from a 250-kyr ice-core record.

,*Nature***364****,**218–220.Ditlevsen, P. D., 1999: Observation of alpha-stable noise and a bistable climate potential in an ice-core record.

,*Geophys. Res. Lett.***26****,**1441–1444.Ditlevsen, P. D., , S. Ditlevsen, , and K. K. Andersen, 2002: The fast climate fluctuations during the stadial and interstadial climate states.

,*J. Glaciol.***35****,**457–462.Ditlevsen, P. D., , M. S. Kristensen, , and K. K. Andersen, 2005: The recurrence time of Dansgaard–Oeschger events and limits on the possible periodic component.

,*J. Climate***18****,**2594–2603.Ditlevsen, P. D., , K. K. Andersen, , and A. Svensson, 2007: The DO-climate events are probably noise induced: Statistical investigation of the claimed 1470 years cycle.

,*Climate Past***3****,**129–134.Gildor, H., , and E. Tziperman, 2003: Sea-ice switches and abrupt climate change.

,*Philos. Trans. Roy. Soc. London***361A****,**1935–1944.Haarsma, R. J., , J. D. Opsteegh, , F. M. Selten, , and X. Wang, 2001: Rapid transitions and ultra-low frequency behaviour in a 40 kyr integration with a coupled climate model of intermediate complexity.

,*Climate Dyn.***17****,**559–570.Johnson, N. L., , and S. Kotz, 1970:

*Distributions in Statistics: Continuous Univariate Distributions I*. John Wiley and Sons, 300 pp.MacAyeal, D. R., 1993: Binge/purge oscillations of the Laurentide ice sheet as a cause of the North Atlantic’s Heinrich events.

,*Paleoceanography***8****,**775–783.Schmittner, A., , M. Yoshimori, , and A. J. Weaver, 2002: Instability of glacial climate in a model of the ocean-atmosphere-cryosphere system.

,*Science***295****,**1489–1493.Schulz, M., 2002: On the 1470-year pacing of Dansgaard-Oeschger warm events.

,*Paleoceanography***17****,**1014. doi:10.1029/2000PA000571.Svensson, A., and Coauthors, 2008: A 60 000 year Greenland stratigraphic ice core chronology.

,*Climate Past***4****,**1–11.Weaver, A. J., , E. S. Sarachik, , and J. Marotzke, 1991: Freshwater forcing of decadal and interdecadal oceanic variability.

,*Nature***353****,**836–838.Wunsch, C., 2006: Abrupt climate change: An alternative view.

,*Quat. Res.***65****,**191–203.

## APPENDIX A

### Model with Hidden Periodic Trigger

*τ*. To investigate this problem we adopt the probabilistic modelfor the times

*T*

_{1},

*T*

_{2}, … of occurrences of the consecutive DO events. Here

*t*

_{0}is some unknown time origin,

*N*

_{1},

*N*

_{2}, … is an increasing random sequence of integers, and

*S*

_{1},

*S*

_{2}, … are independent random slack times (i.e., delay times that can be positive as well as negative) of identical probability density, which is zero outside a bounded interval of length at most equal to the period

*τ*. The model is formulated so that not all the mechanism periods necessarily trigger a DO event. As suggested by P. Hedegård (2007, personal communication), the successful periods with respect to DO events are reasonably chosen at random with some probability

*α*.

*t*

_{0}is eliminated. The variables

*X*=

_{i}*N*

_{i}_{+1}−

*N*are defined as mutually independent random variables with the common frequency function(the geometric distribution). For convenience we write

_{i}*T*

_{i}_{+1}−

*T*as

_{i}*T*, with observations

_{i}*t*

_{1}, …,

*t*. As probability density type for the mutually independent random variables

_{n}*S*

_{1}, …,

*S*

_{n}_{+1}we will adopt the beta distribution density (see Fig. A1),with mean

*E*(

*S*) = 0 and variancewhich we will denote as

*σ*

^{2}. The likelihood function of the unknown parameters

*α*,

*τ*, and

*p*becomes too complicated even for numerical calculations to be completed within reasonable time. Instead, the much simpler method of moments is applied. Noting that the identically distributed random variables given by (A2) are of the form as

*T*=

*Xτ*+

*S*

_{2}−

*S*

_{1}, where

*E*(

*X*) = 1/

*α*,

*E*(

*X*

^{2}) = (2 −

*α*)/

*α*

^{2}{easily obtained by repeated differentiation of the probability generating function

*ψ*(

*u*) =

*E*(

*X*) =

^{u}*αu*/[1 − (1 −

*α*)

*u*] and setting

*u*= 1}, and

*E*(

*S*

_{2}−

*S*

_{1}) = 0,

*E*[(

*S*

_{2}−

*S*

_{1})

^{2}] = 2

*σ*

^{2}, it follows thatElimination of

*τ*and substitution of

*σ*

^{2}gives the equationin

*α*for each given value of

*p*. With the given data of

*n*= 20 sample values and the sample means

*E*(

*T*

^{j}) ≈

*n*

^{−1}∑

_{i = 1}

^{n}

*t*

_{i}

^{j},

*j*= 1, 2, we get the results displayed in Fig. A2. By visual judgment it seems most reasonable to adopt the uniform distribution for

*S*, which corresponds to

*p*= 1. The corresponding parameter estimates are

*τ*= 854 yr and

*α*= 0.357. It is difficult to distinguish the goodness of fit from that of the right panel in Fig. 2.

It is remarkable that with this simple model the most probable period is considerably shorter than the period of 1470 yr proposed in the literature (Schulz 2002). Furthermore, the obtained point process model is quite peculiar. The time distance from a point to the next point is generated almost correctly by tossing a dice repeatedly until either one dot or two dots are obtained (probability 1/3 per toss approximating *α*). Next, the period *τ* is multiplied by the used number of tosses. Finally the obtained time is corrected by adding two numbers chosen independently and completely at random from the interval (−*τ*/2, *τ*/2). The authors consider this model as an artificial mathematical approximation construct rather than an indication of a hidden period. Moreover, the model contains no room for the existence of two states. With a sample size as small as 20 it is possible to construct many different mathematical models that cannot be rejected by a statistical test based on the sample. We have shown in this paper that whatever sophisticated model is constructed for generating the point process, it might as well be generated by a simple two-state process with complete randomness from small time interval to next small time interval (i.e., a so-called two-state Markov process).

## APPENDIX B

### Test of Lack of Correlation between Consecutive DO Point Distances

The estimation of the distribution function for the time distances between the DO events described in section 2 is not dependent on whether or not there is some dependence structure in the sequence of consecutive time distances. Any reordering of the sequence leads to the same distribution function, of course. To see whether the sequence taken in chronological order as measured shows signs of a dependent structure that is essentially different from that of the same sequence of time distances obtained by randomizing the order, the 15 first “number lag” correlation coefficients are shown in Fig. B1 for the measured sequence and three other randomized versions of the measured sequence. To be more specific, the measured sequence of time distances are first transformed one to one into a sequence *y*_{1}, *y*_{2}, …, *y*_{20} with standard normal distribution of the elements. Next, the sequence of 20 − *i* pairs (*y*_{1}, *y _{i}*

_{+1}), (

*y*

_{2},

*y*

_{i}_{+2}) …, (

*y*

_{20−i},

*y*

_{20}) is constructed for each

*i*= 1, 2, …, 15, where

*i*is called the number lag, and the empirical correlation coefficient is calculated for each

*i*. The results are shown as circle marks in the upper-left panel of Fig. B1. The other three panels in Fig. B1 shows the empirical correlation coefficients as small square marks for the three randomized versions. The dashed curves are distribution function quantiles of the correlation coefficient estimator under the assumption that the correlation coefficient is zero, an assumption that is true for the randomized versions of the sequence, of course. Comparison of the upper-left panel with the three other panels gives no reason to reject the hypothesis that the measured distances comply with a model where the number lag correlation coefficients are all zero.

## APPENDIX C

### Test of Lack of Correlation between Interstadial Duration and the Next Following Stadial Duration

The scatterplot in Fig. C1 shows that the assumption of independence between *T*_{1} and *T*_{2} in no way is contradicted by the observed data. However, to be careful, the size of the correlation induced between *X*_{1} and *X*_{2} by the correcting term *R* must be evaluated. The following derivations address this.

*T*

_{1}and

*T*

_{2}are independent, we have the expectation

*E*(

*T*

_{1}

*T*

_{2}) =

*E*(

*T*

_{1})

*E*(

*T*

_{2}) and the covariance Cov(

*T*

_{1},

*T*

_{2}) =

*E*(

*T*

_{1}

*T*

_{2}) −

*E*(

*T*

_{1})

*E*(

*T*

_{2}) = 0. Moreover, by assumption we have

*E*(

*R*|

*T*

_{1},

*T*

_{2}) =

*μT*

_{1}

*T*

_{2}. Finally,

*E*(

*T*

_{1}) =

*τ*

_{1},

*E*(

*T*

_{2}) =

*τ*

_{2}, Var(

*T*

_{1}) =

*τ*

_{1}

^{2}, Var(

*T*

_{2}) =

*τ*

_{2}

^{2}because of the exponential distribution. By the rules of calculations with expectations and covariances it then follows thatandwhereand the last term is obtained from the variance Var(

*W*) =

*pq*/[(

*p*+

*q*)

^{2}(

*p*+

*q*+ 1)] of the standard beta distribution combined with (6) and the linear transformation

*W*= (

*R*+

*t*

_{1})/(

*t*

_{1}+

*t*

_{2}) that defines the probability density (4). This complicated expectation term must be calculated by numerical integration or simple Monte Carlo simulation. To obtain the correlation coefficient we further needBy substitution of the estimated parameters

*τ*

_{1}= 802 yr,

*τ*

_{2}= 1586 yr,

*p*= 698, and

*μ*= 39.1 yr

^{−1}, we find that the standard deviation of

*R*becomes estimated to

*R*)

*X*

_{1}and

*X*

_{2}becomes Cov(

*T*

_{1}+

*R*,

*T*

_{2}−

*R*)/

*T*

_{1}+

*R*)Var(

*T*

_{2}−

*R*)

*T*

_{1}and

*T*

_{2}by use of the observation pairs (

*x*

_{1},

*x*

_{2}).