## 1. Introduction

Atmospheric analogs have been introduced by Lorenz (1969) in a study on atmospheric predictability. The faster one target state *z* and its closest analog *a*_{1} diverge from one another, the harder it is to predict the evolution of *z*. In Lorenz’s study, the state *z* was characterized by height values of the 200-, 500-, and 850-hPa isobaric surfaces at a grid of ≈1000 points over the Northern Hemisphere. The database of available analogs, called the catalog, contained five years of twice-daily values. In his abstract, Lorenz states that there are “numerous mediocre analogues but no truly good ones.”

Since Lorenz’s work, analogs have been used in many applications such as weather generators (Yiou 2014), data assimilation (Hamilton et al. 2016; Lguensat et al. 2017), kernel forecasting (Alexander et al. 2017), downscaling (Wetterhall et al. 2005), nonlinear bias correction (Hamill et al. 2015), climate reconstruction (Schenk and Zorita 2012; Fettweis et al. 2013; Yiou et al. 2013), and extreme event attribution (Cattiaux et al. 2010; Jézéquel et al. 2018).

The reason why Lorenz could not find any good analog was made clear later on by Van Den Dool (1994). It was shown that for high-dimensional systems, the mean return time of a good analog (used as a proxy for a minimum catalog size) grows exponentially with dimension. This result is a variant for analogs of the “curse of dimensionality,” well known in data sciences. With three pressure levels over the whole Northern Hemisphere, the dimension of Lorenz’s study was very high, and only 5 years of twice-daily data was not enough to hope finding a good analog.

Nicolis (1998) added a dynamical systems’ perspective to Van Den Dool’s analysis. She showed that studying mean return times was not enough, as the relative standard deviation of this return time could be very high. Furthermore, it was shown that return-time statistics exhibit strong local variations in phase-space, so that certain target states may need a larger catalog size to find good analogs.

Accounting for Van Den Dool’s findings, it is now usual to reduce as much as possible the feature-space dimension before searching for analogs. Also, the last decades have witnessed a proliferation of data from in situ and satellite observations, as well as outputs from numerical physics-based model. Such conditions allow one to find good analogs in many situations, and it has become standard to use not just one, but many analogs (usually a few tens). From a statistical perspective, using many analogs instead of one can increase estimation bias, but it reduces estimation variance, so that the estimation is less sensitive to noise. Using many analogs also allows us to perform local regression techniques on the analogs, such as local linear regression (Lguensat et al. 2017). This technique has proven efficient in analog forecasting applications (Ayet and Tandeo 2018), and it was shown that local linear regression allows analog forecasting to capture the local Jacobian of the dynamics of the real system (Platzer et al. 2021).

This new context suggests focusing not only on the closest analog *a*_{1}, but also the *k*th closest analog, for *k* up to ~40. The number of analog used is usually the result of a trade-off between the number of available good analogs and the minimum number of analogs required to perform a given task (for instance, Yiou and Déandréis 2019; search for 20 analogs at each step to perform ensemble analog forecasts). Also, one can now reasonably hope to find good analogs using dimension reduction and a large amount of data. Thus, one is less interested in return times, but rather in analog distances. That is, for a given length of available data, how far will the closest analogs be? Performances of analog-based methods are usually conditioned by analog-to-target distances [see, for instance, the relationship between analog distances and forecast performance in Farmer and Sidorowichl (1988) and Platzer et al. (2021)]. In this work, we propose to evaluate the probability distribution of these distances. Our analytical probability distributions make the link between analog-to-target distances, catalog size, and local dimension. This brings new insight on the impact of dimensionality on analog methods.

Section 2 outlines the theoretical framework and findings. The section 3 shows implications of the findings and compares the present analysis with past studies. Section 4 shows results from numerical experiments of the three-variable Lorenz (1963) system, the variable-dimension Lorenz (1996) system, and from 10-m wind reanalysis data from the regional climate model AROME, further referred to as “the AROME reanalysis data.” Detailed derivations of the results of section 2 can be found in appendixes B and C.

## 2. Theory

### a. Analogs in dynamical systems and local dimensions

We assume a dynamical system with an attractor set *z* of the attractor

The trajectory from which the analogs are taken is called the “catalog” *z* to be considered independent, it is enough that the maximum distance between any two analogs of *z* be smaller than the minimum distance between any analog of *z* and its neighbors in time (i.e., its successor and predecessor).

*μ*, conditions the structure of the catalog and the ability to find analogs. In particular, Van Den Dool (1994) and Nicolis (1998) studied the role of the attractor’s dimension that we will now introduce. Let

*B*

_{z,r}the ball centered on

*r*, then

*r*-resolution) local dimension at point

*z*. As mentioned later in the text, this definition depends on the unit used to measure the distance

*r*, although only lightly if

*r*is small [see Eq. (13)]. For instance, relative temperature differences are higher if measured in degrees Fahrenheit rather than degrees Celsius, resulting in a smaller value of

*d*

_{z,r}at fixed

*r*. However, in practice we estimate here

*d*

_{z,r}based on Eq. (2) which considers ratios of distances and is therefore unit independent. There are many other ways to estimate dimension, including ones that do not depend on the choice of unit [see, for instance, the more global estimates of Wang and Shen (1999)]; however, Eq. (1) is the most suited to our purpose and derivations, as appears clearly in appendix B.

Note that for ergodic measures, *μ*(*B*_{z,r}) can be approximated by counting the number of times a given trajectory enters *B*_{z,r} [this is the consequence of the ergodic theorem of Birkhoff (1931)]. In the following, we assume that *μ* is ergodic and stationary. This does not apply when nonstationary processes, such as climate change, break the stationarity of *μ*. Also, in practice, periodic forcings such as seasonality make the structure of the attractor of a system such as the atmosphere vary between winter and summer. Therefore, analogs must be searched within a given time window around the calendar date of the target *z*, so that the subsampling allows us to recover an invariant measure (see Lorenz 1969; Yiou and Déandréis 2019). For a discussion on the modification of the invariant measure due to seasonality and nonperiodic forcing, see Robin et al. (2017).

*μ*is ergodic and that

*μ*is said to be exact dimensional and the limit is independent of

*z*(Young 1982). This typical value of the local dimension is the order-one Renyi dimension, also called information dimension, or attractor dimension, and is here noted

*D*

_{1}. It is a typical value in the sense that for every

*z*and for small enough

*r*,

*d*

_{z,r}is close to

*D*

_{1}. Also,

*D*

_{1}can be estimated by taking the average of the estimates of local dimension (see next section):

The finite-resolution local dimension *d*_{z,r}, however, can deviate from the typical value *D*_{1}. More precisely, *d*_{z,r} exhibits large deviations from its limit value. The amplitude of these deviations depends on (−log*r*)^{−1/2} and on the spectrum of fractal dimensions (for more details, see Caby et al. 2019).

These definitions of dimension correspond to the notion of attractor dimension, which comes from the field of dynamical systems. There are strong connections with other mathematical objects used to estimate dimensionality in computer science and machine learning. These include the doubling dimension (Gupta et al. 2003) and expansion dimension (Karger and Ruhl 2002) which are related to ratios of volume occupied by data, and the intrinsic dimension (Houle 2013), which is related to the minimum number of variables needed to correctly represent a dataset. The local intrinsic dimension as defined by Houle (2017) is closely related to the local attractor dimension *d*_{z,r} which is used in the present study.

The definitions of *B*_{z,r}, *d*_{z,r}, and *D*_{1} depend on the metric that is used to evaluate distances. However, we show in appendix A that the limit value *D*_{1} is independent of the choice of metric; therefore, *d*_{z,r} is also expected to depend only lightly on the metric that is used. The theoretical results expressed in this paper in the limit of small distance *r* → 0 (or, equivalently, of large catalog *L* → +∞) are valid whatever the metric used. Note that this does not apply to measures of similarity such as correlation or statistical divergence, that are not actual metrics (of which we recall the definition in appendix A).

All these definitions are valid in the limit of small distance *r*, which can be hard to achieve in high dimension due to the concentration of norms or “curse of dimensionality” (Verleysen and François 2005). The effect of the curse of dimensionality on the estimation of dimensions following Eq. (1) was studied analytically and numerically by Pons et al. (2020), with effects starting to be nonnegligible in dimension ≈40. In the numerical experiments presented here, we have checked empirically that the concentration of norms was small enough.

The distance from the *k*th analog *z* is noted *z* dependency implicit, writing simply *a*_{k} and *r*_{k} rather than *a*_{k}(*z*) and *r*_{k}(*z*). Analog-to-target distances always depend on a target *z*, and the only way to remove this dependency would be through averaging, which is done only in section 4e. Distances are sorted so that *r*_{1}(*z*) < *r*_{2}(*z*) < ⋅⋅⋅ < *r*_{K}(*z*), and *K* is the total number of analogs considered. Empirical methods usually set *K* to a fixed value, reaching for a bias-variance trade-off. A small value of *K* typically increases the variance of the analog method, for instance, in the case of observation noise. Raising the value of *K* allows us to average out this variability. However, a too large value of *K* would include analogs that are too far from the target and not relevant, therefore raising bias. For an example of this bias-variance trade-off, see Platzer et al. (2021). This amounts to looking at a lower quantile of the function *R* for the analog-to-target distances so that *r*_{K}(*z*) < *R* < *r*_{K+1}(*z*). In this case, *K*(*z*) depends on *z*. This is referred to as the epsilon nearest neighbor search. However, in the numerical experiments of this paper we always set *K* to a fixed value.

### b. Simple scaling of analog-to-target distance with local dimension

*d*

_{z,r}can be estimated using the empirical cumulative distribution function (CDF) of points inside a ball of exponentially decreasing radius:

*s*takes values according to the available data, that is, for the

*k*th analog of

*z*,

*s*

_{k}= −log(

*r*

_{k}/

*r*

_{K}), and

*K*closest available analogs. It follows from Caby et al. (2019) that, for regular enough measures,

In the following and unless otherwise noted, “the local dimension,” or *d*, both refer to *d* might refer to *d*_{z,r} as defined in Eq. (1).

A practical application of Eq. (2) with the system of Lorenz (1963) (see appendix D for a formal definition of this system) is given in Fig. 1. Another way to estimate *σ*^{−1}. As can be seen in the example of Fig. 1, both methods give similar results.

*D*

_{1}from the average of realizations of

*a*

_{k}(

*z*)]

_{k=1,…,K}. Caby et al. (2019) use this approximation to estimate the attractor dimension of the system of Lorenz (1963, hereafter noted L63) as 2.06, which is in agreement with values found in the literature.

*r*

_{k}(

*z*) with

*k*:

*r*

_{K}.

Equation (4) already reveals an important point of our analysis, which is the scaling of *r*_{k} with *k*, and is approximately given by a power-law with exponent 1/*d*. However, this formula comes from a work on local dimensions, not analog-to-target distances. It is therefore not surprising that some of the elements required for our study are missing. In particular, this scaling does not give the constant in front of *k*^{1/d}, in which resides the relation to the catalog size, a crucial point for analog applications. Also, it only gives a mean or typical value of *r*_{k}(*z*), while our objective is to evaluate the probability distribution of *r*_{k}(*z*) at fixed *z* and *L*, or at least the probability of departures from this mean scaling.

The next section gives the full probability distribution of *r*_{k}(*z*) for a fixed target *z* as a function of the local dimension, the catalog size, and the analog number *k*.

### c. Full probability distribution of analog-to-target distance

*L*,

*k*, and

*k*th analog-to-target distance

*r*

_{k}(

*z*) follows the following probability density function:

*p*

_{k}(

*r*) is defined through

*r*

_{k}and

*d*both depend on

*z*. Equation (5) was obtained neglecting the variations of

*d*

_{z,r}with

*r*, and therefore, in practice we assume that

*K*. The value of

*d*is thus assumed to be independent of

*k*. This is consistent with practical applications where the limited available data do not allow us to evaluate fine variations of

*d*

_{z,r}with

*r*, but where clear variations of

*z*are witnessed and reveal different dynamical situations (Faranda et al. 2017). Therefore, in this section

*d*always refers to

*K*largest-order statistics from extreme value theory is given in appendix C.

*r*

_{k}for fixed

*k*,

*z*,

*L*, and

*d*:

*x*, using Laplace’s method up to second order to evaluate the integral (the first order gives Stirling’s formula). This gives the following expressions for the mean and relative standard deviation:

*r*

_{k}~

*k*

^{1/d}of Eq. (4). These approximations are the result of Taylor expansions for large

*k*from Eqs. (6a) and (6b), and will therefore be increasingly valid as

*k*grows. However, even for

*k*= 1, Eqs. (7a) and (7b) give a satisfactory numerical approximation of Eqs. (6a) and (6b).

*kd*> 1, one can also compute

*r*for which

*p*

_{k}reaches a maximum:

*kd*≤ 1,

*p*

_{k}(0) = +∞. Note that the three quantities ⟨

*r*

_{k}⟩, (

*k*/

*L*)

^{1/d}and

*k*→ +∞.

*r*

_{k}by

*L*

^{1/d}, later on referred to as the catalog density. The probability distribution

*u*

_{k}=

*L*

^{1/d}

*r*

_{k}can be computed by imposing the change of variable

*L*

^{1/d}, the probability density is independent of

*L*.

Figure 2 shows plots of *u* for varying values of *d* and *k*. As a consequence of the scaling *u*_{k} ~ *k*^{1/d}, we observe large variations of ⟨*u*_{k}⟩ with *k* for small dimensions *d*, and very small variations of ⟨*u*_{k}⟩ with *k* for large dimensions *d*. Note that, in the limiting case *d*→∞, the random variables *r*_{k} are degenerate and all equal *L*^{−1/d} almost surely. This can be witnessed through the different scales of the horizontal axis of the plots. This result is the consequence of the contraction of norms in high dimension, which can cause the search for analogs to be meaningless. In particular, Beyer et al. (1999) showed that, under reasonable conditions, the ratio between the distance from a target state *z* to its nearest neighbor *r*_{1} and the distance to the farthest point in a dataset *r*_{L} equals 1 for infinitely high-dimensional systems. Finally, it might seem counterintuitive that large values of the horizontal axis are observed in low dimension and not in high dimension, but this is only because the *L*^{−1/d} factor was removed by rescaling. Figure 2 is still consistent with Eq. (7a) which shows that ⟨*r*_{k}⟩ is, at fixed *L*, a growing function of the local dimension *d*.

Also, as a consequence of Eqs. (7), we have that the standard deviation of *r*_{k} is a growing function of *k* for *d* < 2, while it is constant for *d* = 2 and decreasing for *d* > 2. However, the relative standard deviation of *r*_{k} is always a decreasing function of *k* and *d* according to Eq. (7b).

### d. Normalization and convergence to the standard normal distribution

*u*

_{k}, and propose a normalization of the variable

*r*

_{k}(at fixed

*z*) that depends on the local dimension

*k*<

*K*, and on the catalog size

*L*. Equations (7a) and (7b) suggest the change of variables from

*r*

_{k}to

*υ*

_{k}as

*h*

_{k}(

*υ*) of the normalized

*k*th analog-to-target distance, so that

*υ*=

*dk*

^{1/2}[(

*L*/

*k*)

^{1/d}(

*r*− 1)] and

*h*

_{k}(

*υ*)

*dυ*=

*p*

_{k}(

*r*)

*dr*. This gives

*υ*

_{k}approaches the standard normal distribution for large

*k*. Note, however, that this limit cannot be fully observed in practice, as the distribution of Eq. (5) is valid only in the limit of large catalog size and with

*k*≪

*L*. In practice, as the convergence is in

*k*

^{−1/2}, the relative difference between

*h*

_{k}and the standard normal distribution is of ≈15% for

*k*= 40.

### e. Distances in observation space

*z*, but rather an observable

*y*=

*f*(

*z*) defined through a vector-valued function

*f*

^{−1}is the inverse image of

*f*that acts on sets (not vectors), and can therefore be defined even when

*f*is not invertible. This allows us to define an observation-based dimension:

The limit *D*_{1} and of properties of *f*. For instance, if *f* is differentiable and its Jacobian matrix at *z* is of rank *m* > 0, then *f* is quadratic, and its Jacobian matrix at *z* is zero, and therefore,

However, if we keep the hypothesis that *μ* is ergodic and *z* is a nonperiodic point, we can conduct the same analysis as in appendix B but replacing *μ* by *z* by *y* = *f*(*z*), and *d*_{z,r} by *f* is *C*^{∞} and that *r* → 0, we recover Eq. (5), only replacing *d* by *d*^{f}.

Therefore, the statistics of analog-to-target distances in observation space also follow Eq. (5), this time with a dimension that depends not only on the dynamical system, but also on properties of the observable.

## 3. Consequences for applications of analogs

### a. Comparison with previous studies

*ε*. With our notations, this condition can be written

*α*is the probability that the distance between two arbitrarily chosen states is less then

*ε*and can be expressed as the integral of a Gaussian probability density function. For small

*ε*,

Similar results can be found from Eq. (5). Indeed, one has *α* ≈ *ε*. Here, *D*_{1} is replaced by the local finite-resolution dimension

Nicolis (1998) extended the work of Van Den Dool (1994). Interpreting Eq. (10) in terms of mean return times and using the formula from Kac (1959), she found an expression of mean return times using the identity *L*. However, Nicolis (1998) performed empirical estimates of such variations of the return time, shading light on the pitfalls of an analysis limited to mean return times.

In the present paper, the point of view switches from statistics of return times to statistics of analog-to-target distance, and is extended to the *K* closest analogs rather than just the first one. The full probability distribution of Eq. (5) gives a detailed view of the variability of the process of searching for analogs.

Note that our work has many connections to the one of Houle (2017), who also studied probability distributions of distance functions. However, we are not aware of any published work giving probability distributions of analog distances such as in Eq. (5).

### b. Searching for analogs: Consequences

The full probability distribution of Eq. (5) has many consequences for the practical search of analogs.

For very low-dimensional systems (*D*_{1} < 2), the first analog-to-target distance has a lower variability than the next ones, so that a given value of *r*_{1} will be more representative of the next values of *r*_{1} than a given value of *r*_{10} would be of the next values of *r*_{10}. The inverse phenomenon happens for higher dimensional systems (*D*_{1} > 2). This can be taken into account to evaluate the expected performances of analog methods.

Also, the scaling *r*_{k} ~ *k*^{1/d} implies that the growth with *k* of the mean analog-to-target distance is much faster for low-dimensional systems (*z* than the first one. Again, this is consistent with the work of Beyer et al. (1999) on the concentration of norms in high dimensions. In low dimension, the sensitivity of analog-to-target distances to the choice of *K* (i.e., the number of analogs used) is thus higher than in high dimension. In practice, in the case of a sparse catalog (i.e., if the density of points *L*^{1/d} is not large enough to ensure finding very close analogs), a low value of *K* might be preferred in order to avoid using analogs too far away from the target. Conversely, in high dimension and with a similar density *L*^{1/d}, using a small or a large number of analogs should not play an important role on analog-to-target distances. However, note that the most important factor driving analog-to-target distances remains the catalog density *L*^{1/d}, which is higher in low dimension if the catalog size *L* is fixed. Therefore, our analysis is still consistent with the fact that, for a given size of dataset, better analogs will be found in low attractor dimension than in high attractor dimension. The higher sensitivity of analog-to-target distances to *K* in low dimension is only true if *L*^{1/d} is fixed, which means that we are comparing the case of a low dimension *d* and a small catalog size *L* to the case of a high dimension and a large catalog size.

For instance, Lguensat et al. (2017) use analogs to produce forecasts of several well-known dynamical systems, setting *K* = 40, while the use of Gaussian kernels with a variable bandwidth equal to *λ*_{z} = median_{k}*r*_{k} allows us to give a very low weight to analogs at distance *r*_{k} > *λ*_{z}. One might think that the filtering out of analogs with *r*_{k} > *λ*_{z} makes the forecast procedure relatively insensitive to the choice of *K*. Conversely, assuming that *λ*_{z} ≈ ⟨*r*_{[K/2]}⟩, where [*K*/2] is the integer part of *K*/2, we have that *λ*_{z} grows with *K* as *λ*_{z} ~ *K*^{1/d}. Thus, for low-dimensional systems such as the one of L63 for which *D*_{1} ≈ 2.06, our results suggest that in the case of a low sampling density, high values of *K* might have detrimental effects on the efficiency of analog methods. This affirmation is tested in section 4b.

However, note that here we focus on analog-to-target distances assuming that they are an important driving factor of the efficiency of analog methods, but in practice many other parameters come into play, such as the choice of the proper metric, or the choice of the feature space. The tuning of analog methods does not reduce to the objective of minimizing analog-to-target distances. Nevertheless, our results can be used, with caution, to indicate tendencies and general behaviors of analog methods.

In particular, the scaling ⟨*r*_{k}⟩ ~ (*k*/*L*)^{1/d} can be used in the context of dimension reduction. Assume that one wants to perform a statistical task that necessitates *K* analogs (for instance, an ensemble forecast). Then assume that one wants to reduce the dimension in order to have ⟨*r*_{K}⟩ < *ε*. From the scaling ⟨*r*_{k}⟩ ~ (*k*/*L*)^{1/d}, we find that the dimension must be reduced to at least *d*_{max,K} = {1 − [log(*K*)]/[log(*L*)]}*d*_{max,1}. Detailed arguments and a practical example are given in section 4e. Thus, for instance, if the criterion ⟨*r*_{1}⟩ < *ε* is met for *d*_{max,1} = 10 and if *L* = 10^{4}, then the criterion ⟨*r*_{25}⟩ < *ε* will be met only for *d*_{max,25} = 6. This shows that any dimension reduction performed with the objective of decreasing analog distances strongly depends on how many analogs are required.

Finally, the joint distribution of analog-to-target distances from appendix C theoretically allows us to express the probability distributions of any random variable of the form *ω*_{k})_{k} are weights and *p* is a positive integer. Such quantities can give error bounds for analog methods [see Platzer et al. (2021) for the case of analog forecasting]. However, a closed form for the distribution of such variables is yet to be derived.

## 4. Numerical experiments

### a. Three-variable Lorenz system

*z*using the

*K*closest analogs in the system of L63. This procedure is illustrated in Fig. 1. Then the scaling of Eq. (7a) is used to make a least squares fit from the data:

*r*

_{k}(

*z*) is the observed

*k*th analog-to-target distance and ≈

^{LS}means that the constant

*C*(

*z*) is evaluated with least squares from Eq. (11). Figure 3 shows an application of this procedure for a given

*z*of the L63, plotting the real values of

*r*

_{k}(

*z*), and using

*C*(

*z*)

*k*

^{1/d}as an approximation for ⟨

*r*

_{k}(

*z*)⟩ and dotted lines show the standard deviation around the mean from the approximate relative standard deviation given in Eq. (7b).

*L*takes large values (from 10

^{5}to 10

^{7}or more), a small estimation error for

*d*results in a large estimation error for

*L*

^{−1/d}. Another way to look at this estimation issue is that

*d*is relatively insensitive to a rescaling of distances or a change of unit. Let

*ρ*is a scalar value and

*r*/

*ρ*is a rescaled version of

*r*, or equivalently

*r*expressed in a different unit system. Note that we use

*μ*

_{z,r}and not

*μ*

_{z,r/ρ}as we only changed the unit of

*r*, not the actual distance it represents. Then

*ρ*| ≪ |log

*r*|. In particular, the method of Caby et al. (2019) is insensitive to a change of unit, as it involves only ratios of distances [see Eq. (2)]. Thus, Eq. (12) does not hold when

*C*and

*d*are determined as explained above. This is why

*C*(

*z*) is rather evaluated through Eq. (11), which allows one to find the scaling factor

*ρ*(

*z*) defined through

Note that similar issues are raised by Faranda et al. (2011) regarding the continuity of *μ*_{z,r} with respect to *r* and its limiting behavior for small *r*, which motivates Lucarini et al. (2014) to postulate that *μ*_{z,r} is the product of *r*, which is in some sense equivalent to our hypothesis that *C*(*z*) has to be rescaled with *ρ*(*z*) when the local dimension is estimated from the method of Caby et al. (2019).

The fact that *ρ*(*z*) varies with *z* (and is thus not exactly a change of unit) can be explained by the possibility for two points *z*_{1} and *z*_{2} to have the same local dimension *Z*, there exists values *z*_{1} and *z*_{2} such that the probability for *Z* to lie in the vicinity of *z*_{1} is higher than in the vicinity of *z*_{2}, and yet

Equations (11) and (14) are tested in numerical experiments using the system of L63, with results reported in Fig. 4. Analogs of a fixed target point *z* are sought for in 3 × 600 independent catalogs, with three different catalog sizes. Each catalog is built from a random draw without replacement of *L* points inside a (common) trajectory of 10^{9} points, generated using a Runge–Kutta numerical scheme with a time step of 0.01 in usual nondimensional notations. The dimension is calculated using *K* = 150 points, where this number is justified by a bias-variance trade-off: using this number and testing the procedure on 100 points picked from the measure *μ*, one finds a mean dimension *D*_{1} from Eq. (3) between 2.03 and 2.04, which is coherent with values reported by Caby et al. (2019), and a standard deviation of ~0.26. Using a lower value of *K* results in a higher variance, and using higher values results in biases that are dependent on the value of *L* used in this study. For more details on the distribution of local dimensions in the system of L63 the reader is referred to Faranda et al. (2017).

The consistency of empirical densities of *ρ* across varying values of *L* validates the scaling of *C* with *L* and *d*. Empirical probability densities of rescaled analog-to-target distances, also consistent across varying catalog sizes, are coherent with the theoretical probability densities from Eq. (5). The values of the rescaling parameter *ρ* are not surprising, as typical values of distances between points in the attractor are ~16 and maximum distances are ~28. Note that Nicolis (1998) uses a rescaling in studying analog return times with Lorenz’s three-variable system, dividing all distances by the maximum distance between two points on the attractor. The fact that *ρ*(*z*) exhibits seemingly large values is only the result of the choice of variables in the system of L63. For instance, it is possible to make a change of variables that would result in a system having the same chaotic properties, the same dimension, defined by almost the same dynamical equations, but with variables spanning smaller ranges, which would give numerical values of *ρ*(*z*) close to 1 (see appendix D).

Repeating this experiment for different target points *z* gives similar results. Values of *ρ* are on the same order of magnitude as the ones reported in Fig. 4. The consistency across varying values of *L* is almost always recovered, except for some points that have slightly higher dimensions *K* when estimating the dimension and the rescaling factor: the choice of *K* = 150 is relevant for most points, but should be adapted to the local dimension. Moreover, the use of other metrics (Manhattan, order-8 Minkowski, Chebyshev) has a very small influence on the results presented in Fig. 4.

Finally, we have conducted the same experiments but using observations of the first coordinate of the Lorenz system. The results are shown in Fig. 5. Again, the numerical data fit the theory, with an observed dimension close to 1 as expected. These last numerical experiments confirm the fact that our theory can be applied to observables of dynamical systems.

### b. N-variable system of Lorenz (1996)

In sections 2c and 3b, we state that for a fixed catalog local density *L*^{1/d}, the sensitivity of analog-to-target distances with *k* is stronger in low dimension. We also make the link between this sensitivity and the choice of *K* to be made for the efficiency of analog methods. Here we propose a simple illustration with analog forecasting on the system of Lorenz (1969) (see appendix D for a description of the system).

We use a time step of 0.05 (nondimensional units) to generate catalogs. We perform one forecast experiment with *N* = 12 variables and another with *N* = 20 variables. Dimensions *D*_{1} were estimated from Eq. (3) on an independent trajectory of 10^{5} points, and with full, perfect observation catalogs of size 10^{5} for each value of *N* (these were not the catalogs used to perform forecasts). This gives values of *D*_{1} ≈ 8 when *N* = 12 and *D*_{1} ≈ 12 when *N* = 20.

*L*= 10

^{3}time units for the

*N*= 12-variables system, and

*L*= 10

^{5}time units for the

*N*= 20-variable system. We used catalogs of noisy observations, adding independent and identically distributed (i.i.d.), zero-mean Gaussian white noises to a trajectory of full observations. The standard deviation of the noise was set to 1% of the root-mean-square distance (RMSD) between two states picked randomly in the attractor:

The analog forecast was simply done with a weighted mean of the successors of the *K* closest analogs, and weights defined by Gaussian kernels *λ*_{k}(*z*) is defined as the median over *k* of the values *r*_{k}(*z*) as explained in section 3b and used in Lguensat et al. (2017) and Platzer et al. (2021). Values of *K* = 5, 15, 25, 50, and 75 were tested for the total number of analogs. Distances were evaluated using the Euclidean metric. The analog forecast error was computed as the Euclidean distance between the analog forecast and the true future state, divided by the RMSD.

Figure 6 shows medians of analog forecast errors from this numerical experiment as a function of forecast horizon. First, it can be seen that the errors are very similar in magnitude, confirming that analog forecast errors strongly depend on analog-to-target distances (Platzer et al. 2021), which are largely determined by catalog density as we have seen. These errors are between 15% and 40% of the RMSD, which is the mean error of a climatological forecast that estimates the future state as a constant equal to the average over all states in the catalog. Therefore, the analog forecast errors from Fig. 6 appear to be relatively high, which was expected since the catalog density is quite low.

In higher dimension *D*_{1} ≈ 12 and for small forecast horizon (≤0.15), using five analogs results in the highest forecast error, because for this system averaging through a large number of analogs helps the forecast and reduces observational noise (Platzer et al. 2021). Then, still for small forecast horizon (≤0.15) and attractor dimension *D*_{1} ≈ 12, using 15, 25, 50, or 75 analogs does not make a significant difference. This is consistent with the fact that analog-to-target distances grow slowly with *k* in high dimension. Now, for the same system, the same catalog density *D*_{1}, the worst forecast is still witnessed with a low number of analogs *K* = 5, but values of *K* above 25 (i.e., *K* = 50, 75) increase forecast error, since analog-to-target distances grow faster with *k* in lower dimension. For larger forecast horizons (≥0.15), the error is increased due to the chaotic dynamics of the system, and this growth is stronger for large values of *K* which correspond to larger analog-to-target distances. For these larger time horizons and in dimension *D*_{1} ≈ 8, using *K* = 5, 15, or 25 analogs results in lower forecast errors than using *K* = 50 or 75 analogs.

This example illustrates the higher sensitivity of analog methods to the choice of *K* in low dimension, at fixed catalog density *L*, reducing the dimension (through any dimension-reduction technique) allows us to find more analogs close to the target.

### c. AROME reanalysis data: Dimensionality

To further appreciate the applicability of our results to high-dimensional, real geophysical systems, the theoretical developments from section 2 are tested on five years (2015–19) of hourly 10-m wind output from the physical model AROME (Ducrocq et al. 2005) coupled with satellite, radar, and in situ observations through a variational data assimilation scheme (similar to the one of Fischer et al. 2005). The spatial domain is an evenly spaced grid above Brittany, with latitudes ranging from 47.075° to 49.3° and longitudes from −5.7° to −2.575°, and a spacing of 0.025°. To focus on wind at sea, land points are removed from the data resulting in a domain of 8190 grid points.

Note that this dataset is not comprised of state vectors, but of partial observations (10-m wind, over a finite-width, evenly spaced grid) of the state of the atmosphere. Projections of the state *z* would be noted *y* = *f*(*z*) classically. However, we keep the notations *z*, *r*_{k}, *d*, *D*_{1}, when referring to quantities computed directly from the 10-m wind data. As stated in section 2e, our analytical derivations are still valid for observational data, only that the dimension *d* obtained when searching analogs of observables can be different from the dimension obtained when searching analogs of the system state.

From these data, one can compute local dimensions with the method of Caby et al. (2019). As the data are limited (~3 × 10^{4} time points), *K* is set to 40. Note also that, as elements of the catalog are only one hour away from each other, they cannot be assumed to be independent. Therefore, if several analogs are neighbors in time, only one analog is retained, and it is selected randomly in the set of time-neighboring analogs. Also, analogs that are less than one-and-a-half days away from the target *z* are discarded. Usually, analog are searched for in a time window of fixed length around the calendar date of the target *z*. However, in this example, searching for analogs with or without calendar-date restriction resulted in similar results for estimates of dimension and analog-to-target distances, indicating that the closest analogs naturally lied in similar seasons than their targets *z*.

Histograms of local dimensions *D*_{1} here, is 16. Our local dimension histogram is similar in shape to the one of Faranda et al. (2017), who also focused on North Atlantic circulation (in their study, the local dimension is called “instantaneous dimension”). However, our histogram shows slightly higher average dimensions and a higher variability. Note that we focus on two components of horizontal wind velocity, on a dense grid of ~10^{4} grid points, while Faranda et al. (2017) focus on sea level pressure (SLP) at ~10^{3} grid points. Therefore, it is not surprising that we find higher average values of the local dimension. The fact that we observe a higher variability in the local dimension could be due to an intrinsic higher variability of this dynamical indicator, but also to a higher variability in the process of estimating *d* caused by a lack of data. Indeed, we have slightly less data than Faranda et al. (2017), for a system of slightly higher dimension, so that we can find fewer good analogs to estimate *d* than Faranda et al. (2017). Faranda et al. (2017) use *L* ~ 2 × 10^{4} days of historical data. We use ~4 × 10^{4} hours of data, which must be divided by the typical correlation time scale in hours. If we assume that the latter is between 12 and 24 h, we find that our *L* is between 1.5 × 10^{3} and 3 × 10^{3}.

Faranda et al. (2017) found a seasonality in the local dimension of SLP fields, with higher dimensions and a higher variability in winter. In our case, no seasonal trend for the mean or median dimension is observed, but the weekly variability of local dimensions is higher in winter, as witnessed in Fig. 7b. Also, a diurnal cycle can be seen in Fig. 7c, with dimension increasing in daytime and decreasing in nighttime. As diurnal variability is mixed with other sources of variability, it cannot always be identified by eye (see the three first days of Fig. 7c). Histograms of dimension restricted to daytime are similar to histograms restricted to nighttime, so that diurnal cycle does not appear to be the main driver of dimension variability.

We repeated the experiments leading to the histograms of Fig. 7a, but using different metrics (the Manhattan distance, order-8 Minkowski metric, and Chebyshev distance). This did not result in significant change, only that the dimension estimates were slightly larger when using the order-8 Minkowski and Chebychev metrics (not shown). This further demonstrates the robustness of our results to a change of metric.

### d. AROME reanalysis data: Analog distances

An example of target state and analogs is shown in Fig. 8. The chosen target state is a classical winter situation in Brittany, with strong eastward wind coming from the sea. Thus, good analogs are found in the catalog. It is hard to discriminate which analog is closest: for such a high-dimensional system, the first analog-to-target distances are very similar.

Note that for this moderately high dimensional system, the concentration of norms might make the search for analogs meaningless as pointed out by Beyer et al. (1999). For very high-dimensional systems, the ratio between the distance to the nearest analog, *r*_{1}, and the distance to the furthest point in the catalog, *r*_{L}, is close to one, making the search for analog irrelevant. Moreover, Hinneburg et al. (2000) showed that for order-*p* Minkowski metrics the difference between the distance to the furthest point and to the nearest neighbor scales as *d*^{(1/p) − (1/2)}, indicating that for different types of distances the concentration of norm might behave differently. To ensure that this concentration of norm was not an issue, we computed *r*_{1}/*r*_{L} for every point in the catalog (again, omitting neighbors in time to compute *r*_{1}), and for Minkowski metrics of order 1 (also called Manhattan distance), 2 (also called Euclidean distance), 8, and infinity (also called Chebyshev distance or infinity norm). This allowed us to compute histograms of *r*_{1}/*r*_{L} (not shown), which showed a very low probability for *r*_{1}/*r*_{L} to exceed 0.3 whatever the distance used. This shows that the curse of dimensionality is not a severe issue for our example of 10-m wind reanalysis, and that looking for analogs is still meaningful.

*C*(

*z*) (through the least squares approximation introduced in the previous section), it is possible to approximate the rescaled theoretical variable

*υ*

_{k}(introduced in section 2d) through

*υ*

_{k}, especially for large values of

*k*. However, due to the small catalog size, only probability densities up to

*k*= 8 will be studied; otherwise, the expressions obtained theoretically in the limit

*L*→ +∞ are likely not to hold.

To obtain these distributions, analogs of each hourly *z*, *C*(*z*) is computed from Eq. (11), and the distances are rescaled following Eq. (15) and then stored. Finally, the stored values of each *L* = 10^{6} points and testing the procedure on 10^{5} target points that are taken from a trajectory independent from the catalog. Also, the theoretical density functions *υ*_{k} from Eq. (9) are shown for similar (fixed) dimensions. Note that to obtain distributions

Figure 9 shows a relatively good agreement between theoretical and empirical distributions, especially for the Lorenz data. Indeed, the curves of Figs. 9b and 9d are similar in shape, especially the asymmetry for *k* = 1. As *k* grows, the variance of the empirical data (Fig. 9b) becomes smaller than expected in theory (Fig. 9d). This can be explained by the fact that the assumption *L* → +∞ (or equivalently *r*_{k}→0) is better satisfied for low values of *k*. High values of *r*_{k} are associated with a low variability. This also explains the lower variance of the empirical curves (Fig. 9a) compared to the theoretical curves (Fig. 9c), using the wind data. Again, the asymmetry in the shape of the curves for *k* = 1 is respected, and the estimation of the mean fits our theory.

This experiment shows that the present theory, which was derived assuming a large catalog density, is also partially applicable to limited catalog densities (here *r*_{k} at fixed *k*, *z*, and *L*, our estimates of the mean ⟨*r*_{k}(*z*)⟩ are satisfying even for low catalog densities. Therefore, most of our theory seems to be applicable to partial observations of real, moderately high-dimensional systems, with limited catalog size (here, only 5 years of data, for a system of observed dimension *D*_{1} ≈ 16). The fact that our theory could eventually break down for even lower values of the catalog density is not worrying, as it would mean that analog-to-target distances would probably be too large for analogs to be used.

### e. AROME reanalysis data: Objective-based dimension reduction

*k*th analog-to-target distance, RMSD is the root-mean-squared distance between two points randomly taken from the dataset, and

*ε*is a user-defined threshold.

*r*

_{k}(

*z*)⟩, which is the mean over all possible realizations of the

*k*th analog-to-target distance at fixed target

*z*and catalog size

*L*. The average

*z*, while in the rest of this document

*r*

_{k}(

*z*) depends on

*z*, and so does ⟨

*r*

_{k}(

*z*)⟩.

We reduce dimension using EOFs, which allows us to reduce

We use the notation *N*^{eof}. We note *N*^{eof} first EOFs.

According to the theoretical study of Caby et al. (2020), we expect *N*^{eof} and the attractor dimension of the dynamical system under study. For large enough *N*^{eof} we should find *D*_{1} is the dimension found using the original dataset). For small *N*^{eof}, in principle, *N*^{eof} is close to *D*_{1}.

*z*,

*ρ*(

*z*) is on the order of one. When using the method described in the previous sections to compute

*C*(

*z*), we find that

*ρ*(

*z*) is typically between 0.4 and 0.7. Then we make the following ergodicity hypothesis:

*z*, we finally find the approximate scaling:

*D*

_{max,k}is a linear function of log(

*k*). This can be rearranged to give

This last expression shows how *D*_{max,k} strongly depends on *k*. On a practical example, assume that *D*_{max,1} ≈ 10 and that *L* = 10^{4}, then *D*_{max,25} ≈ 6. In this experiment, we assume that the number of required analogs is fixed. Reducing dimension in order to decrease analog distances thus strongly depends on how many analogs are needed for the analog method. For instance, if an ensemble of analogs is used to estimate the full probability density function of a one-dimensional variable (say, the day after tomorrow’s accumulated rainfall over the city of Paris), then one might need at least 100 analogs. Yet 10 analogs might be enough to simply estimate the mean of the distribution. As another example, if one wants to estimate the covariance associated with the forecast error of 5 independent variables, one needs at the very least 5 analogs, but 50 analogs might be necessary, especially in the presence of observational noise. Also, the complexity of the system under study might vary according to phase space location, so that the number of required analogs could depend on the state *z*. In practice, the number of required analogs is a complex function of the quantity to be estimated, the quality of the data, the method that is used, and properties of the system at stake.

Figure 10 shows comparison of this scaling with numerical experiments performed on the AROME reanalysis data. Upper and lower bounds for *D*_{max,k} were derived from estimations of *N*^{eof} we find that *N*^{eof} we find that

The way to use these equations for *D*_{max,k} in practice depends on the particular application. For instance, if one wants to perform a statistical task such as downscaling, one might impose a fixed minimum number of *K* samples to correctly represent a statistical distribution. One might, at the same time, ask that the analog-to-target distance does not exceed a given threshold to ensure a good quality of analogs (assuming that this “quality” is correctly estimated by the chosen distance). Then our formulas can be used to estimate how much of dimension reduction is needed to fulfil these criteria by choosing a number of EOFs close to the theoretical value of *D*_{max,k}.

Another possibility is that the required number of samples *K* varies with *D*_{1}. This is the case in ensemble forecast where one wants to use successors to estimate the covariance matrix of the future state. If the local dimension is *d*, we can assume that the data have been projected on some ⌈*d*⌉-dimensional space, where ⌈*d*⌉ is the ceiling function of *d* (i.e., the smallest integer *i* such that *i* ≥ *d*). In this case, the covariance matrix of the future state is of size *d*⌉-dimensional vector. Therefore, one needs to have at least *d* with *D*_{1}, this last inequality can be rewritten in the form *K* (in this covariance example, *D*_{max,K} from Eq. (17) is a decreasing function of *K*, the intersection *D*_{max,K} and *D*_{1} that is independent of *K*. This maximum value is fixed by the threshold *ε* and the required relationship between the minimum value of *K* and the dimension *D*_{1}. Knowing

However, note that our formulas do not reveal how much information is left behind when reducing the dimension. For instance, in the case of forecast, the maximum dimension *D*_{max,K} might be too low to represent accurately the dynamics of the system. In such a case, one is bound to either raising the value of *L* (which can rarely be done) or increasing the value of *ε* (which might decrease the efficiency of the analog method).

## 5. Conclusions

We combined extreme value theory and dynamical systems theory to derive analytical joint probability distributions of analog-to-target distances in the limit of large catalog density. Those distributions shed new light on the influence of dimension in practical use of analog. In particular, we found that analog-to-target distances are more sensitive to the number of analogs used in low dimension than in high dimension, at fixed catalog density. Contrarily to previous works on the probability to find good analogs, this study focuses on distances rather than return times, gives whole probability distributions rather than first moments, and looks at the *K* closest analogs rather that only the closest one. Numerical simulations of the three-variable Lorenz system confirm the theoretical findings. An example of practical consequence of our theory on the sensitivity of analog forecasts to the number of analogs used, depending on dimension, is given using the system of Lorenz (1996). The 10-m wind reanalysis data from the AROME physical model show that our analysis is also relevant for observations of real systems. Our investigation indicates that the studied wind fields lie in an attractor of moderately high dimension ~16. In this situation of moderate dimensionality, the analog-to-target distances of the first analogs are all very similar and have a low variability. Our theoretical derivations can be used to find optimal dimension reduction for the purpose of decreasing analog distances, which we demonstrate on an example using the AROME reanalysis data. These examples reveal the applicability of the derived probability distributions even to relatively low catalog densities.

## Acknowledgments

The work was financially supported by ERC Grant 338965-A2C2 and ANR Grant 10-IEED-0006-26 (CARAVELE project). This piece of work took its origins in discussion with Théophile Caby, to whom we express our gratitude. In particular, appendix A is an adaptation of a derivation by Théophile Caby. The theoretical derivations of the probability density functions shown in this paper are the result of several exchanges with Benoît Saussol, who we must thank here. We are indebted to Fabrice Collard, Bertrand Chapron, and Caio Stringari, for fruitful insights and discussions about the exploration and interpretation of the AROME reanalysis data. Finally, the last version of this manuscript owes much to the meticulous work of three anonymous reviewers who we thank again here.

## APPENDIX A

### Proof that *D*_{1} is Independent of Metric Choice in Finite Dimension

A metric **z**_{1} ≠ **z**_{2}, and must verify dist(**z**_{1}, **z**_{1}) = 0, dist(**z**_{1}, **z**_{2}) = dist(**z**_{2}, **z**_{1}), and for any third vector **z**_{3}, dist(**z**_{1}, **z**_{3}) ≤ dist(**z**_{1}, **z**_{2}) + dist(**z**_{2}, **z**_{3}).

*r*around the point

*z*, defined with the distance

*q*and

*Q*such that for all points

*z*and

*a*:

*z*and all positive real number

*r*. Taking the logarithm of the measure of this double inclusion, we find

*r*,

*r*→ 0 gives

This means that for small values of *r*, *d*_{z,r} approach the same limit, and are therefore close to each other. However, this proof does not give the convergence rate. In particular, it is possible to find a metric *D*_{1} is arbitrarily slow. Therefore, for a given dataset, it is always possible to find a specific metric such that dimension estimates are far from the real limit value *D*_{1}. Nevertheless, this peculiar behavior is not expected for usual metrics, such as the order-*p* Minkowski metrics used in the numerical examples of the present paper.

## APPENDIX B

### Direct Proof for *p*_{k}(*r*)

In this appendix, we give the proof of Eq. (5) by evaluating directly the probability that analogs lie between the sphere of radius *r* and the sphere of radius *r* + *δr*.

#### a. Poisson distribution of the number of analogs in a ball

*z*, the number of visits

*V*(

*z*,

*r*) of a trajectory of size

*L*into the ball

*B*

_{z,r}follows a Poisson distribution with mean

*Lμ*(

*B*

_{z,r}):

*k*! is

*k*factorial. In the context of analogs, this is the probability to find

*k*analogs with distances to

*z*below the radius

*r*. In machine learning, this is called the epsilon nearest neighbor search. In the following we write

#### b. Distribution of analogs close to the sphere

Now we will use *μ* to evaluate *k*th analog-to-target distance is between *r* and *r* + *δr*, for fixed *k* and *z* and where *δr* is small compared to *r*.

*r*

_{k}∈ [

*r*,

*r*+

*δr*)” is the intersection of the event “there are

*k*− 1 analogs in the ball

*B*

_{z,r}” and the event “there is one analog in

*B*

_{z,r}and

*B*

_{z,r+δr}has a probability of the same form as Eq. (B1) but replacing

*k*by 1 and

*μ*

_{z,r}by

*μ*is regular enough so that

*B*

_{z,r}and

*B*

_{z,r+δr}has a probability of

This last equation is a more general form of our main result which is given in the next section. Here, the probability is expressed in terms of the invariant measure, which is usually not known analytically. The next section expresses the same probability in terms of the analog-to-target distance *r*.

#### c. Distribution of analog-to-target distances

*μ*

_{z,r}and

*r*is given by the definition of the finite-resolution local dimension in Eq. (1):

*d*=

*d*

_{z,r}. In this section, we first acknowledge the variations of

*d*

_{z,r}with

*r*, to better justify why they are neglected in the rest of the paper. Therefore, in this section

*d*=

*d*

_{z,r}for varying values of

*z*and

*r*, while in the rest of the paper

*d*usually refers to a value at fixed distance

*r*

_{K},

*δμ*

_{z,r}and

*δr*involves variations of the local dimension with

*r*. Let Δ =

*d*

_{z,r+δr}−

*d*

_{z,r}, we have

*μ*

_{z,r+δr}= (

*r*+

*δr*)

^{d+Δ}=

*μ*

_{z,r}

*r*

^{Δ}(1 +

*δr*/

*r*)

^{d+Δ}, which gives

*d*, and keeping only lower-order terms, we find

*d*(

*δr*/

*r*) represents an almost steady increase in

*μ*

_{z,r}when

*r*grows. The term Δlog

*r*represents fluctuations in this increase given by the fluctuations in

*d*

_{z,r}. In practice, the method described in section 2b to evaluate

*d*

_{z,r}with

*r*at scales smaller than

*r*

_{K}. Thus, the approximation

*δr*, one can then define

*p*

_{k}, the probability density function of

*r*

_{k}through the identity

## APPENDIX C

### Alternative Proof for *p*_{k}(*r*) and Joint Probability Distribution Using *K* Largest-Order Statistics

Lucarini et al. (2016) give a detailed analysis of the map from

*z*,

*a*

_{1}. The minimum is thus

*r*

_{1}. EVT tells (see Coles et al. 2001) that in the limit of large catalog, the minimum of this lower-bounded distance map on a finite sample of the attractor (a catalog of size

*L*) follows a Weibull distribution, after rescaling. The Poisson law from Eq. (B1) with

*k*= 1 actually gives the scaling and the exact form of the Weibull distribution:

*r*; otherwise, the probability is 1.

*K*largest-order statistics of this function then correspond to the

*K*analogs of the point

*z*. Again, in the limit of large catalog and for small enough

*K*, EVT provides the limit law (see Coles et al. 2001) for the

*k*th minima of this distance function when

*L*→ ∞:

*p*

_{1:K}in the limit

*L*→ ∞:

*r*

_{1}<

*r*

_{2}< ⋅⋅⋅ <

*r*

_{K}. For notation convenience and only in this formula, the random variables

*r*

_{k}are noted identically as the values they can possibly take.

## APPENDIX D

### Three-Variable Lorenz System

*β*

_{1}= 10,

*β*

_{2}= 28, and

*β*

_{3}= 8/3. In this case, the variables

*X*

_{1},

*X*

_{2}, and

*X*

_{3}span values between approximately [−20, 20], [−20, 20], and [0, 40], respectively. If we now make the following change of variables,

*X*

_{1},

*X*

_{2}, and

*X*

_{3}span values between approximately [−2/3, 2/3], [−2/3, 2/3], and [0, 4/3], respectively. For this new system, the values of

*ρ*(

*z*) calculated as in section 4a of the present paper would be close to 1 and not to 28.

*N*-variable system of Lorenz (1996) is defined by the following equations:

*θ*is the forcing parameter. In our numerical experiments we use the value

*θ =*8 and two different values of

*N*= 12 and

*N*= 20, with periodic boundary conditions

*x*

_{i+n}=

*x*

_{i}.

## REFERENCES

Alexander, R., Z. Zhao, E. Székely, and D. Giannakis, 2017: Kernel analog forecasting of tropical intraseasonal oscillations.

,*J. Atmos. Sci.***74**, 1321–1342, https://doi.org/10.1175/JAS-D-16-0147.1.Ayet, A., and P. Tandeo, 2018: Nowcasting solar irradiance using an analog method and geostationary satellite images.

,*Sol. Energy***164**, 301–315, https://doi.org/10.1016/j.solener.2018.02.068.Beyer, K., J. Goldstein, R. Ramakrishnan, and U. Shaft, 1999: When is “nearest neighbor” meaningful?

*Int. Conf. on Database Theory*, Jerusalem, Israel, ICDT, 217–235.Birkhoff, G. D., 1931: Proof of the ergodic theorem.

,*Proc. Natl. Acad. Sci. USA***17**, 656–660, https://doi.org/10.1073/pnas.17.2.656.Caby, T., D. Faranda, G. Mantica, S. Vaienti, and P. Yiou, 2019: Generalized dimensions, large deviations and the distribution of rare events.

,*Physica D***400**, 132143, https://doi.org/10.1016/j.physd.2019.06.009.Caby, T., D. Faranda, S. Vaienti, and P. Yiou, 2020: Extreme value distributions of observation recurrences.

,*Nonlinearity***34**, 118–163, https://doi.org/10.1088/1361-6544/abaff1.Cattiaux, J., R. Vautard, C. Cassou, P. Yiou, V. Masson-Delmotte, and F. Codron, 2010: Winter 2010 in Europe: A cold extreme in a warming climate.

,*Geophys. Res. Lett.***37**, L20704, https://doi.org/10.1029/2010GL044613.Coles, S., J. Bawa, L. Trenner, and P. Dorazio, 2001:

*An Introduction to Statistical Modeling of Extreme Values*. Vol. 208. Springer, 208 pp.Daley, D. J., and D. Vere-Jones, 2003:

*Elementary Theory and Methods*. Vol. I,*An Introduction to the Theory of Point Processes*, Springer, 471 pp.Ducrocq, V., F. Bouttier, S. Malardel, T. Montmerle, and Y. Seity, 2005: Le projet AROME.

,*Houille Blanche***91**, 39–43, https://doi.org/10.1051/lhb:200502004.Faranda, D., V. Lucarini, G. Turchetti, and S. Vaienti, 2011: Extreme value distribution for singular measures. arXiv, https://arxiv.org/abs/1106.2299.

Faranda, D., G. Messori, and P. Yiou, 2017: Dynamical proxies of North Atlantic predictability and extremes.

,*Sci. Rep.***7**, 41278, https://doi.org/10.1038/srep41278.Farmer, J. D., and J. J. Sidorowichl, 1988: Exploiting chaos to predict the future and reduce noise.

*Evolution, Learning and Cognition*, Y. C. Lee, Ed., World Scientific, 277–330.Fettweis, X., E. Hanna, C. Lang, A. Belleflamme, M. Erpicum, and H. Gallée, 2013: Important role of the mid-tropospheric atmospheric circulation in the recent surface melt increase over the Greenland ice sheet.

,*Cryosphere***7**, 241–248, https://doi.org/10.5194/tc-7-241-2013.Fischer, C., T. Montmerle, L. Berre, L. Auger, and S. E. Ştefănescu, 2005: An overview of the variational assimilation in the ALADIN/France numerical weather-prediction system.

,*Quart. J. Roy. Meteor. Soc.***131**, 3477–3492, https://doi.org/10.1256/qj.05.115.Gupta, A., R. Krauthgamer, and J. R. Lee, 2003: Bounded geometries, fractals, and low-distortion embeddings.

*44th Annual IEEE Symp. on Foundations of Computer Science*, Cambridge, MA, IEEE, 534–543, https://doi.org/10.1109/SFCS.2003.1238226.Hamill, T. M., M. Scheuerer, and G. T. Bates, 2015: Analog probabilistic precipitation forecasts using GEFS reforecasts and climatology-calibrated precipitation analyses.

,*Mon. Wea. Rev.***143**, 3300–3309, https://doi.org/10.1175/MWR-D-15-0004.1.Hamilton, F., T. Berry, and T. Sauer, 2016: Ensemble Kalman filtering without a model.

,*Phys. Rev. X***6**, 011021, https://doi.org/10.1103/PhysRevX.6.011021.Haydn, N., and S. Vaienti, 2019: Limiting entry times distribution for arbitrary null sets. arXiv, https://arxiv.org/abs/1904.08733.

Hinneburg, A., C. C. Aggarwal, and D. A. Keim, 2000: What is the nearest neighbor in high dimensional spaces?

*26th Int. Conf. on Very Large Databases*, Cairo, Egypt, VLDB, 506–515, https://www.vldb.org/dblp/db/conf/vldb/HinneburgAK00.html.Houle, M. E., 2013: Dimensionality, discriminability, density and distance distributions.

*2013 IEEE 13th Int. Conf. on Data Mining Workshops*, Dallas, TX, IEEE, 468–473, https://doi.org/10.1109/ICDMW.2013.139.Houle, M. E., 2017: Local intrinsic dimensionality I: An extreme-value-theoretic foundation for similarity applications.

*Int. Conf. on Similarity Search and Applications*, Munich, Germany, SISAP, 64–79.Jézéquel, A., P. Yiou, and S. Radanovics, 2018: Role of circulation in European heatwaves using flow analogues.

,*Climate Dyn.***50**, 1145–1159, https://doi.org/10.1007/s00382-017-3667-0.Kac, M., 1959:

*Probability and Related Topics in Physical Sciences*. Vol. 1. Interscience Publishers, 266 pp.Karger, D. R., and M. Ruhl, 2002: Finding nearest neighbors in growth-restricted metrics.

*Proc. 34th Annual ACM Symp. on Theory of Computing*, Montreal, QC, Canada, ACM, 741–750, https://doi.org/10.1145/509907.510013.Lamperti, J., 1964: On extreme order statistics.

,*Ann. Math. Stat.***35**, 1726–1737, https://doi.org/10.1214/aoms/1177700395.Lguensat, R., P. Tandeo, P. Ailliot, M. Pulido, and R. Fablet, 2017: The analog data assimilation.

,*Mon. Wea. Rev.***145**, 4093–4107, https://doi.org/10.1175/MWR-D-16-0441.1.Lorenz, E. N., 1963: Deterministic nonperiodic flow.

,*J. Atmos. Sci.***20**, 130–141, https://doi.org/10.1175/1520-0469(1963)020<0130:DNF>2.0.CO;2.Lorenz, E. N., 1969: Atmospheric predictability as revealed by naturally occurring analogues.

,*J. Atmos. Sci.***26**, 636–646, https://doi.org/10.1175/1520-0469(1969)26<636:APARBN>2.0.CO;2.Lorenz, E. N., 1996: Predictability: A problem partly solved.

*Proc. Seminar on Predictability*, Reading, United Kingdom, ECMWF, https://www.ecmwf.int/en/elibrary/10829-predictability-problem-partly-solved.Lucarini, V., D. Faranda, J. Wouters, and T. Kuna, 2014: Towards a general theory of extremes for observables of chaotic dynamical systems.

,*J. Stat. Phys.***154**, 723–750, https://doi.org/10.1007/s10955-013-0914-6.Lucarini, V., and Coauthors, 2016:

. John Wiley and Sons, 312 pp.*Extremes and Recurrence in Dynamical Systems*Milnor, J., 1985: On the concept of attractor.

*The Theory of Chaotic Attractors*, B. R. Hunt et al., Eds., Springer, 243–264.Nicolis, C., 1998: Atmospheric analogs and recurrence time statistics: Toward a dynamical formulation.

,*J. Atmos. Sci.***55**, 465–475, https://doi.org/10.1175/1520-0469(1998)055<0465:AAARTS>2.0.CO;2.Platzer, P., P. Yiou, P. Naveau, P. Tandeo, Y. Zhen, P. Ailliot, and J.-F. Filipot, 2021: Using local dynamics to explain analog forecasting of chaotic systems.

,*J. Atmos. Sci.***78**, 2117–2133, https://doi.org/10.1175/JAS-D-20-0204.1.Poincaré, H., 1890: Sur le problème des trois corps et les équations de la dynamique.

,*Acta Math.***13**, A3–A270, https://doi.org/10.1007/BF02392506.Pons, F. M. E., G. Messori, M. C. Alvarez-Castro, and D. Faranda, 2020: Sampling hyperspheres via extreme value theory: Implications for measuring attractor dimensions.

,*J. Stat. Phys.***179**, 1698–1717, https://doi.org/10.1007/s10955-020-02573-5.Robin, Y., P. Yiou, and P. Naveau, 2017: Detecting changes in forced climate attractors with Wasserstein distance.

,*Nonlinear Processes Geophys.***24**, 393–405, https://doi.org/10.5194/npg-24-393-2017.Schenk, F., and E. Zorita, 2012: Reconstruction of high resolution atmospheric fields for northern Europe using analog-upscaling.

,*Climate Past***8**, 1681–1703, https://doi.org/10.5194/cp-8-1681-2012.Van Den Dool, H. M., 1994: Searching for analogues, how long must we wait?

,*Tellus***46A**, 314–324, https://doi.org/10.3402/tellusa.v46i3.15481.Verleysen, M., and D. François, 2005: The curse of dimensionality in data mining and time series prediction.

*Int. Work-Conf. on Artificial Neural Networks*, Warsaw, Poland, ICANN, 758–770.Wang, X., and S. S. Shen, 1999: Estimation of spatial degrees of freedom of a climate field.

,*J. Climate***12**, 1280–1291, https://doi.org/10.1175/1520-0442(1999)012<1280:EOSDOF>2.0.CO;2.Wetterhall, F., S. Halldin, and C.-Y. Xu, 2005: Statistical precipitation downscaling in central Sweden with the analogue method.

,*J. Hydrol.***306**, 174–190, https://doi.org/10.1016/j.jhydrol.2004.09.008.Yiou, P., 2014: AnaWEGE: A weather generator based on analogues of atmospheric circulation.

,*Geosci. Model Dev.***7**, 531–543, https://doi.org/10.5194/gmd-7-531-2014.Yiou, P., and C. Déandréis, 2019: Stochastic ensemble climate forecast with an analogue model.

,*Geosci. Model Dev.***12**, 723–734, https://doi.org/10.5194/gmd-12-723-2019.Yiou, P., T. Salameh, P. Drobinski, L. Menut, R. Vautard, and M. Vrac, 2013: Ensemble reconstruction of the atmospheric column from surface pressure using analogues.

,*Climate Dyn.***41**, 1333–1344, https://doi.org/10.1007/s00382-012-1626-3.Young, L.-S., 1982: Dimension, entropy and Lyapunov exponents.

,*Ergodic Theory Dyn. Syst.***2**, 109–124, https://doi.org/10.1017/S0143385700009615.