1. Introduction
Atmospheric analogs have been introduced by Lorenz (1969) in a study on atmospheric predictability. The faster one target state z and its closest analog a1 diverge from one another, the harder it is to predict the evolution of z. In Lorenz’s study, the state z was characterized by height values of the 200-, 500-, and 850-hPa isobaric surfaces at a grid of ≈1000 points over the Northern Hemisphere. The database of available analogs, called the catalog, contained five years of twice-daily values. In his abstract, Lorenz states that there are “numerous mediocre analogues but no truly good ones.”
Since Lorenz’s work, analogs have been used in many applications such as weather generators (Yiou 2014), data assimilation (Hamilton et al. 2016; Lguensat et al. 2017), kernel forecasting (Alexander et al. 2017), downscaling (Wetterhall et al. 2005), nonlinear bias correction (Hamill et al. 2015), climate reconstruction (Schenk and Zorita 2012; Fettweis et al. 2013; Yiou et al. 2013), and extreme event attribution (Cattiaux et al. 2010; Jézéquel et al. 2018).
The reason why Lorenz could not find any good analog was made clear later on by Van Den Dool (1994). It was shown that for high-dimensional systems, the mean return time of a good analog (used as a proxy for a minimum catalog size) grows exponentially with dimension. This result is a variant for analogs of the “curse of dimensionality,” well known in data sciences. With three pressure levels over the whole Northern Hemisphere, the dimension of Lorenz’s study was very high, and only 5 years of twice-daily data was not enough to hope finding a good analog.
Nicolis (1998) added a dynamical systems’ perspective to Van Den Dool’s analysis. She showed that studying mean return times was not enough, as the relative standard deviation of this return time could be very high. Furthermore, it was shown that return-time statistics exhibit strong local variations in phase-space, so that certain target states may need a larger catalog size to find good analogs.
Accounting for Van Den Dool’s findings, it is now usual to reduce as much as possible the feature-space dimension before searching for analogs. Also, the last decades have witnessed a proliferation of data from in situ and satellite observations, as well as outputs from numerical physics-based model. Such conditions allow one to find good analogs in many situations, and it has become standard to use not just one, but many analogs (usually a few tens). From a statistical perspective, using many analogs instead of one can increase estimation bias, but it reduces estimation variance, so that the estimation is less sensitive to noise. Using many analogs also allows us to perform local regression techniques on the analogs, such as local linear regression (Lguensat et al. 2017). This technique has proven efficient in analog forecasting applications (Ayet and Tandeo 2018), and it was shown that local linear regression allows analog forecasting to capture the local Jacobian of the dynamics of the real system (Platzer et al. 2021).
This new context suggests focusing not only on the closest analog a1, but also the kth closest analog, for k up to ~40. The number of analog used is usually the result of a trade-off between the number of available good analogs and the minimum number of analogs required to perform a given task (for instance, Yiou and Déandréis 2019; search for 20 analogs at each step to perform ensemble analog forecasts). Also, one can now reasonably hope to find good analogs using dimension reduction and a large amount of data. Thus, one is less interested in return times, but rather in analog distances. That is, for a given length of available data, how far will the closest analogs be? Performances of analog-based methods are usually conditioned by analog-to-target distances [see, for instance, the relationship between analog distances and forecast performance in Farmer and Sidorowichl (1988) and Platzer et al. (2021)]. In this work, we propose to evaluate the probability distribution of these distances. Our analytical probability distributions make the link between analog-to-target distances, catalog size, and local dimension. This brings new insight on the impact of dimensionality on analog methods.
Section 2 outlines the theoretical framework and findings. The section 3 shows implications of the findings and compares the present analysis with past studies. Section 4 shows results from numerical experiments of the three-variable Lorenz (1963) system, the variable-dimension Lorenz (1996) system, and from 10-m wind reanalysis data from the regional climate model AROME, further referred to as “the AROME reanalysis data.” Detailed derivations of the results of section 2 can be found in appendixes B and C.
2. Theory
a. Analogs in dynamical systems and local dimensions
We assume a dynamical system with an attractor set
The trajectory from which the analogs are taken is called the “catalog”
Note that for ergodic measures, μ(Bz,r) can be approximated by counting the number of times a given trajectory enters Bz,r [this is the consequence of the ergodic theorem of Birkhoff (1931)]. In the following, we assume that μ is ergodic and stationary. This does not apply when nonstationary processes, such as climate change, break the stationarity of μ. Also, in practice, periodic forcings such as seasonality make the structure of the attractor of a system such as the atmosphere vary between winter and summer. Therefore, analogs must be searched within a given time window around the calendar date of the target z, so that the subsampling allows us to recover an invariant measure (see Lorenz 1969; Yiou and Déandréis 2019). For a discussion on the modification of the invariant measure due to seasonality and nonperiodic forcing, see Robin et al. (2017).
The finite-resolution local dimension dz,r, however, can deviate from the typical value D1. More precisely, dz,r exhibits large deviations from its limit value. The amplitude of these deviations depends on (−logr)−1/2 and on the spectrum of fractal dimensions (for more details, see Caby et al. 2019).
These definitions of dimension correspond to the notion of attractor dimension, which comes from the field of dynamical systems. There are strong connections with other mathematical objects used to estimate dimensionality in computer science and machine learning. These include the doubling dimension (Gupta et al. 2003) and expansion dimension (Karger and Ruhl 2002) which are related to ratios of volume occupied by data, and the intrinsic dimension (Houle 2013), which is related to the minimum number of variables needed to correctly represent a dataset. The local intrinsic dimension as defined by Houle (2017) is closely related to the local attractor dimension dz,r which is used in the present study.
The definitions of Bz,r, dz,r, and D1 depend on the metric that is used to evaluate distances. However, we show in appendix A that the limit value D1 is independent of the choice of metric; therefore, dz,r is also expected to depend only lightly on the metric that is used. The theoretical results expressed in this paper in the limit of small distance r → 0 (or, equivalently, of large catalog L → +∞) are valid whatever the metric used. Note that this does not apply to measures of similarity such as correlation or statistical divergence, that are not actual metrics (of which we recall the definition in appendix A).
All these definitions are valid in the limit of small distance r, which can be hard to achieve in high dimension due to the concentration of norms or “curse of dimensionality” (Verleysen and François 2005). The effect of the curse of dimensionality on the estimation of dimensions following Eq. (1) was studied analytically and numerically by Pons et al. (2020), with effects starting to be nonnegligible in dimension ≈40. In the numerical experiments presented here, we have checked empirically that the concentration of norms was small enough.
The distance from the kth analog
b. Simple scaling of analog-to-target distance with local dimension
In the following and unless otherwise noted, “the local dimension,” or d, both refer to
A practical application of Eq. (2) with the system of Lorenz (1963) (see appendix D for a formal definition of this system) is given in Fig. 1. Another way to estimate
Equation (4) already reveals an important point of our analysis, which is the scaling of rk with k, and is approximately given by a power-law with exponent 1/d. However, this formula comes from a work on local dimensions, not analog-to-target distances. It is therefore not surprising that some of the elements required for our study are missing. In particular, this scaling does not give the constant in front of k1/d, in which resides the relation to the catalog size, a crucial point for analog applications. Also, it only gives a mean or typical value of rk(z), while our objective is to evaluate the probability distribution of rk(z) at fixed z and L, or at least the probability of departures from this mean scaling.
The next section gives the full probability distribution of rk(z) for a fixed target z as a function of the local dimension, the catalog size, and the analog number k.
c. Full probability distribution of analog-to-target distance
Figure 2 shows plots of
Also, as a consequence of Eqs. (7), we have that the standard deviation of rk is a growing function of k for d < 2, while it is constant for d = 2 and decreasing for d > 2. However, the relative standard deviation of rk is always a decreasing function of k and d according to Eq. (7b).
d. Normalization and convergence to the standard normal distribution
e. Distances in observation space
The limit
However, if we keep the hypothesis that μ is ergodic and z is a nonperiodic point, we can conduct the same analysis as in appendix B but replacing μ by
Therefore, the statistics of analog-to-target distances in observation space also follow Eq. (5), this time with a dimension that depends not only on the dynamical system, but also on properties of the observable.
3. Consequences for applications of analogs
a. Comparison with previous studies
Similar results can be found from Eq. (5). Indeed, one has
Nicolis (1998) extended the work of Van Den Dool (1994). Interpreting Eq. (10) in terms of mean return times and using the formula from Kac (1959), she found an expression of mean return times using the identity
In the present paper, the point of view switches from statistics of return times to statistics of analog-to-target distance, and is extended to the K closest analogs rather than just the first one. The full probability distribution of Eq. (5) gives a detailed view of the variability of the process of searching for analogs.
Note that our work has many connections to the one of Houle (2017), who also studied probability distributions of distance functions. However, we are not aware of any published work giving probability distributions of analog distances such as in Eq. (5).
b. Searching for analogs: Consequences
The full probability distribution of Eq. (5) has many consequences for the practical search of analogs.
For very low-dimensional systems (D1 < 2), the first analog-to-target distance has a lower variability than the next ones, so that a given value of r1 will be more representative of the next values of r1 than a given value of r10 would be of the next values of r10. The inverse phenomenon happens for higher dimensional systems (D1 > 2). This can be taken into account to evaluate the expected performances of analog methods.
Also, the scaling rk ~ k1/d implies that the growth with k of the mean analog-to-target distance is much faster for low-dimensional systems (
For instance, Lguensat et al. (2017) use analogs to produce forecasts of several well-known dynamical systems, setting K = 40, while the use of Gaussian kernels with a variable bandwidth equal to λz = mediankrk allows us to give a very low weight to analogs at distance rk > λz. One might think that the filtering out of analogs with rk > λz makes the forecast procedure relatively insensitive to the choice of K. Conversely, assuming that λz ≈ ⟨r[K/2]⟩, where [K/2] is the integer part of K/2, we have that λz grows with K as λz ~ K1/d. Thus, for low-dimensional systems such as the one of L63 for which D1 ≈ 2.06, our results suggest that in the case of a low sampling density, high values of K might have detrimental effects on the efficiency of analog methods. This affirmation is tested in section 4b.
However, note that here we focus on analog-to-target distances assuming that they are an important driving factor of the efficiency of analog methods, but in practice many other parameters come into play, such as the choice of the proper metric, or the choice of the feature space. The tuning of analog methods does not reduce to the objective of minimizing analog-to-target distances. Nevertheless, our results can be used, with caution, to indicate tendencies and general behaviors of analog methods.
In particular, the scaling ⟨rk⟩ ~ (k/L)1/d can be used in the context of dimension reduction. Assume that one wants to perform a statistical task that necessitates K analogs (for instance, an ensemble forecast). Then assume that one wants to reduce the dimension in order to have ⟨rK⟩ < ε. From the scaling ⟨rk⟩ ~ (k/L)1/d, we find that the dimension must be reduced to at least dmax,K = {1 − [log(K)]/[log(L)]}dmax,1. Detailed arguments and a practical example are given in section 4e. Thus, for instance, if the criterion ⟨r1⟩ < ε is met for dmax,1 = 10 and if L = 104, then the criterion ⟨r25⟩ < ε will be met only for dmax,25 = 6. This shows that any dimension reduction performed with the objective of decreasing analog distances strongly depends on how many analogs are required.
Finally, the joint distribution of analog-to-target distances from appendix C theoretically allows us to express the probability distributions of any random variable of the form
4. Numerical experiments
a. Three-variable Lorenz system
Note that similar issues are raised by Faranda et al. (2011) regarding the continuity of μz,r with respect to r and its limiting behavior for small r, which motivates Lucarini et al. (2014) to postulate that μz,r is the product of
The fact that ρ(z) varies with z (and is thus not exactly a change of unit) can be explained by the possibility for two points z1 and z2 to have the same local dimension
Equations (11) and (14) are tested in numerical experiments using the system of L63, with results reported in Fig. 4. Analogs of a fixed target point z are sought for in 3 × 600 independent catalogs, with three different catalog sizes. Each catalog is built from a random draw without replacement of L points inside a (common) trajectory of 109 points, generated using a Runge–Kutta numerical scheme with a time step of 0.01 in usual nondimensional notations. The dimension is calculated using K = 150 points, where this number is justified by a bias-variance trade-off: using this number and testing the procedure on 100 points picked from the measure μ, one finds a mean dimension D1 from Eq. (3) between 2.03 and 2.04, which is coherent with values reported by Caby et al. (2019), and a standard deviation of ~0.26. Using a lower value of K results in a higher variance, and using higher values results in biases that are dependent on the value of L used in this study. For more details on the distribution of local dimensions in the system of L63 the reader is referred to Faranda et al. (2017).
The consistency of empirical densities of ρ across varying values of L validates the scaling of C with L and d. Empirical probability densities of rescaled analog-to-target distances, also consistent across varying catalog sizes, are coherent with the theoretical probability densities from Eq. (5). The values of the rescaling parameter ρ are not surprising, as typical values of distances between points in the attractor are ~16 and maximum distances are ~28. Note that Nicolis (1998) uses a rescaling in studying analog return times with Lorenz’s three-variable system, dividing all distances by the maximum distance between two points on the attractor. The fact that ρ(z) exhibits seemingly large values is only the result of the choice of variables in the system of L63. For instance, it is possible to make a change of variables that would result in a system having the same chaotic properties, the same dimension, defined by almost the same dynamical equations, but with variables spanning smaller ranges, which would give numerical values of ρ(z) close to 1 (see appendix D).
Repeating this experiment for different target points z gives similar results. Values of ρ are on the same order of magnitude as the ones reported in Fig. 4. The consistency across varying values of L is almost always recovered, except for some points that have slightly higher dimensions
Finally, we have conducted the same experiments but using observations of the first coordinate of the Lorenz system. The results are shown in Fig. 5. Again, the numerical data fit the theory, with an observed dimension close to 1 as expected. These last numerical experiments confirm the fact that our theory can be applied to observables of dynamical systems.
b. N-variable system of Lorenz (1996)
In sections 2c and 3b, we state that for a fixed catalog local density L1/d, the sensitivity of analog-to-target distances with k is stronger in low dimension. We also make the link between this sensitivity and the choice of K to be made for the efficiency of analog methods. Here we propose a simple illustration with analog forecasting on the system of Lorenz (1969) (see appendix D for a description of the system).
We use a time step of 0.05 (nondimensional units) to generate catalogs. We perform one forecast experiment with N = 12 variables and another with N = 20 variables. Dimensions D1 were estimated from Eq. (3) on an independent trajectory of 105 points, and with full, perfect observation catalogs of size 105 for each value of N (these were not the catalogs used to perform forecasts). This gives values of D1 ≈ 8 when N = 12 and D1 ≈ 12 when N = 20.
The analog forecast was simply done with a weighted mean of the successors of the K closest analogs, and weights defined by Gaussian kernels
Figure 6 shows medians of analog forecast errors from this numerical experiment as a function of forecast horizon. First, it can be seen that the errors are very similar in magnitude, confirming that analog forecast errors strongly depend on analog-to-target distances (Platzer et al. 2021), which are largely determined by catalog density as we have seen. These errors are between 15% and 40% of the RMSD, which is the mean error of a climatological forecast that estimates the future state as a constant equal to the average over all states in the catalog. Therefore, the analog forecast errors from Fig. 6 appear to be relatively high, which was expected since the catalog density is quite low.
In higher dimension D1 ≈ 12 and for small forecast horizon (≤0.15), using five analogs results in the highest forecast error, because for this system averaging through a large number of analogs helps the forecast and reduces observational noise (Platzer et al. 2021). Then, still for small forecast horizon (≤0.15) and attractor dimension D1 ≈ 12, using 15, 25, 50, or 75 analogs does not make a significant difference. This is consistent with the fact that analog-to-target distances grow slowly with k in high dimension. Now, for the same system, the same catalog density
This example illustrates the higher sensitivity of analog methods to the choice of K in low dimension, at fixed catalog density
c. AROME reanalysis data: Dimensionality
To further appreciate the applicability of our results to high-dimensional, real geophysical systems, the theoretical developments from section 2 are tested on five years (2015–19) of hourly 10-m wind output from the physical model AROME (Ducrocq et al. 2005) coupled with satellite, radar, and in situ observations through a variational data assimilation scheme (similar to the one of Fischer et al. 2005). The spatial domain is an evenly spaced grid above Brittany, with latitudes ranging from 47.075° to 49.3° and longitudes from −5.7° to −2.575°, and a spacing of 0.025°. To focus on wind at sea, land points are removed from the data resulting in a domain of 8190 grid points.
Note that this dataset is not comprised of state vectors, but of partial observations (10-m wind, over a finite-width, evenly spaced grid) of the state of the atmosphere. Projections of the state z would be noted y = f(z) classically. However, we keep the notations z, rk, d, D1, when referring to quantities computed directly from the 10-m wind data. As stated in section 2e, our analytical derivations are still valid for observational data, only that the dimension d obtained when searching analogs of observables can be different from the dimension obtained when searching analogs of the system state.
From these data, one can compute local dimensions with the method of Caby et al. (2019). As the data are limited (~3 × 104 time points), K is set to 40. Note also that, as elements of the catalog are only one hour away from each other, they cannot be assumed to be independent. Therefore, if several analogs are neighbors in time, only one analog is retained, and it is selected randomly in the set of time-neighboring analogs. Also, analogs that are less than one-and-a-half days away from the target z are discarded. Usually, analog are searched for in a time window of fixed length around the calendar date of the target z. However, in this example, searching for analogs with or without calendar-date restriction resulted in similar results for estimates of dimension and analog-to-target distances, indicating that the closest analogs naturally lied in similar seasons than their targets z.
Histograms of local dimensions
Faranda et al. (2017) found a seasonality in the local dimension of SLP fields, with higher dimensions and a higher variability in winter. In our case, no seasonal trend for the mean or median dimension is observed, but the weekly variability of local dimensions is higher in winter, as witnessed in Fig. 7b. Also, a diurnal cycle can be seen in Fig. 7c, with dimension increasing in daytime and decreasing in nighttime. As diurnal variability is mixed with other sources of variability, it cannot always be identified by eye (see the three first days of Fig. 7c). Histograms of dimension restricted to daytime are similar to histograms restricted to nighttime, so that diurnal cycle does not appear to be the main driver of dimension variability.
We repeated the experiments leading to the histograms of Fig. 7a, but using different metrics (the Manhattan distance, order-8 Minkowski metric, and Chebyshev distance). This did not result in significant change, only that the dimension estimates were slightly larger when using the order-8 Minkowski and Chebychev metrics (not shown). This further demonstrates the robustness of our results to a change of metric.
d. AROME reanalysis data: Analog distances
An example of target state and analogs is shown in Fig. 8. The chosen target state is a classical winter situation in Brittany, with strong eastward wind coming from the sea. Thus, good analogs are found in the catalog. It is hard to discriminate which analog is closest: for such a high-dimensional system, the first analog-to-target distances are very similar.
Note that for this moderately high dimensional system, the concentration of norms might make the search for analogs meaningless as pointed out by Beyer et al. (1999). For very high-dimensional systems, the ratio between the distance to the nearest analog, r1, and the distance to the furthest point in the catalog, rL, is close to one, making the search for analog irrelevant. Moreover, Hinneburg et al. (2000) showed that for order-p Minkowski metrics the difference between the distance to the furthest point and to the nearest neighbor scales as d(1/p) − (1/2), indicating that for different types of distances the concentration of norm might behave differently. To ensure that this concentration of norm was not an issue, we computed r1/rL for every point in the catalog (again, omitting neighbors in time to compute r1), and for Minkowski metrics of order 1 (also called Manhattan distance), 2 (also called Euclidean distance), 8, and infinity (also called Chebyshev distance or infinity norm). This allowed us to compute histograms of r1/rL (not shown), which showed a very low probability for r1/rL to exceed 0.3 whatever the distance used. This shows that the curse of dimensionality is not a severe issue for our example of 10-m wind reanalysis, and that looking for analogs is still meaningful.
To obtain these distributions, analogs of each hourly
Figure 9 shows a relatively good agreement between theoretical and empirical distributions, especially for the Lorenz data. Indeed, the curves of Figs. 9b and 9d are similar in shape, especially the asymmetry for k = 1. As k grows, the variance of the empirical data (Fig. 9b) becomes smaller than expected in theory (Fig. 9d). This can be explained by the fact that the assumption L → +∞ (or equivalently rk→0) is better satisfied for low values of k. High values of rk are associated with a low variability. This also explains the lower variance of the empirical curves (Fig. 9a) compared to the theoretical curves (Fig. 9c), using the wind data. Again, the asymmetry in the shape of the curves for k = 1 is respected, and the estimation of the mean fits our theory.
This experiment shows that the present theory, which was derived assuming a large catalog density, is also partially applicable to limited catalog densities (here
e. AROME reanalysis data: Objective-based dimension reduction
We reduce dimension using EOFs, which allows us to reduce
We use the notation
According to the theoretical study of Caby et al. (2020), we expect
This last expression shows how Dmax,k strongly depends on k. On a practical example, assume that Dmax,1 ≈ 10 and that L = 104, then Dmax,25 ≈ 6. In this experiment, we assume that the number of required analogs is fixed. Reducing dimension in order to decrease analog distances thus strongly depends on how many analogs are needed for the analog method. For instance, if an ensemble of analogs is used to estimate the full probability density function of a one-dimensional variable (say, the day after tomorrow’s accumulated rainfall over the city of Paris), then one might need at least 100 analogs. Yet 10 analogs might be enough to simply estimate the mean of the distribution. As another example, if one wants to estimate the covariance associated with the forecast error of 5 independent variables, one needs at the very least 5 analogs, but 50 analogs might be necessary, especially in the presence of observational noise. Also, the complexity of the system under study might vary according to phase space location, so that the number of required analogs could depend on the state z. In practice, the number of required analogs is a complex function of the quantity to be estimated, the quality of the data, the method that is used, and properties of the system at stake.
Figure 10 shows comparison of this scaling with numerical experiments performed on the AROME reanalysis data. Upper and lower bounds for Dmax,k were derived from estimations of
The way to use these equations for Dmax,k in practice depends on the particular application. For instance, if one wants to perform a statistical task such as downscaling, one might impose a fixed minimum number of K samples to correctly represent a statistical distribution. One might, at the same time, ask that the analog-to-target distance does not exceed a given threshold to ensure a good quality of analogs (assuming that this “quality” is correctly estimated by the chosen distance). Then our formulas can be used to estimate how much of dimension reduction is needed to fulfil these criteria by choosing a number of EOFs close to the theoretical value of Dmax,k.
Another possibility is that the required number of samples K varies with D1. This is the case in ensemble forecast where one wants to use successors to estimate the covariance matrix of the future state. If the local dimension is d, we can assume that the data have been projected on some ⌈d⌉-dimensional space, where ⌈d⌉ is the ceiling function of d (i.e., the smallest integer i such that i ≥ d). In this case, the covariance matrix of the future state is of size
However, note that our formulas do not reveal how much information is left behind when reducing the dimension. For instance, in the case of forecast, the maximum dimension Dmax,K might be too low to represent accurately the dynamics of the system. In such a case, one is bound to either raising the value of L (which can rarely be done) or increasing the value of ε (which might decrease the efficiency of the analog method).
5. Conclusions
We combined extreme value theory and dynamical systems theory to derive analytical joint probability distributions of analog-to-target distances in the limit of large catalog density. Those distributions shed new light on the influence of dimension in practical use of analog. In particular, we found that analog-to-target distances are more sensitive to the number of analogs used in low dimension than in high dimension, at fixed catalog density. Contrarily to previous works on the probability to find good analogs, this study focuses on distances rather than return times, gives whole probability distributions rather than first moments, and looks at the K closest analogs rather that only the closest one. Numerical simulations of the three-variable Lorenz system confirm the theoretical findings. An example of practical consequence of our theory on the sensitivity of analog forecasts to the number of analogs used, depending on dimension, is given using the system of Lorenz (1996). The 10-m wind reanalysis data from the AROME physical model show that our analysis is also relevant for observations of real systems. Our investigation indicates that the studied wind fields lie in an attractor of moderately high dimension ~16. In this situation of moderate dimensionality, the analog-to-target distances of the first analogs are all very similar and have a low variability. Our theoretical derivations can be used to find optimal dimension reduction for the purpose of decreasing analog distances, which we demonstrate on an example using the AROME reanalysis data. These examples reveal the applicability of the derived probability distributions even to relatively low catalog densities.
Acknowledgments
The work was financially supported by ERC Grant 338965-A2C2 and ANR Grant 10-IEED-0006-26 (CARAVELE project). This piece of work took its origins in discussion with Théophile Caby, to whom we express our gratitude. In particular, appendix A is an adaptation of a derivation by Théophile Caby. The theoretical derivations of the probability density functions shown in this paper are the result of several exchanges with Benoît Saussol, who we must thank here. We are indebted to Fabrice Collard, Bertrand Chapron, and Caio Stringari, for fruitful insights and discussions about the exploration and interpretation of the AROME reanalysis data. Finally, the last version of this manuscript owes much to the meticulous work of three anonymous reviewers who we thank again here.
APPENDIX A
Proof that D1 is Independent of Metric Choice in Finite Dimension
A metric
This means that for small values of r,
APPENDIX B
Direct Proof for pk(r)
In this appendix, we give the proof of Eq. (5) by evaluating directly the probability that analogs lie between the sphere of radius r and the sphere of radius r + δr.
a. Poisson distribution of the number of analogs in a ball
b. Distribution of analogs close to the sphere
Now we will use μ to evaluate
This last equation is a more general form of our main result which is given in the next section. Here, the probability is expressed in terms of the invariant measure, which is usually not known analytically. The next section expresses the same probability in terms of the analog-to-target distance r.
c. Distribution of analog-to-target distances
APPENDIX C
Alternative Proof for pk(r) and Joint Probability Distribution Using K Largest-Order Statistics
Lucarini et al. (2016) give a detailed analysis of the map from
APPENDIX D
Three-Variable Lorenz System
REFERENCES
Alexander, R., Z. Zhao, E. Székely, and D. Giannakis, 2017: Kernel analog forecasting of tropical intraseasonal oscillations. J. Atmos. Sci., 74, 1321–1342, https://doi.org/10.1175/JAS-D-16-0147.1.
Ayet, A., and P. Tandeo, 2018: Nowcasting solar irradiance using an analog method and geostationary satellite images. Sol. Energy, 164, 301–315, https://doi.org/10.1016/j.solener.2018.02.068.
Beyer, K., J. Goldstein, R. Ramakrishnan, and U. Shaft, 1999: When is “nearest neighbor” meaningful? Int. Conf. on Database Theory, Jerusalem, Israel, ICDT, 217–235.
Birkhoff, G. D., 1931: Proof of the ergodic theorem. Proc. Natl. Acad. Sci. USA, 17, 656–660, https://doi.org/10.1073/pnas.17.2.656.
Caby, T., D. Faranda, G. Mantica, S. Vaienti, and P. Yiou, 2019: Generalized dimensions, large deviations and the distribution of rare events. Physica D, 400, 132143, https://doi.org/10.1016/j.physd.2019.06.009.
Caby, T., D. Faranda, S. Vaienti, and P. Yiou, 2020: Extreme value distributions of observation recurrences. Nonlinearity, 34, 118–163, https://doi.org/10.1088/1361-6544/abaff1.
Cattiaux, J., R. Vautard, C. Cassou, P. Yiou, V. Masson-Delmotte, and F. Codron, 2010: Winter 2010 in Europe: A cold extreme in a warming climate. Geophys. Res. Lett., 37, L20704, https://doi.org/10.1029/2010GL044613.
Coles, S., J. Bawa, L. Trenner, and P. Dorazio, 2001: An Introduction to Statistical Modeling of Extreme Values. Vol. 208. Springer, 208 pp.
Daley, D. J., and D. Vere-Jones, 2003: Elementary Theory and Methods. Vol. I, An Introduction to the Theory of Point Processes, Springer, 471 pp.
Ducrocq, V., F. Bouttier, S. Malardel, T. Montmerle, and Y. Seity, 2005: Le projet AROME. Houille Blanche, 91, 39–43, https://doi.org/10.1051/lhb:200502004.
Faranda, D., V. Lucarini, G. Turchetti, and S. Vaienti, 2011: Extreme value distribution for singular measures. arXiv, https://arxiv.org/abs/1106.2299.
Faranda, D., G. Messori, and P. Yiou, 2017: Dynamical proxies of North Atlantic predictability and extremes. Sci. Rep., 7, 41278, https://doi.org/10.1038/srep41278.
Farmer, J. D., and J. J. Sidorowichl, 1988: Exploiting chaos to predict the future and reduce noise. Evolution, Learning and Cognition, Y. C. Lee, Ed., World Scientific, 277–330.
Fettweis, X., E. Hanna, C. Lang, A. Belleflamme, M. Erpicum, and H. Gallée, 2013: Important role of the mid-tropospheric atmospheric circulation in the recent surface melt increase over the Greenland ice sheet. Cryosphere, 7, 241–248, https://doi.org/10.5194/tc-7-241-2013.
Fischer, C., T. Montmerle, L. Berre, L. Auger, and S. E. Ştefănescu, 2005: An overview of the variational assimilation in the ALADIN/France numerical weather-prediction system. Quart. J. Roy. Meteor. Soc., 131, 3477–3492, https://doi.org/10.1256/qj.05.115.
Gupta, A., R. Krauthgamer, and J. R. Lee, 2003: Bounded geometries, fractals, and low-distortion embeddings. 44th Annual IEEE Symp. on Foundations of Computer Science, Cambridge, MA, IEEE, 534–543, https://doi.org/10.1109/SFCS.2003.1238226.
Hamill, T. M., M. Scheuerer, and G. T. Bates, 2015: Analog probabilistic precipitation forecasts using GEFS reforecasts and climatology-calibrated precipitation analyses. Mon. Wea. Rev., 143, 3300–3309, https://doi.org/10.1175/MWR-D-15-0004.1.
Hamilton, F., T. Berry, and T. Sauer, 2016: Ensemble Kalman filtering without a model. Phys. Rev. X, 6, 011021, https://doi.org/10.1103/PhysRevX.6.011021.
Haydn, N., and S. Vaienti, 2019: Limiting entry times distribution for arbitrary null sets. arXiv, https://arxiv.org/abs/1904.08733.
Hinneburg, A., C. C. Aggarwal, and D. A. Keim, 2000: What is the nearest neighbor in high dimensional spaces? 26th Int. Conf. on Very Large Databases, Cairo, Egypt, VLDB, 506–515, https://www.vldb.org/dblp/db/conf/vldb/HinneburgAK00.html.
Houle, M. E., 2013: Dimensionality, discriminability, density and distance distributions. 2013 IEEE 13th Int. Conf. on Data Mining Workshops, Dallas, TX, IEEE, 468–473, https://doi.org/10.1109/ICDMW.2013.139.
Houle, M. E., 2017: Local intrinsic dimensionality I: An extreme-value-theoretic foundation for similarity applications. Int. Conf. on Similarity Search and Applications, Munich, Germany, SISAP, 64–79.
Jézéquel, A., P. Yiou, and S. Radanovics, 2018: Role of circulation in European heatwaves using flow analogues. Climate Dyn., 50, 1145–1159, https://doi.org/10.1007/s00382-017-3667-0.
Kac, M., 1959: Probability and Related Topics in Physical Sciences. Vol. 1. Interscience Publishers, 266 pp.
Karger, D. R., and M. Ruhl, 2002: Finding nearest neighbors in growth-restricted metrics. Proc. 34th Annual ACM Symp. on Theory of Computing, Montreal, QC, Canada, ACM, 741–750, https://doi.org/10.1145/509907.510013.
Lamperti, J., 1964: On extreme order statistics. Ann. Math. Stat., 35, 1726–1737, https://doi.org/10.1214/aoms/1177700395.
Lguensat, R., P. Tandeo, P. Ailliot, M. Pulido, and R. Fablet, 2017: The analog data assimilation. Mon. Wea. Rev., 145, 4093–4107, https://doi.org/10.1175/MWR-D-16-0441.1.
Lorenz, E. N., 1963: Deterministic nonperiodic flow. J. Atmos. Sci., 20, 130–141, https://doi.org/10.1175/1520-0469(1963)020<0130:DNF>2.0.CO;2.
Lorenz, E. N., 1969: Atmospheric predictability as revealed by naturally occurring analogues. J. Atmos. Sci., 26, 636–646, https://doi.org/10.1175/1520-0469(1969)26<636:APARBN>2.0.CO;2.
Lorenz, E. N., 1996: Predictability: A problem partly solved. Proc. Seminar on Predictability, Reading, United Kingdom, ECMWF, https://www.ecmwf.int/en/elibrary/10829-predictability-problem-partly-solved.
Lucarini, V., D. Faranda, J. Wouters, and T. Kuna, 2014: Towards a general theory of extremes for observables of chaotic dynamical systems. J. Stat. Phys., 154, 723–750, https://doi.org/10.1007/s10955-013-0914-6.
Lucarini, V., and Coauthors, 2016: Extremes and Recurrence in Dynamical Systems. John Wiley and Sons, 312 pp.
Milnor, J., 1985: On the concept of attractor. The Theory of Chaotic Attractors, B. R. Hunt et al., Eds., Springer, 243–264.
Nicolis, C., 1998: Atmospheric analogs and recurrence time statistics: Toward a dynamical formulation. J. Atmos. Sci., 55, 465–475, https://doi.org/10.1175/1520-0469(1998)055<0465:AAARTS>2.0.CO;2.
Platzer, P., P. Yiou, P. Naveau, P. Tandeo, Y. Zhen, P. Ailliot, and J.-F. Filipot, 2021: Using local dynamics to explain analog forecasting of chaotic systems. J. Atmos. Sci., 78, 2117–2133, https://doi.org/10.1175/JAS-D-20-0204.1.
Poincaré, H., 1890: Sur le problème des trois corps et les équations de la dynamique. Acta Math., 13, A3–A270, https://doi.org/10.1007/BF02392506.
Pons, F. M. E., G. Messori, M. C. Alvarez-Castro, and D. Faranda, 2020: Sampling hyperspheres via extreme value theory: Implications for measuring attractor dimensions. J. Stat. Phys., 179, 1698–1717, https://doi.org/10.1007/s10955-020-02573-5.
Robin, Y., P. Yiou, and P. Naveau, 2017: Detecting changes in forced climate attractors with Wasserstein distance. Nonlinear Processes Geophys., 24, 393–405, https://doi.org/10.5194/npg-24-393-2017.
Schenk, F., and E. Zorita, 2012: Reconstruction of high resolution atmospheric fields for northern Europe using analog-upscaling. Climate Past, 8, 1681–1703, https://doi.org/10.5194/cp-8-1681-2012.
Van Den Dool, H. M., 1994: Searching for analogues, how long must we wait? Tellus, 46A, 314–324, https://doi.org/10.3402/tellusa.v46i3.15481.
Verleysen, M., and D. François, 2005: The curse of dimensionality in data mining and time series prediction. Int. Work-Conf. on Artificial Neural Networks, Warsaw, Poland, ICANN, 758–770.
Wang, X., and S. S. Shen, 1999: Estimation of spatial degrees of freedom of a climate field. J. Climate, 12, 1280–1291, https://doi.org/10.1175/1520-0442(1999)012<1280:EOSDOF>2.0.CO;2.
Wetterhall, F., S. Halldin, and C.-Y. Xu, 2005: Statistical precipitation downscaling in central Sweden with the analogue method. J. Hydrol., 306, 174–190, https://doi.org/10.1016/j.jhydrol.2004.09.008.
Yiou, P., 2014: AnaWEGE: A weather generator based on analogues of atmospheric circulation. Geosci. Model Dev., 7, 531–543, https://doi.org/10.5194/gmd-7-531-2014.
Yiou, P., and C. Déandréis, 2019: Stochastic ensemble climate forecast with an analogue model. Geosci. Model Dev., 12, 723–734, https://doi.org/10.5194/gmd-12-723-2019.
Yiou, P., T. Salameh, P. Drobinski, L. Menut, R. Vautard, and M. Vrac, 2013: Ensemble reconstruction of the atmospheric column from surface pressure using analogues. Climate Dyn., 41, 1333–1344, https://doi.org/10.1007/s00382-012-1626-3.
Young, L.-S., 1982: Dimension, entropy and Lyapunov exponents. Ergodic Theory Dyn. Syst., 2, 109–124, https://doi.org/10.1017/S0143385700009615.