## Abstract

Both climate and statistical models play an essential role in the process of demonstrating that the distribution of some atmospheric variable has changed over time and in establishing the most likely causes for the detected change. One statistical difficulty in the research field of detection and attribution resides in defining events that can be easily compared and accurately inferred from reasonable sample sizes. As many impacts studies focus on extreme events, the inference of small probabilities and the computation of their associated uncertainties quickly become challenging. In the particular context of event attribution, the authors address the question of how to compare records between the counterfactual “world as it might have been” without anthropogenic forcings and the factual “world that is.” Records are often the most important events in terms of impact and get much media attention. The authors will show how to efficiently estimate the ratio of two small probabilities of records. The inferential gain is particularly substantial when a simple hypothesis-testing procedure is implemented. The theoretical justification of such a proposed scheme can be found in extreme value theory. To illustrate this study’s approach, classical indicators in event attribution studies, like the risk ratio or the fraction of attributable risk, are modified and tailored to handle records. The authors illustrate the advantages of their method through theoretical results, simulation studies, temperature records in Paris, and outputs from a numerical climate model.

## 1. Introduction

Civil engineers, flood planners, and other risk managers have to take into account the nonstationarity of our climate system when computing return levels^{1} to assess, for example, the quality of existing dams or heights of new dikes and other structures. It is natural to wonder how a 100-yr return level for a dike built 50 years ago, under the assumption of stationarity, could be different from today’s value. As emphasized by Rootzén and Katz (2013), a delicate issue is the interpretation of return periods and return levels within a nonstationary world. A 100-yr return level computed 50 years ago, today, or in 50 years may differ and become a relative quantity. Another challenge is to discriminate between a set of potential causes in order to explain detected changes (see, e.g., Zwiers et al. 2013; Hegerl and Zwiers 2011; IPCC 2013).

To deal with relative probability changes and causality questions, researchers in the event attribution community (see, e.g., Uhe et al. 2016) frequently work with the risk ratio RR = *p*_{1}/*p*_{0} of two probabilities of the same event, *p*_{0} and *p*_{1}, under two different situations: a “world that might have been” (a counterfactual world without anthropogenic forcings) and the “world that is” (a factual world). The archetypical example of a common event is that the random variable of interest, say annual maximum temperature, exceeds a high threshold. The so-called fraction of attributable risk FAR = 1 − 1/RR has also been used (see, e.g., Stott et al. 2004, 2016), and it can be viewed as the probability of necessary causation within the mathematical paradigm of causality theory (see Hannart et al. 2016).

Instead of focusing on return period changes, which is a frequent approach in event attributions studies, our goal in this paper is to determine how record event frequencies differ between the factual and counterfactual worlds. Many reasons can be invoked to motivate this change of viewpoint regarding the event of interest. Practically, records have often large economic and human impacts. In terms of media communications, the news anchor typically asks, whenever a record of temperatures or rainfall is broken, if such a record is due to climate change. It is important to have the appropriate probability tool to precisely address this question. Conceptually, record events (see, e.g., King 2017; Meehl et al. 2009; Resnick 1987) are different from maxima and excesses above a high threshold. Reformulating extreme event attribution in terms of records rather than threshold excesses simplifies the statistical issues because only one probability has to be estimated instead of two. Under some additional assumptions based on extreme value theory (EVT; see, e.g., Embrechts et al. 1997; Coles 2001), we will show that the computation of FAR for records can be simplified dramatically and leads to some simple formulas when evaluated through Pearl’s causality theory (see Hannart et al. 2016). The results yield interesting conclusions when applied to two datasets of annual maxima of daily temperature maxima: observations recorded in Paris and outputs from the Météo-France CNRM-CM5 climate model. Still, our objective is not to replace the classical FAR and RR, but to offer a complementary instrument for climatologists, hydrologists, and risk managers. Although strongly connected, the interpretation of our new FAR and RR for records will differ from conventional approaches.

### a. Main assumptions and notations

The probabilistic framework in event attribution studies often starts with the simple and reasonable hypothesis that two independent samples are available, say with a cumulative distribution function^{2} (CDF) of , and with a CDF of . The random vectors **X** and **Z** represent ensemble members from numerical models run under two different scenarios. In a causality investigation, the vector **X** is said to provide the relevant time series in the so-called counterfactual word, while its counterpart **Z** describes the same information but in the factual world.

As relative probability ratios of the type *p*_{1}/*p*_{0} or (*p*_{1} − *p*_{0})/*p*_{1} are often the main objects of interest in event attribution, we will always assume that such ratios are well defined and finite. Such a hypothesis is classical within the event attribution community and simplifies mathematical expressions and interpretations. Assuming a finite FAR implies that the right-hand end point of is at least a large as that of .

### b. Records and return periods

Breaking a record simply means that the current observation exceeds all past measurements (see Resnick 1987). This type of event is often reported in the media and, in such instances, climatologists are often asked if the frequency of record breaking has increased. To define a record mathematically, suppose that a sample of annual^{3} data, say , is given. The year *r* is a record year if

To relate this mathematical inequality with the media question about the likelihood of this event, one can compute the probability of observing such an event:

This equality holds if the random variables *Y*_{i} are independently and identically distributed (IID). Basically, Eq. (1) emphasizes two points. First, records are expected to occur less frequently as the period length *r* increases. Second, given a fixed time period *r*, the chance of the record occurring in a given year is uniform, 1/*r*. Equation (1) will be the backbone of our conceptual and inferential strategies. In particular, it does not depend on the marginal distribution of the *Y*, that is, on .

### c. Interpreting records in an event attribution context

Given any return period *r*, different types of records can be defined by comparing data from our different worlds. For example, the event characterizes the fact that *Z*_{r} is a record, relative to the counterfactual world.

Equation (1) can be applied to the counterfactual sample^{4 }**X**:

To assess the probability that a factual observation like *Z*_{r} could be a record in the counterfactual world, it is convenient to introduce

Comparing *p*_{0,r} and *p*_{1,r} answers the question of whether the frequency of today’s records was different in the past. The mirrored option would be to compute the chance that a counterfactual realization like *X*_{r} becomes a record in the factual world, that is, . Although possible, we do not pursue this here because the “world that might have been” provides a more likely stationary referential than the “world that is.”

To make the link with past event attribution studies, we use the notation *p*_{0,r} and *p*_{1,r}; they differ in their interpretation and inference properties from the classical survival functions and , where *u* is any number on the real line, such as *p*_{1}(*u*) > 0.

Given the return period *r*, we have to find the solution of the system defined by *p*_{0,r} and *p*_{1,r}. By construction, *p*_{0,r} is always equal to 1/*r*, and consequently, there is only one quantity to estimate: *p*_{1,r}. The consequence of fixing *p*_{0,r} = 1/*r* is particularly relevant when dealing with intercomparison studies like CMIP5 or the upcoming CMIP6 (see, e.g., Kharin et al. 2013; Taylor et al. 2012). To compare the values of *p*_{1} from different climate models, it would be best to fix *p*_{0} to the same value for all climate models. Otherwise, the interpretation of the ratio *p*_{1}/*p*_{0} will only be relative, an important shortcoming for assessments like CMIP5. For example, the ratio for the French IPSL climate model is not directly comparable to that from, say, the American NCAR model because and differ due to model error. This intermodel discrepancy can be large. This issue disappears in our framework because for all *r* ≥ 2. Having the same reference frequency, it is now possible to compare different relative ratios, that is, to perform intermodel comparisons with respect to the same yardstick.

The remaining task is to estimate *p*_{1,r}. To implement it, the rest of the paper will be organized as follows. Section 2 presents our statistical model and its corresponding inferential strategy. Examples will be given in section 3. In particular, the analysis of yearly temperature maxima from a climate model will illustrate the applicability of our approach. Section 4 will conclude by highlighting potential improvements and further directions.

## 2. Statistical model for

A simple approach for deriving explicit formulas for *p*_{1,r} is to make the connection^{5} with moments and moment generating functions (MGFs) by writing

where the new random variable

takes its values in the interval [0, ∞) since *G*(*Z*) ∈ (0, 1]. Equation (2) will play a fundamental role because it will enable us to compute the probability that a factual observation like *Z*_{r} could be a record in the counterfactual world. In particular, we do not need to observe a record at year *r* to infer . Instead, we simply need to compute the expectation . Hence, *p*_{1,r} can also be viewed and interpreted as an MGF with respect to the random variable *W*.

In the remainder of this work, we will always assume that the random variable *W* follows an exponential distribution with mean *θ*. The MGF for an exponential is well known (see, e.g., Henze and Meintanis 2005) and gives a simple form for

A natural question is to wonder which cases obey this exponential constraint for *W*.

Table 1 provides three examples from extreme value theory, a well-developed probability theory for distributions of maxima and related upper-tail behaviors (see, e.g., Embrechts et al. 1997; Coles 2001). From this table, one can see that a shifted Gumbel random variable, that is, *Z* = *X* + *μ*, produces a random variable *W* = −log*G*(*Z*) that is exponentially distributed. The same result can be obtained from an inflated Fréchet, that is, *Z* = *σX* with *σ* > 0, or from a Weibull–EVT type,^{6} that is, *Z* = *μ* + *σX*. This hints that block maxima, a block being a season or a year, can produce factual and counterfactual records that are in compliance with an exponentially distributed *W*. In practice, we do not have to fit a generalized extreme value distribution (GEV) to the factual and counterfactual data. We need to assume that *W* is exponentially distributed.

### a. Testing the null hypothesis, H_{0}: W exp(θ)

The random variable *W* = −log*G*(*Z*) can be approximated by its empirical version

where *i* = 1, …, *n*, and represents a slightly modified^{7} version of the empirical CDF of the sample **X**. To assess the goodness of fit with respect to the null hypothesis, *H*_{0}: *W* ~ exp(*θ*) with an unknown *θ* against general alternatives, we refer to Henze and Meintanis (2005) and Ascher (1990). Both authors indicated that the test of Cox and Oates (1984) performs well for a wide range of testing alternatives. Its statistic is given by with . It can be shown that is asymptotically standard normal under *H*_{0} (see, e.g., Henze and Meintanis 2005).

### b. Inference of θ under H_{0}: W exp(θ)

The constant *θ*, which plays an essential role, can be interpreted in different ways. Since , the constant *θ* can be viewed as the ratio that measures a type of fatness between *X* and *Z*. If *X* and *Z* have the same distribution, then *θ* equals one. If *θ* is smaller than one, then *Z* is more likely than *X* to take large values, and vice versa. The constant *θ* is also the limit of the ratio *p*_{0,r}/*p*_{1,r} as *r* goes to infinity.

Under *H*_{0}: *W* ~ exp(*θ*), we need to estimate the single parameter *θ*. As *p*_{1,2} = 1/(1 + *θ*) with , we define

where corresponds to the empirical CDF associated with the sample **X**. The asymptotic properties^{8} of can be deduced from

where

The delta method (see, e.g., Proposition 6.1.5 in Brockwell and Davis 1991) gives

with . This leads to the (1 − *α*) × 100% asymptotic confidence interval for *θ*; where *z*_{1−α/2} represents the standard normal quantile at 1 − *α*/2.

### c. Inference of and risk ratio under H_{0}: W exp(θ)

From the expression for *p*_{1,r} in Eq. (3), the FAR for records defined by far(*r*) = 1 − *p*_{0,r}/*p*_{1,r} can be expressed as

Table 2 provides far(*r*) and FAR(*u*) expressions for our three GEV examples defined in Table 1. Figure 1 compares far(*r*) with FAR(*x*_{r}) for the same three distributions, where *x*_{r} = *G*^{−1}(1 − 1/*r*) is the classical return level. The limits of far(*r*) and FAR(*x*_{r}) as *r* becomes large are equal (horizontal dashed lines; see also the “Limit” rows in Table 2). In other words, far(*r*) and FAR(*u*) are asymptotically equivalent for extremes.

Equation (6) naturally suggests the estimator

The delta method applied to Eq. (5) gives

and the asymptotic confidence interval for far(*r*) is The relative error of defined as the renormalized standard deviation does not depend on *r*. For *θ* ≠ 1, it is equal to

Also note that this ratio is infinite when *θ* = 1 (no difference between the factual and counterfactual worlds). Hence, it is counterproductive to compute far(*r*) in this special case.

The record risk ratio rr(*r*) = *p*_{1,r}/*p*_{0,r} under model (6) does not have this issue. For the estimator

we have

and consequently, this relative error of the risk ratio satisfies

and it is well defined for *θ* = 1.

## 3. Examples

### a. Annual temperature maxima in Paris

To illustrate how our approach can be applied to climatological time series, we study one of the longest records of annual maxima of daily temperature maxima in Paris (weather station of Montsouris). To represent the factual and counterfactual worlds, this Paris time series is divided into two parts: 1900–30 and 1990–2015. Although the influence of anthropogenic forcing may not be completely absent in the first part of the record, Fig. 2 still shows that the hypothesis of exponentially distributed *W*_{i} seems reasonable. The histogram and the q-q plot do not contradict the exponential hypothesis. The top-right panel compares a decreasing convex dotted line (our fitted exponential model) with the observational histogram that is also decreasing and convex. The lower-left panel shows the ranked observations versus the ones expected from our model. Being close to the diagonal (the solid line) indicates a good fit.

To complement these visual devices, our exponential fit is compared, via the Akaike information criterion (AIC) and Bayesian information criterion (BIC), to a Weibull fit, a natural but more complex alternative. The lower scores for the exponential pdf (AIC = 11.83 and BIC = 13.05 vs AIC = 13.2 and BIC = 15.7) confirms our preference for the simpler model. Concerning the test of Cox and Oates (1984), the inferred *p* value is 0.51, indicating that there is no reason to reject the exponential model for our Parisian data. The upper-left panel of Fig. 2 displays far(*r*) with its associated 90% confidence interval. Even from a single location like Paris, one can deduce that far(*r*) is not equal to zero, but is rather around 0.5.

### b. Analysis of yearly temperature maxima from CNRM-CM5

To illustrate our method with numerical climate model outputs, we study yearly maxima of daily maxima of near-surface air temperature generated by the Météo-France CNRM-CM5 climate model. We focus on a single climate simulation with “all forcings”^{9} for the period 1975–2005. This should mimic the current climatology, that is, the counterfactual world, at least according to this model. During this 30-yr climatological period, we assume that the climate system is approximately stationary with respect to annual maxima. Concerning the “world that might have been” without anthropogenic forcings, we consider a second run from the same numerical model, but with only natural forcings. For this counterfactual world, we assume stationarity over the full length of the simulation, that is, for the period 1850–2012.

Our first inquiry is to see if our null hypothesis on *W* is rejected or not. The white areas in Fig. 3 correspond to the locations where the exponential hypothesis has been rejected by the Cox and Oates test at the 5% level. For the other points, the estimated value of *θ* inferred from Eq. (5) ranges from 0.2 to 1.2. A *θ* near one (orange) indicates identical factual and counterfactual worlds. A *θ* close to zero (blue) implies that the factual and counterfactual worlds strongly differ with an increase in temperature records. Note that the estimated *θ* can be larger than one, meaning that . Consequently, the risk ratio should be used instead of the far in such cases.

Figure 4 displays the largest values of *p*_{1,r} − *p*_{0,r} over all values of *r* ≥ 2. In causality theory, this difference is called the probability of sufficient and necessary causations, and under *H*_{0}, it is equal to with . These probabilities summarize evidence of both sufficient and necessary causation (see, e.g., Hannart et al. 2016). The subset of locations such that the Cox and Oates *p* values are greater than 5% and are displayed in Fig. 4. For example, this number for Iceland is 35%.

Interpreting the two maps jointly highlights the African continent, Brazil, and western Australia, which seem to represent regions with low *θ* and large *p* values, that is, with a strong contrast between the all forcing and only natural forcing runs. Northern Brazil, most of Africa, and western Australia appear to show evidence of such causations most strongly. Variability in surface air temperature is low in tropical regions and over many ocean areas, and it is strongly reduced in areas of sea ice loss, so we expect large relative differences between the counterfactual and factual worlds in these cases. Simulated variability is larger in other places, relative to the simulated change in mean climate. As a consequence, both PN and PS remain small locally in these regions. This includes the Mediterranean basin, which is sometimes referred to as a “hot spot.” Note that given the period of time considered for the “factual world,” non-GHG forcings like anthropogenic aerosols can contribute to this result (offsetting part of the GHG-induced warming).

## 4. Discussions and future work

### a. Event definition

Our work is closely related to the fundamental and sometimes overlooked question of how to define the event itself. Ideally, the event of interest should be easily interpretable, its probability should be simple to compute in both worlds, and it should belong to an invariant reference framework in order to compare different studies. These desiderata were fulfilled by slightly moving away from the idea of defining the event as an excess over a fixed threshold and instead setting the event as a record.

A potential worry could have been that treating record events instead of excesses above the threshold *u* could drastically move us away from FAR(*u*). Figure 1 showed that this concern was unfounded. In other words, the practitioner interested in assessing the likelihood of extreme events relative to a counterfactual scenario has the choice of using far(*r*) or FAR(*x*_{r}) for large *r* or, equivalently, rr(*r*) or RR(*x*_{r}).

### b. Null hypothesis

Statistically, inferences of the probabilities of interest are simple, fast, and efficient under *H*_{0}: *W* ~ exp(*θ*). In particular, such an assumption is satisfied for simple GEV setups (see Table 1). Theoretically, it is rooted on the variable *W* = −log*G*(*Z*). This transformed variable shows that the interest is neither in only *Z* nor only *X* (throughout the CDF *G*) but in how they combine. In other words, our main message is that it is counterproductive to minutely study the two samples **X** and **Z** independently. As the FAR and risk ratios are relative quantities, working with the relative statistic *W* greatly simplifies their inferences.

Concerning the applications, our analysis of a long temperature record in Paris and of a climate model indicates that the exponential assumption for *W* appears to be reasonable for temperature maxima over land areas. Still, a comprehensive study with a large number of both weather stations and climate models is needed to confirm this hypothesis. Future work should also focus on precipitation maxima. In this situation, we conjecture that the Weibull MGF, with its extra parameter *k*, could be a better fit than the exponential MGF because extreme precipitation has a stronger variability. The Weibull parameter *k* should capture the changes between the factual and counterfactual tails. Another option when *H*_{0} is rejected is to estimate nonparametrically *p*_{1,r} [see Eq. (A1)], which provides the necessary information to compute confidence intervals in this case. The main advantage of this approach is that goodness-of-fit tests are not required, but its main drawback is that the sample size *n* should always be much greater than the return period of interest so goes to zero.

To summarize our findings about *H*_{0}, it is important to emphasize that this hypothesis allows us to work with small sample sizes; see our Paris example with *n* = 30. This is a huge difference with nonparametric methods that need thousands of factual and counterfactual simulations. Hence, our methodology is fairly general. One possible application could be conditional attributions, when the event itself can be linked to another covariate, like the atmospheric circulation. Further work on far(*r*) would be to estimate this value’s conditional to atmospheric circulation properties, in order to separate the roles of changes in circulation and temperature increases (see, e.g., Yiou et al. 2017).

Finally, our goal in this paper is not to discard the classical FAR(*u*) and RR(*u*) in favor of far(*r*) and rr(*r*). The two have different interpretations, and they complement each other. Here, the new *p*_{0,r} and *p*_{1,r} should be viewed as an additional risk management tool for practitioners interested in causality, especially those who base their decision-making process on records. Conceptually, our approach has the advantage of providing a fixed reference framework, which is a necessity for intercomparison studies and could help future intercomparison studies like CMIP6.

### APPENDIX

#### Mathematical Computation

##### a. Proof of Eq. (2)

##### b. Proof of Eq. (4)

In this section, we prove a more general result than Eq. (4); the latter can be deduced by taking *r* = 2 in the following proposition.

Let

where Then, we have

with

Note that the estimator and its asymptotic properties do not depend on the null hypothesis: *H*_{0}: *W* ~ exp(*θ*). In other words, is a very general nonparametric estimator of *p*_{1,r} for any integer *r* ≥ 2.

The key point for the proof of Eq. (A1) is the following result about the empirical process (see Mason and Van Zwet 1987):

where *B*(*u*) represents a classical Brownian bridge on [0, 1] with and covariance

It follows that for all *i* = 1, …, *n*, we can write uniformly for any *z*_{i}

Letting *h*(*u*) = *u*^{r−1}, we can write that

with obvious notation for *N*_{1} and *N*_{2}. Concerning the latter, the central limit theorem tells us that converges to a zero-mean Gaussian variable with variance . Concerning *N*_{1}, the function *h*(*u*) = *u*^{r−1} being a differentiable function on [0, 1], Proposition 6.1.5 in Brockwell and Davis (1991) allows us to focus on the approximation of *N*_{1} (the so-called delta-s method):

As , we have

Using , we have

It follows that the covariance of is equal to

For large *n*, this leads to

In other words, the random variable asymptotically follows a zero-mean Gaussian distribution with variance . In addition, the covariance

is equal to zero because . In summary, the sum *N*_{1} + *N*_{2} can be viewed (asymptotically in a distributional sense) as the sum of two zero-mean Gaussian random variables, with total variance

For the special case defined by Eq. (3), we can write that

As and , we can write

This indicates that this nonparametric technique should be used with caution for large *r* as the standard deviation increases with the rate . The sample size *n* should be always much greater than the return period of interest so goes to zero.

##### c. Computation of p_{1,r} in the GEV case

For *ξ*_{i} ≠ 0, we denote

for *i* ∈ {0, 1}, and we have . For any GEV distributions in our tables, we have *F*(*x*) = exp[−1/*T*_{1}(*x*)] and *G*(*x*) = exp[−1/*T*_{0}(*x*)]. In this context, we can write

whenever (same support condition in the Weibull and Fréchet cases).

For *ξ*_{0} = *ξ*_{1} = 0, we set for *i* ∈ {0, 1}, Then, we have

As *T*_{1}(⋅)is a nondecreasing function, we can write that

To continue, we need to derive the pdf of *T*_{1}(*Z*_{r}). If *Z* follows a GEV(*μ*_{1}, *σ*_{1}, *ξ*_{1}), then

Hence, *T*_{1}(*Z*_{r}) is unit-Fréchet. Going back to the computation of *p*_{1,r}, we can use the independence assumption within the sample **X** to write that

We need to compute

To finish the computation, we study separately the Gumbel and Fréchet cases. If *ξ*_{0} = *ξ*_{1} = 0, then Hence, with In other words, we have When *σ*_{0} = *σ*_{1}, this becomes *p*_{1,r} = 1/(1 + *c*_{r}).

If the shape parameters are nonnull and of the same sign, then we have

whenever (same support condition in the Weibull and Fréchet cases). It follows that

with . In other words, we have

For the special case when *ξ*_{0} = *ξ*_{1}, then this simplifies to *p*_{1,r} = 1/(1 + *c*_{r}).

## Acknowledgments

This work is supported by ERC Grant 338965-A2C2 and P. Naveau’s work was partially financed by the EUPHEME project (ERA4CS ERA-NET).

## REFERENCES

*Time Series: Theory and Methods*, 2nd ed. Springer Series in Statistics, Springer, 577 pp.

*An Introduction to Statistical Modeling of Extreme Values*. Springer Series in Statistics, Springer, 208 pp.

*Analysis of Survival Data, Chapman & Hall/CRC Monogr. on Statistics and Applied Probability*, No. 21, Chapman and Hall, 212 pp.

*Modelling Extremal Events for Insurance and Finance*. Stochastic Modelling and Applied Probability Series, Vol. 33, Springer, 648 pp.

**97**,

*Climate Change 2013: The Physical Science Basis.*Cambridge University Press, 1535 pp., https://doi.org/10.1017/CBO9781107415324.

*Extreme Values, Regular Variation, and Point Processes*. Springer Series in Operations Research and Financial Engineering, Springer, 320 pp.

*Climate Science for Serving Society*, G. R. Asrar and J. W. Hurrell, Eds., Springer, 339–389.

## Footnotes

^{a}

Current affiliation: CNRM, Météo France/CNRS, Toulouse, France.

^{b}

Current affiliation: Pacific Climate Impacts Consortium, University of Victoria, Victoria, British Columbia, Canada.

^{c}

Current affiliation: Ouranos, Montreal, Québec, Canada.

^{d}

Current affiliation: École Polytechnique, Palaiseau, France.

© 2018 American Meteorological Society. For information regarding reuse of this content and general copyright information, consult the AMS Copyright Policy (www.ametsoc.org/PUBSReuseLicenses).

^{1}

Given a fixed number of years *r*, the return level, say *x*_{r}, is the value that should be crossed on average once during the period *r* under the assumption of stationarity. The number *r* is called the return period, and, hence, it is associated with an event of probability 1/*r*. The return level *x*_{r} associated with period *r* can also be defined as the value that is exceeded with probability 1/*r* in any given year.

^{2}

All CDFs in this article are assumed to be continuous.

^{3}

For ease of interpretation, the temporal increment will always be a year in this work, but this is arbitrary and other increments could be used instead.

^{4}

The IID assumption appears reasonable for yearly maxima in the counterfactual world, the topic of interest in our examples.

^{6}

With the constraint that *X* and *Z* share the same support, that is, *x* < 1/(−*ξ*) with *σ* > 0 and *μ* = (1 − *σ*)/(−*ξ*).

^{7}

As (*Z*_{i}), with representing the empirical CDF, could be equal to zero for any *i* such *Z*_{i} < min(*X*_{1}, …, *X*_{n}), the expression −log(*Z*_{i}) could provide −log 0 = ∞. This explains why, to compute , we use a kernel estimate with infinite support, and not , to approximate the CDF *G* (see the np package of Racine and Hayfield in R). This approximation is only used for the test *H*_{0}, but not to estimate *p*_{1,r}, and consequently θ. See Eq. (4).

^{9}

That is, the model is driven by specifying the history of greenhouse gas concentrations, aerosols, land use, volcanic aerosol, and solar output over the period of simulation from preindustrial times up to 2005.