• Arcones, M. A., and E. Giné, 1989: The bootstrap of the mean with arbitrary bootstrap sample. Ann. Inst. Henri Poincaré, 25, 457481.

    • Search Google Scholar
    • Export Citation
  • Arcones, M. A., and E. Giné, 1991: Additions and corrections to “The bootstrap of the mean with arbitrary bootstrap sample.” Ann. Inst. Henri Poincaré, 27, 583595.

    • Search Google Scholar
    • Export Citation
  • Athreya, K. B., 1987a: Bootstrap of the mean in the infinite variance case. Proc. First World Congress of the Bernoulli Society, Utrecht, Netherlands, Bernoulli Society, 9598.

  • Athreya, K. B., 1987b: Bootstrap of the mean in the infinite variance case. Ann. Stat., 15, 724731, https://doi.org/10.1214/aos/1176350371.

  • Bickel, P. J., and D. A. Freedman, 1981: Some asymptotic theory for the bootstrap. Ann. Stat., 9, 11961217, https://doi.org/10.1214/aos/1176345637.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Bickel, P. J., F. Götze, and W. R. van Zwet, 1997: Resampling fewer than n observations: Gains, losses, and remedies for losses. Stat. Sin., 7, 131.

    • Search Google Scholar
    • Export Citation
  • Büecher, A., and J. Segers, 2017: On the maximum likelihood estimator for the generalized extreme-value distribution. Extremes, 20, 839872, https://doi.org/10.1007/s10687-017-0292-6.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Canty, A., and B. Ripley, 2017: boot: Bootstrap R (S-Plus) Functions, version 1.3-20. R package, http://statwww.epfl.ch/davison/BMA/.

  • Cooley, D., 2013: Return periods and return levels under climate change. Extremes in a Changing Climate: Detection, Analysis and Uncertainty, Springer, 97114.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Davison, A., and D. Hinkley, 1997: Bootstrap Methods and Their Application. Cambridge University Press, 582 pp.

  • Deheuvels, P., D. M. Mason, and G. R. Shorack, 1993: Some results on the influence of extremes on the bootstrap. Ann. Inst. Henri Poincaré Probab. Stat., 29, 83103.

    • Search Google Scholar
    • Export Citation
  • Fawcett, L., and D. Walshaw, 2012: Estimating return levels from serially dependent extremes. Environmetrics, 23, 272283, https://doi.org/10.1002/env.2133.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Feigin, P., and S. I. Resnick, 1997: Linear programming estimators and bootstrapping for heavy-tailed phenomena. Adv. Appl. Probab., 29, 759805, https://doi.org/10.2307/1428085.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Fukuchi, J.-I., 1994: Bootstrapping extremes of random variables. Ph.D. thesis, Iowa State University, 101 pp.

  • Gilleland, E., 2017: distillery: Method Functions for Confidence Intervals and to Distill Information from an Object, version 1.0-4. R package, https://www.ral.ucar.edu/staff/ericg.

  • Gilleland, E., 2020: Bootstrap methods for statistical inference. Part I: Comparative forecast verification for continuous variables. J. Atmos. Oceanic Technol., 36, 21172134, https://doi.org/10.1175/JTECH-D-20-0069.1.

    • Search Google Scholar
    • Export Citation
  • Gilleland, E., and R. W. Katz, 2016: extRemes 2.0: An extreme value analysis package in R. J. Stat. Software, 72, 139, https://doi.org/10.18637/jss.v072.i08.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Gilleland, E., R. W. Katz, and P. Naveau, 2017: Quantifying the risk of extreme events under climate change. Chance, 30, 3036, https://doi.org/10.1080/09332480.2017.1406757.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Giné, E., and J. Zinn, 1989: Necessary conditions for the bootstrap of the mean. Ann. Stat., 17, 684691, https://doi.org/10.1214/aos/1176347134.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Hall, P., 1990: Asymptotic properties of the bootstrap for heavy-tailed distributions. Ann. Probab., 18, 13421360, https://doi.org/10.1214/aop/1176990748.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Heffernan, J. E., and J. A. Tawn, 2004: A conditional approach for multivariate extreme values (with discussion). J. Roy. Stat. Soc., 66B, 497546, https://doi.org/10.1111/j.1467-9868.2004.02050.x.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Katz, R. W., 2013: Statistical methods for non-stationary extremes. Extremes in a Changing Climate: Detection, Analysis and Uncertainty, Springer, 1537.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Katz, R. W., M. B. Parlange, and P. Naveau, 2002: Statistics of extremes in hydrology. Adv. Water Resour., 25, 12871304, https://doi.org/10.1016/S0309-1708(02)00056-8.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Kinateder, J. G., 1992: An invariance principle applicable to the bootstrap. Exploring the Limits of Bootstrap, Wiley Series in Probability and Mathematical Statistics, Wiley, 157181.

  • Knight, K., 1989: On the bootstrap of the sample mean in the infinite variance case. Ann. Stat., 17, 11681175, https://doi.org/10.1214/aos/1176347262.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Kyselý, J., 2002: Comparison of extremes in GCM-simulated, downscaled and observed central-European temperature series. Climate Res., 20, 211222, https://doi.org/10.3354/cr020211.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Lee, S., 1999: On a class of m out of n bootstrap confidence intervals. J. Roy. Stat., 61B, 901911, https://doi.org/10.1111/1467-9868.00209.

  • LePage, R., 1992: Bootstrapping signs. Exploring the Limits of Bootstrap, Wiley Series in Probability and Mathematical Statistics, Wiley, 215224.

  • R Core Team, 2017: R: A language and environment for statistical computing. R Foundation for Statistical Computing, https://www.R-project.org/.

  • Resnick, S. I., 2007: Heavy-Tail Phenomena: Probabilistic and Statistical Modeling. Springer Series in Operations Research and Financial Engineering, Springer, 404 pp.

  • Schendel, T., and R. Thongwichian, 2015: Flood frequency analysis: Confidence interval estimation by test inversion bootstrapping. Adv. Water Resour., 83, 19, https://doi.org/10.1016/j.advwatres.2015.05.004.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Schendel, T., and R. Thongwichian, 2017: Confidence intervals for return levels for the peaks-over-threshold approach. Adv. Water Resour., 99, 5359, https://doi.org/10.1016/j.advwatres.2016.11.011.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Shao, J., and T. Dongsheng, 1995: The Jackknife and the Bootstrap. Springer, 123 pp.

  • Smith, R. L., 1985: Maximum likelihood estimation in a class of nonregular cases. Biometrika, 72, 6790, https://doi.org/10.1093/biomet/72.1.67.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • View in gallery
    Fig. 1.

    Bootstrap estimated p values against estimated 1000-yr return levels for varying values of the nuisance parameter, in this case the GEV shape parameter ξ for the simulation zGEV from section 3 with “true” parameter values μ = 30, σ = 10, and ξ = 0.2. The “true” 1000-yr return level is approximately 179.03. Blue horizontal lines go through the desired α confidence levels (i.e., α/2 = 0.025 and 1 − α/2 = 0.975). The outer vertical blue lines are the estimated (1 − α/2) × 100% TIB CI estimates, which for this example are approximately (84.13, 219.51), and the center vertical blue line is the estimated 1000-yr return level (in this case, about 130.57). If the interpolation method is accurate, then there should be a black circle near where the outer two vertical blue lines intersect the horizontal blue lines. The leftmost (rightmost) vertical line should have a circle near the top (bottom) horizontal line.

  • View in gallery
    Fig. 2.

    As in Fig. 1, but using the Robbins–Monro algorithm instead of the interpolation method. In this case, the black circles are replaced with “l” and “u” symbols specifying lower vs upper. Accurate bounds should have a “u” symbol near where the rightmost vertical blue line crosses the lower horizontal blue line and an “l” symbol near where the leftmost vertical blue line crosses the top horizontal blue line.

  • View in gallery
    Fig. 3.

    Annual maximum of summer daily minimum temperatures (°F) at Phoenix Sky Harbor airport.

  • View in gallery
    Fig. 4.

    Histograms of simulated heavy-tail samples from GEV distribution functions with location parameters equal to zero, scale parameters equal to unity, and shape parameters of 0.1, 0.5, and 1.5, respectively. The theoretical mean for each simulation is approximately 0.69, 1.54, and undefined, respectively. The means for these samples are approximately 0.64, 1.45, and 16.24.

All Time Past Year Past 30 Days
Abstract Views 409 0 0
Full Text Views 776 600 79
PDF Downloads 806 608 55

Bootstrap Methods for Statistical Inference. Part II: Extreme-Value Analysis

Eric Gilleland National Center for Atmospheric Research, Boulder, Colorado

Search for other papers by Eric Gilleland in
Current site
Google Scholar
PubMed
Close
Free access

Abstract

This paper is the sequel to a companion paper on bootstrap resampling that reviews bootstrap methodology for making statistical inferences for atmospheric science applications where the necessary assumptions are often not met for the most commonly used resampling procedures. In particular, this sequel addresses extreme-value analysis applications with discussion on the challenges for finding accurate bootstrap methods in this context. New bootstrap code from the R packages “distillery” and “extRemes” is introduced. It is further found that one approach for accurate confidence intervals in this setting is not well suited to the case when the random sample’s distribution is not stationary.

Supplemental information related to this paper is available at the Journals Online website: https://doi.org/10.1175/JTECH-D-20-0070.s1.

© 2020 American Meteorological Society. For information regarding reuse of this content and general copyright information, consult the AMS Copyright Policy (www.ametsoc.org/PUBSReuseLicenses).

Corresponding author: Eric Gilleland, ericg@ucar.edu

This article has a companion article which can be found at http://journals.ametsoc.org/doi/abs/10.1175/JTECH-D-20-0069.1.

Abstract

This paper is the sequel to a companion paper on bootstrap resampling that reviews bootstrap methodology for making statistical inferences for atmospheric science applications where the necessary assumptions are often not met for the most commonly used resampling procedures. In particular, this sequel addresses extreme-value analysis applications with discussion on the challenges for finding accurate bootstrap methods in this context. New bootstrap code from the R packages “distillery” and “extRemes” is introduced. It is further found that one approach for accurate confidence intervals in this setting is not well suited to the case when the random sample’s distribution is not stationary.

Supplemental information related to this paper is available at the Journals Online website: https://doi.org/10.1175/JTECH-D-20-0070.s1.

© 2020 American Meteorological Society. For information regarding reuse of this content and general copyright information, consult the AMS Copyright Policy (www.ametsoc.org/PUBSReuseLicenses).

Corresponding author: Eric Gilleland, ericg@ucar.edu

This article has a companion article which can be found at http://journals.ametsoc.org/doi/abs/10.1175/JTECH-D-20-0069.1.

1. Introduction

This paper is the second part in a two-part series on bootstrap methods for atmospheric science applications (Gilleland 2020, henceforth, PI). In atmospheric science, a common question concerns extreme values of weather phenomena and how those rare events might be changing in a future climate. However, this situation presents hidden challenges that are often overlooked. Even when it is recognized that standard parametric-based statistical tests might not be appropriate, bootstrap methods are often seen as a fix for any situation. However, bootstrap methods still require assumptions and the most commonly used type, the independent and identically distributed (iid) bootstrap, fails to produce accurate results when those assumptions are not met.

Assumptions for the bootstrap procedure are often violated when interest is in the extreme values of a process. First, simply resampling from the data does not allow for sampling values that might occur but have not been observed in the data record. Second, asymptotic arguments for the appropriateness of the resampling procedure do not hold for the usual resampling paradigm when the underlying distribution function is heavy tailed (Resnick 2007, chapter 6).

Fawcett and Walshaw (2012) employ block bootstrap resampling along with a bivariate extreme-value model to make inferences when modeling threshold exceedances and Heffernan and Tawn (2004) employ a semiparametric bootstrap procedure in order to make inferences for their bivariate conditional extreme-value model. Kyselý (2002) found that a parametric bootstrap procedure performed fairly well, but nevertheless had a tendency to yield narrower confidence intervals (CI’s) than desired. Schendel and Thongwichian (2015, 2017) advocate for the use of the test-inversion bootstrap (TIB; see PI) procedure, but this method can be difficult to implement, especially in the case of nonstationary data.

The main objective of this paper is to describe new R software (R Core Team 2017), available in the “extRemes” (Gilleland and Katz 2016) package, and making use of code from the “distillery” (Gilleland 2017) package, for obtaining accurate CI’s for extreme values.

2. Extreme value analysis

Weather situations that have a high impact on human life, infrastructure, and the environment, such as extreme precipitation, severe winds, tornadoes, and hurricanes, are often the main thrust of atmospheric studies. For the rarest events, such as those that occur on average only once every 100 years, it is important to utilize the correct statistical analyses in order to accurately portray the risks of these types of events; as well as their uncertainty information. In what follows, it is helpful to denote a random sample of variables X1, …, Xn, to represent a physical phenomena of interest such as 24-h accumulated rainfall, daily maximum temperature, and streamflow.

Theoretical, asymptotic arguments give justification for modeling maxima taken over very long blocks with the generalized extreme value (GEV) distribution function, the frequency of occurrence of rare events by the Poisson distribution function and subsequently the time between events by the exponential distribution function, and for excesses over a very high threshold by the generalized Pareto (GP) distribution function. The frequency and occurrence of extreme events can be modeled jointly by way of a Poisson point process (PP) framework that can be recharacterized in terms of the GEV distribution function. This same PP framework can be expressed as a marked Poisson process where the marks follow a GP distribution function. Approximately, the GP distribution function informs about the tail of the GEV distribution function so that all three approaches to modeling extreme values are essentially the same. This approximate equivalence can be easily intuited by understanding that, for a high threshold u, Pr{max(X1, …, Xn) ≤ u} = Pr{N = 0}, where N is a random variable that represents the number of times that Xi > u, i = 1, …, n. That is, letting Mn = max(X1, …, Xn), N ~ Poisson(λ) and Mn ~ GEV(μ, σ, ξ).

More precisely, the binomial distribution function is the probability distribution function that results for the total number of events, Nn, that occur in a sequence of n independent trials. For p the probability of a success, i.e., p = Pr{Xi > u}, on a given trial with 0 < p < 1, the expected value for Nn is np. If the expected number of events, λ = np, stays constant as the number of trials, n, increases, then p decreases with n; write pn to emphasize this relationship. Under this scenario,
Pr{Nn=0}=(1pn)n=(1npnn)n=(1λn)neλ
for large n. That is, the probability distribution function of Nn is approximately Poisson with rate parameter λ. More generally,
Pr{Nn=k}=λkeλk!,k=0,1,2,andλ>0
for large n and constant rate of occurrence as n increases. The Poisson distribution function has the unusual property that its mean is equal to its variance, which is also the rate parameter, λ.
The GEV distribution function is given by
Pr{max(X1,,Xn)z}=Pr{Zz}=G(z)=exp[{1+ξ(zμσ)}+1/ξ],
where {⋅}+ indicates that the value inside {⋅} is set to zero if it is less than zero. The GEV distribution function has three parameters, −∞ < μ < ∞, σ > 0 and −∞ < ξ < ∞. The parameter μ is a location parameter that linearly adjusts where the overall mass of the GEV distribution function falls, but it is not equivalent to the mean of the GEV distribution function, which is only defined for ξ < 1 and is given by μσ(1 − Γ(1 − ξ))/ξ, where Γ(⋅) is the gamma function defined for x > 0 by Γ(x)=0tx1etdt; reducing to (x − 1)! when x is a positive integer. The scale parameter, σ, relates to the dispersion of the distribution function but is not the same as the standard deviation, which is given by σ2[Γ(12ξ)Γ2(1ξ)]/ξ2, and is only defined when ξ < 1/2. Finally, ξ is the shape parameter where ξ < 0 gives rise to the reverse Weibull distribution function, which has a finite upper bound. The Fréchet distribution function arises when ξ > 0, which has a heavy tail so that the probability of observing increasingly higher values of Z decreases at a polynomial rate. Defined by continuity, the case where ξ = 0 yields the light-tailed Gumbel distribution function; namely, G(z) = exp{−exp[−(zμ)/σ]}.
The GP distribution function, which is usually considered in the format Pr{Y>y}=F¯(y) instead of the more customary Pr{Yy}=F(y)=1F¯(y) orientation, is given for Y = Xu, conditioned on X > u, with u a high threshold, by
F¯(y)=1n[1+ξ(yuσu)]+1/ξ,
where the + subscript has the same meaning as for Eq. (2). The u subscript on the scale parameter emphasizes that σu > 0 depends on the threshold. As mentioned previously, the GP distribution function approximates the tail of the GEV distribution function, and the scale parameter of the GP distribution function is related to the GEV distribution by σu = σ + ξ(uμ), where σ and μ are the parameters from the GEV distribution function associated with block maxima over large blocks from the same underlying random variable that gives rise to the GP distribution function.1 The shape parameter is the same for both. Clearly, the location parameter μ is not involved in the GP distribution function, which has only two parameters. The threshold u can be thought of as a surrogate for the location parameter as it effectively takes on the same role.

The GP distribution function has mean σu/(1 − ξ) and variance σu2/[(1ξ)2(12ξ)]. Naturally, the shape parameter, ξ, once again determines the tail behavior for the GP distribution function. The upper-bounded beta distribution function arises when ξ < 0 and the heavy-tailed Pareto distribution function results when ξ > 0. The light-tailed exponential distribution function occurs when ξ = 0, defined again by continuity.

Comparison of the distribution functions defined in (1), (2), and (3) helps to see how the three are related. The GEV distribution function has the same form as the Poisson distribution function, but where λ varies according to the three parameters; that is, the GEV distribution function is a nonhomogeneous Poisson distribution function. The GP distribution function has a similar form as the exponent portion of the GEV distribution function where the threshold replaces the location parameter.

At the beginning of this section, it was suggested that information about a 100-yr event might be of interest. More generally, suppose interest is in a T-yr event; that is, one that is exceeded, on average, once every T years, or equivalently, with probability 1/T. It should be noted that the probability of occurrence of such an event over a number of years, M, is possibly higher than one might think (cf. Gilleland et al. 2017). For example, suppose a home is for sale by a river. A potential buyer might believe that they would live in the home for 25 years. Suppose further that the potential buyer is risk averse to a 100-yr flood event. The probability of such an event over the span of time is, assuming independence and stationarity in time, given by 1 − (1 − p)M, which for this example is 1 − (1 − 1/100)25 ≈ 22.22%. That is, the probability of having a 100-yr flood event at least once during the 25-yr time frame is more than 20%. In general, it is not known what the T-yr event actually is. That is, the potential buyer is risk averse to the 100-yr flood event, but the assumption is that this buyer knows what the level is for the 100-yr event. The extreme-value distribution functions (EVD’s) given by (1), (2), and (3) are particularly well suited for answering this question. Not only are they the only distribution functions with theoretical support for modeling rare events of this magnitude, but information about such events are easily obtained by inverting these equations. Of course, much uncertainty is involved in the estimated return levels when their length approaches or even exceeds that of the data’s, which is often the case.

In particular, suppose the GEV distribution function is the valid distribution function for Z1, …, Zn, where Zi, i = 1, …, n represent yearly maximum water levels. In this case, the quantiles of this GEV distribution function correspond directly to the return levels, zp, which are simply the solutions to the equation G(zp) = 1 − p for G(⋅) as defined in (2), which yields
zp={μσξ[1{ln(1p)}ξ]forξ0μσln{ln(1p)}forξ=0.
For the PP model, it can be characterized as the GEV distribution function, and even rectified so that it corresponds to an annual basis. Therefore, return levels can be easily found using Eq. (4) under this model. The GP distribution function is slightly more complicated because it is a model for the excesses over a high threshold. The complication involves having to estimate the rate at which the high threshold u is exceeded. Suppose the rate ζu = Pr{X > u} can be estimated, then the value xm > u that is exceeded, on average, once every m observations (e.g., years) for the GP distribution function is given by
xm={u+σuξ[(mζu)ξ1]forξ0u+σuln(mζu)forξ=0.

To utilize these models, however, the parameters must be estimated from data. Because it is the rare events that are of interest, most of the data are irrelevant and are not used in fitting the distribution function to them. In order for the asymptotic results for the GEV distribution function to hold for block maxima, the blocks over which the maxima are taken must be very long. In practice, year-long blocks are generally sufficient, and also lend themselves nicely to interpretation when considering return levels from Eq. (4). For the peaks-over-threshold (POT) models (i.e., the GP and PP models), a very high threshold must be chosen. In practice, a trade-off is made between employing a lower threshold that allows more data to be used in the fit, yielding a lower variance for the estimates, against having a high-enough threshold so that the assumptions are reasonable, yielding lower bias.

Generally, the POT models make better use of the data as more data points are used in fitting the EVD’s to them. However, dependence in the values above the threshold must be considered in order to achieve accurate estimates of the standard errors for the parameter and return level estimators. When choosing the block maxima (BM) approach over the POT approach, it is possible that some blocks might not have any truly extreme values, which leads to utilizing nonextreme data in the fitting procedure. Conversely, it is also possible to have an extreme value at one time during the block and another more extreme value later in the same block. In such a case, one of the extreme values is discarded and not used in the fitting procedure. These issues are not present in the POT approaches. On the other hand, the BM approach is less likely to have issues with temporal dependence than the POT methods.

The main methods, and those available with “extRemes,” for estimating the EVD parameters include maximum likelihood (ML), generalized (or penalized) maximum likelihood (GML), L-moments, and Bayesian estimation. Some other moment and fast estimates are also utilized in the literature. This paper focuses solely on ML estimation.

Maximum-likelihood estimation

Assuming Z1, …, Zn are independent variables that each follow the same GEV distribution function, the log likelihood for the GEV distribution function parameters is given by
(μ,σ,ξ)=nlogσ{(1+1/ξ)i=1nlog[1+ξ(ziμσ)]×I(,μσ/ξ](zi)i=1n[1+ξ(ziμσ)]1/ξ×I(,μσ/ξ](zi)}×I(,0)(ξ)
{i=1n(ziμσ)i=1nexp{ziμσ}}×I0(ξ)
{(1+1/ξ)i=1nlog[1+ξ(ziμσ)]×I[μσ/ξ,)(zi)i=1n[1+ξ(ziμσ)]1/ξ×I[μσ/ξ,)(zi)}×I(0,)(ξ),
where the characteristic function IA(x) = 1 if xA and zero otherwise. Equations (6) and (8) are identical apart from the characteristic functions. The ML estimator (MLE) is the combination of parameter values that maximizes the above log-likelihood function.

Usually, this likelihood is written more simply with just Eq. (6) without the characteristic functions but with the caveat that ξ ≠ 0 and 1 + ξ/σ(ziμ) > 0 for i = 1, …, n; the Gumbel case is then given separately. However, the more convoluted form given in Eqs. (6)(8) emphasizes the fact that the GEV log likelihood involves characteristic functions that depend on the parameters. Subsequently, the regularity conditions that assure that the MLE is asymptotically normally distributed, and thus allow for construction of a fairly simple parametric CI, are not met. Büecher and Segers (2017) show that the asymptotic normality of the MLE holds for parametric families that are differentiable in quadratic mean whose supports depend on the parameter; they also show that the GEV family is not differentiable in the quadratic mean unless ξ > −1/2. Smith (1985) already showed that if ξ > −1/2, the regularity conditions for the MLE to be asymptotically normally distributed will hold. Similar results hold, of course, for the GP likelihood and PP characterization and so are omitted here.

While the MLE is perfectly valid as an estimator for the EVD parameters, clearly it is beneficial to have an alternative strategy for constructing CI’s. Even if ξ > −1/2, which is often the case, when interest is in return levels that exceed the temporal range of the data (e.g., estimating a 100-yr return level with only 20 years of data), the actual distribution functions for such return levels tend to be asymmetric. So, the assumption of approximate normality will not hold. The profile likelihood method is useful for finding CI’s in this context, but it is a difficult procedure to automate (cf. Gilleland and Katz 2016). Therefore, bootstrap methods are appealing for constructing CI’s for EVD parameters and return levels.

3. Bootstrap inference for extreme-value distribution functions

PI provides a thorough review of statistical inference as conducted via bootstrap methods, including the various CI’s calculated in what follows. Bootstrapping for extreme values is challenging for a couple of reasons. The most obvious is that if resampling is carried out using only the observed data, then more extreme values than those observed will not be included in any of the fitting procedures. A more subtle reason will be discussed in section 3c.

It is useful to use simulated data with known distributional forms in order to demonstrate the software. The following code shows how to draw a sample of size 100 from a GEV distribution function with parameters μ = 30, σ = 10, and ξ = 0.2, and the result is assigned to an object called zGEV:

set.seed(2112)

zGEV <- revd(100, loc = 30, scale = 10, shape = 0.2)

Similarly, to draw a random sample of threshold excesses from a GP distribution function:

set.seed(2112)

zGP <- revd(100, scale = 1, shape = 0.2, type = "GP")

Drawing from a point process is slightly more complicated because it is necessary to draw from the nonextreme part of the distribution function in addition to the extreme part. The following code is one way to obtain such a sample:

set.seed(2112)

zPP <- rnorm(100, mean = 2, sd = 1)

set.seed(2112)

tt <- rpois(100, lambda = 0.15)

zPP[ tt > 0 ] <- revd(sum(tt > 0), shape = 0.2, type = "GP",

threshold = 5)

a. TIB

For the stationary GEV distribution function, Schendel and Thongwichian (2015) performed a simulation test to demonstrate the utility of the TIB approach in comparison to another recommended, nonbootstrap, technique known as the profile-likelihood method. For this special case, they introduced a fast method for employing the TIB, thereby allowing them to perform such a test of the method. To have a flexible method that works even for nonstationary models, the functions available in “extRemes” do not make use of this fast algorithm, and therefore such a test is not possible without resorting to parallel computing, which would still require an excessive amount of resources to implement. The TIB approach using the interpolation method is performed as below. Note that a GEV must first be fit to the data using fevd from “extRemes”:

fit <- fevd(zGEV)

fit

plot(fit)

tibOut <- xtibber(fit, which.one = 1000,

test.pars = seq(-0.01, 1.5, 0.005), B = 250, verbose = TRUE)

tibOut

plot(tibOut)

The above example finds an estimated 1000-yr return level (specified by which.one = 1000) of about 130.57 with 95% TIB CI (99.39, 219.19). The “true” value for the simulated series is about 179.03, which is inferred from the 1 − 1/1000 quantile of the GEV distribution function from which the data were sampled. This example represents the heavy-tail case of the GEV, which is the case that causes difficulty for finding accurate bootstrap CI’s. The test.pars argument specifies the sequence of values over which the nuisance parameter is to vary for the interpolation method. The default nuisance parameter is the shape parameter, so the sequence varies across all three types of GEV distribution functions. Because the 1000-yr return level is far out in the tail, this sequence needs to be relatively long. For shorter return levels, it can perhaps be much shorter, thereby speeding up the algorithm.

The result of the plot command on the last line above is shown in Fig. 1. In the figure, an accurate interpolation method TIB interval should have a black circle near where both the leftmost vertical blue line crosses the top horizontal blue line (for the lower bound estimate) and where the rightmost vertical blue line crosses the bottom horizontal blue line. For this example, the interval appears to be reasonable and the “true” 1000-yr return level is within the 95% TIB CI.

Fig. 1.
Fig. 1.

Bootstrap estimated p values against estimated 1000-yr return levels for varying values of the nuisance parameter, in this case the GEV shape parameter ξ for the simulation zGEV from section 3 with “true” parameter values μ = 30, σ = 10, and ξ = 0.2. The “true” 1000-yr return level is approximately 179.03. Blue horizontal lines go through the desired α confidence levels (i.e., α/2 = 0.025 and 1 − α/2 = 0.975). The outer vertical blue lines are the estimated (1 − α/2) × 100% TIB CI estimates, which for this example are approximately (84.13, 219.51), and the center vertical blue line is the estimated 1000-yr return level (in this case, about 130.57). If the interpolation method is accurate, then there should be a black circle near where the outer two vertical blue lines intersect the horizontal blue lines. The leftmost (rightmost) vertical line should have a circle near the top (bottom) horizontal line.

Citation: Journal of Atmospheric and Oceanic Technology 37, 11; 10.1175/JTECH-D-20-0070.1

The interpolation method for the TIB CI is not the recommended approach. A better approach is to use the Robbins–Monro algorithm. The following code performs this algorithm on the same simulation as above:

tibRMout <- xtibber(fit, which.one = 1000, tib.method = "rm",

test.pars = c(0.15, 1.25), B = 250, tol = 0.01, step.size = 0.005,

verbose = TRUE)

tibRMout

plot(tibRMout)

This time the tib.method is specified to be “rm” and the test.pars argument now specifies the starting value for the lower and upper bounds, respectively. Additional arguments used now include tol, which states how close to the exact value of α/2 and 1 − α/2 the estimated p value can be before the algorithm stops, and step.size, which instructs how much to change the value of the nuisance parameter for each iteration of the algorithm. For this run of the code, the achieved estimated α^*/2 for the lower bound is 0.028, which is very close to the desired 0.025, and similarly for the upper bound with α^0.972 instead of the desired 0.975. The estimated ≈95% CI is about (89.39, 210.01), which also contains the “true” 1000-yr return level of 179.03.

Figure 2 shows the result of the plot command for this example. It makes a similar plot as in Fig. 1, but now the upper and lower bounds are estimated separately. Apart from generally being more accurate than the interpolation method, the Robbins–Monro algorithm enables the estimated α^ levels to be seen so that it is possible to quantitatively assess the accuracy of the resulting intervals; at least in terms of how close the algorithm came to achieving the desired α. With the interpolation method, only an inspection of the resulting plot gives any indication of the accuracy of the resulting interval.

Fig. 2.
Fig. 2.

As in Fig. 1, but using the Robbins–Monro algorithm instead of the interpolation method. In this case, the black circles are replaced with “l” and “u” symbols specifying lower vs upper. Accurate bounds should have a “u” symbol near where the rightmost vertical blue line crosses the lower horizontal blue line and an “l” symbol near where the leftmost vertical blue line crosses the top horizontal blue line.

Citation: Journal of Atmospheric and Oceanic Technology 37, 11; 10.1175/JTECH-D-20-0070.1

It is also possible to apply the TIB method in order to obtain CI’s for the parameters of the GEV. The code below shows how to do so for the shape parameter, which is difficult to estimate but arguably the most important one to pin down. This time, it is necessary to specify a different nuisance parameter because the default is the shape parameter. The which.one argument now specifies the number of the parameter in the order provided by strip below. In this case, the shape parameter is the third parameter:

strip(fit)

testShape <- xtibber(fit, type = "parameter", which.one = 3,

nuisance = "scale", B = 250, test.pars = seq(5, 10, 0.1),

verbose = TRUE)

testShape

plot(testShape)

For the example above, the 95% TIB CI is approximately (−0.09, 0.35), but note that results may vary because of the necessity for making random draws. The interval includes zero, but mostly includes more strongly positive values, and does include the “true” parameter value of 0.2. To change the confidence level, say to 99%, the alpha argument is used. The code below demonstrates how to perform the same analysis as above, but for 99% CI’s:

testShape99 <- xtibber(fit, type = "parameter", which.one = 3,

nuisance = "scale", B = 250, test.pars = seq(5, 10, 0.1),

alpha = 0.01, verbose = TRUE)

testShape99

plot(testShape99)

For this particular example (results not shown), the resulting 99% TIB CI is given by about (−0.18, 0.29), which also includes zero and a much longer interval below zero. The above intervals use the interpolation method. To use the Robbins–Monro algorithm, the following code for a 95% TIB CI can be used:

testShapeRM <- xtibber(fit, type = "parameter", which.one = 3,

tib.method = "rm", nuisance = "scale", B = 250,

test.pars = c(7.9, 8.1), tol = 0.01, step.size = 0.005,

verbose = TRUE)

testShapeRM

plot(testShapeRM)

For one instance of the above code, the estimated achieved confidence level is close to the desired level at 0.024 for the lower bound and 0.98 for the upper. The 95% TIB CI is estimated to be about (0.04, 0.33). Again, individual results will vary. This 95% TIB CI does not include zero as the interpolation method did, and the Robbins–Monro method should be considered the more accurate. Indeed, the “true” shape parameter is 0.2, so ideally zero would not be in the interval.

For the GP distribution function, the same type of procedure can be carried out to obtain TIB CI’s. A GP distribution function must first be fit to the data, and then a similar analysis is carried out. For this example, however, a less ambitious interval for the 100-yr return level is sought. Because of the way the GP data were simulated, i.e., all values are above the threshold of 0, there is effectively only one data point per year; at least in the way fevd handles time. Therefore, it is necessary to use the time.units argument in order to obtain reasonable empirical return level estimates for the return-level plot, but the fitted model will not be affected. The function fevd attempts to put all return periods into an annual scale using this argument. The default assumes one data point per day, so if there were say eight data points per day, then there would be about 8 × 365.25 = 2922 points per year and time.units=”2922/year” or more simply time.units=”8/day”. It takes a character that has a number to the left of the slash and the word day, month, or year to the right of it:

fitGP <- fevd(zGP, threshold = 0, type = "GP", time.units = "1/year")

fitGP

plot(fitGP)

tibGP <- xtibber(fitGP, which.one = 100, tib.method = "rm", B = 250,

test.pars = c(0.15, 0.21), step.size = 0.005, tol = 0.01,

verbose = TRUE)

tibGP

plot(tibGP)

For this example (output not shown), the estimated 95% TIB CI using the Robbins–Monro algorithm gives an estimated achieved confidence level close to the desired one with an interval of about (8.65, 72.06). The “true” value of the return level is about 11.27, which falls inside the interval.

b. Nonstationary analysis

Many atmospheric applications involve nonstationarities. It is possible to account for nonstationarity by modeling one or more parameters of the EVD as functions of the covariates (cf. Katz et al. 2002; Katz 2013). The following example takes the annual maximum of summer daily minimum temperature (°F) from Phoenix Sky Harbor Airport (Fig. 3; cf. Gilleland and Katz 2016) and fits a stationary GEV distribution function to the data, and then fits a nonstationary GEV distribution function with a linear trend in the location parameter given by μ(year) = μ0 + μ1 × year:2

Fig. 3.
Fig. 3.

Annual maximum of summer daily minimum temperatures (°F) at Phoenix Sky Harbor airport.

Citation: Journal of Atmospheric and Oceanic Technology 37, 11; 10.1175/JTECH-D-20-0070.1

data("Tphap")

phx <- blockmaxxer(Tphap, blocks = Tphap$Year, which = "MinT")

plot(1900 + phx$Year, phx$MinT, xlab = "Year",

ylab = "Temperature (deg. F)", type = "h")

fitPhx0 <- fevd(MinT, data = phx)

fitPhx0

plot(fitPhx0)

fitPhx1 <- fevd(MinT, data = phx, location.fun = ~Year)

fitPhx1

plot(fitPhx1)

lr.test(fitPhx0, fitPhx1)

v1 <- make.qcov(fitPhx1, vals = list(mu1 = c(91, 120)))

return.level(fitPhx1, return.period = 100, qcov = v1)

The function lr.test conducts a likelihood-ratio test for the inclusion, in this case, of the linear trend in the location parameter. Here, the likelihood-ratio statistic is about 53, which is much larger than the χ12(0.95) critical value (p value ≈ 2.19 × 10−13) suggesting that the trend is important. The quantile–quantile (QQ) plots for each fit (not shown) suggest that the assumptions for the model with the linear trend are reasonable, where they might not be for the stationary model.

The penultimate line of code above sets up a special matrix that allows for finding “effective” return levels. The value of 91 corresponds to 1991, one year later than the data range which ends at 1990, and 1900 + 120 = 2020 to give the value for the year 2020.

Next, it is desired to find the 95% TIB CI for the nonstationary model for the “effective” 100-yr return level for the year 1991. They can be found for 2020 using an analogous approach, but for xtibber, only one can be found at a time. As always, the TIB method requires a certain amount of trial and error. Sometimes a good solution cannot be found easily, as is the situation here. In fact, for the nonstationary EVD, it is very difficult to obtain TIB CI’s:

# For the case where the location parameter is 91.

v1 <- make.qcov(fitPhx1, vals = list(mu1 = 91))

tibNSgev <- xtibber(fitPhx1, which.one = 100, B = 250,

test.pars = seq(-0.21, 0.21, 0.005), qcov = v0, verbose = TRUE)

tibNSgev

plot(tibNSgev)

While the above code finds a reasonable lower bound for the “effective” 100-yr return level for 1991 of about 93°F, it does not find an upper bound. The following code shows how to obtain the parametric CI that assumes normality for the MLE of the 100-yr “effective” return level, which is not a reasonable assumption. Nevertheless, the lower bound agrees pretty closely with the lower bound found by the TIB method at about 93.5°F. The upper bound estimate is given by nearly 96°F:

ci(fitPhx1, qcov = v1)

Given the difficulties with the TIB method, a viable alternative is to perform a regular bootstrap procedure, but using parametric resampling from the fitted GEV distribution function. To perform this type of resampling, the pbooter function can be used, which requires two functions to be defined: one to calculate the statistics of interest and another to simulate data from the fitted nonstationary distribution function. For the latter, “rextRemes” provides a useful shortcut. The first function must have a minimum of two arguments: data and '…'. The second must have the arguments size and '…'

fitter <- function(data, ..., Fit) {

data <- data.frame(MinT = data, Fit$cov.data)

fit <- fevd(MinT, data = data, location.fun = ~Year)

v <- make.qcov(fit, vals = list(mu1 = c(91, 120)))

out <- c(return.level(fit, return.period = 100, qcov = v))

return(out)

} # end of ‘fitter’ function.

simphx <- function(size, ..., Fit) {

out <- c(rextRemes(Fit, size))

return(out)

} # end of ‘simphx’ function.

pbootedPhx1 <- pbooter(x = phx$MinT, statistic = fitter, B = 250,

rmodel = simphx, Fit = fitPhx1, verbose = TRUE)

pbootedPhx1

ci(pbootedPhx1, type = "perc")

The result from one implementation of the above code gives a 95th-percentile bootstrap CI of about (93.24°, 96.31°F) for the 100-yr “effective” return level for the year 1991 and about (97.88°, 102.28°F) for the 100-yr “effective” return level for the year 2020. For this example, the CI for the year 1991 “effective” return level is very close to the one found by the normal approximation (classical) interval, which suggests that the distribution function for this 100-yr return level is at least approximately normal.

Other methods for communicating risk for nonstationary extreme values is nicely summarized in Cooley (2013). These methods are out of the scope of the present text, but may be very useful. It is hoped that the above information about how to apply the parametric bootstrap can still be useful if other more advanced techniques are preferred.

c. The m < n bootstrap

It is well known that the asymptotic results that support bootstrap sampling as a valid method for hypothesis testing and CI construction fail in the case of heavy-tail data. As mentioned in PI, the main assumption for the bootstrap method to be valid is that the relationship between θ^* and θ^ is the same as that between θ^ and θ. Importantly, their scaled differences Dn*=an*(θ^*θ^) and Dn=an(θ^θ) share, in a particular sense, the same limiting distribution functions. More precisely that P[limn*FDn*(x)=G(x)]=1 and P[limnFDn(x)=G(x)]=1, where FDn* and FDn are the distribution functions for Dn* and Dn, respectively, and G is their limiting distribution function. Extreme values, in particular, can be problematic for bootstrap resampling because of the heavy-tail case, whose asymptotics require the bootstrap sample size m → ∞ but also that m/n → 0 as n → ∞ (e.g., m=n; cf. Bickel and Freedman 1981; Arcones and Giné 1989, 1991; Athreya 1987a,b; Giné and Zinn 1989; Knight 1989; Hall 1990; Kinateder 1992; LePage 1992; Deheuvels et al. 1993; Fukuchi 1994; Bickel et al. 1997; Feigin and Resnick 1997; Lee 1999; Shao and Dongsheng 1995; Resnick 2007). If the bootstrap sample is not reduced to a vanishing proportion of the original sample size, then the limit is random (S. I. Resnick 2020, personal communication).

For this section, three random samples are drawn from a heavy-tail distribution function. Namely, the GEV distribution function with μ = 0, σ = 1, and ξ = 0.1, 0.5, and 1.5, respectively. The following R code is used to make the simulations:

set.seed(142)

zheavy0.1 <- revd(100, shape = 0.1)

set.seed(849)

zheavy0.5 <- revd(100, shape = 0.5)

set.seed(908)

zheavy1.5 <- revd(100, shape = 1.5)

Figure 4 displays the histograms for each simulation. The first simulation has a heavy tail, but it is not as “heavy” as the other two cases. That is, the GEV distribution function has a heavy tail when the shape parameter is greater than zero, and the distribution functions tail decays slower as the value of this parameter increases. The “true” mean can be derived for each of these distribution functions, and is given by [Γ(1 − 0.1) − 1]/0.1 ≈ 0.69, [Γ(1 − 0.5) − 1]/0.5 ≈ 1.54 and is undefined whenever ξ > 1.

Fig. 4.
Fig. 4.

Histograms of simulated heavy-tail samples from GEV distribution functions with location parameters equal to zero, scale parameters equal to unity, and shape parameters of 0.1, 0.5, and 1.5, respectively. The theoretical mean for each simulation is approximately 0.69, 1.54, and undefined, respectively. The means for these samples are approximately 0.64, 1.45, and 16.24.

Citation: Journal of Atmospheric and Oceanic Technology 37, 11; 10.1175/JTECH-D-20-0070.1

To demonstrate how to perform an m < n bootstrap, suppose interest is in finding 95% CI’s for the population mean of each of these samples. The sample estimate of the mean for the first two cases is fairly close to the population means given above at about 0.64 and 1.45. While the population mean does not exist for the third sample, given a sample of real values, the sample mean can always be estimated, and it is found to be about 16.24 here. First, a function is needed to calculate the statistic of interest which must take the arguments data and '…'. Here, the statistic of interest is the mean:

bootmean <- function(data, ...) {

return(mean(data, ...))

} # end of ‘bootmean’ function.

Next, the bootstrap samples are made with the following commands. For each simulation, both the m = n and m=n samples are found for comparison:

booted1 <- booter(zheavy0.1, bootmean, B = 500)

summary(booted1)

booted1m <- booter(zheavy0.1, bootmean, B = 500, rsize = 10)

summary(booted1m)

booted2 <- booter(zheavy0.5, bootmean, B = 500)

summary(booted2)

booted2m <- booter(zheavy0.5, bootmean, B = 500, rsize = 10)

summary(booted2m)

booted3 <- booter(zheavy1.5, bootmean, B = 500)

summary(booted3)

booted3m <- booter(zheavy1.5, bootmean, B = 500, rsize = 10)

summary(booted3m)

Note that it is the rsize argument in the call to booter that allows for changing the resample size. Once the bootstrap samples are found, it is a simple matter to find the bootstrap CI’s. Here, only the percentile method is used for brevity:

ci(booted1, type = "perc")

ci(booted1m, type = "perc")

ci(booted2, type = "perc")

ci(booted2m, type = "perc")

ci(booted3, type = "perc")

ci(booted3m, type = "perc")

Table 1 displays the results of the above command. Despite that the seeds are set so that the reader can obtain the same original samples, no seed is set in these bootstrap results, so results may vary from what is displayed in the table. It is immediately clear that the m < n with m=n bootstrap yields much wider intervals than the m = n counterpart, as should be expected because of the smaller resample sizes. The estimated bias is also larger in each case.

Table 1.

The 95th-percentile bootstrap CI’s for the three heavy-tail simulations from the GEV distribution function with μ = 0 and σ = 1.

Table 1.

While there is no remedy for the fact that the undefined-mean situation cannot be discerned, the CI’s are so wide that they at least provide a hint that something might be awry. If the GEV distribution function is suspected, then it can be fit to the data and inferences about its shape parameter would then reveal this possibility.3

4. Discussion and conclusions

This paper demonstrates how to use new bootstrap functions available in the R (R Core Team 2017) package “extRemes” (versions ≥ 2.0–12) for extreme-value analysis; the functions in “extRemes” are wrapper functions to bootstrap code from the “distillery” package. Bootstrap methods have been shown to be highly accurate in situations where usual assumptions for more standard intervals may not apply. However, the accuracy depends on utilizing the correct bootstrap methodology for the random sample to which it is applied.

While the “boot” package (Davison and Hinkley 1997; Canty and Ripley 2017) in R provides excellent utility for performing bootstrap resampling and estimating CI’s, the functions in “distillery” make certain operations easier; some of which are not possible with “boot.” For example, PI and this paper demonstrate how to perform a test-inversion bootstrap (TIB), which is currently not available in “boot,” and an m < n bootstrap, which is less straightforward to do with “boot.”

TIB interval results agree with previous works about their utility for analyzing extreme-value distribution functions, but it is found here that these methods may not be stable when fitting more complex, for example nonstationary, distribution functions. They are also fairly difficult to automate as they often need to be rerun in order to find function arguments that will allow the procedure to converge on an appropriately sized CI. Nevertheless, they represent a theoretically appealing approach, so their availability as a general tool in the “distillery” package might be useful to some research efforts.

Parametric intervals are a good alternative for extreme-value applications in part because it is possible to justifiably simulate values that are more extreme than those observed in the data. Here, it is shown how to apply this approach for the case of nonstationary peak-over-threshold data, which was previously unavailable in “extRemes.”

Acknowledgments

Support for this manuscript was provided by the National Science Foundation (NSF) through the Regional Climate Uncertainty Program (RCUP) at the National Center for Atmospheric Research (NCAR). NCAR is sponsored by NSF and managed by the University Corporation for Atmospheric Research.

REFERENCES

  • Arcones, M. A., and E. Giné, 1989: The bootstrap of the mean with arbitrary bootstrap sample. Ann. Inst. Henri Poincaré, 25, 457481.

    • Search Google Scholar
    • Export Citation
  • Arcones, M. A., and E. Giné, 1991: Additions and corrections to “The bootstrap of the mean with arbitrary bootstrap sample.” Ann. Inst. Henri Poincaré, 27, 583595.

    • Search Google Scholar
    • Export Citation
  • Athreya, K. B., 1987a: Bootstrap of the mean in the infinite variance case. Proc. First World Congress of the Bernoulli Society, Utrecht, Netherlands, Bernoulli Society, 9598.

  • Athreya, K. B., 1987b: Bootstrap of the mean in the infinite variance case. Ann. Stat., 15, 724731, https://doi.org/10.1214/aos/1176350371.

  • Bickel, P. J., and D. A. Freedman, 1981: Some asymptotic theory for the bootstrap. Ann. Stat., 9, 11961217, https://doi.org/10.1214/aos/1176345637.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Bickel, P. J., F. Götze, and W. R. van Zwet, 1997: Resampling fewer than n observations: Gains, losses, and remedies for losses. Stat. Sin., 7, 131.

    • Search Google Scholar
    • Export Citation
  • Büecher, A., and J. Segers, 2017: On the maximum likelihood estimator for the generalized extreme-value distribution. Extremes, 20, 839872, https://doi.org/10.1007/s10687-017-0292-6.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Canty, A., and B. Ripley, 2017: boot: Bootstrap R (S-Plus) Functions, version 1.3-20. R package, http://statwww.epfl.ch/davison/BMA/.

  • Cooley, D., 2013: Return periods and return levels under climate change. Extremes in a Changing Climate: Detection, Analysis and Uncertainty, Springer, 97114.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Davison, A., and D. Hinkley, 1997: Bootstrap Methods and Their Application. Cambridge University Press, 582 pp.

  • Deheuvels, P., D. M. Mason, and G. R. Shorack, 1993: Some results on the influence of extremes on the bootstrap. Ann. Inst. Henri Poincaré Probab. Stat., 29, 83103.

    • Search Google Scholar
    • Export Citation
  • Fawcett, L., and D. Walshaw, 2012: Estimating return levels from serially dependent extremes. Environmetrics, 23, 272283, https://doi.org/10.1002/env.2133.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Feigin, P., and S. I. Resnick, 1997: Linear programming estimators and bootstrapping for heavy-tailed phenomena. Adv. Appl. Probab., 29, 759805, https://doi.org/10.2307/1428085.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Fukuchi, J.-I., 1994: Bootstrapping extremes of random variables. Ph.D. thesis, Iowa State University, 101 pp.

  • Gilleland, E., 2017: distillery: Method Functions for Confidence Intervals and to Distill Information from an Object, version 1.0-4. R package, https://www.ral.ucar.edu/staff/ericg.

  • Gilleland, E., 2020: Bootstrap methods for statistical inference. Part I: Comparative forecast verification for continuous variables. J. Atmos. Oceanic Technol., 36, 21172134, https://doi.org/10.1175/JTECH-D-20-0069.1.

    • Search Google Scholar
    • Export Citation
  • Gilleland, E., and R. W. Katz, 2016: extRemes 2.0: An extreme value analysis package in R. J. Stat. Software, 72, 139, https://doi.org/10.18637/jss.v072.i08.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Gilleland, E., R. W. Katz, and P. Naveau, 2017: Quantifying the risk of extreme events under climate change. Chance, 30, 3036, https://doi.org/10.1080/09332480.2017.1406757.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Giné, E., and J. Zinn, 1989: Necessary conditions for the bootstrap of the mean. Ann. Stat., 17, 684691, https://doi.org/10.1214/aos/1176347134.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Hall, P., 1990: Asymptotic properties of the bootstrap for heavy-tailed distributions. Ann. Probab., 18, 13421360, https://doi.org/10.1214/aop/1176990748.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Heffernan, J. E., and J. A. Tawn, 2004: A conditional approach for multivariate extreme values (with discussion). J. Roy. Stat. Soc., 66B, 497546, https://doi.org/10.1111/j.1467-9868.2004.02050.x.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Katz, R. W., 2013: Statistical methods for non-stationary extremes. Extremes in a Changing Climate: Detection, Analysis and Uncertainty, Springer, 1537.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Katz, R. W., M. B. Parlange, and P. Naveau, 2002: Statistics of extremes in hydrology. Adv. Water Resour., 25, 12871304, https://doi.org/10.1016/S0309-1708(02)00056-8.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Kinateder, J. G., 1992: An invariance principle applicable to the bootstrap. Exploring the Limits of Bootstrap, Wiley Series in Probability and Mathematical Statistics, Wiley, 157181.

  • Knight, K., 1989: On the bootstrap of the sample mean in the infinite variance case. Ann. Stat., 17, 11681175, https://doi.org/10.1214/aos/1176347262.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Kyselý, J., 2002: Comparison of extremes in GCM-simulated, downscaled and observed central-European temperature series. Climate Res., 20, 211222, https://doi.org/10.3354/cr020211.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Lee, S., 1999: On a class of m out of n bootstrap confidence intervals. J. Roy. Stat., 61B, 901911, https://doi.org/10.1111/1467-9868.00209.

  • LePage, R., 1992: Bootstrapping signs. Exploring the Limits of Bootstrap, Wiley Series in Probability and Mathematical Statistics, Wiley, 215224.

  • R Core Team, 2017: R: A language and environment for statistical computing. R Foundation for Statistical Computing, https://www.R-project.org/.

  • Resnick, S. I., 2007: Heavy-Tail Phenomena: Probabilistic and Statistical Modeling. Springer Series in Operations Research and Financial Engineering, Springer, 404 pp.

  • Schendel, T., and R. Thongwichian, 2015: Flood frequency analysis: Confidence interval estimation by test inversion bootstrapping. Adv. Water Resour., 83, 19, https://doi.org/10.1016/j.advwatres.2015.05.004.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Schendel, T., and R. Thongwichian, 2017: Confidence intervals for return levels for the peaks-over-threshold approach. Adv. Water Resour., 99, 5359, https://doi.org/10.1016/j.advwatres.2016.11.011.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Shao, J., and T. Dongsheng, 1995: The Jackknife and the Bootstrap. Springer, 123 pp.

  • Smith, R. L., 1985: Maximum likelihood estimation in a class of nonregular cases. Biometrika, 72, 6790, https://doi.org/10.1093/biomet/72.1.67.

    • Crossref
    • Search Google Scholar
    • Export Citation
1

The GP distribution is a conditional distribution with the condition that X > u. Therefore, it does not have a location parameter.

2

When incorporating covariate information into the parameters of the GEV distribution function, it is important to first allow the location parameter to vary. If the inclusion of the covariate term is found to be significant, then inclusion of covariates in the scale parameter can be tested. Generally, it is desirable not to include covariates in the shape parameter, but if there is reason to do so, then they should be included only after including them with the scale parameter. The issue is for any location-scale family of distributions and not just extreme-value distribution functions. Consider, for example, a normal distribution where the mean is also a location parameter and the standard deviation is also a scale parameter. Because the standard deviation involves deviations about the mean, it follows that incorrect specification of the mean, e.g., ignoring a trend in the mean, will be problematic for estimating the standard deviation. This issue is related to one that arises in polynomial regression where it is well know that fitting a second-order polynomial without any linear term is problematic.

3

Because of the three types of tail behavior for the extreme-value distributions, with one the heavy-tail case, it is problematic to perform resampling from the data without accounting for the uncertainty in type of tail. The parametric bootstrap avoids this issue.

Supplementary Materials

Save