Probable maximum precipitation (PMP) is the key parameter used to estimate the probable maximum flood (PMF), both of which are important for dam safety and civil engineering purposes. The usual operational procedure for obtaining PMP values, which is based on a moisture maximization approach, produces a single PMP value without an estimate of its uncertainty. We therefore propose a probabilistic framework based on a bivariate extreme value distribution to aid in the interpretation of these PMP values. This 1) allows us to evaluate estimates from the operational procedure relative to an estimate of a plausible distribution of PMP values, 2) enables an evaluation of the uncertainty of these values, and 3) provides clarification of the impact of the assumption that a PMP event occurs under conditions of maximum moisture availability. Results based on a 50-yr Canadian Centre for Climate Modelling and Analysis Regional Climate Model (CanRCM4) simulation over North America reveal that operational PMP estimates are highly uncertain and suggest that the assumption that PMP events have maximum moisture availability may not be valid. Specifically, in the climate simulated by CanRCM4, the operational approach applied to 50-yr data records produces a value that is similar to the value that is obtained in our approach when assuming complete dependence between extreme precipitation efficiency and extreme precipitable water. In contrast, our results suggest weaker than complete dependence. Estimates from the operational approach are 15% larger on average over North America than those obtained when accounting for the dependence between precipitation efficiency and precipitable water extremes realistically. A difference of this magnitude may have serious implications in engineering design.
While we have developed an impressive ability to describe climate and hydrologic systems from both dynamic and thermodynamic perspectives, for practical purposes, we do not yet have the ability to analyze and describe the upper bounds of the intensity of many types of extremes based on physical reasoning. In the case of extreme precipitation, current knowledge of storm mechanisms remains insufficient to allow a precise evaluation of limiting values and very rare extreme precipitation. Nevertheless, estimates of such rare extremes are needed for engineering practice, for example, in dam spillway design. Probabilistic approaches using statistical frequency analysis offer a plausible alternative for estimating extremes for a given return period. These approaches involve the fitting of probability distribution models to recorded storm precipitation amounts and extrapolating the tails of these models to very low exceedance probabilities. These approaches have been criticized (Klemeš 1986, 1987, 2000), for example, on the basis that very long return periods (e.g., for 1000 or 10 000 years) can only be estimated from available 50- or 100-yr observational records with very high uncertainty, which may make these estimates unsuitable for engineering applications. Engineers therefore seek additional information to overcome the limitations of deterministic physical reasoning and probabilistic approaches. Hence, a rational concept called probable maximum precipitation (PMP) is commonly used to estimate a possible magnitude of an extreme having a very high return period that is judged to have a negligible risk of exceedance.
The World Meteorological Organization (WMO; WMO 1986), defines PMP as “the greatest depth of precipitation for a given duration meteorologically possible for a design watershed or a given storm area at a particular location at a particular time of year, with no allowance made for long-term climatic trends.” PMP is commonly used for estimating the probable maximum flood (PMF), which is defined as the largest flood that could occur at a given hydrological basin and is a key parameter used for the design of a given project with high safety requirements. Based on our hydrologic and meteorological knowledge, extremes are the results of many components interacting in a complex way. Typically, PMP is computed as a combination of the maximum of component values, where the rationale is that this combination is unlikely to be exceeded.
Theoretically, PMP is an unknown upper limit for precipitation, which could be very high relative to the largest extreme that might be experienced over a fixed period of time, or might even be unbounded, whereas operationally, PMP is a rational engineering solution (meaning not purely based on scientific knowledge) to provide a possible magnitude of extreme precipitation values that can be used by engineers as a practical upper limit where scientific knowledge does not provide the desired guidance. Hence, whether the theoretical upper limit exists or not, an operational PMP can be obtained by engineers to provide guidance for design decisions.
The operational PMP estimate must be clearly recognized for what it is and not be confused with what it is not, namely, a physical upper limit. Furthermore, WMO (2009) hinted that one must distinguish between the “theoretical PMP” and the “operational PMP” (Salas et al. 2014). In fact, when the operational and theoretical PMP concepts are confused, and when a rational concept is recognized as science, inconsistencies and methodological gaps arise, reducing its credibility and usefulness (Klemeš 1993). Unfortunately, this confusion has led several statistical hydrologists to consider the operational PMP concept to be one of the biggest failures in hydrology (Yevjevich 1968; Papalexiou and Koutsoyiannis 2006), despite its continued heavy use. In reality, this is not a failure; when current scientific knowledge does not allow engineers to make a decision, a rational approach coupled with careful judgement must be used and can be appropriate, but should be recognized as such and not as science (Klemeš 1993). Hence, one must distinguish between the theoretical and rational operational PMP concepts and recognize that the latter has an origin that is reasonable, but not necessarily scientific. Hereafter, to avoid any confusion, we will use the term PMP to refer to the “operational PMP.”
WMO (2009) describes a variety of methods to derive PMP estimates depending on the basin characteristics (size, location, and topographies), the amount and the quality of available data, and storm types producing extreme precipitation. Most of these methods involve comprehensive meteorological analysis, and only one is based on statistics, as was proposed by Hershfield (1961). As mentioned in WMO (2009), PMP calculations depend on data and should always be presented as approximations, since the value depends fundamentally on the amount and quality of the data available and the depth of analysis. Given the considerable uncertainties that may influence PMP estimates, providing a range of PMP values by evaluating the uncertainty of values, that is, the range of values that might be possible under equivalent conditions, rather than relying only on single point estimates is necessary and more suitable, but is often ignored in the literature. Only a few studies have dealt with this subject. For instance, Salas et al. (2014) provides an uncertainty estimate using a statistical method, where the expected value and variance of the calculated PMP value are obtained by assuming that the annual maximum of 24-h precipitation is Gumbel distributed. The expected value and variance estimates are used in combination with Chebyshev’s inequality to obtain probability bounds and risks. Salas and Salas (2016) extended this approach assuming a log-Gumbel rather than Gumbel distribution. Micovic et al. (2015) used an alternative approach based on an uncertainty analysis in which judgment is used to integrate the uncertainty of the different factors involved in computing PMP and showed that PMP should always be presented as a range of values to characterize the impacts of the significant uncertainties that are involved in the calculation.
While a variety of PMP calculation methods are described by WMO (2009), the so-called moisture maximization method remains the most representative and is commonly used by engineers (Rakhecha and Clark 1999). A PMP value obtained with the moisture maximization method may be considered a worst-case scenario estimate in which extreme precipitation efficiency (PE) and extreme precipitable water (PW) occur simultaneously. PW is defined as the depth of water that would be produced at a given location if all the water in the atmospheric column above that location was precipitated as rain, whereas PE is defined as the ratio of actual precipitation amount to the actual PW. Because extreme precipitation almost always involves moisture transport from elsewhere, PE is usually larger than one, with the result that PMP is usually larger than PW. While this is a rational operational approach for obtaining a PMP value, the PMP value that is obtained should not be interpreted as the physical upper limit. This is because the calculation involves two maximization operations, once for PW and again for PE, with a PMP value being obtained as the product of the two maxima. A physical interpretation would require an assumption that PMP events simultaneously have maximum moisture availability and maximum precipitation efficiency (Chen and Bradley 2006). This strong assumption may lead to overestimation of a likely upper bound if the PW and PE extremes are not fully dependent, even if maximization provides plausible, physically reasonable limits for PW and PE individually.
The fact that PMP calculation via moisture maximization involves the maxima of observed time series of PE and PW suggests the use of the extreme value theory, a well-developed statistical discipline (Coles et al. 2001). The probabilistic description of extremes through extreme value theory is generally developed through either the block maxima approach or the peaks-over-threshold approach. The former leads to the use of the generalized extreme value (GEV) distribution to describe the probability distribution of the intensity of block maxima, whereas the latter leads to the use of the generalized Pareto distribution to model excesses over a high threshold. Further details about the extreme value theory can be found in Coles et al. (2001). We consider the block maximum approach in this study since the maximum values of PE and PW over a defined period record can be considered to be the maxima of a series of annual maxima.
Accounting for the dependence structure of exceptional PE and PW events is of practical importance, as noted, and can be accomplished via bivariate extreme value analysis. In particular, a copula function can be used to extend univariate extreme value analysis to the bivariate case. Copula functions provide a way to describe the dependence structure independently of the marginal distributions and thus can use different marginal distributions at the same time without any transformations. The application of copulas in hydrology and climatology has grown rapidly during the past decade. Introductions to copula theory are provided in Joe (1997) and Nelsen (2007a), and a detailed review of the development and applications of copulas in hydrology, including frequency analysis, simulation, and geostatistical interpolation, can be found in Salvadori et al. (2007). In recent years, copula functions have been widely used to describe the dependence structure of climate variables and extremes (e.g., AghaKouchak et al. 2010a; Ben Alaya et al. 2014, 2016; Guerfi et al. 2015; Mao et al. 2015). In this study, an extreme value copula (Salvadori et al. 2007) is used to extend the univariate block maxima approach to the bivariate case.
The aim of this study is to propose a probabilistic framework for PMP estimation using the moisture maximization approach. The proposed approach takes advantage of probabilistic bivariate extreme value analysis to address the limitations of operational PMP estimates obtained via moisture maximization by 1) enabling assessment of the sensitivity of the PMP value to one particular observation, the maximum of the entire sample; 2) allowing an evaluation of the uncertainty and thus providing a range of PMP values; and 3) providing clarification of the impact of the assumption that a PMP event occurs under conditions of maximum moisture availability. The proposed approach is illustrated by using output from the Canadian Centre for Climate Modelling and Analysis Regional Climate Model (CanRCM4) to estimate PMP over North America. The remainder of the paper is organized as follows: the datasets and the proposed method are introduced in section 2. Results and discussions are presented in section 3, and conclusions are given in section 4.
2. Materials and methods
Physically based numerical atmospheric models developed over the past three decades play an important role in climate research. In the case of PMP estimation, numerical models are useful since they are able to simulate three-dimensional data representative of long periods. Several previous studies have used numerical climate models to study some aspects of PMP. Abbs (1999) employed a numerical model to evaluate some of the assumptions used in PMP estimation. Ohara et al. (2011) used a numerical model to estimate PMP for the American River watershed in California. Beauchamp et al. (2013) evaluated a warm season PMP estimate based on moisture maximization under recent past climate conditions and then applied it under a future projected climate using output from the Canadian Regional Climate Model (CRCM). In the same way, Rousseau et al. (2014) used CRCM output to develop a methodology to estimate PMP based on moisture maximization accounting for changing climate conditions for the southern region of the province of Quebec, Canada.
Output from the CanRCM4 regional climate model is used in this study over the period 1951–2000 and covering the North American region with 0.44° spatial horizontal resolution (155 × 130 grid points). A more detailed description of CanRCM4 is provided in Scinocca et al. (2016), von Salzen et al. (2013), and Diaconescu et al. (2015). CanRCM4 is a participant in the Coordinated Regional Climate Downscaling Experiment (CORDEX) framework (Giorgi et al. 2009) and is developed by the Canadian Centre for Climate Modelling and Analysis (CCCma) to make quantitative projections of future long-term climate change. The CanRCM4 simulation used in this study is driven by the Second Generation Canadian Earth System Model (CanESM2) and accounts for historical changes in anthropogenic and natural external forcing. From the numerous variables, available in the CCCma archives, we used total precipitation and precipitable water (vertically integrated water vapor through the atmospheric column), both at a 6-hourly temporal resolution.
For this study, we assume that the properties of the PW and PE extremes do not vary substantially over time; that is, stationarity is assumed. Nevertheless, it is recognized that climate change will alter climatic extremes and that the stationarity assumption will be increasingly difficult to justify as the climate continues to warm. A subsequent paper will therefore describe an extension of the proposed methodology to nonstationary situations and will apply the extension to projections of future nonstationary climates. The application of the method in this paper will consider a period, 1950–2000, in which there is still only relatively weak evidence of nonstationarity in the behavior of precipitation extremes, and thus for simplicity, this first application will continue to make the assumption of stationarity. In contrast to the method that will be introduced below, traditional PMP estimation methods, which interpret maxima of finite records as upper bounds, are not suitable for circumstances in which those bounds might change.
1) Operational PMP calculation using moisture maximization
Moisture maximization increases atmospheric moisture to an estimated possible upper limit for the time and location of the precipitation event. Maximized precipitation is determined for each precipitation event using the following equation:
where is the amount of PW in the atmospheric column at the time of the event, is the corresponding PE, and is the maximized PW over season of the actual event. Parameter is generally estimated as the maximum of historical values of over the current season from a sample that is at least 50 years in length, or as the value corresponding to a 100-yr return period for samples smaller than 50 years (WMO 1986). In this study, for the CanRCM4 1951–2000 climate is estimated at each grid box as the maximum simulated value of for the historical period 1951–2000 for the given season . The resulting operational PMP value corresponds to the greatest value of the maximized precipitation series over a chosen period of time (in this study over 50 years from 1951 to 2000) and corresponds to the maximum of , where is the maximum observed precipitation efficiency in season . Furthermore, practitioners often use storm transposition approaches (Foufoula‐Georgiou 1989) as a means of incorporating additional information about precipitation events from nearby locations. We exploit the gridded nature of climate model output to incorporate a simplified transposition step pooling the block maxima of precipitation efficiency using moving windows of grid boxes. The maximum of the nine values within a given grid box region is retained to compute PMP at the central grid box of this region. PMP can be calculated separately for each precipitation duration; in this study, a 6-h duration is considered based on accumulations archived at 0000, 0600, 12000, and 1800 UTC of each simulated day.
The operational moisture maximization approach can be considered a rational approach, even if the precise likelihood of exceedance of the calculated PMP value is not known, because it would be rational to consider it unlikely that and determined from 50-yr or longer records would occur simultaneously. Thus, the product could be viewed as overestimating the largest precipitation that might occur on such a time scale. Scientifically, however, it would be desirable to attempt to quantify the likelihood of exceedance and to understand the impact that dependence between PE and PW may have on PMP.
2) Probabilistic characterization of PMP
A suitable probabilistic framework for the estimation of probable maximum precipitation can be constructed to answer these questions and 1) enable an assessment of the sensitivity of the PMP value to one particular observation, the maximum of the entire sample; 2) allow an evaluation of the uncertainty and thus provide a range of PMP values; and 3) provide clarification of the impact of the assumption that a PMP event occurs under conditions of maximum moisture availability. In this section, we describe how a bivariate extreme value analysis can be used to achieve these ends.
The first objective is addressed by applying a block maximum approach separately to PE and PW using the GEV distribution. The series maxima that are used in the traditional approach can both be considered as the mth-order statistics of annual maxima, where is the length of the data record in years. To handle the second drawback related to the dependence between PE and PW, an extreme value copula function is employed to build the bivariate extreme value distribution by merging the two GEV models. Finally, the resulting bivariate GEV model is used to produce PMP estimates and quantify their uncertainty. Depending on the choice of copula, estimates will either reproduce the operational moisture maximization calculation or differ somewhat from that calculation. We will show that the former requires the assumption of an unrealistic degree of dependence between the annual extremes of PE and PW. Nevertheless, the proposed approach will provide uncertainty estimates in both cases. As a side note, it is worth noting that the proposed bivariate approach does not a priori assume that an upper bound of precipitation exists. In contrast, the approach will make it possible to evaluate whether the operational moisture maximization PMP calculation leads to an upper limit, and if not, to estimate a likelihood of exceedance.
The approach is based on treating the occurrence of extreme precipitation as a compound event, where the distribution of its extremes can be synthesized using the bivariate extreme value distribution of PW and PE. The block maximum approach, which is presented in the online supplemental material (Coles et al. 2001; Embrechts et al. 2013), is used to derive the individual distributions of PW and PE, whereas the extreme value copula function, which is presented in section 2b(2)(i), is used to describe their joint probabilistic behavior. These tools are subsequently applied as described in section 2b(2)(ii) to obtain a range of PMP values that correspond to a specified likelihood. All parameters of the bivariate model are estimated using the maximum likelihood method.
(i) Extreme value copula
While copula functions can be used to build multivariate distributions (see supplemental material regarding copula functions; Salvadori and De Michele 2004; Nelsen 2007b), the extension of univariate extreme value analysis to the bivariate case requires a particular family of copulas called extreme value copulas. Most of the literature available on extreme value copulas concentrates on the bivariate case. Indeed, higher dimensional copulas (of dimension ) are often modeled by pair-copula constructions based on bivariate copulas (Aas et al. 2009). A bivariate copula C is an extreme value copula if and only if (Genest and Segers 2009)
where is convex and satisfies . The function is known as the Pickands dependence function. The upper bound corresponds to the total independence copula , while the lower bound corresponds to the total dependence, or comonotone copula, . An important property of extreme value copulas is that, if are independent and identically distributed (iid) random pairs from an extreme value copula and and , the copula associated with the random pair is also . This property is called max stability. Conversely, max-stable copulas are extreme value copulas. Salvadori et al. (2007) provide details about extreme value copulas. Note that the component-wise annual maxima are unlikely to occur simultaneously within a series of 6-hourly paired observations, and hence, the max-stability assumption of the dependence structure is required to allow making inferences about the copula C using component-wise maxima.
There is no finite dimensional parametric family for the Pickands dependence function (Tawn 1988) to guide the choice of dependence function. Nevertheless, the existing literature presents various parametric and nonparametric dependence function estimators [see section 9.3 of Beirlant et al. (2006) for a review]. A classical nonparametric estimator of is that of Pickands (1981) (see supplemental material).
A potential difficulty with the Pickands estimator is that it may not be convex. Although a number of approaches have been proposed to ensure convexity, such methods often result in the joint distribution being singular and nondifferentiable. Since, in most environmental applications, singular distributions do not occur, parametric differentiable models are more suitable. In addition, for simulation purposes, most general sampling algorithms require the first and second derivatives of the function to be known (Ghoudi et al. 1998). Considering a parametric differentiable form of is therefore more practical since it avoids numerical nonparametric smoothing and leads to increased speed and accuracy of sampling algorithms. Nevertheless, nonparametric estimators are useful in illustrating the suitability of the parametric models. So, here we will use a parametric model, but for comparison purposes we will also discuss the nonparametric Pickands estimator and another nonparametric estimator that is described in the supplemental material (Capéraà et al. 1997; Falk and Reiss 2003, 2005).
We use the Gumbel copula as a parametric model, primarily because of its ease of implementation and the ease of simulation. In addition, the Gumbel copula is of particular interest since it is the only extreme value copula that belongs to the class of Archimedean copulas, which has a wide range of applications in practice (see supplemental material). The dependence function of the Gumbel copula is given by
where the parameter is estimated by maximum likelihood. Further, several studies (Yue 2001; Shiau 2003; Requena et al. 2013) have shown that the Gumbel copula performs well for modeling multivariate hydrological extreme events.
Dependence measures for extremes have received much attention in the literature (AghaKouchak et al. 2010b). The joint occurrence of extreme events can be measured by the so-called upper tail dependence (UTD) coefficient , which can be formulated in terms of Pickands dependence function using the following formula (Salvadori et al. 2007):
In the current work, we consider the estimator of the UTD coefficient derived from the Pickands estimator , which can be compared to the Gumbel copula estimator based on Eq. (3) and the maximum likelihood estimator of . For completeness, we also consider the estimator derived from the nonparametric estimator of Capéraà et al. (1997), which is described in the supplemental material (Frahm 2006; Serinaldi 2008; Requena et al. 2013).
(ii) PMP characterization using bivariate extreme values analysis
First, similar to the traditional operational PMP estimation described above, annual PE maxima are pooled using moving windows grid boxes as a simplified form of storm transposition. Then, for each grid box, we separately fit GEV distributions to the maxima of the annual maxima of PE from the surrounding gridbox region and to the annual maxima PW at the grid box. The Kolmogorov–Smirnov goodness-of-fit test (Stephens 1970; Kharin et al. 2007) is used to assess whether the GEV approximates the behavior of each of these two types of block maxima adequately. This is necessary because the GEV distribution is an asymptotic distribution that is obtained for blocks that increase in length without bound. In our case study, we have 6-hourly data and thus a block length for each season, which in most cases would be considered to be adequate. Nevertheless, the quality of the approximation depends on the rate of convergence to the GEV with increasing block size for the PDF from which individual 6-hourly or daily values are drawn, which may be slow for some distribution types (such as the Gaussian distribution; Leadbetter et al. 2012). In addition, serial correlation and the presence of an annual cycle may reduce the effective block size (Kharin and Zwiers 2005) in many hydrometeorological applications.
Once the univariate GEV distributions for PE and PW have been separately fitted and evaluated for each season, the extreme value copula can be used to connect the two GEV models to obtain an estimated joint distribution for seasonal extremes of PE and PW. The fitted bivariate distribution can then be used to estimate quantiles and return periods in the bivariate setting, including for levels beyond those that have been observed.
Our final objective is to derive information about the product of PE and PW, which implies obtaining the distribution of the product extremes of PE and PW for each season. Our approach estimates PMP by considering the greatest value of these products over a chosen period of time ( years). We therefore use a resampling approach, which is implemented as follows:
Draw four samples of m pairs of PE and PW seasonal maxima (one sample of m pairs for each season) from available data.
Fit the four bivariate GEV models to each of the four samples drawn in step 1, and hence one bivariate GEV model is available for each season.
Draw a sample of m independent pairs of PE and PW extremes from each of the four bivariate distributions fitted in step 2.
Compute the products of the simulated pairs of extreme PE and PW values drawn in step 3.
Keep the largest of the products computed in step 4; this value represents one realization of PMP estimated via moisture maximization based on an m-year sample. The use of four bivariate GEV models corresponding to each season ensures that the simulated PMP value is obtained using PW and PE values occurring within the same season, as it is commonly recommended in practice.
Repeat steps 1–5 to simulate the expected variation in PMP estimates that would occur in repeated analyses of independent m-year periods under the same hydroclimatic conditions.
The large sample of simulated PMP values obtained in this way (we generate samples of 1000 in our application) can then be used to estimate the properties of the PMP distribution (e.g., mean, mode, median, and quantiles). Note that for samples drawn from the bivariate distribution built with the comonotone copula, the maximization of products of annual PE and PW maxima is equivalent to the product of the series maxima for PE and PW and thus represents the traditional approach. Finally, recall that the proposed algorithm assumes stationarity. Nonstationary extensions will be discussed in a subsequent paper.
In addition to providing a probabilistic setting for the interpretation of PMP, the proposed approach offers more flexibility to designers than the operational moisture maximization calculation. Indeed, the algorithm above can be easily adapted to obtain the probability distribution of any order statistic, even those for periods longer than that represented by the available observed record. The mth-order statistics are only used here for comparison purposes with the operational approach.
3. Results and discussion
a. Bivariate extreme value distribution of PE and PW
Univariate GEV models are first fitted separately to the annual maxima of PE and PW at each CanRCM4 grid box and for each season. The Kolmogorov–Smirnov goodness-of-fit test is used at each grid box to assess the differences between the empirical and the fitted GEV cumulative distributions. These tests, which were performed at the 5% significance level, indicate that a GEV distribution reasonably approximates the distribution of annual extremes for both PE and PW over all CanRCM4 grid boxes and for each season.
The shape parameter of the GEV distribution governs the tail behavior of the distribution. The subfamilies defined by, , and correspond to the Gumbel, Fréchet, and Weibull families, respectively. The Gumbel distribution is unbounded, the Fréchet distribution has a lower bound, and the Weibull distribution has an upper bound. Figure 1 shows the estimated GEV parameters for each CanRCM4 grid box during the winter, for both PE and PW. During the winter, the PW distribution is dominantly Weibull, with ≈ −0.15 on average and at about 91% of locations. In contrast, annual maximum PE is most frequently Gumbel or Fréchet; the mean value of is approximately 0.03, and at about 40% of locations. Similar results regarding the shape parameter are observed for the other seasons (Figs. S1–S3 in the supplemental material show the estimated GEV parameters during the spring, summer, and autumn, respectively). In conclusion, empirical evidence through the extreme value theory suggests that PW is bounded whereas PE is unbounded in the upper tail. The finding that PW is bounded is consistent with the fact that PW considers a well-delineated part of the atmosphere, the atmospheric column above a grid box, at a fixed time. The finding that PE may not be bounded in the upper tail also seems reasonable, since this reflects the effect not just of precipitation removal from the column directly above the grid box, but also that of moisture convergence from across a potentially much larger region into the grid box.
Extreme precipitation will be bounded if both of its two components, PW and PE, are bounded. An estimate of the bound above for a given component in these circumstances is , where the shape parameter is negative. It follows that the theoretical upper limit (the theoretical PMP) of precipitation for a given season should not exceed the product of the PE and PW bounds when both shape parameters are negative. In our study, only 13% of grid boxes show simultaneously negative shape parameter estimates for both PE and PW during the winter, and for these cases the estimated bounds have an order of magnitude around 104 mm for a 6-h accumulation. Such very high magnitudes are impractical from an engineering perspective. They are also not rational from a meteorological perspective in the sense that values near such an upper bound, while within the support of the fitted distribution, would be exceptionally unlikely to occur, even over the lifetimes of very long-lived infrastructure.
We next proceed to estimating the extreme value copula for each CanRCM4 grid box using the procedure described in section 2. Traditional PMP estimates via moisture maximization are based on the assumption that extreme PE and PW occur simultaneously. In terms of extreme value copulas, this assumption would suggest that the comonotone copula should be the appropriate extreme value copula. Figure 2 shows the Pickands dependence function estimated using the two nonparametric estimators as well as the Gumbel copula for all grid points over North America during the winter (similar figures for each of the other seasons are presented in the supplemental material). Thin gray lines indicate dependence function estimates at an individual grid point. For each estimator, the red curves represent the means and selected quantiles of all gridbox dependence functions over North America, while the solid black lines correspond to the comonotone copula. The profile of the Pickands function appears to be symmetric and very similar for the three estimators. As we can see, the assumption that a PMP event has maximum moisture availability does not appear to be satisfied in practice in the climate simulated by CanRCM4. Indeed, examination of the Pickands dependence functions over North America demonstrates a significant departure of the dependence structure from that of the total dependence (comonotone) copula.
The dependence (UTD) coefficient is calculated for each CanRCM4 grid box and the corresponding maps obtained using the three estimators , , and , during the winter, are presented in Fig. 3 (similar figures for each of the other seasons are presented in the supplemental material). In all cases, UTD coefficient values over North America are generally very low and confirm the nonvalidity of the assumption of the simultaneous occurrence of PE and PW extremes. In addition, the three maps are very similar and do not exhibit strong spatial variation. Figure 4 compares Gumbel gridbox UTD coefficient estimates with the two other estimators. The estimates exhibit very strong similarity to the estimates (correlation r = 0.97) while the similarity with is not quite as strong (r = 0.8). It should be noted that because is derived from the , which is uniformly strongly consistent and asymptotically unbiased (Capéraà et al. 1997), it is preferred to comparing with , than comparing it with . In addition, has received much attention in the literature. For example, Frahm et al. (2005) carried out extensive simulations to compare three other estimators for and conclude that is preferred. Nevertheless, for simplicity, we will consider the extreme value Gumbel copula in this study.
b. Probable maximum precipitation results
It is desirable to quantify how taking into account the dependence structure between extreme PE and PW could affect PMP estimates. Thus, the algorithm proposed in section 2 was used to sample 1000 values in the PMP distribution at each CanRCM4 grid point using the extreme value Gumbel copula and the comonotone copula. Figure 5 shows maps of the mean PMP values for the CanRCM4 climate obtained using the two copulas and that obtained using the traditional single-value PMP estimate. As expected, our estimates based on the assumption that extreme PW occurs simultaneously with extreme PE will typically lead to larger PMP estimates (Figs. 5a,b) while taking a realistic dependence structure into account leads to somewhat smaller PMP estimates (Fig. 5c). Comparing between Figs. 5a and 5b and Fig. 5c suggests that the assumption of total dependence overestimates the mean PMP value for CanRCM4-simulated 6-hourly precipitation by an average of about 15% over North America. Since PMP should not be underestimated for safety reasons nor overestimated for economic reasons, the proposed approach takes advantage of a more realistic representation of the dependence structure during the moisture maximization step that should give a more reliable maximization of precipitation events. Moreover, it should also be noted that the PMP estimates based on the bivariate extreme value analysis exhibit substantially less spatial noise than the corresponding traditional estimates, indicating greater reliability and consistency between locations.
It is important to reiterate that PMP estimates, whether traditional or probabilistic, should not be interpreted as absolute upper precipitation bounds. These are numbers that are calculated on the basis of a finite record and are therefore subject to considerable sampling uncertainty. The probabilistic approach has the benefit of providing a quantification of sampling uncertainty in the PMP estimate within a defined statistical framework and thus an opportunity to quantify the likely range of PMP estimates that could arise due to sampling variability. Through further research, this framework could also be further developed to enable a quantification of the likelihood that an observed precipitation event might exceed an estimated PMP value over a defined period, taking into account both the uncertainty in the estimated PMP value and the stochastic nature of extreme precipitation. Figure 6 shows maps of the range of plausible PMP values corresponding to the 10th, 50th, and 90th percentiles using both the estimated Gumbel copula and the comonotone copula. As expected, PMP estimates based on 50 years of records have large uncertainties as indicated by the 80% confidence intervals shown in Fig. 6. To check whether bias in percentiles at a given level has a specific spatial pattern, Fig. 7 shows the maps of the ratios between PMP estimates via the Gumbel copula and the estimates via the comonotone copula at the 10%, 50%, and 90% percentile levels. There is some evidence to suggest that ratios are uniformly less than one for the smaller percentiles, and possibly less than one for all percentiles at higher elevations over western North America.
Next, we compare the empirical PMP distributions estimated from the 1000 sampled values with the traditional single-value PMP estimates. This can be achieved using the probability integral transform (PIT; Hamill 2001; Gneiting et al. 2007; Diebold and Mariano 1995):
where is the traditional single-value PMP estimate, and is the empirical cumulative distribution function. If the single-value PMP estimate is a random number with distribution , then will have a uniform distribution.
In this study, values are obtained for each CanRCM4 grid box by transforming the single-value PMP estimate at that grid box with the corresponding empirical distribution function for that location. Uniformity is usually assessed in an exploratory sense, and one way of doing this is by plotting the empirical CDF of the p values and comparing it with the CDF of the uniform distribution. Figure 8 illustrates that the impact of the use of different models for the dependence between PE and PW extremes could influence the resultant PMP distribution relative to that of the traditional estimates. The histogram of p values for the comonotone copula distribution shows that this PMP distribution is consistent with traditional single-value PMP estimates since the resultant p value histogram is close to being uniformly distributed. This demonstrates that the probabilistic model that assumes the simultaneous occurrence of extreme PE and PW is able to describe the uncertainty inherent in traditional single-value PMP estimates. On the other hand, there is a significant departure from the uniform distribution when the strong assumption of simultaneous occurrence of extreme PE and PW is relaxed. Indeed, the histogram clearly indicates that traditional single-value estimates tend to be larger than the median Gumbel copula-based estimate. This result is expected and confirms that the assumption of simultaneous occurrence leads to PMP overestimation.
Accounting for seasonality in the maximization step is common practice when computing PMP. To this end, the proposed bivariate GEV model was fitted separately for each season. Nevertheless, results of the 10%, 50%, and 90% of PMP percentiles using a single bivariate GEV model fitted to annual maxima of PE and PW do not differ greatly from results obtained by including seasonality (see Fig. S10). Moreover, the use of a single bivariate model fitted to the annual component-wise maxima can be helpful to simplify the analysis and makes the proposed approach more practical and possibly facilitates an extension of the proposed methodology to future nonstationary climate.
We have illustrated the proposed probabilistic approach to PMP estimation using historical change simulations produced with the CanRCM4 regional climate model. The application of PMP estimates derived from these simulations, and the use of projections of future PMP values based on CanRCM4 simulations under future forcing conditions, would require careful evaluation of the model and the derived PMP estimates that is beyond the scope of this paper but will be the subject of future research. Nevertheless, we briefly compare PMP estimates based on CanRCM4 and ERA-Interim reanalysis products. Figure 9 shows maps of the mean PMP estimates via the proposed probabilistic approach using CanRCM4 and the ERA-Interim reanalysis (at 0.44° horizontal spatial resolution interpolated from its 0.75° native resolution) calculated for the period (1979–2005). Although the CanRCM4 is able to well represent the spatial pattern of the PMP, it exhibits a positive bias (~16% on average) relative to ERA-Interim over North America. While the comparison with ERA-Interim is far from perfect given that precipitation is considered to be a type-C reanalysis variable (i.e., only weakly constrained by observations; Kalnay et al. 1996), the results strongly suggest that the further study of the probabilistic PMP estimates is warranted and that RCMs may provide a path toward projecting future change in PMP.
Our use of component-wise seasonal PE and PW maxima can be considered as a stepping stone toward a stochastic approach that restricts the analysis to high PE values that coincide with high absolution precipitation amounts and is therefore closer to an approach that is often used by practitioners. A challenge, however, is that asymptotic distribution of extreme high PE values within a block that is restricted to high absolute rainfall accumulation events may not be necessarily be GEV, and thus the bivariate GEV model is not applicable. Heffernan and Tawn (2004) proposed a conditional multivariate extreme value model that may be suitable in such instances, but its application to PMP is not straightforward, and the interpretation of findings, particularly in a nonstationary climate, may be challenging. Implementation would require 1) constructing an asymptotic PE model conditional on extreme precipitation and 2) combining this conditional PE distribution with a univariate extreme value model of PW. Despite the potential interpretation challenges, we plan to attempt such an approach as the next step in the development of a probabilistic framework for PMP estimation. In the interim, the approach considered here, which is based on seasonal elementwise maximization, serves as a stepping stone toward a more comprehensive probabilistic framework for PMP estimation.
Storm transposition concepts have been applied in a number of different ways in traditional single-value PMP estimation approaches (Foufoula‐Georgiou 1989). Generally, these methods access additional information from nearby locations within homogeneous areas, leading to an increase in the effective sample size for a given record period of years, which should reduce sampling errors (Alexander 1963). In the current analysis, the PE maxima over moving windows are used as a simplified approach. As expected, this marginally increases the dependence between PE and PW in the bivariate extreme value analysis relative to that using both PE and PW from the same location (see Fig. S11). Nevertheless, the dependence remains far weaker than the total dependence assumption that is implicit in the usual operational PMP calculation. Given the size of the domain considered in this study, a less simplistic approach to storm transposition would involve the implementation of automated statistical procedures for identifying homogeneous regions. This would induce an additional, complex layer of statistical uncertainty reflective of bias-variance trade-offs that are often encountered when trying to select between statistical models of lesser or greater complexity.
The proposed probabilistic method for PMP estimation, as well as providing for the explicit representation of the dependence between PE and PW extremes and allowing evaluation of the uncertainty inherent in PMP estimates, also permits the estimation of PMP for periods longer than the data record period, which is not possible using traditional approaches (see Fig. S12, where PMP estimates are provided over a period of 100 years based on our 50-yr data record). The probabilistic approach also naturally allows the determination of an exceedance probability that, in effect, quantifies what is meant by “probable.” To finish, even if we have the knowledge required to describe key facets of climate and hydrologic systems from first principles, for practical purposes, we do not have the ability to analyze and describe the upper bounds of the intensity of many types of extremes based on physical reasoning. In the case of extreme precipitation, current knowledge of storm mechanisms remains insufficient to allow a precise evaluation of limiting values. In a practical sense, PMP is not deterministically predictable, and thus probabilistic approaches are useful to evaluate the uncertainties. Despite all the considerable uncertainties that may influence PMP estimates using moisture maximization, PMP estimates are still presented as single values. While this is rational from a practical perspective, it is useful to be able to quantify both the uncertainty of these estimates in terms of their expected variation in repeated calculation under statistically equivalent conditions and to be able to estimate the likelihood of future exceedance of the estimated value. This is why this study attempts to take advantage of recent development in probabilistic extreme value analysis to explore the uncertainty and to provide ranges of PMP values with a known likelihood of coverage. This should ultimately lead to more reliable information for design purposes.
We gratefully acknowledge Dr. Kharin Slava from Environment and Climate Change Canada for providing the model output used in this work, which is available from the Canadian Centre for Climate Modelling and Analysis (CCCma) upon request.
Supplemental information related to this paper is available at the Journals Online website: https://doi.org/10.1175/JHM-D-17-0110.s1.