Trends in the extremes of environments associated with severe US thunderstorms

Severe thunderstorms can have devastating impacts. Concurrently high values of convective available potential energy (CAPE) and storm relative helicity (SRH) are known to be conducive to severe weather, so high values of PROD=$\sqrt{\mathrm{CAPE}} \times$SRH have been used to indicate high risk of severe thunderstorms. We consider the extreme values of these three variables for a large area of the contiguous US over the period 1979-2015, and use extreme-value theory and a multiple testing procedure to show that there is a significant time trend in the extremes for PROD maxima in April, May and August, for CAPE maxima in April, May and June, and for maxima of SRH in April and May. These observed increases in CAPE are also relevant for rainfall extremes and are expected in a warmer climate, but have not previously been reported. Moreover, we show that the El Ni\~no-Southern Oscillation explains variation in the extremes of PROD and SRH in February. Our results suggest that the risk from severe thunderstorms in April and May is increasing in parts of the US where it was already high, and that the risk from storms in February tends to be higher over the main part of the region during La Ni\~na years. Our results differ from those obtained in earlier studies using extreme-value techniques to analyze a quantity similar to PROD.


Introduction
Annual losses from severe thunderstorms in the US have exceeded $10 billion in recent years. 1 In addition to economic losses, 2011 was marked by 552 deaths caused by torna-does. These economic and human impacts are a strong motivation to study how and why US thunderstorm activity varies from year to year and region to region. Two important aspects are trends potentially related to climate change or multi-decadal variability, and modulation by the El Niño-Southern Oscillation (ENSO). However, inadequacies in the length and quality of the thunderstorm data record present substantial challenges to addressing these questions directly (Verbout et al., 2006;Edwards et al., 2018).
In the US, a severe thunderstorm is defined to be one that produces a tornado, hail greater than one inch in diameter, or wind gusts in excess of 50 kts. Supercell storms are responsible for a large fraction of severe thunderstorm reports (e.g., 79% of tornadoes according to Trapp et al. (2005)), even though only about 10% of thunderstorms are supercells (Doswell III, 2015), and a key element in forecasting severe thunderstorms is the prediction of where and when supercells will occur (Corfidi, 2017). A supercell is a thunderstorm with a deep, long-lived rotating updraft (mesocyclone). The presence of buoyancy, i.e., convective available potential energy (CAPE), and deep-layer vertical wind shear are important determinants for supercell development. In addition to the magnitude of the vertical shear, the angle between surface and upper-level winds is important for mesocyclone development and persistence. A key quantity is atmospheric helicity, which is computed relative to storm motion and is proportional to vertical wind shear and the amount of wind direction turning from the surface to upper levels (often 0-3 km).
Several recent studies of US tornado reports have concluded that annual numbers of reliably observed tornadoes, i.e., those rated E/EF1 and greater, show slight but statistically insignificant trends downward over time (Brooks et al., 2014;Elsner et al., 2015), whereas measures of tornado outbreaks or clusters show upward trends (Brooks et al., 2014;Elsner et al., 2015;Tippett et al., 2016). Changes in regional tornado activity have also been reported (Agee et al., 2016;Gensini and Brooks, 2018), but there is less evidence for changes in hail and damaging straight-line wind, perhaps due to the poorer quality of the relevant databases.
In view of the limitations of the historical storm record, a valuable alternative is the analysis of meteorological environments associated with severe thunderstorms. As mentioned above, severe thunderstorms, especially supercell storms, are more likely in the presence of high values of CAPE and of certain measures of vertical wind shear (see, e.g., Brooks et al., 2003;Brooks, 2013) such as storm relative helicity (SRH). Weather forecasters have routinely used such quantities for two decades to interpret observations and the output of numerical weather prediction models (Johns et al., 1993;Rasmussen and Blanchard, 1998;Doswell III et al., 1996), and they are also useful in climatological studies, especially in areas outside the US without extensive historical reports (Brooks et al., 2003). The environmental approach can also provide an indication of expected severe thunderstorm activity in a warmer climate based on climate projections that do not resolve thunderstorms explicitly (Trapp et al., 2009;Diffenbaugh et al., 2013). On time-scales between weather forecasts and climate projections, this approach has provided a clearer picture of how ENSO modulates US hail and tornado activity Lepore et al., 2017).
However, there are notable gaps in previous statistical studies of environments associated with severe thunderstorms. For instance, relationships with ENSO were diagnosed based on monthly averages, which are at best indirect proxies for behaviour on the timescale of weather. Similarly Gensini and Brooks (2018) computed monthly accumulations of daily maxima of a significant tornado parameter. Tippett et al. (2016) used submonthly environmental data but aggregated the results on an annual and US-wide basis. These gaps motivate the present work, which focuses on extremes of the environmental values rather than on monthly averages, and presents results that are spatially and temporally resolved. The framework that we use is statistical extreme-value theory. Gilleland et al. (2013) apply the conditional extreme-value framework of Heffernan and Tawn (2004) to the product WS×W max , where WS is a measure of wind shear and W max = √ 2 × CAPE, by conditioning on the 75th percentile of that variable computed across the spatial domain. This approach has the advantage of allowing the study of real spatial patterns under severe conditions, as opposed to approaches looking at pointwise maxima. They show some temporal variations in the mean simulated values from their model. Mannshardt and Gilleland (2013) perform an unconditional univariate analysis in which they fit the generalized extreme-value (GEV) distribution to the annual maxima of WS×W max and establish the existence of a time trend in the GEV location parameter. Heaton et al. (2011) consider three Bayesian hierarchical extreme-value models based on exceedances over a high threshold for WS×W max , their third model being based on a Poisson point process with a yearly time trend. Neither paper clarifies whether this trend is attributable to both CAPE and WS or only to one of them. Moreover, both articles consider trends in annual quantities and thus cannot detect month-specific features, and they do not account for multiple testing, though this issue is briefly addressed in Gilleland et al. (2008). Finally, they consider only time as a covariate.
We propose to address some of the gaps left by the papers mentioned in the previous paragraph. Our study covers a large part of the contiguous US for individual months from 1979 to 2015. We separately consider CAPE, SRH (0-3 km) and the combined variable PROD= √ CAPE×SRH. To motivate our use of PROD, we consider the discriminant line defined in Brooks et al. (2003, Equation (1)), which is one of the first thresholds used to distinguish low and high likelihoods of severe thunderstorm occurrence using a function of CAPE and vertical shear. This equation can be rewritten as S6 × CAPE 0.62 = 18.60, where S6 is the 0-6 km shear. Replacing S6 with 0-3 km SRH and approximating the power 0.62 by 0.5 leads to a discriminant line of the form SRH × √ CAPE = c, i.e., PROD = c, where c is a real constant, and shows that values of PROD can be expected to be indicative of high risk of severe thunderstorms. PROD has already been used as a proxy for severe thunderstorms in several studies (e.g., Tippett et al., 2016) and the plot of Figure 1 in Brooks et al. (2003) is little changed by replacing S6 with 0-3 km SRH (not shown). More generally, the product of CAPE and two shear-related variables (different or not), or equivalently its square root, is commonly used as an indicator of the likelihood of severe thunderstorm occurrence. For instance, the significant tornado parameter (STP) and the supercell composite parameter (SCP) involve the product of CAPE, S6 and 0-1 km SRH, and the product of CAPE, S6 and 0-3 km SRH, respectively (e.g., Thompson et al., 2003).
To ensure the soundness of our results we carefully check the suitability of the GEV and the use of time and ENSO as explanatory variables in its location parameter, and we account for multiple testing by implementing the false discovery rate procedure of Benjamini and Hochberg (1995). As stated in Gilleland et al. (2013, Section 1), in addition to studying PROD, it is insightful to consider its components separately. Furthermore, accounting for multiple testing is essential when performing many simultaneous tests, as highlighted by Gilleland et al. (2013, Section 4), though they do not apply an adjustment for it.
We find a significant time trend in the location parameter of the GEV for PROD maxima in April, May and August (and to a lesser extent in June and December), in CAPE maxima in April, May and June (and to a lesser extent in August, November and January), and in SRH maxima in May (and to a lesser extent in April). The trends in CAPE maxima are striking, because CAPE is expected to increase in a warming climate (Del Genio et al., 2007;Van Klooster and Roebber, 2009) and are relevant to rainfall extremes (Lepore et al., 2015), but have not previously been observed over the US. April and May are important months for PROD, as severe thunderstorms are frequent at this period. The corresponding time slope is positive in regions of the US where severe thunderstorms are already common, which may have implications for risk assessment and management. Our study also reveals that ENSO can explain variation in the location parameter of the GEV for PROD and SRH maxima in February. The corresponding slope is negative over most of the region we consider, possibly suggesting an increased risk of high storm impacts in February during La Niña years. Our results differ from those of Heaton et al. (2011), Mannshardt and Gilleland (2013) and Gilleland et al. (2013), but are fairly consistent with those obtained by Gensini and Brooks (2018), who inter alia consider the numbers of tornado reports.
The remainder of the paper is organized as follows. Section 2 presents the data and a brief exploratory analysis. We describe our statistical approach and demonstrate its relevance in Section 3. Section 4 details our main results, and Section 5 summarises our findings and discusses them.

Data and exploratory analysis
The data we investigate consist of 3-hourly time-series of 0-180 hPa convective potential energy (CAPE, Jkg −1 ) and 0-3 km storm relative helicity (SRH, m 2 s −2 ) from 1 January 1979 at 00:00 to 31 December 2015 at 21:00. The region covered is a rectangle over the contiguous US from −110 • to −80 • longitude and 30 • to 50 • latitude and the resolution is 1 • longitude and 1 • latitude. These data constitute a coarse version of reanalysis data from the North American Regional Reanalysis (NARR); the original resolution is 32 km longitude and 32 km latitude (see, e.g., Mesinger et al., 2006). The region contains 651 grid points, with no data available for 32 grid points over the sea or lakes. Using these time series, we build 3-hourly time series of PROD= √ CAPE×SRH, measured in m 3 s −3 . As a physical covariate we use monthly values of the NINO 3.4 index ( • C) from 1979 to 2015, taken from the ERSSTv5 data set available on the NOAA Climate Prediction Center website. Figure 1 shows the empirical pointwise probabilities that CAPE and SRH exceed thresholds corresponding to roughly the 90 th percentile of each variable across the entire region. There is a clear North-South gradient for CAPE probabilities, while the regional spatial pattern for SRH suggests that the high values cluster towards the centre of the region. Figure 2 shows an increase in the exceedance probabilities for PROD at many grid points over the decades; a similar result is visible for SRH, but less so for CAPE. This increase is of interest for risk assessment, especially in regions with a high risk of severe thunderstorms. Figure 2 strongly suggests the presence of a temporal trend in the maxima, but there seems to be no geographical shift, notwithstanding the results of Gilleland et al. (2013). The top left panel of Figure 3 shows a positive correlation between PROD April maxima and time for many grid points, and the middle panels show a positive linear time trend for April maxima of PROD, CAPE and SRH in the subregion indicated. The top right panel shows strong negative correlation between PROD February maxima and ENSO at many grid points, while the scatter-plots in the bottom panels show a roughly linear negative trend for all variables. These analyses underscore the need to incorporate ENSO into our statistical modelling of maxima.

Modelling of maxima
Risk assessment entails the estimation of return levels associated with very high return periods and of the probabilities of observing events so extreme that they have never occurred before. Extreme-value theory provides a solid framework for the extrapolation needed to perform these tasks for the maxima of PROD, CAPE and SRH. Here we present the statistical background to the results in Section 4; for further explanation and references see Coles (2001) or Davison and Huser (2015).
Let M n denote the maximum of the independent and identically distributed random variables X 1 , . . . , X n . The extremal types theorem states that if there exist sequences {a n } > 0 and {b n } ∈ R such that (M n −b n )/a n has a non-degenerate limiting distribution as n → ∞, then this must be a generalized extreme-value (GEV) distribution, where ξ and η are real-valued, τ > 0 and, for any real a, a + = max{a, 0}. This implies that if n is large enough, we may approximate the distribution of M n by for suitably chosen η, τ and ξ, which are location, scale and shape parameters. The latter defines the type of the distribution: ξ > 0, ξ < 0 and ξ = 0 correspond to the Fréchet, Weibull and Gumbel types and allow quite different statistical behaviours, with the first giving a heavy upper tail with polynomial decay, the second modelling bounded variables, and the third an intermediate case, unbounded with an exponentially-decaying upper tail. The GEV approximation for maxima remains valid if the variables are dependent, provided that distant extremes are "nearly independent" (more formally, Leadbetter's D(u n ) condition is satisfied). We shall see below that this appears to be the case for our time series, so it is plausible that (1) applies.
The results above provide a natural model for maxima of stationary sequences. To apply this model we split the data into blocks of equal lengths and compute the maximum of each block. Assume that we have T blocks of length n and let M (1) n , . . . , M (T ) n denote the corresponding maxima. If n is large enough, the distribution of the M (t) n is approximately (1), upon which inference can be based; this is the so-called block maximum method. As noted in Section 2, PROD, CAPE and SRH maxima exhibit a time trend and/or a relation with ENSO for some months, and we can allow the GEV parameters to depend upon these variables. Figure 4 and results in Section 4 show that the temporal or ENSO effects only appear for certain months. For instance, time trends for PROD, CAPE and SRH are mainly present in April and May, April to June and April and May, respectively. We therefore choose our blocks to be the months and study each month separately, fitting the models and M (t) n ∼ GEV ηen(t),τen,ξen , η en (t) = η 0,en + η 1,en ENSO t , t = 1, . . . , T, where η 0,ti , η 1,ti , η 0,en , η 1,en , ξ ti and ξ en are real-valued, τ ti and τ en are positive, ENSO t is The top panels show the correlation map with time (in years from 1 to 37) for PROD April maxima (left) and the correlation map with ENSO for PROD February maxima (right). The middle and bottom panels display PROD (left), CAPE (centre) and SRH (right) analyses on a subregion indicated by the black rectangle drawn on the correlation maps. The middle panels show the region-averaged monthly maxima time series across all 444 months in light grey, the region-averaged April maxima time series in black and its 95% confidence interval bounds indicated by the red shaded region. Every point in the time series is the averaged maxima across all grid points in the subregion indicated before, for a particular month and a particular year. The bottom panels show scatter-plots of the region-averaged February maxima with ENSO, along with the 95% confidence interval bounds at each point indicated by the whiskers. The black line represents the best fitted local regression trend estimate, with its 95% confidence interval bounds indicated by the shaded blue region.  1979-1987, 1988-1996, 1997-2005 and 2006-2015, respectively. the value of ENSO in that month for year t, and n equals 224, 232, 240 or 248, depending on the number of days in the month, as we have eight observations per day. Figure 3 suggests that effects of time and ENSO on maxima are roughly linear and impact the location parameter η only, so we consider constant scale and shape parameters; it is generally inappropriate to allow the shape parameter depend on a covariate owing to the large uncertainty of its estimate. The time trend induces non-stationarity between the blocks (i.e., across years) but does not violate the within-block stationarity assumption; see below. Figure 4 suggests that the time trend does not stem from a shift of seasonality.
We compute the monthly maximum for each month and a given grid point and thereby obtain the maxima M for January, say. We then fit the models (2) and (3) by numerical maximum likelihood estimation for each month and grid point.
Recall that, provided the block size n is large enough, within-block stationarity and the D(u n ) condition ensure the validity of (1) and hence allow us to consider the models (2) and (3). To check the plausibility of these two properties, we considered the 3-hourly time series of PROD, CAPE and SRH at 50 representative grid points. For each block (associated with a triplet grid point-month-year), we fitted several autoregressive-moving average (ARMA) processes to the corresponding time series, chose the fit that minimized the Akaike information criterion (AIC), and used a Box-Pierce procedure to assess the independence of the corresponding residuals; we found no systematic departure from independence or stationarity. Often the residual distribution appeared to lie in the Fréchet or Gumbel maximum-domains of attraction, and Embrechts et al. (1997, Section 5.5) show that in such cases convergence of the maxima to the GEV occurs even for ARMA processes. Hence the time series of data within the months seem to satisfy both stationarity and the D(u n ) condition. Choosing the months as blocks thus appears reasonable, as is confirmed by our analysis in the following section. On the other hand, choosing the seasons or years as blocks would mask many interesting features, and the sample size associated with day-or week-long blocks is too low for the GEV approximation (1) to be reasonable.

Assessment of GEV fit
At each grid point i and month j, we fit the GEV to the monthly maxima, as described in Section 3.1, resulting in location, scale and shape parameter estimatesη i,j ,τ i,j andξ i,j . We use the Kolmogorov-Smirnov test to assess the distributional proximity between this GEV and the empirical distribution of the 37 observed monthly maxima. For PROD, CAPE and SRH, in most months, the fit appears acceptable at the 5% level at all grid points. These good in-sample fits of the GEV for all variables are confirmed by the quantile-quantile (QQ) plots, which are displayed for one grid point in Figure 5. However, these results do not take into account the fitting of the GEV to the data, which systematically decreases the values of the Kolmogorov-Smirnov statistic. In order to make an informal allowance for this, we perform the following procedure for each grid point i and month j: 1. fit the GEV using the pooled observations from the eight grid points nearest to i to obtainη po i ,j ,τ po i ,j andξ po i ,j ; 2. use a Kolmogorov-Smirnov test to check the agreement between the "out-sample" GEV with parametersη po i ,j ,τ po i ,j andξ po i ,j , and the empirical distribution of the 37 observed monthly maxima at grid point i.
Then, we implement the same procedure 100 times with data simulated from independent GEV fitted at each grid point and compute the 5% and 95% quantiles of the empirical distribution of the number of rejections. Table 1 shows that, for all variables, the observed numbers of rejections are low compared to the number of grid points (619), especially as we did not account for multiple testing. Moreover, they are not tremendously different from those obtained in the simulation study, although often slightly above the 95% quantile in the case of CAPE and slightly below the 5% quantile for SRH and PROD. These discrepancies may be explained by the substantial spatial dependence in our data, not accounted for in the simulation study. This procedure supports the use of the GEV at grid points at which no data are available and thus goes beyond the initial goal of assessment of the GEV fit. We conclude that the GEV provides a suitable model for the monthly maxima of our three variables.

General
In Section 4, we assess whether time and ENSO affect the location parameter of the fitted GEV for the three variables PROD, CAPE and SRH. However, as this is assessed at 619 grid points, we must make some allowance for multiple hypothesis testing. We first discuss the statistic used to test the significance of time and ENSO, respectively, in (2) and (3). In the first case, we have to test the null and alternative hypotheses H 0 : η 1,ti = 0 versus H A : η 1,ti = 0, by comparing the fits of the models M 0 : η ti (t) = η 0,ti , M 1 : η ti (t) = η 0,ti + η 1,ti t, t = 1, . . . , 37, and similarly for ENSO. We let 0 (M 0 ) and 1 (M 1 ) denote the maximized log-likelihoods for the models M 0 and M 1 and compute the signed likelihood ratio statisticT = sgn(η 1,ti )[2{ 1 (M 1 ) − 0 (M 0 )}] 1/2 , where sgn(η 1,ti ) is the sign of the estimated trend under model M 1 ;T has an approximate standard Gaussian distribution under H 0 , and the corresponding p-value is p = 2Φ(−|t|), wheret is the observed value ofT and Φ denotes the standard Gaussian distribution function. Computing p for the m grid points yields m ordered p-values p (1) ≤ p (2) ≤ · · · ≤ p (m) . The underlying p-values are likely to be positively correlated, since dependence on time or ENSO will have a spatial component, and we now discuss how to adjust for this.

Multiple testing
A popular approach for multiple testing in climatology is the field significance test of Livezey and Chen (1983), but unfortunately this gives little insight about where the results are significant, which is of high interest to us, and the regression approach of Del-Sole and Yang (2011) has the same drawback. Among methods to identify grid points where the results are significant are those, such as the Bonferroni method, that strongly control the so-called family-wise error rate, i.e., the probability that the number of falsely rejected null hypotheses is equal or larger than unity. However, when the number of hypotheses to test is high, such methods are so stringent that the power of the test is very low. Benjamini and Hochberg (1995) introduce the false discovery rate (FDR), namely the expected proportion of false rejections of the null hypothesis H 0 out of all rejections of it, and propose a procedure to ensure that the FDR is below a given level q when performing multiple testing. Their approach, which we call the BH procedure, would reject H 0 at all grid points i such that p i ≤ p (k) , where In fact this ensures that the FDR is less than qm 0 /m, where m 0 denotes the unknown number of grid points at which H 0 is true. We then say that the procedure controls the false discovery rate at level qm 0 /m. For a chosen q, let S q be the number of grid points at which a particular covariate is declared significant by the BH procedure. Then we expect the true number of grid points where the relation is significant, m A , to satisfy As the BH procedure ensures that the false discovery rate is not more than qm 0 /m, we may argue a posteriori that we have controlled the FDR at level which entails that m A ≥ (1 − q (1) )S q . Iterating this argument by defining q (n+1) = q m − 1 − q (n) S q m , n = 1, 2, . . . , the effective level at which we have controlled the FDR is therefore q lim = lim n→∞ q (n) . This limit is generally obtained after a few iterations. Finally, we may write that The BH procedure was originally shown to be valid for independent test statistics, but Benjamini and Yekutieli (2001, Theorem 2.1) prove that it controls the FDR at level qm 0 /m if the statistics have a certain form of positive dependence. Ventura et al. (2004) apply the BH procedure to simulations representative of climatological data and covering the range of correlation scales likely to be encountered in practice, and find that it controls the FDR at level qm 0 /m. Yekutieli and Benjamini (1999) and Benjamini and Yekutieli (2001) propose modifications to account for more general dependence between the test statistics. The first is complicated and its gain over the BH procedure is limited, while the second is applicable whatever the dependence structure but has greatly reduced power, so Ventura et al. (2004) recommend the use of the BH procedure.
The independence assumption underlying the BH procedure is clearly false for our data, but they resemble those considered in Ventura et al. (2004), so applying the BH procedure at level q should control the FDR at level qm 0 /m. A more rigorous argument would use the asymptotic normality of our test statisticT and the multivariate central limit theorem to show thatT 1 , . . . ,T m is asymptotically jointly Gaussian and that the results of Benjamini and Yekutieli (2001) can be applied, but this is outside the scope of the present paper.

Results
In this section we quantify the effects of time and ENSO in the location parameter of the GEV and study their significance, using q = 0.05 and q = 0.2, corresponding to control of the false discovery rate at the nominal levels 5% and 20%. In each case we first discuss PROD, which is the main variable of interest for severe thunderstorm risk, and then consider CAPE and SRH.
We begin with the effect of time. Table 2 shows that many of the 619 grid points exhibit a significant time trend for PROD in April, May and August (and to a lesser extent in June and December). In April, this number equals 313 at the 20% level, so (4) implies that at least 250 of these grid points indeed have a trend; with (5), this number rises to 278. Figure 6 indicates that, in April, the North-East, a very wide South-East corner and the South-West, show significant time trends. In the first two regions,η 1,ti is positive, corresponding to an increasing risk of severe thunderstorms impacts, particularly in already risky regions. Similar conclusions may be drawn from Figure 7 in the case of May, though the South-East is less prominent. The highest slope value corresponds to an annual increase of PROD maxima of about 3% of the corresponding PROD maximum recorded in 1979. Mannshardt and Gilleland (2013) and Heaton et al. (2011) do not find such a significantly positive time trend over the whole region which is the most at risk, sometimes called tornado alley, and they do not obtain significantly positive trends in the North-East of our region, whereas they find a significant positive trend towards the West. These differences are likely to be explained by the following facts: they consider a less recent period , their product variable is slightly different than ours, and they study annual instead of monthly maxima. The discrepancies with Heaton et al. (2011) can also be explicated by the methodological dissimilarities; as already mentioned, they use a Bayesian hierarchical approach. The evolution obtained by Gilleland et al. (2013) between the second (1979)(1980)(1981)(1982)(1983)(1984)(1985)(1986)(1987)(1988)(1989)(1990)(1991)(1992) and the third (1993)(1994)(1995)(1996)(1997)(1998)(1999) period is quite consistent with our trends in Spring; for the other seasons, however, the results differ significantly. There are also many dissimilarities when considering the evolution between the first  and the second (1979)(1980)(1981)(1982)(1983)(1984)(1985)(1986)(1987)(1988)(1989)(1990)(1991)(1992) periods; this comparison is less relevant since the first period does not belong to the time range we consider. Gilleland et al. (2013) consider the mean simulated values conditional on the total amount of energy being large, and then not all grid point values need be extreme; on the other hand, we look at maxima at each grid point. Moreover, the trends we find account for the year-to-year variation, whereas in Gilleland et al. (2013), changes can only be assessed by comparing values for three successive periods of about 15 to 20 years. The positive time trends we obtain in Spring appear quite consistent with the results of Gensini and Brooks (2018), who use much more recent data than the previously described papers. The remaining differences, especially for Texas, may arise for the following reasons. First, as PROD is only an indicator of severe weather, there are necessarily discrepancies with results based on effective tornado reports. Second, PROD slightly differs from STP, so the corresponding results may differ somewhat. Furthermore, the findings of Gensini and Brooks (2018) about reports concern the total number of tornadoes per year, and those about STP are not based on the maxima of that variable.
Regarding CAPE, April, May and June (and to a lesser extent, August, November and January) show many grid points with a significant time trend. For April and May, Figures 6 and 7 show significantly negativeη 1,ti in the West, contrasting with a significantly positive trend in the center and the East. As pointed out by Trapp et al. (2009) and Diffenbaugh et al. (2013), a positive time trend for CAPE is expected in a context of climate change. However, to the best of our knowledge, an observed trend has not been previously reported in the literature.
For SRH, May and to a lesser extent April have many significantly positive grid points spread approximately uniformly except in a large South-West corner in April and a large South-East corner in May. The significance for PROD in April and May comes from both CAPE and SRH. Figures 6 and 7 suggest that the significant positive time trend in the riskiest part of the US stems mainly from CAPE in April and from SRH in May. Overall, no seasonal pattern appears: CAPE seems to drive PROD in January, April, August, November and December, whereas SRH seems to drive it in February, May, June and September. For March, July and October, there is no clear driver. Anyway, trying to relate the behaviour of PROD maxima with that of CAPE and SRH maxima has its limitations. Indeed, the maximum of PROD may not equal the product of the square root of CAPE maximum and the maximum of SRH, as their maxima may not coincide.
We now comment on the effect of ENSO. For PROD, Table 2 reveals that many grid points exhibit a significant relation in February. Figure 8 indicates thatη 1,en is negative at those and that the main regions concerned are the North-East, the South-Center and the North-West; we expect higher PROD maxima during La Niña years in these regions. The highest slope absolute value corresponds to a decrease of PROD maxima per unit of ENSO of about 10% of the corresponding basic level of PROD maximum.
There is no strikingly significant result for CAPE, although  found ENSO signals in CAPE seasonal averages for winter and spring, not accounting for multiple testing.
For SRH, a very large number of grid points exhibit significance in February. Figure 8 shows that almost all grid points are concerned except for a strip in the North and a tiny diagonal strip in the South-East corner of the region. The estimateη 1,en is highly negative in most of the region but very positive in the extreme South-East, with a very rapid change in sign, presumably due to proximity with the Gulf of Mexico. There is a significant negative relation in regions at risk of thunderstorms or large-scale storms, for which SRH plays an essential role. The risk of large impacts may increase during La Niña years. A relationship between seasonal averages of SRH and ENSO in winter was noticed by . Finally, Figure 8 suggests that CAPE contributes more than SRH to PROD in terms of significance, though the relation with ENSO is more pronounced for SRH than for CAPE.
We also considered the residuals of PROD, CAPE and SRH maxima after accounting for ENSO or temporal effects. For instance, if we observe a time trend, the idea of considering the residuals after accounting for ENSO is to determine whether the time trend is explained by ENSO. This allows us to determine whether the time and ENSO effects are "independent".    Table 2: Number of grid points whereη 1,ti andη 1,en are significant for PROD, CAPE and SRH maxima for each month (top); number of grid points whereη 1,ti is significant for PROD, CAPE and SRH maxima residuals after accounting for the relation with ENSO (middle); number of grid points whereη 1,en is significant for PROD, CAPE and SRH maxima residuals after accounting for the relation with time (bottom). We have accounted for multiple testing using the BH procedure with the values of q displayed. Large and small circles indicate significance (after accounting for multiple testing using the BH procedure) at any level not lower than 5% and 20%, respectively. The units ofη 1,en are m 3 s −3• C −1 , Jkg −1• C −1 and m 2 s −2• C −1 for PROD, CAPE and SRH, respectively.
In the case of PROD, Table 2 shows that removing ENSO does not much decrease the number of grid points exhibiting a significant time trend; there is a slight decrease for April but a small increase for some other months. Accounting for the time trend, on the other hand, can slightly increase the number of grid points showing a significant relation with ENSO. For CAPE, removing ENSO decreases the number of grid points exhibiting a significant time trend for March, but there is a slight increase for other months, whereas accounting for time slightly decreases the number of grid points showing a significant relation with ENSO in January and March only, with a slight increase in other months. Regarding SRH, removing ENSO decreases the number of grid points exhibiting a significant time trend in February but there is little impact for other months. The conclusions are similar when accounting for the time trend and studying the ENSO effect. The maps of the residuals (not shown) indicate that when removing a covariate has little impact on the number of grid points at which the relation with the other covariate is significant, it has almost no impact on their positions either. To summarize, the effects of time and ENSO appear "independent", except for CAPE in January and March and SRH in February.

Conclusion
This article quantifies the effects of time and ENSO on the distribution of monthly maxima of PROD, CAPE and SRH, which are highly relevant to the risk of severe thunderstorms. The use of the GEV appears justified in our setting. After allowance for multiple testing we detect a significant time trend in the location parameter of the GEV for PROD maxima in April, May and August, CAPE maxima in April, May and June and SRH maxima in April and May. The observed upward time trend for CAPE, although expected in a warming climate, has not been reported before. April and May are prominent for PROD, as severe thunderstorms are common at this period, and the corresponding trend is positive in parts of the US where the risk is already high, which may have important consequences. We also found ENSO to be a good covariate in the location parameter of the GEV for PROD and SRH maxima in February. The corresponding relationship is negative over most of the region we consider, suggesting that the risk of storm impacts in February increases during La Niña years. Our results differ from those of Heaton et al. (2011), Mannshardt and and Gilleland et al. (2013), but are quite consistent with those obtained by Gensini and Brooks (2018), perhaps in part because these authors consider a period similar to ours, more recent than in the earlier studies.
We investigate the effects of time and ENSO on the marginal (at each grid point) extremal behaviour of PROD, CAPE and SRH. Quantifying the potential impacts of these covariates on the local spatial extremal dependence of these variables would also be useful for risk assessment. Modelling the extremal dependence between CAPE and SRH might also be informative.
An interesting question is the implication of an increase of PROD (or SRH) maxima. As PROD can be seen as a proxy for the probability of severe thunderstorm occurrence, it is natural to think that PROD maxima may be good indicators for the maxima of the variable "number of severe thunderstorms per day". This would somehow imply that the days where PROD maxima occur generally correspond to those days with the largest severe thunderstorms impacts. Providing clear insight about whether this is indeed the case would be valuable.