1. Introduction
There is great interest in how global warming might be affecting the statistics of extreme weather (Meehl et al. 2000; IPCC 2014). Indeed the occurrence of any extreme event not previously observed “within living memory” or “since records began” (in both cases, about 100 years) immediately becomes a candidate for attribution to global warming. Both observational and modeling approaches are taken to the problem. A combined approach is often deemed necessary, given the limitations imposed on observational approaches by the shortness and inaccuracies of historical records, and on modeling approaches by known and unknown model errors.
It is not hard to see how detecting changes in extreme value probabilities using limited records can be problematic. Consider an event with an N-yr return period, whose probability of occurring in any single year is 1/N if the annual occurrences are independent, and of not occurring is 1 − 1/N. The probability of at least one occurrence in M years is then 1 − (1 − 1/N)M, and of exactly one occurrence is [M/(N − 1)] (1 − 1/N)M. These simple formulas highlight the difficulty in pinning down the probability (or equivalently the return period N) of rare events from short records. For example, a single occurrence of an event in an M ~100-yr record does not imply that N ~100 yr. Actually N could be as long as 2000 years, or as short as 22 years, because (using the binomial distribution) there is a 5% chance for a 2000-yr event as well as a 22-yr event to occur just once in 100 years. Similarly, the complete absence of an event in a 100-yr record does not imply that N > 100 yr, as is often assumed. It could be as short as 44 years, because there is a 10% chance for a 44-yr event to occur not even once in 100 years. Thus, although it might be tempting to conclude from the absence of an event in a 100-yr interval and its single occurrence in the subsequent 100-yr interval that its return period N has decreased from “infinity” to 100 years and the event has become “more likely,” the large uncertainties in N would render such a conclusion suspect. Indeed there is an appreciable probability [M/(N − 1)] (1 − 1/N)2M for such a sequence of events to occur with no change in N at all, which is greater than 10% for 80 < N < 800 yr. Similar statements can be made regarding events that occur a few times in one century, and a few more (or less) times in the next.
Given the rarity of extreme events, it is not surprising that detecting significant changes in their probabilities through direct counts or raw (or cosmetically smoothed) histograms is prone to sampling errors. There are three ways around this difficulty. Approach 1 is to fit a probability distribution, usually of a universal form based on generalized extreme value (GEV) theory, to the sampled extreme values. Although more robust than using raw histograms, this approach is also prone to sampling errors, and the links between the parameters of the GEV distribution and underlying physical processes are often unclear. Approach 2 is to fit a probability distribution to all and not just the extreme values, whose more physically transparent parameters related to the mean and variance can also be more robustly estimated from limited records, and then to deduce changes in tail probabilities from the changes in those parameters. This approach is appealing from both sampling and physical perspectives, but can be compromised by using an inappropriate probability distribution, such as a Gaussian in cases where the distribution is clearly not a Gaussian. Approach 3 is to use a numerical climate model to estimate the tail probabilities from raw histograms derived from large ensembles of model integrations. This approach can be made as statistically reliable as desired depending upon computing resources, but may be compromised by known and unknown model errors. All three approaches, individually or in combination, are currently being followed to detect changes in extreme weather and climate risks and to attribute them to natural and/or anthropogenic influences.
Our first goal here is to highlight the need for such detection and attribution efforts to account for the distinctively skewed and heavy-tailed aspects of the probability distributions of daily weather anomalies. As discussed below, departures of several standard deviations from the mean, although rare, are far more common in such a distinctively non-Gaussian world than they are in a Gaussian world. In such a world a mean climate shift is also generally associated with changes in the width and shape of the probability distribution. Consequently, even the sign of the changes in tail probabilities cannot be inferred unambiguously from the mean shift alone. These realities further complicate the problems of detecting significant changes in tail probabilities from historical records of limited length and accuracy. In effect they make it harder to reject a null hypothesis of no change in extreme risks. They also raise the bar for the detection and attribution of changes in extreme risks using climate models, by requiring the models to accurately represent the salient non-Gaussian aspects of the observed distributions.
Our second goal is to suggest a simple way to account for the observed non-Gaussianity by using, as a null hypothesis, a general class of non-Gaussian distributions called stochastically generated skewed (SGS) distributions introduced to the meteorological literature by Sardeshmukh and Sura (2009, hereafter SS09). Sardeshmukh and Penland (2015) further discuss the broad physical foundations of such distributions in the climate system. Here we provide extensive justification for the particular relevance of SGS distributions in extreme value analysis. In addition to capturing the distinctive non-Gaussianity of the observed distributions, SGS distributions have two other notable properties: 1) they include Gaussian distributions as special cases, and 2) they remain SGS distributions following a mean climate shift. We describe how to estimate the parameters of an SGS distribution using the first four statistical moments of the data (mean, variance, skewness, and kurtosis) and, importantly, how to generate a statistically equivalent Markov process with the same parameters, whose physical interpretation is more transparent. The Markov model can be used to test for statistical significance through Monte Carlo simulations, and also to investigate changes in the temporal persistence statistics of extreme anomalies. Our approach may be viewed as an attempt to improve the full distribution–based approach 2 to the analysis of extreme anomalies. It also has implications for the extreme value distribution–based approach 1, and the climate model distribution–based approach 3, as will become clear below.
Figure 1 illustrates how non-Gaussianity can strongly impact tail probabilities, and how ignoring it can compromise the detection, quantification, and attribution of their changes even when they are associated with only a mean shift. The figure compares the probabilities of exceeding 2 and 4 standard deviations (sigma) before and after a positive mean “climate shift” of 1 sigma, for Gaussian and SGS distributions with the same mean and sigma, and a skewness of 1 and excess kurtosis of 5 for the SGS distribution. Before the shift, the probabilities of exceeding 2 sigma are 2.3% and 3.4% for the Gaussian and SGS distributions, respectively. After the shift, they increase to 15.8% and 13.3%, by different factors of 7 and 4. For exceedances of 4 sigma, the changes are more dramatic: from 0.003% (0.34%) before to 0.13% (1.0%) after the shift, and by very different factors (43 and 3 for Gaussian and SGS distributions, respectively). The detection and attribution of such changes would be strongly affected by what one assumes to be the true distribution. If the distribution is actually SGS but is assumed to be Gaussian (i.e., if the non-Gaussianity is ignored), then the occurrence of a 4-sigma event would be interpreted as highly significant (since its preshift risk of 0.003% is very small) and highly consistent with the 43 times higher risk of such events after the mean shift. One would thus, with high confidence, both reject a null hypothesis of no change in the risk of such events and accept the alternative hypothesis of their much higher risk in the shifted climate. If, on the other hand, the distribution is correctly recognized to be SGS, then the occurrence of the 4-sigma event would be interpreted as relatively less significant and less clearly attributable to the climate shift, given its already substantial risk of 0.34% before the shift (which is indeed higher than the 0.13% risk in the Gaussian case even after the shift) and its relatively small threefold increase of risk after the shift. Note that no such disparity of interpretation would occur if the distribution is actually Gaussian but is assumed to be SGS, since a Gaussian is a special case of an SGS distribution with zero skewness and excess kurtosis, as already mentioned.
An illustration of how changes in the probabilities of extreme values associated with a mean climate shift depend strongly on the shape of the PDF. Results are shown for (a) a Gaussian PDF and (b) a non-Gaussian SGS PDF (see text) with skewness S = 1 and excess kurtosis K = 5. In both panels the PDFs for the current climate (black curves) have zero mean and unit standard deviation, and are shifted by one standard deviation to the right (red curves) to illustrate a future climate scenario. This identical mean shift in the two cases implies very different changes in the probability P(x > 2) of exceeding 2 standard deviations, from 2.3% to 15.8% in (a) and from 3.4% to 13.3% in (b). The changes in the probability P(x > 4) of exceeding 4 standard deviations are even more striking, increasing by a large factor of 43 from 0.003% to 0.135% in (a) and by a relatively modest factor of 3 from 0.34% to 1.01% in (b).
Citation: Journal of Climate 28, 23; 10.1175/JCLI-D-15-0020.1
Despite this wider applicability of ψ, one can imagine situations in which knowing even the mean and standard deviation would be insufficient to deduce the sign (let alone the magnitude) of changes in tail probabilities, if the distribution is not reducible to a shape-preserving form through a transformation of variable. Situations involving changes of strongly skewed and kurtotic SGS distributions belong to this category. In such situations the full distribution–based approach 2 and the climate model distribution–based approach 3 to determining tail probabilities are further compromised by ignoring or misrepresenting the SGS character of the observed distributions, eroding their advantages over the extreme value distribution–based approach 1.
The paper is organized as follows. Section 2 highlights the distinctive non-Gaussianity of the observed distributions of daily large-scale weather anomalies. Section 3 shows how this distinctive non-Gaussianity is captured by SGS distributions, and describes procedures for fitting SGS distributions to data and generating a statistically equivalent Markov process with the same parameters. Section 4 discusses the strong links between the distributions of midtropospheric vertical velocity (which are well approximated as SGS distributions) and precipitation anomalies. Section 5 shows how some striking observed asymmetries in the temporal persistence of positive and negative circulation anomalies are consistent with the SGS nature of the anomaly distributions. Section 6 shows how fitting SGS distributions to all available values in limited climate records yields extreme value distributions of block maxima with smaller sampling uncertainties than GEV distributions fitted to only the block maxima. This section also shows how the tails of SGS distributions can be directly linked to the corresponding asymptotic GEV distributions of block maxima and generalized Pareto (GP) distributions of threshold exceedances based on extreme value theory. Section 7 describes how a mean climate shift alters both the width and shape of an SGS distribution, and illustrates their combined impact on tail probabilities to contrast with the impacts of the mean shift alone shown in Fig. 1. Finally, section 8 presents an illustrative application of our SGS distribution–based approach to assess changes in the observed distributions of daily indices of large-scale wintertime atmospheric circulation variability over the North Atlantic and North Pacific Oceans, the North Atlantic Oscillation index and the North Pacific index, over the period 1872–2011. Section 9 follows with a discussion and concluding remarks.
2. The distinctive non-Gaussianity of observed daily weather anomalies


(left) Skewness S and (right) excess kurtosis K of daily anomalies in winter (December–February) of (top) relative vorticity at the upper tropospheric 250-hPa level, (middle) temperature at the lower tropospheric 850-hPa level, and (bottom) vertical velocity at the midtropospheric 500-hPa level, estimated using the 20CRv2 dataset for the 1872–2011 period. Positive values are indicated by red and negative by blue shading. For a Gaussian variable, these maps would be blank.
Citation: Journal of Climate 28, 23; 10.1175/JCLI-D-15-0020.1
The distinctive non-Gaussianity of the probability distributions of daily anomalies in winter. (left) Joint histograms of the S and K values from Fig. 2. Note the approximately parabolic relationship between K and S. In each panel the lower thick black curve is a theoretical parabolic lower bound on K of the form K = 1.5S2 − Δ, where Δ = 1.1, 1.1, and 0.5 in the top, middle, and bottom panels. A weaker universal lower bound on K, if K exists, is the parabola K = S2 − 2, the lowest gray curve. The upper thick black curve in each panel is a theoretical upper bound on K if the fifth moment 〈x5〉 also exists. (right) Hemispherically averaged PDFs of transformed standardized variables y = (x/σ) × S/|S| estimated as histograms. The local factors S/|S| (equal to +1 or −1 at each grid point) ensure that p(y) at each grid point is positively skewed even if p(x) is not. The hemispherically averaged values of p(y > 0) and p(y < 0) are shown as blue and red curves, respectively, with the p(y < 0) curve flipped over to the positive side. Note that the asymmetry between p(y > 0) and p(y < 0) is substantial even for small magnitudes of y, and changes sign at a crossover point somewhere between |y| = 1.41 and 1.73, beyond which the tail of p(y > 0) is heavier than that of p(y < 0).
Citation: Journal of Climate 28, 23; 10.1175/JCLI-D-15-0020.1
The right panels of Fig. 3 highlight another distinctive feature of the observed non-Gaussianity. They show PDFs of the transformed standardized variables y = (x/σ)S/|S| from Fig. 2, estimated as histograms at each grid point and averaged over the hemisphere. The local factors S/|S| (equal to +1 or −1 at each grid point) ensure that p(y) is positively skewed at each grid point regardless of whether p(x) is positively or negatively skewed. The values of p(y > 0) and p(y < 0) are shown as blue and red curves, respectively, with the p(y < 0) curve flipped over to the positive side for easier comparison with the p(y > 0) curve. The two curves would coincide if the PDFs were symmetric. Note that the asymmetry is substantial even for small magnitudes of y (i.e., the PDFs do not have a Gaussian core) and changes sign at a “crossover” point between |y| = 1.41 and 1.73, beyond which the tail of p(y > 0) becomes heavier than that of p(y < 0). It is interesting that this heaviness is already evident at moderate anomaly magnitudes of 2 sigma that are often used as thresholds for defining extreme anomalies and assessing their statistical significance. Such heavy tails clearly need to be accounted for in extreme value analysis.
3. SGS distributions and corresponding Markov processes


It is clear from (5a) that both E and g must be nonzero for S to be nonzero. Note that E is nondimensional, and g and b have the same dimension as x (and σ). Also, without loss of generality, E and b are positive, whereas g can be positive or negative. Thus the sign of g determines the sign of S.












Both g and E are important in determining the skewness and heavy tails of SGS distributions. This is illustrated in Fig. 4, which contrasts the tails of two SGS distributions with the same mean, variance, and E, but one (the same as in Fig. 1) with g ≠ 0 and the other with g = 0. The positive tail is much heavier, and the negative tail much lighter, in the g ≠ 0 case. Note, however, that even the negative tail is much heavier than that of a Gaussian distribution with the same mean and variance, also shown in the figure, beyond about 4σ. Interestingly, a pure power-law decay
Log–log plot of the positive (blue) and negative (red) segments of the asymmetric SGS PDF in Fig. 1 with S = 1 and K = 5, contrasted with a symmetric SGS PDF with S = 0 and the same power-law tail exponent, and also with a standard Gaussian PDF (thin black curve). The PDFs have zero mean and unit variance in all cases. The wiggly gray curves show the estimated PDF in the asymmetric SGS case from a long simulation of the Markov process model [(10)] with the same parameters as the SGS distribution.
Citation: Journal of Climate 28, 23; 10.1175/JCLI-D-15-0020.1


One can fit an SGS distribution to any sample data series using this simple robust method. It may be a poor fit if the sample distribution has more than one peak, or if the sample E2 estimated from (8a) strongly violates the constraints on E2 for an SGS process. The degree to which the fitted SGS distribution differs significantly from the sample distribution may be assessed through Monte Carlo integrations of the SGS process model described in (10) below. It is possible that other methods of SGS parameter estimation and their significance, such as maximum likelihood estimation (MLE) and Markov chain Monte Carlo (MCMC) methods might be superior to our method, especially given the explicit forms of (3) and (10). Although we have not considered it here, this issue is certainly worthy of further exploration.





The terms including
This statistically equivalent Markov process model (10) is useful for quantifying the sampling uncertainties of SGS distributions, with a proper accounting of the “serial dependence” (i.e., the temporal correlation) of x. Long integrations of (10) also provide information on the persistence statistics of extreme values of x that is difficult to obtain in any other way.
Figure 5 illustrates the dramatically different behavior of extremes in the Gaussian and SGS (S = 1, K = 5) cases of Fig. 4 from a time series perspective. It compares the daily time series of x(t) in the two cases in long integrations of the Markov model (10) using a Heun method. The length of the integrations is 108 correlation time units (ctu) of standardized time t′ = λt = t/τc, with a specified τc of 1 ctu and time step of 1/24 ctu. For concreteness, if τc is 1 day, one may think of these as 108-day integrations, with the first 105 days displayed in Fig. 5 as 1000-winter series of 100-day winters. Consistent with the heavier tails in Fig. 4, excursions beyond ±3 standard deviations are more frequent in the SGS case. Note that the two series have the same mean and standard deviation, with little sampling variability in the 1000-winter series of their values estimated from each winter. The figure also shows the 100-decade series of the largest and fifth-largest daily values in each 10-winter decade. These are dramatically different in the SGS and Gaussian cases, with excursions beyond +4 standard deviations far more common in the SGS case. Note also the substantial long-term trends in the series of decadal maxima in the SGS case, lasting a century or even longer, that are associated purely with sampling fluctuations. Consistent with the discussion of Fig. 1, such long-term trends of “record-breaking” values in an SGS world would be mistakenly attributed to “climate change” by adopting an inappropriate null hypothesis that they are drawn from a Gaussian distribution.
The dramatically different behavior of extremes in the SGS (S = 1, K = 5) and Gaussian cases of Fig. 1 from a time series perspective, obtained from long integrations of the Markov model [(10)]. The length of the integrations is 108 days, performed using a specified correlation time scale τc of 1 day and time step of 1 h. Results are shown for the first 105 days, displayed as 1000-winter series of 100-day winters. Consistent with the heavier tail in Fig. 1, excursions beyond ±3 standard deviations are more frequent in the SGS case. The two series have the same mean and standard deviation, with little sampling variability in the 1000-winter series of their values estimated from each winter. Also shown are 100-decade series of the largest (blue) and fifth-largest (orange) daily values in each 10-winter decade. These are dramatically different in the SGS and Gaussian cases, with excursions beyond +4 standard deviations far more common in the SGS case. Note the substantial trends in the series of the decadal maxima in the SGS case, lasting a century or even longer, that are associated purely with sampling fluctuations.
Citation: Journal of Climate 28, 23; 10.1175/JCLI-D-15-0020.1
4. The link between vertical velocity and precipitation distributions


Figure 6 illustrates how the probabilities of w in scenarios of
An illustration of how the probabilities of standardized midtropospheric vertical velocities w in scenarios of
Citation: Journal of Climate 28, 23; 10.1175/JCLI-D-15-0020.1
The correspondence of the PDFs of daily precipitation and positive w through (11) also has implications for the PDFs of seasonal-mean precipitation. Focusing only on the PDFs of positive w in Fig. 6, it is clear that they are positively skewed in both the SGS and Gaussian cases, with the skewness relatively larger in the SGS case, increasing from the mean ascent to the mean descent scenarios. One would expect this skewness to be reduced for seasonal averages in accord with the central limit theorem, but to remain strongly positive in regions of strong mean descent. This is indeed observationally evident in regions of mean descent in Fig. 7. The right panels of Fig. 7 demonstrate the robustness of this relationship in five different observational and reanalysis datasets. One would thus expect the SGS versus Gaussian character of the daily w PDFs to have a strong impact on the extremes of even seasonal-mean precipitation in arid regions of mean descent.
(left) Skewness S of seasonal-mean precipitation in northern (top) winter and (bottom) summer estimated using the GPCP precipitation dataset (Adler et al. 2003). The stippled and white areas indicate areas of mean midtropospheric ascent (
Citation: Journal of Climate 28, 23; 10.1175/JCLI-D-15-0020.1
5. Implications for the persistence of extreme anomalies
An important consideration in extreme value analysis is not only the magnitude but also the duration of extreme anomaly events. There is substantial observational evidence for the asymmetry of both with respect to the sign of circulation anomalies (e.g., Dole and Gordon 1983; Masato et al. 2009; Woollings et al. 2010). Figure 8a shows an early example from Dole and Gordon (1983) of the asymmetric persistence of positive versus negative 500-hPa geopotential height anomalies in the North Pacific sector. It shows the counts PM(T) of T-day persistent anomaly events of amplitude M, defined as events in which the anomaly extends beyond and then persists beyond a threshold M for at least T days. Results are shown for M = ±0, ±0.5, and ±1 standard deviations. Large positive anomalies evidently last longer than large negative anomalies, and also longer than large anomalies of a Gaussian red noise process with the same variance and lag correlation. Such asymmetries are usually attributed to a greater role of nonlinearity in the dynamics of large-amplitude anomalies. Note, however, that the figure shows an asymmetry in the persistence of all positive and negative anomalies, not just large-amplitude anomalies, and the positive anomalies overall are actually less persistent than negative anomalies. There is also a striking exponential decay of PM(T) with T in all cases. No simple explanations of such features currently exist.
Persistent anomaly statistics of positive and negative anomalies beyond specific amplitude thresholds contrasted with similar statistics for a univariate Gaussian red noise process (light gray shaded bands). (a) Counts PM(T) of T-day persistent 500-hPa geopotential height anomaly events of amplitude M, defined as events in which the anomaly extends beyond and then persists beyond a threshold M for at least T days [adapted from Dole and Gordon (1983)]. Results are shown for positive (solid) and negative (dashed) thresholds of M = ±0, ±0.5, and ±1 standard deviations. (b) as in (a), but for PM(T) of x(t) from 108-day simulations of Markov processes with 1-day correlation time scales described in the text associated with SGS (S = 1, K = 5) and Gaussian distributions. Results are shown for positive (dark gray hatching) and negative (red hatching) thresholds of M = ±0, ±1, and ±2 standard deviations, with 95% confidence intervals estimated from the sampling uncertainty among 1000 100-winter blocks of 100-day winters. Note the striking overall similarity between (a) and (b).
Citation: Journal of Climate 28, 23; 10.1175/JCLI-D-15-0020.1
The general character of such persistence probability curves is remarkably well captured, without invoking nonlinearity, by the CAM noise–driven linear Markov model (10) associated with SGS distributions. Figure 8b shows the PM(T) of x(t) from the 108-day model integrations presented earlier for the SGS (S = 1, K = 5) and Gaussian cases, in an identical format to that of Fig. 8a. Note again that time here is actually measured in correlation time units t/τc, and is referred to as “days” using τc = 1 day purely for illustrative purposes. Results are shown for thresholds of M = ±0, ±1, and ±2 standard deviations, with 95% confidence intervals estimated from the sampling uncertainty among 1000 100-winter blocks (“centuries”) of 100-day “winters.” The similarities with Fig. 8a are striking.
The exponentially decaying curves for both the SGS and Gaussian cases in Fig. 8b are associated with the same expected exponential decay
The ability of the linear Markov process (10) to represent such temporal persistence statistics—correctly discriminating among anomaly persistence beyond positive and negative, as well as large and small, amplitude thresholds—makes it useful for investigating the statistical significance of a wide variety of changes in weather statistics associated with climate change. It also enables any changes that are established as significant in a variable to be attributed, as a first step, to changes in the deterministic dynamics (encapsulated in λ = 1/τc) versus the stochastic forcing (E, g, and b) of the variable.
6. Extreme value analysis using SGS versus GEV or generalized Pareto distributions
In this section we show that fitting SGS distributions to all available values in limited records yields extreme value distributions of block maxima with smaller sampling uncertainties than generalized extreme value distributions fitted to only the block maxima. We also show how the tails of SGS distributions are linked to the corresponding GEV distributions of block maxima and generalized Pareto (GP) distributions of exceedances over specified thresholds.








Figure 9 illustrates the large sampling uncertainties of GEV fits when only 25 blocks (“winters”) of data are used. The thick black curves show the “true” PDFs of 100-day block maxima
A demonstration that for limited-length records, the PDF of winter (100-day) maxima of daily values (extreme value PDFs) can be more accurately estimated by fitting SGS distributions to all daily values than by fitting GEV distributions to just the maximum values from each winter. The thick black curve in each panel shows the “true” extreme value PDF of the winter maxima in 108-day integrations of the Markov process (10) associated with the (a) Gaussian and (b) SGS parent distributions of Fig. 1. Shading indicates 95% confidence intervals for the extreme value PDFs estimated using 25-winter (2500-day) segments of the integrations, using raw histograms of the 25 winter maxima (light gray shading), fitting a GEV distribution to the 25 winter maxima (cross-hatching), and using the SGS distribution fit to the 2500 daily winter values (dark gray shading).
Citation: Journal of Climate 28, 23; 10.1175/JCLI-D-15-0020.1
The results in Fig. 9 are consistent with the intuition that if the parent distribution F(x) is known, the PDF of





Insofar as SGS distributions are relevant, the correspondence between GEV and SGS distribution parameters made explicit by (13) enables the GEV parameters to be interpreted physically through the Markov model (10), and provides insight into their variation with block size N. The role of E, the amplitude of the multiplicative noise in (10), is key, since E completely determines ξ and the dependence of

As in the case of the GEV distribution, the correspondence between the GP and SGS distribution parameters made explicit by (15) enables the GP parameters to be interpreted physically through the Markov model (10). The role of E is again key, since E completely determines ξ. In the limit of E → 0 associated with a Gaussian parent, both ξ → 0 and
This analysis highlights the contrast between the extreme value distribution–based approach 1 and the full distribution–based approach 2 to estimating tail probabilities discussed in section 1. The GEV and GP approaches are clearly valuable when the parent distribution F(x) is unknown, but are unnecessary when F(x) is known, especially if F(x) is associated with a process model such as (10) that can be used to quantify sampling uncertainties and provide physical insight. The issue, of course, is to what extent F(x) is known. We have argued for the relevance of SGS distributions as the simplest physically based prototypes for non-Gaussian distributions that include Gaussian distributions as special cases. Even if F(x) is not strictly SGS but is only approximately so, our analysis suggests a way to improve the estimation of GEV and GP parameters by constraining, for example, the shape parameter ξ to be consistent with (13a), with E estimated from (8a) using the full record and not just extreme values. This utility of the SGS approach would clearly be compromised if F(x) deviates significantly and substantially from an SGS distribution, for instance if p(x) has more than one maximum. Note however that even in such instances the GP distribution may not necessarily provide a more accurate model of tail probabilities.
7. Form invariance with respect to climate changes
The SGS distribution is an anomaly distribution associated with a one-dimensional linear Markov model (10). This model is a highly idealized model of the dynamics of individual system components of the multicomponent nonlinear climate system. A rationale for all the multicomponent linear and nonlinear interactions to be approximately represented by parameters E, g, b, and λ in (10) was given in SS09, and the validity of (10) as a process model for the stationary statistics and some temporal persistence statistics has been provided here. No additional assumptions need be made to retain (10) as an equally valid model of anomaly statistics in an altered climate. To that extent, the anomaly distribution of a variable in the new climate should remain an SGS distribution, with possibly different parameters. This rationale is the same as that for expecting anomaly distributions to remain Gaussian in an altered climate if there is a physical basis for them to be Gaussian in the current climate.








The change in g implies changes in the variance, skewness, and kurtosis of the SGS distribution through (4b) and (5), and hence changes in the width and shape of the distribution. It also implies changes in the types of persistence statistics shown in Fig. 8b, even without a change in λ.
Figure 10 shows how accounting for the changes in σ2, S, and K modifies the tail probabilities of the shifted SGS and Gaussian distributions in Fig. 1. (For a cleaner comparison we specify the same change of σ2 in the Gaussian case.) The probabilities of exceeding 2 sigma, which were previously stated as increasing to 15.8% (13.3%) in the Gaussian (SGS) cases due to the mean shift alone, are modified to 21.2% (16.7%) after also accounting for the changes in width and shape, and the probabilities of exceeding 4 sigma are modified even more strongly, from 0.13% (1.0%) to 0.92% (2.65%). Correctly accounting for the changes in width and shape in addition to the mean shift is clearly important in quantifying changes in tail probabilities.
As in Fig. 1, but now also with changes in the width and shape of the SGS distribution associated with the mean shift taken into account. For a cleaner comparison the shifted Gaussian PDF is also shown with the same width as the SGS distribution. See text for a fuller explanation.
Citation: Journal of Climate 28, 23; 10.1175/JCLI-D-15-0020.1
We stress that for even small climate shifts, the change in the SGS distribution is not shape preserving. However, the change in the associated GEV distribution is shape preserving, because the constancy of E implies the constancy of the shape parameter ξ in (13a), as has indeed been found (e.g., Kharin and Zwiers 2005), and sometimes even assumed (as in Zwiers et al. 2011). This allows the sign of the change in the GEV tail (i.e., in the “extreme extremes”) to be deduced from the sign of the extremal probability shift index (PSI) in (1), by specifying the changes in the estimated location and scale parameters.
For larger climate shifts, it is possible for not just g but the other parameters in (10) to change as well. We have not yet found a simple way to characterize the combined effect of the changes in E, g, and b on the tail probabilities, both of extremes and extreme extremes. It appears that in this more general scenario there would be no simpler alternative to deducing the changes in tail probabilities from SGS distributions fitted to data before and after the mean shift.
8. An application to assessment of twentieth-century circulation changes
Finally, we present an illustrative application of our SGS distribution–based approach to assess the changes in two sea level pressure (SLP) based indices of large-scale wintertime atmospheric circulation variability in the Northern Hemisphere, the North Atlantic Oscillation (NAO) index and the North Pacific (NP) index, over the period 1872–2011. To this end we use the 56-member ensemble 20CRv2 dataset (Compo et al. 2011), in which the time series of each ensemble member represents an equally likely estimate of global atmospheric evolution over the 140-yr period within observational error bounds.
To provide context, Fig. 11 shows the ensemble-averaged 7-yr running mean anomalies of the NAO and NP indices. Corresponding results obtained using the NCEP–NCAR reanalysis and ERA-40 datasets for shorter periods are also shown for comparison. There are large multidecadal variations in both indices, including an apparent upward trend in the NAO index and a downward trend in the NP index during much of the second half the twentieth century. However, the trends over the full 140-yr period are much weaker or even nonexistent. This has important implications for understanding the atmospheric circulation response to global warming, and casts doubt on inferences about this response drawn in studies that focus only on the second half, or other subsets, of the full record.
7-yr running means of anomalous sea level pressure (SLP)-based North Pacific (NP) and North Atlantic Oscillation (NAO) indices for the 1872–2011 period. Results are shown for ensemble means of the 56-member ensemble 20CRv2 (Compo et al. 2011; black curves). Corresponding results obtained using the NCEP–NCAR reanalysis (NNR; orange, 1948–2010) and ERA-40 (cyan, 1958–2001) datasets are also shown for comparison with the 20CRv2 results. The NAO index is defined as the daily area-averaged anomaly in a 5° cell centered on 38°N, 26°W minus that centered on 64°N, 22°W, using all available data in each dataset. The NP index (Trenberth and Hurrell 1994) is the area-average anomaly in the region 65°–30°N, 160°E–140°W. Anomalies are defined as departures from the 1981–2000 December through February mean.
Citation: Journal of Climate 28, 23; 10.1175/JCLI-D-15-0020.1
Despite the absence of a long-term mean trend in these indices, one might still think it possible for the probabilities of their extreme values to have changed in association with possible changes in the width and shape of their PDFs. We consider such a possibility, even though the argument given in the previous section precludes it for small (in these cases, nearly zero) mean shifts. Since our interest is in daily weather extremes and the large-scale circulation contexts in which they occur, we focus on changes in the PDFs of daily NAO and NP indices. These have autocorrelation time scales τc of approximately 5.2 and 7 days, respectively.
For each one of the 56 members of the 20CRv2 ensemble, we estimated the mean, variance, skewness, and kurtosis of the daily index anomalies separately from the first and second halves (1872–1941 and 1942–2011) of the 1872–2011 record, and used (8) to fit SGS distributions in each half. As Fig. 12 shows, the ensemble spread of the 56 fitted PDFs, associated with observational uncertainty, is much smaller than that of the raw histograms. The ensemble-mean fitted PDFs do show some changes from the first half to the second half of the record. This still leaves open the question, however, as to how statistically significant those changes are given that we used only 70 years of data to estimate the distributions in each half. We addressed this question through 106 winter simulations of the Markov process model (10), with parameters specified from the ensemble-mean fitted SGS distributions for the second half. We identified the sampling uncertainty of the fitted observational SGS distributions with the sampling variability of SGS distributions fitted to 70-yr segments of the Markov model simulations. This sampling uncertainty is indicated as the gray shaded bands in Fig. 12.
PDFs of standardized daily NP and NAO indices in northern winter. The standardization is based on the full variance of daily data from 1872 to 2011. The standardized daily NAO index is defined as the standardized daily anomaly near Ponte Delgada, Azores (38°N, 26°W), minus that near Reykjavik, Iceland (64°N, 22°W). Anomalies are with respect to the 1871–2011 mean plus 3 harmonics of the annual cycle. The PDFs are estimated for two 70-yr periods, 1872 to 1941 (red) and 1942 to 2011 (blue), both as raw histograms (rectangles) and as fitted SGS PDFs [curves; see (3)]. Results for December through February are shown for each one of the 56 members of the observational 20CRv2 (Compo et al. 2011). There are thus 56 red and 56 blue curves in each panel. Upper and lower segments of the red and blue rectangles show the range of raw counts for each anomaly size bin. The gray swaths indicate 95% confidence intervals on the PDFs associated with using limited 70-yr records, estimated from long Markov model simulations [see (10)]. The internal spread among the red and blue curves is a measure of observational uncertainty, whereas the gray swath is a measure of sampling uncertainty. The shifts of the mean are indicated by the red and blue circles near the center of each panel.
Citation: Journal of Climate 28, 23; 10.1175/JCLI-D-15-0020.1
Figure 12 suggests that there was not only no change in the mean, but also no statistically significant change in the entire PDFs of the NAO and NP indices from the first to the second half of the period 1872–2011. This is consistent with the theoretical expectation from the previous section that if there is no mean shift, there should be no change in the entire PDF if the PDF is an SGS distribution.
9. Discussion and concluding remarks
The difficulties of detecting significant long-term changes in the atmospheric circulation using historical records of limited length and accuracy are well appreciated in the climate research community. They provide a major impetus for ongoing efforts to improve atmospheric reanalyses and extend them as far back in time as feasible. Indeed the very different trends in Fig. 11 over the last 70 years versus the last 140 years in the first such long-term 20CRv2 reanalysis underscore the danger of drawing conclusions about long-term changes from relatively short records. Given the strong link between daily weather statistics (“storm tracks”) and the ambient flow (Whitaker and Sardeshmukh 1998; Compo and Sardeshmukh 2004), there are evident pitfalls in establishing significant long-term changes in weather statistics as well. Changes in extreme weather statistics are even harder to pin down using short records. Since large-scale weather events typically last a day or longer, and there are only about 9000 days of data for each 3-month season in a 100-yr record, a discussion of changes in extreme weather events that occur only a few times in a century is compromised by large uncertainties in their small probabilities on the order of 0.01%. Discussions of such small (and sometimes even smaller) probabilities indeed force one to confront the Bayesian versus frequentist meanings of probability itself based on “degree of belief” versus counts of outcomes. The issue, at its core, is the meaningfulness of assigning quantitative probabilities to events that have never occurred or have occurred only a few times in a finite record. A claim that the probability of an extreme weather event has “doubled” from 0.01% to 0.02% is of a very different character in this regard, given the likelihood of unrealized outcomes in a 100-yr record, than a claim that it has doubled from, say, 10% to 20% or even from 1% to 2%.
The basic note of caution sounded in this paper regarding quantifiable tail probabilities is that they are strongly affected by the non-Gaussianity of the observed distributions, and that their changes cannot generally be deduced from the mean shifts alone. The skewed and heavy tailed character of the distributions warrants accurate representation not only of the mean changes but also of changes of variance, skewness, and kurtosis for deducing the changes in tail probabilities. This has large implications for the detection, attribution, and projection of changes in extreme weather statistics using limited and inaccurate observations and imperfect climate models.
From a physical perspective, understanding the probabilities of extreme “unusual” values of an atmospheric variable requires an understanding of how such values are generated, whether through amplified mechanisms for generating the “usual” values or through concatenation of “a series of unfortunate events.” In either case the probabilities of the unusual and usual values are physically related. The tails of the probability distribution are thus related to the properties of the entire distribution, including its mean, width (variance), and shape (skewness and kurtosis). To get tail probabilities right, one has to get all of these properties right.
We have presented evidence that the distinctively skewed and heavy tailed character of the observed distributions of daily weather anomalies is basically due to their evolution being perturbed by stochastic noise whose amplitude increases linearly with anomaly amplitude, but asymmetrically for positive and negative anomalies. This statistically accurate description is associated with a precise model of the anomaly probability distribution as a stochastically generated skewed (SGS) distribution generated by correlated additive and multiplicative (CAM) noise forcing. The relevance of SGS distributions is not inconsistent with the notion that additional feedbacks play an important role in the evolution and persistence of some extreme anomaly events. Rather, it suggests that such additional feedbacks are not systematic, and can effectively be treated as a CAM noise forcing in modeling anomaly statistics.
We have described how to fit SGS distributions to data using the first four moments (mean, variance, skewness, and kurtosis), and how to generate a statistically equivalent CAM noise–driven Markov process with the same parameters, which is useful for estimating sampling uncertainties and persistence statistics. A virtue of this approach is that it makes use of all values in a time series to estimate tail probabilities, whose sampling uncertainties are therefore smaller than those associated with GEV and GP approaches that use just the extreme values. We stress that our approach is applicable not only to SGS variables but also to other variables that are monotonically related to SGS variables, such as precipitation via its strong link to midtropospheric vertical velocity. Overall, this approach provides a sharper tool for investigating the statistical significance of observed changes in extremes over the twentieth century and of projected changes over the twenty-first century.
Current statements concerning changing extreme weather risks associated with global warming mainly involve quantities related to surface temperature (IPCC 2014) and essentially amount to simple shifts of existing PDFs. Such statements downplay the fact that there is more to regional climate change than surface warming, and that assessing the changing risks of extreme storminess, droughts, floods, and heat waves requires accurate model representations of multidecadal and longer-term changes in the large-scale modes of natural atmospheric circulation variability and the complex nonlinear climate–weather interactions associated with them. The detection of changes in such modes from the limited observational record is much less clear cut than for surface temperature. Figure 12 demonstrated this for indices of large-scale modes of wintertime atmospheric circulation variability in the North Atlantic sector (the NAO index) and the North Pacific sector (the NP index). We found no significant changes either in the mean or in the entire PDFs of these indices over the last 140 years. This is of course not to imply that there have been no changes in other modes of circulation variability in other parts of the globe, just that any such changes need to be carefully evaluated on a case-by-case basis after due consideration of the non-Gaussian aspects of the probability distributions.
To detect, attribute, and make credible projections of changing extreme weather risks in a changing climate using climate models, the models need to represent not only the shifting means but also changes in the width and shape of the distributions of daily weather anomalies. Model misrepresentations of these changes can cast doubt on such efforts. Indeed current models have difficulty in representing even the mean changes, such as their inability to adequately capture the post-1998 “hiatus” in global warming (Fyfe et al. 2013). The mean errors are relatively large on regional scales, in substantial part because of model difficulties in capturing the pattern of tropical ocean temperature changes and their global impacts (Shin and Sardeshmukh 2011) that are associated with misrepresentations of tropical feedback processes (Shin et al. 2010). To estimate changes in tail probabilities using such imperfect models, it is important to account for biases in the probability distributions. However, it is clearly inappropriate to do so through a simple a posteriori “bias correction,” given the links between the mean, variance, and shapes of the predicted probability distributions. There is limited understanding of such links at present, mainly due to the limited understanding of climate–weather interactions on regional scales.
Given the importance of non-Gaussianity highlighted in this paper, it would be interesting to use our SGS–Markov process approach to analyze the tail probabilities of other weather-related variables such as near-surface temperature and precipitation in observational datasets and climate model simulations and projections. A multivariate Markov process model, of which (10) is a univariate approximation as discussed by SS09, could also be used as a more accurate “probability model” for this purpose. These are topics of current research.
Acknowledgments
We thank C. Smith for generating Fig. 7 and C. McColl for generating Fig. 11. We also thank the reviewers for comments and suggestions that improved the clarity of the paper. This work was supported by NOAA’s Climate Program Office (CPO) and by DOE’s Office for Science (BER). The Twentieth Century Reanalysis Project used resources of the National Energy Research Scientific Computing Center managed by Lawrence Berkeley National Laboratory, and of the Oak Ridge Leadership Computing Facility at Oak Ridge National Laboratory, which are supported by DOE’s Office of Science under Contracts DE-AC02-05CH11231 and DE-AC05-00OR22725, respectively.
APPENDIX A
An Extremal Probability Shift Index for Changes in Extremes



APPENDIX B
Convergence of GEV Fits to Extreme Value Distributions
Figure B1 shows the goodness of the GEV fit to the “true” extreme value PDFs of Nd-day block maxima in 108-day simulations of the Gaussian and SGS (S = 1, K = 5) Markov processes with a 1-day correlation time scale discussed in the text. The results are shown as cyan and black curves, respectively, for the Gaussian and SGS cases. The goodness of fit is measured in terms of the normalized Kolmogorov–Smirnov distance Dks between the fitted and true distributions, where the normalization is the square root of the number of sample maximum values for each block size. The calculations were performed for specified block sizes increasing from 10 to 90 days in steps of 10 days, from 100 to 900 days in steps of 100 days, etc. Note that the time unit on the abscissa is actually the correlation time scale τc of the Markov process, which is specified as 1 day in the simulations purely to provide meteorological context.
The goodness of fit of the GEV distribution, as a function of sampling block size, to the “true” extreme value distribution of the Markov process [(10)] associated with the Gaussian (cyan) and SGS (black) parent distributions in Fig. 1. See text for further explanation.
Citation: Journal of Climate 28, 23; 10.1175/JCLI-D-15-0020.1
According to extreme value theory, the GEV fit should become more accurate for larger block sizes, and Fig. B1 clearly bears this out. However, the rate of convergence is much slower in the case of the Gaussian parent than the SGS parent, and the GEV fit is much less accurate in the Gaussian parent case for Nd less than 200 “days” (i.e., correlation time units).
APPENDIX C
Generalized Extreme Value and Generalized Pareto Distributions Associated with Parent SGS Distributions

















REFERENCES
Adler, R. F., and Coauthors, 2003: The Version-2 Global Precipitation Climatology Project (GPCP) monthly precipitation analysis (1979–present). J. Hydrometeor., 4, 1147–1167, doi:10.1175/1525-7541(2003)004<1147:TVGPCP>2.0.CO;2.
Compo, G. P., and P. D. Sardeshmukh, 2004: Storm track predictability on seasonal and decadal scales. J. Climate, 17, 3701–3720, doi:10.1175/1520-0442(2004)017<3701:STPOSA>2.0.CO;2.
Compo, G. P., and Coauthors, 2011: The Twentieth Century Reanalysis Project. Quart. J. Roy. Meteor. Soc., 137, 1–28, doi:10.1002/qj.776.
Davison, A. C., and R. L. Smith, 1990: Models for exceedances over high thresholds (with discussion). J. Roy. Stat. Soc., 52B, 393–442.
Dole, R. M., and N. D. Gordon, 1983: Persistent anomalies of the extratropical Northern Hemisphere wintertime circulation: Geographical distribution and regional persistence characteristics. Mon. Wea. Rev., 111, 1567–1586, doi:10.1175/1520-0493(1983)111<1567:PAOTEN>2.0.CO;2.
Fyfe, J. C., N. P. Gillett, and F. W. Zwiers, 2013: Overestimated global warming over the past 20 years. Nat. Climate Change, 3, 767–769, doi:10.1038/nclimate1972.
IPCC, 2014: Climate Change 2014: Synthesis Report. R. K. Pachauri and L. A. Meyer, Eds., IPCC, 151 pp. [Available online at http://www.ipcc.ch/report/ar5/syr/.]
Kalnay, E., and Coauthors, 1996: The NCEP/NCAR 40-Year Reanalysis Project. Bull. Amer. Meteor. Soc., 77, 437–471, doi:10.1175/1520-0477(1996)077<0437:TNYRP>2.0.CO;2.
Kanamitsu, M., and Coauthors, 2002: NCEP–DOE AMIP-II Reanalysis (R-2). Bull. Amer. Meteor. Soc., 83, 1631–1643, doi:10.1175/BAMS-83-11-1631.
Katz, R. W., and B. G. Brown, 1992: Extreme events in a changing climate: Variability is more important than averages. Climatic Change, 21, 289–302, doi:10.1007/BF00139728.
Kharin, V. V., and F. W. Zwiers, 2005: Estimating extremes in transient climate change simulations. J. Climate, 18, 1156–1173, doi:10.1175/JCLI3320.1.
Kistler, R., and Coauthors, 2001: The NCEP–NCAR 50-Year Reanalysis: Monthly means CD-ROM and documentation. Bull. Amer. Meteor. Soc., 82, 247–267, doi:10.1175/1520-0477(2001)082<0247:TNNYRM>2.3.CO;2.
Masato, G., B. J. Hoskins, and T. Woollings, 2009: Can the frequency of blocking be described by a red noise process? J. Atmos. Sci., 66, 2143–2149, doi:10.1175/2008JAS2907.1.
Meehl, G. A., and Coauthors, 2000: An introduction to trends in extreme weather and climate events: Observations, socioeconomic impacts, terrestrial ecological impacts, and model projections. Bull. Amer. Meteor. Soc., 81, 413–416, doi:10.1175/1520-0477(2000)081<0413:AITTIE>2.3.CO;2.
Penland, C., and P. D. Sardeshmukh, 2012: Alternative interpretations of power-law distributions found in nature. Chaos, 22, 023119, doi:10.1063/1.4706504.
Perron, M., and P. Sura, 2013: Climatology of non-Gaussian atmospheric statistics. J. Climate, 26, 1063–1083, doi:10.1175/JCLI-D-11-00504.1.
Pickands, J., 1975: Statistical inference using extreme order statistics. Ann. Stat., 3, 119–131, doi:10.1214/aos/1176343003.
Sardeshmukh, P. D., and P. Sura, 2009: Reconciling non-Gaussian climate statistics with linear dynamics. J. Climate, 22, 1193–1207, doi:10.1175/2008JCLI2358.1.
Sardeshmukh, P. D., and C. Penland, 2015: Understanding the distinctively skewed and heavy tailed character of atmospheric and oceanic probability distributions. Chaos, 25, 036410, doi:10.1063/1.4914169.
Shin, S. I., and P. D. Sardeshmukh, 2011: Critical influence of the pattern of tropical ocean warming on remote climate trends. Climate Dyn., 36, 1577–1591, doi:10.1007/s00382-009-0732-3.
Shin, S. I., P. D. Sardeshmukh, and K. Pegion, 2010: Realism of local and remote feedbacks on tropical sea surface temperatures in climate models. J. Geophys. Res., 115, D21110, doi:10.1029/2010JD013927.
Smith, R. L., 1989: Extreme value analysis of environmental time series: An application to trend detection in ground-level ozone. Stat. Sci., 4, 367–377, doi:10.1214/ss/1177012400.
Sura, P., and P. D. Sardeshmukh, 2008: A global view of non-Gaussian SST variability. J. Phys. Oceanogr., 38, 639–647, doi:10.1175/2007JPO3761.1.
Tancredi, A., C. Anderson, and A. O’Hagan, 2006: Accounting for threshold uncertainty in extreme value estimation. Extremes, 9, 87–106, doi:10.1007/s10687-006-0009-8.
Trenberth, K. E., and J. W. Hurrell, 1994: Decadal atmosphere–ocean variations in the Pacific. Climate Dyn., 9, 303–319, doi:10.1007/BF00204745.
Uppala, S. M., and Coauthors, 2005: The ERA-40 Reanalysis. Quart. J. Roy. Meteor. Soc., 131, 2961–3012, doi:10.1256/qj.04.176.
Whitaker, J. S., and P. D. Sardeshmukh, 1998: A linear theory of extratropical synoptic eddy statistics. J. Atmos. Sci., 55, 237–258, doi:10.1175/1520-0469(1998)055<0237:ALTOES>2.0.CO;2.
Woollings, T., A. Hannachi, B. Hoskins, and A. Turner, 2010: A regime view of the North Atlantic Oscillation and its response to anthropogenic forcing. J. Climate, 23, 1291–1307, doi:10.1175/2009JCLI3087.1.
Xie, P., and P. A. Arkin, 1997: Global precipitation: A 17-year monthly analysis based on gauge observations, satellite estimates, and numerical model outputs. Bull. Amer. Meteor. Soc., 78, 2539–2558, doi:10.1175/1520-0477(1997)078<2539:GPAYMA>2.0.CO;2.
Zwiers, F. W., X. Zhang, and Y. Feng, 2011: Anthropogenic influence on long return period daily temperature extremes at regional scales. J. Climate, 24, 881–892, doi:10.1175/2010JCLI3908.1.