## 1. Introduction

The study of extreme events is the analysis of the tails of probability distributions. As the tails of the distributions of most natural phenomena are sparsely populated, such a statistical analysis benefits greatly from very large datasets. However, the climate system has only been closely recorded for periods greater than a few decades in a limited number of locations around the globe. As a result, the comparison of climate model output to the observational record of extreme climate events is a difficult task. Validation of the simulation of extreme climate events has not approached the level of sophistication that has been developed for other aspects of the climate system. Nonetheless, extreme climate events can have tremendous ramifications for human and ecological systems, and it is vital to develop climate models to point where skillful forecasts of changes in extreme events due to changes in atmospheric composition can be made.

_{2}experiment, annual maxima and minima were extracted from a parent dataset of daily values. Their technique to estimate return values of such annual extrema is based on the determination of their probability distribution; L moments (Hosking and Wallis 1997) provide a convenient methodology to fit the parameters of a postulated distribution to sample data. Other traditional methods for fitting distribution functions, such as the method of moments or maximum likelihood methods (von Storch and Zwiers 1999; Coles 2001) could also be used. Under suitable regularity conditions, a theorem (e.g., Coles 2001) states that the maximum over a regular period of a large sample is distributed by the generalized extreme value (GEV) distribution. The GEV distribution,

*F*(

*x*), is determined by three parameters,

*ξ,*

*α,*and

*k*are the location, scale, and shape factors. The range of the random variable

*x*is dependent on the value of the shape factor

*k*:

*F*(

*x*) is the limiting cumulative distribution function of the maxima of a sample of independently and identically distributed random variables such as the annual extrema of a sample of daily averaged fields (Leadbetter et al. 1983). For the probability distributions commonly found in natural systems, this result is a consequence of the asymptotic nature of their tails (Castillo 1988). There are no other classes of distributions for this kind of extrema. However, there are classes of parent distributions for which no asymptotic extreme value distribution exists. The Gumbel distribution is a special case where the shape parameter,

*k,*is zero. This distribution is the limiting distribution for maxima drawn from many of the common parent distributions including normal, lognormal, and exponential distributions (Leadbetter et al. 1983). However, when Kharin and Zwiers (2000) performed Kolmogorov–Smirnov goodness-of-fit tests on results from the CCCma model, they found that the GEV distribution, not the Gumbel distribution, better describes the annual maxima of surface air temperature, precipitation, and surface wind speed.

*X*

_{T}is that value which is exceeded, on average, once in a period of time

*T.*For example, when considering annual maxima of daily averaged variables, there is a 1/

*T*chance of any daily average exceeding

*X*

_{T}in a given year (where

*T*is in years). Formally, this is straightforwardly defined as

*F*

*X*

_{T}

*T.*

*X*

_{T}using the above definition of the GEV distribution yields

*k*skew the distribution function such that return values are lower than for the special case of the Gumbel distribution (

*k*= 0) with the same location and scale factors. Similarly, negative values of the shape factor cause the return values to be greater than the Gumbel case.

As the return value is a property of the tail of the distribution, the parent sample size must be large in order to accurately estimate it. Samples of the annual extrema of daily averaged variables are by definition constructed from the endpoints of parent distributions of *all* daily averaged variables for a given year. To construct a sample of extrema large enough to accurately estimate the parameters of the GEV distribution from transiently forced climate model simulations presents a challenge because the statistical properties of such a simulation are not stationary. However, it is reasonable to assume stationarity over short periods of the simulation. By exploiting the statistical independence of climate model simulations initialized by slightly different initial conditions, sample size may be substantially increased, resulting in a more robust estimate of the GEV parameters. Kharin and Zwiers (2000) chose to extract annual extrema over 20-yr simulation periods from an ensemble of three integrations resulting in samples of extrema containing 60 elements. In these numerical experiments, where the forcings of trace atmospheric constituents are varied relatively slowly over a 20-yr period, such an approximation to stationarity is likely to be adequate. However, for more realistic scenarios, such as those incorporating the roughly 3–5-yr forcing effects of large volcanic eruptions, this approximation would not be valid. It is also important to point out that for annual extrema, the parent sample size of daily values cannot be considered to be truly a further 365 times larger than this for most climate variables because of the seasonal nature of the climate system.

In this paper, we examine this seasonality issue, albeit somewhat indirectly, by calculating return values of seasonal maxima and comparing them to those obtained from analysis of the annual maxima. We also examine the relationship, if any, between predicted changes in precipitation extremes and predicted changes in mean precipitation. The parent datasets are constructed from the daily total precipitation combined from two independent transiently forced simulations of the period from 1860 to 2100 from the Parallel Climate Model (PCM) developed at the National Center for Atmospheric Research (NCAR) (Washington et al. 2000). The forcing scenario during the twentieth century is defined by realistic atmospheric concentrations of carbon dioxide and sulfate aerosol (direct effect) and tropospheric and stratospheric ozone concentrations (Dai et al. 2001). The forcing scenario of the twenty-first century assumes the Intergovernmental Panel on Climate Change (IPCC) “business as usual” evolution of these same trace constituents (Dai et al. 2001). Model output from these and many other PCM integrations are available to download (Wehner 2003).

## 2. Simulated present-day climate

Comparison between models and observations of quantities derived from extreme value statistics is hampered by several factors. For precipitation there are two major obstacles to a direct comparison. The first is simply the unavailability of long global time series of observed daily total precipitation. The second is a fundamental issue related to the local character of storms. The horizontal resolution of climate models is much greater than the scale of most precipitating cloud systems. This is especially true of the highly convective storms that often produce extreme values of precipitation. Reliable observations of daily precipitation come in the form of rain gauge data from individual stations. Techniques to aggregate this pointlike station data to larger scales are well established for highly averaged quantities such as monthly means. However, the quantities derived by extreme value statistics are largely determined by a limited number of events at the tail of the distribution. Clearly, the extreme events in the pointlike station data will reflect the localized nature of such individual intense storms, whereas the extreme events in a climate model do not because of horizontal resolution constraints. However, it may be possible to construct time series suitable for extreme event analysis at climate model resolutions from the station data by exploiting the correlation of stations located within a model grid cell (Hosking and Wallis 1997). In fact, Osborn and Hulme (1997) examined the closely related issue of daily precipitation variance at several locations in just this manner.

A practical solution to the lack of daily global observations would be to compare the extreme statistics of the model with a reanalysis. Because reanalysis *is* a model (albeit highly constrained by observations), both of these problems might be solved. However, reanalysis precipitation products are not analyzed fields like the wind or humidity fields but rather prognostic results from parameterizations. Hence, reanalyses suffer the same precipitation deficiencies as models. In particular, the high-frequency variance of precipitation in the National Centers for Environmental Prediction (NCEP)– NCAR reanalysis is much lower than that observed in those few regions where such data have been collected (Kistler et al. 2001). If lower-order measures of variability are not well reproduced, it is unlikely that higher-order measures of variability such as extreme values can be well simulated. For these reasons, evaluation of climate model extreme values against the real world is deferred in this paper.^{1}

### a. Annual versus seasonal precipitation maxima

Figure 1 shows the 20-yr return value of annual maxima of daily precipitation extracted from the years 1979–98 of the PCM simulation. Figure 2 shows the 20-yr return value of seasonal maxima calculated from the same data. Generally, the seasonal return values are less than the corresponding annual quantities. Only isolated areas exhibit one seasonal return value that is nearly identical to the annual return value. Such agreement tells us that in these areas, the largest values occur mainly in the same season. For instance, the September– October–November (SON) return values near Japan are close to the annual return values, suggesting that most of the annual maxima occur in the autumn season. On the other hand, in areas such as the southeastern United States, all of the seasonal return values are quite a bit less than the annual return values. This suggests that the annual maxima are distributed among the seasons. Another interesting area is east of the Andes where both the December–January–February (DJF) and SON return values are near in magnitude to the annual return value. This suggests that very large daily precipitation totals could occur in either of these seasons. In any given year, the second larger value of these two seasons is then excluded from the annual maximum dataset despite the possibility of it being one of the largest values attained. Examination of the data reveals that indeed some large seasonal events are not included in the annual maximum dataset. This illustrates a subtle point concerning the definition of extreme values. The methodology as implemented here defines the “return value of the annual extreme event” rather than the “return value of all extreme events.” The exclusion of large events by utilizing only the largest value of the parent dataset during a given period leads to calculated return values that are slightly lower for the former definition than for the latter. Thus, the frequency of extreme events might be underestimated. Other methodologies, such as choosing values that exceed a threshold (von Storch and Zwiers 1999), are likely to be better suited in cases where the entire parent dataset should be considered.

## 3. Predicted changes in return value

### a. Annual maxima

The PCM exhibits a 2.1°C increase in the annual global mean surface air temperature from 1979 to 2079 under a “business as usual” forcing scenario (Dai et al. 2001). Over that same period the annual global mean daily total precipitation changes from 3.1 to 3.2 mm day^{−1}. Of course, for both fields the mean value changes on smaller scales can be much larger and/or of opposite sign.

Similar to that found by Kharin and Zwiers (2000) in the Canadian Global Climate Model (CGCM1), the PCM exhibits striking changes in the 20-yr return value of annual maxima of daily precipitation from the period 1979–98 to the period 2079–98. Not surprisingly, given the two models' significant hydrological differences in both the control climatology and in their response to greenhouse gas changes (Covey et al. 2000), both the return values and associated changes in the PCM simulation are quite different than those seen in CGCM1. For instance, in CGCM1 the largest return values, exceeding 200 mm day^{−1}, are seen over the western Pacific. The change in return value exceeds 70 mm day^{−1} in parts of this region in the CGCM1 simulations. In the PCM, this strong feature does not appear. Rather, increases are mixed with decreases over this region and do not exceed 40 mm day^{−1}. For both models, biases in mean precipitation can manifest themselves in the return-value spatial patterns. For instance, in the PCM simulation of the Pacific Ocean the annual mean precipitation is much too low at the equator itself compared to observations (Xie and Arkin 1996, 1997). The PCM return values exhibit low values in this same region. This equatorial structure does not exist in maps of the CGCM1 or ECMWF return values (Kharin and Zwiers 2000).

In Fig. 3 (top), the fractional change in the twenty-year return value of annual maxima of daily precipitation from the period 1979–98 to the period 2079–98 relative to the 1979–98 values is shown. Fractional rather than total change is shown in this figure to illustrate that changes in drier regions may be large in a relative sense compared to changes in wetter regions. This is important to note, as the consequences of extreme events may be higher in arid regions than in moist regions even if the magnitude of the events is many times smaller. Conversely, in the driest desert regions the computed precipitation can be so low that any change is insignificant from a practical viewpoint.

### b. Seasonal maxima

Figure 4 shows the fractional changes in the 20-yr return value of seasonal maxima of daily precipitation from the period beginning in December 1978 to that beginning in December 2078. It is evident that these seasonal changes generally differ significantly from each other and the annual changes shown in Fig. 3 (top). A few of the dominant features in the annual changes do appear in some seasons. Most notably, large-scale increases in return values over the equatorial Atlantic, Indian, and western Pacific Oceans are found in both the annual and SON results. However, most of the features in the seasonal changes are completely absent from the annual changes. This is particularly true in regions where the seasonal return value decreases, such as the DJF tropical oceans. Certain features are potentially important. For instance, extreme winter storms over midcontinental North America and China show rather dramatic increases. Given the potential for large amounts of snow in these events, their impact could be quite costly.

Quantifying uncertainty of return value estimates is an important point that is difficult to fully address in the current study due to the limited ensemble size. A straightforward Monte Carlo algorithm (Hosking and Wallis 1997) can be used to assess the statistical accuracy of the properties of a postulated GEV distribution. The method is to first estimate the L moments of the actual sample data. This is followed by generation of random samples distributed according to the GEV distribution associated with these L moments. Next, L moments and, thus, return values for each of the random samples are determined.

For this study, 200 random samples of 40 elements each were generated at every grid point using the annual and seasonal values of the computed L moments. From these randomly generated samples, an associated distribution of return values was calculated. At each grid point, confidence intervals on the return values were constructed by considering the percentiles of this distribution. Comparison of the projected changes in return values with these confidence intervals provides an estimate of their degree of statistical significance. The fractional return value changes in Figs. 3 and 4 exhibit a range of large and small values. The statistical significance of these changes depends strongly on their magnitudes and varies with the time interval considered. In general, the regions of larger change exhibit a high degree of spatially coherent statistical significance. In the first column of Table 1, the fraction of the computational cells that exhibits a change in return value greater than 10% and statistically significant at the 95% level is shown. In the second column of Table 2, results are shown for the cells exhibiting a change in return value greater than 25%. For all seasons, these numbers reveal that roughly one-third of the area of the globe that exhibits a change greater than 25% also exhibits a high degree of statistical certainty. These estimates of statistical significance were also repeated using only 100 random samples. Very similar results were obtained, indicating robustness to these estimates.

Note that this uncertainty analysis does not directly address the issue of sample size. As the return values are determined by a small fraction of the parent sample, it is entirely possible that different realizations of the climate model integration could produce different return value estimates. While the methodology presented here can formally quantify the uncertainty of the return value properties of the given parent sample, it does not reveal any information about the stability or repeatability of this parent sample. This issue could be better addressed by examining larger parent samples obtained from larger ensemble integrations or long control runs.

## 4. Relationship between changes in extremes and changes in lower-order statistics

It is natural to expect that there is some sort of relationship between changes in the mean climate and changes in extreme events. However, for the PCM, the weighted centered pattern correlation is globally about 0.26 between the fractional changes in the 20-yr return value of annual maxima of daily precipitation (Fig. 3, top) and the fractional changes in the annual mean daily precipitation (Fig. 3, bottom). Correlations between the seasonal return value fractional changes (Fig. 4) and seasonal mean value fractional changes (Fig. 5) are significantly lower than that found for the annual results in all seasons and are shown in the first column of Table 2a.

These low values of the global linear correlation factor between the fractional changes in mean and extreme precipitation of Figs. 3, 4, and 5 are obtained despite that there are regions where large-scale similarities between the two changes would appear to suggest a relationship. A number of properties of these fields cause this linear correlation to be lower than expected. First, consider the spatial noise in Figs. 3, 4, and 5. A simple regridding of these maps to resolutions corresponding to T21 and T10 yields values of weighted centered pattern correlation in columns 2 and 3 of Table 2a. Significant correlation increases are seen for the DJF and JJA (June–August) changes but not for the other seasonal changes or for the annual change. Next, consider that there are regions of nonmonotonicity where positive changes in mean coincide with negative changes in return value or vice versa. Some of the largest values where this occurs are in arid regions. Considering absolute rather than fractional changes in a correlation analysis tends to discount these regions due to the lower values of both mean and return value. Table 2b shows the global linear correlation factor between the absolute changes in mean and extreme precipitation at the original T42 resolution as well as those obtained by regridding to T21 and T10. Correlation values are considerably more uniform across the seasons at a fixed resolution. Reduction of the spatial noise by the regridding procedure also increases the correlation values more uniformly across the seasons. Finally, although nearly all of the larger changes in return value are statistically significant as shown in Table 1, only about one-third of all changes are so. In Table 2c, the global linear correlation factor between only the absolute changes in mean and extreme precipitation that are significant at the 90% level are shown at the three resolutions considered. These numbers are nearly identical to those in Table 2b indicating that the statistically insignificant changes are so small as to not contribute much to the globalized result.

Allen and Ingram (2002) suggest that future changes in the mean precipitation and future changes in extreme precipitation events will be driven by differing mechanisms. They maintain that, while the mean precipitation is governed by the energy budget, extreme rainfall events occur when the entire moisture content in a volume of air is precipitated out. Changes in these latter events would be controlled by the Clausius–Clapeyron relationship. Centered pattern correlation is a measure of the linear relationship between two time series and would not necessarily reveal nonlinear relationships such as that between changes in the energy budget and saturation moisture content. On the other hand, one might suspect that the limited areas where the predicted changes in precipitation mean and return value are of an opposite sign cause the values of global linear correlation factor in Table 2c to be low, masking an otherwise linear relationship. These cells represent between 20% and 30% of the statistically significant changes depending on the season and the amount of regridding (Table 3). However, even if these cells are arbitrarily removed from the correlation calculation, the values obtained are only slightly increased over those shown in Table 2c. Hence, whatever the relationship is between local changes in precipitation mean values and return values, it is not likely to be linear and perhaps not even monotonic.

Since extreme values are large, yet infrequent, excursions from the mean, they are a sort of higher-order measure of the variability of a random variable. To explore the relationship, if any, between lower-order measures of variability and extremes it is helpful to consider the nature of the distribution function. On daily time scales at any given location, the intermittent nature of precipitation mechanisms causes the parent distribution of precipitation to be far from Gaussian. Besides the obvious constraint of nonnegativity, at such short durations, many points in the distribution will be near or at zero. Only at much longer time scales, for example, annual or greater, is the normal distribution a good approximation (Lettenmaier 1995). Hence the daily distribution of precipitation is very skewed with a long, broad tail to its largest values. The standard deviation of this distribution is largely determined by this tail as the other side is highly peaked. Although not shown here, maps of the changes in the standard deviation of daily precipitation from the period spanning 1979–98 to the period spanning 2079–98 do indeed exhibit reasonably high pattern correlation with the changes in return value over this same period as shown in Table 4. Each of the seasonal correlations is higher than the annual result, illustrating that daily variability across the seasons dilutes this relationship.

## 5. Conclusions

Consideration of future climate change must include analysis of changes in extreme events due to the potential for significant damage to human and natural systems. A definition of extreme event is somewhat subjective due to the locality of climate at differing locations. However, it is clear that such rare events are best described as lying in the tails of a larger parent distribution. Following previous authors (Kharin and Zwiers 2000; Zwiers and Kharin 1998a), maximum daily precipitation rates achieved over annual and seasonal periods are considered in this paper as a population of extreme events. Because the generalized extreme value distribution well describes this type of data, return values are readily calculated.

The ability of current climate models to simulate extreme daily events is yet to be examined^{2} (Zwiers and Kharin 1998b). For extreme precipitation events, direct comparison of models with observations is complicated by the local nature of intense storms. A sparcity of lengthy global observations further restricts such assessments.

The seasonal nature of precipitation mechanisms is reflected in the 20-yr return values. Many of the features exhibited by the seasonal maxima are absent when considering the annual maximum. Changes in the seasonal return values are also shown to be considerably different than changes in the annual return value. Seasonal changes in extreme events may have far more impact on human and natural systems than do annual changes. For instance, in the model considered here, PCM, midlatitude wintertime return values are significantly increased. The consequences of larger snowstorms could be severe. Other important seasonal disasters, for example, floods and cropland erosion, are also better understood from the study of seasonal return values.

The relationship between changes in mean precipitation and changes in 20-yr daily precipitation return values is complex as simulated by PCM. A weighted centered pattern linear correlation between the two fields is uniformly low across seasons or the annual period. Removal of statistically insignificant changes, reduction of spatial noise by a regridding procedure, and even an arbitrary removal of cells with changes of opposite signs do not increase the linear correlation. The relationship is concluded to be certainly nonlinear and possibly nonmonotonic, thereby limiting the ability to accurately predict changes in extreme events from predicted changes in mean precipitation rates. However, linear correlations between changes in precipitation daily standard deviation and changes in return values are relatively high, implying that changes in the shape of the distribution of daily precipitation are more important than changes in the average of the distribution when trying to understand changes in extreme events.

Daily output data from lengthy climate model integrations is cumbersome and often not saved. The addition of a simple monthly diagnostic, the maximum daily precipitation rate achieved in the month, allows for the calculation of the extreme value statistics described in this paper. Although other forms of extreme event analysis are not enabled by this diagnostic (Frich et al. 2002), it is recommended that the coupled climate model community add this type of quantity to their lists of monthly output if higher-frequency output is not routinely saved.

## Acknowledgments

The author would like to thank Ben Santer and Tom Wigley for comments on this paper.

## REFERENCES

Allen, M. R., and W. J. Ingram, 2002: Constraints on future changes in climate and the hydrologic cycle.

,*Nature***419****,**224–232.Castillo, E., 1988:

*Extreme Value Theory in Engineering*. Academic Press, 408 pp.Coles, S., 2001:

*An Introduction to Statistical Modeling of Extreme Values*. Springer-Verlag, 208 pp.Covey, C., K. M. AchutaRao, S. J. Lambert, and K. T. Taylor, 2000: Intercomparison of present and future climates simulated by coupled ocean–atmosphere GCMs. PCMDI Rep. 66, UCRL-ID-140325, Lawrence Livermore National Laboratory, 20 pp.

Dai, A., T. M. L. Wigley, B. A. Boville, J. T. Kiehl, and L. E. Buja, 2001: Climates of the twentieth and twenty-first centuries simulated by the NCAR Climate System Model.

,*J. Climate***14****,**485–519.Frich, P., L. V. Alexander, P. Della-Marta, B. Gleason, M. Haylock, A. M. G. Klein Tank, and T. Peterson, 2002: Observed coherent changes in climatic extremes during the second half of the twentieth century.

,*Climate Res***19****,**193–212.Hosking, J. R. M., and J. R. Wallis, 1997:

*Regional Frequency Analysis: An Approach Based on L-Moments*. Cambridge University Press, 224 pp.Kharin, V. V., and F. W. Zwiers, 2000: Changes in the extremes in an ensemble of transient climate simulation with a coupled atmosphere–ocean GCM.

,*J. Climate***13****,**3760–3788.Kistler, R., and Coauthors, 2001: The NCEP–NCAR 50-year reanalysis: Monthly means CD-ROM and documentation.

,*Bull. Amer. Meteor. Soc***82****,**247–267.Leadbetter, M. R., G. Lindgren, and H. Rootzen, 1983:

*Extremes and Related Properties of Random Sequences and Processes*. Springer-Verlag, 336 pp.Lettenmaier, D., 1995: Stochastic modeling of precipitation with applications to climate model downscaling.

*Analysis of Climate Variability: Applications of Statistical Techniques,*H. von Storch and A. Navarra, Eds., Springer-Verlag, 197–212.Osborn, T. J., and M. Hulme, 1997: Development of a relationship between station data and grid-box rainday frequencies for climate model evaluation.

,*J. Climate***10****,**1885–1908.von Storch, H., and F. W. Zwiers, 1999:

*Statistical Analysis in Climate Research*. Cambridge University Press, 484 pp.Washington, W. M., and Coauthors, 2000: Parallel climate model (PCM) control and transient simulations.

,*Climate Dyn***16****,**755–774.Wehner, M. F., cited 2003: US DOE Coupled Climate Model data archives. [Available online at http://www.nersc.gov/projects/ gcm_data.].

Xie, P., and P. A. Arkin, 1996: Analyses of global monthly precipitation using gauge observations, satellite estimates, and numerical model predictions.

,*J. Climate***9****,**840–858.Xie, P., and P. A. Arkin, 1997: Global precipitation: A 17-year monthly analysis based on gauge observations, satellite estimates, and numerical model outputs.

,*Bull. Amer. Meteor. Soc***78****,**2539–2558.Zwiers, F. W., and V. V. Kharin, 1998a: Changes in the extremes of the climate simulated by CCC GCM2 under CO2 doubling.

,*J. Climate***11****,**2200–2222.Zwiers, F. W., and V. V. Kharin, cited 1998b: AMIP II Diagnostic Subproject 18: Intercomparison of surface climate extremes. [Available online at http://www-pcmdi.llnl.gov/amip/DIAGSUBS/sp18.html.].

Percentage of cells with a calculated change in return value greater than 10% or 25% that exhibit statistical significance at the 95% level

The weighted centered pattern correlation relating chang es in return values and changes in the mean for the various parts of the year

Fraction of cells with statistically significant (at the 90% level) changes in return value of the same sign as changes in mean precipitation

The weighted centered pattern correlation relating chang es in return values and changes in the standard deviation at the original T42 resolution for the various parts of the year

^{1}

Note that Kharin and Zwiers (2000) compare annual mean wet and dry periods obtained from the ECMWF reanalysis favorably with Canadian station data.

^{2}

Intercomparison of surface climate extremes (F. W. Zwiers and V. V. Kharin, coordinators).