Robustness of Future Changes in Local Precipitation Extremes

Elizabeth J. Kendon Met Office Hadley Centre, Exeter, United Kingdom

Search for other papers by Elizabeth J. Kendon in
Current site
Google Scholar
PubMed
Close
,
David P. Rowell Met Office Hadley Centre, Exeter, United Kingdom

Search for other papers by David P. Rowell in
Current site
Google Scholar
PubMed
Close
,
Richard G. Jones Met Office Hadley Centre (Reading Unit), University of Reading, Reading, United Kingdom

Search for other papers by Richard G. Jones in
Current site
Google Scholar
PubMed
Close
, and
Erasmo Buonomo Met Office Hadley Centre, Exeter, United Kingdom

Search for other papers by Erasmo Buonomo in
Current site
Google Scholar
PubMed
Close
Full access

Abstract

Reliable projections of future changes in local precipitation extremes are essential for informing policy decisions regarding mitigation and adaptation to climate change. In this paper, the extent to which the natural variability of the climate affects one’s ability to project the anthropogenically forced component of change in daily precipitation extremes across Europe is examined. A three-member ensemble of the Hadley Centre Regional Climate Model (HadRM3H) is used and a statistical framework is applied to estimate the uncertainty due to the full spectrum of climate variability. In particular, the results and understanding presented here suggest that annual to multidecadal natural variability may contribute significant uncertainty. For this ensemble projection, extreme precipitation changes at the grid-box level are found to be discernible above climate noise over much of northern and central Europe in winter, and parts of northern and southern Europe in summer. The ability to quantify the change to a reasonable level of accuracy is largely limited to regions in northern Europe. In general, where climate noise has a significant component varying on decadal time scales, single 30-yr climate change projections are insufficient to infer changes in the extreme tail of the underlying precipitation distribution. In this context, the need for ensembles of integrations is demonstrated and the relative effectiveness of spatial pooling and averaging for generating robust signals of extreme precipitation change is also explored. The key conclusions are expected to apply more generally to other models and forcing scenarios.

Corresponding author address: Elizabeth Kendon (née Kennett), Met Office Hadley Centre, Fitzroy Rd., Exeter, EX1 3PB, United Kingdom. Email: elizabeth.kendon@metoffice.gov.uk

Abstract

Reliable projections of future changes in local precipitation extremes are essential for informing policy decisions regarding mitigation and adaptation to climate change. In this paper, the extent to which the natural variability of the climate affects one’s ability to project the anthropogenically forced component of change in daily precipitation extremes across Europe is examined. A three-member ensemble of the Hadley Centre Regional Climate Model (HadRM3H) is used and a statistical framework is applied to estimate the uncertainty due to the full spectrum of climate variability. In particular, the results and understanding presented here suggest that annual to multidecadal natural variability may contribute significant uncertainty. For this ensemble projection, extreme precipitation changes at the grid-box level are found to be discernible above climate noise over much of northern and central Europe in winter, and parts of northern and southern Europe in summer. The ability to quantify the change to a reasonable level of accuracy is largely limited to regions in northern Europe. In general, where climate noise has a significant component varying on decadal time scales, single 30-yr climate change projections are insufficient to infer changes in the extreme tail of the underlying precipitation distribution. In this context, the need for ensembles of integrations is demonstrated and the relative effectiveness of spatial pooling and averaging for generating robust signals of extreme precipitation change is also explored. The key conclusions are expected to apply more generally to other models and forcing scenarios.

Corresponding author address: Elizabeth Kendon (née Kennett), Met Office Hadley Centre, Fitzroy Rd., Exeter, EX1 3PB, United Kingdom. Email: elizabeth.kendon@metoffice.gov.uk

1. Introduction

The Intergovernmental Panel on Climate Change (IPCC), which provides a comprehensive assessment of the current status of scientific research on climate change, suggests that there may be more intense precipitation events in the future over many areas of Europe (Christensen et al. 2007). Changes in the frequency and intensity of extreme precipitation may have considerable impacts on society, including impacts on agriculture, industry, the built-environment, and natural ecosystems (Ekström et al. 2005; McCarthy et al. 2001). Thus, accurate predictions of future changes in local precipitation extremes are essential for informing policy decisions regarding mitigation and adaptation to climate change.

Model projections of future climate change encompass a wide range of uncertainties. The natural variability of the climate limits the accuracy of future climate predictions, while additional uncertainty is associated with modeling deficiencies and future emissions (Baede et al. 2001). If we are to understand the anthropogenically forced component of climate change, it is essential to characterize the uncertainty due to natural variability. However, despite its fundamental significance, there are few studies to date assessing the importance of natural climate variability in the context of predicting changes in climate extremes.

Natural variability occurs on all time scales from subdaily to multidecadal. On daily time scales or less, it is associated with individual weather events, whereas on longer time scales (seasonal and beyond) it derives from the global interaction of weather and climate events. Full knowledge of the natural variability of a climate variable is required to define its extreme behavior. As a consequence, natural variability results in uncertainty in the estimate of an extreme metric obtained from a finite number of years of data. The more years of data available, the more extreme behavior we are able to capture and the greater certainty we have in the sampling of less extreme metrics. For the idealized case of an infinite number of years of data from a stationary long-term climate, a given extreme metric of the population would be known precisely. However, for 30 yr of data, which is the typical length of a model integration and time period over which stationarity may be assumed for present-day precipitation (Mitchell 2003), uncertainty in the extreme estimate arises both due to the finite sampling of short time-scale variability and the fact that this represents only one realization of multidecadal variability. Ultimately, we are less interested in one possible 30-yr realization of the future climate than what is the underlying probability of a given extreme event in any future period.

Natural variability has been shown to contribute significant uncertainty to estimated future changes in seasonal and annual mean precipitation at large spatial scales resolved in general circulation models (GCMs) (Räisänen 2001; Murphy et al. 2004; Sorteberg and Kvamsto 2006). We expect that this is likely to be even greater at the finer spatial and temporal scales associated with local precipitation extremes (Mitchell et al. 1999; Räisänen 2001). Regional climate models (RCMs) have been widely used to provide projections of future changes in extreme precipitation, showing good skill in estimating the observed statistics of daily precipitation in the current climate (Durman et al. 2001; Jones and Reid 2001; Räisänen and Joelsson 2001; Fowler et al. 2005; Frei et al. 2006; Buonomo et al. 2007). However, few studies have examined the uncertainty in these future changes due to natural variability, with exceptions including Räisänen and Joelsson (2001), Huntingford et al. (2003), Ekström et al. (2005), Frei et al. (2006), and Buonomo et al. (2007).

Räisänen and Joelsson (2001) examined the significance of local changes in annual maximum daily precipitation within the context of interannual variability over single 10-yr control and future integrations. Huntingford et al. (2003) applied a bootstrap resampling technique to generalized extreme value (GEV) curves fitted to single 30-yr integrations to assess the significance of changes in local precipitation extremes. A similar parametric bootstrap procedure was also used by Zwiers and Kharin (1998) to examine changes in precipitation extremes at the GCM scale. A further approach using profile likelihood to assess uncertainty in a GEV fit to extreme data was applied in Buonomo et al. (2007). These approaches account for uncertainty due to the sampling of short time-scale natural variability, but do not address the additional issue of climate variability on decadal time scales and longer. We know, however, that there is a significant “red noise” component (i.e., significant spectral power at longer time scales) in sea surface temperatures (SSTs) (e.g., Hasselmann 1976; Rowell and Zwiers 1999) and so it is likely that at least some of this will be reflected in long-term natural variations of continental climate. For example, multidecadal variability of the Atlantic is known to contribute to seasonal mean anomalies over Europe (Hurrell 1995; Knight et al. 2006), and so may also be expected to contribute to uncertainty in extreme precipitation. In particular, changes in extreme precipitation are found to be largely determined by changes in seasonal mean precipitation over European regions in winter (Frei et al. 2006). Scaife et al. (2008) also provide direct evidence that multidecadal variability in large-scale circulation patterns is responsible for much of the observed change in extreme wintertime precipitation over Europe in recent decades.

An ensemble of model simulations using perturbed initial conditions can be used to provide multiple realizations of a future climate under a given forcing scenario. This can be then used to provide an estimate of uncertainty in a given anthropogenic climate change projection due to the full spectrum of internal variability. Ekström et al. (2005) applied a nonparametric bootstrap resampling technique to a three-member RCM ensemble to estimate the uncertainty bounds on future changes in annual extremes of daily precipitation across the United Kingdom. Frei et al. (2006) applied a similar technique to 90 yr of seasonal maxima data from three-member RCM ensembles to assess the significance of changes in extreme precipitation across Europe. They found that using three ensemble members led to a consistently different assessment of the changes in extreme precipitation over central Europe compared to the results when using a single ensemble member. The work presented here uses a three-member ensemble of the Hadley Centre RCM (HadRM3H) to carry out an in-depth assessment of the robustness of extreme daily precipitation change across Europe, on a seasonal basis. In particular, we address the possibility of a wider spread in extreme precipitation estimates from untried ensemble members sampling different phases of multidecadal variability (neglected in the above bootstrap resampling from within a given period of years). Specifically, we use the three ensemble members to provide an estimate of the spread of the underlying population of responses we would have seen with an infinite ensemble, taking account of the large uncertainty due to the small ensemble size. In this context, we examine the following questions:

  • Over what areas of Europe are changes in extreme precipitation discernible above natural variability, for the particular model and scenario used here?

  • Are single 30-yr integrations for the control and future climate sufficient to infer changes in the extreme tail of the underlying precipitation distribution at a given location?

  • How effective is spatial pooling in reducing the noise due to internal variability, and to what degree does the robustness of extreme precipitation change increase with spatial scale?

Note that throughout this paper we use the term “extreme precipitation” to refer specifically to extreme heavy precipitation. This is defined as the average of the heaviest 5% of all precipitation days in a given season.

It should be emphasized that in this study we are considering the robustness of extreme precipitation changes within the context of a given climate projection, that is, for a specific model and forcing scenario. This should not be confused with a robust signal of general climate change, which would include an assessment of uncertainty due to model deficiencies and future emissions, as well as natural variability. However, we anticipate that the results gained here for the HadRM3H climate projection will provide an understanding of the importance of natural variability for predicting extreme precipitation change that will be more generally applicable to other models and forcing scenarios.

The structure of this paper is as follows. Section 2 describes the experimental setup and statistical methodology, including a discussion of the implications of spectral redness in extreme precipitation data. Support for the chosen approach is also provided by a comparison of statistical methods contained in the appendix. Section 3 gives the motivation for this study, highlighting the intraensemble differences in projected future changes in extreme precipitation due to internal variability. Signal-to-noise ratios for these changes in extreme precipitation are analyzed in section 4, indicating where there is a discernible anthropogenic change across Europe and additionally where the change may be quantified within reasonable bounds. Section 5 discusses the extent to which increasing the ensemble size would lead to the emergence of the signal from climate noise, while the effectiveness of spatial processing is assessed in section 6. Section 7 examines the robustness of the finescale contribution to local changes provided by the RCM, and finally, the main conclusions are discussed in section 8.

2. Methodology

a. Model data

A three-member ensemble of the Hadley Centre RCM, HadRM3H, is used to estimate uncertainty in extreme precipitation change due to natural internal variability. The simulated control and future periods are 1960–90 and 2070–2100, respectively, with atmospheric constituents following the IPCC Special Report on Emissions Scenarios (SRES) A2 scenario. The model domain spans Europe with a horizontal resolution of approximately 50 km × 50 km, and the model formulation has been described by Buonomo et al. (2007). Lateral boundary forcing, land surface initialization, and initial atmospheric conditions are all derived from a corresponding three-member ensemble of the global atmospheric model HadAM3H. Thus, while having the same anthropogenic forcings, the ensemble members of the RCM differ in their realizations of large-scale internal climate variability, since this is inherited through the lateral boundaries from large-scale differences within the global model ensemble. The marine surface boundary conditions are identical to those of HadAM3H and are described below.

The experimental design of the driving HadAM3H ensemble is identical to that described by Rowell (2005) for the similar HadAM3P model and is summarized as follows. All members of the HadAM3H control ensemble are forced by observed SSTs, using the Hadley Centre Sea Ice and Sea Surface Temperature dataset (HadISST1; Rayner et al. 2003). Thus, the control ensemble members, for both HadAM3H and HadRM3H, differ only in their internal atmospheric and land surface variabilities (since their marine forcings are identical). For the future period, the HadAM3H ensemble is forced by SSTs that are formed from observed SSTs with the addition of mean changes and trends calculated from an ensemble of global coupled model projections. These coupled integrations, using HadCM3, were initiated from three different times within a multicentennial control integration, and hence the ensemble spread of the resulting HadAM3H and HadRM3H integrations simulates multidecadal internal variations of the coupled ocean–atmosphere system (since the added mean SST changes and trends have evolved differently in each HadCM3 ensemble member).

The HadRM3H integrations represent the current “state of the art” in climate modeling, and have been shown to give good skill in estimating the observed extreme precipitation statistics across Europe (Fowler et al. 2005; Frei et al. 2006). We note that data from the first year of the integrations are discarded to remove the effects of model spinup. Furthermore, data close to the lateral boundaries are relaxed to the GCM driving data over a four-point rim, and the RCM topography follows that of the GCM over an additional four-point inner rim, following the methodology of Jones et al. (1995). Thus, this entire zone (an eight-point rim) is excluded from the analysis to remove any boundary artifacts.

b. Analysis methods

In this study, extreme precipitation is defined as the mean daily precipitation exceeding the 95th percentile of wet days, for December–February (DJF or winter) or June–August (JJA or summer), using a wet-day threshold of 0.1 mm day−1. As such, 30 yr of data for each simulation correspond to about 60 extreme events in a given season (40%–50% wet days × 30 yr × 90 days × 5%).

If xi is the extreme precipitation, for a certain grid box, for the control simulations, then
i1520-0442-21-17-4280-e1
where i = 1, 2, 3 is the ensemble member, μ is the forced component of xi (due to time-varying SSTs and atmospheric composition common to each ensemble member), and εxi is the component of xi due to internal atmospheric variability (following Rowell 2005). Similarly, extreme precipitation for the future scenario yi can be expressed as
i1520-0442-21-17-4280-e2
where μf is the forced component of yi due to changes in atmospheric composition and ensemble mean SST in the future minus the control, and εyi is the component due to the internal variability, in this case including both internal atmospheric variability and that due to anomalous SST changes between each ensemble member.
To assess the robustness of the changes in extreme precipitation within the context of natural variability, we examine the signal-to-noise ratio (SNR) defined as
i1520-0442-21-17-4280-e3
where an overbar indicates averaging over the ensemble members, and σ2x and σ2y are the variances across the ensemble in the control and future scenarios, respectively. Within this context the signal is the anthropogenic change in extreme precipitation, and to test whether this has significantly emerged from climate noise, the appropriate null hypothesis is H0: μf = 0. This can be assessed using a standard t test under the assumption that all realizations of εx and εy are independent and sampled from Gaussian distributions (von Storch and Zwiers 1999). The appropriate t statistic is
i1520-0442-21-17-4280-e4
where n (= 3) is the ensemble size. Following von Storch and Zwiers (1999), this is tested against critical values of the t distribution with, for the case of unequal variances, ν degrees of freedom given by
i1520-0442-21-17-4280-e5
where ν varies between (n − 1) (if σ2xσ2y or vice versa) and 2(n − 1) (if σ2x = σ2y). Thus, for a given value of the variance ratio and a given significance level, there is a unique relationship between the minimum value of SNR corresponding to a discernible signal and the ensemble size.

The validity of the Gaussian assumption for the distribution of estimates xi and yi inherent in the t test above is dependent on the level of the extreme of interest. In general, estimates of an extreme metric generated by randomly sampling a population are expected to show a skewed distribution, with the departure from normality increasing with the extremity of the metric (Folland and Anderson 2002). The Mann–Whitney test (von Storch and Zwiers 1999) is an alternative nonparametric test that does not assume normality, and this has been used here to test the sensitivity of the results to the Gaussian assumption. For an ensemble size of 3, the equivalent Mann–Whitney test of H0 at the 10% significance level equates to rejecting H0 when xi < yj for all {i = 1, . . . , 3, j = 1, . . . , 3} or xi > yj for all {i, j}. In general, it is found that for the extreme index examined here the two methods produce very similar results, suggesting the assessment of significance is relatively insensitive to the distribution assumption, and if anything, the assumption of normality renders the t test slightly conservative (i.e., lower rejection rate of H0). The t test is used preferentially here as it allows a direct interpretation of SNR in terms of significance.

Under the above assumptions, confidence intervals for the anthropogenic change in extreme precipitation can be derived, such that the estimated uncertainty ±E in the signal yx is given by
i1520-0442-21-17-4280-e6
where σ2(yx) is the variance of (yixi), tn−1 is the (0.5 + p/2)th quantile of the tdistribution with (n − 1) degrees of freedom, for the p × 100% confidence interval (von Storch and Zwiers 1999). Assuming σ2(yx)σ2x + σ2y, the relative error in the signal can be expressed in terms of SNR:
i1520-0442-21-17-4280-e7
Thus, for an ensemble size of 3 (for which tn−1 = 2.92 for p = 0.9), absolute values of SNR greater than 5 or 10 correspond to a relative error in the signal of less than 0.5 or 0.25, respectively, at the 90% confidence level. In this case, a change of 4 mm day−1 would be known to within ±2 mm day−1 (SNR greater than 5) or ±1 mm day−1 (SNR greater than 10). Thus, |SNR| = 5 will provide a reasonably precise estimate of the signal, with |SNR| = 10 representing a threshold for the magnitude of changes in extreme precipitation to be robust (for n = 3).

c. Implications of climate noise “redness”

This section provides statistical evidence of spectral redness in the extreme precipitation data and, then, discusses the implications of this for assessing the uncertainty in extreme precipitation changes.

A bootstrap resampling technique was employed to examine the extent to which intraensemble variability could simply be explained by short time-scale natural variability. Assuming that seasonal data are independent from one year to the next, multiple sets of 30 winters (or summers) were randomly sampled from one ensemble member using replacement (Ferro et al. 2005). The resamples represent equally valid realizations of short-time-scale natural variability, from which multiple extreme precipitation estimates, x*1, for the control (or y*1 for the future) can be obtained. This resampling was repeated for a second ensemble member in order to generate multiple estimates, x*2 (or y*2), and hence a confidence interval for the difference: d = x2x1. It was found that the ensemble member differences were significant (i.e., the null hypothesis H0: d = 0 was rejected) at the 10% significance level for 20% (15%) of the grid boxes over Europe for the control and 14% (20%) for the future in DJF (JJA). Even when accounting for the nonindependence of adjacent grid boxes, we are unable to explain a rejection rate of 20% by chance (for 100 independent grid boxes, which is likely to be a considerable underestimate in this case, the probability of a ≥20% rejection rate by chance is <0.2%). This suggests that intraensemble variability is not just the result of the finite sampling of short-time-scale variability. To test further whether the above rejection rates imply redness in extreme precipitation data, the full 90 yr of data from the three ensemble members were redistributed randomly into three 30-yr blocks in order to destroy any serial correlation. The resampling procedure was repeated for the redistributed three-member set, and hence the rejection rate for white data calculated. On repeating the redistribution and subsequent resampling many times, a range for the expected rejection rate for white data was derived. It was found that a rejection rate of 20% of the grid boxes, obtained for the actual ensemble members above, is outside the 90% confidence interval for white data. This suggests that there is evidence of spectral redness in the extreme precipitation data, with a significant noise component varying on at least multiannual time scales. It is noted that in the control the ensemble members are forced by common SSTs (section 2a); however, a possible additional source of long-term memory may be in the land surface. In particular, preliminary investigations have shown significant ensemble member differences in 30-yr seasonal mean soil moisture and temperature over parts of Europe. In the future, there is additionally a representation of the unforced multidecadal variability introduced through anomalous SST changes in each ensemble member (section 2a). The experimental design used here, specifically the use of time-slice experiments with the control ensemble members forced by common SSTs, limits the extent to which the redness is fully represented; nevertheless, we have established that redness is important.

The fact that there is evidence of redness in the extreme data, on at least multiannual time scales (above) and potentially up to multidecadal time scales (Scaife et al. 2008), has implications for assessing the significance of any future anthropogenic changes. Resampling techniques need to account for the temporal dependence in the data in order to be accurate (Ferro et al. 2005). Thus, previous studies that have resampled from within a period of years to construct a synthetic sample representative of a longer period may give an inaccurate estimate of uncertainty in the extreme values (section 1). This potential difficulty has been eliminated here, since the t-test approach exploits the fact that the three ensemble simulations provide independent realizations of the climate. The three realizations provide some information on the spread of the underlying population including untried ensemble members, and this is modeled within the t test using a Gaussian approximation. Thus, this approach explicitly allows for uncertainty in extreme precipitation changes within the context of decadal to multidecadal climate variability, as well as the finite sampling of shorter-time-scale natural variability and, thus, attempts to address the shortcomings identified in some previous studies (section 1).

The t test tells us if changes in the extreme precipitation metric (mean of events greater than the 95th percentile) estimated from 30-yr climate change experiments are significant. This raises a general issue, common to all approaches, of the extent to which an extreme metric estimated from a finite sample is appropriate for the underlying population. The performance of an estimator depends on the level of extreme being measured and the sample size. For example, Folland and Anderson (2002) show that estimates of a high percentile (e.g., 95th) from samples of size 30, taken from independent and identically distributed (iid) random variables may be biased with respect to the underlying population metric. For a climate variable, the effective iid sample size and thus the quality of 30-yr estimates as an estimator of the equivalent population metric will depend on the time scales of the climate noise in the underlying population. In particular, if the climate noise is dominated by short (less than decadal) time scales, then each 30-yr estimate will have the same power of estimation (i.e., be representative of the full range of climate variability), and thus an ensemble of these will converge faster with increasing ensemble size than if there is significant multidecadal variability. In the limit that one mode of the multidecadal climate variability is responsible for much of the tail of the underlying population distribution, then the ensemble estimate may be significantly biased with respect to the population metric.

In the appendix, the performance of the t test for assessing the significance of changes in extreme precipitation is compared to that of profile likelihood, which is a widely used approach for assessing uncertainty in a generalized extreme value (GEV) fit to the extreme data (Coles 2001). In general, this supports the t test as providing a conservative assessment of the robustness of the changes in the extreme precipitation. The t test has the advantage of being a simple technique that is applicable to any index of the distribution. In addition, as noted above, the t test is applicable to ensembles sampling noise with a high degree of redness. As such, this approach could also be extended to multimodel ensembles, where ensemble members are sampling from quite different populations.

3. Internal variability of precipitation extremes

The projected changes in extreme precipitation across Europe for the three-member ensemble of HadRM3H, for DJF and JJA, are shown in Fig. 1. In DJF, it can be seen that in all three ensemble members there are increases in extreme precipitation of up to about 40% over much of Europe, with the exception of decreases over the northwest Scandinavian coast and parts of the Mediterranean. In JJA, all three show increases in extreme precipitation over northern Scandinavia and decreases over much of southern Europe and the Mediterranean, while over central Europe extreme precipitation is projected to increase or decrease dependent on the local region and ensemble member. Thus, on large scales there appears to be a high degree of similarity in the projected changes in extreme precipitation between the three realizations. However, locally or at an individual grid box, there are considerable discrepancies. For example over Spain in DJF and over the United Kingdom in JJA, the geographical pattern of increasing and decreasing extreme precipitation is very different in the three projections. These discrepancies arise from natural climate variability that the three-member ensemble samples, with each realization being equally likely. This raises the issue of, for a given model and forcing scenario, how confident we are in the anthropogenically forced component of future changes in local precipitation extremes.

4. Signal-to-noise ratio

Signal-to-noise ratios (SNRs) for changes in extreme and very extreme precipitation are shown in Fig. 2, along with the seasonal mean wet-day precipitation for comparison. Given that we only have an ensemble size of 3, there is relatively low confidence in these estimates of SNR. However, application of the t test indicates that we are able to discern significant anthropogenic changes in the precipitation indices at many grid boxes within the region. In general, there is a tendency for SNRs to decrease on moving to increasingly extreme metrics. This is because although the signal generally increases in absolute terms for more extreme metrics (not shown), there is a greater increase in noise due to internal variability. This in turn is likely to at least partly reflect the number of events contributing to a given ensemble member estimate of each metric.

For extreme precipitation (defined here as exceedance of the 95th percentile of wet days), approximately 60% of grid boxes across Europe show significant changes in DJF, while this drops to 36% in JJA. We note that if the Mann–Whitney test is used instead to assess the significance the corresponding figures are 62% in DJF and 40% in JJA (section 2b). The lower values of the SNR during JJA partly reflect the relatively low signals in some regions during this season; however, noise due to internal variability is also found to be higher in summer particularly over northern and central Europe. In addition, there is also evidence of higher levels of noise over southern parts of Europe during both seasons, which may be at least partly attributable to there being relatively few wet days in these regions.

For changes in extreme precipitation, high absolute values of SNR (>5) are mainly limited to regions in northern Europe during DJF and a few grid boxes over Scandinavia and southern Europe in JJA. In these regions, we are able to quantify the signal to within reasonable bounds (i.e., with the bounds of the 90% confidence interval being a half of the signal; see section 2b). Elsewhere, significant changes indicate that we are confident that there will be an increase or decrease in extreme precipitation, but the magnitude of the change is relatively uncertain. It is noted that because of the small ensemble size, and thus the large uncertainty in the estimated SNR, it is not possible to make more detailed inferences regarding the variation in the actual SNR. Additionally, we note that the high values of SNR over the Baltic in JJA, corresponding to significant increases in extreme precipitation, are linked to very large local SST anomalies that are thought to be an unrealistic artifact of the coupled model from which they were derived (Kjellström et al. 2005).

In the following sections, we look at how increasing the ensemble size would increase our confidence in the change associated with a given value of SNR, and how SNR depends on strategies for spatial pooling and averaging.

5. Dependence on ensemble size

The theoretical relationship between the minimum value of SNR for a statistically significant anthropogenic change (SNRc) and the ensemble size [Eq. (4)] is shown in Fig. 3. In the variance ratio limit of σ2xσ2y or vice versa, we obtain a conservative estimate for SNRc. It can be seen that with an ensemble size of 3, |SNR| ≳ 2 corresponds to a significant anthropogenic change (also evident from Fig. 2). SNRc decreases rapidly with increasing ensemble size, reflecting the increasing confidence in the ensemble estimate of the signal. In particular, when increasing the ensemble size from 3 to 6, SNRc decreases by at least a factor of 1.7 (depending on the variance ratio). Similarly, the SNR threshold for the 90% confidence interval bounds being a quarter of the signal is reduced from about 10 to 5 when doubling the ensemble size. Thus, increasing the ensemble size is an effective method for generating robust signals of extreme precipitation change (within the context of a given climate projection, i.e., a specific model and forcing scenario; see section 1).

It is noted that given knowledge of the SNR for the underlying population, it would be possible to invert the above relationship and determine the minimum ensemble size required in order to discern the anthropogenic change signal at any given location. Of course, in practice we only have our ensemble estimate of SNR, and for n = 3 we have low confidence in this. Hence, for those grid boxes where the ensemble estimate of |SNR| ≳ 2, we can only say that more than three ensemble members would be required to discern an anthropogenic change and cannot be more precise than that.

6. Strategies for spatial processing

In this section, we examine the relative effectiveness of spatial pooling and spatial averaging for generating robust signals of extreme precipitation change. It is noted that here, as throughout this paper, we refer to robustness within the context of a given climate projection; that is, we are only considering one source of uncertainty, namely that due to natural variability (section 1).

The impact of spatial pooling on SNR is shown in Fig. 4. Here, daily precipitation data from neighboring grid cells are pooled to give one long time series, from which the extreme precipitation index is then calculated. If neighboring grid cells can be considered to be effectively sampling from the same precipitation population, then by pooling the data we would expect to get a less noisy extreme precipitation estimate and thus a higher value of SNR for the same ensemble size. This hypothesis is borne out by the results, which show that spatial pooling over 3 × 3 grid cells (i.e., nine nearest neighbors) increases the fractional area of the significant changes from 60% to 63% in DJF and from 36% to 42% in JJA. Further increasing the number of grid cells pooled leads to further increases in field significance and also the emergence of many more regions with robust quantitative changes (|SNR| > 10), but at the cost of the loss of regional detail. Räisänen and Joelsson (2001) attempt to weight the benefit of spatial averaging (their “spatial averaging” is more akin to our “spatial pooling”) in reducing noise versus the error in the local signal due to smoothing. Their results suggest that low degrees of spatial pooling are likely to be beneficial, with the reduction in noise outweighing the loss of regional detail due to smoothing. Clearly, however, the optimal degree of spatial pooling, for which the error in the estimated local signal is minimized, will depend on the extreme index of interest and also the region.

In the case of spatial averaging, the daily precipitation data from neighboring grid cells are first spatially averaged to give a daily time series of large-scale precipitation, and this is then used to calculate the extreme precipitation index. It is noted that while spatial pooling attempts to provide improved statistics of local precipitation extremes, spatial averaging (as defined here) leads to the sampling of different weather types (synoptic rather than extreme local events). These larger-scale averages are expected to be less strongly affected by internal variability than truly local events (Räisänen 2001). When applying spatial averaging, the resulting maps of SNR (only shown for 7 × 7 averaging; Fig. 6a) are very similar to those in Fig. 4, although the fractional area of significance is about 1% less in each case. At the GCM scale (≡ 7 × 7 averaging), extreme precipitation changes are seen to be significant over almost all European land regions during DJF, while the location of the boundary between increasing and decreasing precipitation extremes remains uncertain for JJA. In general, at this scale, we are still largely unable to quantify the magnitude of the changes with reasonable accuracy across much of Europe.

The relative performance of spatial pooling and spatial averaging for generating robust changes in extreme precipitation over European land is shown in Fig. 5. Here, the area mean SNR is calculated by averaging absolute SNR values across all land grid boxes in the European model domain. It can be seen that with no spatial pooling and three ensemble members, there is a significant change over European land on average during DJF, but not during JJA (left-hand end of the black lines). On applying spatial pooling over 3 × 3 grid cells, there is a noticeable increase in SNR for both seasons, suggesting that this is relatively effective at reducing grid-box noise. Further spatial pooling increases SNR but at a slower rate, with 5 × 5 pooling leading to the emergence of the climate change signal over Europe during JJA. Spatial averaging is also seen to increase SNR, but is generally less effective than spatial pooling. It is also noted that changes in these larger-scale extremes, produced by spatial averaging, may not provide a good estimate of the changes in local-scale extremes (section 7).

Figure 5 also shows the result for area mean SNR over the Alpine region, defined as 44°–48°N, 5°–15°E as in the Prediction of Regional Scenarios and Uncertainties for Defining European Climate Change Risks and Effects (PRUDENCE) project (Dequé et al. 2007). Over the Alps, 3 × 3 pooling increases SNR, leading to the emergence of the anthropogenic change signal from climate noise during JJA. However, further spatial pooling leads to a decrease in SNR in summer. This is likely due to data from high-elevation regions being combined with data from valleys, corresponding to sampling two very different populations. Large-scale extremes produced by spatial averaging are more robust in this region in summer. Other European subdivisions (typically 20° by 15°) exhibit the same behavior as the entire European land area, with monotonic increases in SNR (not shown).

It is noted that although Fig. 5 indicates the degree of spatial pooling–averaging required for significant anthropogenic changes in extreme precipitation across European land regions on average (for three ensemble members), there are individual grid boxes within the region with no discernible change. In particular, in the transition zones between increasing and decreasing precipitation, signals are not significant even with high degrees of spatial pooling (Fig. 4). In the transition zones, the local anthropogenic change may indeed be small, and thus an insignificant signal (i.e., low SNR) may not necessarily correspond to low precision (i.e., high noise level). This may at least partly apply to the low SNR values over parts of central and eastern Europe in JJA. In these regions high degrees of spatial pooling are unlikely to be beneficial and may give misleading results, with increases in SNR reflecting the decay of the transition zone rather than any real increase in robustness locally.

Finally, in this section we examine the relative merits of spatial pooling versus increasing the ensemble size. Doubling the ensemble size from three to six leads to the SNR threshold for significance decreasing by at least a factor of 1.7 (section 5), while increasing the number of grid cells pooled from one to nine leads to an increase in SNR, averaged over Europe, by about a factor of 1.2. Thus, doubling the ensemble size achieves a greater reduction in the signal error than does pooling over 3 × 3 grid cells. This can be understood from the fact that precipitation at a neighboring grid box cannot be considered to be an independent sample of precipitation at the grid box of interest. Precipitation events at neighboring grid cells are not independent, with noise due to natural variability being coherent over several grid lengths. In particular, on longer time scales climate variability tends to occur in preferred large-scale spatial patterns (Baede et al. 2001). In addition, the extent to which spatial pooling is a valid approach to increasing sample size depends on the extent to which the precipitation distributions are similar at adjacent grid boxes. For example, in regions of steep orography (e.g., the Alps) or coastal margins, adjacent grid boxes may show quite different distributions and these regions of “robust spatial heterogeneity” are unlikely to be amenable to spatial aggregation techniques. In general, although 3 × 3 spatial pooling is largely beneficial in reducing the local signal error, relatively greater increases in the robustness of local changes are achieved by increasing the ensemble size. The latter, of course, is accompanied by increased computational cost.

7. Robustness of RCM finescale component

We have shown that large-scale precipitation extremes are generally more robust. In this section we examine the extent to which there is an additional robust finescale contribution to local changes. Finescale information is provided by the RCM, while large-scale behavior is largely inherited from the driving GCM. Fowler et al. (2005) show that the finescale signal from the RCM is skillful at least for the present day, suggesting that this represents “added value” with respect to the GCM. In this analysis, we identify whether there is a robust contribution to the local change in extreme precipitation from the RCM, at scales unresolved by the GCM. We do not consider the added value that the RCM may also provide at larger scales.

The GCM-scale field is calculated by spatially averaging daily precipitation data over 7 × 7 RCM grid cells, and we wish to examine whether there is a significant difference in the relative (%) changes in extreme precipitation between the RCM and GCM scales. Following the approach of Räisänen and Joelsson (2001), we address this by testing the null hypothesis: H0:yFSxFS = 0 where the finescale components in the control and future, xFS and yFS, are defined as
i1520-0442-21-17-4280-e8
i1520-0442-21-17-4280-e9
where xRCM and xGCM are the extreme precipitation at the RCM scale and GCM scale, respectively, for the control, and similarly yRCM and yGCM for the future; the coefficient A = (xRCM + yRCM)/(xGCM + yGCM), and an overbar indicates averaging over ensemble members i. The scaling factor A ensures that there is no significant change in the finescale component when the relative changes at the RCM and GCM scales are the same. Thus, testing H0 specifically examines whether there is a robust finescale component to the local change above a simple scaling. It is noted that since the spatially averaged RCM field is also likely to give a better representation of reality on the large-scale than the GCM field (Durman et al. 2001), this approach gives a lower bound on RCM added value.

The results, shown in Fig. 6, have been examined for RCM-scale information represented both as the individual grid-box values and as the corresponding values after 3 × 3 spatial pooling. With no spatial processing, estimates of the finescale component will contain finescale numerical noise, which may in some cases be systematic across the ensemble (e.g., where it is linked to the orography). Thus, by applying 3 × 3 spatial pooling, which has been shown to be effective at reducing grid-box noise, we may achieve a more accurate measure of the local signal and hence its finescale component.

From Fig. 6, we are able to identify many regions where the RCM provides a robust finescale contribution to the relative change in extreme precipitation locally. In particular, over mountainous regions (e.g., the Scandinavian mountains and the Alps) there are significant robust finescale contributions to the change, even though the GCM-scale change itself may not be significant. This is consistent with these being regions of “robust spatial heterogeneity.” Over the remainder of Scandinavia, to the south and east of the mountains, extreme precipitation is seen to consistently increase less at the finescale than the large scale in relative terms during DJF. This suggests that here local extreme events do not coincide with large-scale extremes, and that the processes controlling changes in these two different weather types may be quite different. The result is consistent with smaller fractional increases in extreme events in the RCM compared to the driving GCM, as reported in Jones et al. (1997). Over the Mediterranean in JJA, extreme precipitation decreases less at the finescale than at the large scale in relative terms. This similarly may point to different processes controlling the changes in local- and large-scale extremes; in particular, there is a tendency for intense precipitation events (which are generally localized) to increase more than the average precipitation in a warmer climate (Allen and Ingram 2002). It is noted that there are consistently more dry days and hence fewer extreme events (top 5% of wet days) at the finescale; however, this generic property of aggregation does not explain the smaller fractional changes in extreme precipitation when compared to the large scale, as found above.

In general, pooling over 3 × 3 grid cells is seen to be effective at reducing finescale noise, leading to a more robust measure of the finescale contribution to the local change. However, an exception to this is seen over mountainous regions, where the RCM is seen to provide robust physically meaningful information down to the 50-km grid scale.

8. Conclusions

Climate noise significantly affects our ability to measure robust signals of extreme precipitation change across Europe. We have addressed this by applying a statistical framework to estimate the uncertainty in extreme precipitation changes due to the full spectrum of climate noise. In particular, the approach used here allows the possibility of a wider spread in extreme precipitation from untried ensemble members sampling different phases of multidecadal variability. We have provided evidence of spectral “redness” in extreme precipitation data on at least multiannual time scales, and other studies (e.g., Scaife et al. 2008) suggest that this is likely to extend to multidecadal time scales. It is noted that the extent to which we are able to reliably account for multidecadal variability is limited by the ensemble size and the degree to which the model accurately represents low-frequency variability. Nevertheless, the results here establish the principle of the need for good sampling of climate variability.

In this paper, we have specifically examined the importance of natural internal variability for projected changes in local precipitation extremes within the context of the three questions outlined in section 1, which are addressed in turn below.

a. Over what areas of Europe are changes in extreme precipitation discernible above natural variability, for the particular model and scenario used here?

For the three-member ensemble representation of current and projected future climate examined here, extreme precipitation changes at the grid-box level are found to be discernible above climate noise over much of northern and central Europe in DJF, but over less than half of Europe in JJA. In particular, there is no significant anthropogenic change over large parts of the Mediterranean in DJF or in much of central and eastern Europe in JJA. In addition, the ability to quantify the change to a reasonable accuracy (i.e., with the bounds of the 90% confidence interval being a half of the signal) is found to be largely limited to regions in northern Europe during DJF and a few grid boxes over Scandinavia and southern Europe in JJA.

Much of the seasonality and geographic dependence of the robustness of extreme precipitation changes across Europe can be explained by the presence and location of a transition zone between increasing and decreasing extreme precipitation. In JJA there is a transition zone across central and eastern Europe, whereas in DJF this is displaced farther south over the Mediterranean and northern Africa. In this transition zone, the underlying anthropogenic change may indeed be small and therefore inherently difficult to discern even above low levels of noise. In addition to variation in the signal, however, there is also evidence of increased internal variability during JJA and over southern Europe in both seasons.

b. Are single 30-yr integrations for the control and future climate sufficient to infer changes in the extreme tail of the underlying precipitation distribution at a given location?

No. Where there is a significant influence of long-time-scale climate variability on the extreme tail of the underlying precipitation distribution, single 30-yr experiments are insufficient to infer changes. Resampling techniques can be used to estimate uncertainty in an extreme precipitation value obtained from a single 30-yr integration accounting for the finite sampling of short-time-scale variability represented within the given 30-yr period. However, the results and understanding presented here suggest that multiannual to multidecadal variability may contribute additional uncertainty. On estimating uncertainty from the full spectrum of climate noise, it is found that, for the specific climate change projection assessed here, more than three 30-yr ensemble integrations would be needed to demonstrate significant local changes over large parts of the Mediterranean in DJF and much of central and eastern Europe in JJA. Elsewhere, although we may be able to discern an increase or decrease in the extreme precipitation with three ensemble members, more are generally needed to be confident of the magnitude of the change. Of course, even with exact knowledge of the underlying precipitation distribution, achieved using a very large ensemble of a “perfect” model, natural variability will lead to uncertainty in the actual realization of the future climate in the real world.

The above results suggest that the investment of resources in large ensembles sampling natural variability will lead to benefits in terms of our ability to accurately predict future changes in local precipitation extremes. This approach would not reduce other sources of uncertainty due to model deficiencies and future emissions, although as shown in Frei et al. (2006) natural variability is an important source of uncertainty for local precipitation extremes.

c. How effective is spatial pooling in reducing the noise due to internal variability, and to what degree does the robustness of extreme precipitation change increase with spatial scale?

It is found that spatial pooling over neighboring grid cells can be beneficial in increasing the robustness of extreme precipitation changes, both in terms of our ability to discern a significant change in some regions and to quantify it in others. In particular, 3 × 3 pooling is effective at reducing grid-box noise. However, when pooling over increasingly large areas, the relative effectiveness of this technique in reducing noise decreases, with high degrees of spatial pooling associated with considerable loss of regional detail. In addition, in regions of “robust spatial heterogeneity,” such as steep orography, simple pooling of data over the nearest neighbors may not be appropriate and more sophisticated spatial pooling may be needed. In general, although 3 × 3 pooling is largely beneficial in reducing noise, considerably greater increases in the robustness of local changes are achieved by doubling the ensemble size.

The robustness of extreme precipitation changes is found in general to increase with increasing scales of spatial averaging. In particular, in DJF for three ensemble members, significant changes in extreme precipitation are seen over almost all European land regions on approximately the GCM scale, although uncertainty remains with respect to their magnitude. However, in the transition zone in JJA, there is no evidence of increasing robustness for extreme precipitation changes even on large spatial scales. Although larger-scale precipitation extremes are generally more robust, there is also evidence of robust finescale contributions to the local change. In particular, over the Scandinavian mountains and the Alps, the RCM is seen to provide robust information down to the 50-km grid scale. In addition, we also find evidence that the RCM leads to smaller fractional increases in extreme precipitation over eastern Scandinavia in DJF and smaller fractional decreases over the Mediterranean in JJA relative to the broader scale. In these regions of robust finescale offsets, particular care is needed in interpreting GCM results.

In this study, the emergence of an anthropogenically driven change above noise due to natural variability has been used as a measure of robustness. Therefore, a small change known with high precision may not be assessed as robust. This is perhaps most pertinent in transition zones. Here, an anthropogenic change may not be discernible above natural variability, although it may be known to within a small absolute uncertainty range. The former is likely to be most relevant as it implies that the change is insignificant relative to the range in extreme behavior currently experienced locally. The signal-to-noise ratio assessed here provides a relative measure of robustness that specifically identifies where anthropogenic forcing leads to a change in extreme behavior that is outside the bounds of natural variability and thus is of particular interest to society.

The above results provide a comprehensive assessment of the robustness of extreme precipitation changes for a single climate change projection (i.e., the HadRM3H projection for the SRES A2 scenario) for Europe derived from a three-member ensemble experiment. This demonstrates the need for ensembles of integrations, the benefits of spatial pooling, and the dependence on spatial scale, within the context of uncertainty due to natural internal variability; key conclusions that we expect will apply qualitatively to other models and forcing scenarios. Clearly, this does not equate to assessing a robust signal of general climate change, which would require an assessment of uncertainty due to model deficiencies and future emissions, as well as natural variability. In particular, if we are to evaluate the overall reliability of projected changes in extreme precipitation, it is important to understand the underlying mechanisms of change and this will be the subject of a future paper.

Acknowledgments

We thank Simon Brown, James Murphy, and David Stephenson for constructive discussions and an anonymous reviewer for very useful comments, in particular for suggesting the resampling test used in section 2c. Financial support was provided by the U.K. Department for Environment, Food and Rural Affairs under Contract PECD 7/12/37 and by the European Commission’s Sixth Framework Programme under Contract GOCE-CT-2003-505539 (ENSEMBLES).

REFERENCES

  • Allen, M. R., and W. J. Ingram, 2002: Constraints on future changes in climate and the hydrological cycle. Nature, 419 , 224232.

  • Baede, A. P. M., E. Ahlonsou, Y. Ding, and D. Schimel, 2001: The climate system: An overview. Climate Change 2001: The Scientific Basis, J. T. Houghton et al., Eds., Cambridge University Press, 85–98.

    • Search Google Scholar
    • Export Citation
  • Buonomo, E., R. Jones, C. Huntingford, and J. Hannaford, 2007: On the robustness of changes in extreme precipitation over Europe from two high resolution climate change simulations. Quart. J. Roy. Meteor. Soc., 133 , 6581.

    • Search Google Scholar
    • Export Citation
  • Christensen, J. H., and Coauthors, 2007: Regional climate projections. Climate Change 2007: The Physical Science Basis, S. Solomon et al., Eds., Cambridge University Press, 847–940.

    • Search Google Scholar
    • Export Citation
  • Coles, S., 2001: An Introduction to Statistical Modeling of Extreme Values. Springer, 208 pp.

  • Dequé, M., and Coauthors, 2007: An intercomparison of regional climate simulations for Europe: Assessing uncertainties in model projections. Climatic Change, 81 , S1. 5370. doi:10.1007/s10584-006-9228-x.

    • Search Google Scholar
    • Export Citation
  • Durman, C. F., J. M. Gregory, D. C. Hassell, R. G. Jones, and J. M. Murphy, 2001: A comparison of extreme European daily precipitation simulated by a global and a regional climate model for present and future climates. Quart. J. Roy. Meteor. Soc., 127 , 10051015.

    • Search Google Scholar
    • Export Citation
  • Ekström, M., H. J. Fowler, C. G. Kilsby, and P. D. Jones, 2005: New estimates of future changes in extreme rainfall across the UK using regional climate model integrations. 2. Future estimates and use in impact studies. J. Hydrol., 300 , 234251.

    • Search Google Scholar
    • Export Citation
  • Ferro, C. A. T., A. Hannachi, and D. B. Stephenson, 2005: Simple nonparametric techniques for exploring changing probability distributions of weather. J. Climate, 18 , 43444354.

    • Search Google Scholar
    • Export Citation
  • Folland, C. K., and C. W. Anderson, 2002: Estimating changing extremes using empirical ranking methods. J. Climate, 15 , 29542960.

  • Fowler, H. J., M. Ekström, C. G. Kilsby, and P. D. Jones, 2005: New estimates of future changes in extreme rainfall across the UK using regional climate model integrations. 1. Assessment of control climate. J. Hydrol., 300 , 212233.

    • Search Google Scholar
    • Export Citation
  • Frei, C., R. Scholl, S. Fukutome, J. Schmidli, and P. L. Vidale, 2006: Future change in precipitation extremes in Europe: Intercomparison of scenarios from regional climate models. J. Geophys. Res., 111 .D06105, doi:10.1029/2005JD005965.

    • Search Google Scholar
    • Export Citation
  • Hasselmann, K., 1976: Stochastic climate models. Part I: Theory. Tellus, 28 , 473485.

  • Huntingford, C., R. G. Jones, C. Prudhomme, R. Lamb, and J. H. C. Gash, 2003: Regional climate model predictions of extreme rainfall for a changing climate. Quart. J. Roy. Meteor. Soc., 129 , 16071621.

    • Search Google Scholar
    • Export Citation
  • Hurrell, J. W., 1995: Decadal trends in the North Atlantic Oscillation: Regional temperatures and precipitation. Science, 269 , 676679.

    • Search Google Scholar
    • Export Citation
  • Jones, P. D., and P. A. Reid, 2001: Assessing future climate change in extreme precipitation over Britain using regional climate model integrations. Int. J. Climatol., 21 , 13371356.

    • Search Google Scholar
    • Export Citation
  • Jones, R. G., J. M. Murphy, and M. Noguer, 1995: Simulation of climate change over Europe using a nested regional–climate model. I: Assessment of control climate, including sensitivity to location of lateral boundaries. Quart. J. Roy. Meteor. Soc., 121 , 14131449.

    • Search Google Scholar
    • Export Citation
  • Jones, R. G., J. M. Murphy, M. Noguer, and A. B. Keen, 1997: Simulation of climate change over Europe using a nested regional–climate model. II: Comparison of driving and regional model responses to a doubling of carbon dioxide concentration. Quart. J. Roy. Meteor. Soc., 123 , 265292.

    • Search Google Scholar
    • Export Citation
  • Kharin, V. V., and F. W. Zwiers, 2000: Changes in the extremes in an ensemble of transient climate simulations with a coupled atmosphere–ocean GCM. J. Climate, 13 , 37603788.

    • Search Google Scholar
    • Export Citation
  • Kharin, V. V., and F. W. Zwiers, 2005: Estimating extremes in transient climate change simulations. J. Climate, 18 , 11561173.

  • Kjellström, E., R. Döscher, and H. E. M. Meier, 2005: Atmospheric response to different sea surface temperatures in the Baltic Sea: Coupled versus uncoupled regional climate model experiments. Nord. Hydrol., 36 , 397409.

    • Search Google Scholar
    • Export Citation
  • Knight, J. R., C. K. Folland, and A. A. Scaife, 2006: Climate impacts of the Atlantic multidecadal oscillation. Geophys. Res. Lett., 33 .L17706, doi:10.1029/2006GL026242.

    • Search Google Scholar
    • Export Citation
  • McCarthy, J. J., O. F. Canziani, N. A. Leary, D. J. Dokken, and K. S. White, 2001: Climate Change 2001: Impacts, Adaptation and Vulnerability. Cambridge University Press, 1032 pp.

    • Search Google Scholar
    • Export Citation
  • Mitchell, J. F. B., T. C. Johns, M. Eagles, W. J. Ingram, and R. A. Davis, 1999: Towards the construction of climate change scenarios. Climatic Change, 41 , 547581.

    • Search Google Scholar
    • Export Citation
  • Mitchell, T. D., 2003: Pattern scaling: An examination of the accuracy of the technique for describing future climates. Climatic Change, 60 , 217242.

    • Search Google Scholar
    • Export Citation
  • Murphy, J. M., D. M. H. Sexton, D. N. Barnett, G. S. Jones, M. J. Webb, M. Collins, and D. A. Stainforth, 2004: Quantification of modelling uncertainties in a large ensemble of climate change simulations. Nature, 430 , 768772.

    • Search Google Scholar
    • Export Citation
  • Räisänen, J., 2001: CO2-induced climate change in CMIP2 experiments: Quantification of agreement and role of internal variability. J. Climate, 14 , 20882104.

    • Search Google Scholar
    • Export Citation
  • Räisänen, J., and R. Joelsson, 2001: Changes in average and extreme precipitation in two regional climate model experiments. Tellus, 53A , 547566.

    • Search Google Scholar
    • Export Citation
  • Rayner, N. A., D. E. Parker, E. B. Horton, C. K. Folland, L. V. Alexander, D. P. Rowell, E. C. Kent, and A. Kaplan, 2003: Global analyses of SST, sea ice and night marine air temperature since the late nineteenth century. J. Geophys. Res., 108 .4407, doi:10.1029/2002JD002670.

    • Search Google Scholar
    • Export Citation
  • Rowell, D. P., 2005: A scenario of European climate change for the late 21st century: Seasonal means and interannual variability. Climate Dyn., 25 , 837849.

    • Search Google Scholar
    • Export Citation
  • Rowell, D. P., and F. W. Zwiers, 1999: The global distribution of sources of atmospheric decadal variability and mechanisms over the tropical Pacific and southern North America. Climate Dyn., 15 , 751772.

    • Search Google Scholar
    • Export Citation
  • Scaife, A. A., C. K. Folland, L. V. Alexander, A. Moberg, and J. R. Knight, 2008: European climate extremes and the North Atlantic Oscillation. J. Climate, 21 , 7283.

    • Search Google Scholar
    • Export Citation
  • Sorteberg, A., and N. G. Kvamsto, 2006: The effect of internal variability on anthropogenic climate projections. Tellus, 58A , 565574.

    • Search Google Scholar
    • Export Citation
  • von Storch, H., and F. W. Zwiers, 1999: Statistical Analysis in Climate Research. Cambridge University Press, 484 pp.

  • Zwiers, F. W., and V. V. Kharin, 1998: Changes in the extremes of the climate simulated by CCC GCM2 under CO2 doubling. J. Climate, 11 , 22002222.

    • Search Google Scholar
    • Export Citation

APPENDIX

Comparison of a t Test and Profile Likelihood for Assessing the Significance of Extreme Precipitation Change

Profile likelihood is generally found to be the most accurate approach for assessing uncertainty in a generalized extreme value (GEV) fit to a given series of block maxima (e.g., seasonal or annual maxima) (Coles 2001). Here, we compare the performance of the t-test approach used in this paper with that of profile likelihood for assessing the significance of changes in 2-yr return levels for seasonal maximum precipitation, for the HadRM3H ensemble. We note that the 2-yr return level for seasonal maxima approximately corresponds to the 99th percentile of wet-day precipitation, for a wet-day fraction of ∼50%.

In this analysis, profile likelihood has been applied to assess the significance of changes in 2-yr return levels, calculated by GEV fitting to seasonal maxima from all 90 yr of concatenated data for each of the control and future periods. The “goodness of fit” test [in particular the standard Kolmogorov–Smirnov test; Kharin and Zwiers (2000)] suggests that the GEV provides a good fit to the full 90 yr of data for both the control and future, with the null hypothesis that the seasonal maxima are drawn from the fitted GEV distribution being rejected at the 10% significance level for ∼11% of the grid boxes for both seasons and time periods. In the case of the t test, 2-yr return levels are calculated separately for each ensemble member, with the GEV fitted to the seasonal maxima from the 30 yr of data for each of the control and future integrations. The t test is then used to assess whether there is a significant change in the ensemble mean of the return levels, following the method outlined in section 2b.

Figure A1 shows that the percentage changes in the 2-yr return levels are in good agreement between the two approaches, with the return-level value obtained by pooling all 90 yr of data approximately equivalent to the mean of the return levels for the individual 30-yr simulations. In terms of assessing the significance of these changes, the t-test and profile likelihood approaches also show good agreement with similar patterns of significance, although 10% less grid boxes are identified as being significant for the t test. In general, the results suggest that the t test has comparable power to profile likelihood for discerning significant changes, or, if anything, it is slightly more conservative.

One possible explanation for the difference in power between the two tests relates to the extent to which the two methods are sensitive to autocorrelation within the data. Profile likelihood applied to a GEV fit assumes independence of the seasonal maxima from one year to the next, and thus autocorrelation within the data on multiannual time scales may lead to this approach giving an inaccurate estimate of the uncertainty in return-level values. The t test makes use of the independence of the three ensemble runs, and although the quality of each 30-yr estimate used in the t test will depend on the autocorrelation within the data, this should be at least partly reflected in the intraensemble variance. Profile likelihood essentially evaluates uncertainty in the GEV fit due to the finite sample size, and does not account for any “redness” in the data. By contrast, the t test explicitly estimates the uncertainty in the return-level value due to the climate variability on multiannual time scales and longer, as well as the finite sampling of short-time scale variability. Note that autocorrelation will lead to return-level estimates being biased in both cases (Coles 2001), which will be reflected in the values presented in Fig. A1.

It is noted that the assumption of normality in the t test is unlikely to be a major source of discrepancy between the two approaches, as this has been shown to have a relatively small impact on the assessed significance of changes for the extreme index defined previously (section 2b). It may be argued that bootstrap resampling techniques could be used to artificially enlarge the sample size for use within the t test, thereby increasing our confidence in the change. However, as we have seen in this paper, the extent to which this technique could be applied depends on the degree of redness within the data, since any temporal dependence must be reproduced in the resamples for the bootstrap approximation to be accurate (Ferro et al. 2005).

In summary, the above comparison supports the t test as providing a valid assessment of the significance of changes in extreme precipitation. It has the benefit of being applicable to ensembles with some redness in the spectrum of climate noise. It also has the advantage of being a simple technique that is applicable to any extreme index and not just those derived from GEV fitting.

Fig. 1.
Fig. 1.

Percentage changes in extreme precipitation in the three different ensemble members of HadRM3H, for (left) DJF and (right) JJA: (top) Run 1, (middle) Run 2, and (bottom) Run 3. Extreme precipitation is defined as the average precipitation exceeding the 95th percentile for wet days at each grid box, for each season. The changes correspond to the differences in extreme precipitation between the 2071–2100 and 1961–90 periods, assuming the SRES A2 forcing scenario.

Citation: Journal of Climate 21, 17; 10.1175/2008JCLI2082.1

Fig. 2.
Fig. 2.

SNRs for the changes in (top) mean, (middle) extreme (exceeding 95th percentile), and (bottom) very extreme (exceeding 99th percentile) wet-day precipitation, for (left) DJF and (right) JJA. The changes correspond to the differences between the 2071–2100 and 1961–90 periods, assuming the SRES A2 forcing scenario. Values are masked in white where the signal is not significantly different from 0 at the 10% significance level. The percentage of grid boxes across the domain for which there is a significant change is indicated.

Citation: Journal of Climate 21, 17; 10.1175/2008JCLI2082.1

Fig. 3.
Fig. 3.

Theoretical SNR required for a significant signal at the 10% significance level as a function of ensemble size, for variance ratios (σ2x/σ2y) of 1 (solid line) and 0 (dotted line).

Citation: Journal of Climate 21, 17; 10.1175/2008JCLI2082.1

Fig. 4.
Fig. 4.

SNRs for changes in extreme precipitation, for spatial pooling over (top) 3 × 3, (middle) 7 × 7, and (bottom) 15 × 15 grid cells, for (left) DJF and (right) JJA. The changes correspond to the differences between the 2071–2100 and 1961–90 periods, assuming the SRES A2 forcing scenario. Values are masked in white where the signal is not significantly different from 0 at the 10% significance level. The percentage of grid boxes across the domain for which there is a significant change is indicated.

Citation: Journal of Climate 21, 17; 10.1175/2008JCLI2082.1

Fig. 5.
Fig. 5.

Absolute SNRs for changes in extreme precipitation, averaged over all European land, as a function of spatial pooling (black solid lines) or spatial averaging (black dashed lines) over 1 × 1, 3 × 3, 5 × 5, 7 × 7, and 15 × 15 grid cells, for DJF (thin lines) and JJA (thick lines). The red lines show the same result but for averaging over the Alpine region (44°–48°N, 5°–15°E). The shaded region corresponds to the minimum SNR required for significance at the 10% level for three ensemble members, with the range due to the dependence on the internal variance ratio.

Citation: Journal of Climate 21, 17; 10.1175/2008JCLI2082.1

Fig. 6.
Fig. 6.

SNRs for changes in extreme precipitation for (a) GCM-scale changes (≡ 7 × 7 spatial averaging), (b) the RCM finescale component for no spatial pooling, and (c) the RCM finescale component for 3 × 3 spatial pooling, for (left) DJF and (right) JJA. Significant changes in the RCM finescale component correspond to a significant difference between the RCM- and GCM-scale changes in relative units (see text for details). Values are masked in white where the signal is not significantly different from 0 at the 10% significance level. The percentage of grid boxes across the domain for which there is a significant change is indicated.

Citation: Journal of Climate 21, 17; 10.1175/2008JCLI2082.1

i1520-0442-21-17-4280-fa01

Fig. A1. Percentage changes in 2-yr return levels and an assessment of their significance using (a) profile likelihood and (b) the t test, for (left) DJF and (right) JJA. Percentage changes are only plotted in (a) where the 80% confidence intervals for the control and future return-level values, determined by profile likelihood, do not overlap [corresponding approximately to a 10% statistical significance level; Kharin and Zwiers (2005)], and in (b) where the changes are significantly different from 0 at the 10% significance level as assessed by the t test. The hatched area indicates those grid boxes for which more than 50% of the seasonal maxima are <5 mm day−1 and, hence, extreme value analysis cannot be applied reliably (Frei et al. 2006). In each case, the percentage of (nonhatched) grid boxes across the domain for which there is a significant change is indicated.

Citation: Journal of Climate 21, 17; 10.1175/2008JCLI2082.1

Save
  • Allen, M. R., and W. J. Ingram, 2002: Constraints on future changes in climate and the hydrological cycle. Nature, 419 , 224232.

  • Baede, A. P. M., E. Ahlonsou, Y. Ding, and D. Schimel, 2001: The climate system: An overview. Climate Change 2001: The Scientific Basis, J. T. Houghton et al., Eds., Cambridge University Press, 85–98.

    • Search Google Scholar
    • Export Citation
  • Buonomo, E., R. Jones, C. Huntingford, and J. Hannaford, 2007: On the robustness of changes in extreme precipitation over Europe from two high resolution climate change simulations. Quart. J. Roy. Meteor. Soc., 133 , 6581.

    • Search Google Scholar
    • Export Citation
  • Christensen, J. H., and Coauthors, 2007: Regional climate projections. Climate Change 2007: The Physical Science Basis, S. Solomon et al., Eds., Cambridge University Press, 847–940.

    • Search Google Scholar
    • Export Citation
  • Coles, S., 2001: An Introduction to Statistical Modeling of Extreme Values. Springer, 208 pp.

  • Dequé, M., and Coauthors, 2007: An intercomparison of regional climate simulations for Europe: Assessing uncertainties in model projections. Climatic Change, 81 , S1. 5370. doi:10.1007/s10584-006-9228-x.

    • Search Google Scholar
    • Export Citation
  • Durman, C. F., J. M. Gregory, D. C. Hassell, R. G. Jones, and J. M. Murphy, 2001: A comparison of extreme European daily precipitation simulated by a global and a regional climate model for present and future climates. Quart. J. Roy. Meteor. Soc., 127 , 10051015.

    • Search Google Scholar
    • Export Citation
  • Ekström, M., H. J. Fowler, C. G. Kilsby, and P. D. Jones, 2005: New estimates of future changes in extreme rainfall across the UK using regional climate model integrations. 2. Future estimates and use in impact studies. J. Hydrol., 300 , 234251.

    • Search Google Scholar
    • Export Citation
  • Ferro, C. A. T., A. Hannachi, and D. B. Stephenson, 2005: Simple nonparametric techniques for exploring changing probability distributions of weather. J. Climate, 18 , 43444354.

    • Search Google Scholar
    • Export Citation
  • Folland, C. K., and C. W. Anderson, 2002: Estimating changing extremes using empirical ranking methods. J. Climate, 15 , 29542960.

  • Fowler, H. J., M. Ekström, C. G. Kilsby, and P. D. Jones, 2005: New estimates of future changes in extreme rainfall across the UK using regional climate model integrations. 1. Assessment of control climate. J. Hydrol., 300 , 212233.

    • Search Google Scholar
    • Export Citation
  • Frei, C., R. Scholl, S. Fukutome, J. Schmidli, and P. L. Vidale, 2006: Future change in precipitation extremes in Europe: Intercomparison of scenarios from regional climate models. J. Geophys. Res., 111 .D06105, doi:10.1029/2005JD005965.

    • Search Google Scholar
    • Export Citation
  • Hasselmann, K., 1976: Stochastic climate models. Part I: Theory. Tellus, 28 , 473485.

  • Huntingford, C., R. G. Jones, C. Prudhomme, R. Lamb, and J. H. C. Gash, 2003: Regional climate model predictions of extreme rainfall for a changing climate. Quart. J. Roy. Meteor. Soc., 129 , 16071621.

    • Search Google Scholar
    • Export Citation
  • Hurrell, J. W., 1995: Decadal trends in the North Atlantic Oscillation: Regional temperatures and precipitation. Science, 269 , 676679.

    • Search Google Scholar
    • Export Citation
  • Jones, P. D., and P. A. Reid, 2001: Assessing future climate change in extreme precipitation over Britain using regional climate model integrations. Int. J. Climatol., 21 , 13371356.

    • Search Google Scholar
    • Export Citation
  • Jones, R. G., J. M. Murphy, and M. Noguer, 1995: Simulation of climate change over Europe using a nested regional–climate model. I: Assessment of control climate, including sensitivity to location of lateral boundaries. Quart. J. Roy. Meteor. Soc., 121 , 14131449.

    • Search Google Scholar
    • Export Citation
  • Jones, R. G., J. M. Murphy, M. Noguer, and A. B. Keen, 1997: Simulation of climate change over Europe using a nested regional–climate model. II: Comparison of driving and regional model responses to a doubling of carbon dioxide concentration. Quart. J. Roy. Meteor. Soc., 123 , 265292.

    • Search Google Scholar
    • Export Citation
  • Kharin, V. V., and F. W. Zwiers, 2000: Changes in the extremes in an ensemble of transient climate simulations with a coupled atmosphere–ocean GCM. J. Climate, 13 , 37603788.

    • Search Google Scholar
    • Export Citation
  • Kharin, V. V., and F. W. Zwiers, 2005: Estimating extremes in transient climate change simulations. J. Climate, 18 , 11561173.

  • Kjellström, E., R. Döscher, and H. E. M. Meier, 2005: Atmospheric response to different sea surface temperatures in the Baltic Sea: Coupled versus uncoupled regional climate model experiments. Nord. Hydrol., 36 , 397409.

    • Search Google Scholar
    • Export Citation
  • Knight, J. R., C. K. Folland, and A. A. Scaife, 2006: Climate impacts of the Atlantic multidecadal oscillation. Geophys. Res. Lett., 33 .L17706, doi:10.1029/2006GL026242.

    • Search Google Scholar
    • Export Citation
  • McCarthy, J. J., O. F. Canziani, N. A. Leary, D. J. Dokken, and K. S. White, 2001: Climate Change 2001: Impacts, Adaptation and Vulnerability. Cambridge University Press, 1032 pp.

    • Search Google Scholar
    • Export Citation
  • Mitchell, J. F. B., T. C. Johns, M. Eagles, W. J. Ingram, and R. A. Davis, 1999: Towards the construction of climate change scenarios. Climatic Change, 41 , 547581.

    • Search Google Scholar
    • Export Citation
  • Mitchell, T. D., 2003: Pattern scaling: An examination of the accuracy of the technique for describing future climates. Climatic Change, 60 , 217242.

    • Search Google Scholar
    • Export Citation
  • Murphy, J. M., D. M. H. Sexton, D. N. Barnett, G. S. Jones, M. J. Webb, M. Collins, and D. A. Stainforth, 2004: Quantification of modelling uncertainties in a large ensemble of climate change simulations. Nature, 430 , 768772.

    • Search Google Scholar
    • Export Citation
  • Räisänen, J., 2001: CO2-induced climate change in CMIP2 experiments: Quantification of agreement and role of internal variability. J. Climate, 14 , 20882104.

    • Search Google Scholar
    • Export Citation
  • Räisänen, J., and R. Joelsson, 2001: Changes in average and extreme precipitation in two regional climate model experiments. Tellus, 53A , 547566.

    • Search Google Scholar
    • Export Citation
  • Rayner, N. A., D. E. Parker, E. B. Horton, C. K. Folland, L. V. Alexander, D. P. Rowell, E. C. Kent, and A. Kaplan, 2003: Global analyses of SST, sea ice and night marine air temperature since the late nineteenth century. J. Geophys. Res., 108 .4407, doi:10.1029/2002JD002670.

    • Search Google Scholar
    • Export Citation
  • Rowell, D. P., 2005: A scenario of European climate change for the late 21st century: Seasonal means and interannual variability. Climate Dyn., 25 , 837849.

    • Search Google Scholar
    • Export Citation
  • Rowell, D. P., and F. W. Zwiers, 1999: The global distribution of sources of atmospheric decadal variability and mechanisms over the tropical Pacific and southern North America. Climate Dyn., 15 , 751772.

    • Search Google Scholar
    • Export Citation
  • Scaife, A. A., C. K. Folland, L. V. Alexander, A. Moberg, and J. R. Knight, 2008: European climate extremes and the North Atlantic Oscillation. J. Climate, 21 , 7283.

    • Search Google Scholar
    • Export Citation
  • Sorteberg, A., and N. G. Kvamsto, 2006: The effect of internal variability on anthropogenic climate projections. Tellus, 58A , 565574.

    • Search Google Scholar
    • Export Citation
  • von Storch, H., and F. W. Zwiers, 1999: Statistical Analysis in Climate Research. Cambridge University Press, 484 pp.

  • Zwiers, F. W., and V. V. Kharin, 1998: Changes in the extremes of the climate simulated by CCC GCM2 under CO2 doubling. J. Climate, 11 , 22002222.

    • Search Google Scholar
    • Export Citation
  • Fig. 1.

    Percentage changes in extreme precipitation in the three different ensemble members of HadRM3H, for (left) DJF and (right) JJA: (top) Run 1, (middle) Run 2, and (bottom) Run 3. Extreme precipitation is defined as the average precipitation exceeding the 95th percentile for wet days at each grid box, for each season. The changes correspond to the differences in extreme precipitation between the 2071–2100 and 1961–90 periods, assuming the SRES A2 forcing scenario.

  • Fig. 2.

    SNRs for the changes in (top) mean, (middle) extreme (exceeding 95th percentile), and (bottom) very extreme (exceeding 99th percentile) wet-day precipitation, for (left) DJF and (right) JJA. The changes correspond to the differences between the 2071–2100 and 1961–90 periods, assuming the SRES A2 forcing scenario. Values are masked in white where the signal is not significantly different from 0 at the 10% significance level. The percentage of grid boxes across the domain for which there is a significant change is indicated.

  • Fig. 3.

    Theoretical SNR required for a significant signal at the 10% significance level as a function of ensemble size, for variance ratios (σ2x/σ2y) of 1 (solid line) and 0 (dotted line).

  • Fig. 4.

    SNRs for changes in extreme precipitation, for spatial pooling over (top) 3 × 3, (middle) 7 × 7, and (bottom) 15 × 15 grid cells, for (left) DJF and (right) JJA. The changes correspond to the differences between the 2071–2100 and 1961–90 periods, assuming the SRES A2 forcing scenario. Values are masked in white where the signal is not significantly different from 0 at the 10% significance level. The percentage of grid boxes across the domain for which there is a significant change is indicated.

  • Fig. 5.

    Absolute SNRs for changes in extreme precipitation, averaged over all European land, as a function of spatial pooling (black solid lines) or spatial averaging (black dashed lines) over 1 × 1, 3 × 3, 5 × 5, 7 × 7, and 15 × 15 grid cells, for DJF (thin lines) and JJA (thick lines). The red lines show the same result but for averaging over the Alpine region (44°–48°N, 5°–15°E). The shaded region corresponds to the minimum SNR required for significance at the 10% level for three ensemble members, with the range due to the dependence on the internal variance ratio.

  • Fig. 6.

    SNRs for changes in extreme precipitation for (a) GCM-scale changes (≡ 7 × 7 spatial averaging), (b) the RCM finescale component for no spatial pooling, and (c) the RCM finescale component for 3 × 3 spatial pooling, for (left) DJF and (right) JJA. Significant changes in the RCM finescale component correspond to a significant difference between the RCM- and GCM-scale changes in relative units (see text for details). Values are masked in white where the signal is not significantly different from 0 at the 10% significance level. The percentage of grid boxes across the domain for which there is a significant change is indicated.

  • Fig. A1. Percentage changes in 2-yr return levels and an assessment of their significance using (a) profile likelihood and (b) the t test, for (left) DJF and (right) JJA. Percentage changes are only plotted in (a) where the 80% confidence intervals for the control and future return-level values, determined by profile likelihood, do not overlap [corresponding approximately to a 10% statistical significance level; Kharin and Zwiers (2005)], and in (b) where the changes are significantly different from 0 at the 10% significance level as assessed by the t test. The hatched area indicates those grid boxes for which more than 50% of the seasonal maxima are <5 mm day−1 and, hence, extreme value analysis cannot be applied reliably (Frei et al. 2006). In each case, the percentage of (nonhatched) grid boxes across the domain for which there is a significant change is indicated.

All Time Past Year Past 30 Days
Abstract Views 0 0 0
Full Text Views 1222 297 60
PDF Downloads 472 87 6