Abstract

Climate impact studies constitute the basis for the formulation of adaptation strategies. Usually such assessments apply statistically postprocessed output of climate model projections to force impact models. Increasingly, time series with daily resolution are used, which require high consistency, for instance with respect to transition probabilities (TPs) between wet and dry days and spell durations. However, both climate models and commonly applied statistical tools have considerable uncertainties and drawbacks. This paper compares the ability of 1) raw regional climate model (RCM) output, 2) bias-corrected RCM output, and 3) a conventional weather generator (WG) that has been calibrated to match observed TPs to simulate the sequence of dry, wet, and very wet days at a set of long-term weather stations across Switzerland. The study finds systematic biases in TPs and spell lengths for raw RCM output, but a substantial improvement after bias correction using the deterministic quantile mapping technique. For the region considered, bias-corrected climate model output agrees well with observations in terms of TPs as well as dry and wet spell durations. For the majority of cases (models and stations) bias-corrected climate model output is similar in skill to a simple Markov chain stochastic weather generator. There is strong evidence that bias-corrected climate model simulations capture the atmospheric event sequence more realistically than a simple WG.

1. Introduction

Several regional and global climate model (RCM and GCM) ensembles have been made available over the past decades via public archives [e.g., PRUDENCE (Christensen and Christensen 2007), phase 3 of CMIP (CMIP3), NARCCAP (Mearns et al. 2009), ENSEMBLES (van den Linden and Mitchell 2009), phase 5 of CMIP (CMIP5; Taylor et al. 2012), and the Coordinated Regional Downscaling Experiment (CORDEX; Jacob et al. 2014; Kotlarski et al. 2014)]. They form the basis of state-of-the art assessments of projected climatic conditions at global [e.g., IPCC Fourth (IPCC 2007) and Fifth Assessment Reports (IPCC 2013)] and regional scales [e.g., United Kingdom Climate Impacts Programme (UKCIP; Jenkins et al. 2008) and Swiss Climate Change Scenarios (CH2011 2011; KNMI 2014)]. Furthermore, climate model ensembles are frequently used by end users in climate impact studies that usually apply statistically postprocessed (downscaled and/or bias corrected) model output to force impact models in order to assess the consequences of climatic changes (e.g., CH2014-Impacts 2014). The majority of these techniques can be summarized as empirical–statistical downscaling (ESD) approaches (Fowler et al. 2007; Maraun et al. 2010). An ESD approach aims to remove systematic climate model biases and to downscale model output from the resolved grid scale to local conditions.

The suitability of individual ESD methods depends on the end-user application and hence requires an appropriate communication of the strengths and weaknesses related to the targeted application. Several recent studies compared the performance of different ESD approaches in cross-validation frameworks to match observed conditions and concluded that quantile mapping (QM) outperforms a number of other ESD techniques (Themeßl et al. 2011; Gudmundsson et al. 2012; Teutschbein and Seibert 2012; Räty et al. 2014; Sachindra et al. 2014). Focusing on QM, further studies addressed the representation of multiday characteristics (Addor and Seibert 2014; Wilcke et al. 2013). Uncertainties exist regarding the stationarity of the underlying transfer functions and biases (Christensen et al. 2008; Boberg and Christensen 2012; Buser et al. 2009; Maraun 2012; Bellprat et al. 2013; Kerkhoff et al. 2014), and the spatial and temporal coherence structure between different point-scale estimates (von Storch 1999; Maraun 2013). By design, QM corrects for biases in the distributional behavior of a parameter (e.g., daily precipitation amounts) but does not explicitly correct for errors in the temporal sequence (e.g., transition probabilities and spell lengths).

Beside ESD approaches, weather generators (WGs) are frequently applied to produce time series for impacts research (e.g., Richardson 1981). The main principle of WGs is to stochastically simulate precipitation series based on Markov chain (MC) model simulations. They are driven by transition probabilities (TPs) between dry and wet states, while precipitation intensity and other variables are thereafter conditioned on the precipitation occurrence (e.g., Richardson 1981; Ines et al. 2011; Keller et al. 2014). Even though WGs are constrained by observed TPs, they may underestimate the frequency of persistent weather situations (i.e., dry and wet spells) as they only account for a limited number of previous states (e.g., Semenov et al. 1998). In contrast, an adequate physical model (i.e., a GCM–RCM chain) may capture extended dry and wet spells more precisely, provided the respective memory effect is appropriately captured.

Given this background, the present study validates and compares the performance of an ensemble of raw and bias-corrected (QM) climate model simulations and of an ensemble of WG realizations in reproducing observed transition probabilities and spell-length statistics at a set of stations across Switzerland.

2. Data

a. Observations

Daily precipitation observations in the period 1961–2000 at 61 operational MeteoSwiss weather stations across Switzerland are used for bias correction, evaluation, and estimation of WG parameters. Figure 1 visualizes the locations of the stations with the actual topography (Fig. 1a), a typical RCM topography (Fig. 1b), and a GCM topography (Fig. 1c). Note that many topographical features remain unresolved in the climate models. Figure 1 additionally provides the observed mean precipitation amounts (Fig. 1d) and wet day (>1 mm day−1) frequencies (Fig. 1e) in gridded and station-based observations for the period 1961–2000.

Fig. 1.

(a) The topography of Switzerland with the location of the 61 analyzed MeteoSwiss stations. (b) The topography as represented by the ETHZ–HadCM3Q0 (see Table 1) at a resolution of 0.22° (~25 km) with the considered subset of 21 stations. (c) The topography as represented by the GCM HadCM3Q0 at a resolution of 3.75° × 2.5° with the considered subset of 21 stations. (d) Wet day (>1 mm day−1) frequency in gridded (2 km) observations and at 61 MeteoSwiss stations in period 1961–2000 (black dots depict the 21-station subset). (e) Mean annual precipitation in gridded (2 km) observations and at 61 MeteoSwiss stations in the period 1961–2000 (black dots depict the 21-station subset). The gridded observational dataset is only used for illustrative purposes. It has a horizontal resolution of 2 km and is operating at daily resolution (MeteoSchweiz 2013).

Fig. 1.

(a) The topography of Switzerland with the location of the 61 analyzed MeteoSwiss stations. (b) The topography as represented by the ETHZ–HadCM3Q0 (see Table 1) at a resolution of 0.22° (~25 km) with the considered subset of 21 stations. (c) The topography as represented by the GCM HadCM3Q0 at a resolution of 3.75° × 2.5° with the considered subset of 21 stations. (d) Wet day (>1 mm day−1) frequency in gridded (2 km) observations and at 61 MeteoSwiss stations in period 1961–2000 (black dots depict the 21-station subset). (e) Mean annual precipitation in gridded (2 km) observations and at 61 MeteoSwiss stations in the period 1961–2000 (black dots depict the 21-station subset). The gridded observational dataset is only used for illustrative purposes. It has a horizontal resolution of 2 km and is operating at daily resolution (MeteoSchweiz 2013).

To make the presentation of the results more concise, the focus is put on a subset of 21 stations (Figs. 1b,c). To ensure that the subset of stations samples the whole range of precipitation variability, the whole set of 61 stations was sorted according to the observed wet day frequencies and every third station was selected.

b. Regional climate models

A set of GCM-driven RCMs from the European Union (EU) ENSEMBLES project (van der Linden and Mitchell 2009) for the historical period 1961–2000 is considered (Table 1). The ensemble consists of 14 experiments, combining five GCMs with nine RCMs. All GCM–RCM chains provide data at a horizontal resolution of about 25 km and are forced by historical greenhouse gas concentrations. Basic results are presented for all GCM–RCM chains, but in some of the analyses emphasis is put on the individual realization run by ETH Zurich [employing the RCM Consortium for Small-Scale Modeling Model in Climate Mode (COSMO-CLM; hereafter CLM)] and its driving GCM (HadCM3Q0 at a horizontal resolution of 3.75° × 2.5°; ensemble member Q0 from the HadCM3-based perturbed physics ensemble). In addition, a CLM simulation driven by reanalysis data (ERA-40; Uppala et al. 2005) is used.

Table 1.

RCMs and GCMs used. (Most acronyms are available online at http://www.ametsoc.org/PubsAcronymList. MOHC is Met Office Hadley Centre, SMHI is Swedish Meteorological and Hydrological Institute, C4I is Community Climate Change Consortium for Ireland, DMI is Danish Meteorological Institute, and ICTP is International Centre for Theoretical Physics. REMO is the MPI Regional Model).

RCMs and GCMs used. (Most acronyms are available online at http://www.ametsoc.org/PubsAcronymList. MOHC is Met Office Hadley Centre, SMHI is Swedish Meteorological and Hydrological Institute, C4I is Community Climate Change Consortium for Ireland, DMI is Danish Meteorological Institute, and ICTP is International Centre for Theoretical Physics. REMO is the MPI Regional Model).
RCMs and GCMs used. (Most acronyms are available online at http://www.ametsoc.org/PubsAcronymList. MOHC is Met Office Hadley Centre, SMHI is Swedish Meteorological and Hydrological Institute, C4I is Community Climate Change Consortium for Ireland, DMI is Danish Meteorological Institute, and ICTP is International Centre for Theoretical Physics. REMO is the MPI Regional Model).

3. Methods

The ability of raw and bias-corrected climate model output to match observed TPs in a historical period is evaluated. For this purpose, each day of a given precipitation series (observed or modeled) is categorized into three different states: dry, wet, or very wet. The motivation to distinguish between wet and very wet days is based on the desire to more accurately describe the character of precipitation. This is in particular required for hydrological applications such as the analysis of high-impact flooding events that affect large-scale catchments and often originate from consecutive days with intense precipitation (e.g., the Alpine flood in 2005; MeteoSchweiz 2006).

Climate models are bias-corrected using QM and the results are systematically compared against a large ensemble of WG simulations.

a. Empirical–statistical bias correction: Quantile mapping

A nonparametric empirical implementation of QM is applied (Gudmundsson et al. 2012). It considers a time-dependent correction function, calibrated for each day of the year (DOY) based on a 91-day window centered over the respective day (Themeßl et al. 2011, 2012).

The principle of QM is to correct the daily precipitation amount from a climate simulation (mod), say X at time t for a grid box located over a target station, so that the corrected distribution of Y(X) matches the observed (obs) cumulative distribution function F, according to

 
formula

In a climate scenario framework, transfer functions as defined by Y(X) in a calibration period [Eq. (1)] are assumed to be stationary and thus also valid under future conditions. Values that lie outside the range of calibrated values are typically considered by an extrapolation of the correction function [see also Themeßl et al. (2012)]. In this study, the correction of the 99th percentile is used if the calibrated range of values is exceeded.

b. Weather generator: Transition probabilities and Markov chain model

A precipitation time series X(t) at daily resolution t is discriminated into three states: dry [D; X(t) < 1 mm day−1], wet [W; 1 mm day−1X(t) < obs.q50DOY], and very wet [V; X(t) ≥ obs.q50DOY], where obs.q50DOY is the 50th percentile of observed wet-day precipitation (>1 mm day−1) considered for each DOY separately. This is done as precipitation parameters are usually subject to seasonal variations. As local-scale conditions are targeted, the observed threshold for very wet days is considered. Applying the modeled 50th percentile for each modeling strategy separately can yield slightly different results but does not change the overall outcome of the analysis. Further, using a two-state (dry and wet) instead of a three-state MC model yields qualitatively similar results (not shown).

The approach applied here can be described as a three-state first-order MC model (Wilks 2011), where the TPs are

 
formula

with J denoting the present and I the previous day’s state (for D, W, or V) and PIJ the corresponding TP. The setup is completely defined by a combination of nine TPs.

Based on these principles, a state is assigned to each X(t) in observations and the individual model realizations. The PIJ values are calculated for each DOY based on a 91-day window centered over the respective DOY. Finally, all individual PIJ values from the different years are averaged over the entire period. Time series describing the sequence of the three states can be simulated using the estimated PIJ values as MC forcing parameters. For the analysis of TPs and spell lengths it is sufficient to not operate a second model for the wet-day precipitation amounts.

c. Evaluation strategy

A split-sample approach with independent calibration (1961–80) and validation (1981–2000) periods is applied. Validation is carried out on a monthly basis, with daily values being averaged across each month. To summarize results an annual-mean skill score is defined and applied in certain analyses:

 
formula

where index i denotes months (m = 12) and Pmod (Pobs) modeled (observed) TPs averaged over each month i. The score S gives an estimate on the average magnitude of the fractional bias in monthly TPs with an ideal score of 1. Note that large over- and underestimations in different parts of the year may compensate each other.

4. Results

Here, the skill of QM in correcting RCM-simulated precipitation time series and of the WG in simulating sequences of dry, wet, and very wet days is presented. Afterward, detailed results regarding TPs at the exemplary site of Chur are shown. The motivation of selecting Chur is due to 1) its location in a rather dry inner-alpine valley not resolved by the RCMs and 2) the good observational data quality at this site. The paper finally continues with summarized results from the analysis of a 21-station subset, restricting the discussion to four prominent TPs.

Figure 2 shows results for a raw (red) versus bias-corrected (blue) model ensemble of 14 RCMs (Table 1) and 100 WG simulations (gray). The raw ensemble obviously deviates from observations. The majority of raw models overestimate wet day frequencies (Fig. 2a).

Fig. 2.

Skill of a raw (red) and bias-corrected (QM; blue) ensemble of 14 GCM–RCM realizations at 21 Swiss stations (see Fig. 1b) in simulating precipitation characteristics. The stations are ordered from wet (top) to dry (bottom). Shown are (a) wet-day (>1 mm day−1) frequency, (b) mean precipitation, (c) wet-day intensity, (d) the 99th percentile of 5-day accumulated precipitation, the frequency of spell lengths with (e) ≥5 and (f) ≥7 consecutive dry days, the frequency of spell lengths with (g) ≥3 and (h) ≥5 consecutive wet days, and (i) the frequency of spell lengths with ≥3 consecutive very wet days (>obs.q50DOY; mm day−1). Results for an ensemble of 100 weather generator (WG) simulations are also presented (gray). The skill (bias) is defined as the ratio between simulation and observation. Bars indicate the bias range across the respective model ensemble. Calibration of QM and estimation of the WG parameters is carried out using observations in the period 1961–80. Validation is based on the period 1981–2000.

Fig. 2.

Skill of a raw (red) and bias-corrected (QM; blue) ensemble of 14 GCM–RCM realizations at 21 Swiss stations (see Fig. 1b) in simulating precipitation characteristics. The stations are ordered from wet (top) to dry (bottom). Shown are (a) wet-day (>1 mm day−1) frequency, (b) mean precipitation, (c) wet-day intensity, (d) the 99th percentile of 5-day accumulated precipitation, the frequency of spell lengths with (e) ≥5 and (f) ≥7 consecutive dry days, the frequency of spell lengths with (g) ≥3 and (h) ≥5 consecutive wet days, and (i) the frequency of spell lengths with ≥3 consecutive very wet days (>obs.q50DOY; mm day−1). Results for an ensemble of 100 weather generator (WG) simulations are also presented (gray). The skill (bias) is defined as the ratio between simulation and observation. Bars indicate the bias range across the respective model ensemble. Calibration of QM and estimation of the WG parameters is carried out using observations in the period 1961–80. Validation is based on the period 1981–2000.

Mean precipitation (Fig. 2b) is primarily overestimated and biases in wet day intensity (Fig. 2c) can be pronounced but no systematic qualitative direction is apparent. The 99th percentile of 5-day accumulated precipitation (Fig. 2d) is overestimated in most cases. Raw models tend to clearly and systematically underestimate the frequency of dry spells (Figs. 2e,f), and to overestimate the frequency of wet spells (Fig. 2g,h). The raw model biases with respect to very wet spells are diverse but in many cases substantial, primarily toward too large values (Fig. 2i). Overall, biases are more pronounced for long spell lengths (Figs. 2f,h). The application of QM systematically and massively improves various precipitation diagnostics, independently of the respective raw climate model’s skill. It is obvious that the bias-corrected model ensemble shows similar skill as, and does partly outperform, the WG ensemble (Fig. 2i). The WG tends to underestimate the frequency of long dry spells (Fig. 2f; majority of sites), of long wet spells (Fig. 2h; at dry sites) and—in a systematic manner—of very wet spells (Fig. 2i) (see also Semenov et al. 1998).

For all modeling strategies, Fig. 3 presents probabilities of different wet (Fig. 3a) and dry (Fig. 3b) spell durations at the stations Sion (SIO) and Chur (CHU), both located in dry valleys not resolved by the RCMs (see Fig. 1). As additional information, the bottom of each panel depicts mean spell durations. At both sites observations show a pronouncedly larger fraction of dry days and accordingly longer average dry than wet spells. While an average wet period lasts for about 2 days, a typical dry spell lasts 5–6 days. Raw model simulations strongly overestimate the probabilities of wet-spell durations and in turn mean wet-spell lengths, and underestimate the length of dry spells and their average duration.

Fig. 3.

Probabilities of (a),(c) wet-spell and (b),(d) dry-spell durations at the stations (top) Sion and (bottom) Chur in the period 1981–2000. The panels show probabilities (x axis) as a function of spell duration (y axis). Mean spell durations are given at the bottom of each panel. The figure shows observations (black line with white circles), 14 raw (red lines) and respective bias-corrected (QM; blue lines) RCM integrations, and 100 Markov chain simulations (WG; gray lines). Results for the ETHZ–HadCM3Q0 are highlighted (thick lines and symbols in corresponding colors). Calibration of QM and WG is undertaken using observations in the period 1961–80. Sion and Chur are chosen as they are situated within dry inner-alpine valleys and are subject to considerable RCM model biases.

Fig. 3.

Probabilities of (a),(c) wet-spell and (b),(d) dry-spell durations at the stations (top) Sion and (bottom) Chur in the period 1981–2000. The panels show probabilities (x axis) as a function of spell duration (y axis). Mean spell durations are given at the bottom of each panel. The figure shows observations (black line with white circles), 14 raw (red lines) and respective bias-corrected (QM; blue lines) RCM integrations, and 100 Markov chain simulations (WG; gray lines). Results for the ETHZ–HadCM3Q0 are highlighted (thick lines and symbols in corresponding colors). Calibration of QM and WG is undertaken using observations in the period 1961–80. Sion and Chur are chosen as they are situated within dry inner-alpine valleys and are subject to considerable RCM model biases.

Quantile-mapped simulations massively improve the representation of spell-length durations. In a similar manner, the ensemble of 100 weather generator simulations shows a very reasonable agreement with observations. However—and particularly in contrast to quantile-mapped climate model output—the WG systematically underestimates the probability of long dry spells and thus dry-spell durations.

Figure 4 presents the annual cycle of the nine TPs that describe a three-state first-order MC in CHU. Observed TPs show a pronounced seasonal cycle. As the wet-day frequency is relatively small (0.29), TPs that describe the transition to a dry state are high, particularly DD. In January, probabilities for two very wet days to follow each other (VV; all other combinations are similarly denoted) are twice as large as in early fall. In warm months DV and VD are considerably larger than in winter.

Fig. 4.

Monthly transition probabilities describing input parameters for a three-state first-order Markov chain with the daily precipitation sequence discriminated into D, W, and V states at the station Chur (see Fig. 1) in the period 1981–2000. The figure shows observations (black line with white circles), 14 raw (red lines) and respective bias-corrected (QM; blue lines) RCM integrations, and 100 Markov chain simulations (WG; gray lines). Results for the ETHZ–HadCM3Q0 are highlighted (thick lines and symbols in corresponding colors). Calibration of QM and WG is undertaken using observations in the period 1961–80.

Fig. 4.

Monthly transition probabilities describing input parameters for a three-state first-order Markov chain with the daily precipitation sequence discriminated into D, W, and V states at the station Chur (see Fig. 1) in the period 1981–2000. The figure shows observations (black line with white circles), 14 raw (red lines) and respective bias-corrected (QM; blue lines) RCM integrations, and 100 Markov chain simulations (WG; gray lines). Results for the ETHZ–HadCM3Q0 are highlighted (thick lines and symbols in corresponding colors). Calibration of QM and WG is undertaken using observations in the period 1961–80.

The set of raw simulations shows a prominent underestimation of DD, WD, and VD. All other TPs are overestimated, especially and most distinctly DW and DV in winter. This is probably due to the common problem of climate models to simulate too many wet days (Rajczak et al. 2013).

Bias-corrected climate model realizations (QM) show a substantial overall improvement of TPs across the whole year. The QM ensemble is in line with the range of the 100 WG simulations that by and large also match observations well. Results for DW and WD even indicate a better skill for QM compared to the WG, suggesting that memory effects are better captured in QM especially in summer. In this particular example (at CHU), the QM-adjusted RCM simulations tend to underestimate VD, which is also reflected by an overestimation of VW and VV. Nonetheless, observations still lie within the simulated range. This pattern is also seen in some of the other stations considered, in particular at dry sites (not shown). Note that results are qualitatively similar when validating second-order transition probabilities (not shown).

Figure 5 presents results for the full 21-station subset (from Figs. 1b,c) and for four prominent TPs: DD, DW, WW, and DV. Additionally, the ERA-40-driven realization of the RCM CLM run at ETH Zurich and the driving GCM HadCM3Q0 itself (see Table 1) are included by specific symbols in Fig. 5.

Fig. 5.

Summary results for 21 Swiss stations (see Figs. 1b,c) for annual-mean transition probabilities. The stations are ordered from dry (left) to wet (right), showing (a) DD, (b) DW, (c) WW, and (d) DV transition probabilities. Results show the range of skill scores [Eq. (3)] as simulated by 14 raw (red bar) and respective bias-corrected (QM; blue bar) GCM–RCM chains (Table 1), and the range for 100 Markov chain simulations (gray, WG). Results are also shown for the raw (square) and bias-corrected (triangle) versions of the ETHZ–HadCM3Q0 (within ensemble bars), the ETHZ–ERA-40 simulations (green), and the GCM HadCM3Q0 run (violet). Note that the y axis is logarithmic.

Fig. 5.

Summary results for 21 Swiss stations (see Figs. 1b,c) for annual-mean transition probabilities. The stations are ordered from dry (left) to wet (right), showing (a) DD, (b) DW, (c) WW, and (d) DV transition probabilities. Results show the range of skill scores [Eq. (3)] as simulated by 14 raw (red bar) and respective bias-corrected (QM; blue bar) GCM–RCM chains (Table 1), and the range for 100 Markov chain simulations (gray, WG). Results are also shown for the raw (square) and bias-corrected (triangle) versions of the ETHZ–HadCM3Q0 (within ensemble bars), the ETHZ–ERA-40 simulations (green), and the GCM HadCM3Q0 run (violet). Note that the y axis is logarithmic.

Overall, the results confirm those for the individual site of Chur (Fig. 4). There are distinct and systematic biases in raw climate model output, and a clear reduction of biases after bias correction for a large majority of cases. Especially at dry sites, raw RCM and GCM output obviously suffers from systematic biases in TPs. For instance, DW and WW probabilities are overestimated, and DD probabilities are underestimated. Intermodel and interstation spread is large for DV, with a tendency for distinct overestimations at dry sites and underestimations at wet sites.

The raw GCM output shows characteristics qualitatively similar to its dynamically downscaled counterpart (ETHZ–HadCM3Q0) but with larger biases, especially a pronounced overestimation of DW and WW. The reanalysis-driven simulation ETHZ–ERA-40 shows a better skill in most cases. The bias-corrected versions of the three model experiments are typically very close to the observations and no systematic difference in their respective performance is apparent. This suggests that the raw model’s skill does not necessarily determine the skill of its bias-corrected version regarding the representation of TPs.

Most obvious, biases are found for DV, where some raw RCMs overestimate the TP by a factor of about 5 at the driest site (SIO). In contrast, the considered GCM underestimates DV at the wettest site, Säntis (SAE), by a factor of 10. In a majority of cases, QM leads to a massive improvement in the representation of TPs.

In general, WG estimates have similar skill to the bias-corrected climate models. A notable feature is an increasing spread of the WG results in the estimates for WW at dry sites (leftmost stations in Fig. 5). This may be related to sampling issues, as two consecutive wet days occur less frequently than the other transitions.

5. Discussion and conclusions

Applying a cross-validation framework, the present study demonstrates that the well-established quantile mapping (QM) technique is able to correct raw RCM precipitation time series in order to represent observed transition probabilities (TPs) and spell-length durations at weather stations across Switzerland. A significant improvement in skill is found, despite the fact that TPs are not explicitly corrected for by the QM methodology. QM merely adjusts the frequency of wet days by correcting the simulated precipitation intensities such that they match observations. Analysis shows that this yields the systematic improvement of TPs and spell lengths in quantile-mapped climate model output.

The main findings of the present study are the following:

  • Raw climate model simulations possess substantial and systematic biases in the representation of (local scale) TPs, spell durations, and multiday precipitation diagnostics. For instance, climate model simulations have a tendency to overestimate the frequency of wet days.

  • Bias correction (QM) of climate model simulations leads to a substantial improvement of the representation of TPs and multiday diagnostics. The improvement of TPs concerns both their magnitude and the representation of their annual cycle. Spell-length durations agree surprisingly very well with observed values after bias correction.

  • The applied WG captures observed TPs and spell lengths well, but is outperformed in some seasons by bias-corrected RCM integrations, in particular for long dry spells. This implies that climate model simulations capture the statistics of atmospheric event sequences more realistic than a simple first-order WG. It is likely that this is due to the short memory of the considered WG. However, this conclusion may require revision with more sophisticated WGs.

Overall, the presented study finds that the application of QM provides obvious added value for the representation of TPs and multiday precipitation characteristics in climate model data at the local scale. This is remarkable, as temporal characteristics are not explicitly corrected for by the deterministic correction of distributional biases by QM.

The general applicability of QM, however, is limited by the demands of the end user. Fields of applicability are single-site impact assessments or multisite studies that consider temporally integrated statistics. For many end-user applications the findings of the presented work indicate a promising applicability of QM. For example, agricultural impact assessments often depend on an accurate representation of local TPs. However, the straightforward application of QM may be erroneous for multisite assessments that rely on the spatiotemporal coherence of the meteorological forcing (Maraun 2013).

Given the tremendous climatic variations of the considered stations (see Figs. 1e,f) and the large set of models used, the findings may be transferable to other settings and regions. Also, Switzerland is a particularly challenging environment in terms of regional climate modeling. This is primarily due to its complex topography and the associated influence of subgrid-scale processes, which are both not fully resolved by current climate model ensembles.

Whether our conclusions are valid under changing climatic conditions remains a question for future research. Besides methodological uncertainties in QM that particularly concern the temporal stability of biases between models and observations (see the introduction), uncertainties regarding the proper interpretation of climate model projections themselves remain (i.e., internal variability and model and scenario uncertainty).

Acknowledgments

We acknowledge the RCM datasets from the EU-FP6 project ENSEMBLES (http://ensembles-eu.metoffice.com). We also acknowledge MeteoSwiss, in particular Christoph Frei, for providing observational data and plotting routines in R. This research was partly funded by the Swiss National Science Foundation through the SNSF Sinergia Project CRSII2_136279, “The Evolution of Mountain Permafrost in Switzerland” (TEMPS), and by the Swiss Federal Office of the Environment in the framework of the project Gefahrengrundlagen für Extremhochwasser an Aare und Rhein (EXAR).

REFERENCES

REFERENCES
Addor
,
N.
, and
J.
Seibert
,
2014
:
Bias correction for hydrological impact studies—Beyond the daily perspective
.
Hydrol. Processes
,
28
,
4823
4828
, doi:.
Bellprat
,
O.
,
S.
Kotlarski
,
D.
Lüthi
, and
C.
Schär
,
2013
:
Physical constraints for temperature biases in climate models
.
Geophys. Res. Lett.
,
40
,
4042
4047
, doi:.
Boberg
,
F.
, and
J. H.
Christensen
,
2012
:
Overestimation of Mediterranean summer temperature projections due to model deficiencies
.
Nat. Climate Change
,
2
,
433
436
, doi:.
Buser
,
C.
,
H. R.
Künsch
,
D.
Lüthi
,
M.
Wild
, and
C.
Schär
,
2009
:
Bayesian multi-model projection of climate: Bias assumptions and interannual variability
.
Climate Dyn.
,
33
,
849
868
, doi:.
CH2011
,
2011
: Swiss climate change scenarios CH2011. MeteoSwiss Rep., 88 pp. [Available online at http://ch2011.ch/.]
CH2014-Impacts
,
2014
: Toward quantitative scenarios of climate change impacts in Switzerland. MeteoSwiss Rep., 136 pp. [Available online at http://www.ch2014-impacts.ch/.]
Christensen
,
J. H.
, and
O.
Christensen
,
2007
:
A summary of the PRUDENCE model projections of changes in European climate by the end of this century
.
Climatic Change
,
81
,
7
30
, doi:.
Christensen
,
J. H.
,
F.
Boberg
,
O. B.
Christensen
, and
P.
Lucas-Picher
,
2008
:
On the need for bias correction of regional climate change projections of temperature and precipitation
.
Geophys. Res. Lett.
,
35
,
L20709
, doi:.
Fowler
,
H. J.
,
S.
Blenkinsop
, and
C.
Tebaldi
,
2007
:
Linking climate change modelling to impacts studies: Recent advances in downscaling techniques for hydrological modelling
.
Int. J. Climatol.
,
27
,
1547
1578
, doi:.
Gudmundsson
,
L.
,
J. B.
Bremnes
,
J. E.
Haugen
, and
T.
Engen-Skaugen
,
2012
:
Downscaling RCM precipitation to the station scale using statistical transformations—A comparison of methods
.
Hydrol. Earth Syst. Sci.
,
16
,
3383
3390
, doi:.
Ines
,
A. V. M.
,
J. W.
Hansen
, and
A. W.
Robertson
,
2011
:
Enhancing the utility of daily GCM rainfall for crop yield prediction
.
Int. J. Climatol.
,
31
,
2168
2182
, doi:.
IPCC
,
2007
: Climate Change 2007: The Physical Science Basis. Cambridge University Press, 996 pp.
IPCC
,
2013
: Climate Change 2013: The Physical Science Basis. Cambridge University Press, 1535 pp.
Jacob
,
D.
, and Coauthors
,
2014
:
EURO-CORDEX: New high-resolution climate change projections for European impact research
.
Reg. Environ. Change
,
14
,
563
578
, doi:.
Jenkins
,
G. J.
,
M. C.
Perry
, and
M. J.
Prior
,
2008
: The climate of the United Kingdom and recent trends. Met Office Hadley Centre Rep., 122 pp.
Keller
,
D. E.
,
A. M.
Fischer
,
C.
Frei
,
M. A.
Liniger
,
C.
Appenzeller
, and
R.
Knutti
,
2014
:
Stochastic modelling of spatially and temporally consistent daily precipitation time-series over complex topography
.
Hydrol. Earth Syst. Sci. Discuss.
,
11
,
8737
8777
, doi:.
Kerkhoff
,
C.
,
H. R.
Künsch
, and
C.
Schär
,
2014
:
Assessment of bias assumptions for climate models
.
J. Climate
,
27
,
6799
6818
, doi:.
KNMI
,
2014
: KNMI 2014 climate scenarios for the Netherlands: A guide for professionals in climate adaptation. KNMI Rep., 34 pp.
Kotlarski
,
S.
, and Coauthors
,
2014
:
Regional climate modeling on European scales: A joint standard evaluation of the EURO-CORDEX RCM ensemble
.
Geosci. Model Dev.
,
7
,
1297
1333
, doi:.
Maraun
,
D.
,
2012
:
Nonstationarities of regional climate model biases in European seasonal mean temperature and precipitation sums
.
Geophys. Res. Lett.
,
39
,
L06706
, doi:.
Maraun
,
D.
,
2013
:
Bias correction, quantile mapping, and downscaling: Revisiting the inflation issue
.
J. Climate
,
26
,
2137
2143
, doi:.
Maraun
,
D.
, and Coauthors
,
2010
:
Precipitation downscaling under climate change: Recent developments to bridge the gap between dynamical models and the end user
.
Rev. Geophys.
,
48
,
RG3003
, doi:.
Mearns
,
L. O.
,
W.
Gutowski
,
R.
Jones
,
R.
Leung
,
S.
McGinnis
,
A.
Nunes
, and
Y.
Qian
,
2009
:
A regional climate change assessment program for North America
.
Eos, Trans. Amer. Geophys. Union
,
90
,
311
, doi:.
MeteoSchweiz
,
2006
: Starkniederschläge August 2005. Arbeitsberichte der MeteoSchweiz 211, 63 pp. [Available online at http://www.meteoschweiz.admin.ch/content/dam/meteoswiss/de/Ungebundene-Seiten/Publikationen/Fachberichte/doc/ab211.pdf.]
MeteoSchweiz
,
2013
: RhiresD—Gridded data of daily precipitation in Switzerland. Accessed 3 February 2016. [Available online at http://www.meteoswiss.admin.ch/content/dam/meteoswiss/de/service-und-publikationen/produkt/raeumliche-daten-niederschlag/doc/ProdDoc_RhiresD.pdf.]
Rajczak
,
J.
,
P.
Pall
, and
C.
Schär
,
2013
:
Projections of extreme precipitation events in regional climate simulations for Europe and the Alpine region
.
J. Geophys. Res. Atmos.
,
118
,
3610
3626
, doi:.
Räty
,
O.
,
J.
Räisänen
, and
J.
Ylhäisi
,
2014
:
Evaluation of delta change and bias correction methods for future daily precipitation: Intermodel cross-validation using ENSEMBLES simulations
.
Climate Dyn.
,
42
,
2287
2303
, doi:.
Richardson
,
C. W.
,
1981
:
Stochastic simulation of daily precipitation, temperature, and solar radiation
.
Water Resour. Res.
,
17
,
182
190
, doi:.
Sachindra
,
D. A.
,
F.
Huang
,
A.
Barton
, and
B. J. C.
Perera
,
2014
:
Statistical downscaling of general circulation model outputs to precipitation—Part 2: Bias-correction and future projections
.
Int. J. Climatol.
,
34
,
3282
3303
, doi:.
Semenov
,
M. A.
,
R. J.
Brooks
,
E. M.
Barrow
, and
C. W.
Richardson
,
1998
:
Comparison of the WGEN and LARS-WG stochastic weather generators for diverse climates
.
Climate Res.
,
10
,
95
107
, doi:.
Taylor
,
K. E.
,
R. J.
Stouffer
, and
G. A.
Meehl
,
2012
:
An overview of CMIP5 and the experiment design
.
Bull. Amer. Meteor. Soc.
,
93
,
485
498
, doi:.
Teutschbein
,
C.
, and
J.
Seibert
,
2012
:
Bias correction of regional climate model simulations for hydrological climate-change impact studies: Review and evaluation of different methods
.
J. Hydrol.
,
456–457
,
12
29
, doi:.
Themeßl
,
M.
,
A.
Gobiet
, and
A.
Leuprecht
,
2011
:
Empirical-statistical downscaling and error correction of daily precipitation from regional climate models
.
Int. J. Climatol.
,
31
,
1530
1544
, doi:.
Themeßl
,
M.
,
A.
Gobiet
, and
G.
Heinrich
,
2012
:
Empirical-statistical downscaling and error correction of regional climate models and its impact on the climate change signal
.
Climatic Change
,
112
,
449
468
, doi:.
Uppala
,
S. M.
, and Coauthors
,
2005
:
The ERA-40 Re-Analysis
.
Quart. J. Roy. Meteor. Soc.
,
131
,
2961
3012
, doi:.
van der Linden
,
P.
, and
J.
Mitchell
, Eds.,
2009
: ENSEMBLES: Climate change and its Impacts: Summary of research and results from the ENSEMBLES project. Met Office Hadley Centre Rep., 160 pp.
von Storch
,
H.
,
1999
:
On the use of “inflation” in statistical downscaling
.
J. Climate
,
12
,
3505
3506
, doi:.
Wilcke
,
R.
,
T.
Mendlik
, and
A.
Gobiet
,
2013
:
Multi-variable error correction of regional climate models
.
Climatic Change
,
120
,
871
887
, doi:.
Wilks
,
D. S.
,
2011
: Statistical Methods in the Atmospheric Sciences. 3rd ed. Academic Press, 704 pp.