The application of an ensemble reduction technique to the European branch of the World Climate Research Program Coordinated Regional Downscaling Experiment (EURO-CORDEX) ensemble at resolution “EUR-11” (~12.5 km) under the RCP8.5 scenario is presented. The technique is based on monthly mean changes between a reference and two future time periods, calculated for eight regions in Germany, of the parameters near-surface air temperature (tas), precipitation totals (pr), contribution of precipitation from very wet days to precipitation totals (R95pTOT), near-surface specific humidity (huss), and surface downwelling shortwave radiation (rsds). The sensitivity of the reduction procedure with respect to a number of tuning parameters is investigated. When the optimal combination of tuning parameters is applied, the technique allows the reduction from 15 to 7 ensemble members, while the reduced ensemble reproduces about 94% of the spread of the full ensemble. Keeping in mind that climate projection ensembles are expected to grow substantially in the near future, this ensemble reduction technique can be useful to limit the computational efforts necessary for further processing and applications such as impact modeling.
Climate change is one of the biggest challenges facing humankind. Thus, the problem of global climate change forced by anthropogenic emissions of greenhouse gases has to be addressed throughout the whole society, including mitigation and adaptation measures. Several governments have established national adaptation plans, where the planning and implementation of adaptation measures have steadily gained importance over the course of recent years. Several so-called action fields—for example, agriculture, water management, and transport infrastructure—are addressed explicitly. (For the German adaptation plan issued by the Federal Ministry for the Environment, Nature Conservation, Building and Nuclear Safety, see http://www.bmub.bund.de/en/topics/climate-energy/climate/details-climate/artikel/german-strategy-for-adaptation-to-climate-change-summary/?tx_ttnews%5BbackPid%5D=216.)
Climate projections are subject to uncertainties that are caused by 1) the chaotic character of the climate system, 2) the underdetermined state of the climate system, 3) the simplified model description of complex physical processes, and 4) the assumptions on the future development of global societies and economy. For considering these uncertainties in adaption planning, an ensemble of climate projections intended to cover the whole range of uncertainties is used by taking advantage of different ensemble generation strategies (initial and boundary conditions perturbation, multimodel ensembles, use of different scenarios; Meinshausen et al. 2011).
In recent years the number of available climate projections has increased rapidly. The compilation of all climate projections leads to a so-called ensemble of opportunity since it is not constructed in a systematic way. For that reason the analysis of a larger ensemble is not always beneficial (Knutti et al. 2010). Furthermore, there can be computational restrictions that, for some applications, impede the use of the full ensemble, suggesting a reduction of the ensemble size. To still make optimal use of the information inherent in the full multimodel ensemble, a careful selection of projections to be further processed in a study is crucial. Therefore, it has to be assured that a reduced ensemble still represents the desired characteristics of the full ensemble. Here, the question arises of what are the desired characteristics of an ensemble.
The widespread notion of “robustness” based on model agreement leads to the conclusion that results shared by many models are more likely to be “true” than results of so-called outliers. Consequently, the main characteristics of the ensemble are considered the ensemble mean or the regions, where model results form clusters. But this statistical interpretation relies on the assumption of independence between model results, which is not fulfilled for present-day multimodel ensembles used for climate projection. The violation of the independence assumption, the sources of interdependence, and their consequences for the validity of the robustness hypothesis have been discussed rigorously in Pirtle et al. (2010) and Parker (2013).
Interdependence between climate models, caused by common assumptions on the climate system and shared observational data for model tuning, as well as replication of code and shared model components across seemingly different models developed by independent institutions, results in similarities between the outputs (and correlation between the errors) of different models (e.g., Knutti et al. 2010). However, the limited gain of new information when adding an interdependent model to an ensemble is not the severest consequence of interdependence. Only independent errors tend to cancel out, such that ensemble statistics in interdependent ensembles can be biased and uncertainty measures systematically underestimate existing uncertainty (Knutti et al. 2010; Tebaldi et al. 2011; Knutti and Sedláček 2013; Evans et al. 2013; Caldwell et al. 2014; and many others).
In contrast to numerical weather prediction ensembles, it is not possible to calibrate the ensemble spread, that is, the dispersion of the outcomes of individual ensemble members in the space of results, to the true uncertainty of climate change, as it is impossible to measure. Therefore we have no evidence to decide whether an ensemble is over- or under-dispersive with respect to the changes signals.
The adverse effects of interdependence on the utility of ensembles recognized, the necessity arises to assess the contribution of the various projections to the ensemble information. In this assessment, the criterion of low interdependence can sometimes conflict with good model performance (a highly independent, i.e., very unsimilar, projection can be doubtful because of terrible model performance). It is therefore a vital objective for ensemble construction to balance interdependence and model performance.
First attempts to reduce interdependence in an ensemble were based on model genealogy (Knutti 2010) but did not prove rigorous in identifying interdependence. Alternatively, quantitative techniques focus on distances and correlations between the output of different models (Masson and Knutti 2011). A selection procedure based on quantitative model distances was developed by Cannon (2015), who applied a multivariate cluster initialization algorithm to multivariate model output. The algorithm selects the most unsimilar, that is, the least interdependent, models out of a given ensemble. Mendlik and Gobiet (2016) propose a cluster analysis in the space of multivariate model output [after dimension reduction by principal component analysis (PCA); Hannachi et al. 2007], and choose one model out of each cluster. The reduced ensemble that results from cluster analysis does not necessarily maximize independence because the cluster analysis is constructed to select representative projections from highly populated regions, that is, regions with many similar members, but tends to disregard more isolated projections.
The methods for reducing ensemble interdependence mentioned so far do not include measures of model performance. Model performance can be incorporated into the selection only by a two-stage process that rejects badly performing projections in a first step and then selects the most independent projections out of the remaining ones. The stepwise procedure does not enable a trade-off between similarity and model performance.
Sanderson et al. (2015a) also apply a PCA prior to define a measure of similarity between each pair of models. From this measure, an independence score is calculated for the whole ensemble that is then maximized by excluding highly dependent models. The resulting reduced ensembles by Cannon’s and by Sanderson’s techniques are similar. But as an additional criterion, Sanderson et al.’s (2015a) method can incorporate model performance with respect to observational data (i.e., model bias) such that a combined independence–performance score (called independent ensemble quality score) is optimized. Knutti et al. (2017) recently applied Sanderson’s independent ensemble quality score to construct a climate projection weighting scheme for the study of Arctic sea ice and temperature evolution.
Here, we present a reduction of the number of ensemble members of the regional climate projection ensemble for Europe, reducing interdependencies in the ensemble with the objective of preserving relevant information on potential future climate states. For this, we apply the procedure proposed by Sanderson et al. (2015a), which consists of two main steps: 1) a dimension reduction realized by a PCA and 2) the reduction of the ensemble with the objective of maximizing the independent ensemble quality score.
The paper is structured as follows: In section 2 we briefly present the underlying model data, and the method is described in sections 3a–3c. Section 3d shows the functioning of the method in detail. Section 4 includes the results of the presented reduction scheme along with an evaluation approach and a sensitivity analysis with respect to tuning parameters. In section 5 a summary is given and conclusions are drawn.
The results presented in this paper depend on the initial ensemble that is used for the reduction process described in section 3. The starting point for the creation of the initial ensemble is the 17-member European branch of the World Climate Research Program Coordinated Regional Downscaling Experiment (EURO-CORDEX) ensemble at resolution “EUR-11” (~12.5 km; Jacob et al. 2014) available on 1 February 2017. For this ensemble, several global climate models (GCMs) were downscaled by several regional climate models (RCMs). The corners of the EURO-CORDEX EUR-11 domain are located at 60.21°N, 315.86°E (top left); 66.65°N 64.4°E (top right); 22.20°N, 350.01°E (lower left); and 25.36°N, 36.30°E (lower right). The grid spacing of the simulations is 0.11° (12.5 km). The complete description of the domain and the grid can be found on the CORDEX Internet site (http://cordex.org). The performance of the regional models was evaluated on the basis of experiments driven by reanalysis (ERA-Interim) by Kotlarski et al. (2014). This ensemble is used as the starting point because the data are relevant for applications like impact modeling, and the results derived from the analysis of the ensemble are used for policy consulting. Furthermore, the ensemble is large enough so as to evoke the necessity of reduction. This problem will intensify for the EURO-CORDEX EUR-11 in the near future because the ensemble will grow by about 20 members, which are currently produced in the Regional Climate Projection Ensemble for Germany German project [Regionale Klimaprojektionen Ensemble für Deutschland (ReKliEs-De); http://reklies.hlnug.de/startseite.html].
Even if the EURO-CORDEX EUR-11 ensemble is large relative to the ensembles available for other CORDEX domains, it is still an ensemble of opportunity. For ensembles of regional climate model experiments this holds in several ways. First, the available global climate experiments are already an ensemble of opportunity, determined by the number of modeling centers around the world that run an Earth system model and are able and willing to contribute to the Coupled Model Intercomparison Project (CMIP). The same is true for the regional models that are used to downscale the global models in the framework of CORDEX. Second, only a subset of the available global experiments is downscaled for the regional ensemble. Third, and maybe most important, the number of experiments produced with each RCM depends on funding, available computational resources at the regional modeling centers, and other nonscientific reasons and is thus arbitrary to a certain degree. The consequence is that the GCMs and RCMs are not equally distributed in the ensemble. This strongly affects the characteristics of the ensemble. Furthermore, the ensemble grows gradually, and therefore the point in time at which the analysis is performed also affects the results. Despite all that, the available ensemble of opportunity is still the best database to analyze likely future climate change in Europe.
For the analysis presented here, we used all the EURO-CORDEX EUR-11 experiments for which both an historical and an RCP8.5 experiment were available on 1 February 2017. Three experiments are excluded [Regional Model of MPI Climate Service Centre (REMO2009) with MPI-M-MPI-ESM-LR (r1 and r2) forcing and ALADIN53 with CNRM-CERFACS-CNRM-CM5 forcing], because they are not available on the common EURO-CORDEX EUR-11 grid. Two experiments with the regional climate model COSMO-CLM (forced by MIROC5 and CanESM2), conducted in the framework of ReKliEs-De, were added to the ensemble. The members of the resulting ensemble are listed in Table 1; individual climate projections (CPs) are addressed by their numbers as CP1–CP15.
In the following, this ensemble is referred to as the full ensemble. For each of the experiments listed in Table 1, long-term monthly mean values for the variables near-surface air temperature (tas), precipitation total (pr) [for pr also daily values are used to calculate the percentage of monthly precipitation from very wet days relative to monthly precipitation total (R95pTOT; note that very wet days are days with higher-than-95th-percentile precipitation w.r.t. the actual period)], near-surface specific humidity (huss), and surface downwelling shortwave radiation (rsds); the periods 1971–2000, 2021–50, and 2071–2100 have been used for the analysis.
Before applying the method for reducing the regional climate projection ensemble for Europe, a reduction of the climate data dimension is necessary (section 3a). In section 3b we introduce the variables we will investigate with the methodology described in section 3c. The results of applying the method are presented in the section 3d.
a. Dimension reduction
Given the high dimensionality of climate model output, dimension reduction is an indispensable step in model data processing. The common practice of restricting considerations on temperature and/or precipitation using broad spatial and temporal averages is a kind of dimension reduction. For its simplicity, this reduction is easy to implement and gives a well-arranged picture, but it might not capture the complexity of climate.
More sophisticated statistical dimension reduction procedures such as PCA, factor analysis, projection pursuit, or multidimensional scaling have been developed to concentrate a maximum of information inherent in a dataset into a minimum of parameters. It is well known that PCA is the best linear dimension reduction technique in the mean-square-error sense, and it is used in many fields of meteorology and climatology under the designation “empirical orthogonal functions” (Hannachi et al. 2007). In our study, PCA is applied to reduce the complexity of climate change at regional and monthly scales to a parsimonious representation without significant loss of information, which is then possible to process in the main step of the procedure (see section 3c).
Our interest is focused on the area of Germany, which is subdivided into eight climatic regions to account for regional climate characteristics [the regions developed within the framework of the test reference years (TRY regions); see Krähenmann et al. 2017]. Interannual variation is allowed for by taking monthly means. For simplicity, the 30-yr periods are considered stationary, because our focus is on the differences between periods. In total, we investigate five parameters (abbreviations agree with CORDEX standardized variable naming). As mentioned in section 2, in addition to tas and pr, we consider further parameters that are important for climate change assessment and impact modeling: R95pTOT (an extreme index of precipitation; Leander et al. 2014), huss, and rsds. In the following analysis, the parameters pr and R95pTOT are divided by 2 to avoid overplaying the importance of precipitation in the composition of the reduced ensemble, because both variables describe aspects of precipitation.
In general, it is possible to incorporate any direct or derived meteorological variable in the analysis including statistical parameters of their distribution like standard deviation and extremal indices, and parameters that describe spatial and temporal dependence.
To represent the reference and future climate of each of the selected regions, 30-yr monthly means (1971–2000) and climate change signals between the reference period and the future 30-yr monthly means (2021–50 and 2071–2100) are calculated from all RCM climate projections in our ensemble of opportunity. To characterize the recent climate, the 30-yr monthly means for 1971–2000 are arranged into a matrix ref of dimension m = 15 (number of RCM climate projections) times n = 480 (5 parameters × 8 regions × 12 months). Climate change is described by two corresponding matrices: ch1 referring to the change between the periods of 1971–2000 and 2021–50, and ch2 to 1971–2000 and 2071–2100.
To assess the effect of the aggregation within the eight climatic regions, we will repeat the analysis with analog matrices, which contain the same variables like ref, ch1, and ch2 without regional aggregation. Instead of climatic regions, the variables will be aggregated over 3 × 3 grid cells of the original model grid, leading to matrices of dimension 15 × 18 420. We follow the approach of Sanderson et al. (2015a) to analyze climate projection similarity directly on the change matrix ch2 and systematically test various ways to take the temporal evolution of climate change into account, including ch1. Additionally, Sanderson et al. (2015a) propose the option of a quality weighting, using some climate projection quality measure based on ref, such that the selection of climate projections with smaller error is gradually preferred in the final ensemble.
1) Uniqueness weighting
For technical reasons, it is necessary to standardize the matrix (ch1, ch2, or ref depending on the calculation) to have column means 0 (ensemble mean of the respective variable at a given region and month subtracted from each column element) and variance 1 within each parameter (to normalize different measuring units).
The standardized matrix is decomposed by singular values decomposition:
where is the m × n matrix of the scores that describe to what extent a principal component is present in each climate projection. The Λ is the n × n matrix of singular values (square root of eigenvalues, in descending order) that indicate the share of total variance represented by the respective principal component. The first m columns of the n × n matrix are the singular vectors or components, where the elements of each vector are the loadings of the original variables on the component. By truncating and to t < m principal components, excessive noise is filtered from the representation so as to retain only essential information about the climate variables. As the choice of t is subjective, we compare the results for different values of t, representing more than 95% of total variance gradually reducing it to less than 80%.
The row vectors of (scores) are representing the climate projections in the vector space spanned by the principal components . The difference between row vector and row vector (t for truncated) is the distance between climate projection i and climate projection j in the space of principal components:
The distance between two climate projections in the space of principal components is a measure of interdependence. Sanderson et al. (2015a) proposed the following similarity function S on the basis of climate projection distances:
This similarity function takes the value 1 if the distance is 0 (maximum similarity, climate projections are equal) and approaches 0 for increasing distances (no similarity, climate projections are independent). The parameter Du, called “radius of similarity,” can be used to adjust the (nonlinear) decrease of the exponential function to the desired range of distances. We choose the mean distance between two climate projections as a default as also suggested by Sanderson et al. (2015a).
The effective repetition Ru of climate projection i is derived from the similarities of climate projection i to all other climate projections in the ensemble:
and the uniqueness weighting of climate projection i in the ensemble is its reciprocal:
The formula of wu results in a weighting close to 1 for nearly independent climate projections, but in smaller weights for stronger dependent climate projections.
This procedure is performed on each of the matrices ref, ch1, and ch2, yielding different uniqueness weightings depending on which matrix was used. With the objective to maximize the spread of climate change (maximum change − minimum change of the considered climatological parameters) in both future time periods simultaneously, we compare various uniqueness weightings based on ch2 alone, on ch1 and ch2 concatenated (columns of ch2 appended to columns of ch1), and the weights for ch1 and ch2 added [wu(i) = (i) + (i)].
2) Quality weighting
A second climate projection weighting is defined in Sanderson et al. (2015a): the quality weighting wq dependent on the error of the climate projection. The calculation of wq is based on the matrix ref, which contains the model output of all climate projections in the reference period aggregated to the monthly means on the TRY regions described in section 3b. A vector Xo is generated from the observational equivalents to the variables contained in ref to calculate the similarity of the climate projections to the observations. To this end, the vector Xo is projected onto the principal components of ref. Analogous to Eq. (1), Xo is constructed as follows:
The representation is again truncated to t principal components and the quality weighting is calculated between the quantities and :
This weighting is meant to give less weight to climate projections with high errors, and consecutively these climate projections are first excluded from the ensemble. The quantity Dq is an analog to Du [Eq. (3)] determining how strongly the error of a climate projection is penalized. The default value is again the mean distance between climate projections and the observation. As the question of penalization is a critical one, we also test reduction schemes with a very high value of Dq such that the quality weighting is near unity for all climate projections.
3) Combined uniqueness and quality weighting
Sanderson et al. (2015a) combine the uniqueness and quality weightings into a product and sum up the weights to form the independent ensemble quality score (IEQS) such that climate projection uniqueness and climate projection performance both contribute to the score Em:
Climate projections are now successively excluded from the ensemble: The IEQS is calculated for the ensemble of opportunity and for every possible subensemble of cardinality m − 1. Typically, the IEQS increases when a climate projection with low uniqueness weight is removed. The subensemble with the highest IEQS is retained. From the (m − 1) ensemble again every possible (m − 2) subensemble is formed and evaluated, and the subensemble with highest gain in IEQS is chosen. This procedure is repeated until an appropriate stopping criterion is satisfied. This could be the maximum of the IEQS over all subensembles, the lowest IEQS still above the IEQS of the ensemble of opportunity, or, from a practical viewpoint, a desirable number of climate projections.
The first step toward the reduction of the ensemble consists of a PCA. The PCA identifies the dominant orthogonal patterns, called principal components, which the matrix is composed of. (The contribution of each principal component to the total variance is summarized in Fig. 6, described in more detail below.) The first five (four) principal components (PCs) reproduce more than 87% (83%) of the variance in ch2.
The principal components represent important features of multivariate climate change and are therefore a useful tool to get an impression of the resulting intervariable correlations in climate change. In Fig. 1 (left panel) the loadings of the first PC (with the highest explained variance) of the matrix ch2 (referring to climate change between the periods of 1971–2000 and 2071–2100) are shown. The plot is dimensionless; the white color indicates climate change signals equal to the ensemble mean change of the full ensemble. Although the principal components indicate a direction only and their sign is not determined a priori, once the sign has been fixed the decomposition is unique. Here, the signs have been chosen so as to indicate stronger increase/weaker decrease w.r.t. mean change by positive loadings (green color) and weaker increase/stronger decrease by negative loadings (pink color). For temperature, green means stronger warming; for precipitation, green implies wetter conditions and more frequent extreme precipitation, respectively; furthermore, green means more humidity and more radiation than under ensemble mean change. In contrast, pink implies a weaker temperature increase, more drying, less extreme precipitation, and less humidity and radiation (Fig. 1). The first PC shows that a stronger-than-mean warming from May to October is associated with only slightly stronger-than-mean precipitation decrease, increased extreme precipitation index and higher-than-mean radiation from July to September, whereas specific humidity increases stronger-than-mean increase across all seasons (this is the pattern for a positive score; if the score is negative, the pattern is inverted). The second PC of ch2, shown in Fig. 1 (right panel), presents a very different change pattern, with lower-than-mean-temperature increase in the cooler seasons and higher-than-mean increase only in July and August. The summer drying is more pronounced than in the first PC. The change in specific humidity is much lower than mean.
The PCA is constructed to represent the climate projections as a linear combination of the PCs weighted by the scores. The scores of all climate projections in the space of climate change signals between the periods of 1971–2000 and 2071–2100 with respect to the first and second PC are shown in Fig. 2 (left panel). The scores of CP14 suggest a very strong temperature increase (compared to the ensemble mean) with extremely dry summers because the PCs are added. This suggestion is confirmed by the changes in CP14 (Fig. 3). On the other hand, CP7 has negative scores, which results in a rather cool and wet climate compared to the ensemble mean (Fig. 3). CP10 is special in that it presents the first PC in almost pure form, nearly coinciding with the x axis. Its position is reflected by warm summers but average precipitation change (Fig. 3).
In contrast, the scores of all climate projections in the space of modeling error in the period 1971–200 with respect to the first and second PC along with the scores of the observation are shown in Fig. 2 (right panel). Noticeably, the positions of the individual projections in the two spaces of principal components are different, underlining the complementarity of information on change signals and errors. The translation of the observations (blue cross) away from the origin (ensemble mean) indicates that the ensemble as a whole is biased.
The effect of the stepwise exclusion of climate projections from the ensemble of opportunity is illustrated in Fig. 4 The independent ensemble quality score increases while the ensemble is reduced down to eight members. Excluding one additional climate projection, the score remains nearly constant. When the exclusion of climate projections is continued further, the IEQS starts to decrease rapidly.
In the score plot, Fig. 2 (left panel), the climate projections belonging to the optimal 7-member ensemble are marked by red crosses. The selected models are mainly situated in the periphery of the cloud, a fact that corresponds well to our objective to select most unsimilar climate projections. The optimal five-member ensemble contains the same climate projections, except for CP2 and CP10, which seems plausible given their scores on the first two PCs. The inclusion of CP5 appears doubtful in this plot, but is justified by its extraordinary high score on the second PC of the change between the periods 1971–2000 and 2021–50 (not shown).
a. Reduction scheme used based on tas, pr, R95pTOT, huss, and rsds
As described in section 3, the reduction is based on the variables tas, pr, R95pTOT, huss, and rsds. Here, we discuss the results of a reduction scheme (sets of tuning parameters needed to determine the weights exactly) using all five variables on the eight climatic TRY regions, added uniqueness weights for the two future time periods, five PCs, and no quality weighting. This scheme will further serve as our benchmark in section 4b.
The decision to omit quality weighting depending on model error as proposed by Sanderson et al. (2015a) was motivated by the fact that all projections are going to be bias corrected before further processing or distributing them to users. The bias correction will apply the multivariate quantile-mapping procedure found in Cannon (2016) that corrects for marginal distribution as well as intervariable covariance structure. We assume that after bias correction, the remaining errors—in terms of the quantities considered here—will be comparable. For comparison, the result of an identical reduction scheme, where the quality weighting is activated, will be presented in section 4b.
On the other hand, the spirit of quality weighting as originally conceived reaches much further than conventional bias evaluation. Quality weighting is seeking to evaluate the plausibility of the projected climate change. Unfortunately, this plausibility is not straightforward to deduce from the projection’s performance in the reference period, as instationarity of the bias w.r.t. long-term variability, large-scale circulation, and state of the climate system greatly interferes with corresponding conclusions (Maraun et al. 2017). At the time of writing this manuscript, the scientific literature does not contain evaluation methods capable to assess the physical plausibility of changes projected by RCMs. Existing attempts to evaluate physical plausibility in GCMs as emergent constraints approaches (Collins et al. 2012) are not easy to transfer to RCMs because of the limited domain. The task is further complicated when the focus is shifted from variables like pressure, geopotential, and sea surface temperature to precipitation, wind speed, and humidity. We are nevertheless attentively following new developments and ready to incorporate criteria that are appropriate to measure physical plausibility of RCM output.
When the quality weighting is dropped, the Sanderson et al. (2015a) method will minimize interdependence within the ensemble without constraints, that is, to select most unsimilar projections spanning a large spread between them. To monitor the performance of the procedure we will compare the spread produced by a reduced ensemble with the spread of the full ensemble of opportunity.
Because of the high dimensionality of the model output, this comparison is rather complex. Maps of the ensemble spread per variable and month dependent on the number of climate projections in the ensemble subset is a proper way to visualize the effect of the ensemble reduction. To get a more complete picture of the performance of the reduction scheme, the spread is not presented for the eight regions used to form the matrices but rather on a 3 × 3 gridbox aggregation. For a more compact representation, we decided to aggregate months to seasons.
An example of such a plot is shown in Fig. 5 (spread in tas at 3 × 3 aggregated grid points, four seasons, two future periods for the full ensemble of 15, and reduced ensembles of 10, 7, 6, 5, and 4 climate projections) for the reduction scheme using the added uniqueness weights from ch1 and ch2 for the variables tas, pr, R95pTOT, huss, and rsds on eight climatic regions with five principal components (see Fig. 6: 87.4% of total variance) without quality weighting. The order of exclusion is CP13, CP3, CP9, CP8, CP4, CP15, CP1, CP6, CP11, CP12, CP14, CP2, CP10, CP5, and CP7, with the maximum IEQS at 8 climate projections. The plot shows well that, except for winter 2071–2100, a reduced ensemble with 7 members represents the full spread of the full ensemble, whereas a further reduction to 6 members reduces the spread considerably in spring and winter. This performance agrees with the declining IEQS in Fig. 4 as the number of ensemble members is reduced from 7 to 4. Figure 7 shows the corresponding reduction in spread for the precipitation pr. The loss of spread for pr is more pronounced already for a reduction to 7 climate projections than for tas because the change patterns are less uniform within the ensemble.
The illustrative plots in Figs. 5 and 7 are useful to get an overview of the spread in different subensembles. To get deeper inside in the changes in spread induced by the ongoing reduction, we list the overall remaining spread in exact numbers in Table 2. Table 2 shows the decrease of the remaining spread in percent per variable and future period, aggregated over the whole area and all seasons as a function of the cardinality of the subensemble for the same reduction scheme as in Figs. 5 and 7.
b. Sensitivity of reduction scheme
In the following, we investigate the sensitivity of the reduction against different reduction schemes and recalculate the loss of spread varying the selection of variables included in the matrices, the spatial aggregation, the number of principal components, the method of including the future period 2021–50, and the activation of quality weighting. The effect of quality weighting is shown by comparing the ensemble mean-square errors between the full and the reduced ensembles. For comparison, we fix the desired number of climate projections in the reduced ensemble: On the one hand, five in accordance with user requests, and on the other hand, seven for more complexity. (Supporting figures for all variations of the reduction scheme are reported in Tables 3 and 4 and in Figs. 8 and 9.)
The exclusion of variables from the matrices can result in a deterioration of the spread compared mainly to the spread of the full ensemble, not only in the excluded variable. This effect is most likely caused by correlations between the variables. As it is not directly predictable, the effect has to be tested (Table 3 and Fig. 8a).
The method of including the period 2021–50 in the analysis influences the spread in 2021–50 insofar as the additive inclusion gives a slightly better spread than the exclusion compared to the spread of the full ensemble, whereas the inclusion by concatenation performs worse (Table 3 and Fig. 8b).
Although the spread is always calculated at the 3 × 3 gridbox aggregation, the aggregation of the variables on the eight climatic TRY regions achieves a slightly higher spread compared to the spread of the full ensemble than the use of the 3 × 3 gridbox aggregation already in the calculation of the weights (Table 3 and Fig. 8c). One explanation for this unexpected fact might be that the eight climatic TRY regions describe a physically meaningful reduction of noise. With the limited number of climate projections available, the singular value decomposition is not able to identify this spatial structure, which is surely not uniformly developed across the climate projections. On the other hand, the superiority of the TRY regions over the 3 × 3 gridbox aggregation could be an artifact of the linearity forced upon the data by PCA. This hypothesis is surely worth testing by nonlinear generalizations of PCA (Gorban et al. 2007).
The fact that a spatial aggregation, which is based on additional physical knowledge not contained in the matrices , is able to improve the result of the reduction leads to the question of whether the temporal aggregation could also be refined. Instead of monthly mean values, circulation-regime-dependent means could make the temporal aggregation more physically meaningful.
The number of principal components is essential when maximizing the spread for a given cardinality. The relation between spread and number of principal components is not linear. The inclusion of more principal components increases the complexity in the ensemble, but it also increases the number of climate projections necessary for its representation. Therefore, if the number of climate projections is fixed for practical reasons, then the influence of the number of principal components on the spread has to be tested in order to find the optimal reduction scheme (Table 3 and Fig. 8d).
The quality weighting (here with Dq near the mean distance) between climate projections and observation has by far the strongest effect on the ensemble spread among the variations of schemes investigated in this paper. This effect is most pronounced for pr, as the trends in precipitation are much less consistent from one climate projection to another than for temperature. In the prevalent case that strong errors and unique climate signal co-occur, the exclusion of a climate projection with bad performance may improve the mean-squared error (MSE) of the ensemble mean, but at the same time a fairly unique trend is excluded from the spread, thus reducing it considerably (see Table 4 and Fig. 9 for spread and MSE).
In our case, the reduction scheme with all variables, aggregation on eight climatic regions, added uniqueness weights for both future periods, and five principal components has proven best for a seven-member subensemble without quality weighting (CP2, CP5, CP7, CP10, CP11, CP12, and CP14). As can be seen from Table 2, the reduced ensemble retains on average ≈97% of the spread (per variable) compared to the full ensemble in the period 2071–2100 and ≈91% in 2021–50.
We would like to point out that the difference in spread representation between the first and the second future period is not caused by a lower weighting of 2021–50 in the reduction scheme or by an insufficient representation of decadal variability. Tests show that even when reducing the ensemble on the information of the first period exclusively, it is not possible to reproduce a higher-than-91% spread using a seven-member ensemble. The reason for such behavior might be the missing consistence in the evolution of climate change that is yet barely detectable in 2021–50.
Similarly, the best five-member subensemble without quality weighting is derived from the reduction scheme with all variables, aggregation on eight climatic regions, and added uniqueness weights for both future periods, but this time with four principal components (CP5, CP7, CP11, CP12, and CP14). The best five-member ensemble is a subensemble of the best seven-member ensemble, but it does not follow the order of exclusion calculated from the reduction scheme with five PCs optimal for the seven-member ensemble. The best five-member ensemble still represents on average ≈94% and ≈80% of the original spread, respectively.
In case we do want to activate quality weighting, the reduction scheme with all other parameters fixed as above, results in a very similar subensemble (CP2, CP5, CP6, CP10, CP11, CP12, and CP14), where CP7 is replaced by CP6. Table 5 shows that, although CP7 has the highest uniqueness weight, at the same time it has by far the lowest quality weight, such that the combined weight is low and the model is excluded. In the variables considered here, CP7 has, in fact, extremely large errors, relative to the other models in the ensemble.
The effect of replacing CP7 is striking: the ensemble spread decreases to only 87% and 85% compared to the full ensemble (whereas the reduced ensemble without quality weighting had 97% and 91%). This difference is due to the climate change signal produced by CP7, which deviates considerably from the rest of the ensemble. At the same time, the MSE of the reduced ensemble is likewise reduced from 95.8% to 76.8% for the ensembles containing CP7 and CP6, respectively, compared to the MSE of the full ensemble.
The decision in favor of or against the application of quality weighting is definitely a crucial one. It corresponds to the question whether an extraordinary change signal can describe a plausible future evolution, despite having large errors. This decision can be approached in either a philosophical or a pragmatic way. But the user has to bear in mind that the exclusion of a strongly biased climate projection is usually not possible without incurring expenses on the ensemble spread.
For comparison, we implemented the completely different reduction procedure proposed by Cannon (2015): the Katsavounidis–Kuo–Zhang (KKZ) algorithm for initializing the centroids in k-means clustering. When applied to the same matrices ch1 and ch2 as used in Sanderson’s procedure (without quality weighting), we find exactly the same optimal seven-member ensemble. In contrast, the five-member ensembles found by Cannon’s procedure shows a slightly lower performance (CP2, CP5, CP7, CP10, and CP14 with 86% and 81% of spread).
5. Summary and conclusions
In this article, we presented a methodology for reducing the high-resolution climate projection ensemble provided by the EURO-CORDEX activity based on the method of Sanderson et al. (2015a). This methodology allows a reduction of the number of ensemble members considered for subsequent impact studies controlling the reduction of the ensemble spread. The method is able to balance a performance criterion (modeling error in a reference period) against the ensemble spread at any desired level. We found that a reduction of the 15 climate projections available in EURO-CORDEX to only 8 ensemble members even increases the IEQS. This IEQS is kept nearly constant for 7 ensemble members. Only a stronger reduction of the ensemble induces a noticeable decrease of IEQS. The reduction to 7 ensemble members leads to an average remaining spread of more than 91% for the climate variables considered (tas, pr, R95pTOT, huss, and rsds) for the climate change signal of the near future (2021–50 with respect to 1971–2000) and even more than 97% for the climate change signal of the far future (2071–2100 with respect to 1971–2000), where the loss of spread for the precipitation variables (pr and R95pTOT) is higher than for the other variables.
In view of the upcoming extension of the EURO-CORDEX ensemble, around 20 new members will be available in the near future; our method of ensemble reduction will gain an immediate importance to keep the ensemble manageable for applications, impact modeling, and climate assessment.
Of course, the resulting reduced ensemble is dependent on the full ensemble. As a consequence, the methodology needs to be applied regularly when additional climate projections become available. Furthermore, the results of the reduction depend on the variables of interest, the spatial and temporal resolution analyzed, and the loss of ensemble spread still accepted. These decisions and the question of considering a quality weighting have to be discussed with the users of the reduced ensemble.
We also want to point out that this methodology is not a quality measure filtering out “bad climate projections.” In contrast, it filters similar climate projections to the benefit of most unsimilar climate projections retaining large part of the spread given by the full ensemble.
It is worth noting that in the reduced ensemble almost all GCMs from the full ensemble are represented. Only one GCM is excluded by this methodology. This provides an indication of the fact that the ensemble spread, at least with respect to temperature, is mainly defined by the GCMs. In contrast, the effect of different RCMs is more apparent for the precipitation variables.
For the future, it would be interesting to also apply this methodology to climate projections temporally aggregated with respect to different circulation regimes instead of monthly means, as additional physical information that can be more clearly identified by circulation regime classification. Furthermore, the linear dimension reduction on which the procedure is based (PCA) could be generalized to nonlinear PCA (principal curves or manifolds) or to kernel PCA.
Here, we applied the methodology of Sanderson et al. (2015a) to a multimodel ensemble. However, the proposed method could as well be applied to compare different ensemble generation strategies. A large number of similar climate projections in an ensemble would point to an inefficient strategy of ensemble generation, a question that is also important in numerical weather prediction and seasonal and decadal prediction. In this way, the methods could contribute to the search of an optimum perturbation strategy for the different purposes addressable by present climate projection ensembles. An application to climate forecast ensembles might be tested as well. Furthermore, as discussed in Sanderson et al. (2015b), the IEQS is not only appropriate for ensemble reduction but for ensemble weighting, too, to find a balance between spread and bias.