1. Introduction
Climate change is one of the biggest challenges facing humankind. Thus, the problem of global climate change forced by anthropogenic emissions of greenhouse gases has to be addressed throughout the whole society, including mitigation and adaptation measures. Several governments have established national adaptation plans, where the planning and implementation of adaptation measures have steadily gained importance over the course of recent years. Several so-called action fields—for example, agriculture, water management, and transport infrastructure—are addressed explicitly. (For the German adaptation plan issued by the Federal Ministry for the Environment, Nature Conservation, Building and Nuclear Safety, see http://www.bmub.bund.de/en/topics/climate-energy/climate/details-climate/artikel/german-strategy-for-adaptation-to-climate-change-summary/?tx_ttnews%5BbackPid%5D=216.)
Climate projections are subject to uncertainties that are caused by 1) the chaotic character of the climate system, 2) the underdetermined state of the climate system, 3) the simplified model description of complex physical processes, and 4) the assumptions on the future development of global societies and economy. For considering these uncertainties in adaption planning, an ensemble of climate projections intended to cover the whole range of uncertainties is used by taking advantage of different ensemble generation strategies (initial and boundary conditions perturbation, multimodel ensembles, use of different scenarios; Meinshausen et al. 2011).
In recent years the number of available climate projections has increased rapidly. The compilation of all climate projections leads to a so-called ensemble of opportunity since it is not constructed in a systematic way. For that reason the analysis of a larger ensemble is not always beneficial (Knutti et al. 2010). Furthermore, there can be computational restrictions that, for some applications, impede the use of the full ensemble, suggesting a reduction of the ensemble size. To still make optimal use of the information inherent in the full multimodel ensemble, a careful selection of projections to be further processed in a study is crucial. Therefore, it has to be assured that a reduced ensemble still represents the desired characteristics of the full ensemble. Here, the question arises of what are the desired characteristics of an ensemble.
The widespread notion of “robustness” based on model agreement leads to the conclusion that results shared by many models are more likely to be “true” than results of so-called outliers. Consequently, the main characteristics of the ensemble are considered the ensemble mean or the regions, where model results form clusters. But this statistical interpretation relies on the assumption of independence between model results, which is not fulfilled for present-day multimodel ensembles used for climate projection. The violation of the independence assumption, the sources of interdependence, and their consequences for the validity of the robustness hypothesis have been discussed rigorously in Pirtle et al. (2010) and Parker (2013).
Interdependence between climate models, caused by common assumptions on the climate system and shared observational data for model tuning, as well as replication of code and shared model components across seemingly different models developed by independent institutions, results in similarities between the outputs (and correlation between the errors) of different models (e.g., Knutti et al. 2010). However, the limited gain of new information when adding an interdependent model to an ensemble is not the severest consequence of interdependence. Only independent errors tend to cancel out, such that ensemble statistics in interdependent ensembles can be biased and uncertainty measures systematically underestimate existing uncertainty (Knutti et al. 2010; Tebaldi et al. 2011; Knutti and Sedláček 2013; Evans et al. 2013; Caldwell et al. 2014; and many others).
In contrast to numerical weather prediction ensembles, it is not possible to calibrate the ensemble spread, that is, the dispersion of the outcomes of individual ensemble members in the space of results, to the true uncertainty of climate change, as it is impossible to measure. Therefore we have no evidence to decide whether an ensemble is over- or under-dispersive with respect to the changes signals.
The adverse effects of interdependence on the utility of ensembles recognized, the necessity arises to assess the contribution of the various projections to the ensemble information. In this assessment, the criterion of low interdependence can sometimes conflict with good model performance (a highly independent, i.e., very unsimilar, projection can be doubtful because of terrible model performance). It is therefore a vital objective for ensemble construction to balance interdependence and model performance.
First attempts to reduce interdependence in an ensemble were based on model genealogy (Knutti 2010) but did not prove rigorous in identifying interdependence. Alternatively, quantitative techniques focus on distances and correlations between the output of different models (Masson and Knutti 2011). A selection procedure based on quantitative model distances was developed by Cannon (2015), who applied a multivariate cluster initialization algorithm to multivariate model output. The algorithm selects the most unsimilar, that is, the least interdependent, models out of a given ensemble. Mendlik and Gobiet (2016) propose a cluster analysis in the space of multivariate model output [after dimension reduction by principal component analysis (PCA); Hannachi et al. 2007], and choose one model out of each cluster. The reduced ensemble that results from cluster analysis does not necessarily maximize independence because the cluster analysis is constructed to select representative projections from highly populated regions, that is, regions with many similar members, but tends to disregard more isolated projections.
The methods for reducing ensemble interdependence mentioned so far do not include measures of model performance. Model performance can be incorporated into the selection only by a two-stage process that rejects badly performing projections in a first step and then selects the most independent projections out of the remaining ones. The stepwise procedure does not enable a trade-off between similarity and model performance.
Sanderson et al. (2015a) also apply a PCA prior to define a measure of similarity between each pair of models. From this measure, an independence score is calculated for the whole ensemble that is then maximized by excluding highly dependent models. The resulting reduced ensembles by Cannon’s and by Sanderson’s techniques are similar. But as an additional criterion, Sanderson et al.’s (2015a) method can incorporate model performance with respect to observational data (i.e., model bias) such that a combined independence–performance score (called independent ensemble quality score) is optimized. Knutti et al. (2017) recently applied Sanderson’s independent ensemble quality score to construct a climate projection weighting scheme for the study of Arctic sea ice and temperature evolution.
Here, we present a reduction of the number of ensemble members of the regional climate projection ensemble for Europe, reducing interdependencies in the ensemble with the objective of preserving relevant information on potential future climate states. For this, we apply the procedure proposed by Sanderson et al. (2015a), which consists of two main steps: 1) a dimension reduction realized by a PCA and 2) the reduction of the ensemble with the objective of maximizing the independent ensemble quality score.
The paper is structured as follows: In section 2 we briefly present the underlying model data, and the method is described in sections 3a–3c. Section 3d shows the functioning of the method in detail. Section 4 includes the results of the presented reduction scheme along with an evaluation approach and a sensitivity analysis with respect to tuning parameters. In section 5 a summary is given and conclusions are drawn.
2. Data
The results presented in this paper depend on the initial ensemble that is used for the reduction process described in section 3. The starting point for the creation of the initial ensemble is the 17-member European branch of the World Climate Research Program Coordinated Regional Downscaling Experiment (EURO-CORDEX) ensemble at resolution “EUR-11” (~12.5 km; Jacob et al. 2014) available on 1 February 2017. For this ensemble, several global climate models (GCMs) were downscaled by several regional climate models (RCMs). The corners of the EURO-CORDEX EUR-11 domain are located at 60.21°N, 315.86°E (top left); 66.65°N 64.4°E (top right); 22.20°N, 350.01°E (lower left); and 25.36°N, 36.30°E (lower right). The grid spacing of the simulations is 0.11° (12.5 km). The complete description of the domain and the grid can be found on the CORDEX Internet site (http://cordex.org). The performance of the regional models was evaluated on the basis of experiments driven by reanalysis (ERA-Interim) by Kotlarski et al. (2014). This ensemble is used as the starting point because the data are relevant for applications like impact modeling, and the results derived from the analysis of the ensemble are used for policy consulting. Furthermore, the ensemble is large enough so as to evoke the necessity of reduction. This problem will intensify for the EURO-CORDEX EUR-11 in the near future because the ensemble will grow by about 20 members, which are currently produced in the Regional Climate Projection Ensemble for Germany German project [Regionale Klimaprojektionen Ensemble für Deutschland (ReKliEs-De); http://reklies.hlnug.de/startseite.html].
Even if the EURO-CORDEX EUR-11 ensemble is large relative to the ensembles available for other CORDEX domains, it is still an ensemble of opportunity. For ensembles of regional climate model experiments this holds in several ways. First, the available global climate experiments are already an ensemble of opportunity, determined by the number of modeling centers around the world that run an Earth system model and are able and willing to contribute to the Coupled Model Intercomparison Project (CMIP). The same is true for the regional models that are used to downscale the global models in the framework of CORDEX. Second, only a subset of the available global experiments is downscaled for the regional ensemble. Third, and maybe most important, the number of experiments produced with each RCM depends on funding, available computational resources at the regional modeling centers, and other nonscientific reasons and is thus arbitrary to a certain degree. The consequence is that the GCMs and RCMs are not equally distributed in the ensemble. This strongly affects the characteristics of the ensemble. Furthermore, the ensemble grows gradually, and therefore the point in time at which the analysis is performed also affects the results. Despite all that, the available ensemble of opportunity is still the best database to analyze likely future climate change in Europe.
For the analysis presented here, we used all the EURO-CORDEX EUR-11 experiments for which both an historical and an RCP8.5 experiment were available on 1 February 2017. Three experiments are excluded [Regional Model of MPI Climate Service Centre (REMO2009) with MPI-M-MPI-ESM-LR (r1 and r2) forcing and ALADIN53 with CNRM-CERFACS-CNRM-CM5 forcing], because they are not available on the common EURO-CORDEX EUR-11 grid. Two experiments with the regional climate model COSMO-CLM (forced by MIROC5 and CanESM2), conducted in the framework of ReKliEs-De, were added to the ensemble. The members of the resulting ensemble are listed in Table 1; individual climate projections (CPs) are addressed by their numbers as CP1–CP15.
Members of the ensemble used for the reduction process (full ensemble). The experiments are named by the data reference syntax elements GCMModelName and RCMModelName. ICHEC-EC is the Irish Centre for High-End Computing Europewide Consortium, MOHC is the Met Office Hadley Centre, and CCLM is the Climate Limited-Area Modelling Community; other expansions are available online at https://www.ametsoc.org/PubsAcronymList.


In the following, this ensemble is referred to as the full ensemble. For each of the experiments listed in Table 1, long-term monthly mean values for the variables near-surface air temperature (tas), precipitation total (pr) [for pr also daily values are used to calculate the percentage of monthly precipitation from very wet days relative to monthly precipitation total (R95pTOT; note that very wet days are days with higher-than-95th-percentile precipitation w.r.t. the actual period)], near-surface specific humidity (huss), and surface downwelling shortwave radiation (rsds); the periods 1971–2000, 2021–50, and 2071–2100 have been used for the analysis.
3. Method
Before applying the method for reducing the regional climate projection ensemble for Europe, a reduction of the climate data dimension is necessary (section 3a). In section 3b we introduce the variables we will investigate with the methodology described in section 3c. The results of applying the method are presented in the section 3d.
a. Dimension reduction
Given the high dimensionality of climate model output, dimension reduction is an indispensable step in model data processing. The common practice of restricting considerations on temperature and/or precipitation using broad spatial and temporal averages is a kind of dimension reduction. For its simplicity, this reduction is easy to implement and gives a well-arranged picture, but it might not capture the complexity of climate.
More sophisticated statistical dimension reduction procedures such as PCA, factor analysis, projection pursuit, or multidimensional scaling have been developed to concentrate a maximum of information inherent in a dataset into a minimum of parameters. It is well known that PCA is the best linear dimension reduction technique in the mean-square-error sense, and it is used in many fields of meteorology and climatology under the designation “empirical orthogonal functions” (Hannachi et al. 2007). In our study, PCA is applied to reduce the complexity of climate change at regional and monthly scales to a parsimonious representation without significant loss of information, which is then possible to process in the main step of the procedure (see section 3c).
b. Variables
Our interest is focused on the area of Germany, which is subdivided into eight climatic regions to account for regional climate characteristics [the regions developed within the framework of the test reference years (TRY regions); see Krähenmann et al. 2017]. Interannual variation is allowed for by taking monthly means. For simplicity, the 30-yr periods are considered stationary, because our focus is on the differences between periods. In total, we investigate five parameters (abbreviations agree with CORDEX standardized variable naming). As mentioned in section 2, in addition to tas and pr, we consider further parameters that are important for climate change assessment and impact modeling: R95pTOT (an extreme index of precipitation; Leander et al. 2014), huss, and rsds. In the following analysis, the parameters pr and R95pTOT are divided by 2 to avoid overplaying the importance of precipitation in the composition of the reduced ensemble, because both variables describe aspects of precipitation.
In general, it is possible to incorporate any direct or derived meteorological variable in the analysis including statistical parameters of their distribution like standard deviation and extremal indices, and parameters that describe spatial and temporal dependence.
To represent the reference and future climate of each of the selected regions, 30-yr monthly means (1971–2000) and climate change signals between the reference period and the future 30-yr monthly means (2021–50 and 2071–2100) are calculated from all RCM climate projections in our ensemble of opportunity. To characterize the recent climate, the 30-yr monthly means for 1971–2000 are arranged into a matrix
To assess the effect of the aggregation within the eight climatic regions, we will repeat the analysis with analog matrices, which contain the same variables like
c. Procedure
1) Uniqueness weighting
For technical reasons, it is necessary to standardize the matrix



















This procedure is performed on each of the matrices
2) Quality weighting











3) Combined uniqueness and quality weighting


d. Application
The first step toward the reduction of the ensemble consists of a PCA. The PCA identifies the dominant orthogonal patterns, called principal components, which the matrix
The principal components represent important features of multivariate climate change and are therefore a useful tool to get an impression of the resulting intervariable correlations in climate change. In Fig. 1 (left panel) the loadings of the first PC (with the highest explained variance) of the matrix

(left) First and (right) second PCs of the matrix of changes in tas, pr, R95pTOT, huss, and rsds (columns) per month (rows) between the periods 1971–2000 and 2071–2100, representing 43.8% and 22.3% of total variance. Green (pink) colors indicate changes above (below) ensemble mean.
Citation: Journal of Applied Meteorology and Climatology 57, 3; 10.1175/JAMC-D-17-0141.1

(left) First and (right) second PCs of the matrix of changes in tas, pr, R95pTOT, huss, and rsds (columns) per month (rows) between the periods 1971–2000 and 2071–2100, representing 43.8% and 22.3% of total variance. Green (pink) colors indicate changes above (below) ensemble mean.
Citation: Journal of Applied Meteorology and Climatology 57, 3; 10.1175/JAMC-D-17-0141.1
(left) First and (right) second PCs of the matrix of changes in tas, pr, R95pTOT, huss, and rsds (columns) per month (rows) between the periods 1971–2000 and 2071–2100, representing 43.8% and 22.3% of total variance. Green (pink) colors indicate changes above (below) ensemble mean.
Citation: Journal of Applied Meteorology and Climatology 57, 3; 10.1175/JAMC-D-17-0141.1
The PCA is constructed to represent the climate projections as a linear combination of the PCs weighted by the scores. The scores of all climate projections in the space of climate change signals between the periods of 1971–2000 and 2071–2100 with respect to the first and second PC are shown in Fig. 2 (left panel). The scores of CP14 suggest a very strong temperature increase (compared to the ensemble mean) with extremely dry summers because the PCs are added. This suggestion is confirmed by the changes in CP14 (Fig. 3). On the other hand, CP7 has negative scores, which results in a rather cool and wet climate compared to the ensemble mean (Fig. 3). CP10 is special in that it presents the first PC in almost pure form, nearly coinciding with the x axis. Its position is reflected by warm summers but average precipitation change (Fig. 3).

(left) Scores on the first and second PC w.r.t. climate change signal between 1971–2000 and 2071–2100, representing 43.8% and 22.3% of total variance, and (right) scores on the first and second PC w.r.t. modeling error in the reference period 1971–2000, representing 31.3% and 24.2% of total variance, of all climate projections belonging to the full ensemble. Red times signs are members of the reduced ensemble; blue times signs are the observations.
Citation: Journal of Applied Meteorology and Climatology 57, 3; 10.1175/JAMC-D-17-0141.1

(left) Scores on the first and second PC w.r.t. climate change signal between 1971–2000 and 2071–2100, representing 43.8% and 22.3% of total variance, and (right) scores on the first and second PC w.r.t. modeling error in the reference period 1971–2000, representing 31.3% and 24.2% of total variance, of all climate projections belonging to the full ensemble. Red times signs are members of the reduced ensemble; blue times signs are the observations.
Citation: Journal of Applied Meteorology and Climatology 57, 3; 10.1175/JAMC-D-17-0141.1
(left) Scores on the first and second PC w.r.t. climate change signal between 1971–2000 and 2071–2100, representing 43.8% and 22.3% of total variance, and (right) scores on the first and second PC w.r.t. modeling error in the reference period 1971–2000, representing 31.3% and 24.2% of total variance, of all climate projections belonging to the full ensemble. Red times signs are members of the reduced ensemble; blue times signs are the observations.
Citation: Journal of Applied Meteorology and Climatology 57, 3; 10.1175/JAMC-D-17-0141.1

Scatterplot of seasonal changes in summer (JJA) per climate projection averaged over Germany: tas vs pr.
Citation: Journal of Applied Meteorology and Climatology 57, 3; 10.1175/JAMC-D-17-0141.1

Scatterplot of seasonal changes in summer (JJA) per climate projection averaged over Germany: tas vs pr.
Citation: Journal of Applied Meteorology and Climatology 57, 3; 10.1175/JAMC-D-17-0141.1
Scatterplot of seasonal changes in summer (JJA) per climate projection averaged over Germany: tas vs pr.
Citation: Journal of Applied Meteorology and Climatology 57, 3; 10.1175/JAMC-D-17-0141.1
In contrast, the scores of all climate projections in the space of modeling error in the period 1971–200 with respect to the first and second PC along with the scores of the observation are shown in Fig. 2 (right panel). Noticeably, the positions of the individual projections in the two spaces of principal components are different, underlining the complementarity of information on change signals and errors. The translation of the observations (blue cross) away from the origin (ensemble mean) indicates that the ensemble as a whole is biased.
The effect of the stepwise exclusion of climate projections from the ensemble of opportunity is illustrated in Fig. 4 The independent ensemble quality score increases while the ensemble is reduced down to eight members. Excluding one additional climate projection, the score remains nearly constant. When the exclusion of climate projections is continued further, the IEQS starts to decrease rapidly.

IEQS for decreasing number of ensemble members.
Citation: Journal of Applied Meteorology and Climatology 57, 3; 10.1175/JAMC-D-17-0141.1

IEQS for decreasing number of ensemble members.
Citation: Journal of Applied Meteorology and Climatology 57, 3; 10.1175/JAMC-D-17-0141.1
IEQS for decreasing number of ensemble members.
Citation: Journal of Applied Meteorology and Climatology 57, 3; 10.1175/JAMC-D-17-0141.1
In the score plot, Fig. 2 (left panel), the climate projections belonging to the optimal 7-member ensemble are marked by red crosses. The selected models are mainly situated in the periphery of the cloud, a fact that corresponds well to our objective to select most unsimilar climate projections. The optimal five-member ensemble contains the same climate projections, except for CP2 and CP10, which seems plausible given their scores on the first two PCs. The inclusion of CP5 appears doubtful in this plot, but is justified by its extraordinary high score on the second PC of the change between the periods 1971–2000 and 2021–50 (not shown).
4. Results
a. Reduction scheme used based on tas, pr, R95pTOT, huss, and rsds
As described in section 3, the reduction is based on the variables tas, pr, R95pTOT, huss, and rsds. Here, we discuss the results of a reduction scheme (sets of tuning parameters needed to determine the weights exactly) using all five variables on the eight climatic TRY regions, added uniqueness weights for the two future time periods, five PCs, and no quality weighting. This scheme will further serve as our benchmark in section 4b.
The decision to omit quality weighting depending on model error as proposed by Sanderson et al. (2015a) was motivated by the fact that all projections are going to be bias corrected before further processing or distributing them to users. The bias correction will apply the multivariate quantile-mapping procedure found in Cannon (2016) that corrects for marginal distribution as well as intervariable covariance structure. We assume that after bias correction, the remaining errors—in terms of the quantities considered here—will be comparable. For comparison, the result of an identical reduction scheme, where the quality weighting is activated, will be presented in section 4b.
On the other hand, the spirit of quality weighting as originally conceived reaches much further than conventional bias evaluation. Quality weighting is seeking to evaluate the plausibility of the projected climate change. Unfortunately, this plausibility is not straightforward to deduce from the projection’s performance in the reference period, as instationarity of the bias w.r.t. long-term variability, large-scale circulation, and state of the climate system greatly interferes with corresponding conclusions (Maraun et al. 2017). At the time of writing this manuscript, the scientific literature does not contain evaluation methods capable to assess the physical plausibility of changes projected by RCMs. Existing attempts to evaluate physical plausibility in GCMs as emergent constraints approaches (Collins et al. 2012) are not easy to transfer to RCMs because of the limited domain. The task is further complicated when the focus is shifted from variables like pressure, geopotential, and sea surface temperature to precipitation, wind speed, and humidity. We are nevertheless attentively following new developments and ready to incorporate criteria that are appropriate to measure physical plausibility of RCM output.
When the quality weighting is dropped, the Sanderson et al. (2015a) method will minimize interdependence within the ensemble without constraints, that is, to select most unsimilar projections spanning a large spread between them. To monitor the performance of the procedure we will compare the spread produced by a reduced ensemble with the spread of the full ensemble of opportunity.
Because of the high dimensionality of the model output, this comparison is rather complex. Maps of the ensemble spread per variable and month dependent on the number of climate projections in the ensemble subset is a proper way to visualize the effect of the ensemble reduction. To get a more complete picture of the performance of the reduction scheme, the spread is not presented for the eight regions used to form the matrices
An example of such a plot is shown in Fig. 5 (spread in tas at 3 × 3 aggregated grid points, four seasons, two future periods for the full ensemble of 15, and reduced ensembles of 10, 7, 6, 5, and 4 climate projections) for the reduction scheme using the added uniqueness weights from

Ensemble spread of tas climate change signal per season (rows) depending on the number of climate projections (columns) for two future periods: (top) 2021–50; (bottom) 2071–2100.
Citation: Journal of Applied Meteorology and Climatology 57, 3; 10.1175/JAMC-D-17-0141.1

Ensemble spread of tas climate change signal per season (rows) depending on the number of climate projections (columns) for two future periods: (top) 2021–50; (bottom) 2071–2100.
Citation: Journal of Applied Meteorology and Climatology 57, 3; 10.1175/JAMC-D-17-0141.1
Ensemble spread of tas climate change signal per season (rows) depending on the number of climate projections (columns) for two future periods: (top) 2021–50; (bottom) 2071–2100.
Citation: Journal of Applied Meteorology and Climatology 57, 3; 10.1175/JAMC-D-17-0141.1

Aggregated explained variance of the PCs of the matrix of changes between the periods 1971–2000 and 2071–2100
Citation: Journal of Applied Meteorology and Climatology 57, 3; 10.1175/JAMC-D-17-0141.1

Aggregated explained variance of the PCs of the matrix of changes between the periods 1971–2000 and 2071–2100
Citation: Journal of Applied Meteorology and Climatology 57, 3; 10.1175/JAMC-D-17-0141.1
Aggregated explained variance of the PCs of the matrix of changes between the periods 1971–2000 and 2071–2100
Citation: Journal of Applied Meteorology and Climatology 57, 3; 10.1175/JAMC-D-17-0141.1

As in Fig. 5, but for pr.
Citation: Journal of Applied Meteorology and Climatology 57, 3; 10.1175/JAMC-D-17-0141.1

As in Fig. 5, but for pr.
Citation: Journal of Applied Meteorology and Climatology 57, 3; 10.1175/JAMC-D-17-0141.1
As in Fig. 5, but for pr.
Citation: Journal of Applied Meteorology and Climatology 57, 3; 10.1175/JAMC-D-17-0141.1
The illustrative plots in Figs. 5 and 7 are useful to get an overview of the spread in different subensembles. To get deeper inside in the changes in spread induced by the ongoing reduction, we list the overall remaining spread in exact numbers in Table 2. Table 2 shows the decrease of the remaining spread in percent per variable and future period, aggregated over the whole area and all seasons as a function of the cardinality of the subensemble for the same reduction scheme as in Figs. 5 and 7.
b. Sensitivity of reduction scheme
In the following, we investigate the sensitivity of the reduction against different reduction schemes and recalculate the loss of spread varying the selection of variables included in the
Remaining spread (%) in the periods from 2071–2100 to 1971–2000 per variable depending on the specification of the reduction scheme w.r.t. input variables, spatial aggregation, inclusion type of period 2021–50, and number of principal components, ensemble size fixed equal to 7, and no quality weighting. Spread is aggregated over the whole area and all seasons. Boldface font indicates benchmark reduction scheme as in Figs. 5 and 7. Italic indicates changes relative to reduction scheme in the first row. Smaller numbers in parentheses indicate remaining spread in the periods from 2021–50 to 1971–2000 to highlight the effect of inclusion type.


Remaining spread (%) in the periods from 2071–2100 to 1971–2000 (top number in each field) and MSE in the period 1971–2000 against observations (bottom number in each field) per variable, compared to the full ensemble. The benchmark scheme of Table 3 (boldface font) is varied by enabling the quality weighting (italic).



Comparison of seven-member ensemble spread in climate change signals per variable depending on the specifications of the reduction scheme relative to the benchmark scheme from section 4a (always shown in purple). (a) Spread for the period 2071–2100 under varying input variables: purple (tas, pr, R95pTOT, huss, and rsds); dark blue (tas, pr, R95pTOT, and huss); light blue (tas, pr, R95pTOT, and rsds); and green (tas, pr, and R95pTOT). (b) Spread for the period 2071–2100 (solid bars) and 2021–50 (dashed bars) under varying inclusion type of period 2021–50: purple (additive), dark blue (concatenative), and light blue (none). (c) Spread for the period 2071–2100 under varying spatial aggregation: purple (8 TRY regions) and dark blue (3 × 3 gridcell aggregation). (d) Spread for the period 2071–2100 under a varying number of PCs: purple (5 PCs), dark blue (6 PCs), and light blue (4 PCs).
Citation: Journal of Applied Meteorology and Climatology 57, 3; 10.1175/JAMC-D-17-0141.1

Comparison of seven-member ensemble spread in climate change signals per variable depending on the specifications of the reduction scheme relative to the benchmark scheme from section 4a (always shown in purple). (a) Spread for the period 2071–2100 under varying input variables: purple (tas, pr, R95pTOT, huss, and rsds); dark blue (tas, pr, R95pTOT, and huss); light blue (tas, pr, R95pTOT, and rsds); and green (tas, pr, and R95pTOT). (b) Spread for the period 2071–2100 (solid bars) and 2021–50 (dashed bars) under varying inclusion type of period 2021–50: purple (additive), dark blue (concatenative), and light blue (none). (c) Spread for the period 2071–2100 under varying spatial aggregation: purple (8 TRY regions) and dark blue (3 × 3 gridcell aggregation). (d) Spread for the period 2071–2100 under a varying number of PCs: purple (5 PCs), dark blue (6 PCs), and light blue (4 PCs).
Citation: Journal of Applied Meteorology and Climatology 57, 3; 10.1175/JAMC-D-17-0141.1
Comparison of seven-member ensemble spread in climate change signals per variable depending on the specifications of the reduction scheme relative to the benchmark scheme from section 4a (always shown in purple). (a) Spread for the period 2071–2100 under varying input variables: purple (tas, pr, R95pTOT, huss, and rsds); dark blue (tas, pr, R95pTOT, and huss); light blue (tas, pr, R95pTOT, and rsds); and green (tas, pr, and R95pTOT). (b) Spread for the period 2071–2100 (solid bars) and 2021–50 (dashed bars) under varying inclusion type of period 2021–50: purple (additive), dark blue (concatenative), and light blue (none). (c) Spread for the period 2071–2100 under varying spatial aggregation: purple (8 TRY regions) and dark blue (3 × 3 gridcell aggregation). (d) Spread for the period 2071–2100 under a varying number of PCs: purple (5 PCs), dark blue (6 PCs), and light blue (4 PCs).
Citation: Journal of Applied Meteorology and Climatology 57, 3; 10.1175/JAMC-D-17-0141.1

Comparison of seven-member ensemble spread in climate change signals for the period 2071–2100 (solid bars) and MSE of ensemble mean w.r.t. observations (dashed bars) depending on the activation of quality weighting: purple (no quality weighting; benchmark scheme of section 4a) and dark blue (quality weighting activated).
Citation: Journal of Applied Meteorology and Climatology 57, 3; 10.1175/JAMC-D-17-0141.1

Comparison of seven-member ensemble spread in climate change signals for the period 2071–2100 (solid bars) and MSE of ensemble mean w.r.t. observations (dashed bars) depending on the activation of quality weighting: purple (no quality weighting; benchmark scheme of section 4a) and dark blue (quality weighting activated).
Citation: Journal of Applied Meteorology and Climatology 57, 3; 10.1175/JAMC-D-17-0141.1
Comparison of seven-member ensemble spread in climate change signals for the period 2071–2100 (solid bars) and MSE of ensemble mean w.r.t. observations (dashed bars) depending on the activation of quality weighting: purple (no quality weighting; benchmark scheme of section 4a) and dark blue (quality weighting activated).
Citation: Journal of Applied Meteorology and Climatology 57, 3; 10.1175/JAMC-D-17-0141.1
The exclusion of variables from the matrices
The method of including the period 2021–50 in the analysis influences the spread in 2021–50 insofar as the additive inclusion gives a slightly better spread than the exclusion compared to the spread of the full ensemble, whereas the inclusion by concatenation performs worse (Table 3 and Fig. 8b).
Although the spread is always calculated at the 3 × 3 gridbox aggregation, the aggregation of the variables on the eight climatic TRY regions achieves a slightly higher spread compared to the spread of the full ensemble than the use of the 3 × 3 gridbox aggregation already in the calculation of the weights (Table 3 and Fig. 8c). One explanation for this unexpected fact might be that the eight climatic TRY regions describe a physically meaningful reduction of noise. With the limited number of climate projections available, the singular value decomposition is not able to identify this spatial structure, which is surely not uniformly developed across the climate projections. On the other hand, the superiority of the TRY regions over the 3 × 3 gridbox aggregation could be an artifact of the linearity forced upon the data by PCA. This hypothesis is surely worth testing by nonlinear generalizations of PCA (Gorban et al. 2007).
The fact that a spatial aggregation, which is based on additional physical knowledge not contained in the matrices
The number of principal components is essential when maximizing the spread for a given cardinality. The relation between spread and number of principal components is not linear. The inclusion of more principal components increases the complexity in the ensemble, but it also increases the number of climate projections necessary for its representation. Therefore, if the number of climate projections is fixed for practical reasons, then the influence of the number of principal components on the spread has to be tested in order to find the optimal reduction scheme (Table 3 and Fig. 8d).
The quality weighting (here with Dq near the mean distance) between climate projections and observation has by far the strongest effect on the ensemble spread among the variations of schemes investigated in this paper. This effect is most pronounced for pr, as the trends in precipitation are much less consistent from one climate projection to another than for temperature. In the prevalent case that strong errors and unique climate signal co-occur, the exclusion of a climate projection with bad performance may improve the mean-squared error (MSE) of the ensemble mean, but at the same time a fairly unique trend is excluded from the spread, thus reducing it considerably (see Table 4 and Fig. 9 for spread and MSE).
In our case, the reduction scheme with all variables, aggregation on eight climatic regions, added uniqueness weights for both future periods, and five principal components has proven best for a seven-member subensemble without quality weighting (CP2, CP5, CP7, CP10, CP11, CP12, and CP14). As can be seen from Table 2, the reduced ensemble retains on average ≈97% of the spread (per variable) compared to the full ensemble in the period 2071–2100 and ≈91% in 2021–50.
We would like to point out that the difference in spread representation between the first and the second future period is not caused by a lower weighting of 2021–50 in the reduction scheme or by an insufficient representation of decadal variability. Tests show that even when reducing the ensemble on the information of the first period exclusively, it is not possible to reproduce a higher-than-91% spread using a seven-member ensemble. The reason for such behavior might be the missing consistence in the evolution of climate change that is yet barely detectable in 2021–50.
Similarly, the best five-member subensemble without quality weighting is derived from the reduction scheme with all variables, aggregation on eight climatic regions, and added uniqueness weights for both future periods, but this time with four principal components (CP5, CP7, CP11, CP12, and CP14). The best five-member ensemble is a subensemble of the best seven-member ensemble, but it does not follow the order of exclusion calculated from the reduction scheme with five PCs optimal for the seven-member ensemble. The best five-member ensemble still represents on average ≈94% and ≈80% of the original spread, respectively.
In case we do want to activate quality weighting, the reduction scheme with all other parameters fixed as above, results in a very similar subensemble (CP2, CP5, CP6, CP10, CP11, CP12, and CP14), where CP7 is replaced by CP6. Table 5 shows that, although CP7 has the highest uniqueness weight, at the same time it has by far the lowest quality weight, such that the combined weight is low and the model is excluded. In the variables considered here, CP7 has, in fact, extremely large errors, relative to the other models in the ensemble.
Uniqueness and quality weight per climate projection. The CP7 weights highlighted in boldface font are discussed in more detail in the text.


The effect of replacing CP7 is striking: the ensemble spread decreases to only 87% and 85% compared to the full ensemble (whereas the reduced ensemble without quality weighting had 97% and 91%). This difference is due to the climate change signal produced by CP7, which deviates considerably from the rest of the ensemble. At the same time, the MSE of the reduced ensemble is likewise reduced from 95.8% to 76.8% for the ensembles containing CP7 and CP6, respectively, compared to the MSE of the full ensemble.
The decision in favor of or against the application of quality weighting is definitely a crucial one. It corresponds to the question whether an extraordinary change signal can describe a plausible future evolution, despite having large errors. This decision can be approached in either a philosophical or a pragmatic way. But the user has to bear in mind that the exclusion of a strongly biased climate projection is usually not possible without incurring expenses on the ensemble spread.
For comparison, we implemented the completely different reduction procedure proposed by Cannon (2015): the Katsavounidis–Kuo–Zhang (KKZ) algorithm for initializing the centroids in k-means clustering. When applied to the same matrices
5. Summary and conclusions
In this article, we presented a methodology for reducing the high-resolution climate projection ensemble provided by the EURO-CORDEX activity based on the method of Sanderson et al. (2015a). This methodology allows a reduction of the number of ensemble members considered for subsequent impact studies controlling the reduction of the ensemble spread. The method is able to balance a performance criterion (modeling error in a reference period) against the ensemble spread at any desired level. We found that a reduction of the 15 climate projections available in EURO-CORDEX to only 8 ensemble members even increases the IEQS. This IEQS is kept nearly constant for 7 ensemble members. Only a stronger reduction of the ensemble induces a noticeable decrease of IEQS. The reduction to 7 ensemble members leads to an average remaining spread of more than 91% for the climate variables considered (tas, pr, R95pTOT, huss, and rsds) for the climate change signal of the near future (2021–50 with respect to 1971–2000) and even more than 97% for the climate change signal of the far future (2071–2100 with respect to 1971–2000), where the loss of spread for the precipitation variables (pr and R95pTOT) is higher than for the other variables.
In view of the upcoming extension of the EURO-CORDEX ensemble, around 20 new members will be available in the near future; our method of ensemble reduction will gain an immediate importance to keep the ensemble manageable for applications, impact modeling, and climate assessment.
Of course, the resulting reduced ensemble is dependent on the full ensemble. As a consequence, the methodology needs to be applied regularly when additional climate projections become available. Furthermore, the results of the reduction depend on the variables of interest, the spatial and temporal resolution analyzed, and the loss of ensemble spread still accepted. These decisions and the question of considering a quality weighting have to be discussed with the users of the reduced ensemble.
We also want to point out that this methodology is not a quality measure filtering out “bad climate projections.” In contrast, it filters similar climate projections to the benefit of most unsimilar climate projections retaining large part of the spread given by the full ensemble.
It is worth noting that in the reduced ensemble almost all GCMs from the full ensemble are represented. Only one GCM is excluded by this methodology. This provides an indication of the fact that the ensemble spread, at least with respect to temperature, is mainly defined by the GCMs. In contrast, the effect of different RCMs is more apparent for the precipitation variables.
For the future, it would be interesting to also apply this methodology to climate projections temporally aggregated with respect to different circulation regimes instead of monthly means, as additional physical information that can be more clearly identified by circulation regime classification. Furthermore, the linear dimension reduction on which the procedure is based (PCA) could be generalized to nonlinear PCA (principal curves or manifolds) or to kernel PCA.
Here, we applied the methodology of Sanderson et al. (2015a) to a multimodel ensemble. However, the proposed method could as well be applied to compare different ensemble generation strategies. A large number of similar climate projections in an ensemble would point to an inefficient strategy of ensemble generation, a question that is also important in numerical weather prediction and seasonal and decadal prediction. In this way, the methods could contribute to the search of an optimum perturbation strategy for the different purposes addressable by present climate projection ensembles. An application to climate forecast ensembles might be tested as well. Furthermore, as discussed in Sanderson et al. (2015b), the IEQS is not only appropriate for ensemble reduction but for ensemble weighting, too, to find a balance between spread and bias.
REFERENCES
Caldwell, P., C. Bretherton, M. Zelinka, S. Klein, B. Santer, and B. Sanderson, 2014: Statistical significance of climate sensitivity predictors obtained by data mining. Geophys. Res. Lett., 41, 1803–1808, https://doi.org/10.1002/2014GL059205.
Cannon, A. J., 2015: Selecting GCM scenarios that span the range of changes in a multimodel ensemble: Application to CMIP5 climate extremes indices. J. Climate, 28, 1260–1267, https://doi.org/10.1175/JCLI-D-14-00636.1.
Cannon, A. J., 2016: Multivariate bias correction of climate model outputs: Matching marginal distributions and intervariable dependence structure. J. Climate, 29, 7045–7064, https://doi.org/10.1175/JCLI-D-15-0679.1.
Collins, M., R. Chandler, P. Cox, J. Huthnance, J. Rougier, and D. Stephenson, 2012: Quantifying future climate change. Nat. Climate Change, 2, 403–409, https://doi.org/10.1038/nclimate1414.
Evans, P., F. Ji, G. Abramowitz, and M. Ekström, 2013: Optimally choosing small ensemble members to produce robust climate simulations. Environ. Res. Lett., 8, 044050, https://doi.org/10.1088/1748-9326/8/4/044050.
Gorban, A., B. Kégl, D. Wunsch, and A. Zinovyev, Eds., 2007: Principal Manifolds for Data Visualization and Dimension Reduction. Lecture Notes in Computational Science and Engineering, Vol. 58, Springer, 340 pp.
Hannachi, A., I. Jolliffe, and D. Stephenson, 2007: Empirical orthogonal functions and related techniques in atmospheric science: A review. Int. J. Climatol., 27, 1119–1152, https://doi.org/10.1002/joc.1499.
Jacob, D., and Coauthors, 2014: EURO-CORDEX: New high-resolution climate change projections for European impact research. Reg. Environ. Change, 14, 563–578, https://doi.org/10.1007/s10113-013-0499-2.
Knutti, R., 2010: The end of model democracy? Climatic Change, 102, 395–404, https://doi.org/10.1007/s10584-010-9800-2.
Knutti, R., and J. Sedláček, 2013: Robustness and uncertainties in the new CMIP5 climate model projections. Nat. Climate Change, 3, 369–373, https://doi.org/10.1038/nclimate1716.
Knutti, R., R. Furrer, C. Tebaldi, J. Cermak, and G. Meehl, 2010: Challenges in combining projections from multiple climate models. J. Climate, 23, 2739–2758, https://doi.org/10.1175/2009JCLI3361.1.
Knutti, R., J. Sedláček, B. Sanderson, R. Lorenz, E. Fischer, and V. Eyring, 2017: A climate model projection weighting scheme accounting for performance and interdependence. Geophys. Res. Lett., 44, 1909–1918, https://doi.org/10.1002/2016GL072012.
Kotlarski, S., and Coauthors, 2014: Regional climate modeling on European scales: A joint standard evaluation of the EURO-CORDEX RCM ensemble. Geosci. Model Dev., 7, 1297–1333, https://doi.org/10.5194/gmd-7-1297-2014.
Krähenmann, S., A. Walter, S. Brienen, F. Imbery, and A. Matzarakis, 2017: High-resolution grids of hourly meteorological variables for Germany. Theor. Appl. Climatol., 131, 899–926, https://doi.org/10.1007/s00704-016-2003-7.
Leander, R., T. Buishand, and A. Tank, 2014: An alternative index for the contribution of precipitation on very wet days to the total precipitation. J. Climate, 27, 1365–1378, https://doi.org/10.1175/JCLI-D-13-00144.1.
Maraun, D., and Coauthors, 2017: Towards process-informed bias correction of climate change simulations. Nat. Climate Change, 7, 764–773, https://doi.org/10.1038/nclimate3418.
Masson, D., and R. Knutti, 2011: Climate model genealogy. Geophys. Res. Lett., 38, L08703, https://doi.org/10.1029/2011GL046864.
Meinshausen, M., and Coauthors, 2011: The RCP greenhouse gas concentrations and their extensions from 1765 to 2300. Climatic Change, 109, 213–241, https://doi.org/10.1007/s10584-011-0156-z.
Mendlik, T., and A. Gobiet, 2016: Selecting climate simulations for impact studies based on multivariate patterns of climate change. Climatic Change, 135, 381–393, https://doi.org/10.1007/s10584-015-1582-0.
Parker, W., 2013: Ensemble modelling, uncertainty and robust predictions. Wiley Interdiscip. Rev. Climate Change, 4, 213–223, https://doi.org/10.1002/wcc.220.
Pirtle, Z., R. Meyer, and A. Hamilton, 2010: What does it mean when climate models agree? Environ. Sci. Policy, 13, 351–361, https://doi.org/10.1016/j.envsci.2010.04.004.
Sanderson, B., R. Knutti, and P. Caldwell, 2015a: A representative democracy to reduce interdependency in a multimodel ensemble. J. Climate, 28, 5171–5194, https://doi.org/10.1175/JCLI-D-14-00362.1.
Sanderson, B., R. Knutti, and P. Caldwell, 2015b: Addressing independency in a multimodel ensemble by interpolation of model properties. J. Climate, 28, 5150–5170, https://doi.org/10.1175/JCLI-D-14-00361.1.
Tebaldi, C., J. Arblaster, and R. Knutti, 2011: Mapping model agreement on future climate projections. Geophys. Res. Lett., 38, L23701, https://doi.org/10.1029/2011GL049863.