## 1. Introduction

Statistical downscaling has become a well-established tool in regional and local impact assessments over the past few years. Precipitation downscaling is especially paramount for impact studies to correct the spatial and temporal structure of precipitation from coarse models. To resolve the scale discrepancy between global circulation models (GCMs) and local-scale weather, a vast number of algorithms and methods have been proposed and extensively tested. A comprehensive review on different precipitation downscaling approaches and their advantages and drawbacks is given by Maraun et al. (2010). Despite the extensive research on downscaling methods, there is still little consensus on the choice of useful atmospheric predictor variables (Wilby and Wigley 2000). Besides the general decision of a proper statistical downscaling model, the selection of an *informative predictor* set is crucial for the accuracy and stability of the resulting downscaled time series (Fealy and Sweeney 2007; Huth 2004; Fowler et al. 2007; Maraun et al. 2010; Bárdossy et al. 2002).

This statement relates both to the atmospheric variables as well as the predictor domains in terms of geographical location and spatial extent, to which in general not much attention is paid, which was also noted by Goodess and Palutikof (1998) and Wilby and Wigley (2000). Ideally, suitable predictors for perfect prognosis (PP) approaches should fulfill the following conditions: (i) a stationary relationship between predictors and predictands, (ii) a valid relationship under future climate conditions, (iii) predictors should have a high predictive power (explained variance), and (iv) predictors should be reasonably well simulated by the driving dynamical model (Maraun et al. 2010). For the first two conditions it is important that the statistical relationships found can be understood in physical terms.

Some circulation-related predictors, such as sea level pressure, fulfill most requirements and are therefore often employed as predictors in PP downscaling studies. However, on small scales, the sole use of circulation predictors often turns out to be insufficient to capture the complete variability. Therefore, some studies additionally consider thermodynamical variables and the amount of available water in the atmosphere (e.g., Wilby and Wigley 1997; Murphy 2000; Beckmann and Buishand 2002). Cavazos and Hewitson (2005) assessed the relative skill and error of 29 potential atmospheric predictors from the National Centers for Environmental Prediction–National Center for Atmospheric Research (NCEP–NCAR) reanalysis and their relevance for precipitation at 15 globally distributed locations. An artificial neural network (ANN) downscaling model was applied, including most relevant predictors of daily precipitation. Then, the most relevant predictors were determined by a rotated principal component analysis. Midtropospheric geopotential heights and midtropospheric specific humidity were found to be the dominant predictors at all locations. Most other predictors, however, show a strong regional and seasonal dependence.

Only a limited number of papers are interested in the predictive capability of the domain size or shape and the question to what extent variability of neighboring grid points influence local-scale events. Wilby and Wigley (2000) emphasized the spatial relationships between observed daily precipitation and both mean sea level pressure and atmospheric moisture for selected regions in the United States. They found that in many cases, maximum correlation between precipitation and mean sea level pressure is not directly located above the target grid cell, whereas correlations with specific humidity were largest when the predictors were right above the predictand position. Recently, D’onofrio et al. (2010) used a weather pattern classification to optimize the spatial domain of predictors using a set of 17 atmospheric variables. They found that the optimum spatial domain is smaller than the synoptic scale and therefore suggested considering different domains for dynamical and nondynamical predictors.

The approach of this study to derive natural three-dimensional predictor domains was developed for a new weather generator, which is best driven by a number of airmass classes with unique local precipitation distributions. Rather than using a predefined set of potential predictors and fixed regular domains for airmass classification, these factors are optimized by a modified nonlinear optimization algorithm. The algorithm combines a clustering method using nonlinear unsupervised self-organizing maps (SOMs) and a generic simulated annealing (SA) algorithm (see Fig. 1). This concept respects the idea that atmospheric subspaces with irregular spatial boundaries may be more informative for finding linkages between atmospheric characteristics and local phenomena than common rigid domains.

Most of the optimized domains can be interpreted in terms of well-known atmospheric processes. Consequently, one may expect that, if these domains contain relevant information for precipitation generation, the grid points should also be suitable inputs for nonlinear downscaling methods. To establish a nonlinear empirical relationship between local-scale predictand and the previously optimized predictors, we utilize static artificial neural networks (ANNs), as in many other downscaling studies (e.g., Cavazos and Hewitson 2005; Hewitson and Crane 1996; Schoof and Pryor 2001; Wilby et al. 1998; Coulibaly et al. 2005). However, using a set of optimized predictors still poses the fundamental question whether all members increase the explained variance of the model’s output either individually or even by interaction with other predictors. So far, no downscaling study has paid much attention to either the interactions of individual variables or grid points or to the different sources of uncertainty in the model input. If one would like to achieve a full understanding of the model’s sensitivity pattern, it is necessary to recover the complete variance of the predictand, especially for nonlinear nonadditive models (Saltelli et al. 2006). Hence, a model-free sensitivity measure is required that is valid independently of the degree of linearity. An attractive model-free approach uses variance-based methods, which can also account for nonlinear nonadditive models, unlike methods based on normalized derivatives. Once, a robust ANN model is trained, the first- and higher-order effects are estimated by Monte Carlo (MC)-based global sensitivity analysis (GSA). Taking advantage of the MC runs we also analyze the model output within specific bounds of interest (*factor mapping*).

The present study intends to show that physically interpretable predictor subspaces in the potential domain can be identified by conditional airmass classification and thereby help to understand linkages between atmospheric characteristics and local phenomena. Also, in view of the difficulties of nonlinear predictor screening, the paper addresses the question whether such predictor sets could improve the accuracy of nonlinear downscaling methods. In doing so, the influences of each predictor are estimated with the objective to evaluate the quality of the predictor set. The proposed approach has been applied and evaluated to daily precipitation data in the Rhineland region of Germany.

## 2. Predictor selection

The conditional airmass classification combines a nonlinear unsupervised SOM clustering method and a generic probabilistic SA algorithm. The SOM clustering aims to find a predefined number of characteristic classes, such that each day can be assigned unambiguously to one of these classes. Rather than searching for characteristic circulation patterns, the classification is constrained by the local precipitation distributions. The selection of the predictors aims at making the precipitation distributions of every class clearly different. The selection of these predictors is performed by a SA algorithm. Both algorithms are specified in the next two sections.

### a. Self-organizing maps

Self-organizing maps is one of a multiplicity of unsupervised clustering algorithms and have become more popular, also in the field of atmospheric science, during the recent decades (Crane and Hewitson 2003; Hewitson and Crane 2002; Lynch et al. 2006; Kohonen 1982). Rather than identifying characteristic regions in data space by maximizing the within-group similarity and the difference between groups, SOMs aim to find a predefined number of representative multidimensional prototypes (position vectors) in data space. These prototypes are mapped onto a lower dimensional map space called Kohonen map, which intends to preserve the topological properties of the input space. This low-dimensional presentation is, in particular, useful in describing and visualizing high-dimensional data. In the beginning of the learning phase prototypes are initialized randomly in the data space and are adjusted iteratively during the learning process.

*M*be a set with

*μ*data points, each specified by an

_{M}*n*-dimensional predictor vector

**x**

*(e.g., air temperature, specific humidity, etc.):*

_{i}*N*consisting of prototypes

**n**

*composed of a randomly initialized vector*

_{i}**w**

*within the same input data space and a node vector*

_{i}**k**

*located on a discrete two dimensional topological map*

_{i}*μ*is the number of desired number of classes. At each iteration step

_{N}*t*we randomly select one of the predictor vectors

*X*. The prototype

*n*with the least distance is called the “best-matching unit” (BMU). Subsequently, the vector

_{w}*δ*from the winner node (kw) on the topological map, will be adjusted. The prototype subset, which will be updated, is defined as

^{t}The degree of adaptation is determined by the learning rate *ε ^{t}*, the weight function

*h*, and the distance

^{t}Once no more changes in location of the prototypes are observed or the maximum number of iterations is reached, the algorithm terminates. Under perfect conditions prototypes are arranged such that they span the entire data space but with more prototypes in regions of high data density (Hewitson and Crane 2002; Kohonen 1982). The properties of SOMs allow one to approximate a nonlinear distribution in predictor space despite the linear measures. A disadvantage of SOMs is that the user must define a large number of parameters such as the size and shape of the update kernel, learning rates, and number of classes (prototypes). After training each data point is assigned to the closest node.

### b. Simulated annealing

The problem of finding an optimal predictor subset on a discrete configuration space is an example of combinatorial minimization. Solving this problem with exhaustive search algorithms requires exponential computing time; this problem thus belongs to the complexity class “NP complete.” For a large number of grid points in the configuration space, it is impossible to explore all possible combinations. In the case study presented in this paper, the number of possible solutions is 2^{11730}. In many cases such combinatorial problems can be solved, or at least well approximated, by the heuristic probabilistic simulated annealing (SA) algorithm (Kirkpatrick et al. 1983). The concept of simulated annealing was first introduced in numerical calculations by Metropolis et al. (1953) and has since become a widely applicable heuristic optimization technique.

*E*(cost function, see below) between the old and new configuration is computed. Based on Δ

*E*the system changes its configuration from energy

*E*

_{1}to energy

*E*

_{2}with probability

*T*is a control parameter (analog of temperature). If Δ

*E*≤ 0, that is, the solution becomes better, the rearrangement is always accepted [Pr(Δ

*E*) = 1]. In case Δ

*E*> 0, that is, the solution becomes worse, the solution is accepted with this finite probability, Pr(Δ

*E*), to avoid getting trapped in a local minimum in the cost function. During optimization,

*T*is gradually lowered (annealing schedule). The SA algorithm was implemented with the modules detailed below. See Table 1 for an overview of the settings.

The simulated annealing (SA) parameter set.

#### 1) Rearrangements

At the beginning of each iteration step the algorithm determines whether a new grid point is added or a seeded grid point is removed. The ratio of additions to removals varies in time and oscillates harmonically with a frequency of 0.001 between a ratio of 0.7 and 0.3. This strategy reduces the chance that the algorithm gets stuck in a local optimum. If grid points are added, the chance of selecting one of the directly surrounding grid points is higher (0.9) than selecting farther grid points. In case no improvement is observed after 400 consecutive iterations, the configuration is set back to the best configuration.

#### 2) Cost function (energy)

The classification is conditioned on the local-scale precipitation distributions measured at eight weather stations. For every possible pair of classes the *χ*^{2} test is used to determine whether the two distributions significantly differ on a 95% level. The optimization algorithm maximizes the total number of different distributions. Additionally, the *χ*^{2} values are summed to get a measure of how distributions are different. In case two model runs have the same number of different distributions, the final decision is determined by the summed *χ*^{2} value.

#### 3) Annealing schedule

The annealing schedule is determined empirically. The temperature is held constant for 2000 reconfigurations, or for 100 successful reconfigurations so as to reach approximately thermal equilibrium. The temperature is consecutively lowered by a constant cooling rate of 10%. Owing to the large number of grid points this rate does not guarantee finding the global minimum.

## 3. Global sensitivity analysis

*f*(which is replaced by the ANN later in this study),

*V*(

**Y**). If one predictor

**X**

*is fixed at a particular value*

_{i}**Y**is accordingly

**X**

*. That this sensitivity measure depends on the value of*

_{i}**X**

*on*

_{i}**Y**. The corresponding first-order sensitivity index of

**X**

*is given by*

_{i}*S*of a factor

_{Ti}**X**

*is made up of the first-order and all higher order terms where a given predictor*

_{i}*X*is participating, consequently giving information on the nonadditive character of the model. The

_{i}*X*

_{~i}indicates that all predictors have been fixed and only

*X*varies over its uncertainty range. This approach permits, even for nonadditive models, one to recover the complete variance of

_{i}*Y*. The indices can be efficiently computed by Monte Carlo–based numerical procedures (Saltelli et al. 2006, 1999).

## 4. Case study: The Rhineland

### a. Data and setup

Daily precipitation measurements for 1971–2000 have been obtained from the German Weather Service (DWD) for eight weather stations in the Rhineland (Germany), providing a set of 8 × 10 592 samples. For this period complete datasets are available for all stations. The large-scale atmospheric data is derived from the 40-yr European Centre for Medium-Range Weather Forecasts Re-Analysis (ERA-40), provided on 2.5° × 2.5° regular geographical coordinates. The potential predictor domain is centered on central Europe ranging from 35° to 75°N, 12.5° to 42.5°W. Six variables have been used: air temperature (*T*), specific humidity (*H _{S}*), vertical velocity (

*w*), divergence (div), potential vorticity (PV), and geopotential height (

*z*)—on five pressure levels (1000, 850, 700, 500, and 300 hPa) for classification. Inherently, some of the predictor variables used show significant correlations, for example,

*H*and

_{S}*T*, or div and

*w*.

The algorithm was initialized by setting the seed for each predictor to the grid point closest to the precipitation observation (50°N, 5°E) in the potential predictor domain. For the high performance clusters at the University of Aachen, several SOM clustering computations have been performed with a different number of classes ranging from 3 to 10. First, all models that produced significantly different precipitation distributions for all classes and stations have been selected (see cost function in section 2b). From these models, the model with the greatest number of classes was selected so as to get a possibly high number of independent classes. According to this criterion the best SOM architecture consists of six classes. An example of the cumulative precipitation distributions is given for one of the weather stations, Aachen (see Fig. 2). To assure that the algorithm has converged we restarted the optimal configuration and compared the two runs. The second run converged to the same subdomains, indicating that the algorithm found a robust optimum.

Example of a cumulative density function of the daily precipitation sums for each of the six classes at the weather station near Aachen.

Citation: Journal of Climate 24, 23; 10.1175/2011JCLI4155.1

Example of a cumulative density function of the daily precipitation sums for each of the six classes at the weather station near Aachen.

Citation: Journal of Climate 24, 23; 10.1175/2011JCLI4155.1

Example of a cumulative density function of the daily precipitation sums for each of the six classes at the weather station near Aachen.

Citation: Journal of Climate 24, 23; 10.1175/2011JCLI4155.1

### b. Predictor domains

Figure 3 shows the optimized subdomains used for the final classification. Note that the algorithm identifies both continuous and disjointed subdomains (e.g., *T*, PV). The shape of the predictor domains depends on the physical processes involved, while the algorithm is seeking the best domain that includes most relevant information for precipitation generation for all air masses. In the following discussion, predictor domains are analyzed concerning their spatial structure as well as their inherent dynamic and thermodynamic characteristics.

Optimized predictor domains for (a) specific humidity *H _{S}*, (b) vertical velocity

*w*, (c) divergence (div), (d) potential vorticity (PV), (e) temperature

*T*, and (f) geopotential height

*z*. For each parameter a three-dimensional perspective of the domain is shown; for presentation the lattices are filtered by a 3 × 3 × 3 Gaussian filter and displayed on an isosurface. The corresponding map is shown to the right. Note that the potential predictor domain extends farther to the east. The black dot indicates the location of the study area.

Citation: Journal of Climate 24, 23; 10.1175/2011JCLI4155.1

Optimized predictor domains for (a) specific humidity *H _{S}*, (b) vertical velocity

*w*, (c) divergence (div), (d) potential vorticity (PV), (e) temperature

*T*, and (f) geopotential height

*z*. For each parameter a three-dimensional perspective of the domain is shown; for presentation the lattices are filtered by a 3 × 3 × 3 Gaussian filter and displayed on an isosurface. The corresponding map is shown to the right. Note that the potential predictor domain extends farther to the east. The black dot indicates the location of the study area.

Citation: Journal of Climate 24, 23; 10.1175/2011JCLI4155.1

Optimized predictor domains for (a) specific humidity *H _{S}*, (b) vertical velocity

*w*, (c) divergence (div), (d) potential vorticity (PV), (e) temperature

*T*, and (f) geopotential height

*z*. For each parameter a three-dimensional perspective of the domain is shown; for presentation the lattices are filtered by a 3 × 3 × 3 Gaussian filter and displayed on an isosurface. The corresponding map is shown to the right. Note that the potential predictor domain extends farther to the east. The black dot indicates the location of the study area.

Citation: Journal of Climate 24, 23; 10.1175/2011JCLI4155.1

The *H _{S}* domain (Fig. 3a) is situated directly above the study area and its spatial extent is much smaller than those of the other subdomains. Wilby and Wigley (2000) found similar spatial correlation patterns between precipitation and

*H*for six regions in the United States, tending to overlap with the target. Together with the

_{S}*w*pattern (Fig. 3b), extending upward to a height of several kilometers (300 hPa) and sloping backward toward the west, this formation captures air masses advancing aloft a frontal surface. This pattern is well-known from rising warm air advection in advance of developing surface lows and condensation of water vapor in the upper atmosphere.

The div dipole (Fig. 3c) is situated between the upper (300 hPa) and ground-level (1000–700 hPa) atmosphere, with the lower region directly connected to the lower edge of the *w* domain. This is essentially due to the close linkage of the two predictors by the continuity equation. Depending on whether air masses are ascending or descending (in the *w* domain), the two div regions behave either as sinks or sources (see Table 2).

Characteristics of the airmass classes. Shown are the mean values of *H _{S}* and

*w*in the corresponding subdomains. Also given is the mean difference in surface temperature Δ

*T*(Iceland minus Spain) and standard deviation (in brackets) of the

*T*dipole. To gain a detailed understanding, the local mean rain rate (RR) and fraction of rainy days (RD) are also presented.

Useful information on diabatic processes can be obtained by changes in the PV (Davis and Emanuel 1991; Rossa et al. 2000). As PV is the product of absolute vorticity on an isentropic surface and static stability, a release of latent heat is concentrated within the domain of enhanced PV values. Therefore, strong baroclinic zones, where much diabatic heat release takes place, usually form low-level PV anomalies (near the ground, see also Fig. 3d). Besides the derivation of diabatic processes the vorticity equation admits, by mean of its invertibility, inferences on other meteorological fields such as geopotential height, air temperature, wind, and static stability. In the North Atlantic region, the highest cyclone track densities are found around 60°N (Ulbrich et al. 2008, 2009; Wang et al. 2006), slightly varying with the seasons. Consequently, significant changes of PV particularly take place in this zone. This is in good agreement with the obtained PV subdomain, which ranges from the northern part of Germany along the west coast of Norway up to 62.5°N. The structure also nicely reflects the often observed comma-shaped pattern forming in the proximity of low pressure systems.

Unlike the subdomains of the circulation and moisture predictors, two disjointed *T* domains (Fig. 3e) are located over Spain and Iceland. This is not surprising since the effect of temperature change due to vertical movement is usually lower than that due to advection. There is no evidence that wetter or dryer events could be explained by westerly advection of warmer or colder air masses. However, the seasonal occurrence of air masses coincides with the corresponding temperature signal. Typical summer airmass patterns, such as AM1 and AM2, are characterized by warmer air temperatures in both *T* domains.

Finally, the *z* domain is shifted northeast to the Rhineland and stretches vertically from the upper to the middle troposphere with small horizontal extent (Fig. 3f). The size of the domain strongly contrasts that of the linear correlation map, which shows strong correlations of precipitation and *z* over a great distance. Accordingly, it is advised to be careful about using correlation as a measure of predictive power of predictors for downscaling.

### c. Airmass classes

To create a greater understanding of the six airmass classes and thus of their influence on local phenomena, it is key to understand the physical processes involved. These processes are closely related to the spatial domains, which have been discussed in the previous section. Some mean characteristics of the predictors and the predictand for each class are given in Table 2. The resulting composite mean sea level pressure and specific humidity maps are presented in Fig. 4. These classes can be roughly divided in three low (AM1, AM3, AM5) and three high pressure systems (AM2, AM4, AM6).

Fields of mean sea level pressure (contour lines) and specific humidity at the 500-hPa level (filled contour) for each of the six air masses (AM1–AM6); the black dot indicates the location of the study area.

Citation: Journal of Climate 24, 23; 10.1175/2011JCLI4155.1

Fields of mean sea level pressure (contour lines) and specific humidity at the 500-hPa level (filled contour) for each of the six air masses (AM1–AM6); the black dot indicates the location of the study area.

Citation: Journal of Climate 24, 23; 10.1175/2011JCLI4155.1

Fields of mean sea level pressure (contour lines) and specific humidity at the 500-hPa level (filled contour) for each of the six air masses (AM1–AM6); the black dot indicates the location of the study area.

Citation: Journal of Climate 24, 23; 10.1175/2011JCLI4155.1

Frequent rainfall events in the study area are associated with advection of humid air masses from the west and southwest. These are typical conditions for AM1 and AM3 (see Fig. 4 and Table 2). In both cases the advection is driven by a low pressure system north of the British Isles. Within the complete *w* subdomain vertical winds are negative, indicating a large-scale ascent of air masses. Once the frontal system passes the study area and the low pressure system is located above Scandinavia, uplift rates reduce and the flow changes to the northwest. This situation is typical for AM5.

During summer and autumn the air contains more moisture due to the increased saturation vapor pressure and evaporation (see Fig. 4 and Table 2). This phenomenon can be especially observed for AM1 and AM2, as indicated by the higher frequency in Fig. 5. Obviously, such quantitative fluctuations in moisture content are mainly attributed to seasonal temperature differences. Nevertheless, an increased moisture content does not implicate higher mean rain rates (see Table 2). A good example is given by AM2 and AM3 although both air masses do have similar moisture contents in the *H _{S}* domain, which is located directly above the study area, the mean rain rates show great differences. Other physical conditions, such as vapor pressure deficit, may be more useful predictors for this purpose.

Frequencies of the six air masses (SOM classes) by month.

Citation: Journal of Climate 24, 23; 10.1175/2011JCLI4155.1

Frequencies of the six air masses (SOM classes) by month.

Citation: Journal of Climate 24, 23; 10.1175/2011JCLI4155.1

Frequencies of the six air masses (SOM classes) by month.

Citation: Journal of Climate 24, 23; 10.1175/2011JCLI4155.1

The much dryer classes, AM2 and AM6, show pronounced high pressure systems over central Europe accompanied with dry air masses in the midtroposphere. As a consequence of the large-scale subsidence, divergence takes positive values in the full div subdomain. The associated adiabatic heat release leads to changes in the vapor pressure deficit and hence to a reduction of relative humidity. This is evident above all in the reduced precipitation probabilities.

### d. ANN downscaling

In the next step, an empirical relationship between local-scale precipitation and the previously optimized predictors (47 predictors in total) is derived by a nonparametric static feed-forward ANN (M1). It should be mentioned that the predictor screening was performed on the same dataset being used for downscaling. However, since the predictors were not explicitly optimized to improve the accuracy of the downscaling, independence is still given. Next to the standard statistical model M1, for the purpose of comparison two additional ANN models are trained using different predictor settings. The statistical model M2 utilizes the same six variables as the standard model M1, but the domain is limited to the Rhineland itself. The domain is selected by computing standard product-momentum correlation coefficients for a 3 × 3 gridpoint domain above the target location. The pressure level with the strongest correlations is selected. Accordingly, the following group of predictors (54 predictors in total) was selected: *T*_{1000}, *H _{s}*

_{700},

*w*

_{700}, div

_{300}, PV

_{500}, and

*z*

_{500}. The choice of predictors for model M3 (18 predictors in total) goes back to the work of Cavazos and Hewitson (2005), who found that

*z*

_{500}and

*H*

_{s700}are the most relevant controls of daily precipitation in the midlatitudes; this matches the heights used in M2 for these two variables.

By randomly selecting individual data elements the dataset is divided in three sets: training (70%), validation (15%), and testing (15%). Each ANN model is trained using 60 different sets of random initial weights so as to avoid local optima. The number of neurons in the two hidden layers was determined by trial. Using more than three and two neurons in the first and second hidden layers, respectively, does not lead to significant improvement in the accuracy. Since the ratio of neurons to the amount of data is very low and the unbiased generalization error is constantly estimated on the test set during training, the results are unlikely to suffer from overfitting. As a consequence, the accuracy of the training set is only slightly better than for the validation and test set.

For the evaluation the mean performance over all stations is estimated from the five best ANNs. Results of all three ANN models based on the validation dataset are shown in Tables 3 and 4. The accuracy is quantified by the explained variance (*R*^{2}), the rms error (RMSE), the skill ratio between simulated and observed standard deviation (Skill), the mean precipitation differences (DP), the ratio of the average number of dry spells (DS) with more than three days, and the ratio of the average length of these dry spells. These ratios are the number or length found in the simulated downscaled data divided by the value of these indices in the measured data. Measures are calculated separately for each season, so fluctuations in model accuracy can be studied.

Explained variance (*R*^{2}), RMSE, and skill calculated for the validation period for all seasons, DJF: December–February, MAM: March–May, and SON: September–November. Results for the models M1 (optimized predictors with SOM), M2 (six levels), and M3 (two levels) are shown.

The differences of mean precipitation (DP) (%), ratio of the average number of dry spells (DS) (more than 3 days), and the ratio of the average length of dry spells calculated for the validation period for all seasons. Results are shown for the models M1 (optimized predictors with SOM), M2 (6 levels) and M3 (2 levels).

In general, all models perform better during winter and autumn owing to less convection in this period. Other studies (e.g., Tolika et al. 2007; Cavazos and Hewitson 2005; Coulibaly et al. 2005) reported similar annual fluctuations in model performance. While there is not much improvement using the SOM optimized predictor set in spring, significant improvements are achieved in summer and autumn. Apart from this, it is noticeable that the mean precipitation amount is underestimated by almost all models and seasons. The same holds true for the number and length of consecutive dry days, indicating difficulties in modeling days without precipitation. Overall, the downscaling with the optimized domains (M1) seems to be better at reproducing the variability of precipitation; see the column Skill. The number of predictors for the models M1 (47 predictors) and M2 (54 predictors) is very similar; thus the differences cannot be explained by the number of predictors alone.

Based on the results, we assume that the ANN models predict processes with sufficient accuracy so that they may be used for prognostic modeling and the global sensitivity analysis in the next section.

### e. Global sensitivity analysis

To understand the importance of the predictors, the model’s complete sensitivity pattern is required (factor prioritization). In the following section, different sources of uncertainty in the ANN are estimated using the variance-based GSA method introduced in section 3. To take into account the spread of ANN network weights, first- and higher-order indices are calculated and averaged over the five best ANNs (smallest RMSE) at each station. Then, for reasons of clarity indices are averaged over all stations as well.

Figure 6 shows the relative importance of individual predictors given by the total- and first-order indices. First-order indices *S _{i}* measure individual predictor contributions to the ANN output variance, while the total-order indices

*S*also include all interaction effects. Whereas the first-order sensitivity is often (close to) zero, in all cases

_{Ti}*S*values significantly differ from zero, indicating that all variables contribute to the output variance either singularly or in combination with other variables. By subtracting

_{Ti}*S*from

_{i}*S*, one obtains a measure of how much variables are involved in interactions.

_{Ti}Plot of the total-order and first-order sensitivity derived from the trained ANN. Every individual grid point of the optimized predictor set (*H _{s}*,

*T*,

*w*, PV, div,

*z*) is depicted on the abscissa. Corresponding pressure levels are represented by the horizontal grayscale bar at the bottom.

Citation: Journal of Climate 24, 23; 10.1175/2011JCLI4155.1

Plot of the total-order and first-order sensitivity derived from the trained ANN. Every individual grid point of the optimized predictor set (*H _{s}*,

*T*,

*w*, PV, div,

*z*) is depicted on the abscissa. Corresponding pressure levels are represented by the horizontal grayscale bar at the bottom.

Citation: Journal of Climate 24, 23; 10.1175/2011JCLI4155.1

Plot of the total-order and first-order sensitivity derived from the trained ANN. Every individual grid point of the optimized predictor set (*H _{s}*,

*T*,

*w*, PV, div,

*z*) is depicted on the abscissa. Corresponding pressure levels are represented by the horizontal grayscale bar at the bottom.

Citation: Journal of Climate 24, 23; 10.1175/2011JCLI4155.1

The individual variables account for about 73% (sum of first-order indices) of the output variance, thus the remaining 27% is due to interactions (nonadditive). Thus the ANN network, trained for precipitation downscaling, belongs to the class of nonadditive models. Apparently some inputs mainly contribute because of their interactions with other variables, for example, div in June–August. Nevertheless, these factors make an important contribution to the model’s output variance by interaction. According to the result *z*, *w*, *T*, and *H _{S}* may be designated as sensitive variables since the model output is mainly influenced by the uncertainty of these factors. There is clear evidence that these factors are subject to seasonal fluctuation and, thus, not all physics is taken into account by these models. For example, the geopotential height

*z*is more important in the lower troposphere around the 500-hPa level during summer and spring, whereas the upper troposphere (300-hPa level) gains importance in winter. Conversely, at the 850–500-hPa level

*w*and

*T*consistently contribute to the model output variance. Note that specific humidity

*H*at the 500-hPa and 700-hPa level shows high values in summer and a minimum in winter.

_{S}Downscaling studies often address the question of which predictors are most responsible for producing strong precipitation events (factor mapping). To do so, model output realizations are mapped back onto the input space (Saltelli et al. 2006) using the previously used Monte Carlo realizations from the GSA. In a first step the output is filtered into two subsets: one contains the upper 5% tail of rainy days and the second subset with the remaining outcomes. Most likely the two subsamples will come from different unknown probability density functions of the predictors. The cumulative distributions are compared by applying a two-sided Kolmogorov–Smirnov test. In Fig. 7 the differences between the two groups with strong and weak precipitation of the average Kolmogorov–Smirnov test statistics over all stations are shown for all predictors: the larger the value, the more important the predictor in producing strong precipitation. In general, high influential predictors (total-order indices) also trigger stronger events. However, apparently less important parameters, such as PV and div can also play a decisive role in the tails of the prediction.

Two-sample Kolmogorov–Smirnov test statistics between the two distributions of the upper 5% tail of the output realization and the remaining 95%. Corresponding pressure levels are represented by the horizontal grayscale bar at the bottom.

Citation: Journal of Climate 24, 23; 10.1175/2011JCLI4155.1

Two-sample Kolmogorov–Smirnov test statistics between the two distributions of the upper 5% tail of the output realization and the remaining 95%. Corresponding pressure levels are represented by the horizontal grayscale bar at the bottom.

Citation: Journal of Climate 24, 23; 10.1175/2011JCLI4155.1

Two-sample Kolmogorov–Smirnov test statistics between the two distributions of the upper 5% tail of the output realization and the remaining 95%. Corresponding pressure levels are represented by the horizontal grayscale bar at the bottom.

Citation: Journal of Climate 24, 23; 10.1175/2011JCLI4155.1

## 5. Conclusions and discussion

This study presents a new nonlinear approach for optimizing predictors and their corresponding three-dimensional domains for conditional airmass classification. Unlike common approaches, each predictor domain can take irregular spatial boundaries to obtain a wide range of precipitation distributions at each weather station. As shown in section 4a, allowing for nonlinearities in the screening process exposes clearly defined and physically interpretable compact and smooth predictor domains, which are by far rectangular and not symmetrical. Most of the domains can be assigned to specific stages of a low pressure trough. Assuming that the optimized predictors are also suitable inputs for nonlinear downscaling, the optimized predictor set has been used for an ANN downscaling in the German Rhineland. The results have been compared to those of similar models, using symmetrically arranged input predictors on several pressure levels.

Despite the predictors not having been explicitly optimized to improve the explained variance of the downscaling, an improvement could be achieved in summer and autumn. A particular emphasis was placed on the insight of predictor interactions, with the intention to indicate the sensitivities and corresponding sources of uncertainty in the predictor set. Such valuable information may be further used to develop more parsimonious models with only small decreases in accuracy. The applied GSA turned out to be an efficient way to provide the necessary understanding of the model’s sensitivity pattern. According to the results, about one-fourth of the ANN model output variance is due to interaction effects.

Neglecting such interactions in the screening process will lead to some loss of information. In this context linear screening methods are insufficient as they neither account for interactions nor for nonadditivity, as given by many nonlinear prediction algorithms. This also holds true for the widely used sigma-normalized derivative sensitivity analysis, which might lead to wrong conclusions because of missing interaction effects.

Based on the GSA results, the following conclusions can be drawn for central Europe:

The role of humidity (

*H*) appears to be more relevant for summer precipitation when subscale processes are dominant. This is also the most likely explanation for the modest horizontal expansion of the domain. The influence of_{S}*H*shows a pronounced vertical differentiation decreasing from the 700-hPa (8%–18%) to the 300-hPa (2%–3%) level (see Fig. 6)._{S}Vertical velocity (

*w*) at the 500-hPa level explains almost 30% of the model output variance (including interactions; see Fig. 6) in the transitional seasons. Together with the contribution at the 850-hPa and 500-hPa level,*w*seems to be a key element for downscaling precipitation. Unlike other predictors, its domain is sloping upwind across the entire troposphere, covering a great part of central Europe (see Fig. 3d).Potential vorticity (PV) and divergence (div) may be classified as rather insignificant by themselves, particularly in summer and spring. During that time of the year these predictors contribute almost entirely only due to interactions effects (see Fig. 6). For parsimonious models these predictors may be removed with little loss of accuracy.

Temperature (

*T*) fundamentally differs from other predictors by its dipole domain (see Fig. 3a) and its distance from the investigation area with the main center of action over Spain. Nevertheless, the mid- to high-tropospheric*T*values contribute between 10% and 18% (including interactions; see Fig. 6) to the total model output variance.Geopotential height (

*z*) was found to be an influential predictor for all seasons. During autumn and winter the high-tropospheric region (300 hPa) seems to be more important, while there is a pronounced shift downward to the 500-hPa level in summer and spring. Interestingly, the horizontal extent of the domain is relatively small. Outside of this domain pressure will also correlate with precipitation in the Rhineland; this illustrates that strong correlations do not always equate to good prediction.

In general, the decomposition of the complete model output variance by a GSA offers a comfortable way to estimate the relative importance of predictors.

Although the downscaling results are convincing, better results may be obtained by additionally including appropriate time lags and scales. Altogether, predictor optimization by means of conditional airmass classification provides a good way to find characteristic atmospheric domains. However, the proposed algorithm is rather time consuming, which is caused by both the simulated annealing algorithm and the self-organizing maps clustering algorithm. One way to speed up the optimization can be achieved by using an alternative clustering algorithm (e.g., *k* means). Furthermore, a more greedy and efficient optimization algorithm may be sufficient to find the optimum domains. It is planned to verify this approach in other climatic regions.

## Acknowledgments

We gratefully acknowledge financial support by the Deutsche Forschungsgemeinschaft (DFG), Grant VE366/3-1.

## REFERENCES

Bárdossy, A., J. Stehlík, and H. Caspary, 2002: Automated objective classification of daily circulation patterns for precipitation and temperature downscaling based on optimized fuzzy rules.

,*Climate Res.***23**, 11–22.Beckmann, B.-R., and T. A. Buishand, 2002: Statistical downscaling relationships for precipitation in the Netherlands and north Germany.

,*Int. J. Climatol.***22**, 15–32, doi:10.1002/joc.718.Cavazos, T., and B. C. Hewitson, 2005: Performance of NCEP-NCAR reanalysis variables in statistical downscaling of daily precipitation.

,*Climate Res.***28**, 95–107.Coulibaly, P., Y. B. Dibike, and F. Anctil, 2005: Downscaling precipitation and temperature with temporal neural networks.

,*J. Hydrometeor.***6**, 483–496.Crane, R. G., and B. C. Hewitson, 2003: Clustering and upscaling of station precipitation records to regional patterns using self-organizing maps (SOMs).

,*Climate Res.***25**, 95–107.Davis, C., and K. Emanuel, 1991: Potential vorticity diagnostics of cyclogenesis.

,*Mon. Wea. Rev.***119**, 1929–1953.D’onofrio, A., J.-P. Boulanger, and E. Segura, 2010: CHAC: A weather pattern classification system for regional climate downscaling of daily precipitation.

,*Climatic Change***98**, 405–427, doi:10.1007/s10584-009-9738-4.Fealy, R., and J. Sweeney, 2007: Statistical downscaling of precipitation for a selection of sites in Ireland employing a generalised linear modelling approach.

,*Int. J. Climatol.***27**, 2083–2094.Fowler, H. J., S. Blenkinsop, and C. Tebaldi, 2007: Linking climate change modelling to impacts studies: Recent advances in downscaling techniques for hydrological modelling.

,*Int. J. Climatol.***27**, 1547–1578.Goodess, C. M., and J. P. Palutikof, 1998: Development of daily rainfall scenarios for southeast Spain using a circulation-type approach to downscaling.

,*Int. J. Climatol.***18**, 1051–1083.Hewitson, B. C., and R. G. Crane, 1996: Climate downscaling: Techniques and application.

,*Climate Res.***7**, 85–95.Hewitson, B. C., and R. G. Crane, 2002: Self-organizing maps: Applications to synoptic climatology.

,*Climate Res.***22**, 13–26.Huth, R., 2004: Sensitivity of local daily temperature change estimates to the selection of downscaling models and predictors.

,*J. Climate***17**, 640–652.Kirkpatrick, S., C. D. Gelatt, and M. P. Vecchi, 1983: Optimization by simulated annealing.

,*Science***220**, 671–680.Kohonen, T., 1982: Self-organized formation of topologically correct feature maps.

,*Biol. Cybern.***43**, 59–69, doi:10.1007/BF00337288.Lynch, A., P. Uotila, and J. J. Cassano, 2006: Changes in synoptic weather patterns in the polar regions in the twentieth and twenty-first centuries, part 2: Antarctic.

,*Int. J. Climatol.***26**, 1181–1199.Maraun, D., and Coauthors, 2010: Precipitation downscaling under climate change: Recent developments to bridge the gap between dynamical models and the end user.

,*Rev. Geophys.***48**, RG3003, doi:10.1029/2009RG000314.Metropolis, N., A. W. Rosenbluth, M. N. Rosenbluth, A. H. Teller, and E. Teller, 1953: Equation of state calculations by fast computing machines.

,*J. Chem. Phys.***21**, 1087, doi:10.1063/1.1699114.Murphy, J., 2000: Predictions of climate change over Europe using statistical and dynamical downscaling techniques.

,*Int. J. Climatol.***20**, 489–501.Rossa, A. M., H. Wernli, and H. C. Davies, 2000: Growth and decay of an extra-tropical cyclone’s PV-tower.

,*Meteor. Atmos. Phys.***73**, 139–156, doi:10.1007/s007030050070.Saltelli, A., S. Tarantola, and K. P.-S. Chan, 1999: A quantitative model-independent method for global sensitivity analysis of model output.

,*Technometrics***41**, 39–56.Saltelli, A., M. Ratto, S. Tarantola, and F. Campolongo, 2006: Sensitivity analysis practices: Strategies for model-based inference.

,*Reliab. Eng. Syst. Saf.***91**, 1109–1125, doi:10.1016/j.ress.2005.11.014.Schoof, J. T., and S. C. Pryor, 2001: Downscaling temperature and precipitation: A comparison of regression-based methods and artificial neural networks.

,*Int. J. Climatol.***21**, 773–790.Tolika, K., P. Maheras, M. Vafiadis, H. A. Flocas, and A. Arseni-Papadimitriou, 2007: Simulation of seasonal precipitation and raindays over Greece: A statistical downscaling technique based on artificial neural networks (ANNs).

,*Int. J. Climatol.***27**, 861–881.Ulbrich, U., J. G. Pinto, H. Kupfer, G. C. Leckebusch, T. Spangehl, and M. Reyers, 2008: Changing Northern Hemisphere storm tracks in an ensemble of IPCC climate change simulations.

,*J. Climate***21**, 1669–1679.Ulbrich, U., G. C. Leckebusch, and J. G. Pinto, 2009: Extra-tropical cyclones in the present and future climate: A review.

,*Theor. Appl. Climatol.***96**, 117–131, doi:10.1007/s00704-008-0083-8.Wang, X. L., V. R. Swail, and F. W. Zwiers, 2006: Climatology and changes of extratropical cyclone activity: Comparison of ERA-40 with NCEP–NCAR reanalysis for 1958–2001.

,*J. Climate***19**, 3145–3166.Wilby, R. L., and T. M. L. Wigley, 1997: Downscaling general circulation model output: A review of methods and limitations.

,*Prog. Phys. Geogr.***21**, 530–548.Wilby, R. L., and T. M. L. Wigley, 2000: Precipitation predictors for downscaling: observed and general circulation model relationships.

,*Int. J. Climatol.***20**, 641–661.Wilby, R. L., T. M. L. Wigley, D. Conway, P. D. Jones, B. C. Hewitson, J. Main, and D. S. Wilks, 1998: Statistical downscaling of general circulation model output: A comparison of methods.

,*Water Resour. Res.***34**, 2995–3008, doi:10.1029/98WR02577.