## Abstract

High-resolution weather scenarios generated for climate change impact studies from the output of climate models must be spatially consistent. Analog models (AMs) offer a high potential for the generation of such scenarios. For each prediction day, the scenario they provide is the weather observed for days in a historical archive that are analogous according to different predictors. When the same “analog date” is chosen for a prediction at several sites, spatial consistency is automatically satisfied. The optimal predictors and consequently the optimal analog dates, however, are expected to depend on the location for which the prediction is to be made.

In the present work, the predictor (1000- and 500-hPa geopotential heights) domain of a benchmark AM is optimized for the probabilistic daily prediction of 8981 local precipitation “stations” over France. The corresponding 8981 locally domain-optimized AMs are used to explore the spatial transferability and similarity of the optimal analog dates obtained for different locations. Whereas the similarity is very low even when the locations are close, the spatial transferability of the optimal analog dates for a given location is high. When they are used for the prediction at all other locations, the loss of prediction performance is therefore very low over large spatial domains (up to 500 km). Spatial transferability is lower in the presence of high mountains. It also depends on the parameters of the AM (e.g., its archive length, predictors, and number of analog dates used for the prediction). In the present case, AMs with higher prediction skill exhibit lower transferability.

## 1. Introduction

High-resolution scenarios of various surface meteorological variables are classically required for impact studies at regional scales. Climate models usually exhibit severe limitations for simulating such scenarios. Their coarse spatial resolution leads, for instance, to a crude representation of precipitation-related processes (e.g., Giorgi and Mearns 1991; Frei et al. 2003). Precipitation outputs are therefore classically biased and often exhibit unrealistic spatial patterns (Salathé 2003) and variability (Bürger and Chen 2005). Moreover, they are often available at spatial and temporal scales that are much too coarse with respect to the scales required for operational purposes or impact studies. Similar limitations are known for other surface meteorological variables such as temperature, wind, and radiation. Statistical downscaling models (SDMs) are often used to overcome these limitations and produce the required meteorological scenarios from climate models outputs. They are based on the following twofold rationale: 1) local meteorological variables are strongly influenced by the state of the atmosphere and its circulation at the synoptic scale and 2) synoptic-scale atmospheric variables are better simulated by climate models. SDMs are thus based on empirical relationships established, for recent decades and generally for a daily time step, between a selection of large-scale atmospheric variables (called the predictors) and the required local meteorological variables (called the predictands).

A large number of SDMs have been proposed over the last two decades [see Maraun et al. (2010) for a review]. They are widely used to generate weather scenarios for past or future climates from outputs of climate models (e.g., Wilby et al. 1999; Hanssen-Bauer et al. 2005; Boé et al. 2007; Lafaysse et al. 2014). They can also be used to reconstruct weather scenarios from atmospheric reanalysis data [specific events as in Auffray et al. (2011) or time sequences covering 50–100 past years as in Mezghani and Hingray (2009), Kuentz et al. (2013), and Wilby and Quinn (2013)]. Another application is weather forecasting on the basis of the outputs of regional- or synoptic-scale numerical weather prediction models (e.g., Obled et al. 2002; Gangopadhyay et al. 2005; Marty et al. 2012, 2013). Perfect prognosis (also referred to as perfect prog) approaches (Maraun et al. 2010), mainly based on transfer functions or analog resampling methods, are SDMs of particular interest because they use the time variations of large-scale variables (e.g., pseudo-observations of the recent climate) to reproduce the time variations that would have been obtained for local-scale surface meteorological variables (e.g., observations for the same recent climate period).

Most SDMs have been used to generate precipitation and temperature time series as main meteorological variables for hydrological applications. Over recent years, SDMs have focused even more on other meteorological variables used in agro- and biometeorology and also climate-related energy sources (relative humidity, wind speed, potential evaporation, solar radiation, etc.). Key scientific challenges for future decades include the generation of relevant scenarios for multiple weather variables at multiple sites within a region of interest (Wilks 2012). A number of critical issues arise here. They mainly involve the difficulty of generating, from both a statistical and physical viewpoint, relevant weather scenarios with respect to 1) space–time fluctuations of each weather variable and 2) space–time cofluctuations between weather variables.

For instance, transfer functions (e.g., based on nonlinear regressions, artificial neural networks, or principal component analysis) have been widely used to generate different weather variables at a single site or areal averages. They can be extended to create multivariate and multisite (or even true spatial) generators, that is, adapted and used for multiple locations and multiple variables simultaneously (e.g., Roessler et al. 2012). The use of common large-scale predictors induces some spatial correlation between the generated predictands. Some temporal correlation (autocorrelation) is also obtained as a result of the significant persistence of driving atmospheric indices chosen as predictors (Buishand et al. 2004). However, the level of correlation obtained either in space or time may be not sufficient unless the generation at individual sites is forced by spatially and temporally correlated random numbers (e.g., Wilks and Wilby 1999; Mezghani and Hingray 2009). The reproduction of observed correlation structures from a statistical point of view is moreover not a guarantee that generated space–time patterns are relevant from a physical point of view, especially for infrequent events. This may be, for instance, a critical limitation for the generation of major precipitation events presenting major spatial heterogeneities such as those frequently observed in regions with complex and marked topography (e.g., Mezghani and Hingray 2009).

A good alternative consists of nonparametric SDMs based on the *k*-nearest neighbor (kNN) resampling approach. These approaches have been widely used in recent years for the generation of daily weather variables at multiple sites (e.g., Buishand and Brandsma 2001; Gangopadhyay et al. 2005). A number of recent works have extended their application to the generation of a number of covariates [e.g., precipitation, temperature, relative humidity, longwave and shortwave radiation, and wind speed in Boé et al. (2007), Lee et al. (2012), and Lafaysse et al. (2014)]. Analog dates of the current generation day are searched for in the historical database on the basis of a similarity criterion. A daily state vector characterizing the daily atmospheric circulation and state is used to identify the days that are the most similar to the current day. The required surface weather variables observed for one or for a selection of the kNNs are then used as a weather scenario for the current day. A number of variants of the kNN approach have been presented over the last decade. Differences are related to 1) the vector of large-scale predictors [e.g., given fields of synoptic variables (Obled et al. 2002) or the vector of synthetic indices extracted from these fields via principal component analysis (PCA) (Zorita and von Storch 1999)], 2) the distance criterion used to identify the kNNs (e.g., Euclidean, Mahalanobis, and Teweles–Wobus), and 3) the method used to estimate the predictand from these kNNs. The analog method classically refers to the configuration where the nearest neighbor is selected as the scenario for each generation day (e.g., Zorita and von Storch 1999). A probabilistic estimation of predictands is also often achieved when all kNNs are retained as scenarios (Gangopadhyay et al. 2005; Marty et al. 2012; Lafaysse et al. 2014). The major advantage of kNN resampling approaches is that they do not require restrictive assumptions concerning the joint distribution of the different predictands. Therefore, they can be easily applied to the generation of nonnormally distributed data. As surface weather variables are sampled simultaneously from historical records for a given analog day, generated fields are physically realistic and consistent (because already observed) within each day. Generated weather variables are consequently expected to reproduce not only the observed distributions but also cross-correlations between variables and sites much better than parametric models (e.g., Mehrotra and Sharma 2007; Boé et al. 2007; Lee et al. 2012). This applies to the generation time step (usually daily) and to subgeneration time steps (e.g., hourly if such a resolution is available in the archive; Mezghani and Hingray 2009). The only limitation concerning the number of covariates generated and the space–time resolution of generated time series is related to the available data in the archive of observations.

These kNN approaches therefore offer a powerful means to generate physically relevant high-resolution space–time scenarios for impact studies. The large-scale predictors used to identify analog dates are, however, classically optimized for the site and the variable for which a prediction is needed. As a result, the optimal predictor set is expected to strongly depend on the considered region. This is illustrated by the varying predictive power of different types of individual predictors for different sites around the world (e.g., Cavazos and Hewitson 2005; Timbal et al. 2008) and also for sites located within small geographical domains (e.g., Reichert et al. 1999). When the same predictors can be retained, the optimal large-scale domain over which the similarity between daily predictors must be evaluated is also expected to depend on the site (Horton 2012; Radanovics et al. 2013). The analog dates obtained for a model optimized for a given site can of course be used for prediction at other sites. The resulting predictions, however, are likely to be suboptimal.

As precipitation is of major interest in hydrological impact studies, the present work is only focused on this meteorological variable. The methodology could however be easily applied to any other predictand. This paper explores the spatial transferability of analog dates identified from a locally optimized reference analog model (AM) for the probabilistic daily prediction of precipitation at neighboring sites throughout France. The experiment is based on precipitation estimates obtained on a grid from the Système d’Analyse Fournissant des Renseignements Atmosphériques à la Neige (SAFRAN) precipitation reanalysis (Quintana-Segui et al. 2008; Vidal et al. 2010). The large-scale analogy domain of the reference AM is optimized for each grid cell leading to a set of locally optimal analog models that can be in turn applied to precipitation prediction elsewhere. The spatial transferability of a model can be assessed by comparing its performance at each site with the performance obtained with the locally domain-optimized AM for this site.

The structure of this paper is as follows: Section 2 describes the data, the reference analog model, and the evaluation criteria used to assess the local performance of the models, spatial similarity, and transferability of analog dates. Results from the transferability assessment experiment are presented in section 3 and discussed in section 4. Finally section 5 draws a number of conclusions.

## 2. Data, model, and evaluation

### a. Data

The large-scale predictors used to identify the analog dates are geopotential heights taken from the 40-yr European Centre for Medium-Range Weather Forecasts (ECMWF) Re-Analysis (ERA-40) (Uppala et al. 2005), with a spatial resolution of 1.125° × 1.125°.

The predictand corresponds to the daily total precipitation taken from the SAFRAN near-surface reanalysis (Quintana-Segui et al. 2008; Vidal et al. 2010) with a resolution of 8 × 8 km^{2}. This local reanalysis covers France and includes a set of 8981 grid points. Precipitation values estimated for these grid points are considered as “pseudo-observations.” In the following, the analog SDM is applied to predict the precipitation on each SAFRAN grid point.

### b. The analog SDM

The analog model applied in this study is based on previous developments for probabilistic quantitative precipitation forecasts (Bontron and Obled 2005; Marty et al. 2012) for several catchments in France. The model involves four steps:

A seasonal filter is first applied in order to keep only candidates that belong to the same period of year: calendar days are selected as candidates with a moving window of ±

*j*calendar days centered on the target day. Note that the target day and neighboring days in a temporal window of ±5 days were excluded from the candidates.A criterion for similarity between the target day and each candidate day is then computed. The chosen criterion is the Teweles–Wobus score (TWS; Teweles and Wobus 1954) that compares two geopotential gradients providing information on the origin of the air masses. Note that an analogy based on the TWS performs better than a classical Euclidean distance applied to the principal components previously derived for the predictor fields—here the geopotential heights—retained for the analogy (e.g., Guilbaud and Obled 1998; Wetterhall et al. 2005).

The candidate days are then sorted according to the TWS and the nearest analog dates are kept.

The predictive precipitation for the target day is obtained from the empirical distribution of the precipitation values of the

*N*_{d}nearest analogs.

The analog model requires optimization of several parameters, including the type or level of predictors, the domain used to compute the similarity criterion, and the similarity criterion itself. These parameters are expected to be dependent on the predictand (i.e., on the predictand target grid point in our case). Ideally these parameters should be optimized together as proposed by Horton (2012). For our domain, geopotential height fields of 1000 and 500 hPa at +12 h and +24 h were found to be the most informative predictors by Bontron (2004). The seasonal filter was set to one month before and after the target day. A preliminary analysis showed that *N*_{d} = 25 was a good compromise to obtain on average the best performance (the performance is expressed by the skill score as described later in section 2c) over France. For every grid point *p* in [1, …, 8981], the performance loss obtained when comparing an AM with a number *N*_{d} of analog dates equal to 25 to an AM with an optimal number of analog dates—varying between 10 and 40 for the SAFRAN grid points—is lower than 0.01 (results not shown).

The predictor domain was conversely optimized for each target grid point by maximizing the performance of the prediction. The domain optimization results from a growing rectangular analogy domain algorithm, as explained by Bontron (2004). The algorithm steps are the following:

The elementary ERA-40 grid cell centered on a given target SAFRAN grid point is chosen as an initial predictor domain.

For each cardinal direction, four alternative domains—obtained by an extension of one ERA-40 grid cell—based on the initial predictor domain are tested as predictor domains. The domain obtaining the best score is kept.

Step 2 is repeated considering the predictor domain kept in step 2 as the initial predictor domain until no better performance is obtained for all of the four alternative extended predictor domains. The final predictor domain obtained from this iterative procedure corresponds to the analogy domain with the optimal score. No tolerance criterion is used here.

This algorithm leads to what we will call “the locally domain-optimized analog model for the target predictand *k*,” denoted hereafter as AM_{k}.

### c. Evaluation scores

Each analog model AM_{k} leads to a set of *N*_{d} analog dates for each issued prediction. These dates can be used for the prediction of precipitation for any target grid point *p*. As each issued prediction is based on an ensemble of *N*_{d} analog dates, the criterion used for the evaluation of the AM should be a probabilistic score. Inspired by common practice in Ensemble Prediction System (EPS) evaluation, we use a skill score based on the expected continuous ranked probability score () introduced by Brown (1974) and Matheson and Winkler (1976). The value for an AM_{k} used for prediction at a given grid point *p* is defined as

where and *F*_{k,i} respectively denote the cumulative distribution function (CDF) of the observations *o*_{p,i} and the CDF derived from AM_{k} for the issued prediction *i* and where *M* is the number of issued predictions. Also, *x*_{p} denotes the precipitation quantiles of the CDFs on grid point *p*. Note that corresponds to the Heaviside function where if *x*_{p} ≥ *o*_{p,i} and otherwise. For a prediction *i*, the integral in Eq. (1) is equal to the area of the squared differences between the predicted and the observed CDFs, as illustrated in Fig. 1 and detailed in the appendix. The has several advantages:

it evaluates the entire probabilistic prediction,

it is a proper score [for the definition, see Eq. (1) in Gneiting and Raftery (2007), and Bröcker and Smith (2007)],

it can be interpreted as an integral over all possible Brier scores (Brier 1950), and

it is equal to the mean absolute error (MAE) for a deterministic prediction.

The continuous ranked probability skill score (CRPSS) is used to compare the model performance to the performance of a reference prediction model. In the present work, the reference prediction model is simply a calendar climatology defined for each prediction day by the precipitation distribution of the days belonging to a seasonal window (±30 days) centered on the corresponding calendar day. The skill score of AM_{k} when applied to grid point *p* is defined as

where corresponds to obtained by the climatological model applied to grid point *p*. CRPSS is a positive oriented score: a CRPSS equal to 1 corresponds to a perfect AM whereas a negative CRPSS indicates that the performance for the considered AM is worse than that of the climatological model. As and are expressed in the same units as the predictand, CRPSS is a dimensionless performance score, which makes it possible to compare the performance obtained between two grid points.

In addition to , the relative bias *b*_{k}(*p*) of the averaged scenario predicted by AM_{k} when applied to grid point *p* can be computed to evaluate the averaged predicted precipitation quantity as follows:

where is the mean precipitation scenario at grid point *p* predicted by AM_{k} for a given issued prediction *i* and is the observed interannual mean precipitation.

#### 1) Evaluation of spatial transferability

For a given grid point *p*, the predictions obtained with AM_{k} are expected to have lower performance than those obtained with AM_{p}. We further estimate the spatial transferability for a given AM_{k} using the following differences:

where CRPSS_{k}(*p*) and CRPSS_{p}(*p*) correspond to CRPSS obtained when respectively applying AM_{k} and AM_{p} to grid point *p*. For Δ*b*_{k}(*p*), a positive value means that AM_{k} predicts an increase of the averaged predicted precipitation compared to that estimated by AM_{p} at grid point *p*.

#### 2) Evaluation of spatial similarity of analog dates

We additionally estimate the similarity of analog dates obtained from two analog models AM_{x} and AM_{y}. Each issued prediction *i* ∈ [1, …, *M*] derived from these models is composed of *N*_{d} analog dates as described in Fig. 2. A daily similarity score *s*_{i} corresponding to the number of similar analog dates between the two predictions issued respectively by AM_{x} and AM_{y} can be computed for each prediction day *i*. To evaluate a mean similarity level between analog dates produced by AM_{x} and AM_{y}, we determine the proportion of issued predictions for which the daily similarity score *s*_{i} exceeds a given similarity level *s*_{0}:

For a given prediction day *i*,{*s*_{i} ≥ *s*_{0}} denotes the indicator function equal to 1 if *s*_{i} ≥ *s*_{0} and 0 otherwise. A value of 60% for , for instance, means that for 60% of the issued predictions, the number of similar analog dates between both models is greater than *s*_{0}. For a given *s*_{0} threshold, the larger the value for , the more similar the dates from the two models are. Therefore, the lower the level *s*_{0}, the greater the percentage of issued predictions . Perfect equality between analog dates is obtained when *O*_{100%} is equal to 1.

## 3. Results

In the following sections, the optimization of the set of locally domain-optimized analog models AM_{p} is performed on the corresponding grid points *p* for the 20-yr period covering from 1 August 1982 to 31 July 2001. The transferability of analog models is evaluated over the same period. Note that the evaluation period could have been different from the calibration period. However, it would then have been difficult to determine to what degree any modification in model performance was due to temporal and/or spatial transferability.

### a. Evaluation of the locally domain-optimized analog model

In this subsection, we evaluate the performance when applying each model AM_{p} to the corresponding grid point *p* for the prediction of precipitation. The spatial distribution of the skill score CRPSS_{p}(*p*) over France is illustrated in Fig. 3. Its high spatial variability depends on the topography, with a high performance of 0.4 in the western part of the Massif Central and in the northern Alps. Lower prediction skill varying around 0.28 is observed in the plains, reaching 0.35 along the Atlantic coast. The important variability of precipitation in the Mediterranean region—characterized by a coefficient of variation ranging between 2.25 and 4.5 whereas this coefficient is lower than 2.25 for the other SAFRAN grid points (results not shown)—could explain the lowest skill observed in this region. The predictive power is lower for regions with more frequent convective precipitation where a weaker link with large-scale circulation can be observed.

The spatial distribution of the relative bias *b*_{p}(*p*) obtained from each AM_{p} is illustrated in Fig. 4. For a majority of grid points, the absolute value of relative bias of the mean scenario is less than 5%. However, two areas stand out from this trend:

the southwestern area of France, where precipitation is underestimated with a relative bias of around −12%, and

the northern Alps, with a bias of up to 10%.

In the following, the performance obtained with the model AM_{p} applied to grid point *p* is used as a benchmark for assessing the transferability of analog models to other grid points. The spatial transferability and similarity of analog dates have been evaluated for 12 analog models locally optimized for 12 grid points uniformly distributed over France. Results will be described for only two of these models, which are representative of the results obtained with the other transferred AMs.

### b. Evaluation of spatial transferability

Figures 5 and 6 illustrate the spatial transferability of two analog models locally optimized for two grid points: one located in the northwest of France (hereafter called the NW grid point) and one located in the southeast of France (hereafter called the SE grid point).

The analogy domain of AM_{NW} is illustrated in Fig. 7. It extends to the Atlantic Ocean and to the southwestern part of France. On the other hand, the analogy domain of AM_{SE} covers the southeastern part of France, extending over the Mediterranean Sea (see Fig. 7). The Atlantic Ocean and the Mediterranean Sea constitute the major sources of humidity for the generation of precipitation over France. As shown by Boé and Terray (2008) and Garavaglia et al. (2010), westerly Atlantic and southerly Mediterranean circulation is mainly responsible for the precipitation. In the following, AM_{NW} and AM_{SE} are applied to predict precipitation at each grid point *p* ∈ [1, …, 8981].

Figure 5a illustrates Δ*b*_{NW}(*p*) when applying AM_{NW} to all SAFRAN grid points. We observe that the relative bias increases by 5% points for grid points situated 100 km north of the NW grid point. For grid points with a *y* coordinate ranging from 2200 to 2400 km, Δ*b*_{NW}(*p*) remains between −5% and 5%. For *y* coordinates less than 2200 km, the bias is affected by the Massif Central where a decrease of 5% is observed in the western area whereas it is greater in the eastern part. The relative bias decreases by more than 10% points in the southeastern part of France.

A similar behavior is observed when transferring AM_{SE}. A bias increase of more than 5% is observed north of the SE grid point and west of the Massif Central whereas a bias decrease ranging between 5% and 10% can be seen on the island of Corsica. In the southwest and east of France, Δ*b*_{NW}(*p*) varies from −5% to 5%.

With respect to the CRPSS, the spatial transferability of AM_{NW} and AM_{SE} is very high. Figure 6a illustrates the CRPSS loss when transferring AM_{NW}. The absolute value of the loss is less than 0.05 for a large majority of grid points. It even becomes less than 0.01 for grid points in a roughly 500-km-wide region around the NW point, which extends to the Massif Central. A high loss equal to −0.13 is only obtained for the far southeastern region including Corsica.

The spatial transferability of AM_{SE} is smaller as illustrated in Fig. 6b. The region showing a −0.01 loss is 300 km wide and 500 km long. The CRPSS absolute loss remains smaller than 0.01 for more than 400 km when transferring AM_{SE} to the north. Conversely, ΔCRPSS_{SE} shows a large decrease from −0.01 to −0.05 when crossing the Massif Central from southeast to northwest, reaching a value equal to −0.15 in Brittany. The mountains of the Massif Central thus play an important climatological barrier role in the region. The same is expected for the Alps and also for the Pyrenees as illustrated by Bontron (2004).

The very small loss observed for large areas in Figs. 6a and 6b shows that the performance obtained when transferring AM_{k} to a broad region containing *k* is regionally similar to the performance of the locally domain-optimized AM.

### c. Similarity of analog dates

A reason for the small CRPSS loss discussed in section 3b could be that the analog dates obtained with two AMs respectively optimized for two close grid points do not differ greatly. The mean similarity level between analog dates produced by two different AMs is expected to depend on the number of analog days retained for the prediction. This is discussed later in section 4b. In the present work, we present the mean similarity level when 25 analog dates are retained.

To compare the 25 analog dates obtained from AM_{NW} and each locally domain-optimized AM_{p}, we estimated for each grid point *p* the percentage of issued predictions *O*_{80%} for which the daily similarity level exceeds the 80% daily similarity threshold.

The spatial distribution of *O*_{80%} is plotted in Fig. 8a. The percentage of issued predictions for the 80% daily similarity threshold is surprisingly low or very low, even where a very good spatial transferability of AM_{NW} is obtained. Around the −0.01 CRPSS loss contour, the percentage of issued predictions exceeding the 80% daily similarity threshold is around 20%. It vanishes to 0% where the CRPSS loss is around −0.05. The largest percentage (up to 100%) is obtained for a very small number of grid points located in the close neighborhood of the NW grid point.

A similar behavior is observed for AM_{SE} in Fig. 8b. Grid points near the −0.01 CRPSS loss contour obtain once again an *O*_{80%} value of around 20% and only a few points located near the SE grid point have a high value of *O*_{80%}. Nevertheless, similarity strongly depends on the relief because its pattern is stretched from south to north and delimited on either side by the Massif Central and the Alps.

Spatial transferability of an AM therefore does not require having the best analog dates. Even if the transferred analog dates differ from the optimal ones, they lead to similar predictions in terms of precipitation for grid points located quite far away. The spatial transferability of an AM therefore does not require a high level of similarity of analog dates. It can also be concluded that a similarity of the domain boundaries of the predictors used to identify the analog dates is not necessary for spatial transferability.

## 4. Discussion

### a. Extraction of rainy and dry days

This section aims to explain the increasing bias pattern obtained in this study when a transferred AM is used instead of the locally domain-optimized one. Note first that for the daily precipitation prediction for the 1982–2001 period, 25 analog dates are extracted for each day from the same period. If each date would have been extracted 25 times, the mean precipitation amount from the probabilistic prediction over the whole period would have been exactly equal to the climatology and thus to the observed mean. A wet bias occurs because wetter days are on average more frequently extracted than drier days (e.g., wet days versus dry days or rainy days with large precipitation amounts versus rainy days with small precipitation amounts). Thus an increase in the wet bias in the north using AM_{NW} suggests a larger extraction frequency of wetter days in this region. This is illustrated in Figs. 9a–d. In each figure, a point corresponds to one day of the period 1982–2001. The coordinates are presented on the second and third principal components (PCs) of a principal component analysis carried out using the geopotential heights for 1000 hPa (*Z*_{1000}) at +12 h and 500 hPa (*Z*_{500}) at +24 h over a larger domain—delimited by the coordinates 38.25°–50.625°N, 12.375°E–6.75°W—represented by thick black lines in Fig. 7. The first PC is not used here as it roughly represents the geopotential mean level, which is not accounted for by the Teweles–Wobus distance criterion used to identify the archive days. The first three PCs explain 69.2%, 14.1%, and 6.6% of the total variance, respectively. Figure 9a highlights the varying density of days according to their synoptic situation in this two-dimensional space. For each day of the archive period represented by its second and third PCs (PC2 and PC3) in Fig. 9a, we can compute the number of times that this day is chosen as one of the 25 analog dates by an AM for the *M* issued predictions. If a day is extracted more than 25 times, this day is overextracted. If the number of extractions is lower than 25, the day is underextracted. Figures 9b and 9d illustrate the extraction frequency of each day obtained with respectively AM_{NW} and AM_{SE}. Figure 10a represents in the same way the local precipitation anomaly—defined as the ratio of the precipitation observed for each day over the daily average precipitation—for the NW grid point. The extraction bias from Fig. 9b combined with the precipitation anomalies from Fig. 10a explains the bias obtained for the reproduction of the mean interannual precipitation over the period. A similar interpretation is possible for Fig. 9d representing the day extraction applied by AM_{SE} to local precipitation anomalies of the SE grid point represented in Fig. 10b.

Transferring AM_{SE} to the NW grid point leads to applying the extraction scheme of AM_{SE} represented in Fig. 9d to the local observed precipitation of the NW grid point as shown in Fig. 10a. We notice in Fig. 9c, representing the difference in extraction frequency between AM_{SE} and AM_{NW}, that AM_{SE} tends to extract the negative more than the positive coordinates of PC2. As the days having a negative PC2 coordinate correspond mostly to rainy days for the NW grid point, AM_{SE} will generate more precipitation for the NW grid point than AM_{NW}. The increased positive bias difference observed in Fig. 5b when transferring AM_{SE} to the north of France can be explained in the same way. On the other hand, days with positive PC2 coordinates are more frequently extracted by AM_{NW}. These days are dry days for the SE grid point, as illustrated in Fig. 10b. Applying AM_{NW} to the SE grid point leads to an increase in the precipitation negative bias observed in Fig. 5b. This could be generalized to the south of France.

### b. Sensitivity of similarity and transferability to the number of analog dates

A study on the sensitivity of similarity and transferability to the number of analog dates *N*_{d} retained for each prediction was carried out. Results are presented for predictions deduced from AM_{NW} and AM_{SE}. Similar results are obtained for predictions issued from an AM optimized for other locations (not shown).

Increasing the number of analog dates is expected to increase the mean daily similarity level of analog dates obtained from two models AM_{x} and AM_{y}. Figures 11a and 11b illustrate the mean similarity level *O*_{80%} for AM_{NW} and AM_{SE} with *N*_{d} = 250. The region covered by a value of *O*_{80%} = 0.7 (delimited by the blue line in Figs. 11a and 11b) is 400–500 km wide compared to the region which is only 100–200 km wide for 25 analog dates (see Figs. 8a,b).

Figures 12a and 12b show the CRPSS loss when applying either AM_{NW} or AM_{SE} with 250 analogs compared to the optimal configuration for 250 analogs. In both locations, the spatial pattern of ΔCRPSS is roughly the same as that for 25 analogs (see Figs. 6a and 6b, respectively). The spatial transferability of AM_{NW} and AM_{SE} tends however to be higher, with 250 analogs as demonstrated, for instance, by the smaller CRPSS loss observed toward the extreme southeast of France when applying AM_{NW} or toward the extreme west when applying AM_{SE}. The larger sample of analogs makes the differences between the optimal predictions at different locations smoother.

Similar results were obtained when predictions were based on other sizes of analog sample. Figure 13a represents the proportion of grid points for which the mean similarity level *O*_{80%} (cumulated probability plot of *O*_{80%}) is larger than a given value for *N*_{d} = 25, 50, 100, 150, 250, and 500. For both NW and SE cases and in line with what was pointed out earlier, the percentage of grid points which have a given value of *O*_{80%} (e.g., 0.7) increases with *N*_{d}. The spatial transferability of both models tends to increase with *N*_{d}, very slightly for NW, especially for small *N*_{d} values, but much more significantly for SE (Fig. 13b).

Note also that the skill of the locally domain-optimized AM significantly decreases as *N*_{d} increases. This can be seen in Fig. 13c representing the CDF of the optimal CRPSS for different analog sample sizes. The median value of the optimal CRPSS decreases from 0.3 to 0.15 as *N*_{d} increases from 25 to 500. As increasing the size of the analog sample only leads to a slight gain in spatial transferability, the performance of the transferred AM thus significantly decreases. If the skill of the method can be optimized with an appropriate selection of the number of analog dates, its spatial transferability is much less dependent on this value and seems to be more an intrinsic characteristic of the region, related to climatological barriers induced by the presence of significant mountain ranges.

### c. Sensitivity of spatial transferability to the selection of the analog days

The spatial transferability is expected to depend on the predictors chosen to identify the analog dates. In the present section, we present the same analyses as in the previous sections with a two-level AM (hereafter referred to as AM2) including in addition atmospheric humidity as a predictor (e.g., Horton 2012; Marty et al. 2012). The 25 analog dates retained for the prediction are also chosen in the 1982–2001 historical period of the ERA-40 reanalysis but they are, for each target day, selected within a subset of 60 preselected candidate dates that correspond to the nearest spatial patterns in terms of atmospheric circulation. Similar to the one-level AM case considered in the last sections (hereafter referred to as AM1), the preselection of the 60 candidate dates is based on the Teweles–Wobus distance for the 1000- and 500-hPa geopotential height fields. According to Marty et al. (2012), the second-level selection of the 25 dates is based on the Euclidean distance for the humidity variable defined as the product of precipitable water and relative humidity at 850 hPa. As previously, the spatial domain used for the first analogy level was optimized for the prediction at each SAFRAN grid point. The humidity predictor is also SAFRAN grid dependent and it corresponds to an inverse-distance weighted average computed on the four neighboring ERA-40 grid points.

The mean daily similarity level of analog dates obtained with two AM2 models when locally domain optimized for different grid points is very similar to what was found before (for the locally domain-optimized AM1 models), that is, low to very low (not shown). The spatial transferability of analog dates obtained with AM2 models is, on the other hand, rather different from that obtained with AM1 models. Figures 14a and 14b illustrate the CRPSS loss obtained for each SAFRAN grid point when the AM2 models optimized for the NW and SE grid points are applied instead of the locally optimized ones. For each AM2 model (AM2_{NW} or AM2_{SE}), the CRPSS loss increases much faster than the one obtained from the transfer of the corresponding AM1 model (respectively AM1_{NW} or AM1_{SE}). When the CRPSS loss is 0.01 with the AM1 model, it becomes higher than 0.05 with AM2. The area for which a 0.01 CRPSS loss is obtained covers a roughly 200-km-wide region with AM2 compared to the 500 km previously found with AM1.

As already shown by Bontron (2004) for a number of rain gauges in southeastern France, the integration of the humidity predictor improves the prediction skill of the locally domain-optimized AMs for every location in France. The CRPSS gain between AM1 and AM2 is roughly 0.03 for a large part of France (Fig. 15). However, the lower spatial transferability of a locally best performing AM2 model results in lower performance than the locally less well performing AM1 model when both are transferred to far locations. This is highlighted in Fig. 16 where the performance of the transferred AM2_{NW} model is better than the performance of the transferred AM1_{NW} model only for the grid points that are within a 150–500-km-wide area around the NW grid point. The same conclusion can be drawn for the SE grid point. On the other hand, the transferred AM1 model presents higher skill for more grid points situated far away. The increased performance allowed from the introduction of humidity therefore makes the model much more region-specific and thus much less transferable.

It is interesting to note that these results are in line with those presented in section 4b. We found that increasing the number of analog days used for the prediction 1) slightly increases the spatial transferability of AM1 but 2) conversely drastically reduces its overall skill. A similar conclusion is also obtained when the length of the archive period is modified. Timbal et al. (2003) showed that the larger the size of the archive from which the analogs are selected is, the higher the skill of the prediction is. The *k*-nearest neighbors of any day are actually expected to be better analogs of this day when they are identified in a longer archive. We have also tested the transferability of the 25-nearest analog dates when identified in a longer archive (1959–2001 instead of 1982–2001). As expected, the skill of the locally optimized models increases (by roughly 0.05 CRPSS points) but again the transferability of the AM decreases (not shown).

## 5. Conclusions and perspectives

The spatial transferability of analog dates for the probabilistic prediction of local precipitation is a key requirement for the generation of spatially and physically consistent weather scenarios for impact studies. It was first explored in the present study over France with a basic domain-optimized analog model based on the large-scale circulation (i.e., *Z*_{1000} at +12 h and *Z*_{500} at +24 h).

The mean similarity level between analog dates obtained for AMs optimized respectively for two different locations is low to very low. Despite this, the spatial transferability of analog dates, with respect to the CRPSS of the prediction, is high to very high. The loss of performance obtained using a transferred model instead of the locally domain-optimized model remains very low over large spatial domains, which can sometimes be 500 km wide. The spatial transferability of a given model, however, is not isotropic and especially depends on the topography of the studied region. The spatial range of transferability is reduced in presence of major mountainous areas. The Massif Central, for instance, constitutes a clear meteorological barrier in France.

Using more analog dates for the prediction obviously increases the similarity of analog dates. It leads however to only a very slight increase of their spatial transferability and to a large decrease of the absolute prediction skill of the model. For the studied area, the prediction skill is therefore much more sensitive to this parameter than to the large-scale domain used to identify the analogs, provided that the different sites for which a prediction is required are not too far from each other from a meteorological (and topographical) point of view.

The spatial transferability of the analog dates was found to be dependent on the parameters of the AM used for their identification. In France, it is lower when the length of the archive period is longer and when humidity is included as a second-level predictor.

Results are of course expected to be dependent on the specific geographical and meteorological context of the studied area. In recent years, the climate and weather forecast research communities in particular have made major efforts to develop high-resolution gridded reanalysis of precipitation for large domains at the country or continental scales from dense networks of daily rain gauge data [e.g., in French mountainous regions (Gottardi et al. 2012) and in Europe (Haylock et al. 2008), North America (Maurer et al. 2002), South America (Liebmann and Allured 2005), or Asia (Xie et al. 2007)]. These datasets provide a powerful opportunity to better explore the spatial transferability of analog dates across larger domains and the possibility to generate spatially consistent scenarios for regional scale impact studies.

Results obtained in this paper highlight some practical solutions for the generation of weather scenarios suited for regional-scale impact studies:

A high level of similarity in optimal analog dates is not mandatory for the spatial transfer of an AM. Even when the level of similarity is low, optimal analog dates for a given site have potentially a quasi-optimal predictive power for remote locations. As a consequence, a strict similarity of the optimal large-scale analogy domains identified for the prediction at different locations is also not mandatory for the transferability of the optimal analog dates from one location to the other.

The generation of probabilistic precipitation prediction with the studied AM was found to produce a nonnegligible bias in the mean interannual daily precipitation. This results from the fact that historical days are not extracted with the same frequency over the simulation period. The transfer of analog dates to remote sites is expected to modify the bias depending on both the location of the target and of the grid points used for the optimization of the AM but the overall bias remains lower than 20%. As several sets of analog dates can lead to the same performance, it can be supposed that a set of analog dates exists that reduces the observed absolute bias when transferring an AM for locations situated far away. Optimizing the analogy criterion by minimizing the resulting absolute bias could lead to analog dates providing similar performance and lower biases.

When the analog dates are identified from geopotential height fields, the spatial transferability of locally domain-optimized analog dates is high and a quasi-optimal prediction can be obtained within large areas with analog dates identified from a unique analogy domain.

The refinement of an analog prediction model is potentially detrimental to its spatial transferability. In the present study, when humidity is included as a second-level predictor, the local skill of a locally domain-optimized model is improved but its spatial transferability is found to be much lower. This does not mean that the overall performance of the refined model becomes lower than the performance of the rough model. In the present case, the skill improvement due to the introduction of humidity is larger than the skill loss resulting from its transferability to locations that are up to 400 km from the grid point used for the optimization. Depending on the extent of the domain for which scenarios are required, a lower spatial transferability can be limiting and a mandatory compromise may be necessary between the performance of the AM and its spatial transferability. If spatially consistent scenarios have to be produced for very large domains, a lower skill of the AM may be preferred as it is more transferable in space.

The analog predictions could be easily used to generate probabilistic scenarios for every other gauge with available observation data and located in the neighborhood of those over which the analogy domain was optimized, provided that mountainous regions are not located in between them. The analog approach therefore also has the potential of generating spatially consistent precipitation scenarios at the scale of regional impact studies covering tens of thousands of square kilometers or even more. As analog dates can also be used to extract nonprecipitation variables, spatial and physical consistency is also expected for multivariate and multisite scenarios at these scales. However, the spatial transferability of analog dates may be quite different for other predictands. Further work should be carried out to explore this.

The large spatial transferability highlighted here was obtained with the analog dates derived for a locally domain-optimized AM. In the present work, the predictand is one of the 8981 (8 × 8 km

^{2}) grid precipitation estimates obtained via optimal interpolation from a country-wide network of 3000–4500 stations (depending on the year). This strongly suggests that a single gauge would be enough to identify the locally optimal large-scale analogy domain and that the corresponding analog dates would also show such a high spatial transferability. However, a robust identification of the optimal large-scale analogy domain for a given region could not necessarily be achieved from a single-site optimization, especially in areas where precipitation triggering processes are significantly different from one place to another (e.g., in areas with marked relief). In such a case, a more robust strategy would be to carry out a multistation optimization or an optimization for the prediction of the daily area-average precipitation for the region. This is expected to decrease the prediction skill at individual locations but possibly increase the size of the transferability domain.The spatial transferability of the AM was studied here considering all days of the 1982–2001 period. It is however expected to vary according to the season and/or the weather regime of the day. For anticyclonic conditions over France, for instance, a higher transferability is expected as precipitation is generally zero over large areas. The spatial transferability for wet days is also expected to be larger when precipitation is most often stratiform with wide coverage than when it is convective and therefore local. A similar analysis conditioned by weather regimes could therefore provide a better characterization of analog date spatial transferability.

## Acknowledgments

This work is part of a Ph.D. thesis funded by the French Ministère de l’Enseignement Supérieur et de la Recherche (MESR). We would especially like to thank the three anonymous reviewers for their relevant comments and suggestions that allowed us to improve this study and broaden its scope.

### APPENDIX

#### Graphical Interpretation of the CRPS for a Given Prediction

Determining the CRPS_{k}(*i*, *p*) applied to grid point *p* for a given prediction *i* consists in comparing the CDFs of *F*_{k,i} predicted by AM_{k} to that of observation . By definition, when *x*_{p} ∈ ]−∞, *o*_{p,i}[, otherwise. The CRPS_{k}(*i*, *p*) can thus be divided in two subintegrals as proposed in Eq. (A1):

By adding and removing 1 in the second right integral of Eq. (A1), we obtain

## REFERENCES

*C. R. Acad. Sci.,*

**327**(3), 181–188, doi:.

*Cold and Mountain Region Hydrological Systems under Climate Change: Towards Improved Projections,*A. Gelfan et al., Eds., IAHS Publ. 360, 19–25.

*Water Resour. Res.,*

*Rev. Geophys.,*

**48,**RG3003, doi:.

*Quart. J. Roy. Meteor. Soc.,*

**131,**2961–3012, doi:.

*J. Climate,*

**12,**2474–2489, doi:.