## 1. Introduction

The knowledge of daily time series such as temperature, wind, precipitation, solar radiation, and so on are of great value for the industries. For example, the activities of energy companies are greatly affected by temperature and wind conditions. Heating and cooling degree-days are frequently used as a measure of energy demands (Johnson et al. 1996). Also, energy companies use information on surface winds for wind energy resource assessment and the deployment of wind turbines to generate electricity.

Several approaches have been proposed for the generation of daily time series (Grondona et al. 2000). Stochastic weather models are developed to provide weather data series long enough to be used in the assessment of risk in hydrological or agricultural applications. These models can simulate many realizations and thus can provide a wide range of feasible weather situations (Semenov et al. 1998). However, according to Katz (1996), an issue of concern is that stochastic models fitted to time series of daily weather variables tend to underestimate the observed variance of monthly mean temperature (Hansen and Driscoll 1977; Young 1994) or monthly total precipitation (Gregory et al. 1993; Katz and Parlange 1993). This situation in which the observed variance exceeds that for the fitted model is termed overdispersion (Cox 1983).

Dreveton and Guillou (2004) developed a technique for time series generation based on principal component analysis (PCA). Their approach consists of performing PCA to create independent variables, the values of which are then generated separately with a random process. This method produces a better match to the observed (interannual) variance than previous methods and represents a better approach to reduce the impact of the overdispersion.

For some applications, it is important to generate information on the regional weather and its temporal evolution. Large-scale datasets do not account for mesoscale phenomena that may be of importance. There are several techniques that incorporate regional or mesoscale phenomena for the analysis of large-scale data. They are known as downscaling or regionalization techniques. In principle, these methods are divided into three categories: 1) statistical–empirical methods, 2) dynamical or nesting methods, and 3) statistical–dynamical methods. The statistical–empirical methods use regional and global observations to derive statistical relations between large-scale and regional-scale anomalies (Heyen et al. 1996; Benestad 2001; Huth 2002). The relations, however, cannot be found and applied to all regions, especially if the link between the large-scale weather and the local phenomenon is weak. The dynamical or nesting method uses a mesoscale model to downscale the global data for a certain region. Rife et al. (2010) used a mesoscale model to study the global distribution of diurnally varying low-level jets and Hahmann et al. (2010) developed a reanalysis system for the generation of mesoscale climatologies. The use of mesoscale models allows the estimation of the wind resource to take into account mesoscale phenomena, such as the channeling effect of wind by wide valley, if large-scale climatological forcing is correctly specified. The mesoscale model is driven by the global data and the boundary conditions. Simulation periods on the order of several tens of years are necessary to reveal representative classes of weather situations or weather patterns, which is a computationally expensive task. The statistical–dynamical approach (Frey-Buness et al. 1995; Frank and Landberg 1997; Yu et al. 2006) links global data with mesoscale model simulations using statistics about large-scale weather patterns. It requires fewer computer resources without being restricted to short periods of data and its use is not limited to regions where sufficient data are available. The mesoscale model is driven by classified weather patterns on a selected subdomain without being permanently forced by the global data. The basic assumption behind the statistical–dynamical downscaling procedure is that regional climate is associated with a specific frequency distribution of basic large-scale weather patterns. In principle, there are several possibilities to define suitable large-scale weather patterns. The selection of appropriate weather patterns, however, must be made in the view of representativeness and computational economy. The more weather patterns we use, the better the representativeness is, and the bigger the computational cost will be (Frey-Buness et al. 1995).

Wippermann and Grob (1981) and Heimann (1986) are probably the first to show the potential of using statistical–dynamical downscaling methods with respect to the frequency distributions of surface wind directions. Frey-Buness et al. (1995) extended the approach to regional climate analysis and regional climate change studies. The classification scheme in Frey-Buness et al. (1995) uses a minimum of 12 classes of large-scale wind directions, two classes of seasons (summer or winter), and two classes of static stability (fair weather or poor weather). A total of 48 classes are obtained. Following the concepts described in Frank and Landberg (1997) and Frank et al. (2001) and the statistical–dynamical procedure of Frey-Buness et al. (1995), Yu et al. (2006) developed the Wind Energy Simulation Toolkit (WEST) to estimate the wind resource and applied it to generate a wind atlas of the Gaspé Peninsula located along the south shore of the Saint Lawrence River in Quebéc, Canada, extending into the Gulf of Saint Lawrence. WEST was built as a stand-alone and complete toolkit that includes mesoscale and microscale modeling systems and a statistical module. The classification scheme used in Yu et al. (2006) includes several meteorological key parameters to define the “classes” such as the direction and speed of the geostrophic wind at 0 m above sea level (MSL), and the vertical geostrophic wind shear (wind speed difference between 1500 and 0 m MSL). The geostrophic wind direction is classified into 16 sectors, and each sector has 14 wind speed classes. Each class of sector and speed interval are then divided into two classes (positive or negative) according to the sign of geostrophic wind shear. Cutler et al. (2006) considered clustering techniques as an alternative to optimize the previously existing classification schemes and applied it for the generation of a numerical wind atlas of Ireland and Egypt. The clustering technique used by Cutler et al. (2006) has the ability to include wind speed and direction and thermal stability from different heights for the classification scheme. It was shown that the clustering method is able to produce results at least as accurate as the previously existing methods but with less computational effort.

In general, all these statistical–dynamical downscaling methods are strongly constrained by the use of the geostrophic wind and hydrostatic approximations to determine the weather patterns and the total computational cost can be very high since the number of weather patterns determined can be very large. For example, Yu et al. (2006) used a total of approximately 330 classes to describe the large-scale climate of the Gaspé region represented by a tile of 875 m × 875 km, which implies running the mesoscale model 330 times. An individual tile is not allowed to be much bigger for the homogeneous geostrophic wind and the hydrostatic approximations to hold. Therefore, the application of the method for bigger domains requires the use of several tiles to cover the entire region of study, which implies running the mesoscale model separately for each individual tile that increases the computational cost even more. Another downside of these methods is that no information on the temporal variation (time series) of the regional weather can be obtained; only information on the spatial variations is obtained. Furthermore, the determined weather patterns do not necessarily relate to the real features of the local flow and they may not be physically relevant. There are, however, some commonly used techniques for compressing the statistical information (variance) in multivariate datasets that reveal statistically independent patterns of flow that can often be associated with different physical processes. These techniques make it possible to examine the temporal and spatial variations of the flow patterns separately. One of these techniques was introduced into the atmospheric sciences by Lorenz (1956) and was termed the empirical orthogonal function (EOF) method. The EOF technique has many desirable properties, such as the fact that the revealed patterns are based on the characteristics of the data themselves, they are linearly independent of one another, and a measure of their relative importance is provided. The main goal of this research is to present a statistical–dynamical downscaling procedure with a statistical module based on EOF analysis to generate large-scale atmospheric patterns that are then adapted to high-resolution terrain and surface roughness and used for regional time series construction and numerical wind atlas generation. As in the previous works, the newly developed method also assumes that the regional weather can be determined from the large-scale weather. The assumption in this work, however, is not based on the criteria of the frequency of occurrence associated with large-scale weather patterns, instead a new criteria based on the statistical variance explained by the large-scale atmospheric component patterns (expressed through EOF) is established. Regional time series of temperature and wind speed are constructed from the downscaling procedure. The technique is applied to generate a numerical wind atlas of the Gaspé Peninsula and the results are validated using observations at different observation masts located in the region. The generated wind atlas is then compared with that obtained in Yu et al. (2006) for the same region.

In section 2 the new method is presented. The EOF technique for vector datasets is briefly reviewed, the dataset and the formula for the construction of the atmospheric component patterns are presented, the new statistical–dynamical downscaling scheme is described, the settings and main assumptions for the mesoscale integrations are discussed, and the formulation for the construction of regional time series is introduced. The results from the application of the EOF technique, the mesoscale numerical simulations, and the validations are presented in section 3. A summary and conclusions can be found in section 4.

## 2. Methodology

In this section the new statistical–dynamical downscaling scheme is presented. The EOF method is briefly revisited. The dataset used for the EOF analysis, the model initial conditions and the model settings are discussed. The formulation for the construction of the regional time series is discussed by the end of this section.

### a. Statistical–dynamical downscaling

The new statistical–dynamical procedure can be summarized in the following main steps: 1) Long-term time mean (large-scale basic states) and the anomalies of wind, temperature, geopotential height, and specific humidity are computed from the global dataset [National Centers for Environmental Prediction–National Center for Atmospheric Research (NCEP–NCAR) reanalysis data]. 2) The EOF technique is applied to extract the dominant spatial patterns (EOFs) and the associated principal components (PCs) from the anomalies. 3) The *n*th EOF multiplying the standard deviation of the *n*th PC is added to the large-scale basic state to form what we call the *n*th large-scale atmospheric component pattern. 4) The large-scale basic state and the *n*th large-scale atmospheric component pattern are dynamically downscaled using a mesoscale model to produce the regional basic states and the *n*th regional atmospheric component pattern, respectively. 5) The *n*th regional counterparts of the *n*th large-scale EOFs are produced by subtracting the *n*th regional basic state from the *n*th regional atmospheric component patterns and then dividing the result by the standard deviation of the *n*th PC. 6) Regional time series of temperature and individual (zonal and meridional) winds are constructed and a wind atlas is generated.

For validation purposes, the obtained wind atlas is compared with that created in Yu et al. (2006). Furthermore, the constructed regional time series are compared with observations at different masts located in the Gaspé Peninsula.

### b. EOF analysis method

Application of the EOF analysis in meteorological research is not a new practice. EOF techniques were used by Lorenz (1956) in numerical prediction. North (1984) studied the connection between EOFs and normal modes and provided insight into the EOF interpretation of data fields by the study of fields generated by linear stochastic models. In our study, temperature, geopotential height, specific humidity, and wind vector are to be represented for the purpose of data analysis and then are to be used as initial conditions for numerical integrations. For this type of application, a variation of the standard EOF technique that merges the datasets of several meteorological fields (wind, temperature, humidity, etc.) into one input data matrix is the most suitable. The first EOF analysis of this type was performed by Kutzbach (1967) to examine surface pressure, temperature, and precipitation over North America. There are two variations of the EOF technique to deal with vector datasets (e.g., wind datasets). The individual winds (e.g., zonal and meridional components) can be represented by complex numbers (Hardy and Walton 1978) or the real components (Ludwig et al. 2004). Here, the real-components approach is used. More information on vector EOF methods can be found in Kaihatu et al. (1998).

**r**

_{i}= (

*x*,

_{i}*y*,

_{i}*z*) that determines the position of the

_{i}*i*th grid point; the column vector

**Ψ**(

**r**

_{i},

*t*) = (

*Z*,

*T*,

*H*,

*U*,

*V*)

^{T}formed by the combination of the geopotential height

*Z*(

**r**

_{i},

*t*), temperature

*T*(

**r**

_{i},

*t*), specific humidity

*H*(

**r**

_{i},

*t*), zonal wind

*U*(

**r**

_{i},

*t*), and meridional wind

*V*(

**r**

_{i},

*t*) datasets, where

*t*denotes time and T denotes the matrix transposition operation; and the basic state

**Ψ**

_{0}(

**r**

_{i}) determined as the long-term time mean of

**Ψ**(

**r**

_{i},

*t*) so that

**Ψ**

_{0}(

**r**

_{i},

*t*) = (

*Z*

_{0},

*T*

_{0},

*H*

_{0},

*U*

_{0},

*V*

_{0})

^{T}where the subindexes 0 refer to basic-state fields. Then,

**Ψ**(

**r**

_{i},

*t*) can be determined following the flow decomposition:

The second term on the rhs is the sum of *N* products of two terms: 1) the scalar terms *a _{n}*(

*t*), which are usually called PCs and are functions of time, and 2) the column vectors

*ψ*_{n}(

**r**

_{i}) = (

*Z*,

_{n}*T*,

_{n}*H*,

_{n}*U*,

_{n}*V*)

_{n}^{T}that denote the

*n*th EOF and are functions of space. For optimal performance during the EOF calculations, this work uses anomalies weighted by the geographical area (geographical weighting) and by the variable (using the standard deviation of individual variables).

The EOFs *ψ*_{n} have the following properties: 1) they are linearly independent; 2) they are unit vectors obtained by normalizing the eigenvectors of the covariance matrix **Ψ**′ represents the anomaly fields (**Ψ**′ = **Ψ** − **Ψ**_{0}) and the overbar denotes time average; 3) they are arranged in descending order of explained variance; and 4) at any time *t*, the coefficients *a _{n}*(

*t*) are given as

*a*(

_{n}*t*) = 〈

**Ψ**′

*ψ*_{n}〉, which represents the projection of the disturbance vector onto the EOFs. The angle brackets denote domain integration.

Because the terms in the rhs of (1) are arranged in descending order of importance (where importance is defined as the amount of variance explained by each EOF), a few leading terms generally provide good estimates of the atmospheric fields. The agreement between observations and the estimates provided by (1) will improve as terms are added, but the effect of each additional term will usually be smaller. In most common large-scale weather applications 15 modes would be usually enough to explain about 80% of the total variance.

### c. Dataset and atmospheric component patterns

The dataset used for the EOF analysis is the global NCEP–NCAR reanalysis (Kalnay et al. 1996). The reanalysis used in this study covers a period of 46 years (1958–2004) with a time sampling of 6 h. The dataset is in a latitude–longitude grid, with 2.5° × 2.5° grid spacing at 17 pressure levels in the vertical (from 1000 to 10 hPa). To reduce the computational cost during the EOF calculations, only 9 levels (1000, 925, 850, 700, 500, 300, 200, 100, and 50 hPa) are considered in the analysis. Figure 1 depicts the domain used for the EOF calculations, which has 11 × 9 grid points in the horizontal and covers the entire province of Québec, the Maritimes (consisting of New Brunswick, Nova Scotia, and Prince Edward Island Canadian provinces), and part of the northeastern United States. To validate the new procedure, 6-h daily observations of wind at 40-m height and temperature at 30-m height measured at 29 observation masts located in the Gaspé Peninsula provided by the Ministére des Ressources du Québec are used. The masts are located as shown in Fig. 1 (dots) mainly along the south shoreline of the St. Lawrence River. The observations were collected during a period of 5 years from 1 January 1998 to 1 January 2003. The observation period for individual masts, however, ranges from 0.8 to 2.4 yr. Inside the 5 years of observations there were corrupted data that were rejected from the study.

*a*(

_{n}*t*) = 〈

**Ψ**′

*ψ*_{n}〉. Finally, the

*n*th large-scale atmospheric component pattern is determined by adding to the basic state the standard deviation of the

*n*th PC multiplied by the

*n*th EOF space function

*σ*denotes the standard deviation of the

_{n}*n*th PC and

**Φ**

_{n}(

**r**

_{i}) is the column vector formed by the

*n*th atmospheric component pattern. The second term in (2) is constructed following the flow decomposition (1), and it provides an estimate of the amplitude of the disturbance associated with the

*n*th EOF. A plot of

*σ*versus

_{n}*n*(not shown) reveals that the amplitude of the disturbance decreases monotonically with

*n*. Therefore, low-order EOFs produce larger perturbations to the basic state than high-order EOFs. The authors have performed sensitivity experiments using values other than

*σ*for the coefficients multiplying the EOF function in (2), and the results indicated that as long as these coefficients are selected inside the range of variation of the PCs

_{n}*a*(

_{n}*t*) the generated wind atlas remains unchanged.

### d. Numerical integrations

Similar to the work of Yu et al. (2006), this study uses a mesoscale model for the dynamical downscaling module. Here, the compressible nonhydrostatic limited-area Global Environmental Multiscale–Limited Area Model (GEM-LAM) is set up following Yu et al. (2006) at a grid space of 5 km covering the Gaspé region. A total of 195 × 195 grid points and 28 unevenly distributed model levels from the surface to 20 km above the sea level (with 10 levels within the first 1.5 km) are used in the simulation. Multiple model integration is performed. In each integration, the GEM-LAM is directly initialized separately with the basic state and individual atmospheric component patterns determined by (2) that are adjusted to high-resolution terrain and surface roughness via mesoscale simulations. The surface roughness length is determined solely with land use or vegetation cover, as opposed to the “total roughness,” which includes the roughness due to the vegetation and that due to subgrid variation of terrain. The simulation time is chosen long enough for a steady state to be reached. Boundary conditions are kept constant with time during the model simulations. At the initial time the ground temperature is the same as that of the air in immediate contact and then is kept constant during the integration.

### e. Construction of the regional time series

**X**

_{0}(

**r**

_{j}), where

*j*= 1 and

*j*denotes the grid points in the regional domain of dimension

*J*, is the steady state obtained during the dynamical downscaling of

**Ψ**

_{0}(

**r**

_{i}), and assuming also that the

*n*th regional atmospheric component pattern

**Θ**

_{n}(

**r**

_{j}) is the steady state obtained from the dynamical downscaling of the

*n*th large-scale atmospheric component pattern

**Φ**

_{n}(

**r**

_{i}), then the regional time series can be constructed using (1) and (2):

**X**

_{rec}is the column vector formed by the constructed regional time series of geopotential height, temperature, humidity, and zonal and meridional winds. The PCs used in (3) are assumed to be the same obtained from the EOF analysis of the large-scale data. The term

*χ*_{n}(

**r**

_{j}) represents the “regional” counterpart of

*ψ*_{n}. The

*χ*_{n}(

**r**

_{j}) can be determined assuming that it satisfies (2):

## 3. Results

### a. Basic states

Basic states are computed as the 46-yr time mean of geopotential height, specific humidity, temperature, and wind. The basic states of the geopotential height and wind vectors, specific humidity, temperature, and wind speed at 850 hPa are shown in Figs. 2a–d, respectively. The flow at this level is approximately geostrophic (Fig. 2a). Also a ridge in the specific humidity is observed over Québec with the axis extending from southwest to north. Figure 3 is as Fig. 2 but at 500 hPa. According to Fig. 3a the flow at 500 mb is mainly westerly geostrophic with maximum wind speed on the Maritimes (Fig. 3d).

### b. Spectrum of variance

The EOF modes are sorted according to the eigenvalues of the covariance matrix

### c. Characteristics of the PCs and the EOFs

Figure 5 depicts the PCs corresponding to the first four modes that altogether explain almost 63% of the total statistical variance. For better visualization, the PCs are plotted during the period from 1 July 1998 to 1 July 1999. The amplitude (frequency) of oscillation of these modes decreases (increases) with the increasing of the mode number. To show some of the main spectral characteristics of the dominant modes, Fig. 6 depicts the power spectra for the first four EOF modes. The power spectrum for individual modes exhibits multiple peaks. In such case, the oscillation period of the mode is given by the maximum peak. The dominant oscillation period (in days) is computed as 0.25/frequency. The oscillation period for modes 1–4 are 365, 8.5, 8.1, and 1 days, respectively. The power spectrum of mode 1 reveals two spikes: a large dominant peak at 6 × 10^{−4} (corresponding to the annual cycle), and a smaller peak at 0.25 (corresponding to the diurnal cycle). EOF modes 2 and 3 have similar oscillation periods. Furthermore, a lag correlation computation (not shown) between PC2 and PC3 revealed that these modes are in quadrature. Since modes 2 and 3 explain a similar variance (see analysis in section 3b) and since they are in quadrature with similar period of oscillations, they may interact to form a propagating synoptic pattern.

Figure 7 shows the EOF-1 disturbances of geopotential height, wind, temperature, and specific humidity at 850 hPa. Without losing generality, the following discussion assumes positive values of PC 1. According to Fig. 5a and the power spectra in Fig. 6, EOF-1 variability follows the annual cycle with positive values of PC 1 mainly during northern winter (Fig. 5a). Figure 7a indicates that, mainly during winter, EOF 1 disturbances exhibit characteristics of a flow in gradient wind balance described by a low pressure disturbance centered over Newfoundland and the associated wind blowing counterclockwise following the isobars. The advection of continental dry and cold air anomaly from the northwest of the domain creates a drier environment over the entire domain (Fig. 7b) and a cold-pool anomaly (Fig. 7c) centered over central Québec. According to Fig. 7c, a pattern of a ridge of large wind speed is observed over central Québec. When negative values of PC 1 are assumed, mainly during periods other than winter, Fig. 7a is represented by a high pressure disturbance with associated wind blowing clockwise bringing moist and warm air from the Atlantic Ocean at lower latitudes creating a moist environment over the entire domain and a warm pool anomaly with maximum centered over central Québec. Figure 8 is the same as Fig. 7 but for EOF mode 2. The following analysis assumes positive values of PC 2. According to Fig. 5b and the power spectra in Fig. 6, EOF-2 disturbances represent higher-frequency-variability events related to transitional synoptic patterns. A distinctive feature in Fig. 8a, when compared with Fig. 7a, is the presence of a high pressure anomaly system centered over New Brunswick. Corresponding to this high pressure anomaly, a dipole structure of positive and negative anomalies of humidity (Fig. 8b) and temperature (Fig. 8c) is observed. The existence of the positive anomalies in specific humidity and temperature, both centered in the southwest of Québec, may be associated with the advection of moist and warm air from the Atlantic Ocean. On the other hand, the existence of the negative anomalies in the specific humidity and temperature, both centered in the northeast over the Maritimes, may be associated with the advection of continental dry and cold air from higher latitudes. The minimum in wind speed over New Brunswick depicted in Fig. 8d is collocated with the center of the high pressure system in Fig. 8a. To summarize the results on the spatial patterns for the rest of the modes, the panels in Fig. 9 depict spatial patterns of wind speed for the EOF modes 3–7. The choice of plotting only the first seven modes is motivated by the discussion that will follow in the next section. According to these figures, higher mode numbers tend to be characterized by smaller spatial scales.

### d. Validations

The validation procedure is established as follows: 1) construction of the regional time series of temperature and zonal *u* and meridional *υ* winds using (5), the wind speed time series, and the mean wind speed *M* (*M* ≤ *N*) retained during the construction of the time series is also evaluated.

Figure 10 shows the time evolution of the domain-averaged wind speed (solid lines) at 40 m AGL and temperature (dashed lines) at 30 m AGL during the dynamical downscaling of the atmospheric component pattern associated with EOF 1. A similar evolution is observed during the dynamical downscaling of the basic state and the remaining 50 atmospheric component patterns considered for this study. A steady state is reached after 4 h. For this reason, 4 h is chosen as the integration time.

Figures 11 and 12 show the 850-hPa regional counterparts of EOFs 1 and 2, respectively, and Fig. 13 shows the regional counterparts of EOFs 3–7. There are strong similarities between the spatial patterns of the large-scale EOFs and those of their respective regional counterparts. For example, the minimum of geopotential height of EOF mode 1 in Fig. 8a is also captured in Fig. 11a (represented by the blue shading). The ridge in the wind speed of EOF mode 1 located over the St. Lawrence river (Fig. 7d) is also captured in Fig. 11d. The minimum in wind speed for EOF mode 2 (Fig. 8d) located in the Gaspé Peninsula is also captured in Fig. 12d. The match between large-scale EOFs and their regional counterparts is better for higher vertical levels (not shown) because the impact of topography becomes less important. Because of the importance of the wind resource assessment for the wind energy industry, the next analysis will focus on the constructed time series of wind speed. Figure 14 depicts the GMAE (for wind speed) in meters per second as a function of the number of modes *M* retained in (5). The GMAE showed to be sensitive to the number of modes retained in (5). According to Fig. 14, GMAE improves (decreases) systematically until it reaches the “optimal” (minimum) value of 0.82 m s^{−1} for *M* = 7. Beyond *M* = 7 GMAE slightly increases and then remains almost constant with *M*. Thus adding more modes to the time series in (5) does not necessarily produce better estimates. The best estimates in this particular application are produced when only seven modes are retained. Similar behavior is observed in the analysis of the time series of temperature. We argue that this kind of behavior may be due to the fact that we have used the PCs derived at low resolution to construct the regional time series and also by the fact that unrealistic features may be revealed in the structure of the higher-order atmospheric component patterns, which when downscaled may lead to a reduction of the signal-to-noise ratio and a degradation of the results. Statistical issues related to the relatively small number of observation masts used for the comparison may also contribute to this type of behavior in GMAE. This phenomenon may be less important when the large-scale data are given at a grid size comparable to that used for the mesoscale simulations. To explore these ideas, further testing of the method is required, however.

Table 1 shows the location (latitude and longitude), MAE in meters per second, and CC in percent (for *M* = 7) for the 29 observation masts. The largest values of MAE were obtained in regions with very complex terrain (masts 23, 24, and 26) characterized by narrow valleys–ridges with 1-km width, corresponding to the terrain marked with letter B in Fig. 15 (bottom panel), which are completely smoothed out in a grid spacing of 5 km used for the model simulations. This result suggests that using a smaller than 1-km grid spacing for the model simulations may have a positive impact in reducing the MAE. On the other hand, very small values of the MAE are obtained in regions with much less complex terrain (masts 3, 17, 18, and 20) similar to the terrain marked by letter A in Fig. 15 (bottom). Some of these observation masts are very close to each other and report similar values for the MAE and CC. The intermediate values of MAE and CC are reported in masts located in terrain with intermediate level of complexity similar to the terrain marked with letter B in Fig. 15 (bottom).

The table shows number, location (lat, lon), CC (%), and MAE (m s^{−1}) for the 29 masts.

Figure 16 shows panels of the observed (black lines) and constructed time series (using the first seven modes) every 6 h for wind speed (green lines) at 40 m AGL and temperature (red lines) at 30 m AGL at the observation masts in Montagne-Seche (Figs. 16a,b), Luceville Saint-Donat (Figs. 16c,d), Canton Power (Figs. 16e,f), and Murdochville (Figs. 16g,h). These masts correspond to 26, 1, 28, and 10, respectively, in Table 1. Without losing generality, we chose to show the plots at the above mentioned masts because they are located in regions with different terrain characteristics (Fig. 15 bottom). For better visualization only a sample of the time series is plotted. The values of CC for the individual observation masts are included in the top-right corner of the panels. The results indicate that high-frequency events (e.g., Fig. 16c) are better estimated as more terms are added in (5). The observed time series at some masts (e.g., Fig. 16g) shows low-frequency variability, then using only the first few modes provides good estimates.

The wind atlas computed in step 3 of the validation procedure is of great value for the wind energy industry to determine sites of wind potential. In this work, the generated wind atlas is compared with the observed mean wind speed at the 29 observation masts. Figure 15 (top panel) shows the “optimum” (GMAE = 0.82 m s^{−1}) wind atlas generated with seven modes. The general pattern of the wind atlas compares well to a wind atlas of the same region generated in Yu et al. (2006). The value of GMAE in this work, however, is modestly better (lower) with respect to the GMAE = 0.87 m s^{−1} value reported in Yu et al. (2006). According to Fig. 15 the strongest winds are over water away from land, over coastal land, or at the top of the hills. To evaluate the performance of the method in regions with extreme values of observed wind speed, we focus in the mast in Murdochville (mast 10 in Table 1), which recorded the largest observed wind speed of 9.1 m s^{−1} and is located at the highest altitude MSL relative to the other masts. The method estimated this observation reasonably accurately, with a value of 8.30 m s^{−1}, which also represented the largest estimated value among all the masts. It is noteworthy to mention that the estimated value at this mast is better than the value of 5.9 m s^{−1} reported by the WEST for the same mast.

The work of Yu et al. (2006) has been used as benchmark for comparison with this study because both works use the same region and all the observation masts. A thorough comparison with other wind mapping techniques is difficult because none of them has been applied to Québec. Also, the lack of common choice of statistical parameters to perform the validations (mean absolute error, mean relative error, root-mean-square error, bias, etc.) adds another level of complexity when comparing with different methods. For example, Frank and Landberg (1997) performed a wind downscaling experiment with similar goals to ours to generate a numerical wind atlas for Ireland. However, they did not compare the simulated wind speed with the observed ones. Only the simulated wind power was compared with that analyzed from observed winds [see Table III in Frank and Landberg (1997)]. Also, one needs to keep in mind that the region analyzed in that study is characterized with a terrain much less complex than that used in the present study. Another example is the work of Mortensen et al. (2005) that applied a statistical–dynamical downscaling procedure to generate a wind atlas for Egypt. The dynamical downscaling module, however, included both a microscale and a mesoscale model. Because of the size of Egypt and the limitations intrinsic to the procedure, the map was subdivided into two larger tiles and four smaller tiles. The horizontal grid space for the larger tiles was 7.5 km and for the smaller tiles was 5 km. The results were presented in terms of the mean relative error (difference between observation and estimation divided by their mean value and expressed in percent), which corresponded to 10% for the larger tiles and 5% for the smaller tiles.

The quality of the results obtained from the proposed method in this study expressed through the high values of the correlation coefficients between the observed and estimated time series, the small global mean absolute error, and the fact that the results are consistent with those reported in Yu et al. (2006) suggest that the proposed method can be used as a powerful tool for wind resource estimation with much less computational effort than previous methods. The dynamical downscaling module in the new method is computationally less expensive because fewer weather patterns are determined in the statistical module. For example, there are 330 classes of large-scale weather derived in Yu et al. (2006) versus only seven atmospheric component patterns derived in the new method, reducing the number of simulations by a factor of 40. When working with very big domains there is an increase in the overhead during the EOF calculations, but the total wall clock time remains much smaller when compared with the method of Yu et al. (2006). Another important advantage of the new method against the existing methods is the capability to construct regional time series.

It is noteworthy to mention that the results presented in this section, and most important the determination of the “optimum” wind atlas, are based on the knowledge of observations. The indicators of accuracy (CC, MAE, and GMAE) are computed using the observed time series of wind speed. However, one may be interested in using this technique to generate a wind atlas for a certain region without the knowledge of observations. For such case, we propose a different strategy to find the optimum wind atlas. As explained earlier, the agreement between observations and the estimates provided by (5) will improve as more modes are added in (5). Then, (5) provides the worst estimate when only the basic state plus first EOF mode is considered. Thus the value of *M* that gives the maximum departure between the estimated value of the mean wind speed SP using *M* modes and the estimated value of the mean wind speed using only the first EOF (*M* = 1) in (5) can be used to generate the “optimum” wind atlas. In summary, the optimum wind atlas corresponds to the wind map generated using the value of *M* for which the expression *j* using *M* modes in the reconstruction and *J* corresponds to the total number of grid points in the regional domain (section 2e, first paragraph).

## 4. Summary and conclusions

The wind resource assessment remains a very difficult task. In this research a new statistical–dynamical downscaling method for regional time series generation is introduced and applied to wind resource assessment. The new technique uses a statistical module based on EOF analysis. The plausibility of using EOF analysis to generate large-scale atmospheric component patterns is based on its ability to compress the statistical information onto a few dominant EOF patterns ordered by the variance explained. The large-scale atmospheric component patterns are generated following (2). These patterns can then be adapted to high-resolution terrain and surface roughness by using mesoscale simulations to generate regional atmospheric component patterns. Previous techniques define a larger number of large-scale weather patterns making the dynamical downscaling module more expensive computationally. Furthermore, the newly developed technique does not have any limitations in terms of domain size because it is not constrained by the use of the geostrophic wind and hydrostatic approximations like the existing methods. The fact that this technique allows the construction of time series at regional level is very important for the industries that rely on the use of daily time series of atmospheric fields such as wind and temperature. In this work the wind resource is estimated for the Gaspé region in Québec and a high-resolution wind atlas is generated using a long-term time mean of the constructed time series of wind speed.

Values of CCs as high as 91.5% for the wind speed (see Table 1) and 98.5% for the temperature are obtained. The numerically generated wind atlas compares well to observations of the mean wind speed collected at different masts in the Gaspé region. The structure of the mean wind speed pattern compares well to the structure of the pattern reported in the wind atlas in Yu et al. (2006). The results shown to be sensitive to the number of modes retained during the construction of the time series and the wind atlas. The GMAE systematically decreases when more modes are retained to construct the time series until it reaches a minimum value. After that value the GMAE starts increasing slightly. The optimumwind atlas is associated with the minimum value of GMAE, in this case GMAE = 0.82 m s^{−1}, which was obtained retaining seven modes. The authors argue that this behavior is due to the difference between the grid size at which the EOF method is performed (2.5° × 2.5°) and the grid size used for the time series reconstruction at high resolution (5 km) and the assumption of using the low-resolution PCs for the regional time series reconstruction. This may lead to a reduction of the signal-to-noise ratio during the time series construction at regional scales. To further explore this idea the authors propose using a dataset at a smaller grid size, for example the North American Regional Reanalysis. Also, for future work new atlases for other regions in Canada will be produced, together with a wind atlas for the whole country.

The errors in our calculations come from the following main sources: 1) limitations intrinsic to the EOF method—the EOF method is linear so it cannot reveal the nonlinear structures of the flow, 2) errors during the mesoscale simulations, 3) errors during the vertical interpolation from pressure to sigma levels, 4) the assumption of using the PC at low resolution to construct the regional time series, 5) assumptions made for the validation strategy—the strategy of using the nearest grid point to the observation mast also may lead to substantial errors in the validations, and 6) measurement-reading errors during the observations.

Despite the negative impact from the above issues, the results suggest that the newly developed technique can be a powerful tool for the generation of reliable regional time series and wind atlases with less computational effort than previous techniques.

## Acknowledgments

This research was supported by the Canadian government ecoEnergy Technology Initiative (ecoETI) funding.

## REFERENCES

Benestad, R. E., 2001: A comparison between two empirical downscaling strategies.

,*Int. J. Climatol.***21**, 1645–1668.Cox, D. R., 1983: Some remarks on overdispersion.

,*Biometrika***70**, 269–274.Cutler, N. J., B. H. Jorgensen, B. K. Ersboll, and J. Badger, 2006: Class generation for numerical wind atlases.

,*Wind Eng.***30**, 401–415.Dreveton, C., and Y. Guillou, 2004: Use of principal components analysis for the generation of daily time series.

,*J. Appl. Meteor.***43**, 984–996.Frank, H. P., and L. Landberg, 1997: Modelling the wind climate of Ireland.

,*Bound.-Layer Meteor.***85**, 359–378.Frank, H. P., O. Rathmann, N. G. Mortensen, and L. Landberg, 2001: The numerical wind atlas—The KAMM/WAsP method. Risø National Laboratory Rep. Risø-R-1252(EN), 60 pp. [Available online at http://130.226.56.153/rispubl/VEA/veapdf/ris-r-1252.pdf.]

Frey-Buness, A., D. Heimann, and R. Sausen, 1995: A statistical–dynamical downscaling procedure for global climate simulation.

,*Theor. Appl. Climatol.***50**, 117–131.Gregory, J. M., T. M. L. Wigley, and P. D. Johnson, 1993: Application of Markov models to area average daily precipitation series and inter-annual variability in seasonal tools.

,*Climate Dyn.***8**, 299–310.Grondona, M. O., G. P. Podesta, M. Bidegain, M. Marino, and H. Hordij, 2000: A stochastic precipitation generator conditioned on ENSO phase: A case study in southeastern South America.

,*J. Climate***13**, 2973–2986.Hahmann, A. N., D. Rostkier-Edelstein, T. T. Warner, F. Vandenberghe, Y. Liu, R. Babarsky, and S. P. Swerdlin, 2010: A reanalysis system for the generation of mesoscale climatographies.

,*J. Appl. Meteor. Climatol.***49**, 954–972.Hansen, J. E., and D. M. Driscoll, 1977: A mathematical model for the generation of hourly temperatures.

,*J. Appl. Meteor.***16**, 935–948.Hardy, D. M., and J. J. Walton, 1978: Principal component analysis of vector wind measurements.

,*J. Appl. Meteor.***17**, 1153–1161.Heimann, D., 1986: Estimation of regional surface-layer wind-field characteristics using a three-layer mesoscale model.

,*Beitr. Phys. Atmos.***59**, 518–537.Heyen, H., E. Zorita, and H. von Storch, 1996: Statistical downscaling of monthly mean North Atlantic air pressure to sea level anomalies in the Baltic Sea.

,*Tellus***48A**, 312–323.Huth, R., 2002: Statistical downscaling of daily precipitation in central Europe.

,*J. Climate***15**, 1731–1742.Johnson, G. L., C. L. Hanson, S. P. Hardegree, and E. B. Ballard, 1996: Stochastic weather simulation: Overview and analysis of two commonly used models.

,*J. Appl. Meteor.***35**, 1878–1896.Kaihatu, J. M., R. A. Handler, G. O. Marmorino, and L. K. Shay, 1998: Empirical orthogonal function analysis of ocean surface currents using complex and real-vector methods.

,*J. Atmos. Oceanic Technol.***15**, 927–941.Kalnay, E., and Coauthors, 1996: The NCEP/NCAR 40-Year Reanalysis Project.

,*Bull. Amer. Meteor. Soc.***77**, 437–471.Katz, R. W., 1996: Use of conditional stochastic models to generate climate change scenarios.

,*Climatic Change***32**, 237–255.Katz, R. W., and M. B. Parlange, 1993: Effects of an index of atmospheric circulation on stochastic properties of precipitation.

,*Water Resour. Res.***29**, 2335–2344.Kutzbach, J. E., 1967: Empirical eigenvectors of sea-level pressure, surface temperature and precipitation complexes over North America.

,*J. Appl. Meteor.***6**, 791–802.Lorenz, E. N., 1956: Empirical orthogonal functions and statistical weather prediction. Massachusetts Institute of Technology Statistical Forecasting Project Rep. 1, 49 pp.

Ludwig, F. L., J. Horel, and C. D. Whiteman, 2004: Using EOF analysis to identify important surface wind patterns in mountain valleys.

,*J. Appl. Meteor.***43**, 969–983.Mortensen, N. G., and Coauthors, 2005: Wind atlas for Egypt, measurements and modelling 1991–2005. New and Renewable Energy Authority, Egyptian Meteorological Authority, and Risø National Laboratory Rep. 258 pp. [ISBN 87-550-3493-4.]

North, G. R., 1984: Empirical orthogonal functions and normal modes.

,*J. Atmos. Sci.***41**, 879–887.Rife, D. L., J. O. Pinto, A. J. Monaghan, C. A. Davis, and J. R. Hannan, 2010: Global distribution and characteristics of diurnally varying low-level jets.

,*J. Climate***23**, 5041–5064.Semenov, M. A., R. J. Brooks, E. M. Barrow, and C. W. Richardson, 1998: Comparison of the WGEN and LARS-WG stochastic weather generators for diverse climates.

,*Climate Res.***10**, 95–107.Wippermann, F., and G. Grob, 1981: On the construction of orographically influenced wind roses for given distributions of the large-scale wind.

,*Beitr. Phys. Atmos.***54**, 492–501.Young, K. C., 1994: A multivariate chain model for simulating climatic parameters from daily data.

,*J. Appl. Meteor.***33**, 661–671.Yu, W., R. Benoit, C. Girard, A. Glazer, D. Lemarquis, J. R. Salmon, and J.-P. Sausen, 2006: Wind Energy Simulation Toolkit (WEST): A wind mapping system for use by the wind-energy industry.

,*Wind Eng.***30**, 15–33.