## Abstract

Multivariate linear regression is used to downscale reanalysis-based midtropospheric predictors (wind components and speed, temperature, and geopotential height) to historical wind observations at 44 surface weather stations during the four calendar seasons. The model performance is assessed as a function of statistical feature of the wind, averaging time scale of the wind statistics, and wind regime (as defined by how variable the vector wind is relative to its mean amplitude).

Despite large differences in predictability characteristics between sites, several systematic results are observed: consistent with recent studies, a strong anisotropy of predictability for vector quantities is often observed, although no obvious relation is found between large-scale topographic features and the anisotropy orientation or magnitude. The predictability of time-averaged quantities increases with decreasing averaging time scale. In general, the predictability of mean vector wind components is superior to that of mean wind speeds or subaveraging time scale vector wind variability.

These results are interpreted through empirically and theoretically based analyses of the sensitivity of mean wind speed to changes in the vector wind statistics. On longer averaging time scales, the statistical features of the wind speed are found to be highly sensitive to subaveraging time-scale vector wind variability, which is poorly predicted. On shorter averaging time scales, the mean wind speed is found to be highly sensitive to the magnitude of the mean vector wind, a quantity whose predictability can be much lower than the individual mean vector wind components. These results demonstrate limitations to the statistical downscaling of wind speed and suggest that deterministic models that resolve the short-time-scale variability may be necessary for successful predictions.

## 1. Introduction

In response to growing concern regarding the widespread effects of anthropogenic carbon dioxide on the climate system, numerous jurisdictions have introduced measures to stimulate the decarbonization of their energy supply. Over the last two decades, a large industry has grown in Europe, and more recently in China and North America, whose product is the conversion of wind energy into electricity. When they are available, estimates of the power in the wind resource are calculated from probability distributions of local wind observations. At places and times for which direct observations are not available (such as in a future climate), accurate methods of predicting this resource (from the regional wind resource) are necessary. In addition, predicting the variability of the wind resource on a range of time scales offers a significant advantage for managing the operations and maintenance of a wind farm (Orosa et al. 2010).

Weather and climate predictions are ultimately produced by dynamical models of the global-scale atmospheric circulation. These models are particularly well suited for accurately describing the flow above the planetary boundary layer (PBL) where dynamical spatial scales are relatively large, but their performance within the PBL will be limited by the approximations made necessary by their discretization (Giorgi and Mearns 1991). Dynamic downscaling is a means of circumventing the errors introduced by parameterizations in large-scale climate models by using output from these to drive a limited domain model with finer spatial and temporal resolution. These models do demonstrate improved accuracy within the PBL, but finely resolved local area models are computationally expensive and may be impractical to apply to a large number of sites.

In this regard, statistical downscaling is a natural complement to dynamic modeling. Via statistical methods, local surface variables may be related to the free tropospheric circulation that is more accurately resolved by the dynamic models. Surface wind variability is influenced by both the large-scale circulation and local influences, such as topography and land–water contrast. Statistical downscaling (SD) seeks to identify the information in the large-scale flow aloft that is relevant to variability of the local surface quantities. With the relevant information in the circulation aloft identified (the predictors), SD then models the relationship between these and the statistics of the local variable (the predictand).

The techniques used for SD are diverse. A first classification distinguishes between linear and nonlinear techniques. Within each of these categories are statistical methods of varying complexity. Both classes of techniques have been applied to the downscaling of surface winds. Some studies have suggested that these two techniques compare quite well, with no real added advantage from the increased complexity of nonlinear models (Zorita and von Storch 1999). Davy et al. (2010) found that a random forest algorithm (a linear SD method that formed predictions from an ensemble of regression trees) outperformed multivariate linear regression when predicting wintertime and summertime surface wind variability at an Australian site. As well, nonlinear regression was found to outperform linear regression in the prediction of temporal variability of daily-averaged wind components in France (Najac et al. 2009; Salameh et al. 2009). Taken together, these studies do not provide a clear picture regarding the relative merits of linear versus nonlinear techniques, or of simple versus complex models.

With any of these SD techniques, an increase in model complexity (and the number of statistical model parameters) allows for more complex relationships to be modeled. With increased complexity, statistical models require data in larger amounts and of higher quality to robustly estimate their parameters. Evidence of this is seen in the Curry et al. (2012) critique of the probabilistic downscaling approach taken by Pryor et al. (2005).

The results of past applications of SD to surface winds over southern France, western Canada, and the subarctic Pacific Ocean have provided first insights into the general relative predictability of surface wind features (e.g., Salameh et al. 2009; van der Kamp et al. 2012; Monahan 2012a). A finding common across these three previous studies is that mean vector wind components (i.e., the zonal, meridional, etc. component) have better predictability than mean wind speeds. Furthermore, the work of Salameh et al. found that meridional wind components were better predicted than the zonal projections. This anisotropy was interpreted as resulting from topographic influences constraining the dominant wind direction. A similar result was seen in van der Kamp et al. (2012), where maximum prediction skill of the vector projections at 18 sites in topographically complex regions of British Columbia was typically aligned with topographic features such as ocean straits and mountain ranges. In contrast, the current literature provides little consensus on which averaging time scales will have better predictability. Past work predicting surface winds in topographically complex regions of France found weekly time scales to be better predicted than daily or hourly time scales (Salameh et al. 2009), while a study of the sea surface winds at eight buoys off the coast of British Columbia found predictability to be greater at daily time scales relative to monthly time scales (Monahan 2012a). No previous study has systematically investigated the statistical predictability of surface winds across averaging time scale and wind statistic across a broad range of surface types.

This study will make use of the simple linear regression-based SD technique developed and employed in Monahan (2012a) and van der Kamp et al. (2012) to assess the predictability of surface winds at 44 observational sites located across four central Canadian provinces during four calendar seasons—December–February (DJF), March–May (MAM), June–August (JJA), and September–November (SON)—and spanning the years 1953–2006. By extending the scope of the previous research to 1) sites that span diverse locales, 2) multiple averaging time scales (daily, monthly, and seasonal), and 3) a large sample size and duration of observations, the study's aims are

to quantify the general linear statistical predictability of land surface winds using free-tropospheric predictors;

to assess the sensitivity of the prediction skill to

vector versus scalar predictands (e.g., wind components versus wind speed) and

averaging time scale; and

to provide an interpretation of the difference in predictability between wind components and wind speed using idealized models of the surface wind probability distribution introduced in Monahan (2012a, 2013).

Predictions are made of historical winds only; no downscaling of winds in future climates is considered. Accordingly, the predictability results and discussion that follow represent an exploration of the linear relationship between the midtropospheric means and the statistics of the surface winds, and hence the utility of this tool in future prediction studies. The focus of this study is not the particular details of predictive skill at individual sites or seasons, but instead common features of predictability that hold across all sites and seasons.

St. George and Wolfe (2009) provided a first quantitative estimate of the potential predictability of wind speeds in the southern Canadian prairies (SCP). This earlier study found a significant and anticorrelated linear relationship between the SCP regional winter wind speeds and variability of the Niño-3.4 index, a measure of El Niño–Southern Oscillation (ENSO) based on sea surface temperature anomalies in the eastern half of the equatorial Pacific Ocean. As ENSO is known to have important effects on tropospheric circulations over North America (Allan et al. 1996), which presumably mediate the connection between Niño-3.4 anomalies and wind anomalies in the Canadian Prairies, at least a modicum of success should be expected if SD using free-tropospheric predictors is applied to the surface winds over this region.

Section 2 details the methodology and data used in this study, specifically the climate data used, the selection of the predictor variables, and the construction of the predictors. Section 3 presents the prediction skill realized with the statistical model. The predictability of the surface winds is discussed in section 4. Conclusions and a discussion of the results in the context of the SD literature are presented in section 5.

Throughout the study, the symbols *u* and *υ* generally denote orthogonal wind components with arbitrary orientation. When explicitly noted, these will specifically denote either (i) the zonal and meridional wind or (ii) the along- and cross-mean winds. The wind vector is denoted as **u**, while **e** denotes the direction of an arbitrary projection of the wind vector. The notation is used to denote the mean of a quantity *y* on a particular time scale, and *σ _{y}* is used to denote the associated standard deviation of

*y*on subaveraging time scales. That is, if represents a monthly mean of

*y*,

*σ*represents the standard deviation of

_{y}*y*within that month. The notations and are used to denote the long-term (climatological) mean and standard deviation of [note that is insensitive to the averaging time scale of ].

## 2. Methodology and model development

### a. Data

This study considers statistical relationships between surface wind data from anemometers on surface meteorological stations (the predictands) and the large-scale flow aloft (the predictors) from global reanalyses. These datasets will now be described in more detail.

#### 1) Predictands

Observations of surface winds from 44 stations were considered: 10 from Alberta (AB), 8 from Saskatchewan (SK), 7 from Manitoba (MN), and 19 from Ontario (ON) (Fig. 1 and Table 1). All stations are located at airports and have at least 30 years of hourly observations (with some missing data). From west to east, the locales of these sites vary from bordering the Rocky Mountain range to a large expanse of farmland/plains through eastern Alberta, Saskatchewan, and Manitoba to forested lands and coastal areas (along the Great Lakes). Available surface observations dating from 1953 to 2006 were used in this analysis.

The hourly data used to form the predictands were taken from two sources. To minimize the effect of discontinuities resulting from anemometer height relocations, the magnitude of the surface wind was taken from the adjusted historical hourly surface wind speeds (Wan et al. 2010). These were retrieved directly from the Environment Canada Adjusted and Homogenized Canadian Climate Data service (at http://ec.gc.ca/dccha-ahccd/). Unadjusted hourly data from the Meteorological Service of Canada in situ station records were used to provide the wind directions, as adjusted hourly data for direction were not available. These observations were retrieved directly from the Environment Canada Weather Office Climate Data Online service (at climate.weatheroffice.gc.ca/climateData/canada_e.html; Environment Canada 2011). All hourly observations contain the average of the direction and speed of the wind during the 2-min period ending at the hour of observation (Environment Canada 1977).

#### 2) Predictors

The statistical model predictors were constructed from large-scale fields from the National Centers for Environmental Prediction (NCEP)–National Center for Atmospheric Research (NCAR) Reanalysis I (R1) (Kalnay et al. 1996) product. The reanalysis product was provided by the National Oceanic and Atmospheric Administration Office of Oceanic and Atmospheric Research Earth System Research Laboratory Physical Sciences Division (NOAA/OAR/ESRL PSD), Boulder, Colorado (from ftp://ftp.cdc.noaa.gov/Datasets/). The R1 product is a global dataset, with a 2.5° × 2.5° resolution and a 6-h time step, spanning 1948 to the present. The NCEP North American Regional Reanalysis (NARR) was also considered in the current analyses; the prediction results were found to be insensitive to the finer spatial and temporal resolution of the NARR reanalysis product.

### b. Methods

#### 1) The statistical model

This study employs a multivariate linear regression model to capture the relationship between the information carried in the flow aloft and the local surface winds. Given the simplicity of the model used, the prediction skills found in this study may be interpreted as a floor for potential prediction skill, such that higher values could potentially be achieved with more complex statistical downscaling techniques.

Since linear regression requires a relatively small number of model parameters, the risks of overfitting the multivariate linear regression model are lower than with a more complex but more flexible model. To further minimize the risk of overfitting, a cross-validation scheme was used. Since year-to-year correlations are expected to be small for daily, monthly, and seasonally averaged surface wind quantities, each year was predicted individually using parameters trained with data from all other years. The variability of the regression coefficients between data subsets is small (generally on the order of 5%).

#### 2) Constructing the predictands

Following the approach of van der Kamp et al. (2012), both wind speed and a full 360° array of vector wind projections around the compass were predicted. Projections of the wind vector were made at 10° increments spanning 170° counterclockwise from east to west, yielding 18 wind vector projections per site (by construction, projections in directions 180° apart differ only in sign). For each of daily, monthly, and seasonal time scales, both the time-scale mean wind quantities and the subaveraging time-scale standard deviations were calculated. The wind roses in Fig. 2 provide an illustration of how the wind vectors change with the averaging time scale at four representative locations.

The vector wind predictands are

Note that this approach will make no distinction between, for example, southeasterly and northwesterly winds as only their scalar product, thus absolute value, with the unit vector **e** is considered. The scalar wind predictands are

Two additional vector wind predictands, associated with the idealized model used to relate the predictability of vector and scalar wind statistics (section 4), were calculated. They are

To avoid complications arising from potential seasonal nonstationarities in the relationship between the statistics of surface and free troposphere flows, each calendar season (DJF, MAM, JJA, and SON) was independently predicted.

#### 3) Constructing the predictors

We refer to Monahan (2012a) for a detailed description of the method employed to construct the set of predictors. In summary, the correlation of the local observations and the predictor variables at pressure levels spanning the troposphere was mapped to (i) assess the suitability of predictor fields and (ii) identify the vertical extent of the predictive information in the free-tropospheric flow. Combined empirical orthogonal function (EOF) decomposition was utilized to efficiently organize the variance contained in the predictors (as individual fields are spatially autocorrelated, and dynamical balances result in correlations between fields). Predictions of surface winds at representative sites were made using a range of numbers of EOF predictors at each pressure level to (i) determine a suitable number of predictors and (ii) determine the vertical extent of predictive information aloft.

The reanalysis products offer a suite of potential predictors at pressure levels throughout the troposphere. The following five predictor variables, which have consistently produced meaningful prediction skill across previous studies (e.g., Salameh et al. 2009; Najac et al. 2009; Davy et al. 2010; van der Kamp et al. 2012), are 1) wind speed *W*, 2) zonal flow *U*, 3) meridional flow *V*, 4) geopotential height *Z _{g}*, and 5) temperature

*T*. There are both empirical and theoretical reasons to expect these variables to have a strong relationship with surface winds, and hence to use them to drive the statistical model. On synoptic and longer time scales, the primary dynamic processes contributing to surface variability have structure throughout the troposphere, so winds at the surface are related to winds aloft. Furthermore, large-scale balances aloft couple atmospheric mass distribution, thermodynamic structure, and flow.

For a predictor to be useful in statistical downscaling, its correlation structure with the predictand must (i) indicate a strong linear relationship between the local surface flow and the flow aloft and (ii) be on horizontal scales that can be well resolved by general circulation models. An illustration of the horizontal extent and location of predictive information in the predictor variables at 600 hPa relevant to DJF monthly mean zonal wind variability at four locations is shown in Fig. 3. The white circle in each plot identifies the location of the surface station (Edmonton, AB; Estevan, SK, Kenora, ON; and Ottawa, ON). Note that the correlation patterns (Fig. 3) suggest that surface wind variability is driven by a combination of the Pacific–North America (PNA) pattern and the North Atlantic Oscillation (NAO), with the former contributing more on the western side of the continent and the latter more on the eastern side.

At all four of these stations, variability in monthly mean surface zonal wind is strongly correlated with the mean of these predictor variables on hemispheric scales. In contrast, the subaveraging time scale standard deviations of these free-tropospheric fields had weak and spatially disorganized statistical relationships to surface wind statistics (not shown). In consequence, we omitted the standard deviations of the flow aloft and utilized the means of the five previously listed predictor variables in the subsequent analyses.

As we seek to use predictive information from the free-tropospheric flow, the correlation maps were used to identify the vertical structure of predictive information. Consistent with Monahan (2012a) and van der Kamp et al. (2012), we find that predictive information aloft is relatively constant between 800 and 400 hPa.

The combined EOF decomposition was calculated over the spatial domain illustrated in Fig. 3; for simplicity, we chose to utilize the same domain for all averaging time scales and sites. Consequently it was necessary to choose an EOF area that encompassed the main areas of relevant information on longer averaging time scales. From the results of the predictions made with increasing numbers of model predictors, suitable numbers of model predictors were determined to be 125, 10, and 6 for daily, monthly, and seasonal predictions, respectively. At these numbers of model parameters, prediction skills were insensitive to reasonable changes in these predictor numbers. Finally, consistent with the vertical structure of the correlation fields, prediction skill is found to be largely insensitive to predictor pressure level between 800 and 400 hPa. In the subsequent analysis, predictor variables from the 600-hPa reanalysis fields were used to drive the predictions.

## 3. Prediction results

Having created predictors for each of the 3-month calendar seasons (DJF, MAM, JJA, and SON) and identified that suitable predictive information was carried on the 600-hPa pressure level, predictions of historical wind observations were made on each of the daily, monthly, and seasonal time scales.

The cross-validated *r*^{2} prediction skills of monthly vector and scalar predictands at two representative sites are shown in the top row of Fig. 4. These polar plots show prediction skills of the two vector component predictands (mean and standard deviation, red curves) and the two wind speed predictands (mean and standard deviation, blue curves). Such polar plots were computed for each station, season, and averaging time scale. An atlas of these plots for all stations considered is presented in Culver (2012).

These plots clearly demonstrate an anisotropy of prediction skill for vector quantities. At Lethbridge, there are clear directions of maximum and minimum predictability of the vector wind predictands. For both means and standard deviations, the southwesterly (and northeasterly) vector winds are best predicted, whereas predictability of the vector winds is markedly reduced for northwesterly (and southeasterly) projections. The anisotropy at Red Deer is weaker, but still present. Such anisotropy was first noted by van der Kamp et al. (2012) and is found at most of the sites and seasons considered in the current study.

The primary differences in predictability among sites and seasons fall into three classes.

*There are quantitative differences in the predictability of the four predictands.*At each site there is some range of mean vector wind projections whose predictability approaches or exceeds*r*^{2}= 0.5 (on monthly time scales). In contrast, the predictability of the standard deviations and of varies considerably from site to site. There are sites where both the vector component predictands and the mean wind speed are well predicted (e.g., Lethbridge), while at other sites with nearly equivalent predictability of the vector predictands, the predictability of the mean wind speed is considerably smaller (e.g., Red Deer).*There is variability in the orientation of the anisotropy of vector wind predictability.*There is, for instance, approximately 30° between the orientation of the best predicted monthly mean vector wind component at Lethbridge and that at Red Deer.*There are differences in the magnitude of the anisotropy of vector wind predictability.*In the case of Lethbridge, the predictability of the mean vector component has a distinct minimum at north-northwestern (and south-southeastern) projections. On the other hand, the predictability of the mean components at sites such as Red Deer is much less sensitive to the direction of the vector projection.

A representative spatial map of the predictability of the DJF monthly mean wind speed (, bottom plot) and that of the monthly mean vector wind (, top plot) provided in Fig. 1 further illustrates these differences. To reduce the amount of information to be considered, we focus our attention on the best predicted vector components. We see that, in general, a projection (direction **e**) can be chosen at each site such that is close to or greater than 0.5. On the other hand, there is considerable spatial variability in the predictability of , and the speeds are consistently more poorly predicted than the best-predicted vector wind component on monthly time scales. Second, we find that anisotropy is generally weak and oriented orthogonal to the neighboring large-scale topography at western Alberta sites, while in Manitoba, anisotropy is generally large and sometimes (but not always) aligned along large bodies of water. At many lakefront Ontario sites, is weakly anisotropic and generally not aligned with the lake orientations. In general, no clear picture of the influence of large-scale surface inhomogeneities (e.g., strong topographic relief, large bodies of water) on the degree or orientation of vector wind prediction anisotropy is seen.

Despite the differences in anisotropy across sites, there are some general features of surface wind predictability that hold across sites and seasons. As we consider 44 sites and the four seasons, we obtain 176 prediction skill results for each predictand. The probability densities in Fig. 5 represent the range and relative frequency of the predictability of the statistical features as a function of their averaging time scale (i.e., seasonal, monthly, or daily). Prediction results from all sites and the four calendar seasons (i.e., across site and season) are included in the estimate of the densities. In general, we see that the predictability of means generally increases as the averaging time scale decreases, for both vector and scalar winds. As well, predictions of the best-predicted mean vector wind component are quite skillful in many sites, particularly for daily means (in all seasons). In contrast, the SD model is quite poor at predicting and the standard deviation of both scalar and vector winds. Finally, while the predictions of are typically poor, there are instances on all averaging time scales where the approaches that of the best predicted . It follows that is strongly related to the mean vector wind at some sites, while at others it is not. In the following section we will provide an idealized model to understand these differences in predictability.

## 4. Interpreting predictability

The difference in predictability of scalar and vector statistics is robust, holding across season, site, and averaging time scale. This result may initially appear puzzling: after all, the wind speed is simply the magnitude of the vector winds:

The following discussion will present an empirical and a theoretical approach to explain the disparity in predictability.

First, we note that the equality in Eq. (1) is only valid at a particular instant in time; the equation does not hold for time-averaged components and wind speed:

Consider a hypothetical site located along the coast of a large body of water, where the direction of the surface temperature gradient reverses from day to night. As the land temperature rises in the daytime an on-shore flow is generated. At night, when the land cools and its temperature drops below that of the body of water, an offshore wind results. Averaged over a daily time scale, if the daytime onshore winds are equally strong as the offshore nighttime winds, the mean wind speed will be nonzero while the mean vector wind will be zero. The mean wind speed is not a function of the mean vector wind components alone.

### a. Empirical relationship between and

The linear dependence of on the vector wind statistics can be quantified by calculating the correlation of with means and with standard deviations of the vector wind projections along various axes. In the lower row of Fig. 4, polar plots display the calculated correlation (*ρ*) between DJF monthly averaged and the statistics of vector wind components at the two Alberta sites considered in the previous section. It is evident that the fluctuations of the mean zonal wind component exercise a strong influence over monthly averaged at Lethbridge. In contrast, the monthly averaged at Red Deer is predominantly influenced by that of *σ*_{u·e}: the month-to-month variations in the mean vector winds are only weakly correlated with changes in .

Comparing the polar correlation plots to the polar prediction plots above demonstrates that predictability varies with the alignment of the mean vector wind component that is best predicted relative to that to which is most sensitive. For example, at Lethbridge is strongly correlated to the mean and (to a somewhat lesser extent) the standard deviation of the zonal winds. These strong sensitivities suggest that the predictability of should be about as strong as that of the mean zonal wind, which is in fact what is observed. In contrast, there is very little correlation between and at Red Deer, but a substantial correlation between and the standard deviation of the northwest (southeast) vector projections. The predictability of is seen to be about as strong as *r*^{2 }*σ*_{u·e} in this direction. As the predictability of the subaveraging time scale standard deviation is weak, is correspondingly poor at this location.

We also note that, while the dominant wind vector direction at Lethbridge (Fig. 2, second row) is well aligned with that of the maximum correlation between mean wind speed and mean vector wind component [; Fig. 4, bottom left], we found that in general the dominant wind direction does not align with the direction of the strongest correlation or that of the best-predicted mean vector component.

These polar correlation plots represent a general means of interpreting predictability in terms of the predictability of the statistical features of the vector wind (when is linearly related to these vector wind statistics). In particular, they demonstrate that good predictions do not generally result from good predictions of the vector wind components alone; alignment between the anisotropy of vector wind predictability and that of sensitivity is also important. These correlation plots do not explain why should be sensitive to mean vector winds in some cases (and from certain directions), and to the standard deviation of vector winds in other cases. We will now consider an idealized probability model (IPM) to address these different sensitivities.

### b. The idealized probability distribution model

In Monahan (2012a), an idealized model of the wind speed probability distribution function was introduced to study the mean wind speed as a function of the vector wind statistics. Assuming that the vector wind fluctuations are Gaussian, isotropic, and uncorrelated, it can be shown that

where *I _{k}*(

*z*) is the associated Bessel function of the first kind of order

*k*(Rice 1945),

is the magnitude of the mean vector wind, and *σ* is the isotropic standard deviation of the vector wind components.

The IPM was first introduced for the analysis of sea surface winds over the sub-Arctic Pacific Ocean (Monahan 2012a), and the assumptions on which it is based may be less appropriate over land. To quantify the error associated with these approximations, the monthly mean wind speeds (for each site and season) were calculated from monthly mean vector wind statistics using Eq. (3) and compared to monthly mean wind speeds computed directly from observations . The isotropic variance *σ*^{2} was estimated as the mean of the variances of the vector wind components:

The observed and modeled monthly are compared in Fig. 6, which displays estimates of the distribution of the correlation between modeled and observed . Despite its approximations, the uniformly large values of these correlations demonstrate that the IPM provides an excellent characterization of variability in the mean wind speed.

One particularly useful aspect of the IPM is that the sensitivities of the statistical features of the wind speed (, , ∂_{μ}*σ _{w}*, and ∂

_{μ}

*σ*) are functions of the dimensionless ratio

_{w}*μ*/

*σ*alone (Monahan 2012a). It is therefore convenient to introduce the bounded variable

The sensitivities , , ∂_{μ}*σ _{w}*, and ∂

_{σ}

*σ*as functions of

_{w}*θ*are displayed in Fig. 7. We see that the end members of the

*θ*range represent two distinct regimes of sensitivity. In a low

*θ*regime, the variability of is dominated by the variability of

*σ*; and essentially all the information relevant to the variability of mean wind speeds is held in

*σ*. At the other

*θ*extreme, the variability of is dominated by that of

*μ*. In contrast,

*μ*has very little influence over

*σ*; the variability in

_{w}*σ*is determined primarily by the vector wind standard deviation,

_{w}*σ*(Fig. 7b).

We will use the IPM characterizations of the sensitivities of and *σ _{w}* to the statistical features of the vector winds to interpret the observed differences in predictability between scalar and vector wind statistics.

### c. The predictability of relative to μ and σ

Histograms of the time-mean *θ* values (on seasonal, monthly, and daily time scales) observed at the sites under consideration are displayed in Fig. 8. It is clear that longer averaging time scales are generally associated with lower *θ* values for which is most sensitive to *σ*. This trend is a result of the fact that *σ* increases with averaging time scale. On daily time scales, the observations are found to be in an intermediate *θ* regime in which is nearly equally sensitive to *μ* and *σ*.

To evaluate the utility of the IPM in interpreting the predictability of , cross-validated linear regressions predictions of *μ* and *σ* were made from the free-tropospheric predictors and compared to corresponding predictions of mean wind speed. Note that predicting *μ* directly is not in general the same thing as predicting and separately and then combining these into a prediction of *μ*, as these quantities are nonlinearly related (Monahan 2013). Scatterplots of the cross-validated *r*^{2} prediction skill of *μ* and *σ* compared to predictability (for all seasons and sites) are displayed in Fig. 9 for each of the averaging time scales considered. For each averaging time scale, predictions were made at 44 sites and four seasons; hence each scatterplot has 44 × 4 = 176 points, each representing the prediction skill of *μ* (top row) or *σ* (bottom row) with at a particular site and season. There is a high degree of correspondence between the predictability of and of *σ* on both seasonal and monthly averaging time scales. This is in stark contrast to the evident absence of a strong relationship between the *r*^{2} values of and of *μ* on these time scales. These relationships between predictabilities are consistent with the expectations from the IPM for low *θ* settings.

On daily time scales, there are linear relationships between the predictability of and both *μ* and *σ*. Since the IPM predicts an approximately equal sensitivity of to both variables, the contrasting strengths of these correlations is somewhat surprising at first. This result is a consequence of the fact that, although is equally sensitive to variations in *μ* and *σ*, day-to-day variations in *μ* are generally observed to be a factor of 2–3 times larger than those of *σ*.

### d. The predictability of μ relative to and

The predictability range of *μ* and *σ* may be assessed by revisiting the scatterplots displayed in Fig. 9. While *μ* and *σ* display a wide range of predictive skills, in general these skills—being comparable to those of —are not particularly large. A further illustration of this fact is seen in the *r*^{2} probability distributions (across season and site) of seasonal, monthly, and daily quantities of *μ* and *σ* displayed in Fig. 10 (top row). For the locations and time scales under consideration, there are no substantial differences in the predictability of *μ* and *σ*.

The poor predictability of *μ* is seemingly at odds with the relatively good predictability of the mean vector wind components, particularly on daily time scales (Fig. 10, bottom row). This discrepancy results from the fact that *μ* is a nonlinear transformation of the mean vector winds; it will now be demonstrated that there will be instances where both of the wind components may be quite well predicted while *μ* will have no linear predictability.

A discussion of the linear predictability of in terms of the predictability of and of is presented in Monahan (2013). This earlier study takes advantage of the fact that to an excellent approximation, for any predictor *x*,

where corr(*x*, *μ*) is the correlation of *x* and *μ* [as is demonstrated in Monahan (2012b)].

With this approximation, simplified expressions for the predictability of *μ* in terms of the predictability of and can be obtained if it is assumed that fluctuations in and are uncorrelated. Denoting the respective linear predictability of the mean vector winds and as and , the linear predictability of *μ*, corr(*x*, *μ*) = *ρ _{μ}*, is expressed as

We can align the coordinate system such that and are the wind components along and across the climatological mean wind. In this case mean , so

where . It follows that the predictability of *μ* is bounded above by that of , which itself is by definition less than or equal to the predictability of the best-predicted mean vector wind component. At sites where the prevailing winds dominate, ≫ (*γ* ≫ 1) and . However, where the prevailing winds are weak, ≫ , and ≪ .

Scatterplots (across site and season) of the relative predictability of *μ* achieved with predictions on seasonal, monthly, and daily averaging time scales relative to that of are shown in Fig. 11. The points are color coded by the theoretical relative predictability of *μ* as calculated with Eq. (9) at each individual site and season. We find that the relative predictability of *μ* is well characterized by Eq. (9) and has a strong relation to the variability of . The predictability of *μ* generally decreases from being (i) equal to that of on a seasonal averaging time scale, to (ii) broadly distributed on a monthly averaging time scale, down to (iii) generally lower than that of on daily time scales. The sharp contrast in the relative predictability of *μ* on a seasonal versus a daily averaging time scale is a result of increasing proportions of surface wind variability resolved on the averaging time scale [hence increasing ] as the averaging time scale decreases.

That the mean vector wind components can be well predicted while *μ* is poorly predicted implies that high values of *θ* are not necessarily an a priori indication of good predictability of . Good predictions of by SD require both high values of *θ* and of the ratio . While the first of these quantities increases with decreasing averaging time scale, the second decreases. It follows that at any location there should be some optimal time scale on which is best predicted, as determined by the distribution of vector wind variance across time scales.

## 5. Discussion and conclusions

This study's purpose has been twofold. First, the study sought to quantify the predictability of vector and scalar surface winds at Canadian sites with simple and robust statistical downscaling (SD) methods. Second, the study sought to investigate and interpret the relative predictability of scalar and vector wind quantities.

### a. Summary of results

The predictive information contained in the flow aloft relevant to the surface winds at sites in Alberta, Saskatchewan, Manitoba, and Ontario was found to occur on large scales throughout the midtroposphere. Statistical features of the historical wind observations as computed on daily, monthly, and seasonal averaging time scales were predicted for each of the three-month calendar seasons DJF, MAM, JJA, and SON.

The results of this study demonstrate a high degree of variability in the predictability of wind statistics across different sites and seasons. Some general patterns do emerge from this variability:

the vector component predictands are generally better predicted than the speed predictands;

vector wind predictability is generally highly anisotropic;

the means are generally better predicted than the standard deviations; and

the prediction skill of mean quantities generally increases with decreases in the averaging time scale (from seasonal to monthly to daily).

In particular, it was found that on all time scales the predictability of wind speed is lower than that of the best-predicted vector wind component. Empirically and theoretically based relationships between the predictability of wind speed statistics and those of the vector wind statistics were also considered.

Polar plots of mean wind speed, , and correlation with statistics of the vector components demonstrated that fluctuations in can be predicted with high skill if they are dominated by fluctuations in mean vector wind components, which in turn can be predicted with high skill.

The idealized probability model (IPM) introduced in the sea surface winds SD study of Monahan (2012a) was found to provide an excellent characterization of land surface mean wind speeds at all sites considered (in spite of its many assumptions).

Analysis of SD results confirmed the existence of two end-member regimes of sensitivities, as determined by the ratio of the vector wind standard deviation,

*σ*, to the magnitude of the mean vector wind,*μ.*In a low*θ*regime, the predictability of is explained solely by that of*σ.*In a high*θ*regime, the predictability of is dependent on that of*μ.*The predictability of the wind speed standard deviation,*σ*, is primarily dependent on the predictability of_{w}*σ*regardless of the*θ*regime.The predictability of

*μ*and that of*σ*were found to be low, similar to that of . While the predictability of*σ*was found to be low with the methods considered, under certain conditions the predictability of*μ*was shown to approach that of the mean vector winds, . In wind climates where the prevailing winds dominate, such as at Lethbridge, Alberta (Fig. 2), the potential predictability of*μ*will approach that of the mean vector wind component. When the prevailing winds are weak (i.e., Churchill, MB, North Bay, ON; Fig. 2), the upper limit on the predictability of*μ*is much lower than that of the vector winds.

A schematic illustration of the relationship between the predictability of and the mean vector wind is given in Fig. 12. We find that a high value of *θ* and, hence, strong sensitivity of to *μ* is a necessary, but not a sufficient, condition for high predictability. High predictability (relative to the vector wind components) requires a wind climate with relatively little directional variability (i.e., Lethbridge, AB; Fig. 2). Wind speeds in regions of strong directional variability (i.e., North Bay, ON; Fig. 2) will generally be poorly predicted.

### b. Discussion of results

The key conclusion that is drawn from the present analysis is the demonstration that the predictability of wind speed relative to the best predicted vector wind component is a function of the local wind climate, determined by the degree of variability in the wind (on time scales shorter or longer than the averaging time scale) relative to the mean vector wind. This systematic analysis, across sites, seasons, and time scales, is new to the present study.

The predictability of the surface winds at two southern Alberta and four southern Saskatchewan sites was previously explored by St. George and Wolfe (2009), who found a prediction skill (*r*^{2}) of 0.24 when directly correlating regional southern Canadian prairies (SCP) wintertime seasonal wind speed with the Niño-3.4 index. However, as St. George and Wolfe did not perform a cross-validated prediction, this *r*^{2} value is best interpreted as an upper bound on the linear predictive skill of the Niño-3.4 index. Furthermore, the regional wind speed time series in St. George and Wolfe (2009) was constructed by averaging the wintertime seasonal means of six widely separated sites. In the present study, predictions were cross-validated and sites were predicted individually (thus including local effects that may have been lost through aggregation). Nevertheless, the predictability of mean winds at the SCP sites was found to be greater than that indicated by direct correlations with the Niño-3.4 index. Across the SCP sites, the mean wintertime wind speed prediction *r*^{2} was 0.31, in contrast to 0.24 from Niño-3.4 directly. While this value represents an improvement over the estimate of prediction skill reported by St. George and Wolfe, this cross-validated prediction skill is still relatively small.

The results in the SCP and the other sites considered show that in many circumstances is heavily influenced by subaveraging time scale variability, which has a relationship with the averaged flow aloft that is either more complex than the capabilities of linear statistical models or very weak. This is a crucial result, as it represents a potential inherent limitation of any statistical downscaling approach for surface wind speeds.

Two earlier studies, van der Kamp et al. (2012) and Salameh et al. (2009), have considered differences in predictability of different vector wind components. Both studies attribute the anisotropy to the influence of topography because of the observed generally strong alignment of predictability along topographic features in their study domains. In the van der Kamp et al. (2012) study, there were also a small number of sites whose vector wind predictability did not align with local or mesoscale features. The current analyses found that sites of similar distances from the Rocky Mountains exhibit contrasting orientations and magnitudes of anisotropy in mean vector wind predictability. Furthermore, sites in regions that are devoid of large topographical features, such as eastern Saskatchewan, Manitoba, and western Ontario, generally exhibit stronger anisotropy than the Ontario and western Alberta sites with strong local and large-scale topographic features. These results suggest that, while topography may be an important factor, it is not the sole influence on the anisotropy of predictability. An investigation of the physical controls on the anisotropy of vector wind predictability is an area of future study.

The dependence of mean vector wind predictability on averaging time scales was considered in Salameh et al. (2009) and Monahan (2012a). Salameh et al. made predictions of 6-hourly, daily, and weekly mean surface flows at six French cities. At these sites, the prediction skill of vector components was found to increase with the averaging time scale. On the other hand, both Monahan (2012a) and the current study have found predictability to be better on short averaging time scales (daily) relative to longer time scales (monthly and seasonal). Direct comparisons of the results presented in Salameh et al. (2009), Monahan (2012a), and the current study are complicated by the facts that (i) the daily averaging time scale is the only averaging time scale held in common by all three studies, (ii) the studies considered regions with dramatically different topographic variability, and (iii) the earlier two studies considered only the zonal and meridional components. The direction of the best predicted component may vary by up to 30° from one calendar season to another and is not in general aligned zonally or meridionally (van der Kamp et al. 2012). As well, the direction of the best predicted component also varies with the averaging time scale. These uncertainties suggest caution in directly comparing the observed influence of the averaging time scale on predictability across these studies.

The averaging time scale may influence predictability by controlling the proportion of variability resolved by the means of surface winds. For land surface winds, most of the wind variability is typically on synoptic time scales, which are resolved on daily averages, but are part of the subaveraging time scale variability for monthly and seasonal averages. The results of these studies indicate that, when the dominant variability is resolved (by both predictors and predictands), prediction skill is at its best. Further evaluations of wind speed predictability over a wider range of wind climates (e.g., over the oceans) would allow an assessment of the generality of this finding. Developing a deeper understanding of the controls on the relative predictability of winds across different averaging time scales is an important direction for future research, as is an evaluation of averaging time scales that balance good predictability of relative to *μ*, and *μ* relative to .

### c. Broader implications of results

The statistical methods considered in the present study are found to yield only weak predictability of wind speed on daily, monthly, and seasonal averaging time scales. The results suggest that improvements to predictions on longer time scales would follow from better predictions of subaveraging time scale vector wind variability. The improvements in predictability could come from finding more appropriate predictors, or using more sophisticated downscaling models. For example, predictors that contain more accurate subaveraging variability information are expected to follow from including monthly or seasonal statistics computed from daily time scale predictions. As well, there are theoretical arguments indicating that prediction skill will be improved if quantities that are functions of the mean vector wind, such as *μ*, and by extension as defined by the idealized probability model, are computed from predictions of the mean vector wind rather than predicted directly (Monahan 2013). These extensions of the current study methods are useful directions of future study.

While we have sought to characterize the predictability of conditional on the *θ* range of the surface wind under consideration, we have not investigated the predictability of *μ*, , or other surface wind quantities conditioned on thresholds in the wind's speed. A direction of future study would be to evaluate the predictability of these surface wind quantities conditioned on the speeds being within wind turbine operability thresholds. In addition, a site by site investigation of the specific local influences on predictability is an excellent direction of future study, but is beyond the scope of the present study's focus on the general predictability of the vector and scalar wind, and the relationship between the two.

Unfortunately, with the small number of statistical degrees of freedom in current observational records, consideration of more complex models and a broader range of predictors comes with the risk of overfitting the model. The generally low statistical predictability of the scalar wind quantities leads us to affirm van der Kamp et al. (2012)'s suggestion that deterministic models, which resolve the short-time scale variability, may be a more useful tool for successful predictions of wind quantities that are highly influenced by subaveraging time scale variability.

The present study has demonstrated empirical and theoretical relationships between mean wind speed predictability and the predictability of vector winds. Seemingly random differences between the predictability of these different aspects of the wind have been shown to be a result of a systematic variation in the dependence of the wind speed statistics on those of the vector winds. The results suggest that predictions of scalar wind standard deviations, as well as in low *θ* conditions, made with SD models are unlikely to be skillful. For applications that require predictions of on the averaging time scales considered (such as wind power forecasting, civil engineering and design considerations, etc.) these results emphasize the potential importance of dynamic downscaling for surface winds. In contrast, as the methods considered were relatively skillful in predicting the mean vector winds, the statistical downscaling approach employed may be of greater utility in applications which make use of knowledge of the mean vector winds, such as pollution transport and dispersion modeling.

## Acknowledgments

The authors wish to thank Alex Cannon, Andrew J. Weaver, and John C. Fyfe for their time and comments, as well as Derek van der Kamp for his guidance in presenting and assessing vector wind predictability. We are also thankful of the time and consideration made by two anonymous reviewers whose comments greatly improved the present study. This research was funded by the Natural Sciences and Research Council of Canada's Collaborative Research and Training Experience Program in Interdisciplinary Climate Science.

## REFERENCES

*El Niño, Southern Oscillation, and Climatic Variability*. 1st ed. CSIRO Publishing, 405 pp.

*Climate Dyn.,*

**38,**1281–1299,

*Bound.-Layer Meteor.,*

**135,**161–175,

*Manual of Surface Weather Observations*. 7th ed. Environment Canada, 488 pp.

*Rev. Geophys.,*

**29,**191–216,

*Bull. Amer. Meteor. Soc.,*

**77,**437–471.

*J. Climate,*

**25,**1511–1528.

*J. Climate,*

**25,**6684–6700.

*J. Climate,*

**26,**5563–5577.

*Climate Dyn.,*

**32,**615–634,

*J. Geophys. Res.,*

**110,**D19109,

*Meteor. Atmos. Phys.,*

**103,**253–265,

**37,**L06801, doi:10.1029/2010GL042940.

*Climate Dyn.,*

**38,**1301–1311,

*J. Climate,*

**23,**1209–1225.

*J. Climate,*

**12,**2474–2489.