Climate predictions tailored to the wind energy sector represent an innovation in the use of climate information to better manage the future variability of wind energy resources. Wind energy users have traditionally employed a simple approach that is based on an estimate of retrospective climatological information. Instead, climate predictions can better support the balance between energy demand and supply, as well as decisions relative to the scheduling of maintenance work. One limitation for the use of the climate predictions is the bias, which has until now prevented their incorporation in wind energy models because they require variables with statistical properties that are similar to those observed. To overcome this problem, two techniques of probabilistic climate forecast bias adjustment are considered here: a simple bias correction and a calibration method. Both approaches assume that the seasonal distributions are Gaussian. These methods are linear and robust and neither requires parameter estimation—essential features for the small sample sizes of current climate forecast systems. This paper is the first to explore the impact of the necessary bias adjustment on the forecast quality of an operational seasonal forecast system, using the European Centre for Medium-Range Weather Forecasts seasonal predictions of near-surface wind speed to produce useful information for wind energy users. The results reveal to what extent the bias adjustment techniques, in particular the calibration method, are indispensable to produce statistically consistent and reliable predictions. The forecast-quality assessment shows that calibration is a fundamental requirement for high-quality climate service.
The demand for renewable sources of energy as an alternative to fossil-fuel sources has increased for reasons such as the need to mitigate the climate change resulting from anthropogenic greenhouse gas emissions, the interest in the creation of new economic opportunities, and the provision of energy access to people living in areas without access to other sources of energy (Renewable Energy Policy Network for the 21st Century 2015; IPCC 2007). Furthermore, the 21st Congress of the Parties for the United Nations Framework Convention on Climate Change (COP21) agreement has recently proposed several polices to promote energy efficiency and replace fossil fuels with renewable sources of energy (Lane 2016). Wind energy is the cheapest option for new sources of power-generating capacity and is the second-leading source of renewable energy worldwide, only exceeded by hydropower in terms of installed capacity (Pryor and Barthelmie 2010; Santos et al. 2015). In recent years, installed capacity for wind power has experienced rapid growth, with a total of 370 GW installed worldwide in 2014. As a consequence, wind energy has become a key element of the electricity supply in many parts of the world (World Wind Energy Association 2015).
Operational and economic issues related to wind energy, such as the need to match supply with demand at all times under the intermittent nature of wind, require the modeling and forecasting of generation processes for wind power at a range of temporal and spatial scales (Pinson 2013). Prediction of the variability of wind energy resources, which has been identified as a challenge to the grid integration of wind energy systems (Najafi et al. 2016; Füss et al. 2015), is a key piece of the decision-making processes because it allows end users to take informed, precautionary action with potential cost savings to their operations. Hence, more efficient energy management strongly depends on having accurate resource forecasts. Wind energy forecasting options have been traditionally limited to short time scales (from hours to a few days) because near-surface winds and thus wind energy production strongly depend on mesoscale and synoptic-scale variability (Graff et al. 2014; Pryor and Barthelmie 2010). At longer time scales, the assessment of the economic feasibility of future wind farms is a function of, among other things, the expected energy yield and the maintenance requirements over their life span of periods from a month to several decades. This information is not readily available to the relevant users, however, who therefore have to rely on past observation-based information, which is often available only as short time series. The need for climate information that is representative of the next few decades has raised the interest of the wind industry in climate projections, which are increasingly being used in long-term resource evaluation (Hueging et al. 2013; Reyers et al. 2015; Vautard et al. 2014).
With a focus on time scales from a month to a decade into the future, current energy practices use an approach that is based on the future climate being a repetition of an estimate of the current climatological behavior (Garcia-Morales and Dubus 2007). Advances in the science of climate prediction that cover the gap in climate information between weather forecasting and projections of climate change can be considered as an alternative to the state of the art by providing predictive information that helps users to make more informed decisions and move beyond using only climatological information.
It recently has been shown that climate predictions are capable of providing additional value for wind energy applications; information at these time scales can be especially beneficial for the management of power production plants (Clark et al. 2017; García-Bustamante et al. 2009; Lynch et al. 2014; Troccoli 2010). For instance, climate predictions could allow electricity-grid system operators to estimate the future production generated by wind farms and use it as input for load-balance models. Should this potential of climate prediction materialize, the matching of supply and demand could be optimized and significant cost savings be realized, with a better anticipation of market changes. This framework will favor greater penetration of renewable electricity into markets.
The scenario described is of great interest to the renewable energy community, but little progress had been made in practice. In recent years the skill of climate predictions has significantly improved (Doblas-Reyes et al. 2013), however. For instance, seasonal forecast systems (i.e., those that provide information for periods ranging from a month to slightly longer than a year into the future) are now providing skillful forecasts for extratropical regions where no substantial skill was found before (Clark et al. 2017; Dunstone et al. 2016; Scaife et al. 2014). This will promote their application in wind energy decision-making, as illustrated for different energy sources (De Felice et al. 2015; Garcia-Morales and Dubus 2007). At this time, however, there are very few instances of the application of seasonal predictions in the wind energy industry. Improved climate information that includes seasonal forecasts may change this situation, for example by allowing innovative wind energy insurance and helping to cover high-risk periods associated with persistent lower-than-expected wind resource.
Seasonal predictions will be beneficial if they are skillful enough, but also they must be tailored to the potential users in a decision-making context. In particular, seasonal predictions have systematic errors that make them unusable unless they are postprocessed to have statistical features that are similar to those of the observational reference employed. This problem has been recognized by the climate science community as one of the main challenges for moving to a better use of climate predictions (Buontempo et al. 2014; Coelho and Costa 2010). Two recent European projects related to climate services that are sponsored by the European Commission Seventh Framework Programme (FP7) [European Provision of Regional Impacts Assessments on Seasonal and Decadal Timescales (EUPORIAS; see online at http://www.euporias.eu/) and Seasonal-to-Decadal Climate Prediction for the Improvement of European Climate Services (SPECS; see online at http://www.specs-fp7.eu/)] have tried to address these challenges and support the development of sectorial climate services in Europe through the involvement of stakeholders in defining effective ways to develop climate information.
This paper raises the limits associated with current seasonal forecast systems when used in wind energy applications. It focuses on the description of appropriate bias adjustment techniques to overcome some of these limits and to promote the use of climate forecast information on those occasions in which it can provide greater accuracy than current approaches. The method that is described recognizes that end users must be provided with information about the prediction uncertainty (Alessandrini et al. 2013), and therefore a probabilistic approach is adopted because it is more valuable in user-specific loss functions (Pinson and Tastu 2013).
An overview of the necessary steps to provide climate predictions to the wind energy sector is provided in Fig. 1, which summarizes the main challenges addressed in this paper. Section 2 of the paper introduces the datasets and describes one of the most widely used seasonal forecast systems and its limitations. Section 3 describes appropriate bias adjustment techniques. It also introduces measures to assess forecast quality and explains their relevance in a user context. Section 4 presents the impact of the bias adjustments on the wind speed seasonal forecasts, including an analysis of the changes in the statistical properties of the postprocessed predictions. Section 5 reports the concluding remarks and provides a wider context for future work in the dissemination of climate predictions in user-relevant formats.
In this study, we use 10-m wind speed forecasts from the European Centre for Medium-Range Weather Forecasts (ECMWF) System 4 (hereinafter System 4) operational seasonal forecast system (Molteni et al. 2011), which is based on a global climate model with coupled atmospheric and oceanic components. System 4 comprises an ECMWF atmospheric model—the Integrated Forecast System (IFS) CY36R4 with a T255 spectral truncation (horizontal resolution of approximately 80 km) and 91 vertical levels reaching up to 0.1 hPa—coupled to the Nucleus for European Modelling of the Ocean (NEMO), version 3.0, ocean model. The ocean model uses a grid with horizontal resolution of ~1° in the extratropics with equatorial refinement and 42 levels in the vertical direction. The atmosphere and ocean are coupled using a version of the Ocean Atmosphere Sea Ice Soil (OASIS3) coupler developed at the Centre Européen de Recherche et de Formation Avancée en Calcul Scientifique (CERFACS).
System 4 is run in ensemble-prediction mode. Ensemble predictions are a way to account for uncertainties in the climate system, in particular those associated with the imperfections of the initial conditions and the model formulation (Slingo and Palmer 2011). For this reason, the operational System 4 forecasts are produced at the beginning of each month with 51-member ensembles. Each member of the ensemble uses slightly different initial conditions and different realizations of stochastic representations of subgrid physical processes in the atmosphere. This approach allows prediction of the forecast uncertainty (measured by the ensemble dispersion) along with the prediction itself. The simulations are performed for up to 7 months into the future.
Seasonal forecast systems traditionally do not produce operational forecasts of wind speeds at turbine height levels. Instead, wind speeds are made available at 10 m or at different pressure levels. It is difficult to interpolate directly to hub height because the physical height of pressure levels is not constant over time. For that reason, 10-m wind speeds have been selected for this analysis. It might be possible for the forecast systems to deliver wind speed at hub height, should the renewable-energy community show an interest in seasonal forecast systems.
The analysis in this paper focuses on boreal winter because the winter season has larger wind speed variability in the Northern Hemisphere (Archer and Jacobson 2013). In addition, the analysis of seasonal predictions of wind speed in winter can be relevant because of the higher variability in the supply of wind power in that particular season (Bett and Thornton 2016). This illustrates the potential of seasonal predictions for end users, because these predictions potentially have more impact where the interannual variability is the largest, although other seasons have also been analyzed (see Fig. S1 of the online supplemental material), and the conclusions apply equally. The predictions considered here are those issued on 1 November, for which 3-month statistics for the December–February period (DJF; also known as a 1-month-lead seasonal forecast) are made. Predictions over the period 1981–2013 have been used in the study. The prediction for DJF in 2013 has been used as an operational forecast, and the predictions over 1981–2012 have been used as the retrospective predictions (hindcasts) to be used in the validation process. This consideration aims to emulate true operational prediction conditions for which no observed information about the future is available.
To evaluate the System 4 prediction quality, we compare the predicted 10-m wind speed with the corresponding variable of the ERA-Interim reanalysis (Dee et al. 2011). This reanalysis uses the ECMWF IFS atmospheric model to assimilate observational data of many types, including in situ observations and satellite retrievals, to produce a spatially and temporally complete “best guess” gridded observational dataset. ERA-Interim has the same resolution as System 4. This resolution is fairly coarse, but this product offers uniform global coverage in exchange. Given the sparsity of global wind observations, reanalyses have demonstrated their potential usefulness for large-scale wind energy applications (Cannon et al. 2015). The problems related to the lack of a long-enough historical data record have also promoted the use of reanalyses by the wind industry (Rose and Apt 2015).
For this reason, and being aware that reanalysis estimates can often be far from point observed values, the reanalysis has been used as the best available estimate of wind speed. The choice of reanalysis dataset is arbitrary, and the conclusions are equally valid when using other reanalyses, either global or regional. Further work is needed to assess the seasonal predictions for the locations of specific wind farms.
a. Data processing
Wind speed forecasts are affected by biases resulting from the inability to numerically reproduce all of the relevant processes that are responsible of climate variability (Doblas-Reyes et al. 2013). Apart from biases in the mean and other characteristics of the distribution of the simulated variables, for probabilistic forecasts there are additional difficulties, such as the lack of forecast reliability (Pinson 2012), which is a measure that quantifies the agreement between the predicted probabilities and observed relative frequencies of a particular event. This is important from a wind energy point of view since reliable probabilities are expected to be included in decision-making processes. Hence, climate predictions require a bias adjustment stage to statistically resemble the observational reference, minimize forecast errors, and formulate reliable probabilities. Bias adjustment of the wind speed has been identified as a requirement by the wind energy sector to fulfill acceptable reliability for use in its decision-making processes (Alessandrini et al. 2013).
This paper illustrates the relative merits of different techniques for statistical bias adjustment of ensemble forecasts to address different aspects of the forecast error. Two approaches, a simple bias correction and a calibration method, have been selected.
1) Simple bias correction
Simple bias correction is based on the assumption that both the reference and predicted distributions of seasonal wind speed are approximated well by a Gaussian (normal) distribution. The adjustment creates predictions that have the same mean and standard deviation as the reference dataset. This is a zero-order approach for correction of the systematic mean error that has been previously applied to correct temperature and precipitation (Leung et al. 1999). The Gaussian assumption is a limitation of the approach because the monthly and seasonal wind speed distribution can be, at times, slightly non-Gaussian.
The bias correction scheme can be summarized in this way:
Seasonal mean anomalies are calculated by subtracting the ensemble mean of the seasonal averages from the seasonal average xij of each forecast for each year i and member j. A new seasonal mean yij is calculated by multiplying the seasonal mean anomaly by the ratio of the standard deviation of the reference dataset σref to the interannual standard deviation of the ensemble members σe and adding the climatological mean of the reference dataset. This is done for each grid cell separately, resulting in a new wind speed forecast ensemble that has the same ensemble mean and standard deviation as the reference.
2) Calibration method
The calibration can be considered as a way of obtaining predictions with an interannual variance that is equivalent to that of a reference dataset in a similar way to the simple bias correction method but at the same time ensuring an increased reliability of the probability predictions. Here we apply the technique of variance inflation (von Storch and Zwiers 2001). This calibration strategy has been selected because an inflation of the ensemble spread is required to obtain reliable probabilities, and it is applied as in (Doblas-Reyes et al. 2005).
If xi is the ensemble-mean prediction for any grid point at year i and zij is the difference of ensemble member j from the ensemble mean, then the calibrated estimate of the ensemble member j can be expressed as
The coefficients α and β are defined as follows:
The σem is the standard deviation of the ensemble mean (the time series of xi), σe is the standard deviation of the ensemble, σref is the standard deviation of the reference, and ρ is the correlation between the ensemble mean of the retrospective forecasts and the reference dataset. The α and β coefficients are found under two constraints: The first is that the standard deviation of the inflated prediction is the same as that for the reference, and the second is that the predictable signal after the inflation is made equal to the correlation of the ensemble mean with the reference dataset.
b. Assessment of forecast quality
Seasonal forecast systems, as in any other forecasting process, have to be systematically compared with a reference, preferably observations, to assess their overall quality in a multifaceted process known as forecast-quality assessment (Mason and Baddour 2008). This is a fundamental step in the prediction problem because a prediction has no value without an estimate of its quality that is based on past performance (Doblas-Reyes et al. 2013). Moreover, quantification of the uncertainty is one of the most crucial aspects for successful development of the wind industry and minimization of the financial risk.
Three sources of uncertainty in common scoring metrics of probabilistic forecasts should be considered: improper estimates of probabilities from small-sized ensembles, an insufficient number of forecast cases, and imperfect reference values as a result of observation errors. A way to alleviate these problems is to use several scoring measures to offer a comprehensive picture of the forecast quality of the system (Jolliffe and Stephenson 2012) and to apply statistical inference as often as required. Note that these sources of uncertainty are independent of the uncertainty of the individual forecasts: the user should consider and be provided with both types of uncertainty when making decisions when this information is included.
Several scoring measures are used in this paper, including skill and reliability measures such as the reliability diagram and the rank histogram. Forecast quality has been used to evaluate the performance of the seasonal forecast system as well as the impact of the two bias adjustment techniques on forecast quality. The goal is to offer the most general and, a priori, relevant information for a user in the wind energy sector instead of the traditional view offered by climate scientists in which the information provided to the users is mainly based on correlation, which is very useful but gives only a small part of the information user requires.
1) Skill scores
Skill estimates that are based on the performance of the system in the past may guide users about the expected performance of future forecasts (Weisheimer and Palmer 2014), always with the caveat that the predictability of the climate system may change over time. Skill scores are a tool for end users that can be used to develop alternative strategies to their baseline information to minimize the risk and to perform optimal management (Pinson et al. 2009). Skill scores for both deterministic (ensemble mean) and probabilistic predictions are considered.
The Pearson correlation coefficient between the ensemble mean and the reference dataset has been used as a measure of the linear correspondence between the forecasts and the reference. This deterministic skill measure is invariant to changes in scale; hence the bias correction and calibration of the forecasts do not change the correlation of the ensemble mean with the observations. The bias adjustment techniques that are illustrated in this paper (defined in section 3a) have been applied in leave-one-out cross validation to mimic as closely as possible an operational context in which new coefficients might be estimated to predict each year. In cross-validation mode, the prediction to be adjusted is removed from the sample used to estimate the coefficients. As a result, the correlation of the postprocessed forecast changes relative to the correlation computed directly with the uncorrected forecasts.
A comprehensive measure of the predictive skill for probabilistic seasonal predictions of categorical events is the ranked-probability skill score (RPSS) (Epstein 1969; Wilks 2011). This is a squared distance between the cumulative probabilities of the categorical forecast and reference vectors relative to a naive forecast strategy, which in our case has been taken as the climatological statistics (hereinafter referred to as “climatology”; derived from all of the possible events recorded in the past) because this is the preferred current choice of the users targeted by this analysis. The RPSS is based on the rank probability score, which is a measure of the squared distance between the forecast and the reference cumulative probabilities. In the case that is presented here, the RPSS has been computed on the basis of categorical forecasts for terciles. Three equiprobable events are associated with the two terciles of the climatological distribution of the reference: wind speed exceeding the upper tercile (above-normal category), wind speed not exceeding the lower tercile (below-normal category), and values occurring between the two terciles (normal category). The probabilities have been computed as the fraction of ensemble members in the corresponding category. This is only one example; other categories could be defined if they better represent the decisions involved in precautionary climate action. The individual values of the reference dataset in the verification time series can fall in any of the three categories, with probability determined by the probability density function for the target season.
The continuous ranked probability skill score (CRPSS) is a probabilistic skill score (Jolliffe and Stephenson 2012) that has commonly been used to evaluate the predictive skill of the full probability distribution. It is based on the continuous ranked probability score (CRPS), which is a score that reduces to the mean absolute error if a deterministic forecast is used. The CRPS measures the difference between the predicted and observed cumulative distributions, and it can be converted into a skill score that measures the performance of a forecast relative to the climatology.
The RPSS and CRPSS range from 1 to −∞. Skill scores below 0 are defined as unskillful, those equal to 0 are equal to the climatology forecast, and anything above 0 is an improvement upon climatology, through to 1, which indicates a “perfect” forecast.
Fair scores to ensemble forecasts have been recently introduced (Fricker et al. 2013; Ferro 2014). A skill score is fair when it favors predictions with ensemble members that perform as if they have been sampled from the same distribution as the reference dataset. The fair versions of the RPSS and CRPSS have been used to give an estimate of what the skill is when an infinite ensemble size is used (a measure of potential skill). The differences between the results of the fair scores and the basic scores are small, as has been shown for the RPSS in Fig. S2 of the supplementary material.
Reliability analysis of prediction systems remains as a prime concern for the wind energy sector, as it is for any user of probability predictions, because of the risks and uncertainties involved in the forecasting of wind resources (Chaudhry and Hughes 2012). Rank histograms are a simple tool to evaluate the reliability of ensemble forecasting systems (Elmore 2005). They are generated by dividing the observations among a limited number of bins, thereby defining a set of exhaustive and mutually exclusive events. Then the observed frequencies for these bins are compared with the corresponding forecast probabilities. Rank histograms help to determine whether the forecast is assumed to be reliable, and in that case it is expected to be flat. Some deviations from uniformity can appear for reliable forecasts because of randomness, however. The rank histograms have been displayed on probability paper (Bröcker 2008). On the y axis, rank histograms display cumulative probabilities (instead of the traditional observed frequency) that indicate how probable that observed frequency would be if the prediction were reliable. This information is useful to identify whether the deviations from reliable behavior are systematic or merely random. In addition, the readability of the rank histogram is further improved by scaling the ordinate by a logit transformation that has the effect of displaying both small and large probabilities equidistantly. On the right, the 90%, 95%, and 99% simultaneous confidence intervals have been represented.
Rank histograms illustrate whether the ensemble members and the verifying observation come from the same probability distribution, in which case the forecasts are statistically consistent and no calibration of the ensemble is needed. This happens when the rank histogram is flat (as if coming from a uniform distribution). Because of sampling variations, the histograms are almost never flat, however. To assess whether the deviations from flatness are attributed to chance or to deficiencies in the forecasts, goodness-of-fit test statistics are computed: the Pearson χ2, the Jolliffe–Primo test statistic for slope (JP slope), and the Jolliffe–Primo test statistic for convexity (JP convex) (Jolliffe and Primo 2008). The Jolliffe–Primo statistics are obtained from the decomposition of the Pearson χ2 in components that allow the identification of bias (slope) or under-/overdispersion (convexity) in the forecast ensemble. The detailed mathematical definition of this goodness-of-fit test can be found in the appendix of Jolliffe and Primo (2008).
Reliability diagrams are a common diagnostic of probabilistic predictions that assess both reliability and skill. They consist of a plot of the observed relative frequency against the predicted probability of a dichotomous event, providing a quick visual assessment of the impact of tuning probabilistic forecast systems. A perfectly reliable system should result in a line that is as close as possible to the diagonal, within a certain measure of uncertainty.
The information provided by the reliability diagram should be interpreted with care because even a perfectly reliable forecast system is not expected to have an exactly diagonal reliability diagram because of the limited samples that are typical of seasonal forecast systems (Jolliffe and Stephenson 2012). To address this problem, we have included consistency bars (Bröcker and Smith 2007) in these diagrams. They indicate how likely the observed relative frequencies are, under the assumption that predicted probabilities are accurate.
To draw a reliability diagram, discretization and grouping into probability bins (10 in this paper) of the probability forecasts have to be done. A reliability diagram also includes the frequency of the forecast probabilities included in each bin, which is known as a sharpness diagram. Sharpness gives an indication of the variation in forecast probabilities issued by the prediction system, independent of the observations.
The rank histogram and the reliability diagram are complementary tools to assess the reliability of the system. The former assesses the full forecast ensemble and does not require the formulation of forecast probabilities—an aspect that is necessary in the case of the reliability diagram, for which one assesses the features of both the forecast system and the statistical model that transforms the ensemble into probabilities.
Total installed wind power indicates the power capacity available on each wind farm. It has been represented in Fig. 2 to identify which are the most important locations from the point of view of a wind energy user. To illustrate the performance of seasonal predictions, we selected two regions that are key for the wind energy sector because of the concentration of wind farms at those locations. For the selection of the regions we have also taken into account the potential skill available in such regions (supplemental Fig. S1). The first region is in Canada (longitude from 112.5° to 113.2°W and latitude from 50.3° to 51.0°N). Canada is an important player in terms of energy resources (Vaillancourt et al. 2014) and is a global leader in the sustainable development of wind energy. This region had an exceptional year in 2014 for wind energy development, ranking seventh globally in terms of newly installed capacity (Canadian Wind Energy Association: http://canwea.ca/wind-energy/installed-capacity/) for that year. The region in the North Sea (longitude from 9.8° to 10.6°E and latitude from 58.0° to 58.7°N) is the second region that was considered. It is the most important region for offshore energy activities in Europe because of the large and consistent wind resource, the relatively shallow water that minimizes the cost of the wind farms, and the proximity to developed electricity markets (Schillings et al. 2012).
Figure 3 displays the predictions for the uncorrected, bias-corrected, and calibrated sets for these two regions. The effect of the bias adjustment on the predictions is that, when the corrections are applied the hindcasts (gray dots) and the reference dataset (black dots) show similar mean and variance. After the bias adjustment, the probabilities in each category differ as a result of the changes in the ensemble distribution. The skill changes accordingly with the bias adjustment, showing a decrease in the correlation and an increase in the probabilistic skill scores. The decrease of the correlation is due to the cross validation, which leads to an implicit leakage of information and a degeneracy in this measure of potential skill (Barnston and van den Dool 1993; Barnston et al. 2012). The improvement of the fair RPSS and CRPSS are associated with reduction of the systematic errors. Contrary to the correlation, the RPSS and the CRPSS are both sensitive to the systematic differences in the statistical properties (mean and variance) of the predicted variables with respect to those in the observations and to the inadequacy of the ensemble dispersion to act as a prediction of the forecast error (the lack of reliability). This is a useful example of the importance of using more than one measure of forecast quality, in particular when dealing with user-relevant variables.
The information provided by global forecast systems is relatively coarse. In a global context, the sizes of the two selected regions are small. Besides, for a small region the skill is expected to be noisier and less robust than for a larger one. To explore how the size of the region affects the forecast quality we have estimated the forecast quality for larger regions (supplemental Fig. S2). The comparison shows that the skill differences are small when a larger region is considered. Future work will focus on the formulation of predictions for specific sites. This is a nontrivial task because the bias adjustment techniques necessary in seasonal forecasting require sufficiently long observational references that are not readily available.
The fair RPSS maps for the uncorrected, bias-corrected, and calibrated wind speed are shown in Fig. 4. The uncorrected predictions (Fig. 4a) display very low scores all around the world. The highest values are found in tropical regions, in particular in some regions of northeastern South America and northwestern Africa. This maximum can be explained because the largest predictability at seasonal time scales is attributed to anomalies in the tropical sea surface temperatures resulting from coupled ocean–atmosphere phenomena, in particular those related to El Niño–Southern Oscillation events (Kirtman and Pirani 2009) that affect mainly the regions mentioned above.
Figures 4b and 4c show that the fair RPSS increases globally when bias adjustment is applied. This kind of assessment is widely available for variables such as temperature and precipitation but is not available for wind speed. The skill improvement has been quantified in Figs. 4d and 4e, which indicate that the skill scores for the bias-adjusted predictions increase by more than 1 relative to the uncorrected ones. The fair RPSS maps (Figs. 4b,c) for the postprocessed predictions have their maximum values in the tropics. Although the skill is relatively low at extratropical latitudes, some positive skill is found in those regions. For instance, some regions in Europe such as the North Sea or Scandinavia display positive values. Wind speed predictions show the highest skill in northern Europe; in southern Europe negative RPSS values are found. This result is in agreement with previous work (e.g., Weisheimer et al. 2011) that indicated that seasonal dynamical predictions have limited forecast quality over Europe.
The skill improvement is also present in southeastern Asia, the central United States, and northeastern South America, where positive values appear when bias-correction and calibration techniques are applied. The bias adjustment allows the skill in those regions associated with ENSO teleconnections (Hamlington et al. 2015; Quan et al. 2006), as well as that associated with other sources of seasonal to interannual predictability such as the persistence of the North Pacific decadal oscillation (Gershunov and Cayan 2003), to emerge. Wind speed with positive skill in North American regions has important implications for the wind energy sector in this economically active region.
The differences between the correlation and CRPSS before and after the bias adjustment of the wind speed forecasts have been included in online supplemental Figs. S3 and S4. The correlation of the uncorrected forecasts is always higher because of the cross-validation leakage mentioned above. It is noticeable that the correlation spatial distribution in the calibrated hindcasts is noisier than in the two other types of forecasts considered. This noise is due to the coefficients estimated in the calibration having a smaller spatial decorrelation length and being less robust than the mean and variance used in the simple bias correction.
For the uncorrected predictions (Figs. 5a,b), the overpopulated lower ranks and the negative slope in the rank histogram illustrate that a positive unconditional bias is present in the data. These biases appear for the predictions of both regions, although the effect of this deficiency seems more important in Canada (Fig. 5a) where all the observations are exceeded by the majority of the ensemble members, leaving the highest rank categories almost empty. The simple bias-corrected and calibrated forecasts show more homogeneously populated ranks, indicating that the reliability of the ensemble improves when the bias adjustment is applied. The deviation of the flatness of these rank histograms could be the result of some forecast deficiencies still remaining after the bias adjustment. For instance, for the calibrated forecasts in Canada (Fig. 5e), the rank 50 shows a very large value that might indicate that the ensemble overestimates the true uncertainty range.
To assess whether the deviations from flatness of the rank histograms are attributed either to chance or to deficiencies in the forecasts, goodness-of-fit test statistics, with the null hypothesis being that the rank histogram is uniform, are computed (Table 1). The three statistical tests—Pearson χ2, JP slope, and JP convex—allow us to identify whether the forecasts are biased or whether the ensemble has over- or underdispersion.
Table 1 shows that departures from flatness exist for the uncorrected forecasts, especially in Canada, where the tests take very high values, showing that the ensembles are underdispersive, as evidenced by the high JP convex values. The high values of the JP slope show that the forecasts are also affected by biases. The uncorrected forecasts in the North Sea also have biases and are underdispersive, although the statistical tests have smaller values than those in Canada. The results are statistically significant, with the significance p values being virtually 0.
The tests applied to the simple bias-corrected forecasts and the calibrated forecasts indicate that the deviation from flatness is minimized when the bias adjustment is applied. The Pearson χ2 for the calibrated data in the Canada region has higher values than the simple bias-corrected ones (p value of 0.01), and the JP tests provide no evidence of departures from flatness, with p values higher than 0.01. This result consequently shows that the biases and the underdispersion in the raw ensemble are corrected and that the deviations from uniformity are independent of these specific problems. Making sure that the ensemble is well calibrated is a critical aspect of the forecast for the user because it suggests that the ensemble predictions represent the forecast error, within statistical sampling, and can be trusted in specific applications that have been developed using meteorological observational references.
To further analyze the impact of the bias adjustment on reliability, reliability diagrams (Fig. 6) allow comparison between the observed frequencies and forecast probabilities (obtained from the ensemble forecasts) for binary events. The events are defined by the thresholds of the lower and upper terciles, as for the RPSS but in a dichotomous way. If the prediction system is reliable, then good agreement should exist between forecast probabilities and observed relative frequencies and the graph should be close to the diagonal.
The slope of the reliability diagrams is positive. This shows that as the forecast probability of the event occurring increases so does the verified chance of observing the event and that therefore the forecasts have some reliability. The reliability curves for the three events have a steeper slope than the diagonal in both regions, suggesting that the probability forecasts are overconfident. For the uncorrected forecasts in Canada (Fig. 6a), the curve for the below-normal category (blue line) flattens when the forecast probability is above 0.45. This means that, when the forecast probability is higher than 0.45, there is no relationship between the forecast probabilities and the frequency of the observed below-normal wind speeds. The reliability diagram for the uncorrected predictions in the North Sea (Fig. 6b) shows only a narrow set of probabilities issued, with values ranging from 0.1 to 0.5 for the above-normal (red line) and below-normal (blue line) categories and from 0.4 to 0.7 for the normal category (orange line). In addition, the above-normal category is so steep that it falls outside the consistency bars. This result illustrates the poor reliability for that event in the North Sea when the predictions are uncorrected.
The reliability curves of the bias-corrected predictions (Figs. 6c,d) show features that are similar to those of the uncalibrated ones. One should bear in mind that, apart from correcting the mean and standard deviation of the forecast distribution, the simple bias correction does not have any additional impact on the predictions and, hence, no substantial changes beyond the effect of the cross validation should be expected in the reliability diagram.
The calibrated predictions for the above-normal and below-normal events (Figs. 6e,f) have reliability diagrams that have their points lying closer to the diagonal than is found for the uncorrected and bias-corrected predictions. This result corresponds to a better agreement between the forecast probabilities and the probability of the observed event than in the other two cases, suggesting that the overconfidence has been corrected. For the North Sea (Fig. 6f), the slope of the curve for the normal category (orange line) (Fig. 6f) becomes horizontal, suggesting that the system cannot discriminate between predictable and unpredictable normal wind speeds in this region, which is not surprising because normal events might not have strong signals, which are the kinds of signals that are associated with the predictability of the system.
In addition, for the predictions of below-normal and above-normal wind speeds after calibration the sharpness diagrams (Figs. 6e,f) show more homogeneously populated bins for both regions. This means that the forecast system is able to predict those events with a larger range of forecast probability values. Conversely, the uncorrected and simple bias-corrected predictions display their frequency peaks near the climatological frequency so that they often predict the event with a climatological probability. These results show the improvement in the reliability of the predictions obtained when calibration is applied—improvements that are particularly relevant to the users.
Seasonal predictions have not yet been widely taken into account by the wind energy sector. Some applications in the energy sector of this type of forecasts have been recently identified, however. They illustrate that predictions at seasonal time scales can be used as input by the industry in decision-making processes to replace the current naive climatological information. In this paper we illustrate a strategy for the use of wind speed seasonal predictions by the wind energy sector.
After describing one of the most popular operational seasonal forecast systems, ECMWF’s System 4, and its forecast-quality characteristics, two different bias adjustment techniques to correct the typical deficiencies of the predictions from global forecast systems are described. It is shown that bias adjustment is indispensable for the predictions to be usable. The System 4 predictions have skill in predicting wind speed at seasonal time scales, especially in the tropics, but also in extratropical regions of relevance to the wind energy sector. This is an encouraging result that has not been documented elsewhere. Dynamical seasonal predictions do suffer from a number of important systematic errors that also affect wind speed predictions. Bias adjustment methods are required for the predictions to have the same statistical properties as the observational reference and hence to be applicable by users. With regard to the bias adjustment, the simple bias correction and the calibration method produce predictions with statistical properties that allow their actual application. The most important gain in forecast quality for seasonal predictions comes through the increase in their skill and reliability, the latter being a critical aspect of the forecasts from the user perspective. These gains in forecast quality cannot be evidenced using correlation, which suggests that more than one measure of forecast quality is needed, even in a user context.
The predictions and the impact of the bias adjustment are illustrated on two skillful regions that are crucial for the wind energy sector, the North Sea and central Canada. A further analysis of the predictions reveals that both the bias correction and the calibration method produce an improvement in the consistency of the ensemble. In addition, the reliability diagrams demonstrate that the calibration method, which also corrects the deficiencies in the ensemble spread, provides more reliable predictions than does the simple bias correction technique. Improvements in reliability are fundamental from a user perspective because reliability guarantees the trustworthiness of the predictions.
Our work demonstrates that calibration is necessary because it produces an improvement in both skill and reliability, making this technique essential for seasonal predictions to be usable. The development of these strategies is part of a recent initiative undertaken by the climate community in which climate services are developed to provide more relevant, reliable, and action-oriented climate information (Buontempo et al. 2014). This paper illustrates the fact that seasonal predictions of near-surface wind speed have skill in several regions where there is substantial installed power and that after bias adjustment the predictions are reliable for use.
Future improvements include the combination of seasonal predictions from different sources, based on both dynamical and empirical–statistical forecast systems. The global and illustrative character of this paper requires the use of a reanalysis as reference data. Verification against other reanalyses and regional observed wind speed data might offer slightly different results because of the observational uncertainty, which is an additional factor that will be taken into account in future analyses, but the need for a bias adjustment process will be unavoidable. There are simple ways to convert the wind speed into energy density that will also be explored from the point of view of seasonal prediction, and the use of empirical downscaling could offer additional benefits when considering seasonal predictions for specific power plants.
The work described here opens the field to the next step in the development of a climate service: the creation of tailored products that facilitate the widespread use of climate predictions by the wind energy sector (step 4 in Fig. 1). The release of climate services can range from knowledge transfer (informing, documenting, and providing training in the best bias adjustment techniques) to the creation of operational online interactive interfaces that allow users in the wind industry to easily explore probabilistic predictions. An example of a prototype of an interactive platform that incorporates bias-adjusted predictions can be found at the Project Ukko (http://www.project-ukko.net) online interface designed in the framework of the EUPORIAS project. In addition, the New European Wind Atlas (NEWA; see online at http://euwindatlas.eu/), which is currently in development, will provide access to skill evaluations of climate predictions. Further interactions between the climate science community and the renewable energy community are also indispensable to quantify the actual economic value of climate predictions and to evaluate how the predictions have performed in the past. This is a necessary step to demonstrate to energy stakeholders the saliency of climate forecast outcomes.
The authors acknowledge funding support from the RESILIENCE (CGL2013-41055-R) project, funded by the Spanish Ministerio de Economía y Competitividad (MINECO), and the FP7 EUPORIAS (GA 308291) and SPECS (GA 308378) projects. We also acknowledge funding from the COPERNICUS action CLIM4ENERGY-Climate for Energy (C3S 441 Lot 2) and the New European Wind Atlas (NEWA) project funded from ERA-NET Plus, Topic FP7-ENERGY.2013.10.1.2. Special thanks are given to Nube González-Reviriego and Albert Soret for helpful comments and discussion. We acknowledge use of the s2dverification (http://cran.r-project.org/web/packages/s2dverification) and SpecsVerification (http://cran.r-project.org/web/packages/SpecsVerification) R-language-based software packages. We also thank Pierre-Antoine Bretonnière, Oriol Mula, and Nicolau Manubens for their technical support at different stages of this project.
Supplemental information related to this paper is available at the Journals Online website: http://dx.doi.org/10.1175/JAMC-D-16-0204.s1.