## 1. Introduction

Wind data at time scales from 10 min to 1 h are an important input for modeling the performance of wind farms and the impact of wind farms on many countries’ national electricity systems. In New Zealand, however, because of commercial sensitivity, relatively few 10-min wind data at actual or proposed wind-farm sites are publicly available. The National Institute of Water and Atmospheric Research (NIWA) and Meteorological Service of New Zealand (MetService) do collect 10-min wind data at about 60 sites, and these are publicly available from NIWA’s National Climate Database (CLIDB). These climate stations tend to be low-elevation sites, however, and their data have been generally available only within the past few years. Also the NIWA and MetService data are typically from 10-m masts, which is well below the hub height for turbines and is lower than the mast data collected at wind-farm sites.

In the absence of actual observed data, simulated or “synthetic” data can be used by electricity system modelers, provided that the data are well calibrated against real data, to produce realistic results for applications such as 1) creating generation scenarios during the passage of frontal systems, 2) calculating the capacity contribution of wind, and 3) estimating the contribution of wind during different seasons and for 4) other studies that look at integrating wind energy into the national grid.

The Electricity Commission, a regulatory authority of the electricity industry in New Zealand with market oversight and advisory responsibilities to central government, had a need for such a synthetic wind dataset (SWD). The required properties of the SWD time series were that it 1) be valid for turbine altitudes (85 m), 2) include multiple years, 3) cover 15 wind-farm or potential wind-farm sites (Fig. 1 shows the regions within which these sites were located), 4) be at 10-min intervals with records that are time stamped to preserve realistic meteorological interarea correlations so as to allow modeling of the impact of specific meteorological events, 5) reproduce well seasonal and synoptic variations in wind speed and preserve intersite correlations, 6) reproduce well the higher-frequency fluctuations (from hour to hour and from 10-min period to 10-min period) in wind speed and preserve intersite correlations, 7) reproduce reasonably (given the constraints of “disguised” data provided by generator companies) the climatological characteristics of wind speed for each site (i.e., the approximate wind resource), 8) reproduce reasonably the frequency of speeds below and above the turbine operating thresholds, and 9) have no missing periods. There was no interest in accurately reproducing the relative wind resources among sites; that is, this study was not about identifying rich wind-resource areas within New Zealand. There was also no requirement for or interest in reproducing exactly the actual 10-min wind variations observed at each site.

In this paper, the data are described in section 2 and the novel method used to create the SWD is described in section 3. The use of a method developed by Haslett and Raftery (1989) was considered but was found to be unsuitable (see the appendix). In section 4, a comparison of the SWD with “observed” mast data from wind farms is done, thus demonstrating that the requirements for electricity system modeling are met. Conclusions are discussed in section 5.

## 2. Data

The philosophy behind the method was to exploit several years of archived 12-km gridded hourly winds available over the whole of New Zealand from regional numerical weather prediction (NWP) models. From experience and analysis (e.g., Reid and Turner, 2001) it was known that these NWP winds [available from the fifth-generation Pennsylvania State University–National Center for Atmospheric Research Mesoscale Model (MM5) and a local configuration of the United Kingdom’s Met Office Unified Model that is called the New Zealand Limited Area Model (NZLAM)] have intersite correlations and temporal variability (on a synoptic time scale) that match the observations reasonably well. The development of robust relationships between the NWP winds and the hourly speeds observed at wind farms followed by imposing a high-frequency (10 min) residual appropriate for the site would produce a synthetic wind dataset with the desired properties. The high-frequency residuals would be uncorrelated among sites (as is observed, other than for sites that are less than a few kilometers apart) but have a distribution of time-step-to-time-step fluctuations that matched 10-min wind-farm mast observations. In this section the data used and the NWP outputs are described.

### a. Observed mast data from wind-farm sites

The observed mast data provided by generators is commercially sensitive, and therefore for most sites it was disguised by either 1) normalizing or 2) not revealing the mast height and/or exact location. This latter point precluded the use of high-resolution modeling techniques. The wind speeds as provided were therefore multiplied by a factor chosen to ensure the average of observed data was a reasonable estimate for that site; these estimates were rounded to the nearest 0.5 m s^{−1} to respect further the commercial sensitivity of the data. The choice of this factor was based on 1) our own knowledge of likely local terrain impacts on regional wind average speeds from nearby lowland climate stations, 2) an existing coarse-resolution annual national wind speed map (available from NIWA), and/or the NWP regional average, and 3) the need to round the disguised average to the nearest 0.5 m s^{−1}. A summary of the wind data provided by generators for the 15 sites is given in Table 1. Note that one site (CKS3) was a public site and is included for full verification purposes and that another site (NTH3) had no mast data at all—for the latter site a composite from two nearby windy sites with similar exposures was used.

Facts about the observed data provided for each site. The codes for methods of disguise are M = mast height not provided, N = normalized data, Z = zero observations (composite used), and X = power-law extrapolation from 10-m observations.

### b. Publicly available observed 10-m mast data

*w*from nearby lowland climate stations were extracted from CLIDB and were extrapolated to 85 m using the power-law relationship

*α*parameter used in this extrapolation and in the extrapolation of the NZLAM 10-m winds to 85 m is based on representative land use/vegetation type surrounding the sites, with values ranging between 0.096 and 0.147. For site CKS1 there was no period for which available observations overlapped with NZLAM; here climate-station data from Baring Head and Cape Campbell were used to provide a composite dummy station with hourly records. In a similar way, data from Mokohinau and Musick Point were composited to construct a dummy observed time series for site NTH3 because no observations from that possible wind-farm site exist. Climate-station data from Palmerston North and Dannevirke were also used in assessing the possible use of a method described by Haslett and Raftery (1989) (see appendix).

### c. MM5 output

The New Zealand MetService has been running MM5 operationally at 12-km resolution since 2003, and the archive of boundary layer wind forecasts from September 2003 through August 2008 was made available to this project. The configuration used for the wind archive was a nested configuration with the outer nest set to 36-km horizontal resolution (110 × 110 grid points) and the inner nest set to 12 km (109 × 141 grid points). Every third grid point of the inner nest was coincident with a grid point of the outer nest. The inner nest was not centered within the outer nest. The horizontal grid had an Arakawa–Lamb B staggering of the velocity variables, with scalar quantities defined in the center of each grid square. The rest of the configuration is given in Table 2. The vertical structure was a terrain-following sigma coordinate in which *σ* was 1 at the surface and 0 at the model top (which was set to 100 hPa) and there were 25 vertical levels.

A list of important parameter settings for the MM5 forecasts used in this study.

The MM5 configuration was initialized and had the lateral boundary of the outer nest set by the National Centers for Environmental Prediction Global Forecast System (NCEP GFS) global model at 1° horizontal resolution. Where there were gaps in the archived data, these were filled by initializing the same MM5 configuration with the NCEP reanalysis (Kalnay et al. 1996) dataset at 2.5° horizontal resolution. MM5 was simply run in a “cold start” fashion without any data assimilation. The model was run 4 times per day at initialization times of 0000, 0600, 1200, and 1800 UTC. Other important MM5 settings are given in Table 2.

Hourly winds at 10, 15, 30, 45, 60, 75, 100, 120, 150, and 200 m were extracted from the forecast model archives using a “log*p*” interpolation from model sigma levels. These output levels are regularly used for MetService forecaster analyses and were a default operational postprocessing setting; an additional log*p* interpolation from these output levels was done to obtain the 85-m values. The prognosis times were *T* + 6, *T* + 7, *T* + 8, *T* + 9, *T* + 10, and *T* + 11, in hours, where *T* is the time that the run started. These prognosis times were chosen so as to be far enough into the model run to avoid problems with model “spinup.” No horizontal interpolation was carried out, and the four surrounding grid points for each of the sites were provided. The choice of which grid point to then use was based on which had the highest correlation with observed data. A horizontally interpolated value could have been used, but tests done for site CKS1 showed little impact in the synthetic wind speed time series ultimately derived.

Spectral filtering of the extracted hourly time series was necessary to remove unphysical spikes that were artifacts of the discontinuity introduced by using the winds from *T* + 6 to *T* + 11 h from consecutive forecasts; that is, the winds at *T* + 12 of one forecast are not always equal to the *T* + 6 winds from the next. Also noise that was perhaps introduced through the assimilation scheme within the GFS forecast cycles would contribute to spikes at sub-6-hourly harmonics. The spectral filtering procedure involved adjusting values of the spectral coefficients for the small number of affected frequencies (mainly 12, 9, 6, 3, and 1 h) to match the values of the observed power spectra.

### d. NZLAM output

NIWA’s NWP model, NZLAM, was run on a rotated latitude–longitude map projection, with the equator centered on the geographical domain of interest, leading to a nearly conformal 12-km grid mapping for its 324 × 324 × 38 (level) grid. The Met Office Unified Model, or UM, (of which NZLAM is a local implementation) is a nonhydrostatic, fully compressible, deep-atmosphere formulation using a terrain-following height-based coordinate. It uses a semi-Lagrangian advection scheme for prognostic variables and an Eulerian treatment of the continuity equation for mass conservation (Davies et al. 2005). It was run on a 6-h analysis/forecast cycle, and two forecasts (at 0600 and 1800 UTC) each day were run to 48 h (the other two ran to the next analysis time window, i.e., 6 h).

NZLAM’s data assimilation system is three-dimensional variational data assimilation with the first guess at appropriate time (3DVAR + FGAT; Lorenc et al. 2000), and the increments are estimated at the time of observation using a 3-h data cutoff and then are added to the background using an incremental-analysis update technique (Bloom et al. 1996), beginning 3 h before the nominal analysis time and extending over 6 h (i.e., to 3 h after nominal analysis time). Data assimilated include standard meteorological observations from land and ocean stations (ships and buoys), rawinsondes, and aircraft, as well as satellite observations of atmospheric radiances [High Resolution Infrared Radiation Sounder (HIRS), Advanced Microwave Sounding Unit A (AMSU-A), and AMSU-B], ocean surface wind speed and total column moisture [Special Sensor Microwave Imager (SSM/I)], ocean surface wind speed and direction [Advanced Scatterometer (ASCAT) and Quick Scatterometer (QuikSCAT)], and atmospheric motion vectors from geostationary meteorological satellites [Multifunctional Transport Satellites (MTSAT)]. Because this forecast system assimilates local data, it can be “warm cycled.” It accordingly did not utilize global model data within the forecast domain and accordingly preserves mesoscale structure between forecast cycles, with the result that NZLAM forecasts did not suffer from spinup problems like those that occur in non-data-assimilating models.

A comprehensive description of the UM can be found in Staniforth et al. (2004), and applications of the UM to simulating strong wind events in New Zealand are reported in Webster et al. (2008). Some of the important model configurations and parameter settings of NZLAM are given in Table 3.

A list of important parameter settings for the NZLAM forecasts used in this study.

For this study, surface wind speed and direction output were extracted from the archives for forecast hours from *T* + 0 to *T* + 12 for each forecast made between 24 October 2006 and 15 December 2008. Surface speeds were used because other model levels from within the PBL were not archived for the whole of the required period. The surface speeds were interpolated to approximate (the coordinates of the mast locations were typically not divulged) locations of the wind-farm site and then extrapolated to 85 m using Eq. (1). Within this 2-yr period there were four short (either 12 or 24 h) periods in April and May of 2007 for which no NZLAM output was available. For these periods, nearby surface hourly observations extrapolated from 10 to 85 m were used to fill the gaps. Some spectral filtering of the extracted NZLAM hourly time series was also necessary to remove unphysical spikes. In hindsight, given the fact that forecasts from *T* + 0 to *T* + 3 are within the assimilation window, using forecasts from *T* + 3 to *T* + 15 h may have resulted in a smoother power spectrum that would have been in less need of filtering.

## 3. The statistical NWP-based method

The statistical steps to our eventual method, developed after some trial and error, are described in this section.

### a. Modeling 10-min wind speeds

*w*is observed 10-min wind speed,

*x*is the 24-h centered running mean of modeled NWP hourly wind speed,

*y*is the departure of modeled hourly wind speed from

*x*,

*y*is the

_{i}*i*-h lagged

*y*(

*y*

_{0}=

*y*), and

*z*is the wind direction in degrees (

*z*= 0 is for northerly). The cos(

*z*) and cos(

*z*+ 45°) terms reflect the contribution of the modeled wind direction to the wind speed data; cos(

*z*) represents north–south-oriented winds, and cos(

*z*+ 45°) represents northwest–southeast orientation. There is no need for terms for east–west or southwest–northeast orientation because these are orthogonal to cos(

*z*) and cos(

*z*+ 45°). The reason for these terms is that in various places in New Zealand the wind can be channeled through gaps in mountain ranges or can have speedup or slowdown effects depending on the direction. This channeling is not necessarily properly taken into account even with a model grid resolution of around 12 km, and these terms allow the equation to include such effects on the local wind speed.

All NWP wind components are at hourly time steps, but these were linearly interpolated to 10-min time steps. Here *a*, *b*, *c _{i}* (

*i*= −3, −2, −1, 0, 1),

*d*

_{1}, and

*d*

_{2}are the combining coefficients to form the NWP-predicted component (Table 4);

*ε*is the residual.

*t*is in 10-min time steps,

*ε*is the residual of the daily component or 10-min component, and

_{t}*η*is the innovation, which is normally distributed and statistically independent with respect to time

_{t}*t*. The

*α*coefficients are given in Table 5.

To ensure that the variance of the simulated daily residual component was close to that observed, the innovation variance for the daily residual component was factored by the ratio of variance of observed daily residuals over the variance of the modeled daily residuals using the initially estimated parameters. For some stations, the innovation standard deviation of the 10-min residual was fitted to a linear function of the NWP wind speed. This linear function was estimated by maximum likelihood estimation since the probability distribution of the innovation *η _{t}* was known. This means that in the ARMA process as defined here the innovation error variance can depend on the simulated wind trend, and, because the simulated trends may have periodical or seasonal characteristics, so too does the residuals series.

The function “arima.sim” in the statistical package R (Dalgaard 2008) was used for simulating the daily and 10-min residual time series. To use arima.sim, autoregression and moving-average coefficients and innovation time series were supplied. The coefficients used for simulation were estimated using Eq. (3). The innovation series for the 10-min residuals was rescaled by the linear function of NWP-predicted component [estimated using Eq. (2)]. The simulated daily error and 10-min error were added onto the NWP-predicted component to form the simulated wind speeds.

Some negative wind speeds were simulated as a result of the Gaussian error assumption in the ARMA models. The redistribution of the negative winds within the light-wind classes (<5 m s^{−1} for most sites; <9.5 m s^{−1} for the three windiest sites) was done so as to match the observed distribution of light winds.

### b. Treating spatial dependencies

Once 10-min SWD for all 15 sites had been created using the method documented in section 3a, intersite comparisons were then done to check whether spatial correlations of the original NWP hourly wind speeds had been maintained. In general, these intersite correlations were lower than observed, and therefore an additional adjustment had to be made to correct for this bias.

A reason for the lower simulated correlation is that only the spatial dependence of NWP-predicted components was modeled, whereas the spatial dependence of residuals was set to be statistically independent. Because the observed 10-min component of residuals was almost spatially independent, only the spatial dependence of the daily components of residual needed to be modeled. The spatial dependence was then modeled by introducing a correlated Gaussian innovation series of the daily component of residual. We chose a spatial correlation between the two innovation series of daily residual time series such that the correlation between the two simulated 10-min wind speed series was close to that observed. To remove some artificial coarseness in the low wind speeds (introduced when the redistribution of the negative wind speed tail of the site histograms was done), the wind speeds within each bin were simply randomly redistributed within that bin.

### c. Matching the diurnal cycle

*from the original NWP filtered hourly time series using*

_{I}*λ*given by

*O*is the observed average diurnal cycle,

*M*is the average diurnal cycle of the provisional SWD, and the

*t*are the 144 ten-minute time intervals within a day. An upper limit on Δ

*was imposed to avoid “large” changes to the provisional SWD wind speed values.*

_{I}### d. Other notes

The use of model hourly data from all four surrounding NWP grid points as predictors was tested for site CKS3 against just using an interpolated (weighted by inverse distance) average or the grid point that correlated best with observations. A lower BIC resulted from using the latter.

Trials with logarithmic and square root transforms of *w* were also done; the transformations oversimulated (i.e., the incidence of high wind speeds was far too high) the tail of the wind speed distribution, however, especially for the log transformation. Experiments with a general linear model with Poisson family led to a similar conclusion. Trials using a generalized additive model to fit a linear predictor were also done. The data seem to fit a linear model better, however, which matches our physical understanding of the data and NWP models.

The step that decomposed the residual *ε* in Eq. (2) into low-frequency (daily) and high-frequency (10 min) components resulted in power spectra with slightly less power at frequencies between about 4 and 16 h. The spectral coefficients in this range were modified (consistently among sites) in such a way as to match better the behavior of the observed spectra.

## 4. Results

In this section, a sample of results is presented whereby various properties of the observed site winds are compared with the MM5-based 5-yr SWD and the NZLAM-based 2-yr SWD. Characteristics of the time series that were checked were histograms of 10-min and hourly wind speeds, histograms of 10-min-to-10-min and hourly-to-hourly fluctuations in wind speeds, time series of monthly average wind speeds, averaged diurnal cycles, 10-min power spectra, intersite correlations of wind speeds and wind speed fluctuations, and some “synoptic” segments of the time series. In this section plots are not presented for all characteristics or for all 15 sites, and those plots included here are intended only to indicate to the reader the general veracity of the method, that is, whether observations are matched reasonably by the SWD. Plots for the entire periods of the SWD are shown as a sanity check on the entire SWD, and plots comparing just the period of overlap are shown to highlight differences with the observed data and provide an indication of whether a particular period was unusual. For each site, separate plots were generated for all characteristics and for both the entire period and the overlap period with observations, and these plots are all publicly available.

Ten-minute and hourly histograms of both wind speeds and fluctuations are presented for two “well simulated” [CKS3 (Fig. 2a) and STH1 (Fig. 3b)] and two “poorly simulated” [STH1 (Fig. 2b) and NTH2 (Fig. 3a)] sites. The use of the words well and poorly is relative and is subjectively based on visual inspection of the tails, mean, and overall shape of the histograms. Overall, the histograms of wind speeds and high-frequency fluctuations at the 15 different sites were reproduced well for 11 of the 15 sites, and poorly (mainly because of deficiencies in reproducing the histogram in the important midrange wind classes of 9–14 m s^{−1}) for the other 4 sites (CNI2, MWT1, CTY1, and STH1). Some of the problems in reproducing the histograms are likely due to the use of the coarse 12-km grid spacing of the NWP models and their failure to capture fully the local topographic speedup effects, slope flows, coastal jets, and so on.

Histograms of observed, MM5-based SWD, and NZLAM-based SWD wind speeds at (a) site CKS3 (10 min) for the entire periods for which each dataset is available and (b) site STH1 (hourly) for the period during which the SWD overlaps with the 1-yr period of observations.

Citation: Journal of Applied Meteorology and Climatology 50, 12; 10.1175/2011JAMC2668.1

Histograms of observed, MM5-based SWD, and NZLAM-based SWD wind speeds at (a) site CKS3 (10 min) for the entire periods for which each dataset is available and (b) site STH1 (hourly) for the period during which the SWD overlaps with the 1-yr period of observations.

Citation: Journal of Applied Meteorology and Climatology 50, 12; 10.1175/2011JAMC2668.1

Histograms of observed, MM5-based SWD, and NZLAM-based SWD wind speeds at (a) site CKS3 (10 min) for the entire periods for which each dataset is available and (b) site STH1 (hourly) for the period during which the SWD overlaps with the 1-yr period of observations.

Citation: Journal of Applied Meteorology and Climatology 50, 12; 10.1175/2011JAMC2668.1

Histograms of observed, MM5-based SWD, and NZLAM-based SWD wind speed deltas at (a) site NTH2 (10 min) for the entire periods for which each dataset is available and (b) site STH1 (hourly) for the period during which the SWD overlaps with the 1-yr period of observations.

Citation: Journal of Applied Meteorology and Climatology 50, 12; 10.1175/2011JAMC2668.1

Histograms of observed, MM5-based SWD, and NZLAM-based SWD wind speed deltas at (a) site NTH2 (10 min) for the entire periods for which each dataset is available and (b) site STH1 (hourly) for the period during which the SWD overlaps with the 1-yr period of observations.

Citation: Journal of Applied Meteorology and Climatology 50, 12; 10.1175/2011JAMC2668.1

Histograms of observed, MM5-based SWD, and NZLAM-based SWD wind speed deltas at (a) site NTH2 (10 min) for the entire periods for which each dataset is available and (b) site STH1 (hourly) for the period during which the SWD overlaps with the 1-yr period of observations.

Citation: Journal of Applied Meteorology and Climatology 50, 12; 10.1175/2011JAMC2668.1

The histograms of 10-min deltas (Fig. 3a) do show a tendency for an underprediction of the frequency of the delta class from −0.25 to 0.25 m s^{−1}. For eight of the sites (CTY1, CKS2, CKS3, MWT1, MWT3, CNI1, NTH1, and NTH2), these are underestimated by at least 5%; this means that for these sites the situations in which the 10-min winds are relatively light are undersimulated. This behavior was not as apparent for the hourly deltas and likely resulted from the need to inflate the high-speed tail of the NWP histograms.

Figure 4 shows that monthly averages were generally captured well in the SWD, indicating that the method will probably provide good seasonal and interannual power-generation scenarios for the period of the SWD. This result suggests that if extended periods of high-resolution NWP reanalysis output were available this method could provide very good multidecadal historical time series of wind resource in New Zealand. Furthermore, if this were combined with existing long-term hydroelectric-power dam inflow records it would be an important and useful dataset in assessing the risk to New Zealand’s long-term security of renewable energy supply.

Observed, MM5-based SWD, and NZLAM-based SWD monthly wind speeds for the period during which the SWD overlaps with the 1-yr period of observations at sites (a) STH2 and (b) MWT3.

Citation: Journal of Applied Meteorology and Climatology 50, 12; 10.1175/2011JAMC2668.1

Observed, MM5-based SWD, and NZLAM-based SWD monthly wind speeds for the period during which the SWD overlaps with the 1-yr period of observations at sites (a) STH2 and (b) MWT3.

Citation: Journal of Applied Meteorology and Climatology 50, 12; 10.1175/2011JAMC2668.1

Observed, MM5-based SWD, and NZLAM-based SWD monthly wind speeds for the period during which the SWD overlaps with the 1-yr period of observations at sites (a) STH2 and (b) MWT3.

Citation: Journal of Applied Meteorology and Climatology 50, 12; 10.1175/2011JAMC2668.1

Observed intersite correlations were calculated for each station pair where possible. There were typically at least 30 days of overlap for the site pairs, with about one-half of the sites having more than 180 days of overlap. There were two exceptions, however: CKS1, which had no overlap with any other site, and NTH3, for which such a calculation would be not valid because the “observations” there were a composite of two exposed “nearby” (within 100 km) climate stations. Figure 5 shows the intersite correlations as a function of the distance between sites for both the original NWP hourly series and the provisional SWDs. The “minimum” in correlations for distances of around 700 km is a “Cook Strait” effect, and is an artifact of the particular locations of the 15 sites in this project. The correlation in speeds between Cook Strait sites and winds at other sites drops off more quickly with distance than for most other locations in New Zealand. A comparison of Figs. 5b and 6b shows the positive impact of the treatment of spatial dependencies. Overall, the observed relationships (Fig. 6a) was reasonably preserved for both the final versions of MM5- and NZLAM-based SWD, although there was a tendency for the intersite SWD correlations to be slightly more correlated than was observed (Fig. 7).

Correlations of wind speeds between each of the 15 sites as a function of distance for (a) the filtered hourly NZLAM time series (i.e., prior to any statistical treatment) and (b) the version of the NZLAM-based SWD just prior to the treatment of spatial dependencies. The thick solid curves approximate the lower and upper bounds for the observed correlations.

Citation: Journal of Applied Meteorology and Climatology 50, 12; 10.1175/2011JAMC2668.1

Correlations of wind speeds between each of the 15 sites as a function of distance for (a) the filtered hourly NZLAM time series (i.e., prior to any statistical treatment) and (b) the version of the NZLAM-based SWD just prior to the treatment of spatial dependencies. The thick solid curves approximate the lower and upper bounds for the observed correlations.

Citation: Journal of Applied Meteorology and Climatology 50, 12; 10.1175/2011JAMC2668.1

Correlations of wind speeds between each of the 15 sites as a function of distance for (a) the filtered hourly NZLAM time series (i.e., prior to any statistical treatment) and (b) the version of the NZLAM-based SWD just prior to the treatment of spatial dependencies. The thick solid curves approximate the lower and upper bounds for the observed correlations.

Citation: Journal of Applied Meteorology and Climatology 50, 12; 10.1175/2011JAMC2668.1

Correlations of wind speeds between each of the 15 sites as a function of distance for (a) observations (when possible) and (b) the NZLAM-based SWD. The thick solid curves approximate the lower and upper bounds for the observed correlations.

Citation: Journal of Applied Meteorology and Climatology 50, 12; 10.1175/2011JAMC2668.1

Correlations of wind speeds between each of the 15 sites as a function of distance for (a) observations (when possible) and (b) the NZLAM-based SWD. The thick solid curves approximate the lower and upper bounds for the observed correlations.

Citation: Journal of Applied Meteorology and Climatology 50, 12; 10.1175/2011JAMC2668.1

Correlations of wind speeds between each of the 15 sites as a function of distance for (a) observations (when possible) and (b) the NZLAM-based SWD. The thick solid curves approximate the lower and upper bounds for the observed correlations.

Citation: Journal of Applied Meteorology and Climatology 50, 12; 10.1175/2011JAMC2668.1

The distribution of “errors” (i.e., difference between SWD and observed intersite correlations) for all wind-farm sites that had more than 3 months of overlapping observations; 67 such comparisons were made.

Citation: Journal of Applied Meteorology and Climatology 50, 12; 10.1175/2011JAMC2668.1

The distribution of “errors” (i.e., difference between SWD and observed intersite correlations) for all wind-farm sites that had more than 3 months of overlapping observations; 67 such comparisons were made.

Citation: Journal of Applied Meteorology and Climatology 50, 12; 10.1175/2011JAMC2668.1

The distribution of “errors” (i.e., difference between SWD and observed intersite correlations) for all wind-farm sites that had more than 3 months of overlapping observations; 67 such comparisons were made.

Citation: Journal of Applied Meteorology and Climatology 50, 12; 10.1175/2011JAMC2668.1

Figure 8 shows a segment of observed and SWD 10-min time series for a 17-day period in June of 2007 for the MWT1 and MWT2 sites. It shows how the relationships between the sites have been preserved in both SWD but also how the large synoptically driven regional speedup on 4 June and the decrease on 9 June have been captured by the SWD.

Time series of 10-min wind speeds at MWT1 and MWT2 for the period from 1200 New Zealand standard time (NZST) 4 Jun to 1200 NZST 21 Jun 2007 (a) as observed, (b) from the MM5-based SWD, and (c) from the NZLAM-based SWD.

Citation: Journal of Applied Meteorology and Climatology 50, 12; 10.1175/2011JAMC2668.1

Time series of 10-min wind speeds at MWT1 and MWT2 for the period from 1200 New Zealand standard time (NZST) 4 Jun to 1200 NZST 21 Jun 2007 (a) as observed, (b) from the MM5-based SWD, and (c) from the NZLAM-based SWD.

Citation: Journal of Applied Meteorology and Climatology 50, 12; 10.1175/2011JAMC2668.1

Time series of 10-min wind speeds at MWT1 and MWT2 for the period from 1200 New Zealand standard time (NZST) 4 Jun to 1200 NZST 21 Jun 2007 (a) as observed, (b) from the MM5-based SWD, and (c) from the NZLAM-based SWD.

Citation: Journal of Applied Meteorology and Climatology 50, 12; 10.1175/2011JAMC2668.1

## 5. Discussion and conclusions

A 5-yr (from 2 September 2003 to 30 August 2008) MM5-based 10-min SWD and a 2-yr (from 24 October 2006 to 15 December 2008) NZLAM-based 10-min SWD at 85-m altitude have been created for use in modeling the impacts of 15 actual or proposed wind farms on New Zealand’s electricity system. The method for creating the SWD has also been described, and this method should be applicable in many other countries and is general enough such that different hub heights for different sites could be specified. A variation on this method would have to be devised if two wind farms were sited very close to each other such that the 10-min fluctuations in wind speed were not independent.

Summary statistics showing how well the two SWD correlate with observations for each site are given in Table 6. It seems that both SWD do a useful job of explaining the variance for all sites. Lower correlations at some of the sites (e.g., CKS1, CTY1, and NTH1) did not necessarily correspond to poorer representations of the histograms of wind speeds and short-term speed fluctuations at those sites. In fact, two sites with very good correlations (STH1 and MWT1) had wind speed histograms that did not match that well the observed histograms in the speed range from 9 to 14 m s^{−1}, tending to underpredict the frequency of winds within this band. It is reassuring that these two sites were still able to reproduce reasonably well the power-generation curve as calculated from the observed wind data: see Fig. 9 (black solid and dashed lines), which shows power-generation curves for MWT1 for a period in June of 2007.

Bias and explained variance of observed wind speeds by original filtered MM5 and NZLAM hourly winds for each site as well as the explained variance of the final SWD. For CKS1, different sets of observations were used. Biases afterward were close to 1.

The actual power output for MWT1 (gray dots; left-hand axis) for the period 4–21 Jun 2007, and the difference between the actual power output and that estimated using the observed mast wind speeds (solid curve; right-hand axis) or that estimated using the MM5-based SWD (dashed curve; right-hand axis). The actual power output and the power output using the observed mast speed were provided through the courtesy of the generator and the New Zealand Electricity Commission.

Citation: Journal of Applied Meteorology and Climatology 50, 12; 10.1175/2011JAMC2668.1

The actual power output for MWT1 (gray dots; left-hand axis) for the period 4–21 Jun 2007, and the difference between the actual power output and that estimated using the observed mast wind speeds (solid curve; right-hand axis) or that estimated using the MM5-based SWD (dashed curve; right-hand axis). The actual power output and the power output using the observed mast speed were provided through the courtesy of the generator and the New Zealand Electricity Commission.

Citation: Journal of Applied Meteorology and Climatology 50, 12; 10.1175/2011JAMC2668.1

The actual power output for MWT1 (gray dots; left-hand axis) for the period 4–21 Jun 2007, and the difference between the actual power output and that estimated using the observed mast wind speeds (solid curve; right-hand axis) or that estimated using the MM5-based SWD (dashed curve; right-hand axis). The actual power output and the power output using the observed mast speed were provided through the courtesy of the generator and the New Zealand Electricity Commission.

Citation: Journal of Applied Meteorology and Climatology 50, 12; 10.1175/2011JAMC2668.1

Differences between the MM5-based and NZLAM-based SWD are likely due to 1) NZLAM employing a 3DVAR data assimilation scheme whereas MM5 does not and so has a longer spinup period, 2) the MM5 SWD being based on selecting output from a single suitable nearby grid point whereas for NZLAM interpolated values to the general location of the site were used, and 3) for the MM5 SWD, wind speeds from the filtered hourly series being interpolated between the 75- and 100-m levels to 85 m whereas for the NZLAM SWD the power-law extrapolation of 10-m winds to 85 m was used.

Overall, the demands required by the electricity system modelers of the SWD have been met and the realistic preservation of the important properties of the available short-term 10-min mast data, albeit disguised, has been demonstrated. The complete synthetic wind dataset can be obtained free of charge from the New Zealand Electricity Commission’s (online at http://www.electricitycommission.govt.nz/opdev/modelling/synthetic-wind) or by contacting the corresponding author of this paper.

## Acknowledgments

We thank Trevor Carey-Smith, Andrew Tait, and Marina Sterk from NIWA for their assistance in the project. We thank Brian Bull (Electricity Commission) and Devin Kilminster (MetService) for input into statistical methods during the course of the project. We also appreciate very much the following companies (liaisons in parentheses) for providing the observed 10-min mast data used in the study: Meridian (Paul Botha), Contact Energy (David Rohan), TrustPower (Clayton Delmarter), Unison (Daniel Stettner), and MainPower (Andrew Hurley). Author Zheng is partially supported by National Natural Science foundation of China General Program 40975062. We also thank the New Zealand Foundation of Research Science and Technology for partial support of this work through Contracts CO1X0701 and CO1X0812.

## APPENDIX

### Survey of a “Haslett and Raftery” Approach

The common element of both the method reported here and that of Haslett and Raftery (1989) is the generation of long-period wind series at sites for which observations are available only for a short period. The basic analysis steps of Haslett and Raftery (1989) were 1) to estimate the signal, 2) to estimate the residual, and 3) to estimate the spatial dependence. These same basic steps were also followed in our analysis. Because of the differences between the datasets being analyzed, however, the details of the three steps are significantly different, and these differences were necessary to improve the analysis. A comparison of the major features of Haslett and Raftery (1989) and the method reported upon in this paper is presented in Table A1.

Data and analysis method for Haslett and Raftery (1989) and the current Electricity Commission SWD project.

To investigate the performance of the Haslett and Raftery (1989) method, we applied their method to produce a 10-min time series for the MWT1 site. Two nearby public climate stations (Dannevirke and Palmerston North Aero) with long 10-min records were used. The explained variance obtained for the Haslett and Raftery method at MWT1 was 0.520, whereas for our approach it was 0.596 for the MM5-based SWD and 0.688 for the NZLAM-based SWD. This result shows that the current approach was at least comparable to and likely overall better than Haslett and Raftery’s method. This conclusion is reinforced by the fact that all of the other 14 sites did not have such an ideal situation of having two nearby stations. Therefore, we chose not to employ the Haslett and Raftery method. Another factor in our choice was the need to fill in synoptically consistent winds for periods of missing data in the long-term public climate-station records.

## REFERENCES

Bloom, S. C., L. L. Takaks, A. M. Da Silva, and D. Ledvina, 1996: Data assimilation using incremental analysis updates.

,*Mon. Wea. Rev.***124**, 1256–1271.Box, G., G. M. Jenkins, and G. C. Reinsel, 1994:

*Time Series Analysis: Forecasting and Control*. 3rd ed. Prentice Hall, 592 pp.Dalgaard, P., 2008:

*Introductory Statistics with R*. Springer, 363 pp.Davies, T., M. J. P. Cullen, A. J. Malcolm, M. H. Mawson, A. Staniforth, A. A. White, and N. Wood, 2005: A new dynamical core for the Met Office’s global and regional modelling of the atmosphere.

,*Quart. J. Roy. Meteor. Soc.***131**, 1759–1782.Edwards, J. M., and A. Slingo, 1996: Studies with a flexible new radiation code. I: Choosing a configuration for a large-scale model.

,*Quart. J. Roy. Meteor. Soc.***122**, 689–719.Gregory, D., and P. R. Rowntree, 1990: A mass flux convection scheme with representation of cloud ensemble characteristics and stability dependent closure.

,*Mon. Wea. Rev.***118**, 1483–1506.Grell, G. A., J. Dudhia, and D. R. Stauffer, 1995: A description of the Fifth Generation Penn State/NCAR Mesoscale Model (MM5). NCAR Tech. Note TN-398+IA, NCAR 117 pp. [Available online at http://www.mmm.ucar.edu/mm5/documents/mm5-desc-doc.html.]

Haslett, J., and A. E. Raftery, 1989: Space-time modelling with long-memory dependence: Assessing Ireland’s wind power resource.

,*Appl. Stat.***38**, 1–50.Hong, S.-Y., and H. L. Pan, 1996: Nonlocal boundary layer vertical diffusion in a medium-range forecast model.

,*Mon. Wea. Rev.***124**, 2322–2339.Kalnay, E., and Coauthors, 1996: The NCEP/NCAR 40-Year Reanalysis Project.

,*Bull. Amer. Meteor. Soc.***77**, 437–471.Lorenc, A. C., and Coauthors, 2000: The Met. Office global three-dimensional variational data assimilation scheme.

,*Quart. J. Roy. Meteor. Soc.***126**, 2991–3012.Reid, S., and R. Turner, 2001: Correlation of real and model wind speeds in complex terrain.

,*Wea. Forecasting***16**, 620–627.Reisner, J., R. M. Rasmussen, and R. T. Bruintjes, 1998: Explicit forecasting of supercooled liquid water in winter storms using the MM5 mesoscale model.

,*Quart. J. Roy. Meteor. Soc.***124**, 1071–1107.Schwarz, G. E., 1978: Estimating the dimension of a model.

,*Ann. Stat.***6**, 461–464.Staniforth, A., A. White, N. Wood, J. Thuburn, M. Zerroukat, and E. Croderro, 2004: Joy of U.M. 6.0. Model formulation. Unified Model Documentation Paper 15, version 6.0. Met Office Tech. Note, 534 pp. [Available online at http://research.metoffice.gov.uk/research/nwp/publications/papers/unified_model/umdp15_v6.0.pdf.]

Webster, S., M. Uddstrom, H. Oliver, and S. Vosper, 2008: A high-resolution modeling case study of a severe weather event over New Zealand.

,*Atmos. Sci. Lett.***9**, 119–128.