Wind rose summaries, which provide a basis for understanding and evaluating the climatological behavior of local wind, have a directional bias if a conventional method is used in their generation. Three techniques used to remove this bias are described and are compared for theoretical and observed wind distributions. All three techniques successfully remove the bias, with the simplest of the three performing as well as the more-complex techniques.
Wind roses are histograms that depict the joint relative frequency of wind speed and direction. These diagrams are used in runway orientation, air-quality modeling, and wind-energy planning, to name only a few uses.
A well-known methodological bias exists in the creation of wind roses from directional wind data. In brief, the practice of reporting wind direction at discrete 10° increments and analyzing them in a 360° range divided into 16 equally spaced, 22.5°-wide bins results in four bins each represented by three measurements and the remaining 12 each represented by only two. This bias is often easy to identify visually, because undue prominence is seen in the frequencies of the four primary directions (north, south, east, and west). For a uniform wind distribution, this practice introduces a positive bias of 33% at these directions and a negative bias of 11% at the other 12, as illustrated in Fig. 1. A similar problem arises from the 360° range divided into eight bins with a bias of ±11% in alternate bins. This bias is distinct from the observational bias (Ratner 1950) in which observers give preference to cardinal directions over those between the cardinal points.
Three correction schemes to the popular 16-bin model are described here: a global rescaling, bin boundary adjustments, and proportionate binning applied to individual wind values. All address the bias and yield nearly equivalent results. For simplicity, in the work presented here the winds are not stratified by wind speed; the analysis would be the same if they were.
2. Bias-correction techniques
The simplest correction is to globally rescale the count in each wind direction bin by the number of directions it represents relative to the average (Droppo and Napier 2008; Lea and Helvey 1971). Thus, the primary four directions are each reduced by a factor of 0.75 and the remaining 12 directions are multiplied by 1.125. As a result of this rescaling, the sum of all bins will, in most cases, be different from the original total. To keep the overall number of events constant, all bins are further scaled by the ratio of the old total to the new total.
A second correction proposed by Lea and Helvey (1971) involves an adjustment to the count in each bin that is based on where bin boundaries fall in relation to the winds represented by that bin. For example, winds counted in the east bin, defined as being from 78.75° to 101.25°, came from directions 80°, 90°, and 100°. If one assumes that winds from those three directions are evenly distributed between 75° and 105°, the number of winds counted in this bin should be reduced by the excesses of 3.75° occurring at each end, which actually fall in the ranges of neighboring bins. The count in this bin is reduced by (30° − 2 × 3.75°)/30° = 0.75. The neighboring east-southeast bin, defined as being from 101.25° to 123.75°, originally counts winds from 110° and 120°. This bin receives 3.75°/30° = 0.125 of the values in the east bin and, again assuming an even distribution of winds from 105° to 125°, must itself be reduced by the excess of 1.25°/20° = 0.0625, which is passed on to the southeast bin. Adjustments are made to the remaining 14 directions using similar logic.
The previous two methods are applicable to wind direction histograms that have already been constructed from the observations. If individual observations are available, a third technique proposed by Droppo and Napier (2008) may be applied. In this method each wind event is directly binned by a proportionate amount for the bin or bins into which it falls. For example, a wind direction of 100° is assumed to represent a uniform distribution of winds between 95° and 105°. Given the boundary between the east and east-southeast bins at 101.25°, these bins are incremented by 0.625 and 0.375, respectively. A wind direction of 180°, representing 175° to 185°, falls entirely in the south bin and simply increments its count by 1. Because it handles each wind observation individually and assigns weights proportionately, this technique removes the methodological bias at its source and provides truly unbiased wind roses.
As expected, all three methods result in perfect correction of the uniformly distributed wind case. Next we examine the performance of all three techniques on two artificial cases of increasing complexity and on actual observations.
3. Application of bias correction
In both artificial cases a pseudorandom-number generator was used to create 3600 uniformly distributed wind directions between 0° and 360°. These were then recorded as values to the nearest 10°, from 10° to 360° inclusive. In the first case the distribution is skewed as follows: 30% of the values were replaced with normally distributed wind directions with a mean of 225° and a standard deviation of 20° and 10% of the values were replaced with normally distributed winds with a mean direction of 45° and standard deviation of 20°. All replacement values are rounded to the nearest 10°. As shown in Fig. 2, all three methods yield nearly identical results and remove the obvious bias of the original. In the second case, normally distributed values with means of 110° and 340° and a standard deviation of 25° replaced 40% and 20%, respectively, of the uniform distribution. Figure 2 again shows strong agreement among the bias corrections.
Considering the differences among the histograms between the original and bias-corrected distributions, one must first recognize that they are computed over the entire population. Any differences are due to method and not to random-sampling considerations. Inferences, however, can be made regarding the similarity of two histograms from the test between two binned datasets described by Press et al. (1992).
In this test, one accepts or rejects the null hypothesis that two samples of binned data, such as the values plotted in Fig. 2, were drawn from the same underlying population. The χ2 statistic is greatest when the relative differences between values in a bin are most pronounced:
In this formula, the Si and Ri terms are the number of occurrences of an event in each bin of the two distributions, R and S, being compared. In the current example with 16 bins, when χ2 > 25 one would reject, with 95% confidence, the null hypothesis.
Given the histogram values computed here, one can ask, “Would one conclude that these describe samples taken from the same population?” When χ2 > 25, one would reject this hypothesis. Computing χ2 among the four distributions then suggests whether the methods used are in agreement with one another. To put the χ2 values in perspective, in the left panel of Fig. 2, χ2 = 50 between the biased distribution and that with the global rescaling correction. In the same figure, χ2 = 4.2 between the global rescaling and the bin boundary adjustment.
Because these histograms are generated from a circular distribution of wind directions, one must also consider whether the arbitrary locations of the origin and bin boundaries have an impact on the value of the test statistic (Jammalamadaka and SenGupta 2001). To address this, the boundary definitions for the 16 sectors were each shifted by 2.5° clockwise, resulting in a different bias pattern requiring recomputation of the global rescaling weights, bin boundary adjustments, and proportionate binning values as described above. This shift was performed eight times to cover all unique combinations of how the directional bins could be analyzed.
To assess in a practical manner whether the choice of bias-correction scheme makes a difference, the techniques described above were applied to a selection of 30 years (1981–2010) of wind records from the Integrated Surface Database (Smith et al. 2011). On the first and 16th day of each month, samples were gathered containing wind directions at 3-hourly intervals for 15 days centered on that date. Wind reports with speeds of less than 3 mi h−1 (1.34 m s−1) were excluded. Thus each sample had a maximum of 3600 observations. These samples were taken 2 times per month at 262 locations in the United States and its territories, yielding a total of 6288. Values of χ2 were computed among the original biased wind distributions and the three bias-correction methods using the original and eight shifted boundary configurations. The distribution of the χ2 values is shown in Table 1.
Several key points are seen in these comparisons:
From the first three rows in Table 1, in over 99% of the cases, the bias-corrected versions would be deemed different from the original distribution. An example of one that was not is shown in Fig. 3. Here, for data collected at Kahului, Hawaii, in mid-July, the winds have such a dominant northeasterly mode that any bias correction makes little difference.
Fewer than 0.4% of the cases were found to have the global rescaling and proportionate binning techniques compared with χ2 > 25. This indicates that a similar result is expected from either method. Among the original bin definitions, the worst-case comparison between the two (χ2 = 20.9; right tail area = 0.14) is shown in Fig. 3. At Cold Bay, Alaska, coincidentally again for mid-July, we see a relatively large difference between these two methods in the west-southwest and south-southeast sectors. Here the global rescaling resulted in a decrease in the west sector largely compensated by increases in the south-southeast and southeast bins.
The bin boundary adjustment technique yielded significantly different distributions from the global rescaling and proportionate binning methods 2.6% and 0.5% of the time, respectively. An example in which this method shows a difference appears in Fig. 3 for Asheville, North Carolina, for dates within a week of 1 January. The bin boundary adjustment moves such a large number of values from the north bin to the north-northeast bin that the contribution to the χ2 sum from this bin alone is enough to conclude that the methods yield different results.
The two-bin χ2 test can only indicate when it is safe to reject the null hypothesis that two samples were drawn from the same population. The high occurrence of low χ2 values at the left end of Table 1 indicates, however, that in the vast majority of cases the bias-correction distributions are very similar among the three methods. In other words, in most cases the differences in the distributions are nowhere near the threshold required to reject the null hypothesis. By extension, the bias corrections yield results that should be similar enough for practical purposes.
This process has been repeated using eight bins with analogous results. Because the bias correction is not as large in the eight-bin case, there were more examples in which the original and bias-corrected distributions could not be distinguished. Nonetheless, the basic pattern of agreement among the three correction methods remained the same.
The methodological bias in creating wind roses should be eliminated using one of the three methods described here. If individual wind reports are available, the proportionate binning method has the appeal of representing each wind report in an unbiased manner. For wind direction histograms that have already been computed, the global rescaling and bin boundary adjustment methods offer nearly identical bias correction, with the first being easier to implement and demonstrating more consistent agreement with the proportionate binning.