## Abstract

Yearly percentiles of geostrophic wind speeds serve as a widely used proxy for assessing past storm activity. Here, daily geostrophic wind speeds are derived from a geographical triangle of surface air pressure measurements and are used to build yearly frequency distributions. It is commonly believed, however unproven, that the variation of the statistics of strong geostrophic wind speeds describes the variation of statistics of ground-level wind speeds. This study evaluates this approach by examining the correlation between specific annual (seasonal) percentiles of geostrophic and of area-maximum surface wind speeds to determine whether the two distributions are linearly linked in general.

The analyses rely on bootstrap and binomial hypothesis testing as well as on analysis of variance. Such investigations require long, homogeneous, and physically consistent data. Because such data are barely existent, regional climate model–generated wind and surface air pressure fields in a fine spatial and temporal resolution are used. The chosen regional climate model is the spectrally nudged and NCEP-driven regional model (REMO) that covers Europe and the North Atlantic. Required distributions are determined from diagnostic 10-m and geostrophic wind speed, which is calculated from model air pressure at sea level.

Obtained results show that the variation of strong geostrophic wind speed statistics describes the variation of ground-level wind speed statistics. Annual and seasonal quantiles of geostrophic wind speed and ground-level wind speed are positively linearly related. The influence of low-pass filtering is also considered and found to decrease the quality of the linear link. Moreover, several factors are examined that affect the description of storminess through geostrophic wind speed statistics. Geostrophic wind from sea triangles reflects storm activity better than geostrophic wind from land triangles. Smaller triangles lead to a better description of storminess than bigger triangles.

## 1. Introduction

Assessing past storm activity is one of the more difficult tasks in climate science. Wind time series are either too short because of lacking observations or are inhomogeneous. Inhomogeneities are caused by observational routines and analyses, type and accuracy of used instruments, the surroundings, and station relocations (Trenberth et al. 2007). As an example for such inhomogeneities the wind time series of Hamburg can be named (Weisse and von Storch 2009). The time series exhibits a decreasing number of days per decade with wind speeds over 7 Beaufort because of the weather station being relocated from the harbor to the airport of Hamburg. Inhomogeneities are also caused by improvements in the observational framework: with increased supervision of the atmosphere through satellite-based measurements, buoys, and stations came an increased detection rate of storm events that lead to probably false inferences about long-term changes in storminess (Shepherd and Knutson 2007).

Making use of air pressure–based proxies for storm activity, which are based on usually homogeneous pressure readings, is a possible solution to counteract these problems. Several proxies exist, such as the frequency of 24-hourly local pressure changes of 16 hPa or the frequency of pressure readings less than 980 hPa (Bärring and von Storch 2004). Schmidt and von Storch (1993), however, followed a different approach. They investigated geostrophic wind speeds in the German Bight (North Sea). Here, pressure observations from three stations, which form a triangle, were used to calculate geostrophic wind speeds and associated annual frequency distributions. The authors assumed that any variation in atmospheric wind statistics would be reflected in the geostrophic wind statistics. Schmidt and von Storch (1993) found no increase in geostrophic storminess, concluding that storm activity remained almost constant for the examined period of over 100 yr.

In the following years, several studies adopted the method to analyze storminess over different areas in the midlatitudes. A brief technical description of the method can be found in Schmith (1995) and Wang et al. (2009). Alexandersson et al. (1998, 2000) used pressure readings from 21 stations in northwestern Europe and the North Atlantic to form several triangles. They examined the annual 95th and 99th percentiles and found that northwestern Europe storm activity shows interdecadal variability. They also examined the linkage of these percentiles to the North Atlantic Oscillation (NAO) and found that large-scale atmospheric features only moderately explain the long-term behavior of this proxy, mostly because of the assessment of annual distributions. They noted that the correlation between the NAO and winter seasonal percentiles is higher and lower for other seasonal percentile time series. Matulla et al. (2008) updated one of the pressure triangles of Alexandersson et al. (2000) and added further stations over central Europe. They concluded that storminess over central Europe features the same characteristics as storminess over northern Europe. Furthermore, Matulla et al. (2008) stated that the NAO index is not useful in explaining central Europe storm activity. Wang et al. (2009) extended the previous studies using the triangle proxy as they explored seasonal and regional differences in the temporal evolution of northeastern Atlantic storminess. They concluded that storminess in the North Sea region is different to storminess in other regions and that summer and winter storm activity differs. They also found a moderate relationship between winter storminess and the NAO.

The studies that use the triangle proxy commonly assume that the variation of the statistics of strong geostrophic wind speeds describes the variation of statistics of ground-level wind speeds. Although there might be evidence that this assumption is valid (WASA Group 1998; Wang et al. 2009), it is still unproven. The aim of the present study is to close this gap with a systematic evaluation of the triangle pressure proxy. Such an investigation requires long and homogeneous data. Therefore, we use diagnostic 10-m wind and surface air pressure fields from the spectrally nudged and National Centers for Environmental Prediction (NCEP)-driven regional model REMO (Feser et al. 2001; Weisse et al. 2009) for the period 1959–2005. These fields belong to the coastDat dataset (available online at http://www.coastdat.de from the Helmholtz-Zentrum Geesthacht). They used hourly ground-level wind speed and surface air pressure fields over Europe and the North Atlantic with 0.5° × 0.5° resolution (around 50–60 km). Weisse et al. (2005) show that surface wind fields and their statistics are homogeneous and reasonably well simulated over the sea in coastDat. We assume that the wind fields and their statistics are also reasonably well simulated over land.

The following sections address the evaluation of the triangle pressure proxy for annual and seasonal percentiles. Furthermore, the influence of low-pass filtering, size, and surface properties of underlying triangles on the proxy quality is examined and discussed.

## 2. Are annual and seasonal percentiles of geostrophic wind speed and of ground-level wind speed positively linearly related?

The assumption that the variation of the statistics of strong geostrophic wind speeds describes the variation of the statistics of ground-level wind speeds implies that percentile time series of geostrophic and of atmospheric wind speed are positively linearly related. To evaluate this assumption, the correlations between specific quantiles of geostrophic and of atmospheric wind speed time series, namely, the median, the 90th, the 95th, and the 99th percentile time series, are investigated. For this purpose we determine annual and seasonal frequency distributions from hourly geostrophic and near-surface wind speeds over various triangles in the dataset region. The triangles are randomly chosen to vary their size and location. In this approach, the length of triangle sides ranges from about 50 to 1800 km (Fig. 1). Over these triangles, geostrophic wind speeds are expected to represent area-averaged wind conditions. For our evaluation, however, we use statistics of area-maximum (instead of area-average) surface wind speeds as a measure of storm activity, which is characterized by strong surface wind speeds. With this choice we set a higher standard for determining a positive link between the statistics of geostrophic wind speeds and storm activity. Note that the usage of statistics of area-averaged wind speeds would result in higher correlations.

Here, 1221 triangles have been examined to assess the correlation between annual time series. Figure 2 displays histograms of the ensemble of correlations between the median, 90th, 95th, and 99th percentile time series of geostrophic and of modeled ground-level wind speeds. Table 1 shows the applicable 0.05 quantiles and the median correlation. The 0.05 quantiles of the four ensembles of correlations are greater than 0. The differences between median correlations of the median, 90th, and 95th percentile time series are small as the values range from 0.692 to 0.718, only the 99th percentile time series have a smaller median correlation of 0.573. From Fig. 2 and the median values of the ensemble of correlations we infer that the median geostrophic wind speed best reflects the variations of annual ground-level wind speed statistics and that the correlations decrease for upper-quantile time series.

After having derived the percentile time series and respective correlations, the mentioned research question is dealt with in two steps. First, every single correlation is tested locally for a positive linear dependency at the 0.01 significance level via bootstrap hypothesis testing. The proportion *h* of accepted local null hypotheses given in Table 1 increases for upper-percentile wind time series from about 3.85% (median wind speeds) to 16.95% (99th percentile wind speeds).

Second, these proportions are used to determine a general answer to our question. If quantiles of geostrophic wind speed and of area-maximum wind speed were independent (for instance, would not covary linearly), one would expect *r* = 99% of all the sample correlations not to be 0.01 significant on average, and only 1% to be inconsistent with the null hypothesis of a 0 correlation. The likelihood of obtained proportions can be deduced from the binomial distribution (e.g., Livezey and Chen 1983) under this claim, which serves as a global null hypothesis, after the following problem has been addressed.

The results of the first step are not directly applicable to the claim as the percentile time series of different triangles probably depend on each other. Thus, the number of spatially independent time series is likely to be small compared to the number of examined triangles. Different methods suggested in Van den Dool (2007, chap. 6) reveal that the number of spatial degrees of freedom is somewhere between 9 and 25 in our case. For the present study, *N* = 20 spatial degrees of freedom are assumed.

Now, the likelihood of obtained *h* under the null hypothesis of *r* = 99% of all the correlations being 0 in general can be calculated as the cumulative probability of the binomial distribution *P* (*h* · *N*, *N*, *r*). Note that the product *h* · *N* is rounded as the binomial distribution requires *h* · *N* to be an integer number. Our analysis reveals that it would be highly unlikely to achieve the proportions of accepted local null hypotheses if the global null hypothesis was true. The probabilities are in the range of *P* ~ 10^{−15}. Even if *h* was 80% the probability would be insignificant. These results are in agreement with Fig. 3 in Livezey and Chen (1983). Thus, the global null hypothesis is rejected. The probability that the statement of all the correlations being 0 is valid is extremely low. We conclude that annual percentiles of geostrophic wind speed and of area-maximum wind speed are positively linearly related in general.

For seasonal quantiles of geostrophic wind speed and of ground-level wind speed, the same analysis has been carried out for every season. The results are presented in Table 2. Compared with the results of annual percentile time series, the same conclusions can be drawn. There is a linear link between seasonal percentiles of geostrophic wind speed and of area-maximum surface wind speed. Furthermore, the median correlations are between 0.525 and 0.781. They are highest for the winter and lowest for the summer season owing to the seasonal variability of the westerlies. The median correlations decrease for upper percentiles within each season. The differences to the annual median correlations are little. The proportions of accepted local hypothesis tests are smaller than those of the annual results. Consequently, the positive linear relationship also exists on the seasonal scale.

In the literature, storm activity on the interannual-to-interdecadal scale is commonly assessed through low-pass-filtered time series to remove higher-frequency variability. Low-pass filtering, however, certainly affects the linear link between percentile time series; to what extent will be addressed as follows. Now, the analysis has been repeated with a Gaussian filter, whose weights depend on the standard deviation *σ* (see von Storch and Zwiers 2002, chap. 2, 17). The filter has been applied to the annual geostrophic percentile time series with *σ* = 2 prior to calculating the correlations. Low-pass filtering of geostrophic wind quantiles decreases the quality of the linear link, which can be seen in Fig. 3 and Table 3. While unfiltered percentiles mostly show moderate to strong positive linear relationships, low-pass filtering results in weak to moderate linear relationships. The 0.05 quantile of the correlations for the 99th percentiles is just above 0 with a value of 0.053. The proportions of accepted local hypothesis tests are higher (up to 44.96%) than those of unfiltered time series. However, it can be concluded that low-pass filtering does not destroy the positive linear relationship between any of the percentile wind speed time series, although it decreases the informative value.

We have obtained all the results through simulated winds in the virtual world of the regional model REMO. As the statistics of atmospheric wind speeds are reasonable well simulated over sea (Weisse et al. 2005), we expect that the positive linear relationship between variations of the statistics of geostrophic and of ground-level wind speeds also exists in the real atmosphere; to what extent cannot be estimated owing to a lack of observations.

## 3. How do size and surface conditions influence the description of storm activity?

Wang et al. (2009) noted that the configuration of triangles seems to be important as spatial gradients and differences might be masked out over long distances. To examine whether the configuration of the triangles plays an important role a two-way analysis of variance (ANOVA, e.g., von Storch and Zwiers 2002, chap. 9) has been carried out. We use ANOVA to evaluate the effects of different levels of size and surface conditions on the annual correlation. Furthermore a potential interaction between size and surface conditions is assessed that could emerge for smaller triangles with mixed surface conditions, that is, the two factors may act together on the annual correlation in a different way than they would separately. For that reason, we classify the transformed correlations by different levels of size and surface conditions. Equal group sizes are achieved by collapsing the annual correlations into groups of 116 randomly chosen values.

The response variable is the Fisher *z*–transformed annual correlation between quantile time series of geostrophic and of area-maximum surface wind speed. The Fisher *z* transformation is used to obtain a more normally distributed variable to analyze (e.g., von Storch and Zwiers 2002, chap. 8). Explanatory variables are the average length of triangle sides, here referred to as size, and the surface condition, that is, the land fraction, of underlying triangles. The surface condition is classified as land for a land fraction of greater than 0.5 and as sea for a land fraction of equal to or smaller than 0.5.

The size is divided into three groups—smaller than 300 km (small), equal to or greater than 300 km and smaller than 800 km (medium), and equal to or greater than 800 km (large). These classes are chosen for the following two reasons. The characteristic horizontal range of cold fronts stretches from 80 to 300 km (Carlson 1991). Cold fronts that bring a transition from warmer to colder air masses are often accompanied by strong winds. Whether the proxy is capable of detecting such circumstances will be seen by high correlations between the annual quantile time series. On the other hand, 800 km as the lower boundary for larger-sized triangles mark the transition from mesoscale to synoptic scale atmospheric motions—a characteristic dimension of extratropical cyclones.

The ANOVA, conducted at the 0.01 significance level, reveals that the effects of size and surface conditions on mean Fisher *z*–transformed correlations are independent among each other. Furthermore, there is a significant difference between the mean Fisher *z*–transformed correlations because of the surface conditions and size of underlying triangles. For further details on the ANOVA, see the appendix.

Table 4 reveals the inverse-transformed differences in the mean Fisher *z*–transformed annual correlation for different percentile time series and effects. Figure 4 illustrates the findings for the percentile time series. All the differences are significant at the 0.01 significance level in a Fisher *z t* test. We have found that geostrophic wind from sea triangles reflects storm activity better than geostrophic wind from land triangles. Moreover, smaller triangles lead to a better description of storminess than bigger triangles. The differences in the mean correlations due to size are most distinct with values greater than 0.30 for comparing small and large triangles. The differences, on the contrary, become small between small and medium-sized triangles with values from 0.07 to 0.24. The mean correlations between medium and large triangles differ from 0.23 to 0.27. The effects of surface properties result in differences of about 0.17–0.21. In general the differences are more distinct for the median wind time series and become smaller for upper-percentile wind time series (Table 4).

The higher mean correlation of sea triangles is understandable with regard to turbulent impacts over land that affect surface winds in the planetary boundary layer. The geostrophic wind approximation is less accurate in this layer over land where ageostrophic dynamics play an important role. Over sea the frictional influence from the surface diminishes resulting in a better description of wind speeds through geostrophic wind speeds. Note that these effects strongly depend on the parameterization in the REMO model. The near-surface winds in the model are affected by atmospheric stability and frictional effects of vegetation cover and topography (Jacob and Podzun 1997). The influence of these parameters on the wind is restricted by the spatial resolution in the model, such that turbulence is not described on the subgrid scale in itself. Instead such effects are parameterized. We can only speculate whether a more advanced parameterization would make the differences in the mean correlations due to surface conditions more distinct.

While the differences in the mean correlations between land and sea triangles are in the range of 0.17–0.21, the differences due to the size are greater and in the range of 0.07–0.48. For all the percentile time series the correlation is highest for small triangles. In contrast to large triangles that mask out pressure gradients, smaller triangles detect small-scale variations. Sharp pressure gradients associated with smaller low pressure systems can be named as examples. The detection of small-scale variations leads to a better description of wind and storm activity. The correlation appears to be also affected by topographical versatility within the triangles, which can be seen in Fig. 5. Figure 5 shows the spatial distribution of the correlation between the annual 95th percentile time series of geostrophic and area-maximum surface wind speed for each triangle size. Whereas the correlations of small triangles only decrease over smaller topographically versatile areas such as the Alps (with values of about 0.2–0.4), the correlations of medium and large triangles are lower than those of small triangles over land in general. Furthermore, the high correlations of smaller triangles are likely to be an effect of the hourly temporal resolution. Small and fast moving low pressure systems are noticed because of the high sampling frequency. Otherwise, these pressure systems would have rushed through the triangles without being recognized. Note that the high correlations of small triangles could also be caused by the regional model REMO that produced the initial data. Its spatial resolution is around 50 km, which is in the range of the smallest triangle size. It could be argued that ageostrophic components of the wind are homogeneously simulated on this spatial scale because of the parameterization, thus making the small-scale wind agree more with the geostrophic wind. A slight indication for this is shown in Table 4, where the differences in the mean correlation between small- and medium-sized triangles are smaller than the differences between other groups of sizes.

## 4. Concluding remarks

This study aims at a systematic evaluation of the triangle pressure proxy that has been and will be used to assess past and recent storm activity in the midlatitudes. Results obtained from examining the correlation between specific percentile time series of geostrophic wind speed and of area-maximum surface wind speed over various triangles show that the variation of strong geostrophic wind speed statistics describes the variation of ground-level wind speed statistics. Even though we used area-maximum (instead of area-averaged) surface wind speeds, we could show that annual and seasonal quantiles of geostrophic wind speed and of ground-level wind speed are positively linearly related. We verified the linear link by using simulated air pressure and ground-level wind speed in a regional model. We expect that the linear relationship as well exists in the real atmosphere as it does in the simulation. We also considered the influence of low-pass filtering, which decreases the quality of the linear link. Furthermore, we examined several factors that affect the description of storminess through geostrophic wind speed statistics. Geostrophic wind from sea triangles reflects storm activity better than geostrophic wind from land triangles. Smaller triangles lead to a better description of storminess than bigger triangles.

## Acknowledgments

We thank Frauke Feser, Matthias Zahn, Michael Hofstaetter, and Peter Hoffmann for constructive discussions and helpful comments. We also appreciate the thoughtful comments by the two anonymous reviewers.

### APPENDIX

#### Application of a Two-Way ANOVA

The general idea of analysis of variance (ANOVA) is to decompose the variability in the response variable among different factors (e.g., von Storch and Zwiers 2002, chap. 9). If the factors produce a significant amount of variation in the response variable, they will result in different mean values (in the categorized response variable). In our study we make use of a two-way ANOVA that also allows us to assess combined effects of the factors. In that case, the influence on the response variable is not independent among the involved factors. If two factors act independently of each other, the contribution made by any one of them is through the values of its individual levels, regardless of the level of the other factor.

The two-way analysis of variance helps us to determine whether the variation of the response variable, in our case the Fisher *z*–transformed annual correlation, is due to known causes, which are the factors size and surface conditions of underlying triangles, or whether it is due to random, unexplained causes.

The used factorial model of the ANOVA reads

where *Y _{ijk}* denotes the

*k*th Fisher

*z*–transformed annual correlation (with

*k*= 1, … , 116) in the (

*ij*)th combination of size and surface conditions, where

*i*= 1, 2, 3 (respectively small, medium, or large) and

*j*= 1, 2 (land or sea). Here,

*μ*is the overall mean Fisher

*z*–transformed annual correlation,

*α*is the effect of the size, and

_{i}*β*is the effect of surface conditions on the transformed correlation. Also, (

_{j}*αβ*)

*is the effect on the correlation when different levels*

_{ij}*i*and

*j*of size and surface conditions are combined, which indicates the effect of interaction between size and surface conditions, and

*ε*represents the random effect on the (

_{ijk}*ijk*)th transformed correlation and is assumed to be a zero mean and normally distributed variable with variance .

The ANOVA is carried out with three null hypotheses: two hypotheses for the direct (main) effects and one for the combined effect on the correlation. The first (second) main effect null hypothesis *H*_{0} states that there is no difference between the mean Fisher *z*–transformed correlations due to surface conditions (due to size of triangles), and alternative hypothesis *H*_{1} that there is a difference due to surface conditions (due to size of triangles). The interaction null hypothesis *H*_{0} declares that there is no interaction between the size of triangles and the surface conditions, the effects of size and surface conditions are independent. Its alternative hypothesis *H*_{1} denotes the existence of an interaction between the size and the surface conditions. The effects of size, in that case, depend on the surface conditions and vice versa.

The ANOVA requires several assumptions that need to be taken care of. Every Fisher *z*–transformed annual correlation in the (*ij*)th combination of size and surface conditions is assumed to be normally distributed with equal variance. The validity of the first assumption has been tested by using a Kolmogorov–Smirnov test, the latter one by a *χ*^{2} test. Both tests have been performed at the 0.05 significance level. Further, the ANOVA requires the transformed correlation in the (*ij*)th combination of factors to be independent, which we have considered by selecting the sample randomly.

Under *H*_{0} the test statistic, the ratio between explained and unexplained variance in the sample, follows a central *F* distribution with two different degrees of freedom *ϑ*. VR, which is estimated from the sample for each of the two factors and their combination, is used to calculate the probability value . Here, *P* determines the probability to find a variance ratio VR that is at least as extreme as the calculated variance ratio; *H*_{0} is thus accepted (rejected) when *P* is greater (smaller) than the used significance level.

The degrees of freedom and the test statistics are presented in Table A1 for the Fisher *z*–transformed correlations between the annual 95th percentile wind speed time series. For the other wind speed time series the values of the test statistics differ but the same conclusions can be drawn.

The ANOVA accepts the interaction null hypothesis at the 0.01 significance level. The effects of size and surface conditions on mean Fisher *z*–transformed correlations are independent. Furthermore, the other two null hypotheses are rejected at the 0.01 significance level. Thus, there is a significant difference between the mean Fisher *z*–transformed correlations owing to the surface conditions and size of underlying triangles.