1. Introduction
Global atmospheric reanalyses have become a common tool for climate model validations and diagnostic studies such as assessing climate variability and long-term trends. In operational weather analyses, state-of-the-art numerical weather prediction models in combination with modern data assimilation schemes are used to project the state of the atmosphere as described by a finite set of imperfect, irregularly distributed observations onto a regular grid (Glickman 2000). These analyses are useful products for numerical weather forecasts, but their use in climate change research remains limited because changes in the analysis system (the model or the data assimilation scheme) or changes in the observational network used may introduce inhomogeneities, which may cause spurious trends. To reduce inhomogeneities, a number of global reanalysis efforts (e.g., Uppala et al. 2005; Kalnay et al. 1996; Onogi et al. 2007) have been developed, all using frozen state-of-the-art data assimilation systems and numerical models. In this way, inhomogeneities in global reanalyses are greatly reduced, although changes in the (assimilated) observational network data may still have substantial impacts. For example, Bengtsson et al. (2004) showed that a remarkable jump in the annually averaged total kinetic energy occurred in the 40-yr European Centre for Medium-Range Weather Forecasts Re-Analysis (ERA-40) (Uppala et al. 2005) at the time when satellite data were introduced into the reanalysis, which lead to a significant upward trend in the total kinetic energy. This trend was largely reduced in a sensitivity experiment that simulated the situation before the advent of satellite data. Kistler et al. (2001) computed annually averaged anomaly correlations between 5-day forecasts of 500-hPa heights, which were initiated from the National Centers for Environmental Prediction–National Center for Atmospheric Research (NCEP–NCAR) reanalysis and the reanalysis itself at the time of the forecast. For the Northern Hemisphere they found that forecast skill was steadily increasing with time for the first 10 years or so of the reanalysis. These results indicate that during that time the quality (or the degree of realism) of the reanalysis has steadily improved owing to more and better observations and that the first 10 yr should be discarded when assessing long-term changes. Moreover, they showed that the reanalysis is much better over the Northern than over the Southern Hemisphere where much fewer observations are available, a result that is found and confirmed also from reanalyses products (e.g., Bromwich et al. 2007).
So far, most reanalyses available cover periods of up to several decades mostly for the second half of the twentieth century. While the datasets in recent decades might be less affected by inhomogeneities, the records are too short to fully assess natural climate variability and long-term changes. Therefore, the Twentieth Century Reanalysis project has been set up to produce a comprehensive global atmosphere dataset covering the period from 1871 onward (Compo et al. 2011). By assimilating only surface pressure observations with sea ice and sea surface temperature anomalies as boundary conditions, it was anticipated that inhomogeneities are largely reduced and, furthermore, that the dataset will become a valuable resource for both climate model validations and diagnostic studies (Compo et al. 2011). Some surprising results, such as noticeable differences of long-term trends in zonally averaged precipitation minus evaporation derived from the Twentieth Century Reanalysis (20CR) and from climate model simulations of the twentieth century, are already noted by Compo et al. (2011). Ferguson and Villarini (2012) recently found inhomogeneities in 20CR air temperature and precipitation that led to their suggestion to restrict climate trend applications over the central United States to the second half-century of the 20CR records.
More recently some papers have been published concentrating on assessing long-term trends in storm activity over Europe using 20CR. Brönnimann et al. (2012) used 20CR to assess trends in storm activity from 1871 onward by using modeled wind speeds at every grid point of 20CR in the Northern Hemisphere. In different case studies they find consistency with observations (e.g., the storm Kyrill). They also find good agreement with long-term storminess at Zurich (observed and modeled) where long observations of wind speed are available.
Donat et al. (2011) used 20CR to provide an analysis of storminess throughout the period 1871–2008. Through the assessment of a gale index derived from air pressure differences and upper percentiles of daily maximum wind speeds, they concluded that 20CR suggests a long upward trend in European storminess since 1871. They mention the possibility that 20CR is likely to suffer from inhomogeneities due to changing station density and quality of early observations. However, they conclude that the observational density over Europe is relatively high throughout the investigated period and suggest that identified trends may (at least partially) be a consequence of increasing greenhouse gas concentrations during the past. Their result is in sharp contrast to a large number of studies focusing on long-term storminess trends for western Europe and the North Atlantic (e.g., Alexandersson et al. 2000; Bärring and von Storch 2004; Matulla et al. 2008; Wang et al. 2009), which found decreasing storminess until the 1960s, an increase until the mid-1990s, and a decline afterward.
In this paper we focus on the extent to which long-term trends in storm activity over Europe and the northeast Atlantic may be derived from 20CR. Instead of relying on wind speed measurements themselves, which frequently suffer from inhomogeneities such as changes in measurement techniques, relocation of stations, or changes in the surrounding of stations (e.g., Wan et al. 2010; Lindenberg et al. 2012), we use a well-established proxy for storm activity based on geostrophic wind speeds derived from surface pressure data. The index was originally proposed by Schmidt and von Storch (1993) and later on extensively used by other authors (e.g., Alexandersson et al. 2000; Bärring and von Storch 2004; Matulla et al. 2008; Wang et al. 2009). Krueger and von Storch (2011) showed that the informational content of such proxies is high enough to describe past storminess. Updates of such indices are provided in the Intergovernmental Panel on Climate Change (IPCC) Fourth Assessment Report to describe long-term changes and variability of storm activity (Fig. 3.41 in Trenberth et al. 2007). Moreover, marine surface pressure measurements are less likely to be affected by inhomogeneities as marine surface pressure represents (compared to near-surface wind speeds) a relatively large-scale variable that is less affected by changes in instrumentation,1 small relocations of stations, or changes in the surrounding stations. We also concentrate on an area known to have a relatively high station density throughout the period for which 20CR was performed (Donat et al. 2011) in order to provide a conservative estimate.
The remainder of this paper is structured as follows. In the next section, we concentrate on the comparison of storminess trends in 20CR and observations. We first introduce the data and method needed in our analysis and present the results afterward. In the third section, we assess changes in the number of stations assimilated into 20CR, followed by the last section in which we discuss our results and conclude.
2. Comparison of storminess trends in 20CR and observations
a. Data and methods
The MSLP observations used in our study are available from Cappelen et al. (2010). The pressure observations are also available from the International Surface Pressure Database (at http://reanalyses.org/observations/international-surface-pressure-databank), whose data have been assimilated into 20CR (Compo et al. 2011). Presumably, storm activity based on observations and on 20CR should be very similar.
We derive the standardized time series of annual 95th and 99th percentiles of geostrophic wind speeds over 10 triangles of mean sea level pressure from observations and 20CR in the North Atlantic from 1881 onward. The time series are standardized by subtracting their mean values and dividing by their standard deviations as in Alexandersson et al. (1998, 2000). The standardization ensures that each time series is in the same range. We only regard annual percentiles to prevent the possible danger of alias artifacts in the time series (see Madden and Jones 2001). Afterward, these 10 time series are averaged to obtain a robust estimate of storminess on a large scale. The coordinates of the triangle corners are given by Alexandersson et al. (1998, 2000) and are illustrated in Fig. 1. In 20CR, we use the grid boxes nearest to the station coordinates (see Fig. 1 and Table 1). Note that, in the case of the Danish stations, the two stations lie within one 20CR grid point. Although the MSLP values from 20CR grid points are not identical to station measurements, resulting differences are systematic throughout the gradient calculation. Therefore, the statistics of the geostrophic wind speeds will not be affected greatly by this issue. Further, the MSLP is a relatively large-scale variable. By employing our analysis over sea surfaces mainly, we minimize the influence of land surfaces and avoid land-use change (or changes in surface roughness). These aspects would disturb the geostrophic wind approximation and thus its representativeness of surface storminess (Krueger and von Storch 2011). However, the geostrophic wind itself and its statistics are independent from such aspects.
WMO number (or for Denmark a national climate number), country, name, and coordinates of the stations used. The numbers in parentheses denote the coordinates of the nearest 20CR grid box.
We repeat the calculations for each of the 56 ensemble members of 20CR and derive an ensemble mean of the storminess time series in 20CR as suggested by Compo et al. (2011). In the following, we will concentrate on the standardized annual 95th percentiles of geostrophic wind speeds only as both standardized time series derived from annual 95th and 99th percentiles agree almost completely with each other.
We focus our discussion on Gaussian-filtered time series (with σ = 3), which leaves the long-term trends in the time series without the year-to-year variability. We provide the Gaussian-filtered ensemble mean of the 56 percentile time series, the associated (Gaussian-filtered ensemble) spread (black line and gray shades in Fig. 2), and the Gaussian-filtered percentile time series derived from observations (blue line in Fig. 2). Along with averaging the time series of the 10 triangles and only regarding annual percentiles, the Gaussian filter will help to overcome potential problems in comparing the time series that may arise from the different temporal resolution of 20CR (6-hourly) and observations (3-hourly, and later 1-hourly). Note that we have taken the missing years in the observations (see Table 2) into account by setting these years to missing values at the respective locations in the 20CR data. Our analysis of storminess therefore starts in the year 1881. Even so, the same observations have been very likely assimilated into 20CR. Considering all these measures taken, our employed analysis provides a robust estimate of storm activity on a large scale.
Triangles and time periods used to construct mean values within the northeast Atlantic.
b. Comparison of storminess and statistical significance
Storminess derived from 20CR through geostrophic wind speeds over the northeast Atlantic resembles the time series shown in Donat et al. (2011); in particular, an upward trend over the whole period is inferred. The time series increases until the 1990s and then decreases. Moreover, some decadal variability is imposed on the time series, which appears to be weak.
However, when compared with the 95th percentiles of geostrophic wind from observations after Alexandersson et al. (2000) (blue line in Fig. 2), we obtain completely different results. Except from a decline in the 1880s, a trend over the entire analysis period derived from observations is not visible. Decadal-scale variability dominates the observation-based time series. There is only one similarity: The time series seem to be in phase after 1940 and share a correlation of 0.95. Before 1940 the correlation is 0.11. Either time series share the upward trend after the 1960s and the following decreasing trend starting in the early 1990s. During the same period the differences almost vanish. In contrast to our 20CR-based time series, the upward trend in storminess from observations after 1960 is rather small (relative to the whole time series itself).
Formally, our findings are confirmed by bootstrap hypothesis testing of differences in low-pass filtered mean values, which allows us to also consider uncertainties in the observations. First, we derive an ensemble of similar observation-based time series of mean sea level pressure through sampling measurement errors. In this first step, we assume normally distributed measurement errors in the pressure observations with a mean of 0 hPa and a standard deviation of 1 hPa. Note that this value is rather conservative and high as pressure observations are usually provided with 0.1-hPa accuracy. Such a high value of 1 hPa, nevertheless, ensures that larger uncertainties in measurements in the early years are well accounted for. These random errors are repeatedly added to the observed mean sea level pressure, from which annual 95th percentiles of geostrophic wind speeds are then calculated (as written above). The created ensemble, in our case, consists of 7400 storminess time series, whose ensemble mean is low-pass filtered (see above) and compared to the low-pass filtered ensemble mean of 20CR storminess in the next step. Second, under the null hypothesis of no differences in low-pass filtered mean values, we bootstrapped a null distribution to derive upper and lower critical values of differences. At the 0.01 significance level, these values are about ±0.14. Last, we calculated the differences in low-pass filtered mean values (i.e., between the black and blue line) at each time step of the overlapping time period 1881–2004. There are only two periods when differences are in between the calculated critical values and thus fail to reject the null hypothesis. The first period, 1928–39, is marked by the intersection of the time series (due to the steady upward trend of the 20CR-based curve). The second period, 1986–2004, is the period when the time series almost completely agree with each other.
3. Changes in the station density and storminess
Inhomogeneities caused by changes in the station density and quality of observations represent a likely reason for explaining the described discrepancies. The 20CR ensemble spread (regarded for the surface pressure fields) represents the uncertainty in pressure measurements. It further reflects, to a certain degree, the number (or lack) of assimilated pressure observations over land and sea.
From 1871 onward we calculated the yearly mean of the area average of the ensemble standard deviation of the surface pressure over all grid points in the examined area, which roughly spans from 51.9° to 71°N, 22.7°W to 14.5°E (Fig. 3a). Further, to illustrate the number of assimilated stations, we analyzed the metadata provided by Compo et al. (2011) (available at
The standard deviation steeply decreases until 1880. Afterward, which marks the relevant period in our analyses, the standard deviation slowly decreases until 1938. During the World War II era, we see a steep increase and decrease thereafter (around 1940). After the 1950s the time series decreases further slowly and remains almost unchanged after 1965.
The number of assimilated stations slowly increases until 1927. Afterward, during the World War II era, we see a steep increase followed by a decrease. The time series increases again slowly until the 1960s when it soars to a higher level. In the beginning of the 1970s there is a sudden decline, which is followed again by an increase. It may be possible that there are some gaps in the metadata in this instance. After the mid-1970s, the numbers of assimilated stations are on a high level and increase even further.
Our 20CR-based and observation-based storminess time series agree in their phase characteristics, as written above, from the 1940s onward. From the 1960s onward, even the differences between the time series become smaller and almost vanish. These agreements coincide with the strong reduction of uncertainty in the 20CR ensemble (as in Fig. 3a), also due to the strong increase in the number of assimilated station readings to a high level (Fig. 3b and Compo et al. 2011).
Further, the upward trend in 20CR storminess until the 1950s occurs at the same time when the standard deviation of the 20CR ensemble is steadily decreasing and when the number of assimilated stations is steadily increasing. Over the period 1881–1950, for instance, the low-pass filtered ensemble mean of extreme geostrophic wind speed percentiles and the ensemble standard deviation share a correlation of about −0.60 (about 36% explained variability) due to the opposite trends of the geostrophic wind speed percentiles and the standard deviation.
4. Discussion and conclusions
We have compared long-term time series of storminess over Northern Europe and the northeast Atlantic derived from observations and 20CR. We have assessed the temporal evolution of storminess through a well-established proxy of storm activity. This proxy is based on upper percentiles of geostrophic wind speeds, which we have derived from surface pressure triangles. While both time series share a common behavior roughly during the second part of the twentieth century, they are inconsistent during the earlier years. While the storm index derived from observations shows pronounced decadal variability but no clear long-term trend, the storm index derived from 20CR suggests a more steadily increasing upward trend throughout the twentieth century.
We argue that the long-term behavior of storm activity in 20CR is implausible because of several reasons. A number of studies that examined storminess in that area and used different sources of information support our results, which are based on observed pressure data. Von Storch and Reichardt (1997) and Weisse and von Storch (2009) analyzed extreme sea levels derived from tide gauge data in the German Bight in terms of storminess. When changes in the mean sea level were taken into account, this proxy showed pronounced decadal variability with a maximum occurring around 1995, but no clear long-term trend over the last century. Woodworth and Blackman (2002) and Menéndez and Woodworth (2010) examined a quasi-global tide gauge dataset and used similar methods. They were similarly unable to derive significant long-term trends in storm-induced water level variations along European coasts. Further, Bärring and von Storch (2004), who used several proxies based on homogenized air pressure readings from individual stations, described pronounced variability but also found no evident long-term trend.
These results suggest that the long-term trend identified from analyzing 20CR needs to be carefully regarded and probably reflects inhomogeneities in the reanalysis itself, most likely as a consequence of a changing station density. A similar argument is stated in Compo et al. (2011), who noted that storm tracks estimated from the ensemble mean of 20CR appear to be noticeably weaker for the earlier period 1887–1947 compared to the more recent period 1948–2008. They emphasize, that “such a result should not be taken as indicative of an actual climate change. Rather, as the observational density gets lower, less synoptic variability is present in the ensemble mean analyses as fewer observations are available.”
Our results from analyzing a storm proxy based on large-scale atmospheric pressure data point to inconsistencies in the long-term trends and variability of storminess derived from observations and 20CR. The inconsistencies are largest during the first half of 20CR, when fewer stations are assimilated and storm activity is surprisingly low. The inconsistencies are already large over a supposedly well-monitored region. Our findings suggest that similar problems may arise, in particular over more data-sparse regions. While changes in the number of assimilated stations appear to be the most likely reason to explain the discrepancies, a 20CR dataset whose station density remains constant over time is required to fully address this problem (e.g., Thorne and Vose 2010). Unfortunately such a dataset is not available so far.
Acknowledgments
The authors thank Beate Gardeike for helping us prepare the figures. We are grateful to the Deutsche Wetterdienst DWD, the Danmarks Meteorologiske Institut DMI, and the Norwegian Meteorologisk Institutt for assisting us in retrieving pressure observations.
Support for the Twentieth Century Reanalysis project dataset is provided by the U.S. Department of Energy Office of Science Innovative and Novel Computational Impact on Theory and Experiment (DOE INCITE) program and Office of Biological and Environmental Research (BER) and by the National Oceanic and Atmospheric Administration Climate Program Office.
REFERENCES
Alexandersson, H., T. Schmith, K. Iden, and H. Tuomenvirta, 1998: Long-term variations of the storm climate over NW Europe. Global Atmos. Ocean Syst., 6, 97–120.
Alexandersson, H., H. Tuomenvirta, T. Schmith, and K. Iden, 2000: Trends of storms in NW Europe derived from an updated pressure data set. Climate Res., 14, 71–73.
Bärring, L., and H. von Storch, 2004: Scandinavian storminess since about 1800. Geophys. Res. Lett., 31, L20202, doi:10.1029/2004GL020441.
Bengtsson, L., S. Hagemann, and K. I. Hodges, 2004: Can climate trends be calculated from reanalysis data? J. Geophys. Res., 109, D11111, doi:10.1029/2004JD004536.
Bromwich, D. H., R. L. Fogt, K. I. Hodges, and J. E. Walsh, 2007: A tropospheric assessment of the ERA-40, NCEP, and JRA-25 global reanalyses in the polar regions. J. Geophys. Res., 112, D10111, doi:10.1029/2006JD007859.
Brönnimann, S., O. Martius, H. von Waldow, C. Welker, J. Luterbacher, G. P. Compo, P. D. Sardeshmukh, and T. Usbeck, 2012: Extreme winds at northern mid-latitudes since 1871. Meteor. Z., 21, 13–27.
Cappelen, J., E. Laursen, and C. Kern-Hansen, 2010: DMI daily climate data collection 1873-2010, Denmark, the Faroe Islands and Greenland. Danish Meteorological Institute Tech. Rep. 11-06, 41 pp. [Available online at http://www.dmi.dk/dmi/tr11-06.pdf.]
Compo, G. P., and Coauthors, 2011: The Twentieth Century Reanalysis project. Quart. J. Roy. Meteor. Soc., 137, 1–28.
Donat, M. G., D. Renggli, S. Wild, L. V. Alexander, G. C. Leckebusch, and U. Ulbrich, 2011: Reanalysis suggests long-term upward trends in European storminess since 1871. Geophys. Res. Lett., 38, L14703, doi:10.1029/2011GL047995.
Ferguson, C. R., and G. Villarini, 2012: Detecting inhomogeneities in the Twentieth Century Reanalysis over the central United States. J. Geophys. Res., 117, D05123, doi:10.1029/2011JD016988.
Glickman, T., Ed., 2000: Glossary of Meteorology. 2nd ed. Amer. Meteor. Soc., 855 pp.
Kalnay, E., and Coauthors, 1996: The NCEP/NCAR 40-Year Reanalysis Project. Bull. Amer. Meteor. Soc., 77, 437–471.
Kistler, R., and Coauthors, 2001: The NCEP–NCAR 50-Year Reanalysis: Monthly means CD-ROM and documentation. Bull. Amer. Meteor. Soc., 82, 247–267.
Krueger, O., and H. von Storch, 2011: Evaluation of an air pressure–based proxy for storm activity. J. Climate, 24, 2612–2619.
Lindenberg, J., H. Mengelkamp, and G. Rosenhagen, 2012: Representativity of near surface wind measurements from coastal stations at the German Bight. Meteor. Z., 21, 99–106.
Madden, R. A., and R. H. Jones, 2001: A quantitative estimate of the effect of aliasing in climatological time series. J. Climate, 14, 3987–3993.
Matulla, C., W. Schöner, H. Alexandersson, H. von Storch, and X. Wang, 2008: European storminess: Late nineteenth century to present. Climate Dyn., 31, 125–130.
Menéndez, M., and P. Woodworth, 2010: Changes in extreme high water levels based on a quasi-global tide-gauge data set. J. Geophys. Res., 115, C10011, doi:10.1029/2009JC005997.
Onogi, K., and Coauthors, 2007: The JRA-25 Reanalysis. J. Meteor. Soc. Japan, 85, 369–432.
Schmidt, H., and H. von Storch, 1993: German Bight storms analysed. Nature, 365, 791–791.
Schmith, T., 1995: Occurrence of severe winds in Denmark during the past 100 years. Proc. Sixth Int. Meeting on Statistical Climatology, Galway, Ireland, All-Ireland Statistics Committee and Cosponsors, 83–86.
Thorne, P. W., and R. S. Vose, 2010: Reanalyses suitable for characterizing long-term trends. Bull. Amer. Meteor. Soc., 91, 353–361.
Trenberth, K. E., and Coauthors, 2007: Observations: Surface and atmospheric climate change. Climate Change 2007: The Physical Science Basis, S. Solomon et al., Eds., Cambridge University Press, 235–336.
Uppala, S. M., and Coauthors, 2005: The ERA-40 Re-Analysis. Quart. J. Roy. Meteor. Soc., 131, 2961–3012.
Von Storch, H., and H. Reichardt, 1997: A scenario of storm surge statistics for the German Bight at the expected time of doubled atmospheric carbon dioxide concentration. J. Climate, 10, 2653–2662.
Wan, H., X. L. Wang, and V. R. Swail, 2010: Homogenization and trend analysis of Canadian near-surface wind speeds. J. Climate, 23, 1209–1225.
Wang, X., F. Zwiers, V. Swail, and Y. Feng, 2009: Trends and variability of storminess in the northeast Atlantic region, 1874–2007. Climate Dyn., 33, 1179–1195.
Weisse, R., and H. von Storch, 2009: Marine Climate and Climate Change: Storms, Wind Waves and Storm Surges. Springer Praxis, 219 pp.
Woodworth, P., and D. Blackman, 2002: Changes in extreme high waters at Liverpool since 1768. Int. J. Climatol., 22, 697–714.
Pressure is measured over centuries using mercury barometers.