Climatological reference data serve as validation of regional climate models, as the boundary condition for the model runs, and as input for assimilation systems used by reanalyses. Within the framework of the interdisciplinary research program Climate Water Navigation (KLIWAS): Impacts of Climate Change on Waterways and Navigation of the German Federal Ministry of Transport and Digital Infrastructure, a new climatology of the North Sea and adjacent regions was developed in an joint effort by the Federal Maritime and Hydrographic Agency, the German Weather Service [Deutscher Wetterdienst (DWD)], and the Integrated Climate Data Center (ICDC) of the University of Hamburg. Long-term records of monthly and annual mean 2-m air temperature, dewpoint temperature, and sea level pressure data from 1950 to 2010 were calculated on a horizontal 1° × 1° grid. All products were based on quality-controlled data from DWD’s Marine Data Centre. Correction methods were implemented for each parameter to reduce the sampling error resulting from the sparse coverage of observations in certain regions. Comparisons between sampling error estimates based on ERA-40 and the climatology products show that the sampling error was reduced effectively. The climatologies are available for download on the ICDC’s website and will be updated regularly regarding new observations and additional parameters. An extension to the Baltic Sea is in progress.
The Climate Water Navigation (KLIWAS): Impacts of Climate Change on Waterways and Navigation (Kofalk et al. 2010) program of the German Federal Ministry of Transport and Digital Infrastructure investigated the potential consequences of climate change for navigation on coastal waterways and coastal protection infrastructure by means of regional climate models. As part of this program, one goal was aimed at evaluating the quality of hindcast runs of these regional models and of global reanalyses providing boundary condition by means of long-term reference data over the North Sea. To achieve this goal, a gridded climatological reference dataset is required against which the model results can be compared.
Unfortunately, climatologies for the North Sea region are rare and until now existed only in the form of a compiled atlas, consisting of printed maps for a limited climate period from 1981 to 1990 (Michaelsen 1998). However, the underlying input data required to produce gridded fields are not available anymore. Atmospheric reanalyses, such as the NCEP-1 (Kalnay et al. 1996) and NCEP-2 (Kanamitsu et al. 2002), ERA-Interim (Berrisford et al. 2009; Dee et al. 2011), ERA-40 (Uppala et al. 2005), and the Twentieth Century Reanalysis, version 2 (Compo et al. 2011), that could be used to evaluate regional climate models are model outputs as well and cannot be considered observations, even though they assimilate observational data. Last but not least, they are not specifically created with high-resolution for any region. In essence, atmospheric reanalyses are not suited without limitation and further assessment as a reference dataset in the North Sea region.
Few gridded reference datasets exist for regional analyses over land and are primarily based on in situ observations, such as the European daily high-resolution gridded dataset (E-OBS; Haylock et al. 2008; van den Besselaar et al. 2011). There are no such reference datasets of the marginal seas such as the North Sea. Observations from voluntary observing ships (VOS; see, e.g., WMO 1994), buoys, research, naval, and light vessels are the available in situ measurements in this area. They form a long-term data source and provide information about air temperature, humidity, and pressure that cannot be reliably measured from satellites (Kent and Berry 2005). These observations are archived, among others, in the widely known International Comprehensive Ocean–Atmosphere Data Set (ICOADS; Worley et al. 2005; Woodruff et al. 2011) and DWD’s Marine Data Centre.
Simple gridded ICOADS products of monthly mean data on a 1° × 1° and 2° × 2° horizontal resolution for different input data and periods are provided by the Physical Science Division of NOAA’s Earth System Research Laboratory (http://icoads.noaa.gov/data.icoads.html). The following key limitations of this product are described in detail by Deser and NCAR (2015): no corrections are applied to the observations beyond basic quality control, the data coverage is sparse, and creating comprehensible maps of a given climate variable will need further processing. These products represent indeed the actual measured values that may have been observed at different times of the month, and do not take into account observational uncertainties (Gulev et al. 2003). They are also not homogeneously distributed in space; discrete shipping routes lead to “contamination” of certain grid cells (Schade et al. 2013).
From the abovementioned findings, we have to conclude that a dataset sufficient in quality and detail so that it can be used to test a regional model of the North Sea region does not exist. The Federal Maritime and Hydrographic Agency, the German Weather Service [Deutscher Wetterdienst (DWD)], and the Integrated Climate Data Center (ICDC) of the University of Hamburg therefore decided to create such a reference dataset, using all available observations of the North Sea and adjacent regions. This paper will present this new climatology, which hereafter will be referred to as the KLIWAS North Sea climatology (KNSC), consisting of the variables 2-m air temperature, sea level pressure, and dewpoint temperature. The goal of the paper is to introduce the KNSC by describing the details of the processing of the climatology and the dataset used. At the same time, it will provide a quality test of KNSC by comparing it against independent observations of coastal land stations and research platforms.
Section 2 will focus on the input data and the preprocessing procedures, followed by a detailed description of the methods that were used to create the climatology in section 3. The results for each variable are presented in section 4, and the summary and a brief outlook follow in section 5. Schade et al. (2017, hereafter Part II) will go further into detail on the assessment of five global reanalyses against the final climatology.
2. Data input
a. DWD’s Marine Data Centre
The input data for the KNSC are marine in situ observations originating from the Marine Data Centre of the DWD, which maintains an extensive climatological archive of national and international marine observations. Apart from recent data, the archive consists of a large amount of historic data ranging back to the mid-nineteenth century. The DWD Marine Data Centre combines real-time marine observations and delayed-mode data from VOS (http://sot.jcommops.org/vos/vos.html) into a common archive with regular additions of new data. Real-time data from ships, buoys, and other marine measurement platforms are automatically retrieved from the Global Telecommunications System (GTS), archived in an interim database, and are consolidated in near–real time for long-term archival. Additionally, VOS data are collected, thoroughly checked, and redistributed in delayed mode by the Global Collecting Centres (GCC) within the WMO Marine Climatological Summaries Scheme (MCSS). The DWD Marine Data Centre operates one of the two GCCs worldwide (http://www.dwd.de/EN/ourservices/gcc/gcc.html), and the data are also included in the DWD archive. Many VOS send data in real time over the GTS as well as in delayed mode through the GCCs. If data from both data sources are available for a VOS, then the GCC data supersede the GTS data stream in the DWD archive, due to its higher quality standards. GCC data are also an important input for ICOADS. Thus, the contents of the DWD marine data archive are comparable to ICOADS. However, particularly in the early part of the record from 1950 to 1980, both archives may contain different data source, as international data exchange was not so well developed as today. To fill possible gaps in the DWD archive, the in situ data basis was complemented by observations from the ICOADS dataset (Woodruff et al. 2011).
To ensure the maximum degree of reliability, all observations undergo a flagging procedure based on several quality checking standards. The first step is a minimum quality control (MQC) procedure (WMO 2012) that is applied to all VOS data prior to the international redistribution. In the next step, all data are checked using the DWD high quality control (HQC) procedure. These procedures do not only check the individual observation but also implement checks on a sequence of observations from a specific observation platform in order to identify data errors of location and their time series consistency.
The individual HQC checks comprise:
Detection of duplicates.
Formal check for numeric/nonnumeric values, allowed values of parameters.
Verification of the geographical position and, if given, the direction and distance of travel along the route. Distances exceeding individually defined tolerance levels are discarded from further analysis.
Climatological threshold checks to identify unphysical values. These thresholds were defined on the basis of the ERA-Interim dataset (Berrisford et al. 2009; Dee et al. 2011). In some regions small-scale or short time events are not properly represented in the reanalysis (e.g., cold air outbreaks with extremely low air temperatures or local wind speed peaks). The climatological limits were manually adapted for such regions to allow the respective values in the observations.
Identification of temporal outliers and repetitive values in data sequences.
Consistency checks, which also include the identification of unphysical relations between different parameters.
Spatial checks to reject values that exceed a maximum deviation (individually defined for each parameter) from neighboring observations.
All observations are sequentially flagged according to the results of the abovementioned checks, assigning a new indicator for each check (the appendix). However, not all tests are applicable to an observation; for example, there may not be enough observations for a time sequence test. If an observation does not pass any of the checks, it is rejected.
Striving for cooperation between DWD and ICD, this dataset was chosen as input for a gridded climatology for the North Sea and adjacent parts of the North Atlantic and the Baltic Sea. In total, approximately 19.4 million quality-controlled sets of meteorological in situ observations in the region 47°–65°N, 15°W–15°E from 1950 to 2010 were considered. The full input dataset for the KSNC with marine surface observations is available for download (ftp://ftp-icdc.cen.uni-hamburg.de/knsc/).
The variables 2-m air temperature (AT), sea level pressure (SLP), and 2-m dewpoint temperature (DPT) were analyzed. Only data that passed at least the climatology check (section 2a) were accepted—more specifically, observations with quality flags C–H (the appendix). Moreover, only observations at standard observation times (0000, 0600, 1200, and 1800 UTC) were used in order to reduce possible sampling biases (e.g., Gulev et al. 2007) caused by the high frequency of observations by automated measurements. Overall, approximately 7 million AT, 7 million SLP, and 4.6 million DPT data remained after the filtering process.
Figure 1 shows the number of observations per year of the parameters AT, SLP, and DPT available over the study area. The figure illustrates that generally there are more observations for AT and SLP than for DPT, since the first two are easier to observe by the ship’s personnel (Gulev et al. 2007). A poor observational density can be seen for all variables in the 1950s, followed by a general data increase—especially in the 1960s and the 1980s. The latter can be explained by the rising amount of automation of atmospheric observations on merchant ships (Gulev et al. 2007).
To get an overview of the spatial and temporal distribution of the observations, Fig. 2 shows the average number of observations per month and grid cell of all variables for the period from 1950 to 2010. Values equal to or greater than 20 are shown in dark red and can reach an average number of about 300 observations. Generally, a suboptimal coverage (1–10) is found outside the North Sea: even a low threshold of 20 observations per month is not achieved in many grid cells. Since the North Sea area shows better coverage, a default threshold of 20 observations will be used in this assessment.
Figure 3 shows the distribution of observations in 6-day windows (days 1–6, 7–12, 13–18, 19–24, 25–end of the month) per month and cell for AT, SLP, and DPT to examine the temporal coverage within a month. These windows are the result of an optimization process to ensure a uniform distribution of observations throughout the month. For example, a single low pressure system at the beginning of the month would result in underestimating the monthly mean SLP value if the other windows were not covered. Only a few cells contain observations in every 6-day window; single highly frequented shipping routes can be identified for AT and SLP observations, for example, in the English Channel and across the North Sea around the northern coast of Scotland (see, e.g., Schade et al. 2013). Therefore, an estimate of four covered windows per month will be used in this assessment to approximate a uniform distribution.
c. Simple gridded product
As a first attempt, the input data were averaged over 1° × 1° boxes; at least 20 measurements had to be present per box to get a valid mean value. As an example, Figs. 4a and 4b display the resulting monthly mean AT and SLP data from the simple gridded product (SGP) for the month of December 2001. A high spatial variability is evident in the whole area and for both parameters. Differences between neighboring cells can be distinct: the grid cell centered at 57.5°N, 3.5°E, for example, shows a mean AT value of about 1.3°C, surrounded by cells with values of up to 8°C (Fig. 4a). Differences between neighboring cells are especially pronounced for mean SLP in the northern parts of the area (Fig. 4b) with up to 40–45 hPa. The corresponding gridded ICOADS 1° × 1° product looks very similar (not shown). To clarify that December 2001 was not only an outlier, the standard deviation was calculated for each box over all December means (1950–2010). Figures 4c and 4d show these standard deviations as a measure for the temporal variability of AT and SLP. The variability is especially high in coastal regions and in the northern and northwestern regions, and differs in these regions from box to box. At least for the northern region for AT and the entire northwestern region for SLP, the low observational density is responsible for the high variability (see Fig. 2).
d. Sampling error estimates
As shown in the previous section, the observations show irregular temporal (and spatial) sampling, which has an impact on the final data product. To get a measure for this effect, a subsampling approach was chosen as proposed by Gulev et al. (2007): the single observations were matched with the 6-hourly ERA-40 output and only those values of ERA-40 were taken into account where observations existed. From this subsampled ERA-40 dataset, monthly mean values were calculated per cell. The respective sampling error estimate was computed as the difference between the monthly means calculated from the complete ERA-40 product and the subsampled monthly means. This period was chosen for the following reasons: most of the boxes of the North Sea show values in every month. And, it is the common period for all reanalyses that are investigated in detail in Part II. To overcome the resulting sampling error, a combination of correction methods was applied to create climatologies for each variable. These methods are described in detail in the following section.
To calculate fields of monthly mean values, all observations were binned in 1° × 1° grid cells. All further processing methods are described in the following paragraphs.
a. Correction of diurnal variation
In case the investigated variable showed a diurnal cycle, the diurnal variations were removed from the data to allow daily averaging over not evenly distributed observations: At first, the 6-hourly mean values ht (with t = 0000, 0600, 1200, 1800) were calculated by averaging all n observations at this time t of a month m and grid cell c, including the eight neighboring grid cells (cΔ) for each year, and then by averaging these results over the entire period, 1950–2010 [Eq. (1)]. In the denominator of Eq. (1), only the years showing observations were counted. A correction term “ct” was computed for each t, m, and c as the difference between the long-term daily mean and ht [Eq. (2)]. Figure 5a shows the four correction terms for AT for an example cell (Fig. 5b) for the month of December. All observations oi of a certain c, m, and t were corrected by their corresponding ct [Eq. (3)],
b. Correction of monthly variation
If the parameter showed an annual cycle, the intramonthly variations were removed from the data. The annual cycle, however, remained in the 12 monthly mean values. This processing allows the monthly averaging of temporally unevenly distributed observations. At first, daily means dmean were calculated of the considered c and its eight neighboring cells (cΔ) over the period 1950–2010 [Eq. (4)]. As in Eq. (1), only the years showing observations on this day were counted. If the parameter shows a diurnal cycle, the corrected observations (section 3a) were used as oi—if not the original observations were taken. Figure 6a shows the mean annual cycle that results on the basis of the calculated daily means. This cycle was divided into 12 months and approximated per month by a second-degree polynomial. The five adjacent days of the previous and the following month were included to approximate the transition between the months. In each cell, a ct per day (Fig. 6b) was calculated as the difference between the polynomial fit and the long-term monthly mean [Eq. (5)]. The parameter “dom” in the long-term mean part contains the number of days per month. All observations per cell were corrected by the corresponding correction term [Eq. (6)],
c. Including neighboring grid cells
Neighboring grid cells have been included for the calculation of the monthly mean if the number of measurements per cell drops below a certain threshold. A default threshold of 20 observations per month was assumed to allow at least reliable statistics and promised reasonable coverage (Fig. 2). In the first step, the eight neighboring cells were included; in the second step, 16 more cells were considered if the number of observations still remained below the threshold. Hence, the actual cell mean value was calculated from up to 25 cells. Only in the case of no observations was the grid cell not considered at all for the KNSC product.
These observations were simply averaged arithmetically, since the goal of this project was to produce climatological monthly means and these variables show a low spatial variability on a monthly scale; small-scale variations on a daily scale were not supposed to be represented in this climatological product. In contrast to the temporal correction procedures (see sections 3a, 3b, and 3d), no spatial correction, like spatial interpolation methods, was applied to the observations to calculate the gridcell value. Even after temporal correction, extreme values measured on different days of the month can occur in the same location, so spatial interpolation could lead to artifacts. Additional artificial gradients could be introduced by spatial interpolation, since the measurements are in any case unevenly distributed in a grid cell, due to ship routes and fixed buoys.
The calculated mean values were assigned to the centers of the respective cells: 47.5°– 64.5°N and 14.5°W–14.5°E. In addition to the sampling error (calculated as described in section 2d), the number of observations, the standard deviations, and the number of neighboring cells were stored for each cell and month.
There was no special treatment for regional boundary grid cells where the observation density is low (see Fig. 2) and on average more neighboring cells were included. To estimate the possible bias introduced by an asymmetric field of neighboring cells at the region border, the error was calculated for each water boundary cell per month from 1950 to 2010. In place of each boundary grid cell, a proxy cell with a complete neighborhood was identified according to the number of cells used for the averaging process—that is, 9 or 25—by shifting its location one or two steps toward the center of the region. The difference between the mean values of the 9 (25) cells and the asymmetric 6 (15) cells was determined for these proxy cells. Finally, the mean error (ME) and the mean absolute error (MAE) and their standard deviations were calculated over all differences of all proxy cells for each variable as a temporal and spatial average and are shown in the results.
d. 6-day windows
If a distinct mean annual cycle, regarding the overall period, from 1950 to 2010, does not exist for a variable and therefore the procedure discussed in section 3b would not be reasonable, then the month was divided into five 6-day windows: w1–w5 (1–6, 7–12, 13–18, 19–24, 25–end of the month, respectively). As already mentioned in section 2b, this procedure is supposed to avoid an underrating of periods with low observations. First, the average within each 6-day window was calculated using oi. As in section 3c, were included in two steps if the number of observations in that window was below the threshold. These mean values were averaged arithmetically to obtain a monthly mean value [Eq. (7)]. If there were fewer than four windows with observations, then the calculated mean value mmean was discarded,
e. Annual and 30-yr mean fields
Based on the monthly mean products, additional 30-yr climatologies were calculated for the standard periods 1951–80, 1961–90, 1971–2000, and 1981–2010 for each month, as well as annual means for the entire period. As a prerequisite for calculating the 30-yr mean of each cell, at least 25 of the 30 monthly mean values had to be present for each period. To get a valid annual mean per grid cell, all 12 months had to be available.
In the first step, the 30-yr climatologies were qualitatively compared with the printed maps of the climatology by Michaelsen (1998) to exclude major processing errors. In the second step, the product was checked against the simple gridded product (section 2c). It was investigated whether the differences between these products correspond to the sampling error (section 2d) and whether the correction methods were actually reducing this error. Therefore, the RMS sampling error from 1979 to 2001 was compared to the RMS differences between the simple gridded monthly means and the KNSC product over the same period.
Further, a comparison between the KNSC parameters AT and SLP and independent point measurements in the same area was performed to illustrate the quality of KNSC. For this purpose, air pressure and air temperature measurements from the North Sea research platforms Forschungsplattformen in Nord- und Ostsee Nr. 1 (FINO1; 54.0149°N, 6.5876°E) and FINO3 (55.195°N, 7.1583°E) were used (see www.fino-offshore.de, data: http://fino.bsh.de/). After the FINO measurements were quality checked, the air pressure observations of FINO1 and FINO3 were reduced to sea level using the barometric formula. For the air temperature, observations from FINO3 are available only for a limited time within the KNSC period; therefore, the 40-m observations from FINO1 were considered. These have a longer time overlap with KNSC than the 30-m observations and are next close to the ground. They were corrected to 2 m by assuming an adiabatic temperature gradient of −1 K per 100-m height. Monthly means of these time series were calculated and compared with the respective KNSC boxes corresponding to the positions of the FINO platforms. Apart from the FINO data, the KNSC product was also checked against independent coastal land station time series data from DWD, although KNSC includes only observations over water. One has to keep in mind that these are point measurements compared to gridcell averages, so differences have to be expected. The results of all the evaluation checks are shown in the next section. Further in-depth investigations concerning reanalysis data are addressed in Part II.
Processing and evaluation results are presented for all variables separately. Different periods were investigated: the overall period (1950–2010); the years covered by all reanalyses (1979–2001) to match the investigations in Part II; and the last climate period (1981–2010), since the observational density is best during this period (see Fig. 1).
a. 2-m air temperature
The 2-m AT observations were corrected by the diurnal and monthly variation methods. Where fewer than 20 observations were available per cell and month, the neighboring cells were included in two steps.
The 30-yr mean fields were analyzed, and the fields for June and December for the climate period from 1981 to 2010 are shown as examples in Figs. 7a and 7b, respectively: In June, a northwest–southeast gradient from about 9°C north of Scotland to 14°C in coastal waters of mainland Europe can be observed. In December, a gradient from northeast to southwest with values from about 5° to 11°C is evident. Both gradients result from the warming/cooling effect of the adjacent landmasses in contrast to the sea during the respective months and the inflow of warm Atlantic waters through the English Channel. Regions north of the North Sea (above 62°N) and northwest of Ireland (west of 7°W) are not well covered by observations, probably due to the lack of shipping routes in these areas (Fig. 3).
Comparisons between KNSC and the simple gridded product (section 2c) for each month show that the often rather inhomogeneous input was effectively smoothed, showing realistic patterns. Analog to Fig. 4, Fig. 8a shows the AT for December 2001 and Fig. 8c shows the standard deviations calculated for each box over all December means (1950–2010). The pronounced differences between neighboring boxes and the temporal variability were reduced. Additionally, the standard deviation of all boxes was calculated per month from 1950 to 2010. The resulting monthly time series in Fig. 9a shows that the spatial variability (red line) was reduced compared to the simple gridded product (black line).
The sampling error estimates (section 2d) were investigated to show that the reduction of variability is correct. The RMS sampling error from 1979 to 2001 shown in Fig. 10a varies from 0 to 1.05 K. The RMS differences between the simple gridded monthly means (section 2c) and KNSC over the same period differ on average by up to 1.05 K (Fig. 10b). They are also comparable to the differences of the ICOADS 1° × 1° product and KNSC over the same period (not shown), as well in strength as in location, which is not surprising, since ICOADS and the DWD marine database for the most part have identical observations (section 2a). The absolute values of the differences between both fields show that the correction methods effectively reduced the sampling error in the entire area (Fig. 10c).
Because of the low observational density and asymmetric fields of neighboring cells at the region boundary, an error estimate especially for these boundary cells was calculated (section 3c) for the overall period, 1950–2010: it results in an ME of −0.06 ± 0.31 K and an MAE of 0.15 ± 0.27 K.
Compared to FINO1 data, KNSC AT values show a qualitatively good agreement over time with a correlation coefficient of 0.99 (Fig. 11a): The mean standard deviation is about 1.9°C for KNSC and FINO1 (no suitable AT data were available for FINO3). Figure 11b shows a mean difference of about −0.9 K, and, especially in the first years, the KNSC temperatures tend to be lower. The maximum difference of −2.8 K can be partly explained by the high number of missing values in FINO measurements, so the range is about ±1.5 K around the mean difference, which is in the range of the standard deviation of the data. As already mentioned in section 3f, point data measured at a platform at 40 m are compared to gridbox means of measurements on ships and buoys, so there are many potential causes that could contribute to these deviations, and these will be further investigated.
A comparison between KNSC and coastal land station data additionally provided by DWD from 1979 to 2001 shows that the station ATs are colder with up to 2 K nearly all around mainland Europe, some cells around the British islands, and off the Norwegian coast (Fig. 12a). These large differences exceed the random measurement errors found by Kent and Berry (2005). They mainly result from colder temperatures over land than over sea in the half-year winter, when the differences are especially pronounced (not shown). This land effect results from the fact that only observations over sea were considered in KNSC, and it has been observed by, for example, Schade et al. (2013) for a selected grid cell off the Danish coast and by Mooney et al. (2011) for stations around Ireland, and is addressed in more detail in Part II.
b. Sea level pressure
Since the SLP observations do not show a pronounced mean annual cycle for the entire period (1950–2010) in many cells, especially in the northern region, and no diurnal cycle, they were not corrected by the monthly or diurnal variation method. For this kind of parameter, the procedure described in section 3d was introduced; that is, the mean fields were calculated according to the 6-day window method. In contrast to the other variables, the 24 neighboring cells have been included in the averaging process at almost every time step to represent the SLP’s large-scale character. In practice, the eight neighboring boxes were always included, and the threshold for using the next 16 neighboring boxes was set to 500 observations per cell and a 6-day window. In addition, there had to be at least one observation in four of the 6-day windows for each cell and month for the mean value to be considered.
As for AT, the 30-yr SLP mean fields were investigated first. The fields for June and December from 1981 to 2010 are shown in Figs. 7c and 7d, respectively: In June a weak gradient from the northeast to the southwest with values from about 1013 to 1018 hPa can be seen. The December months are characterized by an enhanced northwest-to-southeast gradient from about 1006 to 1015 hPa, which can be related to a deeper Icelandic low and a stronger high in the region of the Azores in winter.
In comparison to the simple gridded product (Fig. 4b), KNSC SLP shows more realistic patterns, although a different correction method than for AT was used. Corresponding to Fig. 4, Fig. 8b shows the mean SLP for December 2001 and Fig. 8d shows the standard deviations calculated for each box over all December means. Again, differences between neighboring boxes and the temporal variability were greatly reduced. As for AT, the standard deviations of all boxes were calculated per month from 1950 to 2010. The monthly time series in Fig. 9b shows that the spatial variability (red line) was reduced and compared to the simple gridded product (black line), and was even more pronounced than for AT (Fig. 9a).
The RMS sampling error from 1979 to 2001 in Fig. 10d shows values up to 6.5 hPa. The RMS differences between the simple gridded monthly means and KNSC differ on average by up to 6.5 hPa over the same period (Fig. 10e), as well as the differences of ICOADS 1° × 1° and KNSC over the same period (not shown). Figure 10f shows that the correction methods reduced the sampling error.
The error estimate for the boundary water cells for the overall period (1950–2010) results in an ME of −0.07 ± 1.34 hPa and an MAE of 0.64 ± 1.19 hPa.
Compared to FINO data, the air pressure (Fig. 11c) also showed a qualitatively good agreement over time (correlation coefficient of 0.98). The standard deviation is about 6 hPa for KNSC and about 9 hPa for FINO. Even though there are some months with significant differences, a closer inspection shows that these can be explained by the high number of missing values in the FINO measurements (Fig. 11d).
SLP differences between coastal land stations and the corresponding KNSC cells over the period 1979–2001 show minor deviations up to ±1.5 hPa (Fig. 12b), always within the range of random measurement errors (Kent and Berry 2005). These coastal differences in mean SLP are also found between reanalyses and KNSC. Furthermore, they seem to be connected to enhanced pressure gradients in the winter months during NAO+ years over the period 1979–2001 (see Part II).
c. 2-m dewpoint temperature
The 2-m DPT observations have been corrected by the monthly variation method only (as described in section 3b); a diurnal cycle could not be detected. As for AT, the neighboring cells were included in two steps if fewer than 20 observations were found.
Figures 7e and 7f show the 30-yr means for June and December, respectively, from 1981 to 2010. The patterns are comparable to the AT means, but the regions north of 60°N and northwest of Ireland are not well covered, neither in winter nor in summer.
The RMS sampling error from 1979 to 2001 in Fig. 10g varies from 0 to 1.05 K. The RMS differences between the simple gridded means and KNSC differ on average by up to 1.05 K over the same period (Fig. 10h). The difference for both fields (Fig. 10i) shows that the correction methods effectively reduced the sampling error.
The error estimate for the boundary water cells for the overall period of 1950–2010 results in an ME of −0.07 ± 0.51 K and an MAE of 0.23 ± 0.46 K.
5. Summary and conclusions
We presented here a new climatology of the North Sea and adjacent regions (KNSC) that was created within the KLIWAS project in a way that it can be easily extended in time when ever more recent or historic observations become available. The first version of the KNSC presented here contains the variables 2-m air temperature (AT), sea level pressure (SLP), and 2-m dewpoint temperature (DPT) on a regular 1° × 1° grid from 1950 to 2010. The variables wind speed and relative humidity were not included (but they might become available in a later version) because of unresolved problems regarding uncertainties in acquisition and quality control, which have to be investigated in detail.
Intercomparisons between KNSC and the simple gridded input data (section 2c) show that the applied correction methods reduced the sampling error for the respective variables.
Among others, one goal of the KNSC project was to highlight that as a result of the low observational density, regional research is complicated. To remedy this situation, further sustained observations need to be collected and made available for research. This is true for almost the entire ocean. However, it ironically holds especially in coastal waters, where variability is high and where regions therefore are even more undersampled. Contrary to models that can be rerun, updated, and improved, measurements that were not taken cannot be regained. Therefore, more emphasis should be placed on data acquisition and storage to ensure the best possible quality of the reference database. These problems were recognized before by the World Climate Research Programme International Conference on Reanalyses in Silver Spring, Maryland, in May 2012 (see WCRP 2012).
KNSC attempts a step forward to remedy the problem through improved data processing and quality control. Interpolation and homogenization of KNSC products—for example, via optimum interpolation methods, kriging, etc.—is planned for the future to address the needs of the model community for ideal input parameters. Also, an extension to the Baltic Sea is in progress. Further effort is needed in the acquisition of data and additional quality checks, especially for wind speed and other parameters to be included in future versions of KNSC. Comparisons with coupled reanalysis products such as the global NCEP Climate Forecast System Reanalysis (CFRS; Saha et al. 2010) and the COSMO 6-km reanalysis (COSMO-REA6; Bollmeyer et al. 2015) should also contribute to the understanding of climate variability.
There is also a hydrographic part of the KNSC (Bersch et al. 2013) that includes the parameters salinity and water temperature from 1890 to 2011. The complete KNSC datasets are provided by the Integrated Climate Data Center through its website (http://icdc.cen.uni-hamburg.de/1/projekte/knsc.html), where also interactive visualizations, access via Open-Source Project for a Network Data Access Protocol (OPeNDAP), and calculations of climatologies of user-requested periods are available. All KNSC datasets are referenced by a digital object identifier (Sadikni et al. 2013; Bersch et al. 2013).
The work presented here was conducted as part of the KLIWAS project of the German Federal Ministry of Transport and Digital Infrastructure (BMVI) and the Excellence Initiative CLISAP at the Universität Hamburg, funded through the German Science Foundation (Grant EXC 177/2). ICOADS data are provided by the NOAA/OAR/ESRL/PSD, Boulder, Colorado, from its website (http://www.esrl.noaa.gov/psd). The FINO project is sponsored by the BMWI (German Federal Ministry for Economic Affairs and Energy) and Projektträger Jülich (https://www.ptj.de/).
Quality Flags of DWD’s Marine Data Centre
The quality flags shown in Table A1 are provided by the quality procedures of the DWD for every observation of the Marine Data Centre (section 2a). Observations showing the quality flags C, D, E, F, G, and H entered into the KNSC.
This article has a companion article which can be found at http://journals.ametsoc.org/doi/abs/10.1175/JTECH-D-17-0045.1