OBSERVATIONS FOR REANALYSES

Global dynamical reanalyses of the atmosphere and ocean fundamentally rely on observations, not just for the assimilation (i.e., for the definition of the state of the Earth system components) but also in many other steps along the production chain. Observations are used to constrain the model boundary conditions, for the calibration or uncertainty determination of other observations, and for the evaluation of data products. This requires major efforts, including data rescue (for historical observations), data management (including metadatabases), compilation and quality control, and error estimation. The work on observations ideally occurs one cycle ahead of the generation cycle of reanalyses, allowing the reanalyses to make full use of it. In this paper we describe the activities within ERA-CLIM2, which range from surface, upper-air, and Southern Ocean data rescue to satellite data recalibration and from the generation of snow-cover products to the development of a global station data metadatabase. The project has not produced new data collections. Rather, the data generated has fed into global repositories and will serve future reanalysis projects. The continuation of this effort is first contingent upon the organization of data rescue and also upon a series of targeted research activities to address newly identified in situ and satellite records.


R
eanalysis efforts fundamentally depend on observations of the atmosphere, the ocean, the cryosphere, and the land surface.First and foremost, observations provide information on the state of the atmosphere and ocean that can be assimilated into a numerical weather prediction model in order to produce the reanalysis.However, observations are also used in several other steps along the processing chain.They are used to constrain the boundary conditions of the numerical weather prediction model, to calibrate statistical relations used in the processing (e.g., for geophysical parameter estimation from satellite data), to determine and correct the error of other observations, and to evaluate the final reanalysis product.A reanalysis is therefore only as good as the underlying observational data, and rescuing, compiling, and managing additional observations are therefore critical.
Collecting historical observations is a tremendous effort, which can only be undertaken through close collaboration between weather and climate data centers, satellite agencies, reanalysis producers, and individual scientists.Reanalysis producers share their observation databases and thus build on each other's efforts.However, each reanalysis project also has its specific objectives and technical restrictions or opportunities.Particularly when going back in time, not all observations are easily accessible.A large fraction of historical meteorological observations has never been digitized because the data have either thus far not been considered valuable or just not been known about.However, even in the rather recent past, the availability of satellite data records (and the computer codes needed to read and process them) is an issue that needs to be addressed (Poli et al. 2017).Furthermore, metadata need to be compiled and shared in a systematic way in order to render the data useful.
Major efforts were therefore undertaken in the European Reanalysis of the Global Climate System (ERA-CLIM2), to collect and make available observations for all reanalyses covering the period since 1900.Such an undertaking requires a much broader vision than the production of a specific reanalysis.Historical observations are also a legacy, and generating reanalyses (or other data products) must be seen as a continuous effort.The observations collected may be fully used only in the next cycle of reanalysis generation.However, by simulating the observations already in the current generation of reanalyses, feedback on their quality can be obtained and many instrument issues have been detected this way (Lawrence et al. 2017).
In this paper we report on the efforts within ERA-CLIM2 to rescue, compile, assess, and make available observations as well as metadata and expertise.The data generated feed into global repositories and serve future reanalysis projects.

CLIMATE REANALYSES.
Reanalyses are generated by combining first-guess estimates of the state of the atmosphere, ocean cryosphere, land, and so forth, usually defined by model forecasts, with data from a range of observing platforms (surface, upper air, satellites) using objective methods named data assimilation systems.The availability and quality of observations affects the confidence that can be placed upon reanalysis datasets (e.g., Compo et al. 2011;Hersbach et al. 2017;Uppala et al. 2005).
Some reanalyses provide multidecadal or longer time series of gridded variables.However, they can only be used to monitor the evolution of the Earth system climate if they are not sensitive to changes of the observing system.Given that the observing system has been changing dramatically over the past decades, especially since the 1970s with the start of the satellite era, it is necessary to try to rescue and recover as many past observations as possible, so that the number of observations does not change too strongly throughout the years spanned by a climate reanalysis and so that different observations can be used together to characterize their biases or other uncertainties.
The work presented here was performed within the framework of ERA-CLIM2, a 4-yr research project funded by the European Union Seventh Framework Program (FP7).It aims to produce coupled reanalyses, which are physically consistent datasets describing the evolution of the global atmosphere, ocean, land surface, cryosphere, and carbon cycle [see Buizza et al. (2018) for an overview].The main contribution of the ERA-CLIM2 project to climate science has been to improve the capacity for producing stateof-the-art climate reanalyses that extend back to the early twentieth century.Observational data rescue was an important part of the project, including data rescue of historic in situ weather observations around the world and reprocessing of satellite climate data records.Specifically, the work encompassed 1) data rescue for in situ observations, their quality control, and provision of metadata; 2) satellite data rescue, reprocessing, and intercalibration; and 3) provision of boundary constraints and external forcing.

SURFACE METEOROLOGICAL DATA .
Observations of pressure and wind and, sometimes, temperature at the surface are assimilated into reanalyses.Other surface observations, such as those of precipitation, have so far mostly been used in reanalysis to evaluate the effectiveness of the forecast of those variables.It is important to apply rigorous quality control (QC) to these observations and the Hadley Centre Integrated Surface Database (HadISD; version 2.0.0;Dunn et al. 2016; available from www.metoffice .gov.uk/hadobs/hadisd/) has been further developed under ERA-CLIM2 to improve and extend the QC of a subset of subdaily observations back to 1931 obtained from the National Oceanic and Atmospheric Administration/National Centers for Environmental Information (NOAA/NCEI) Integrated Surface Database (Smith et al. 2011), as well as providing an assessment of station record homogeneity.
However, existing holdings of such observations are inadequate, and ERA-CLIM2 contributed to the collection of further historical surface meteorological data by cataloging, prioritizing, imaging, digitizing, applying quality-control tools, and/or formatting subdaily data from various sources (see Table 1).In addition to data rescue work carried out within the project, ERA-CLIM2 supported the coordination of global data rescue work via the Atmospheric Circulation Reconstructions over the Earth (ACRE) initiative (Allan et al. 2011).In total, 4.7 million station days' worth of meteorological data (this does not include the snow data discussed later) were cataloged.Most of the data (4.6 million station days) were digitized (Table 1).Work was also performed on the QC and homogenization of the data (e.g., Bližňák et al. 2015;Hunziker et al. 2017).The data were submitted to the corresponding data centers, such as the International Surface Pressure Databank (ISPD; Cram et al. 2015) for pressure and the International Surface Temperature Initiative (ISTI; Rennie et al. 2014) for temperature; additional download links for individual data collections are given below.Additionally, metadata for these stations are included in the ERA-CLIM registry (see below).Table 1 provides a brief listing of important sources.These are discussed in more detail below.Climatológicos, 1937-74.These publications presented data for Luanda from 1937 until 1974 (with a gap in 1969) and for other Angolan stations from 1953.The entire collection was imaged and subsequently recovered by optical character recognition (OCR) techniques; ERA-CLIM2 complemented the work started in ERA-CLIM (Stickler et al. 2014).The subdaily data chosen for digitization included surface or mean sea level (MSL) pressure, air temperatures, relative humidity, wind speed and direction, cloud cover, precipitation, and evaporation.In addition to Luanda, data from nine stations in Angola were digitized for ERA-CLIM2: Mossâmedes (now Namibe), Cabinda, Dundo, Malange, Vila Lusa (now Luena), Lobito, Nova Lisboa (now Huambo), Sá da Bandeira (now Lubango), and Mavinga.Subsequent to the OCR process, QC tools were applied to the digitized data.The data can be accessed online (http://eraclim2 .rd.ciencias.ulisboa.pt/).Rodrigues-Mozambique, 1909-60.Two sets of publications were found for Mozambique at the Instituto Dom Luiz library.The first collection, the one with the oldest data, contains the observations for Maputo (then Lourenço Marques, Campos Rodrigues Observatory) for 1909-60 (bihourly from 1910).The data were imaged, and the 1910-14 period and the 1947-60 period (in addition to the 1915-46 period in Anais das Colonias recovered previously in ERA-CLIM) were digitized.This is a very complete set of data containing the same variables as the Angola set mentioned above, as well as soil temperatures (with depth reaching 1.5 m below ground level) and grass temperatures.Priority was given to surface/MSL pressure, and for the 1947-60 period all data (recorded every 2 h) have been digitized and sent to ISPD.Moçambique-Boletim Mensal, 1934-56.The other Mozambique dataset was found to contain subdaily data for Maputo only, for 1934-44, and for several stations in the 1951-56 period.There is a gap between 1945 and 1950 in the Instituto Dom Luiz collection of these publications.Besides imaging the whole set, it was chosen to digitize the 1951-56 period for the following stations: Beira, Inhambane, Quelimane, Mossuril, Murrebué, Tete, Vila Cabral, and Lumbo.The same set of observed variables was recovered as for the Angola set mentioned above.

Província de
Registo de Todas las Observaciones del Tiempo de Superficie (Fuerza Aerea de Chile), 1950-58.In South America ERA-CLIM2 was given access to hourly surface observations performed by Fuerza Aerea de Chile for 1950-58 and for 41 stations.All data were handwritten and imaged and contained the following variables, among others: dewpoint temperature, cloud cover, wind direction and speed, surface pressure, dry-bulb temperature, accumulated daily UPPER-AIR DATA.A particular focus of the project was on historical upper-air data (Stickler et al. 2014;Ramello Pralungo et al. 2014).ERA-CLIM2 continued to digitize and QC the rich sources discovered during ERA-CLIM.These efforts led to an update of the Comprehensive Historical Upper-Air Network (CHUAN; Stickler et al. 2010).
In total (ERA-CLIM and ERA-CLIM2), 1.3 million station days' worth of upper-air profiles were cataloged (Table 2), most of which were digitized and quality controlled.Since stations typically have two soundings per day, with several variables at many levels, 1.3 million station days correspond to tens of millions of individual data points.A map of all stations, with an indication of the number of station days, is given in Fig. 1.Noteworthy is the digitization of the data from the so-called international days from the 1920s.This was an effort coordinated by the International Aerological Commission.On specific days (typically once per month), balloons were launched from many stations worldwide.Although assimilating such data into a reanalysis may not help much, it is a superb opportunity for comparison.
A further effort concerned upper-air data from ships, again continuing from ERA-CLIM (Stickler et al. 2015).In the following we give a brief overview of the most important sources.
Russian upper-air data prior to 1960.Operational radiosonde data from 41 stations from Russia (Fig. 1) prior to 1960 were digitized, which is a backward extension of existing data holdings (such as the Integrated Global Radiosonde

Quality controlled
Archive; see Durre et al. 2006) to the 1940s.The work comprised three parts: • complementing data already existing in electronic format but which were incomplete (missing stations, missing levels, etc.); • digitizing new sources from handwritten tables; and • integrating computerized "views" of old punch cards in exotic formats (the handwritten tables were manually keyed by punching machines to paper punch cards in the 1970s and 1980s, and in the late 1980s the paper punch cards were copied to nine-track magnetic tapes "as is").The effort in ERA-CLIM2 related to this source of data included decoding information from obsolete nine-track tape media and compiling a digital dataset.
The final dataset was compiled from these three sources.The data were formatted, quality controlled, and delivered to CHUAN, version 2.1.They were also used for various analyses (Lavrov and Sterin 2017;Sterin and Lavrov 2017).
Central and northern European data, 1900s-30s and polar expeditions.The digitization of several large collections of upper-air data begun in ERA-CLIM was continued in ERA-CLIM2.An important source for early data was the German "daily weather report," , which contains data from many stations in the Atlantic-European sector as well as Russia.Also data from the Netherlands and Finland were digitized (Fig. 1).These three data sources provide rich information on the early period of aerological measurements (i.e., from the 1900s to the 1930s); in total, 200,000 station days were digitized.Also noteworthy are upper-air data from early Arctic expeditions such as Greenland expeditions in 1912/13 (Switzerland), 1926-31 (University of Michigan), and 1930/31 (Germany) or British, American, and Australian Antarctic expeditions.
Asian and African sources, 1920s-50s.A large amount of data were digitized from the reports of the India Meteorological Department (1928-42) and the Pakistan daily weather report .A further important source was upper-air data from Egypt (1920-53).These sources clearly leave a large imprint on the station map (Fig. 1) reflecting a colonial world at that time (Brönnimann and Wintzer 2018).As an example of a further Asian source, Fig. 2 2 and Fig. 1).Pilot balloon data for the period 1919-57 were digitized from handwritten reports.Major efforts have been undertaken to locate records of upper-air data from France and French ex-colonies in 82 French archives, to image all the located records (one million images generated within ERA-CLIM and ERA-CLIM2), to create new long-term series of upper-air wind, and to compile metadata on these historical observations.Seven long-term series of upper-level wind for the French mainland and one series in Corsica longer than 25 years were generated: Ajaccio (1920-48), Antibes/Nice (1923-48), Bordeaux (1929-21, 1923-57), Le Bourget (1920-21, 1923-48), Lyon (1920-21, 1923-47), Perpignan (1920-21, 1923-57), Poitiers (1921, 1923-48), and Toulouse (1923-47).In addition, early radiosonde data for Trappes (Fig. 3) and Bordeaux for the period 1937-39 have been located in an old internal publication and have been digitized to extend the series provided within ERA-CLIM.
OCEANIC DATA.Historical Southern Ocean data.Our knowledge of climate variability and change in the Southern Ocean is currently limited by a lack of historical observations from that region.This limits the skill of reanalyses here because not only are there relatively few observations to assimilate but the boundary conditions are also relatively poorly understood.The Hadley Centre Sea Ice and Sea Surface Temperature dataset (HadISST) analysis (Rayner et al. 2003) was further developed under ERA-CLIM and ERA-CLIM2, revising both the SST and sea ice components to version 2. Titchner and Rayner (2014) document the new sea ice analysis (which is available at www.metoffice.gov.uk/hadobs/hadisst2/).The representation of Antarctic sea ice remains fairly crude in this analysis prior to 1973, so a search was undertaken for more observations.This search was continued under ERA-CLIM2 and has yielded information not only on sea ice but also (primarily) on surface meteorological variables for the Southern Ocean for the late nineteenth and early twentieth centuries.
These data, mostly from ships' logbooks in the Southern Ocean, have been imaged but not keyed, from the original documents.These documents include dedicated meteorological logbooks as well as common ships' logbooks, meteorological forms, whale-catch books, day reports, ice reports, ice charts, and other relevant items.Types and frequency of observations will vary from one document to another, but typically there is barometric pressure and air temperature, wind direction and force, weather, cloud cover, SST, sea state and swell, sometimes salinity and biological observations, sea ice and icebergs, and other phenomena such as aurora.The logbooks and other documents imaged are mostly from the twentieth century up to 1950, with a few from the nineteenth century such as the Ross Antarctic expedition of 1839-43.
Observations have been imaged from naval vessels (British and Chilean), whaling vessels (British and Norwegian), vessels of exploration, research vessels, and merchant shipping (British and Finnish).In all about 131,000 images have been captured (58,000 from the United Kingdom, 30,500 from Norway, 20,300 from Finland, and 22,000 from Chile).The images are held at the Met Office.This is expected to provide more than one million observations, once the data are keyed.Much of the British and Chilean data are for the Weddell Sea area and around the Antarctic Peninsula.The Norwegian whaling data cover the Weddell Sea and Indian Ocean sectors of the Antarctic.The data from merchant shipping  include high-latitude transits of the South Pacific from Australia and New Zealand toward Cape Horn or the west coast of South America by British and Finnish vessels.The collection of 131,000 images does not represent the entirety of historic data available for the Southern Ocean.There are many more known sources of ships' logbooks in Norway and Finland.Archives in Sweden, Denmark, Germany, Argentina, Australia, New Zealand, Japan, and the United States have yet to be explored and documented.4), the total number of observations being around 1.2 million.The SWE values were calculated from snow depth and density samples.Depth measurements were performed at up to 100 locations obtained at a constant interval through the snow course.The snowpack bulk density is sampled more sparsely, typically every tenth depth sample.In addition to SWE, the dataset has information on snow course center coordinates, mean snow depth, mean density, and the day of an individual observation.The typical frequency of observation at a single site varies from once a week to once a month, and the observations were carried out throughout the period of seasonal snow cover.Figure 4 shows the locations of included snow courses.

Development of
In addition to SWE, historical snow depth data from Russia were digitized and processed.Again, the digitization mainly extended the list of stations in the snow parameter dataset.Furthermore, 20 additional stations were digitized that were not available electronically.Owing to numerous metadata records on historical changes in snow observational practices in the former Soviet Union and on the changes in QC processing and quality flag codes, it became possible to merge the data into a consistent database.(The data are available from www.meteo.ru.) Figure 5 shows climatologies of snow depth for the months of April, October, and November (Wegmann et al. 2017).The bottom row shows corresponding maps from the ECMWF twentieth-century reanalysis (ERA-20C; Poli et al. 2016), which does not assimilate snow depth data.The figure shows that ERA-20C realistically reproduces the broad features of climatological average snow depth.
HISTORICAL SATELLITE DATA .Satellite data are used in almost all global reanalyses covering the last 50-60 years.However, they only became available in the mid-1960s, and the satellites and instruments were mostly built for the purpose of weather monitoring.Their use in a reanalysis requires a quantification and correction of longterm trends due to changes in the characteristics of satellites and performance of sensors during their operational lifetime in space, as well as systematic differences between instruments of the same kind (e.g., in their spectral response).A reprocessing of the data that applies corrections for such effects is fundamental to serve the generation of physically consistent data records of geophysical variables by reanalysis.Consequently, in ERA-CLIM2, further improvement of reanalyses of the satellite era was contributed to by producing reprocessed and recalibrated satellite data records and by including more observations from early satellite instruments.An assessment of satellite sounder radiances suitable for use in reanalyses, mainly from the 1970s, was made for ERA-CLIM2 and is documented in Table 3.
Specific efforts addressed four different satellite time series:  radiances and atmospheric motion vectors (Borde et al. 2014) for the period 1982-2014.The recalibration of geostationary radiances is a complex process involving multiple steps: 1) collection of geostationary and reference instruments datasets; 2) selection of reference instruments and bringing the reference data to match spectrally with the geostationary measurements, which involves rigorous data analysis and research; 3) finding collocations between the geostationary and reference data (which is computationally expensive); and 4) use of the collocated measurements to determine the recalibration coefficients.The scope of effort needed for satellite data rescue and reprocessing is no less daunting than that for in situ data rescue (Poli et al. 2017).As part of satellite data rescue activities, advice on the use of historical satellite datasets was provided.Even more important, the capability to simulate the measurements of historical satellite instruments to allow monitoring and assimilation of radiances from such instruments has been enhanced.The Radiative Transfer for the Television and Infrared Observation Satellite (TIROS) Operational Vertical Sounder (TOVS) model (RTTOV; Saunders et al. 1999) has been updated since ERA-Interim was produced to include more of the historical sensors.In particular biases of individual instruments were explored, and the work has shown that some spectral responses previously assumed in the radiative transfer model needed updating.The new RTTOV-12 model reduces some of the biases due to instrumental effects and also allows more accurate calculations with an increased number of pressure levels through the atmosphere.For example, the cell pressure of the Stratospheric Sounding Unit (SSU) reduced after launch and new coefficients have been developed to account for this (Nash and Saunders 2015).Coefficients for early satellite instruments have also been updated assuming the atmospheric CO 2 concentration for an earlier period, rather than using current CO 2 concentrations.other projects as it contains metadata on thousands of surface and upper-air stations as well as ship logs.
In addition to the station names, various station numbers, coordinates, and time period of measurements, the database contains information on the archive, variables covered, the state of the process, or data level (imaged, digitized, formatted, quality controlled, etc.) and offers the possibility to directly link to the data.This registry will allow others to make full use of the information compiled within ERA-CLIM and ERA-CLIM2 and to share their own data rescue information with the community.
In addition to updating existing repositories (see next section), observations rescued and processed in ERA-CLIM and ERA-CLIM2 were also directly used for the generation of reanalyses.Surface pressure observations were assimilated into both ERA-20C and the Coupled ECMWF Reanalysis of the Twentieth Century (CERA-20C; Laloyaux et al. 2018), and some of the processed satellite data were assimilated into the fifth major global reanalysis produced by ECMWF (ERA5; Hersbach and Dee 2016).The satellite data are and will become available from EUMET-SAT (under http://navigator.eumetsat.int).
As an example, Fig. 7 shows the case of a typhoon on 18 September 1906 that hit Hong Kong.ERA-20C, which only included one station within the region displayed (Hong Kong), did not reproduce the typhoon.Within ERA-CLIM, data from the South China Sea were digitized (large dots), and they were assimilated into CERA-20C.The CERA-20C ensemble mean produces a low pressure system, albeit too weak.However, ensemble member 9 of CERA-20C shows a stronger cyclone.In ERA-CLIM2, the South China Sea dataset has now been extended substantially, so that future reanalyses will be able to better simulate typhoons over South Asia in the early years.
Most of the observations digitized and processed within ERA-CLIM and ERA-CLIM2 were not assimilated into ERA-20C or CERA-20C, but they were used for their evaluation (e.g., Stickler et al. 2015).In addition, a test reanalysis was performed [European Reanalysis of the presatellite era (ERA-PreSAT); Hersbach et al. 2017], covering the period 1939-67 and assimilating the CHUAN, version-1, upper-air data (Stickler et al. 2010), demonstrating the potential of using historical upper-air data for reanalyses.Ongoing and future reanalysis efforts will be able to make use of the millions of profiles digitized as well as of the experience in generating ERA-PreSAT.Likewise, snow observations are not directly assimilated but were used to evaluate the reanalyses and their respective land surface products (Fig. 5).

CONTRIBUTIONS TO PUBLIC REPOSITO-RIES.
The ERA-CLIM2 observations feed into existing repositories and are already used in other projects.This section gives a brief overview (Table 4 lists the contributions to existing repositories).By submitting data to existing repositories (from where they are retrieved for reanalyses projects), they become part of their update cycles.This guarantees that version numbers are unique and that older versions remain available.When reanalyses approach real time, observations will be taken from the real-time data stream and not from these repositories, but the data will still be available to the user via the OFA.
The digitized surface data were submitted to ISPD (pressure) and ISTI (temperature).The ERA-CLIM2 upper-air data have been integrated into the latest version 2.1 of the CHUAN [database [for version 2 see Stickler et al. (2014)], which is ASCII formatted (and can be downloaded from ftp://giub-torrent.unibe.ch/eraclim2/).Those data have been reformatted into the ECMWF Observation Database (ODB2) format and have been stored in the Meteorological Archival and Retrieval System (MARS) archive.In this format the data can be immediately read by the Integrated Forecast System.
CHUAN2.1 will be used in the back extension of the Copernicus Climate Change Services (C3S) ERA5 (http://apps.ecmwf.int/datasets/)from 1950, where it will provide an essential source of upper-air information.After assimilation the upper-air data  (2017, unpublished manuscript).These procedures largely accomplish the requirements stated in a workshop that was held in Reading, United Kingdom (29 June-1 July 2015; www.ecmwf .int/sites/default/files/COP-CO-WS-Summary.pdf), on climate observation requirements.This includes the so-called observation feedback, which contains the original observations along with data assimilation quality flags, potential bias corrections, and departures from the reanalysis record.This represents a mine of information of interest to observation experts and reanalysis producers.Observation feedback is available from a number of reanalyses at present [Twentieth Century Reanalysis (20CR) and ERA-20C] and ensures that the full observation input is available.
MAKING THE EFFORT SUSTAINABLE.An important aspect of the work on observations is its sustainability.Future reanalysis projects require not only observations but also metadatabases and data holdings (Thorne et al. 2017), data rescue services (Brönnimann et al. 2018, manuscript submitted to Geosci. Data J.), and an active climate data community.The C3S contributes to make this undertaking sustainable via investment in services on data rescue and global database production.
C3S is one of several dedicated information services recently established by the European Commission as part of its Copernicus Earth Observation Programme.The purpose of C3S is to provide open and free access to data, tools, and information needed to support climate adaptation and mitigation in Europe.Improving access to climate observations and reanalyses is a key objective.C3S is being implemented as an operational service similar to weather forecasting, which means that all data and products will be fully supported, quality assured, and reliably delivered on a well-defined schedule.This provides a unique opportunity to transfer some of the essential work on observations and reanalysis undertaken in ERA-CLIM and ERA-CLIM2 and related projects to a sustainable operational environment.
Various technical aspects of the work on observations described in this article are now included in C3S.For example, a C3S data rescue service is being developed that will build on the work started in ERA-CLIM on the collection and sharing of metadata on data rescue projects (Brönnimann et al. 2018, manuscript submitted to Geosci. Data J.).This service will also develop new tools for data rescue and promote best practices and in some cases provide direct support to data rescue communities (e.g., in the form of capacity-building workshops).C3S is also taking steps to improve access to observations in existing climate data archives, by supporting the development and maintenance of a new global, quality-controlled database of historic surface observations, including land and marine data, building on existing holdings at NOAA/NCEI and many other sources (Thorne et al. 2017).Similar activities will be initiated to ensure the maintenance and further development of a merged database of available upper-air meteorological observations, including all those collected in the ERA-CLIM and ERA-CLIM2 projects.EUMETSAT is continuing its recalibration and reprocessing activities within a dedicated C3S project in support of the next reanalysis of the satellite era.
Production of global and regional model-based reanalyses, which until recently has relied almost exclusively on research funding, is now conducted as a fully supported operational activity under Copernicus.This includes ERA5, the latest ECMWF reanalysis, which will be completed in 2018, and incorporates various reprocessed satellite data records developed in ERA-CLIM.The next generation, ERA6, will likely be a coupled atmosphere-ocean reanalysis using data and methods developed in ERA-CLIM2.
Clearly, establishing reanalysis as an operational activity will bring many benefits to users and, as a result, will generate more support for work on data recovery and reprocessing.Nevertheless, a strong need remains for underpinning research on data and methodology to ensure that the state of the art continues to evolve.

CONCLUSIONS.
Global dynamical reanalyses of the atmosphere and ocean rely on observations in various steps along the production chain.In addition to being assimilated into a reanalysis system, observations are used to constrain the model boundary conditions, for uncertainty determination of other observations, and for the evaluation of data products.In ERA-CLIM2, major efforts were devoted to observations, namely, data rescue, processing and management, compilation and quality control, and error estimation: • New sea surface and sea ice datasets were produced that serve as boundary conditions for reanalyses.• Hundreds of thousands of images were taken within ERA-CLIM and ERA-CLIM2 and 6.4 million station days were digitized, amounting to tens of millions of data points.• Several snow products were generated, encompassing SWE and snow depth.• Satellite data were recalibrated and reprocessed to provide consistent satellite data records from the 1980s until today.• The HadIOD and HadISD databases of qualitycontrolled surface and ocean measurements were extended and improved.
A large fraction of the data were not assimilated into the reanalysis yet but fed into global repositories on which future reanalysis efforts will be based.The work on observations therefore is one cycle ahead of reanalysis production.The continuation of this effort is vital for the success of future reanalysis projects.
ACKNOWLEDGMENTS.ERA-CLIM2 was funded by EU-FP7 (Grant Agreement 607029).The project collaborated closely with the Horizon2020 project EUSTACE (640171), as well as the Swiss National Science Foundation projects DECADE (147320) and CHIMES (169676).

Fig. 1 .
Fig. 1. (top) Wind at 7900 m above mean sea level obtained with pilot balloons at Las Cañadas del Teide on Tenerife, Canary Islands, in 1912.(bottom) Map of digitized upper-air data from ERA-CLIM and ERA-CLIM2.The circle size indicates the number of station days per station.In total, 1400 stations are shown.

Fig. 2 .
Fig. 2. Kite ascent from Beijing, 12 Apr 1934: (left) raw recording and (top right) tabulated data (from the Bulletin of the Upper Air Current Observations, Vol.I-VI, Academia Sinica, Nanking, China).(bottom right) The map shows the series from China and Korea from the 1930s and early 1940s digitized within ERA-CLIM and ERA-CLIM2.
HadIOD.The Met Office Hadley Centre Integrated Ocean Database (HadIOD;Atkinson et al. 2014) was first developed under ERA-CLIM and further improved under ERA-CLIM2 to produce version 1.2.0.0.HadIOD is a database of global ocean temperature and salinity observations, combining data from surface-only and profiling instruments, with quality flags, bias adjustments, and uncertainty estimates for each observation (where possible).Chief data sources are the International Comprehensive Ocean-Atmosphere Data Set (ICOADS)(Woodruff et al. 2011) for SST observations and the Met Office Hadley Centre EN series dataset(Good et al. 2013)   for temperature and salinity profile observations.The ultimate aim of HadIOD is to provide input for coupled data assimilation.HadIOD.1.2.0.0 has undergone the following recent developments: (i) greater time coverage (now starting in 1850); (ii) new data sources, for example, high-temporal-resolution global tropical moored buoy array data; (iii) improvements to bias corrections; (iv) a new output format for assimilation; and (v) improved quality control of data from ships and drifting buoys over their measuring lifetimes.SNOW DATA.An effort to establish a global dataset of distributed snow observations resulted in a new collection of snow water equivalent (SWE) values measured at snow courses of the Northern Hemisphere.SWE recordings from distributed snow courses were obtained covering the regions of Russia (and the former Soviet Union), Canada,
Figure 6 shows the cross-calibrated Meteosat infrared (IR) window channel (10.5-12.5 µm) radiances for a portion of the time series from 1991 to 2006.The time series shows an almost perfect alignment of the resultant radiances from different Meteosat satellites, and the difference plot demonstrates the successful elimination of some artifacts in the time series and the removal of an artificial trend from the data.

Fig. 6 .
Fig. 6. (top) Recalibrated Meteosat IR channel radiances (mW m −2 sr −1 cm −1 ) over Payerne, Switzerland, and (bottom) the difference between corrected radiances and those computed using operational calibration parameters.The black line shows Meteosat first-generation MVIRI radiances, and the red line shows Meteosat second-generation SEVIRI radiances.The cross-calibrated radiances have been homogenized using the Meteosat-5 MVIRI spectral response function.

Fig. 7 .
Fig. 7. Sea level pressure and 10-m wind during a typhoon at 1800 UTC 18 Sep 1906 from (left) ERA-20C and (from left center to right) CERA-20C ensemble mean, member 9, and map of the 1005-hPa contour for all members.The dots indicate pressure observations from the South China Sea dataset digitized by ERA-CLIM, which did not feed into ERA-20C but were assimilated into CERA-20C.

Table 1 . Surface observations (No. of station days) digitized within ERA- CLIM and ERA-CLIM2. Source Cataloged Digitized Quality controlled
The work comprised the assessment of existing sources of data on computer media in any format and in hardcopies to select subset of stations acceptable for period extension; the reformatting of old odd-formatted data to fit the common format for later data; filling gaps in data for the period before 1966 by digitizing hardcopy material, transforming digitized data to a common format; adjustment of time and date; and QC of data.

Table 4 . ERA-CLIM and ERA-CLIM2 contributions to public repositories.
Bias corrections at least for upper-air temperature back to 1950 have been developed as well and are described in detail in L. Haimberger et al.