The ECMWF twentieth century reanalysis (ERA-20C; 1900–2010) assimilates surface pressure and marine wind observations. The reanalysis is single-member, and the background errors are spatiotemporally varying, derived from an ensemble. The atmospheric general circulation model uses the same configuration as the control member of the ERA-20CM ensemble, forced by observationally based analyses of sea surface temperature, sea ice cover, atmospheric composition changes, and solar forcing. The resulting climate trend estimations resemble ERA-20CM for temperature and the water cycle. The ERA-20C water cycle features stable precipitation minus evaporation global averages and no spurious jumps or trends. The assimilation of observations adds realism on synoptic time scales as compared to ERA-20CM in regions that are sufficiently well observed. Comparing to nighttime ship observations, ERA-20C air temperatures are 1 K colder. Generally, the synoptic quality of the product and the agreement in terms of climate indices with other products improve with the availability of observations. The MJO mean amplitude in ERA-20C is larger than in 20CR version 2c throughout the century, and in agreement with other reanalyses such as JRA-55. A novelty in ERA-20C is the availability of observation feedback information. As shown, this information can help assess the product’s quality on selected time scales and regions.
Reanalysis of past observations with a model-based data assimilation system as used for weather forecasting has enabled many thousands of users to develop weather- and climate-sensitive applications in a wide range of fields (Gregow et al. 2015). Recent global atmospheric reanalysis products include NASA’s Modern-Era Retrospective Analysis for Research and Applications (MERRA; Rienecker et al. 2011), NCEP’s Climate Forecast System Reanalysis (CFSR; Saha et al. 2010), ERA-Interim from ECMWF (Dee et al. 2011), and Japanese 55-year Reanalysis (JRA-55) from JMA Kobayashi et al. 2015). Each provides time series, typically starting in 1979 (but 1958 for JRA-55) and extending to the present, of a comprehensive set of atmospheric variables. These datasets estimate the evolution of the state of the atmosphere during the modern observing period, based on available surface and upper-air observations from a variety of satellite instruments and in situ observing systems.
Many climate studies would benefit from even longer datasets that extend back to the early instrumental record. However, consistent estimation of climate variables from early observations brings special challenges beyond those usually encountered in reanalysis. Prior to the 1940s, nearly all weather observations were made from Earth’s surface and only in well-populated land areas or along shipping routes at sea. Information about the nature and quality of early instrumentation is often incomplete (Kennedy 2014). Furthermore, locating and gaining access to early weather observations requires dedicated efforts in data rescue and digitization, especially in parts of the world that are most affected by climate change and variability (Allan et al. 2011).
In spite of such challenges, Compo et al. (2006) demonstrated the feasibility of producing a useful global atmospheric reanalysis spanning a century or more with modern data assimilation methods. The Twentieth Century Reanalysis (20CR) dataset extending back to 1871 was produced (Compo et al. 2011), together with model-generated information about flow-dependent uncertainties in the estimates. Data assimilation for 20CR was achieved with an ensemble Kalman filter based on an atmospheric model constrained by precomputed global estimates of sea surface temperature and sea ice concentration. Only surface pressure observations were assimilated; all other variables are estimated implicitly from the model equations. Compo et al. (2011) were able to show, among other things, that 20CR estimates of tropospheric temperatures compare reasonably well with (independent) radiosonde observations.
The success of the 20CR project has inspired an ambitious long-term research effort at ECMWF, aimed at developing the capability to produce century-long climate reanalyses that take maximum advantage of the available instrumental record (Dee et al. 2014). Two consecutive research collaborations, European Reanalysis of Global Climate Observations Project (ERA-CLIM) and follow-on ERA-CLIM2 project, have been funded by the European Commission to support this goal. Both projects are making important contributions to data rescue and digitization, particularly for historic upper-air observations (Stickler et al. 2014) as well as for early satellite data records (Poli et al. 2015b). These and many other data sources are to be assimilated in a series of progressively accurate reanalyses of the climate system, including atmosphere, land surface, ocean, and sea ice components.
This article describes ECMWF twentieth century reanalysis (ERA-20C), the first reanalysis product of the ERA-CLIM project, which provides global atmospheric data for the period 1900–2010. ERA-20C relies on a recent version of ECMWF’s Integrated Forecast System (IFS). The IFS, in its standard configuration used for medium-range forecasting applications, includes an atmospheric general circulation model (AGCM) and a variational analysis scheme. Both components have been suitably modified for the purpose at hand.
The AGCM configuration, including specifications of boundary conditions and atmospheric composition and solar radiation, has been described in a separate article (Hersbach et al. 2015a). Prior to data assimilation, the model’s ability to simulate observed changes in the twentieth-century climate was assessed by computing an ensemble of 10 model simulations for the period 1900–2010. Results of the assessment are discussed in detail by Hersbach et al. (2015a), who conclude that the IFS AGCM, when suitably constrained with boundary conditions and radiative forcing data, is well able to represent low-frequency variability of known large-scale features of the twentieth-century climate. Part of the model output (monthly averages for all variables and 3-hourly values for a few selected parameters) has been archived and is accessible via ECMWF’s public data server at http://apps.ecmwf.int. This dataset is named ERA-20CM, where the letter M stands for “model only.”
Drawing on lessons learned from the ERA-40 project (Simmons et al. 2004), this preparatory step of first running the model for the entire reanalysis period, but without data assimilation, fulfilled the intended goal of helping to understand the role of the model and forcings in ERA-20C. In particular it allows a precise assessment of the impact of the assimilated observations, including the effects of changes in data coverage and instrumentation. The main purpose of assimilating observations in a reanalysis is to add realistic information about weather events; a key question is whether it is possible to do so without deteriorating the model representation of low-frequency variability and change.
The outline of this paper is as follows. Section 2 introduces the ERA-20C reanalysis system. Section 3 presents the production. Section 4 shows assimilation statistics and other performance indicators. Section 5 presents some examples of low-frequency variability assessment. Whenever possible the ERA-20C results are compared with the model simulation (ERA-20CM) but also with independent observational products and other reanalyses. Conclusions follow in section 6.
2. Data assimilation system
The modeling and data assimilation system used to create ERA-20C is based on the IFS cy38r1 (ECMWF 2013). The system steps forward in time in 24-h cycles. In each cycle the analysis (i.e., best estimate) is produced by combining observations with a background (i.e., prior estimate) obtained from a short model forecast initialized from the previous analysis.
a. Model and forcings
The model configuration is identical to ERA-20CM (Hersbach et al. 2015a), except that the model time step was reduced from 1 h to 30 min to improve the representation of atmospheric tides. Briefly, as discussed by Hersbach et al. (2015a), the following prescribed forcing data vary over the course of the century: sea ice concentration, sea surface temperature (SST), solar radiation, tropospheric and stratospheric aerosols, ozone, and greenhouse gases. We used the Hadley Centre Sea Ice and Sea Surface Temperature dataset (HadISST) version 22.214.171.124 (Titchner and Rayner 2014; J. Kennedy et al. 2015, unpublished manuscript) for sea ice and SST, and all other data sources as specified for CMIP5 (Taylor et al. 2012). The model has 91 levels between the surface and 0.01 hPa or about 80-km altitude. The horizontal resolution is spectral triangular truncation T159 or approximately 125 km.
b. Observation handling and quality controls
The observation input comprises atmospheric surface pressure observations from the International Surface Pressure Databank (ISPD; Cram et al. 2015) version 3.2.6 and the International Comprehensive Ocean–Atmosphere Data Set (ICOADS; Woodruff et al. 2011) version 2.5.1, as well as marine wind reports from ICOADS. In case of duplicate surface pressure reports preference is given to ICOADS, and reports of surface pressure at station level are preferred over reports of mean-sea level pressure. Observations reported at location exactly 0° latitude and 0° longitude, which are likely erroneous, are also rejected, except for one PIRATA network buoy moored at that location. Observations of wind above a model land point, near the coastlines, or in mountainous closed seas are also rejected because at coarse horizontal resolution the model would not be representative. Observations of surface pressures contained in ocean-profiling instrument reports in ICOADS are also rejected due to suspect quality. Station and ship pressure observations are also rejected if all other observations reported within a 5-day window are constant, provided there are at least three observations. For example, this constant time series check identifies a station reporting pressures of exactly 1000.10 hPa during the entire period 1992–94.
The background check rejects any observation whose departure from the background value is more than 7 times larger than expected, based on estimated background and observation error standard deviations. The maximum allowed observed pressure departure is capped at 120 hPa to account for the fact that background errors vary over time and space, and their estimates are not considered reliable in early years and in unobserved regions of the globe. Finally, a variational quality control check involving neighboring observations is applied as described by Tavolato and Isaksen (2015).
c. Variational analysis
The assimilation system employs an incremental four-dimensional variational (4D-Var) analysis (Rabier et al. 2000) every 24 h. The 4D-Var generates an adjustment to the model state at 0900 UTC such that the subsequent 24-h model trajectory provides the best fit to observations. Extensive experimentation was conducted with the IFS to demonstrate that the use of a 24-h (rather than 12 h) interval gives slightly superior results when assimilating only surface observations, especially where data coverage is sparse. To control the growth of gravity waves during the longer window, a digital filter is applied on temperature and vorticity increments.
The variational analysis, especially when used for an extended climate reanalysis, must be supplied with information about spatially and temporally varying background errors. This information should reflect the dependence of errors on atmospheric dynamics and also account for the substantial changes in data coverage that have taken place during the twentieth century. To this end, the output of a previously produced 10-member ensemble of preliminary reanalysis experiments (Poli et al. 2013) was used to derive estimates of background error standard deviations. These error estimates were then used as input to the ERA-20C reanalysis (see Poli et al. 2015a). Note that this approach is suboptimal because the background errors of the prior ensemble spread necessarily differ from the background errors of the single-member production, since several factors were changed between the two (so as to resolve issues noted in the ensemble production).
Poli et al. (2013) provide a detailed description of the ensemble experiments used for error estimation, which employed a variant of ECMWF’s ensemble of data assimilation (EDA) technique (Isaksen et al. 2010). The ensemble was designed to represent key sources of uncertainty in a century-long climate reanalysis. Each member used a different plausible evolution of SST and sea ice (as given by the HadISST product), included simulated model errors (using a stochastic physics scheme), and accounted for uncertainties in the assimilated observations (by adding pseudorandom errors).
The variational analysis scheme used in ERA-20C produces slowly varying bias adjustments for all surface pressure observations (see Poli et al. 2015a, their section 2.4). Briefly, the bias estimates are generated separately for each station, each ship, and each reporting practice (station level, mean-sea level).
To complete production within a reasonable amount of time, the reanalysis was divided into overlapping 6-yr segments, all computed simultaneously. The first segment starts on 1 January 1899 and subsequent segments start in years ending in 4 or 9, with the final segment starting in 2004 and extending to 2010. The output was then consolidated after discarding the first year of each segment. The final product consists of 3-hourly fields and monthly averages of all variables for the period 1900–2010.
The observations inputs chosen for ERA-20C follow a novel data policy for ECMWF reanalyses, namely to consider only observations that can be redistributed publicly. This is to ensure traceability of the observations and their impact on the final product, and allow third-party benefit and feedback. To facilitate this, ERA-20C included the production of an observation feedback archive (OFA) designed from the outset for user investigations. A description of the ERA-20C OFA is given by Hersbach et al. (2015b). This archive is organized by observation report type and source. It contains the observations assimilated (surface pressure and 10-m wind) and other observations found in the source that can be handled by the IFS data assimilation. It also contains observation data that were not assimilated, as long as they were input to the data assimilation system. Such observations can assist in later exploitation of the observation feedback (e.g., by providing contextual information about the observation scene) or in investigations of the reanalysis quality. This includes ICOADS visual observations (present and past weather, visibility, cloud-base height, amounts, and types) and observations that can be exploited quantitatively (surface pressure tendency, air temperature and humidity, and seawater temperature). The observation feedback includes in particular the innovation (observation minus background) and the residual (observation minus analysis) for each assimilated observation. For some of the nonassimilated variables, such as air temperature, estimates of the reanalysis background and analysis are also available in observation space.
4. Assimilation performance
Between 1900 and 2010, the number of surface pressure observations assimilated per month in ERA-20C increases from about 30 000 to 3.6 million. These numbers correspond to observations that passed all quality control steps. Several of these steps are based on the background state (see section 2b above). Figure 1a shows a near-continuity in this growth, regardless of the seams between the production streams. Note that a separate color indicates the spinup portion of each stream. This enables it to be concluded, for example, that the drop in the number of observations in 1965 is not an artificial discontinuity between two production streams, but is in the continuity of the spinup, so the cause should be in the observation data source.
The mean innovations in Fig. 1b all oscillate around zero, as expected and thanks to the bias correction. This bias correction is produced by the variational analysis, using prior bias estimates from the last time similar observations were assimilated. This can potentially result in discontinuities in the multistream production if the constraints are insufficient for the analysis to determine a unique solution, or if the iterative scheme retains a memory of initial conditions after the year of spinup (all bias estimates start at zero). The effectiveness of the 1-yr spinup can be gauged on this figure: the duration is sufficiently long to enhance the interstream continuity in terms of mean observation bias correction. Figure 1b shows that the mean bias correction is a smoothed estimator for the uncorrected innovation, as intended when designing the bias correction scheme. A few jumps, however, are visible between streams, such as in 1920 where a longer spinup would have allowed for a smoother continuity. The RMS of uncorrected innovations (the highest curve in Fig. 1c) feature seasonal variations, resulting from seasonal variability and corresponding background error variations. As expected these are not removed by the bias correction.
However, the fairly smooth decrease over time of the global RMS hides regional variations that are much greater. The map in Fig. 2a shows that the 1900 innovations are largest at the margin of the observed areas, namely in the Southern Oceans and the tip of South America, exceeding sometimes 10 hPa. In the North Atlantic storm track, the innovations decrease from an excess of 5 hPa RMS in 1900 to 2–4 hPa RMS in 1955 (Fig. 2b) and 1–2 hPa RMS in 2010 (Fig. 2c). Likewise, the RMS of wind component innovations also varies substantially regionally, from upward of 5 m s−1 in the North Atlantic and Pacific in 1900 to under 3 m s−1 in 2010.
Synoptic evaluation with respect to independent observations
The decrease in RMS of innovations over time results from a great change in the observing system (Fig. 1a). As more observations of pressure and wind are assimilated, the quality of the analyses and subsequent backgrounds improves. However, innovation statistics are of little help to users looking for a quantified metric of reanalysis uncertainty for their application. To this end, we illustrate how the OFA can assist, quantifying as an example the reliability of the ERA-20C temperature record, using the independent air temperatures from ships and corresponding ERA-20C estimates in the OFA. By collocation, estimates from the control member of ERA-20CM and version 2c of 20CR (20CRv2c) can also be made for each observation. Figure 3a shows the proportion of variance (at each location and within the year 1900) in the observations explained by ERA-20CM. This variance explained (expressed as percentage) is mostly below 50%. A small improvement is visible in 2010 over the northern Pacific and Atlantic (Fig. 3d), indicating that the improved SST forcing is sufficient to explain some of the synoptic variability there.
The ERA-20C reanalysis, by assimilating surface pressure and wind observations, is expected to feature higher synoptic realism than ERA-20CM. Figure 3b shows that variance explained by ERA-20C exceeds variance explained by ERA-20CM in 1900 in most areas, especially in the Atlantic and Pacific midlatitudes. In the Southern Hemisphere, except for the much-traveled southern Atlantic and western Indian Oceans, variance explained by ERA-20C remains below 50%. In 2010, it reaches over 80% in most of the Northern Hemisphere extratropics and upward of 70% in some of the southern midlatitudes. The variance explained by collocated 20CRv2c (Figs. 3c,f) is comparable, although higher (lower) than ERA-20C in the Southern (Northern) Hemisphere in 1900 (2010). Slight differences can be noted between 20CR and ERA-20C in their abilities to reproduce the evolution of air temperatures from ships. We believe that these small differences are not fortuitous; in poorly observed regions, they reflect the superiority of an ensemble analysis system over a deterministic variational analysis system to produce adapted and optimal background errors. However, the next twentieth-century reanalysis of ECMWF will use an ensemble method, as initially tried in the preliminary ensemble production (Poli et al. 2013). It allows for self-adaptive background error correlations as the observing system improves, but will use an evolved method for the training dataset used in the correlation assessment, as developed by Bonavita et al. (2014).
In the tropics, possibly because of a lack of geostrophic balance, variances explained by 20CR and ERA-20C are as low as ERA-20CM, indicating that 20CR and ERA-20C add little intra-annual realism to a model simulation in this region for the years shown, according to the ship observations. This poor performance is explained by the atmospheric evolution in the tropics known to be in tight interaction with tropical ocean temperatures, and some of the products using SST forcing determined at monthly time scales. In future reanalyses, improvements are expected with daily SST forcing and atmosphere–ocean coupling.
5. Climate relevance
a. Air temperatures at the surface
Over land, Hersbach et al. (2015a) found by comparison with CRU Temperature Data, version 4 (CRUTEM4), that ERA-20C improves over ERA-20CM on a monthly time scale, but not on time scales of a year or longer. We present here a comparison over oceans, to air temperatures observed by ships, using the OFA. In spite of wide temporal variations in observation density (Fig. 4a), accentuated seasonally by the limitation to nighttime only (to avoid the known daytime bias of ship observations), ERA-20C/M values are steadily biased cold by about 1 K with respect to such observations (Fig. 4b). This bias is more stable over time than 20CRv2c, which also features large seasonal variations. Additionally, the smaller oscillations in ERA-20C departures as compared to ERA-20CM ones are indicative of synoptic variability in ERA-20CM that does not match observations (these oscillations are not artifacts of the method: ERA-20C collocation yields results that match the OFA contents).
All panels in Fig. 4b indicate a warm anomaly in the air temperature measurements from ships around World War II. This feature also affects the comparison with the model simulation, clearing the surface data assimilation of responsibility, but pointing to a discrepancy between seawater and air temperatures. Kennedy (2014) mentions that most SST observations at the time are assumed to be engine-room intake measurements. In parallel, Kent et al. (2013) suggest corrections to the air measurements from ships, reducing somewhat this anomaly (see their Fig. 9). However, their comparison to CRUTEM4 (see their Fig. 16) displays a remaining, unexplained, warm anomaly in ship air temperatures, whereas Hersbach et al. (2015a) do not find any comparable disagreement between ERA-20C/M and CRUTEM4 in 1940–45. The complexity of this unresolved problem is a reminder of the need for integrating findings from the observation community into comprehensive databases such as the ICOADS value-added database (Smith et al. 2011), to enable progress in improving our understanding of the climate record through comparisons with climate models or reanalyses via the OFA or other mechanisms such as Observations for Model Intercomparisons Project (Obs4MIPs) (Teixeira et al. 2014).
b. Water cycle
The atmospheric model in ERA-20C simulates several variables related to the water cycle. We consider first the rainfall, because it can be compared to the monthly analysis of rain gauge measurements from the Global Precipitation Climatology Centre (GPCC; Becker et al. 2013). The map in Fig. 5 shows that only few regions present sufficient amounts of rain gauge data throughout the years 1900–2010. The 12-month running mean time series over Europe and Japan show a fair ability of ERA-20C to represent the interannual fluctuations of precipitation throughout the whole time period, with a noticeable improvement after 1945–50. However, over North America, a similar improvement comes later, around 1960. Before 1925, ERA-20C presents little or no realism in terms of precipitation anomalies in that region as well as Japan and Australia.
Another component of the water cycle for which validating observations are available for long time periods is the total column water vapor. Figure 6a shows that ERA-20C/M and 20CRv2c feature dry biases with respect to observational products from Remote Sensing Systems (RSS; Wentz 2013) and Hamburg Ocean–Atmosphere Parameters and Fluxes from Satellite Data (HOAPS; Fennig et al. 2012). However, the anomalies relative to 1988–2008 in the latter products are better reproduced by ERA-20C/M and 20CRv2c than by JRA-55 or ERA-Interim (Fig. 6b). In addition, Fig. 6c indicates that ERA-20CM and ERA-20C present global averages in precipitation minus evaporation that are more stable than JRA-55, ERA-Interim, and 20CRv2c, although the latter is close to zero on average.
c. Climate indices
To gain a basic insight into the climate fidelity of ERA-20C we investigate a selection of common climate indices. First, we consider four indices calculated from monthly mean data. The Niño-3.4 index measures the equatorial Pacific SST anomalies (5°N–5°S, 190°–240°E) and, with regard to the El Niño–Southern Oscillation (ENSO) phenomena, provides an indication of the likely influence of the ocean on the atmosphere. This knowledge is useful when interpreting the Southern Oscillation index. The anomalies in this region are associated with El Niño and La Niña. Figure 7a shows that the Niño-3.4 indices from reanalyses (ERA-20C, 20CRv2c, JRA-55, and ERA-Interim) and the model simulation (control member of ERA-20CM) all have similar variations. All the products considered here use prescribed SST, but the SST inputs differ, although ERA-20CM control member and ERA-20C used identical SST inputs. Sometimes the magnitudes of the extrema are different, particularly earlier in the twentieth century. Clearly visible are the strong El Niños of 1982/83 and 1997/98 and the strong La Niñas of 1916–18, 1973–76 and 1988/89.
The Southern Oscillation index (SOI) shown in Fig. 7b is a measure of the local atmospheric component of ENSO. It is calculated from the normalized anomaly of surface pressure difference between Tahiti and Darwin (where the values are represented by the nearest grid point values), with positive (negative) values generally indicating La Niña (El Niño). The various reanalyses are in reasonable agreement on the state of the SOI, particularly from the early 1980s onward. This agreement occurs because of the relatively high degree of anticorrelation with the Niño-3.4 SST index, the specification of which does not differ a great deal between the various reanalyses. ERA-20C and 20CRv2c overemphasize minima in 1972 and 1979/80 respectively, by more than half a standard deviation. ERA-20C does not properly capture the modest maximum in 1967/68, retaining negative values. Prior to 1940, ERA-20C and 20CRv2c differ more than after 1940, with ERA-20C tending to have higher values. This difference is most marked in the first decade of the twentieth century, when the SOI was mostly negative. Although there are no observational constraints on surface pressure in the model simulation, the SOI of the latter is not too dissimilar to those of the reanalyses, implying that the SOI is influenced by the specified SST.
Figure 7c shows the North Atlantic Oscillation (NAO) index. This index measures the relative variation in atmospheric mass in the North Atlantic between the subtropical and high latitudes. It is calculated from the difference between the normalized surface pressure anomalies at the nearest grid point values to 37.5°N, 25.8°W and 65.1°N, 22.7°W. Here, average values are shown for the extended boreal winter (December–March). Consistency between the reanalyses is good in the second half of the century. All the reanalyses investigated show winter 2009/10 having the largest magnitude minimum in the period 1900–2010, with values more than three standard deviations below average. In the period before the 1950s, there are larger discrepancies between ERA-20C and 20CRv2c, which are most obvious at times of maxima, when the surface pressure is relatively low at high latitudes, and vary up to about half a standard deviation, with ERA-20C tending to have lower values. The NAO index of the model simulation has periods when the variability is similar to that of the reanalyses, in the mid to late 1980s for example. However, often the variability is different, implying that the observational constraints (which arise from the specification of the various forcings such as SST, carbon dioxide, aerosols, etc. in the model) are not usually strong enough to yield a realistic NAO index, which is influenced by many factors.
The Pacific–North American (PNA) index shown in Fig. 7d measures one of the major modes of low-frequency variability in the Northern Hemisphere and is influenced by various factors including ENSO. It is calculated from a linear combination of normalized geopotential height anomalies at 500 hPa at four different locations (Wallace and Gutzler 1981). The PNA index is positive when geopotential heights are relatively high over Hawaii, low over the North Pacific, high over Alberta, and low over the southeastern United States. From the 1980s onward the PNA indices from the reanalyses are mostly consistent, apart from occasional differences at extrema. During the 1960s and 1970s both ERA-20C and 20CRv2c tend to have higher values than JRA-55, with smaller-magnitude minima in particular. Between about 1908 and 1938 the variations are similar but in general ERA-20C has higher values than 20CRv2c. The PNA index of the model simulation has periods when the variability is similar to that of the reanalyses. However, often the variability is different, implying that the observational constraints are not usually strong enough to yield a realistic PNA index.
Using daily data, the Madden–Julian oscillation (MJO) is diagnosed using the Wheeler and Hendon (WH) index (Wheeler and Hendon 2004). It consists of projecting the analysis fields into precomputed combined empirical orthogonal functions of zonal wind at 200 and 850 hPa and outgoing longwave radiation. Because the ERA-20CM archive does not include the daily fields required, this diagnostic could not be computed from that model simulation. Showing only decades fully covered by each reanalysis, Fig. 8a suggests that ERA-20C, JRA-55, and ERA-Interim display similar MJO amplitudes over the last three decades. However, the mean amplitude of the MJO in 20CRv2c is consistently lower than that of other reanalyses throughout the century. The correlation between the interdecadal variability of MJO amplitude in ERA-20C (red curve in Fig. 8a) and 20CRv2c (black dotted curve in Fig. 8a) is about 0.85. At this point it is unclear whether the greater MJO amplitude in ERA-20C results from the assimilation of surface wind observations, or whether it is an inherent quality of the ECMWF AGCM. To help address that question, the ability to diagnose the MJO from daily fields will be considered when configuring the archive in future model simulations.
The resulting amplitudes of the first two principal components (PC1 and PC2) of the MJO are also compared between ERA-20C and the other reanalyses 20CRv2c, JRA-55, and ERA-Interim. Figure 8b (Fig. 8c) shows that the linear correlation between the MJO PC1 (PC2) time series of ERA-20C and 20CRv2c increases during the century, from around 0.6 in the 1900s to about 0.9 in recent decades.
In conclusion, the four monthly climate indices considered, as well as the MJO WH index, show excellent agreement for ERA-20C with other products for the most recent times (especially after 1980) but more discrepancies at times and regions (e.g., Pacific, before 1940) for which observation coverage is more scarce.
ERA-20C is ECMWF’s first atmospheric reanalysis specifically designed for climate applications. The reanalysis covers the period 1900–2010, and, unlike previous ERA projects, assimilates observations that, for the most part, have not previously been used for numerical weather prediction. ERA-20C uses a recent version of ECMWF’s operational forecasting system, but substantial modifications were made to the specification of model boundary conditions and forcing data. Various adjustments to the data assimilation methodology were introduced in order to address the special challenges associated with quality control, bias correction, and statistical analysis of sparsely distributed near-surface observations.
Several steps were taken to ensure that users as well as producers of reanalysis products have access to the information needed to be able to assess information content and uncertainties in the reanalysis. A dedicated set of model simulations, ERA-20CM, was separately produced to help separate the role of the model and the impact of observations in ERA-20C. Monthly means from the model simulations, as well as 3-hourly fields for a few selected variables, have been archived and are available to users. In addition, an observation feedback archive (OFA) has been created that contains all (roughly 2 × 109) observations assimilated in ERA-20C, each supplemented with information about source, instrumentation, geolocation, quality control, bias adjustment, and departure from background estimates and reanalyzed equivalent. Tools for accessing the data from ERA-20C, the OFA, and ERA-20CM, are available at ECMWF’s WebAPI at http://apps.ecmwf.int.
In this article we have shown a few selected examples of the use of ERA-20C for climate and weather applications, as well as several promising results (e.g., in the representation of the MJO, and 2-m temperature observation variance explained by ERA-20C larger than variance explained by ERA-20CM in most extratropical areas). Some shortcomings have surfaced, such as the slight negative impact of the data assimilation on trends and low-frequency variability. The quality control of observations has led to inadvertent exclusion of many best-track reports of tropical cyclones. More details on this and other issues are described in a technical report (Poli et al. 2015a, their section 7). In addition, we attract the attention of users to the fact that even analyzed fields such as surface pressure may suffer from spurious low-frequency signals when the observational coverage changes significantly. For example, over latitudes between 60° and 90°S, the mean-sea level pressure decreases by more than 5 hPa during the course of the twentieth century, but this unexplained trend is not found in ERA-20CM (Poli et al. 2015a, their section 8.3). We expect that users of the data will report many other strengths and weaknesses of the dataset.
When attempting to answer the frequently asked question “is ERA-20C suitable for my application?” several points need to be considered, similarly to other reanalyses (e.g., Dee et al. 2016). Determinant in such decision are the geophysical variable and the temporal domain of interest (i.e., climatic features or case studies). Given these two elements, two further questions are relevant: the extent of the observational constraints and the realism of the AGCM and its forcings. Figure 9 proposes a path diagram indicating these relationships in ERA-20C for a selection of essential climate variables (ECVs). The diagram is to be read from left to right. For each ECV presented, the corresponding observational constraints are listed. From there, an assessment is proposed for the likely realism of ERA-20C for case study and climate trend analysis. Last, the diagram suggests practical methods to check this realism. Overall, the figure indicates how the observation feedback archive and the model simulation (ERA-20CM) can assist the users in their exploitation of ERA-20C.
The ERA-20C reanalysis is a stepping stone toward the longer-term objective of improving extended climate reanalyses by taking maximum advantage of the available instrumental record (Dee et al. 2014). This requires collaborative work on data rescue and data reprocessing, new research in coupled data assimilation (e.g., Laloyaux et al. 2016), and improved access to observations for use in climate science and applications. Current ECMWF efforts in this area, framed by the ERA-CLIM2 project, are focused on using newly available twentieth-century upper-air observations in atmospheric reanalysis, and on development of a first fully coupled atmosphere–ocean reanalysis of the twentieth century.
ERA-20C benefits from support of the European Reanalysis of Global Climate Observations Project (ERA-CLIM) and follow-on ERA-CLIM2, respectively funded by European Union (EU) research FP7 Grant Agreements 265229 and 607029. ERA-20C benefits also from significant in-kind contribution from ECMWF, for high-performance computing, archive, data services, and supplementary staff support. We thank NOAA and NCAR for providing the International Comprehensive Ocean–Atmosphere Data Set (ICOADS) v2.5.1 and International Surface Pressure Databank (ISPD) v3.2.6 observational datasets. We thank Gil Compo and Chesley McColl for useful advice and provision of ISPD v3.2.6. We are grateful to Rob Allan and his collaborators, including ERA-CLIM project partners, for their efforts in the area of historical observation data rescue, in particular the Atmospheric Circulation Reconstructions over the Earth (ACRE) initiative, which facilitates and undertakes data rescue activities essential for the continuous improvement of ISPD and ICOADS. The HadISST126.96.36.199 dataset was provided by the Met Office Hadley Centre with partial funding from ERA-CLIM. The 20CRv2c data were acquired from doi:10.5065/D6N877TW. JRA-55 data were acquired from JMA and from doi:10.5065/D60G3H5B. GPCC data were acquired from doi:10.5676/DWD_GPCC/FD_M_V7_100. RSS data were acquired from http://remss.com. HOAPS data were acquired from doi:10.5676/EUM_SAF_CM/HOAPS/V001. We thank the ERA-CLIM advisory board members for fruitful comments and suggestions during the preparation stages of ERA-20C: Phil Jones, Michele Rienecker, Mark Serreze, Sakari Uppala, and Robert Vautard. We also thank David Bromwich and two anonymous reviewers, whose comments helped improve this manuscript. ERA-20C and ERA-20CM data shown in this manuscript (gridded fields and observation feedback archive) are available from http://www.ecmwf.int.
Current affiliation: Météo-France, Centre de Météorologie Marine, Brest, France.