The homogeneity of the ECMWF 40-yr Re-Analysis (ERA-40) is assessed. This is done by comparing ERA-40 data with results from the NCEP–NCAR reanalysis and also by investigating a known relationship between a modeled (latent heat flux) and an external (SST) quantity. The direct comparison between the two reanalyses reveals a lot of inhomogeneities. They occur mainly in the Southern Hemisphere and before 1980. While observational density was sufficient to effectively constrain the models in the Northern Hemisphere, it was not in the Southern Hemisphere. From the investigation of the relationship between latent heat flux and SST it is found that, because of an increasing amount of data, the reanalysis results become more reliable toward the end of the reanalysis period (approximately after 1980). When using the reanalysis data to investigate climate change issues care has to be taken not to confuse the inhomogeneities with real changes.
Atmospheric reanalysis projects are performed to get a description of the atmosphere that is free of inhomogeneities caused by a change in the analysis model. The results of a reanalysis are, however, prone to inhomogeneities caused by changes in the observing system (e.g., Uppala 1997; Kistler et al. 2001). Due to the introduction of satellites, the amount of observations to be assimilated has increased tremendously during the last 50 yr (see Fig. 1 of Kistler et al. 2001). As most readily available observations are used in the reanalysis, few independent data are available to quantify the degree of inhomogeneity introduced by the increasing amount of observations.
To overcome this problem two approaches are used in this paper. In the first the results from two different reanalysis projects are compared, and in the second a known relationship between a model variable (latent heat flux, Qlat) and a nonmodel quantity [sea surface temperature (SST)] is investigated.
A reanalysis is effectively a model run constrained by observations. If only few observations are available, the constraint is weak and the model essentially produces its own intrinsic variability. With an increasing amount of observations available, the model is more and more forced to follow the observed variability rather than its own intrinsic one. Assuming that different models have a different intrinsic variability makes it possible to determine those parts of the spatiotemporal domain in which at least one of the reanalyses does not display the correct variability. This is done by comparing results from the two reanalyses. If they agree, the observational constraint is large enough to force the models to follow the “real” variability of the atmosphere. If they do not agree, the constraint is too weak and at least one of the two reanalyses does not reproduce the real variability. It could be that one of the models is much better than the other so that the observational constraint is strong enough for this model, but not for the other. Even then, regions of large differences between the models are likely to be regions where the better of the reanalyses is less accurate than elsewhere. Paraphrasing, we can say that disagreement between the two reanalyses means insufficient observational coverage.
In the extratropics, the time rate of change of SST, ∂tTs, and Qlat are known to be highly correlated (Cayan 1992). SST is not a model variable, but prescribed from observations, and in the extratropics the atmosphere is known to be only marginally influenced by local SST (Kushnir et al. 2002). Therefore, a high correlation between (observed) ∂tTs and (modeled) Qlat can only be attained if the atmosphere is sufficiently constrained. Investigating the change in time of the correlation between the two quantities therefore gives insight into when and where the atmosphere is sufficiently constrained by observations.
The two reanalyses considered are the well-known one that has been conducted at the National Centers for Environmental Prediction (NCEP) in collaboration with the National Center for Atmospheric Research (NCAR) (Kalnay et al. 1996), and the ERA-40 reanalysis, which has recently been finished at the European Centre for Medium-Range Weather Forecasts (ECMWF) (Simmons and Gibson 2000). The NCEP–NCAR reanalysis covers the period from 1948 until now and is continually updated, while ERA-40 presently covers the period September 1957 to August 2002. A near real-time continuation, following the NCEP–NCAR example, is planned for the near future.
In this paper monthly means and monthly anomalies from both reanalyses are used. The anomalies are created in the usual way by subtracting a monthly climatology from each individual monthly mean. A possible linear trend is also removed.
a. Atmospheric dynamics
The basic dynamical variable is pressure. If the pressure fields of two models differ, most other fields will differ, too. Figure 1 shows three time–latitude sections of the normalized differences of sea level pressure (SLP) between the two reanalyses, while Fig. 2 shows the same information for the 500-hPa height (z500). Normalization is done by dividing the differences by the local standard deviation as determined from the ERA-40 data. The sections are taken along 120°W, 20°W, and 160°E. These longitudes were chosen so as to lie as much as possible over the ocean, where data coverage is worst. From the three sections chosen only that at 120°W has a significant part over land (north of 35°N).
At all three sections the differences between the two reanalyses are systematic; that is, their sign usually does not change with time. Also, an annual cycle is clearly visible nearly everywhere. At the surface (SLP) the sign of the differences changes with latitude and longitude, while in the middle atmosphere (z500) it only depends on latitude. These systematic differences are present throughout the whole period and therefore are not due to differences in data used. Rather, they represent differences in the models. Such differences may occur in the physical parameterizations, the representation of the dynamics, or the assimilation system. For SLP different formulas for extrapolation to sea level may also be a reason.
While the sign of the differences generally does not change over time, significant changes in their magnitude are clearly visible. They occur both gradually and suddenly. The gradual changes are mainly recognized in the Southern Hemisphere (SH) and probably indicate an increasing number of conventional data available. The increase may come from more ships operating in the SH or from an increasing number of meteorological stations. Sudden changes may stem from two reasons. First, the data assimilated in the two reanalyses are different. ERA-40 was conducted later than the NCEP– NCAR reanalysis, and new datasets have been used (Simmons and Gibson 2000). If these datasets have a relatively homogeneous data density over a certain period of time, the beginning and end of the period should show up as a sudden inhomogeneity in the difference plots. The second reason for a sudden change is a large dataset (satellite) becoming available in both reanalyses. Both models will then be more constrained, resulting in a reduction of the differences. One such obvious change occurs in 1979, when a lot of new data became available (e.g., Kistler et al. 2001, their Fig. 1; Sturaro 2003), most of it from satellites. Other sudden changes are visible in 1967, 1969, 1973 (all around 40°N), 1976 (around 40°N and poleward of 60°S), and 1993 (near the South Pole). Most of these changes can be recognized at all longitudes and both at the surface (SLP) and in the middle atmosphere (z500).
The differences between the two reanalyses seem to be largest in the Tropics. However, as variability is low in these regions, the absolute (unscaled) differences are much smaller. Apart from the tropical belt, the largest differences occur in the SH, especially in the region south of 40°S. Their magnitude decreases from high values in the earlier years to the low level found in the other areas toward the end of the period considered. This result tells us that in the Northern Hemisphere (NH) the conventional observation coverage was large enough throughout the reanalysis period to sufficiently constrain the model, while in large parts of the SH only the introduction of satellites achieved the necessary data density. An exception to this general picture is the region around Australia (sections at 160°E) where differences are significantly lower than at the other sections, reflecting the availability of measurements on that continent.
To get an impression of the spatial distribution of insufficient data coverage, Fig. 3 shows correlations between the monthly SLP anomalies from the two reanalyses for three different periods. In the earlier period (1958–67, Fig. 3a) the correlations are small in large parts of the Southern Hemisphere, especially over the ocean, indicating insufficient data coverage in that area. In the central period (1973–81, Fig. 3b) correlations in the SH have increased dramatically, and in the most recent period (1989–99, Fig. 3c) correlations are high everywhere except for some mountainous regions like Tibet. Additionally, correlations are notably low over Africa, indicating that data coverage is still too low over that continent. Satellite-derived estimates of near-surface dynamical quantities (most notably wind) are only available over the sea. Corresponding plots for z500 (not shown) show essentially the same patterns as does SLP, but correlations are generally higher than for SLP.
It thus appears that in the Southern Hemisphere the two reanalysis models were free to develop their own variability in the early part of the period considered, rather than to follow the real variability. Therefore, care has to be taken when studying variability in that region. Characteristics of the displayed variability may differ from reality in that period, and changes over time may reflect observational-based inhomogeneities rather than natural changes.
b. Atmosphere–ocean coupling
The inhomogeneities in the reanalyses also influence the inferred air–sea coupling. This is shown here for the relationship between anomalies of latent heat flux (Qlat) and anomalous changes of SST (∂tTs, calculated from the monthly mean SST anomalies). As Cayan (1992) has shown, both are highly correlated in the extratropics with Qlat driving SST changes. Beforehand, it is not clear whether such a relation should also hold in a reanalysis, where SST is prescribed and cannot react to Qlat. Figure 4 shows correlations between the two quantities for the same three different periods as before. The correlations are high in all periods in the Northern Hemisphere, where maxima exceed 0.6. These values are only slightly lower than those found by Cayan (1992) for the winter months (maxima reaching 0.8), while the present analysis includes all months. In the SH the correlations are low in the earlier periods but reach a level comparable to that in the NH in the most recent period. This development confirms the findings from the comparison between the two reanalyses in the foregoing section.
It is interesting to note that, while a similar comparison for the NCEP–NCAR reanalysis essentially gives the same pattern as shown in Fig. 4, the correlations themselves are lower in the earlier periods, especially in the NH (see also Sterl 2001).
It is of course very unlikely that the relation between Qlat and ∂tTs really changed during the course of the ERA-40 period. What changed, however, is the amount and quality of the data that was fed into the reanalysis. This has two effects on the correlation between Qlat and ∂tTs. First, as shown in the foregoing section, the phase of the atmospheric variability is not properly determined when data coverage is too low. Latent heat flux is largely determined by atmospheric fields (wind and humidity). Therefore, any correlation between Qlat and SST would be lost, even if the SST were perfect. Second, however, due to the few SST observations available in the early period in the Southern Hemisphere, the variability of the reconstructed SST field that is used as the lower boundary condition in ERA-40 is much too low. One cannot expect large correlations with a quantity that hardly varies.
The low variability of SST is illustrated in Fig. 5, showing the variances of SST and of latent heat flux for two periods. While the variances of Qlat in both periods are comparable, that of SST increased a lot in the Southern Hemisphere and along the equator in the Pacific Ocean. The latter may be real, as the period 1989–99 contains two major and some minor El Niños. To get an impression of the temporal evolution of the variability of SST, Fig. 6 shows a time series of anomalies of ∂tTs at a point (50°S, 120°W) in the southeastern Pacific, the region with the lowest data coverage. The variability of this time series clearly increases after 1981. In that year the SST product used changed from one purely based on in situ measurements to one incorporating satellite retrievals (Reynolds and Smith 1994), thus greatly increasing the amount of data.
So while the amplitude of the variability of latent heat flux, which is mainly model generated (depending on wind and humidity), remains more or less the same during the ERA-40 period, that of the imposed SST-forcing changed dramatically. Obviously, the amplitude of the model's variability neither depends on the amount of atmospheric observations (only its phase does) nor on the variability of the underlying SST field. The latter result implies that extratropical SST does not drive atmospheric variability. This conclusion is in line with other studies on the forcing of atmospheric variability by extratropical SSTs (see Kushnir et al. 2002 and references therein).
4. Summary and conclusions
Some aspects of the homogeneity of atmospheric reanalyses have been investigated. While the emphasis was on the recently finished ERA-40 reanalysis, the results are equally valid for the NCEP–NCAR reanalysis.
Two approaches to assess the homogeneity of the reanalysis data have been followed. In the first approach, pressure data (SLP and z500) from the two reanalyses have been compared. Differences were found to be large in the early part of the reanalyses in the Southern Hemisphere, but decreasing toward the end, while in the Northern Hemisphere they had a constant low level throughout the period. This is an indication that observation density in the NH was high enough throughout the reanalysis period to fully constrain the models, while due to the lack of data at least one of the models was free to follow its own intrinsic variability in the SH during the early years of the reanalysis.
In the second approach whether the known correlation between the change of SST and latent heat flux is correctly reproduced by the reanalysis data was investigated. While the correlation is high in the NH throughout the whole period, it is absent in the SH during the early years, only reaching a level comparable to that in the NH after 1981 when the SST data significantly improved due to the incorporation of satellite-derived SSTs.
Both approaches lead to the same conclusions regarding the homogeneity of the ERA-40 data. In the NH data coverage was large enough during the whole period to effectively constrain the model and avoid large inhomogeneities. Contrary to that, data coverage in the SH was much too low before about 1980. The model was essentially free to develop its own variability. Only through the availability of satellite data from 1979 onward was a data coverage reached that was sufficient to effectively constrain the model. Therefore care has to be taken when the ERA-40 data are to be used to study variability. Short-term variability in the SH may be nearly unrelated to the real variability that occurred, while long-term variability might reflect the increasing amount of data rather than true changes. Homogeneity can only be assumed after 1980. In the NH these problems are much less severe.
The above conclusions are likewise true for the NCEP–NCAR reanalysis. However, the comparison between the two reanalyses also revealed some systematic differences between them. Their origin must lie in differences in model formulation (dynamics, physics, assimilation). An investigation into these differences is beyond the scope of this note.
The NCEP–NCAR reanalysis data were obtained from http://ingrid.ldeo.columbia.edu/SOURCES maintained at the International Research Institute for Climate Prediction (IRI). The ERA-40 data were obtained from ECMWF at http://data.ecmwf.int/data/d/era40. I thank the ERA-40 team at ECMWF for their continuous support. The plotting was done with the free Ferret software developed by NOAA/PMEL/ TMAP. I thank Camiel Severijns for software support. The work was funded by EU under the ERA-40 project (Grant EVK2-CT-1999-00027).
Corresponding author address: Dr. Andreas Sterl, Royal Netherlands Meteorological Institute (KNMI), P.O. Box 201, NL-3730 AE De Bilt, Netherlands. Email: email@example.com