In this study, surface and radiosonde data from staffed Antarctic observation stations are compared to output from five reanalyses [Climate Forecast System Reanalysis (CFSR), 40-yr ECMWF Re-Analysis (ERA-40), ECMWF Interim Re-Analysis (ERA-Interim), Japanese 25-year Reanalysis (JRA-25), and Modern Era Retrospective-Analysis for Research and Applications (MERRA)] over three decades spanning 1979–2008. Bias and year-to-year correlation between the reanalyses and observations are assessed for four variables: mean sea level pressure (MSLP), near-surface air temperature (Ts), 500-hPa geopotential height (H500), and 500-hPa temperature (T500).
It was found that CFSR and MERRA are of a sufficiently high resolution for the height of the orography to be accurately reproduced at coastal observation stations. Progressively larger negative Ts biases at these coastal stations are apparent for reanalyses in order of decreasing resolution. However, orography height bias cannot explain large winter warm biases in CFSR, JRA-25, and MERRA (11.1°, 10.2°, and 7.9°C, respectively) at Amundsen–Scott and Vostok, which have been linked to problems with representing the surface energy balance.
Linear trends in the annual-mean T500 and H500 averaged over Antarctica as a whole were found to be most reliable in CFSR, ERA-Interim, and MERRA, none of which show significant trends over the period 1979–2008. In contrast JRA-25 shows significant negative trends over 1979–2008 and ERA-40 gives significant positive trends during the 1980s (evident in both T500 and H500). Comparison to observations indicates that the positive trend in ERA-40 is spurious. At the smaller spatial scale of individual stations all five reanalyses have some spurious trends. However, ERA-Interim was found to be the most reliable for MSLP and H500 trends at station locations.
Reanalysis datasets are a very important source of atmospheric data over Antarctica due to the sparse network of observation stations. A number of new global reanalysis datasets have been released in recent years: the National Centers for Environmental Prediction (NCEP) Climate Forecast System Reanalysis (CFSR), the European Centre for Medium-Range Weather Forecasts (ECMWF) Interim Re-Analysis (ERA-Interim or ERAINT), the Japanese 25-year Reanalysis (JRA-25), and the Modern Era Retrospective-Analysis for Research and Applications (MERRA) (see Table 1 for details). An assessment of the skill of the reanalyses is important since they are used widely both to study the atmosphere and as a source of data for other fields of research.
To date two reanalyses, the 40-yr ECMWF Re-Analysis (ERA-40) and the NCEP–National Center for Atmospheric Research Global Reanalysis 1 (NCEP-1), have been most frequently used for Antarctic studies. ERA-40 has generally been found to be the more reliable reanalysis (Bromwich et al. 2007). In terms of near-surface variables, both show a large reduction in bias over Antarctica in 1978 due to the introduction of widespread satellite data. In ERA-40 the biases in temperature and pressure are consistently small from 1979 onward (Bromwich and Fogt 2004). However, NCEP-1 shows relatively large biases in mean sea level pressure (MSLP) over East Antarctica continuing after 1979 to approximately 1993 when more in situ observations became available (Hines et al. 2000; Marshall 2002). Biases in orography height in both NCEP-1 and ERA-40 can account for post-1979 cold biases in Ts over coastal stations and a warm bias at the Amundsen–Scott station (Bromwich and Fogt 2004).
The JRA-25 dataset has been available since 2006, but its performance in Antarctic near-surface temperature and MSLP has not to our knowledge been assessed. Above the near surface, reanalysis skill is less strongly related to biases in orography and surface processes. Bromwich et al. (2007) found that at 500 hPa interannual variability in geopotential height is captured well by JRA-25, ERA-40, and NCEP-1. In terms of climatological annual-mean geopotential height, differences between ERA-40 and JRA-25 are small, but NCEP-1 shows a distinct negative bias over East Antarctica.
More recently three global reanalysis datasets have been released: CFSR, ERA-Interim, and MERRA. They include the following broad improvements: (i) increased spatial resolution, (ii) more complete observational datasets, (iii) more realistic representation of stratospheric dynamics, (iv) improved assimilation and variational bias correction of satellite radiances, and (v) improved representation of the hydrological cycle. Recent assessments show that these improvements have had a positive impact on reanalysis skill. In an assessment of surface mass balance (SMB) trends, Bromwich et al. (2011) concluded that ERA-Interim is probably the most realistic compared to CFSR, JRA-25, MERRA, and the NCEP–Department of Energy (NCEP–DOE) Second Atmospheric Model Intercomparison Project (AMIP-II). Bracegirdle (2012) also found ERA-Interim to be the most accurate in a comparison with independent (not assimilated) sea level pressure observations over the Bellingshausen Sea in spring 2001. However, despite various improvements in the design of contemporary reanalyses, it is apparent from the range of estimated SMB trends found by Bromwich et al. (2011) and Nicolas and Bromwich (2011) that the issue of unreliable trends continues to be a problem. In particular Bromwich et al. find that interreanalysis differences in SMB trends may be linked to coincident differences in 500-hPa geopotential height trends.
Here we compare the skill of contemporary reanalyses to that of ERA-40, which is generally considered to be the best performing reanalysis of the previous generation (Bromwich et al. 2007). Reanalysis skill was assessed by comparison to observations of MSLP, near-surface temperature (Ts), 500-hPa geopotential height (H500), and 500-hPa temperature (T500) collected at staffed Antarctic observation stations spanning the period 1979–2008.
Two outstanding questions are addressed: (i) has the higher resolution used for CFSR, ERA-Interim, and MERRA reduced the temperature biases associated with smoothing of steep orography and (ii) do the latest reanalyses show improved skill and a reduction of spurious trends in the midtroposphere?
In this study five reanalysis datasets are compared to in situ observations over Antarctica. Data from the latest four global reanalyses from the main reanalysis centers were included, namely, CFSR, ERA-Interim (ERAINT), JRA-25, and MERRA (see Table 1 for details of the reanalysis datasets used here). ERA-40 is also included as a reference point for the previous generation of reanalyses. It should be noted that both the input data and analysis system used for MERRA and CFSR are nearly the same (Saha et al. 2010).
Monthly-mean observational data from staffed Antarctic stations are retrieved from the Scientific Committee on Antarctic Research (SCAR) Reference Antarctic Data for Environmental Research (READER) project (hosted at http://www.antarctica.ac.uk/met/READER/). The locations and names of stations included in this study are shown in Fig. 1. The READER dataset consists of monthly mean meteorological parameters derived from 6-hourly synoptic data that have been rigorously and systematically checked for errors and missing values. Monthly mean values are flagged as missing data if the percentage of daily observations is too low to calculate an accurate mean (<90% for surface and <30% for upper air).
The comparison between observational and reanalysis data is conducted as follows. The three decades spanning 1979–2008 were assessed separately. This allows for changes in the coverage of observation stations and changes in reanalysis performance over time as new data types become available or old ones are discontinued. A bilinear interpolation of the gridded reanalysis datasets was used to estimate parameter values at the locations of observation stations to the nearest 0.1° in latitude and longitude. Observation stations were included in the comparison to reanalysis datasets if less than or equal to 30% of the monthly mean values are flagged as missing in the READER dataset. The interpolated reanalysis data at that location were masked with the same missing months. For upper-air data, a further step in comparing the reanalysis and observational data was required since most radiosonde ascents over Antarctica are conducted at 0000 UTC, with relatively few at 1200 UTC and very few at 0600 and 1800 UTC. Therefore, in the case of comparison to upper air data, monthly means of reanalysis data were calculated from 0000 UTC data only.
To investigate observational uncertainty in reanalysis bias of decadal-mean T500, a comparison with the inclusion of five adjusted datasets of radiosonde temperature observations was conducted and is discussed in section 4. Adjusted datasets include adjustments to the original radiosonde data in an effort to account for instrumental bias. Five such datasets were included here: Met Office Hadley Centre Atmospheric Temperatures, version 2 (HadAT2) (Thorne et al. 2005); Iterative Universal Kriging (IUK) (Sherwood et al. 2008); Radiosonde Observation Correction using Reanalysis (RAOBCORE) (Haimberger 2007); and Radiosonde Innovation Composite Homogenization (RICH-obs and RICH-tau) (Haimberger et al. 2008), which are all summarized in Thorne et al. (2011). Only four Antarctic stations were found to be included across all the datasets for the period 1979–2008 (Casey, Halley, McMurdo, and Syowa). An average of these four was used for interdataset comparisons.
a. Near-surface temperature
Figure 2 shows the performance of the reanalyses in Ts. Antarctica can be split into three distinct climatological regions (Turner and Marshall 2011): coastal East Antarctica (all coastal stations in the Eastern Hemisphere), the high interior of Antarctica (Amundsen–Scott and Vostok), and the Antarctic Peninsula and Weddell Sea coastline (APW—all near-coastal stations in the Western Hemisphere). West Antarctica is not included owing to the lack of staffed observation stations. No clear decadal changes in bias and/or correlation were evident in Ts; therefore, Fig. 2 shows the first two decades combined (1979–98), with the third decade omitted due to the lack of ERA-40 after 2001.
For stations located along coastal East Antarctica many of the reanalyses show a clear cold bias in annual-mean Ts (Fig. 2). The highest resolution models, CFSR and MERRA, show the smallest biases (−2.8° and −1.6°C, respectively, averaged across the East Antarctic stations) compared to the lowest resolution models, ERA-40 and JRA-25 (−3.7° and −4.9°C, respectively). This is consistent with the findings of Bromwich and Fogt (2004), who showed that orography height errors associated with low resolution are a plausible explanation for a large part of the cold bias over coastal East Antarctica.
Figure 3 shows the orography height bias at the observation stations, which clearly shows that biases in orography height over coastal East Antarctica are largest in the low-resolution reanalyses. Height-adjusted annual-mean Ts biases (assuming a dry adiabatic lapse rate of 9.8°C km−1) show much smaller biases around coastal East Antarctica (bottom panel of Fig. 3), which indicates that most of the Ts bias in this region can be explained by orography height bias. However, factors other than orography bias, such as surface inversion strength and radiative flux biases, discussed below, may also contribute to the temperature bias. The year-to-year correlations for Ts at East Antarctic stations are generally high across the reanalyses (Fig. 2). ERA-Interim (ERAINT) shows the largest correlations with 0.91, 0.86, and 0.95 for annual, summer, and winter, respectively. One exception is JRA-25, which averaged over East Antarctic stations shows a relatively small correlation of 0.59 for summer Ts.
The picture is different over the interior stations (Amundsen–Scott and Vostok). All of the reanalyses show a positive bias in annual mean Ts ranging from 4.6° to 10.0°C. ERA-40, ERA-Interim, and MERRA are at the lower end of this range, and JRA-25 shows the largest positive biases. Some reanalyses show a large contrast in bias between summer and winter. In particular, CFSR and MERRA both show large positive biases in winter of 11.1° and 7.9°C, respectively, but almost no bias in summer. In contrast, ERA-40 shows a stronger positive bias in summer of 6.5°C and a smaller bias of 3.0°C in winter. Orography height cannot explain these large Ts biases since all reanalyses show small height biases at Amundsen–Scott and Vostok (Fig. 3). An alternative explanation is biases in surface radiative fluxes affecting the surface temperature inversion, which can reach 25°C in winter over parts of the high plateau of East Antarctica (Phillpot and Zillman 1970). Cullather and Bosilovich (2012) found particularly large net radiation flux biases at the South Pole in MERRA and CFSR that are related to the warm Ts biases in winter. In summer compensating shortwave radiation biases lead to smaller Ts biases. Along coastal Antarctica the winter inversion is weaker (~5°C). In terms of correlations at interior stations, both ERA-40 and JRA-25 show relatively weak relationships, with year-to-year correlations in annual mean Ts of 0.73 and 0.58, respectively. It is notable that ERA-Interim shows a significant improvement over ERA-40, with a correlation in annual mean Ts of 0.88. Despite its large winter bias, CFSR shows the highest correlation values in annual mean (0.90), summer (0.90), and winter (0.95).
For annual mean and seasonal Ts over the northern tip of the Antarctic Peninsula, biases are relatively small and correlations are large for all five reanalyses. However, farther south along the peninsula all reanalyses apart from ERA-40 show an annual-mean cold bias at Faraday/Vernadsky and Rothera (cf. Fig. 1) of −2° to −3°C.
b. Sea level pressure
By its definition, mean sea level pressure (MSLP) should be less dominated by biases in orography height than Ts. As a result, temporal and decadal variations in reanalysis skill are more apparent (Fig. 4). Since the reanalyses were masked with missing data in the decadal comparisons shown, changes are not caused by variations in observational coverage. Bromwich et al. (2011) reported a large range in the MSLP trends produced by different reanalyses and tentatively concluded that ERA-Interim was the most reliable. Consistent with this, ERA-Interim shows the most stable decadal mean bias at long-term stations (i.e., those that have good temporal data coverage over the three decades from 1979 and therefore appear in all rows in Fig. 4). The stability of decadal mean bias was quantified by taking the standard deviation of decadal mean bias at each long-term station and averaging across these stations. The standard deviation for ERA-Interim is 0.31 hPa, significantly smaller than CFSR, JRA-25, and MERRA, which show values of 0.55, 0.41, and 0.49 hPa, respectively (ERA-40 is omitted owing to a lack of data after 2002). A caveat is that temporal changes in observational bias may affect these results, but the importance of this is difficult to assess. At the time of the study by Bromwich et al. (2011), ERA-Interim was only available back to 1989, but our results indicate that their conclusions relating to MSLP are robust to the inclusion of data back to 1979. The relatively large standard deviation of decadal bias found for CFSR is in part due to marked decadal shifts in bias at two stations, Casey and Novolazarevskaya, where large positive biases occur in 1979–88 and 1999–2008, respectively (Fig. 4). An important feature of these shifts in bias in CFSR is that they do not occur at neighboring stations, which suggests that broad-scale changes in assimilated data are not the cause.
Year-to-year correlations in annual mean MSLP between observations and reanalyses are generally very high (r > 0.98) for all locations and models. One exception is a relatively small correlation for JRA-25 at Novolazarevskaya during 1989–98 (r = 0.79).
c. Midtropospheric temperature and geopotential height
The 500-hPa level lies above the entire surface of Antarctica, and therefore the representation of the boundary layer and surface processes becomes less important in determining temperature. Despite their relatively accurate climatologies, all of the reanalyses show clear decadal variations in T500 bias at individual stations (Fig. 5). Decadal changes in bias of this magnitude can lead to spurious trends of similar magnitude to those observed (Turner et al. 2006). Between 1979–88 and 1989–98, changes in bias occur at many station locations and across the reanalyses. However, between 1989–98 and 1999–2008, the greatest changes in bias occur at Amundsen–Scott station. These changes cause significant differences in 30-yr linear trends at Amundsen–Scott estimated by the reanalyses (Fig. 6). CFSR gives the trend of smallest magnitude of −0.09°C decade−1. The only reanalysis with a statistically significant trend (<5% level) is MERRA with a slope of −0.28°C decade−1. Due to missing years in the radiosonde record it is difficult to assess the strength or significance of the observed trend. However, it is clear that the trends in T500 at Amundsen–Scott are significantly dependent on the choice of reanalysis.
A notable feature of Fig. 6 is the convergence of all five reanalyses and radiosonde data from 2005 through 2008. However, this is not apparent at other stations (not shown) and may have occurred by chance since there are no coincident major changes in assimilated data. Even assuming a constant observational dataset, there will be some year-to-year variability in accuracy due to differences in skill at simulating different conditions (e.g., Manney et al. 2003). One would expect such flow-dependent bias to be more important at smaller spatial scales. Consistent with this, the agreement between reanalyses is stronger for Antarctica as a whole (Fig. 7). In particular, CFSR, ERA-Interim, and MERRA show small differences and no significant trend over the period 1979–2008. However, JRA-25 gives a trend of −0.35°C (significant at <1% level). A further notable feature of Fig. 7 is that between 1979 and the early 1990s ERA-40 shows a strong positive trend from a relatively large cold bias in 1979 (Fig. 5).
Interannual correlations between reanalysis and observational values of annual mean H500 are similarly high to those for T500 with no clear systematic differences between the reanalyses (Fig. 8). For each reanalysis the largest positive and negative biases generally occur in the decade 1979–88. In the subsequent two decades changes in bias at some stations again demonstrate the strong local dependence of trends on the choice of reanalysis. As was seen for MSLP, ERA-Interim shows comparatively small decadal changes in H500 bias. For CFSR the variations in bias at Casey and Novolazarevskaya are less dramatic than the variations in MSLP bias (Fig. 4) but are of the same sign and much larger than those seen in ERA-Interim. For Antarctica as a whole (Fig. 9), the results are qualitatively similar to those seen for T500. CFSR, ERA-Interim, and MERRA agree closely and show no significant trend. JRA-25 shows a significant negative linear trend of −8 m decade−1 (significant at <5% level).
4. Conclusions and discussion
In this paper a climatological assessment of the latest generation of global reanalysis datasets has been presented. Decadal mean and year-to-year correlations of near-surface temperature (Ts), mean sea level pressure (MSLP), 500-hPa temperature (T500), and 500-hPa geopotential height (H500) are assessed over the period 1979 through 2008. Five reanalyses (CFSR, ERA-40, ERA-Interim, JRA-25, and MERRA; see Table 1) are compared to surface station and radiosonde data from staffed observation stations across Antarctica retrieved from the Scientific Committee on Antarctic Research (SCAR) READER dataset.
The results clearly show that smaller biases in orography height in the higher resolution reanalyses (CFSR and MERRA) are associated with dramatic reductions in Ts bias compared to lower-resolution reanalyses (ERA-40 and JRA-25). This is most evident over coastal East Antarctica due to the region's steep orography and is consistent with previous findings (Bromwich and Fogt 2004; Connolley and Harangozo 2001). However, large biases in Ts also occur over the interior of East Antarctica, which cannot be explained by orography height biases. In particular, CFSR, JRA-25, and MERRA show large winter warm biases of 11.1°C, 10.2, and 7.9°C, respectively. Cullather and Bosilovich (2012) showed that these warm biases are most likely related to net radiation flux biases at the South Pole in MERRA and CFSR. In summer compensating shortwave radiation biases lead to smaller Ts biases. Such large biases may have implications for the representation of related phenomena, such as the katabatic winds.
The biases and decadal variations in MSLP found here are consistent with the analysis of Bromwich et al. (2011), who suggested that ERA-Interim is the most reliable at reproducing MSLP trends. This conclusion is robust when including an additional 10 years (1979–88) that have recently been added to the ERA-Interim dataset. An important caveat of comparisons to staffed observation stations is that they may not be representative of regions less well constrained by in situ observations, such as West Antarctica. In addition, since the observations are assimilated into the reanalyses, the two are not independent. One indication of the broader reliability of ERA-Interim is that it has also been found to be the most accurate at reproducing independent nonassimilated MSLP measurements taken from buoys over the Bellingshausen Sea (Bracegirdle 2012).
In terms of T500 and H500, all five reanalyses exhibit contrasting decadal variability in bias across the observation stations. However, a clearer pattern emerges for Antarctic-wide averages. In both variables there is strong agreement between CFSR, ERA-Interim, and MERRA—none of which show significant linear trends over the period 1979–2008. In contrast, JRA-25 and ERA-40 show significant negative and positive trends, respectively, which appear spurious when compared to raw Television and Infrared Observation Satellite Operational Vertical Sounder (TOVS) tendencies (Sakamoto and Christy 2009). The positive ERA-40 T500 trend also appears spurious when taking into account observational uncertainty in the radiosonde data. We investigated the observational uncertainty of the above results by assessing five additional adjusted radiosonde temperature datasets (HadAT2, IUK, RAOBCORE, RICH-obs, RICH-tau), which show that the positive ERA-40 T500 trend (subsampled to available station data) is clearly outside the range of the different datasets (not shown). The negative JRA-25 trend is not clearly outside the range of the radiosonde datasets but should be treated with caution since the change from TOVS to the Advanced TOVS (ATOVS) in 1998 along with a coincident change in the method of assimilating radiances caused a discontinuity in stratospheric temperatures (Onogi et al. 2007; Sakamoto and Christy 2009). This is partly a consequence of the relatively large stratospheric and upper-tropospheric bias in the JRA-25 forecast model (Onogi et al. 2007). The positive trend in ERA-40 between the 1980s and 1990s is most likely of a different origin. Sakamoto and Christy suggested that overly positive trends in ERA-40 are related to transitions in TOVS and in the assimilation streams (the transition from stream 3 to stream 1 occurs during the late 1980s). Another factor that might affect coastal stations is a temperature bias over sea ice in ERA-40 that caused spurious positive lower-tropospheric temperature trends across a discontinuity in 1997, which has been identified as an important issue over the Arctic (Screen and Simmonds 2011). However, there is not strong evidence for this effect in observations from terrestrial Antarctica, since the large positive trend in ERA-40 occurs mainly during the 1980s.
In summary, improvements such as adaptive bias correction of radiances and more realistic stratospheric dynamics in CFSR, ERA-Interim, and MERRA appear to have resulted in closer agreement in tropospheric trends for Antarctic-wide averages. This alone is not proof of improved performance, since the reanalyses could contain common errors. However, comparisons against observational datasets also indicate improved performance. For Ts the relatively high resolution of CFSR and MERRA has almost eliminated the biases associated with orography height. At present an update to JRA-25—JRA-55—is being developed and implemented (Ebita et al. 2011), and it is anticipated that this will show similar improvements. However, our results show that challenges remain in the representation of near-surface temperature in the strong Antarctic inversion and in capturing regional trends across the continent.
We acknowledge the good suggestions of two anonymous reviewers, which helped to improve the manuscript. This study is part of the British Antarctic Survey Polar Science for Planet Earth Programme. It was funded by the U.K. Natural Environment Research Council. The European Centre for Medium-Range Weather Forecasts is thanked for providing the ERA-40 and ERA-Interim datasets. The JRA-25 dataset used for this study was provided from the cooperative research project of the JRA-25 long-term reanalysis by the Japan Meteorological Agency (JMA) and the Central Research Institute of Electric Power Industry (CRIEPI). The Global Modeling and Assimilation Office (GMAO) and the GES DISC are acknowledged for the dissemination of the MERRA dataset. The CFSR data was retrieved from the Research Data Archive, which is managed by the Data Support Section of the Computational and Information Systems Laboratory at the National Center for Atmospheric Research in Boulder, Colorado.