Surface air temperatures have recently increased more rapidly in the Arctic than elsewhere in the world, but large uncertainty remains in the time series and trend. Over the data-sparse sea ice zone, the retrospective assimilation of observations in numerical reanalyses has been thought to offer a possible, but challenging, avenue for adequately reproducing the historical time series. Focusing on the central Arctic Ocean, output is analyzed from 12 reanalyses with a specific consideration of two widely used products: the Modern-Era Retrospective Analysis for Research and Applications, version 2 (MERRA-2), and the European Centre for Medium-Range Weather Forecasts interim reanalysis (ERA-Interim, hereafter ERA-I). Among the reanalyses considered, a trend of 0.9 K decade−1 is indicated but with an uncertainty of 6%, and a large spread in mean values. There is a partitioning among those reanalyses that use fractional sea ice cover and those that employ a threshold, which are colder in winter by an average of 2 K but agree more closely with in situ observations. For reanalyses using fractional sea ice cover, discrepancies in the ice fraction in autumn and winter explain most of the differences in air temperature values. A set of experiments using the MERRA-2 background model using MERRA-2 and ERA-I sea ice and sea surface temperature indicates significant effects of boundary condition differences on air temperatures, and a preferential warm bias inherent in the MERRA-2 model sea ice representation. Differences between experiments and reanalyses suggest the available observations apply a significant constraint on reanalysis mean temperatures.
Northern Hemisphere high-latitude warming trends are the subject of significant research efforts. Modeling studies have suggested that the Arctic will warm at a faster rate than the planet as a whole in future climate scenarios (Kirtman et al. 2013; Barnes and Polvani 2015), and that this is attributable in part to projected changes in the cryospheric surface cover, notably in the reduction of Arctic sea ice (Serreze and Barry 2011). The disproportionate warming of the Arctic—referred to as “Arctic amplification”—has potential relevance to regional conditions and to the large-scale general circulation in the future (Holland and Bitz 2003; Deser et al. 2010; Peings and Magnusdottir 2014; Barnes and Screen 2015; Francis et al. 2018). Nevertheless, observational records indicate that significant trends in the cryosphere have already occurred (Bekryaev et al. 2010; Screen and Simmonds 2010; Taylor et al. 2017). A quantitative assessment of recently experienced historical warming trends is important for examining and understanding the physical processes involved, for model validation, and for prognostic studies. An accurate reconstruction of the time series poses a considerable challenge. Infrared satellite measurements may be confounded by pervasive, near-surface cloudiness and profile inversions. Remote polar conditions and the presence of an ocean with a quasi-permanent sea ice cover also provide significant obstacles to obtaining routine spatially representative in situ temperature measurements over the Arctic Ocean. As a result of these challenges, there is a significant sparsity in the observing network over the central Arctic Ocean, which may affect estimates of regional and global surface temperature trends (Morice et al. 2012; Cowtan and Way 2014; Curry 2014).
Atmospheric numerical reanalyses are seen as a promising means for addressing the north polar cap issue through the assimilation of available observational data and the use of a background numerical weather prediction model [NWP (note that acronyms are also defined in the appendix); Simmons and Poli 2015; National Academies of Sciences, Engineering, and Medicine 2018]; however, this notion contrasts with concerns in the ability of reanalyses to discern long-term trends (Bengtsson et al. 2004; Chaudhuri and Ponte 2015). Arctic air temperature trends are known to vary among reanalyses. For example, Simmons et al. (2017) showed significant differences in normalized reanalysis time series averaged over the north polar cap. To address these concerns, the evaluation and intercomparison of reanalyses are useful activities with the goal of understanding discrepancies through careful examination. Previous assessments of reanalyses in the Arctic have also made use of unique or novel data sources for validation (Jakobson et al. 2012; Wesslén et al. 2014; Graham et al. 2019), although these sources are typically of limited duration.
Reanalyses differ from single-variable observation-derived datasets by providing a physically consistent, multiple-variable depiction of conditions. The resulting surface air temperature can rely on a representation of the overlying atmospheric profile and on the treatment of the surface boundary condition. Surface temperature issues have been documented in reanalysis products (Chaudhuri and Ponte 2015; Simmons et al. 2017; Gross et al. 2018; Graham et al. 2019). Furthermore, erroneous jumps in the surface air temperature time series for the Arctic have been found in multiple reanalyses (Screen and Simmonds 2011; Alexeev et al. 2012; Simmons et al. 2017). Discrepancies in the atmospheric profile have also been previously examined (Thorne 2008; Bitz and Fu 2008; Alexeev et al. 2012; Jakobson et al. 2012) and uncertainties exist in the vertical profile of Arctic warming (Döscher et al. 2014).
Here, we focus more closely on the specification of sea ice and sea surface temperatures (SSTs), as well as the model representation of the atmospheric interaction with these surfaces. It is hypothesized that the surface boundary condition is a dominant control on the discrepancies in reanalysis surface air temperatures. The presence or absence of sea ice radically alters the heat and moisture exchange between the atmosphere and underlying ocean, thus affecting the adjacent air temperatures. Two associated issues are 1) the sea ice dataset selection and 2) the representation of physical processes associated with the surface–atmosphere interface. Discrepancies among various sea ice concentration datasets can be large (Kattsov et al. 2010). Discontinuities may also arise from changes in the satellite observing system for sea ice concentration, although considerable efforts have been made to homogenize the time series (Cavalieri et al. 1999, 2012; Comiso et al. 2017). Global reanalyses are constrained in the selection of sea ice cover datasets by necessary criteria: global, long duration datasets that have consistency with similarly gridded SSTs. In the case of a reanalysis product that is updated in near–real time, there is an added constraint for the timely availability of the boundary conditions. These constraints frequently lead to the merging of two or more SST and sea ice datasets, which may adversely impact time series.
For this study, a large number of historical and contemporary reanalyses are used to assess potential systematic characteristics associated with various boundary condition treatments, and to understand evolutionary changes in reanalyses of similar construction. Surface-based reanalyses are not considered here. This paper also expands on the work presented by Simmons et al. (2017) and Gross et al. (2018) in relation to the NASA Modern-Era Retrospective Analysis for Research and Applications, version 2 (MERRA-2) by examining the role of data assimilation, temperature tendency terms within the atmospheric model, and SST and sea ice boundary forcing on near-surface temperature times series in the Arctic. Section 2 provides a brief overview of the various reanalyses and datasets used in this study. Section 3 presents an overview of differences among the reanalyses. In section 4, we further examine specific differences between two established reanalyses: the European Centre for Medium-Range Weather Forecasts (ECMWF) interim reanalysis (ERA-Interim, herein ERA-I; Dee et al. 2011) and MERRA-2 (Gelaro et al. 2017). A numerical experiment is conducted with the MERRA-2 atmospheric general circulation model using boundary conditions from MERRA-2 and from ERA-I, and is compared with the reanalyses. The experiment is described in more detail below. A summary is provided in section 5.
2. Datasets and method
a. Reanalysis temperature and sea ice fields
Atmospheric reanalyses are retrospective, gridded depictions that are generated through a statistical adjustment of a prior, short-term numerical forecast to available observations during the period of the forecast (Kalnay et al. 1996). The correction of background forecasts due to the assimilation of observations is variously known as the “innovation” or the analysis increment (Cullather and Bosilovich 2012) and is indicative of the background model bias during the short-term forecast period. Table 1 provides a list of 11 global reanalyses and one regional reanalysis examined for this study, as well as references that describe the various systems. These products are surveyed over the recent satellite-observing era, 1979–2017, beginning with the first full year of microwave satellite observations. The available time periods covered by the reanalyses vary considerably, but there is significant overlap for the period 1980–93. Five of the most recent global reanalyses cover the 30-yr period 1980–2009: ERA-I, ERA5, JRA-55 [the Japanese 55-yr Re-Analysis from the Japan Meteorological Agency (JMA)], MERRA-2, and CFSR (the NOAA Climate Forecast System Reanalysis). The evolution of reanalyses is marked by a gradual increase in spatial resolution, an increase in assimilated data, and a more enhanced representation of physical processes (Dee et al. 2011; Molod et al. 2015).
Surface air temperatures have historically been recognized as a key climate variable (Hansen et al. 1981) and are routinely measured at globally dispersed weather stations. This contrasts with the skin temperature at the surface–atmosphere interface, which is available from remote sensing methods but has not historically been a standard in situ observation. Reanalyses produce a 2-m temperature field that corresponds to a screen-level height. The JMA reanalyses and later ECMWF reanalyses employ optimal interpolation to produce a screen-level temperature analysis (Simmons et al. 2004; Dee et al. 2011; Kobayashi et al. 2015); however this is only carried over land, and not ocean, in ERA5. For other reanalyses, surface air temperature is not assimilated; the field is produced from an interpolation between analyzed fields at the lowest model level and the skin temperature.
For open-water conditions, the skin temperature is largely prescribed by the SST field; over sea ice, the skin temperature is prognostic. The sea ice surface is typically modeled as an ice slab of fixed thickness, with a prescribed temperature underneath the column (usually the water freezing point) and no snow cover. The thickness is typically 1.5–2.0 m. For the Arctic System Reanalysis version 2 (ASRv2) over limited time periods (2000–02, 2012; Bromwich et al. 2018), the ice thickness is a prescribed, spatially and temporally varying field produced from satellite-derived estimates of sea ice age (Maslanik et al. 2007). The CFSR employs a weakly coupled atmosphere–ice–ocean model for the analysis background field, which includes a five-category dynamical sea ice model with thermodynamics based on Winton (2000).
Notably, many reanalyses use a threshold for sea ice cover—in which the entirety of a model grid box is considered either ice-covered or ice free at any given point in time—while other reanalyses incorporate a fractional sea ice cover, as indicated in Table 1. The latter means that the skin temperature for a particular grid point is an amalgamation of a prescribed SST and the modeled prognostic value as weighted by the sea ice fraction. The sea ice cover and the surrounding SST are prescribed from boundary condition datasets. Figure 1 shows examples of how multiple datasets are merged together to provide boundary conditions that span a full reanalysis time period. In general, the SST time series requires a product spanning early years prior to the introduction of the Advanced Very High Resolution Radiometer (AVHRR) in late 1981, an advanced dataset covering the main satellite observing period, and a near–real time product for advancing the reanalysis forward. Satellite-borne passive microwave sensors have provided a near-continuous record of sea ice concentration from late 1978 through to the present time (Parkinson 2014), but the pairing of sea ice cover to SST datasets has resulted in a similar fragmentation of the time series. Particular SST datasets may also subscribe to multiple sea ice products (Reynolds et al. 2007), resulting in the time series as shown. The figure does not include differences associated with changes in the satellite observing system or other operational matters.
b. ERA-I and MERRA-2
Two reanalyses examined more closely here are the ERA-I and MERRA-2. MERRA-2, a global reanalysis product developed by the NASA Global Modeling and Assimilation Office, provides output at a resolution of ½° latitude by ⅝° longitude and 72 hybrid-sigma vertical levels for the period 1980 to the present (Gelaro et al. 2017). It is produced with version 5.12.4 of the Goddard Earth Observing System (GEOS) data assimilation system. Similarly, the ERA-I is a global atmospheric reanalysis covering the period 1979 to the present, with Gaussian grid output fields of approximately 0.7° × 0.7° grid spacing and 60 hybrid-sigma levels (Dee et al. 2011). As noted previously, ERA-I uses a separate 2-m temperature assimilation system, which affects the subsequent background NWP forecast through modification of the model land surface, although not where there is snow cover (Simmons and Poli 2015). Thus, its effectiveness has been thought to be limited in the data-sparse Arctic. The decision to focus on ERA-I as opposed to ERA5 stems from the fact that ERA-I has been used in numerous studies that focus on the Arctic, as well as the fact that the dataset and documentation of ERA5 had not yet been released at the onset of this study.
For the sea ice surface, ERA-I uses an ice slab representation with a uniform 1.5-m thickness with an ocean temperature at the ice interface of 271.46 K. In MERRA-2, as well as for MERRA, the ice surface is not represented as an ice slab; rather, the model ice surface consists of a 7-cm skin layer for purposes of heat capacity. The layer is used in the surface energy budget computation and provides a prognostic surface temperature. A Newtonian relaxation to 273.15 K on a 24-h time scale is then applied as a representation of the upwelling ocean heat flux.
Both reanalyses use fractional sea ice cover. It is noted that several reanalyses, including MERRA-2, output ice concentration as a fraction with respect to total area as opposed to the ocean area, and this has been accounted for. As seen in Fig. 1, the SST and sea ice boundary conditions for both reanalyses are derived from varying datasets over time, with both reanalyses using OSTIA and OSI SAF after April 2009. Errata are associated with both reanalyses in their use of these boundary conditions. For the sea ice fields accompanying the Reynolds OI Daily v2 product used in MERRA-2 for the period from 1982 to March 2006, the land–sea masking was incorrect, resulting in erroneous coastal polynya that are particularly notable along the coast of the eastern Canadian Arctic Archipelago. In ERA-I, between 1 January 1989 and 31 January 2009, SIC was inappropriately set to 1 north of 82.5°N and SIC less than 0.2 is set to zero (Simmons and Poli 2015).
c. Satellite and in situ datasets
Figure 2 indicates a mask used for this study that comports to standard definitions of the Arctic Ocean, but without the Canadian Arctic Archipelago or Baffin Bay. The purpose of the mask is to eliminate land surfaces that are subject to discrepancies in topography, and to more narrowly focus on the Arctic Ocean. For example, the Canadian Arctic Archipelago is a resolution-dependent feature and has thus been excluded. Rather than interpolate, the native resolution and land–sea definitions are used to derive masked areas for each reanalysis. The masked area is then approximately 9.82 × 106 km2 but varies slightly between reanalyses due to resolution and discrepancies in the land–sea reference. For spatial differencing, an interpolation to the MERRA-2 non-Gaussian grid is performed using spherical harmonics (Adams and Swarztrauber 1997).
Also shown in Fig. 2 are drift trajectories for two studies examined. The USSR continuously operated manned drifting ice stations in the central Arctic for the period examined, up to 1991 (NSIDC 1996; denoted as “NP” stations). At least one and occasionally two camps operated at a given time and the stations were mostly located poleward of 75°N. The Surface Heat and Energy Budget of the Arctic (SHEBA) was a year-long study of Arctic conditions beginning in September 1997 from the support of a free-drifting icebreaker in the western Beaufort Sea (Uttal et al. 2002). These observations have previously been used to examine differing sets of reanalyses (Bromwich et al. 2000; Beesley et al. 2000). Buoy and other marine data have also previously been used to examine reanalyses (Lüpkes et al. 2010; Simmons and Poli 2015) but are not shown here, although their assimilation in reanalyses is considered. For surface air temperature comparisons, buoy data in particular are known to require careful treatment, as their orientation and overall condition may be affected by the converging ice pack; snow cover insulation and solar heating are additional issues (Serreze et al. 1997).
Two widely used satellite-derived temperature datasets are also examined for comparison. The instrument suite of the Atmospheric Infrared Sounder and the Advanced Microwave Sounding Unit (AIRS-AMSU) has been carried on the NASA Aqua spacecraft since 2002; the AMSU instrument failed in 2016. A stochastic cloud clearing/neural network provides a first guess for version 6 of the AIRS-AMSU Level 3 physical retrieval product (Susskind et al. 2014), which includes the surface air temperature. The neural network is trained using ERA-I (Milstein and Blackwell 2016). Monthly data are provided on a 1° × 1° grid. The International Satellite Cloud Climatology Project (ISCCP; Rossow and Schiffer 1999) high-resolution global monthly (HGM) product also provides an air temperature that is produced through the extrapolation of the TIROS operational vertical sounder (TOVS) profile down to the surface pressure (Rossow and Zhang 1995). Monthly data are provided on a 1° × 1° grid for the period 1983–2012.
d. Numerical simulations
Two sets of experiments shown below use the background atmospheric general circulation model (AGCM) of MERRA-2. “M2AMIP” refers to a project producing a 10-member ensemble of simulations from the MERRA-2 model for the period 1980–2016 (Collow et al. 2017), in the style of the Atmospheric Model Intercomparison Project (AMIP; Gates 1992). The simulations are free-running and do not assimilate observations, but other aspects of the model configuration are identical to MERRA-2 including spatial resolution, SST and sea ice boundary conditions, greenhouse gas forcing, and aerosol emissions. To assess the impact of different boundary conditions on the temperature time series and atmospheric circulation, an additional ensemble of five members was produced using the SST and sea ice concentration of ERA-I and are referred to as M2AMIPERA. For computational requirements, the boundary conditions were used at a resolution of 1°. As seen below, the ensemble size of M2AMIPERA was suggested by the limited variation of the ensemble members in comparison to M2AMIP and differences therein. The experiments parallel Simmons et al. (2004) in comparing a reanalysis with a free-running integration using the identical background model with the same sea ice and SST boundary conditions. The ensemble outputs are used to examine the extent to which variability and trends are attributable to the specified SST and sea ice conditions, or to the influence of assimilated data.
3. General characteristics of reanalyses Arctic temperatures
Spatial characteristics of the mean air temperature field over the Arctic Ocean are described in various sources (Serreze et al. 1997; Rigor et al. 2000; Serreze and Barry 2005). Briefly, a strong temperature gradient is present from the pole toward the North Atlantic in autumn, winter, and spring, with coldest temperatures associated with the highest latitudes of the central Arctic Ocean. In summer, surface melt restricts the field to within close proximity of 273 K, with reduced spatial gradients throughout the Arctic Ocean.
An overview of differences among all reanalyses is first provided. Figure 3 shows the annual time series for the Arctic domain as defined in Fig. 2. The figure illustrates the unmistakable recent warming trend, large interannual variability, and the large uncertainty as indicated by the spread of the reanalyses, even if older systems are discarded. The time periods covered by each reanalysis differ, but a general picture may be obtained with a regression of the combined set of all reanalyses. A warming trend of 3.3 K over the 39-yr period is found with an estimated standard deviation of the regression coefficient of 0.2 K (6%), or 0.9 ± 0.1 K decade−1. Analysis of each decade of the time period indicates the rate of warming has accelerated with time. The trend is not significant over the first and second decades but reaches 2.1 ± 0.3 K decade−1 for the period of 1999–2008 and 2.0 ± 0.5 K decade−1 for 2009–17. It may be seen from Fig. 3 that the total range among all the reanalyses decreases with time, from greater than 3.1 K in the 1980s to 2.3 K after 2000. This is due in part to the limited time span covered by some of the products. There is a suggestion of a bifurcation in more recent years after 2005, with ERA-I, ERA5, MERRA, and MERRA-2 averaging 1.5 K warmer than other reanalyses.
For the overlapping period 1980–2009, five recently produced renalyses—denoted with bold lines—suggest a trend of 0.6 ± 0.1 K decade−1. There is a spread in the magnitude of the trend even when only recent reanalyses are considered. Individually, the trends range from 0.4 K decade−1 (MERRA-2) to 0.7 K decade−1 (JRA-55). These differences in trend are apparent in Fig. 3, where MERRA-2 and ERA5 are more than 1 K warmer than other reanalyses in the first decade, and the difference becomes reduced with time. With the linear trends removed, the 30-yr time series of annual values from each of the five reanalyses are generally well correlated with each other. The correlation coefficient for any two of these reanalyses ranges from (r =) 0.91 to 0.96. For each reanalysis, the standard deviation of the detrended 1980–2009 time series is between 0.59 and 0.69 K.
Also shown in Fig. 3 are the time series for the AIRS-AMSU and ISCCP satellite-derived time series. In general, the AIRS-AMSU time series agrees closely in magnitude, variability, and trend with the ERA-I used for its neural network training, as well as the ERA5, MERRA, and MERRA-2 reanalyses. The ISCCP time series agrees more closely with a subset of the reanalyses after 1988 that includes ERA-15, JRA-25, JRA-55, CFSR, and the ASRv2. Variations in the ISCCP time series for the period 1984–88 do not agree well with any of the reanalyses, with particular disagreement for the year 1987. The early period of disagreement occurs just prior to the introduction of NOAA-11 into the ISCCP data stream. Spatial comparisons indicate large differences between ISCCP and the reanalyses at the highest latitudes, suggesting a data coverage issue.
Spatially, the pattern of the regressed trend in the reanalyses is shown in Fig. 4a and is found to be focused on the Eurasian side of the Arctic Ocean. The largest temperature trends exceed 7 K over the 1979–2017 period in the northeastern Barents Sea, also pointed out by Simmons and Poli (2015), and are collocated with regions transitioning from seasonally ice-covered to perennially ice-free. The patterns may be compared with satellite-based studies which used the AVHRR skin temperature dataset and found warming patterns over the central Arctic similar to those shown in Fig. 4 (Comiso and Hall 2014). The differences in temperatures and temperature trends among the reanalyses are associated with the general pattern contrasting values over the Eurasian marginal seas with those along the northern coast of the Canadian Arctic Archipelago. As seen in Fig. 4b, the largest uncertainty in the combined regression trend is over the Barents Sea, which approaches 0.4 K. Also shown in Fig. 4c are trends by month for the domain average, which are large only in the nonsummer months, indicating a restricted temperature variability in the presence of surface melt. The largest trends are found in October and November and are greater than 6 K over the period.
Comparisons with in situ observations from NP ice drift stations and SHEBA were performed by using 6-h observations and comparing with the nearest grid point location for that time. A focus is on the September–November (SON) period encompassing the season of the largest temperature trend and large reanalysis uncertainty. The SHEBA campaign had limited observations in September and October during deployment and closedown; results for November are shown in Fig. 5. There are considerable differences in the bias, but generally there is consistent ordering among the reanalyses for each year. The bias values are reduced for SHEBA. The largest biases are with MERRA-2 and ERA5, which average 4.4 and 5.1 K, respectively; in contrast, NCEP–NCAR averages −1.0 K, and is the only reanalysis with a consistently negative bias. Also shown in Fig. 5 are temporal correlation coefficients with the station data. These range from (r =) 0.53 for ERA-15 up to 0.89 for CFSR. In general, contemporary reanalyses correlate more closely with observations than earlier products. There is a decrease in correlation and an increase in bias for each of the reanalyses for November 1982.
Discrepancies among the reanalyses shown in Figs. 3 and 5 are intriguing. In particular, Fig. 5 suggests that more recently produced contemporary reanalyses are associated with larger biases in comparison to in situ observations. Several issues related to boundary conditions are explored. Figure 6 shows a comparison of Arctic-domain mean temperatures as related to the use of threshold sea ice cover during the winter months for the overlapping years of 1980–93. The relation is most apparent in the winter months but is also present in the late autumn months (not shown). In general, reanalyses using fractional ice cover are warmer by an average of more than 2 K, although the difference is small for specific reanalyses. For the period, reanalyses using threshold sea ice cover generally have a similar average temperature of about 248 K, while those using fractional ice cover produce a range of values that vary by 3 K. In situ stations are necessarily located on sea ice that has a local fractional coverage of 100%. In contrast, reanalyses using fractional sea ice cover are representing a larger scale, where the grid box–averaged ice fraction may be much lower. In seasons other than summer, a lower ice fraction would produce a larger oceanic heat flux and result in warmer reanalysis air temperatures as compared to the local station values. We speculate that some, but not all, of the temperature differences with these reanalyses may be explained for these reasons. A preliminary assessment using MERRA-2 with the NP ice drift stations for November found that the collocated reanalysis ice fraction may be as low as 0.81, with a monthly average of 0.96. A regression analysis suggests that the monthly ice cover discrepancy may account for 0.82 ± 0.06 K, although the results appear to be station dependent. Nonlocal discrepancies in the fractional ice cover likely also play a role.
Focusing on the autumn period associated with large temperature trends and reanalysis differences, Fig. 7a indicates a large spread in the prescribed sea ice concentrations in the reanalyses that use fractional sea ice cover. Averaged over the overlapping 1980–2009 period, the mean central Arctic ice concentrations range from 0.62 in MERRA-2 to 0.66 in ERA-I, although some of this difference stems from the inappropriate use of SIC equal to 1 north of 82.5 °N in ERA-I. The ice concentration values converge in more recent years. Prior to 2005, the difference between ERA-I and MERRA-2 averages 0.04. After 2009, when both reanalyses use the same ice concentration dataset, the differences are negligible. For these two reanalyses, Fig. 7b shows a time series of differences in the SST over the Arctic domain. In the presence of fractional sea ice cover, ERA-I prescribes an SST of 271.46 K, while MERRA-2 uses its boundary dataset values. The differences are largest in midsummer, up to 1.4 K, when air temperature differences between the reanalyses are small. Interannual variability in the SST differences is also apparent, with largest differences in the period of 2002–09.
4. ERA-I and MERRA-2
a. Comparison of reanalyses
We focus on differences between established reanalyses ERA-I and MERRA-2, which have the largest sea ice concentration differences. As seen in Fig. 3, MERRA-2 annually averaged Arctic Ocean 2-m temperatures are substantially warmer than ERA-I by greater than 1 K for the period of 1980–2004. Relative to other reanalyses, ERA-I warms more abruptly in 2005, and becomes more comparable to MERRA-2 in the most recent years. Shown in Figs. 8a and 8b are the spatial patterns of these differences for the autumn months (SON) for the periods of 1980–2004 and 2009–17 to coincide with changes in boundary forcing. Among the features highlighted are differences of greater than 2 K over the central Arctic Ocean in earlier period, and large differences over the Canadian Arctic Archipelago and northern coastal Greenland. These features become more muted in the later time period. These oceanic differences appear consistent with errors in the implementation of sea ice boundary conditions described earlier. This would suggest that the inappropriate use of a constant ice fraction of 1 in ERA-I at highest latitudes results in air temperatures that are cool relative to MERRA-2, while the land–sea mask issues of MERRA-2 result in erroneously warm values over coastal regions of the Canadian Arctic Archipelago and Franz Josef Land. Simmons and Poli (2015) also indicate additional issues for northern Ellesmere Island and Greenland, which may aggravate coastal differences with MERRA-2. These differences persist into the more recent period. There are substantial differences between the reanalyses over the Greenland Ice Sheet and surrounding land surfaces, which consistently differ in sign with the ocean regions. The contrast between land and ocean supports an approach of considering the two surfaces separately. Removal of the use of 100% sea ice coverage north of 82.5°N, finer spatial resolution, and changes to the screen level analysis, among other differences, have contributed toward a decrease in the differences between MERRA-2 and the newer ERA5.
The apparent direct relation between reanalysis sea ice cover discrepancies and temperature differences is further illustrated in Fig. 8b, which shows a scatter of reanalysis temperature differences in relation to sea ice concentration differences. Ice cover differences account for 68% of the variance in the temperature differences for the month of September, 67% for October, and 64% in November. The regression slope increases into the late autumn. A difference of 0.04 in the average ice concentration between ERA-I and MERRA-2 is associated with an air temperature difference of 0.48 ± 0.03 K in September and a difference of 1.40 ± 0.03 K in November.
b. Influence of assimilated observations
As previously noted, the Arctic region is data sparse for reanalyses (Cullather et al. 2016). Simmons and Poli (2015) previously examined the impact of observations in ERA-I. They found that the observational coverage varies substantially over the reanalysis period, and that the ERA-I trends are largely derived from the background forecast. Analysis increments in the Arctic have previously been examined for MERRA, the predecessor reanalysis of MERRA-2 (Cullather and Bosilovich 2012; Laliberté and Kushner 2014). Notably, Laliberté and Kushner (2014) found a trend in time toward larger analysis increment values in MERRA.
In MERRA-2, a trend in lower-tropospheric temperature analysis increment is still present in selected seasons. As seen in Fig. 9a, there is an approximately linear increase in the increments in winter (DJF) and spring (MAM); there is no trend in summer (JJA). In autumn (SON), the increments experience a considerable jump in 2006, altering the annual cycle of the increments (Fig. 9b). In the sign convention, negative increments as seen in winter and spring denote a cooling by incoming observations relative to the background forecast in the analysis, with the reverse condition in summer. This is consistent with an inherent warm bias over sea ice.
The discontinuity in the analysis increment time series for autumn is concerning and warrants further investigation. MERRA-2 transitioned its boundary condition datasets in 2006 (Fig. 1), although other dataset transitions occurred within the time series resulting in more significant changes in the sea ice concentration without large jumps in the analysis increment. In the MERRA-2 sea ice time series, the largest temporal discontinuity in sea ice occurs in 2005, when the Reynolds daily SST dataset changes its accompanying sea ice product from Cavalieri et al. (1999) to Grumbine (1996). For the temperature analysis increment, the 2006 discontinuity is more likely to be related to changes in the observing system used in MERRA-2. MERRA-2 assimilates observations of buoy pressure and temperature (as converted to the model variable virtual temperature), over ice and open water. Gelaro et al. (2010) demonstrated that per observation, ships and buoys have the largest impact on the GEOS model; this is likely the case here as well. For the first 12 years of the MERRA-2 time series, observations from ships and buoys over the Arctic are extremely limited. Counts of assimilated surface pressure and virtual temperature observations from buoys in MERRA-2 gradually increase from 1992 onward, but abruptly spike in 2006 when buoys were deployed for the International Polar Year (Fig. 10a; Inoue et al. 2009).
To further investigate the role of buoys, spatial maps of the counts of assimilated observations and the mean observation-minus-forecast (OMF) values for virtual temperature during the last decade of MERRA-2 for autumn and winter are shown in Figs. 10b–e, using the Gridded Interpolated Observations dataset for MERRA-2 (Bosilovich et al. 2015). The difference in the total number of assimilated virtual temperature observations from buoys in MERRA-2 between the first and last decade of MERRA-2 is nearly two orders of magnitude. Spatially, the differences are mostly clustered north of the Canadian Arctic Archipelago (Figs. 10b,c). There are very few assimilated buoy observations during the 1980s; the period 2007–16 is shown here. A seasonal cycle in the count of assimilated buoy observations is present, possibly due to communication issues. Often deployed in summer, the limited lifespan of buoys also results in more observations in SON than in DJF (Figs. 10b,c). The mean OMF per observation for 2007–16 skews negative during DJF, indicating the model is warmer than the observations (Fig. 10d). But despite the increase in observation counts, this only represents about half of the Arctic Ocean (Figs. 10a,b,d). The largest negative values of OMF occur in the western Beaufort Sea, while there are positive values just to the west. The location of the trend in the temperature increment is actually located where the observations are clustered, and not where the OMF is the largest. In autumn, the OMF tends to be more positive, although negative values are present in select locations of the central Arctic Ocean (Fig. 10e). As seen in winter months as well, there is a general tendency for positive OMF values over open water and negative values over sea ice. This gives an indication that model biases differ between the ocean and ice-covered surfaces.
Other observation types including radiosondes, ships, and satellite radiances were also investigated. But these were determined to be not as influential as buoys. In the Arctic, radiance channels with weighting functions for temperature that peak near the surface are not assimilated in MERRA-2. While vertical mixing could propagate the impact of radiances from aloft, the properties of the increment at 1000 hPa are present in the bottom most vertical layer. A spatial map of the linear trend in the increment indicates that the majority of the trend in the Arctic is present north of the Canadian Archipelago while the ship and radiosonde observations tend to be in the Greenland, Barents, and Kara Seas.
c. Numerical experiments
Annual mean 2-m temperatures for the period of 1980–2016 in MERRA-2, as well as the differences from ERA-I, M2AMIP, and M2AMIPERA, are shown in Fig. 11. In agreement with Fig. 3, MERRA-2 is warmer than ERA-I over most of the Arctic, with the exception of the Greenland Sea. Both M2AMIP and M2AMIPERA are warmer than the reanalyses regardless of the source of the prescribed SST and ice concentration. M2AMIP is noticeably the warmest of the four time series, while the ERA-I reanalysis is the coldest. Despite differences in the magnitude of the temperature, the spatial structure is generally the same.
The largest discrepancies in temperatures among reanalyses occur in autumn and winter; here we focus on the winter season when the largest differences are present. Differences in the spatial patterns of 2-m temperature and other variables of interest from the M2AMIP and M2AMIPERA AMIP simulations are presented in Fig. 12. While the 2-m temperature in M2AMIP is warmer than M2AMIPERA across nearly the entire Arctic Ocean domain, the largest differences are generally present north of about 82°N and are on the order of about 2 K (Fig. 12a). As previously noted, the 2-m temperature fields are produced as an interpolation of the skin temperature and the lowest model level. As may be seen, the 2-m temperature differences between the two ensembles are reflective of differences in the skin temperature as well as the lowest model level (Figs. 12b,c). The spatial pattern for the difference in the skin temperature is very similar to that for 2-m temperatures but is slightly larger in magnitude (Fig. 12b). Using standard output from the MERRA-2 atmospheric model, skin temperature can be evaluated separately over open water and over sea ice, as seen in Figs. 12d and 12e. The skin temperature over open water tends to be colder in the M2AMIP, but this is generally less than 0.5 K (Fig. 12d) and is actually on the order of 0.2 K for most of the Arctic. The largest contribution to the difference in skin temperature and 2-m temperature stems from the skin temperature over ice (Fig. 12e). This is not surprising given that differences in the prescribed SSTs are small (Fig. 7b).
A key factor in the difference in 2-m temperature between the two AMIP simulations is the inappropriate use of a sea ice concentration of 1.0 north of 82.5°N in ERA-I for roughly half of the time series clearly seen in Fig. 12f. Aside from the Barents and Kara Seas, there is a general preference for a larger ice concentration in the Arctic in MERRA-2 and M2AMIP. There is some discrepancy along the sea ice boundary that likely stems from a combination of the difference in spatial resolution between the boundary forcing datasets and the fact that ERA-I sets the ice concentration to zero when it is analyzed to be below 0.2. The influence of these sea ice discrepancies has impacts in the Arctic climate beyond surface air temperatures. Latent heat is released over the exposed ocean in M2AMIP, forming low level clouds and resulting in a larger cloud fraction, cloud optical depth, and downwelling longwave radiative flux (Figs. 12g,h,i). When only the 2010–16 period is considered, many of these differences are muted as a result of using the same boundary forcing datasets. During that time period, the largest differences over the Arctic Ocean in most of the variables previously discussed occur along the sea ice edge and over open water in the Barents and Chukchi Seas.
Discrepancies in the MERRA-2 ice concentration are also evident. Notable differences, particularly in the sensible heat flux, between the two AMIP simulations are found along the coastline (Figs. 12a,b,f,g,h,i,j). This is likely associated with the known issue in MERRA-2 in which the land–sea mask is not properly applied to sea ice from the Reynolds daily dataset, resulting in erroneous coastal polynyas.
Additional differences between the AMIP experiments highlight discrepancies in the model computation of turbulent fluxes. The GEOS model is semi-implicitly coupled with the surface. As an iterative process to update the skin temperature over ice between time steps, the model calls the turbulence parameterization, followed by a balancing of the surface energy budget through radiation. Both of these processes may be assessed through temperature tendency terms at 1000 hPa, as defined in the atmospheric model due to turbulence, radiation, dynamics, moisture, friction, and gravity wave drag. By design, these terms cancel each other out. It may be seen in Fig. 12k that differences in the temperature tendency due to turbulence bears similarity to 2-m temperature and skin temperature over ice, in the sense that the temperature is being warmed over the North Pole. The difference in the temperature tendency due to turbulence between M2AMIP and M2AMIPERA is balanced by the temperature tendency due to dynamics (Fig. 12l). There is very little difference in the temperature tendency due to radiation, despite the difference in the downwelling longwave (LW) flux. The larger cloud optical depth in M2AMIP has a minimal impact on downwelling shortwave (SW) radiation due to the near-zero solar insolation in the Arctic during DJF (not shown).
A time series of area-averaged 2-m temperature for the period of 1980–2016 during winter in MERRA-2, ERA-I, M2AMIP, and M2AMIPERA over the Arctic Ocean domain is shown in Fig. 13, while figures for the other seasons may be found in the online supplemental material. Linear trends are given in parentheses. The year-to-year variability is similar among the datasets, with noticeable spikes in temperature during years with reduced sea ice extent in 2007 and 2012 in both the reanalyses and the AMIP-style simulations. Both reanalyses agree that a pronounced minimum temperature within the time series occurs in 1987, but the AMIP-style simulations do not agree. In fact, one member of the M2AMIPERA ensemble shows the minimum a year later. The variability among members tends to be larger in years with enhanced sea ice extent, as the underlying model has a larger influence on the 2-m temperature than the prescribed temperature over open water.
Despite agreement in year-to-year variability, the entire time series of both AMIP-style simulations is warmer than either reanalysis, with M2AMIP being ~1.5–2 K warmer than MERRA-2. This is especially true in the early 1980s, as well as the period extending from 1996 through 2004. It is also evident that the predominant warming trend occurs prior to the mid-2000s in M2AMIP. Much more agreement can be seen between the two reanalyses and the two AMIP ensembles after 2009 when all use similar SST and sea ice datasets. It is at this point that ensemble members of the two AMIP-style simulations become nearly indistinguishable.
In agreement with other reanalyses and observational datasets, Arctic amplification is present in MERRA-2 and ERA-I, with the Arctic warming nearly 3 times as fast as the midlatitudes, although the linear trend is larger over the Arctic Ocean in ERA-I. The vast majority of the warming in ERA-I occurs between the late 1980s and the early 2000s, as the Arctic temperature in ERA-I becomes closer to MERRA-2. Alternatively, no M2AMIP ensemble member indicates Arctic amplification, as the temperature over the Arctic is quite warm in the 1980s, and the ensemble mean depicts a substantially larger midlatitude trend (Table 2). The problem is twofold from an Arctic amplification perspective. Not only is the Arctic not warming at the rate seen in MERRA-2 or ERA-I, but the midlatitudes are warming more in M2AMIP than in MERRA-2. Using the SST and ice concentration from ERA-I for M2AMIPERA, the MERRA-2 AGCM produces Arctic amplification, at a faster rate than suggested by MERRA-2 but slower than for ERA-I. Although Arctic amplification is produced in M2AMIPERA, the time series of near-surface air temperature in the Arctic is not necessarily any more realistic than M2AMIP.
This evaluation provides several conclusions. Model air temperatures respond to sea ice and SST boundary condition differences and come to close agreement in periods when the experiments use similar boundary values. This supports the contention of the importance of the boundary conditions to reanalysis temperatures. The above analysis indicates a strong dependence between differences in reanalysis Arctic temperatures and the sea ice and boundary conditions used. Second, the AMIP ensemble temperatures are considerably warmer than the corresponding reanalyses. This is consistent with the GEOS ice surface representation being inherently too warm. Finally, this difference—as well as the magnitude and variability of the MERRA-2 analysis increments—indicates that observations have an important impact on reanalysis temperatures and trends. It is evident that more in situ and remote sensing observations as well as a better use of existing satellite observations are needed in the Arctic to represent the characteristics of the entire region in both reanalyses and numerical models. Independent grid box–scale observations, such as those being collected by the MOSAiC field campaign, are also needed to properly evaluate reanalyses with fractional sea ice coverage.
An intriguing result is the generally better agreement of reanalyses using a threshold ice cover to the in situ observations, as compared to those using fractional ice cover. Ice drifting stations are necessarily located on robust ice floes that are consistent with a local ice concentration near 100%; it is speculated that a contrast with the larger-scale, satellite-derived ice fraction is at issue, but this requires further investigation. Additionally, the possibility exists that the sea ice boundary forcing used in reanalyses with fractional sea ice cover is biased low, as uncertainties exist in the observations of sea ice concentration used as input (Gerland et al. 2019).
The relation of surface temperatures to the melting point of sea ice, in combination with the freezing point of more salty seawater, is an important consideration for the Arctic, and this has discouraged the use of time series normalization or temperature anomalies here as compared to other studies that compared to long-term observational datasets (Simmons et al. 2017; Gross et al. 2018). The presented analysis is also in consideration of the character of the discrepancies over ocean, land, and land-ice surfaces, which represent differing physics and different treatments in the models. Among these three surfaces, the treatment of sea ice in numerical weather prediction models and reanalyses remains relatively simplistic. The use of a constant skin layer and Newtonian relaxation in MERRA-2 throughout the entire time series might not be appropriate; however, this is a limitation of an uncoupled representation of sea ice processes. While coupled atmosphere–ocean assimilation may yield improvements, it is apparent that some reduction in reanalysis uncertainty could be achieved in their current configuration with a more careful treatment of model boundary conditions and improved parameterizations over sea ice. A more advanced representation of the ice surface may improve results (e.g., Batrak and Müller 2019). Nevertheless, the uncertainty in sea ice concentration is likely to be a persistent issue.
Current reanalyses do not generally consider a wide variety of surface processes and characteristics of sea ice that affect air temperatures. For example, temporal changes in ice thickness affecting the conductivity of the ocean heat flux are not represented. Cai and Kalnay (2005) previously showed analytically that a long reanalysis can detect a trend present in observations assimilated by the reanalysis, even if the physical processes responsible for the trend are not represented in the model. This places additional weight on obtaining and correctly utilizing available observations in the data-sparse Arctic. An independent, definitive, and spatially representative time series of near-surface Arctic air temperatures for model and reanalysis evaluation is not available. In the absence of other suitable methods, reanalyses remain a promising avenue for constructing the historical Arctic temperature record.
ECMWF, JRA, and ASRv2 reanalysis fields were obtained from the Research Data Archive at NCAR. NCEP–NCAR and NCEP DOE II reanalyses were obtained from the NOAA ESRL. CFSR fields were obtained from the NOAA National Operational Model Archive and Distribution System. MERRA, MERRA-2, and AIRS-AMSU were obtained from the NASA GESDISC. ISCCP and AVHRR data were obtained from the NOAA National Centers for Environmental Information. TOVS Polar Pathfinder and NP ice drifting station data were obtained from the NSIDC at the University of Colorado at Boulder. SHEBA data were obtained from the Department of Atmospheric Sciences, University of Washington, Seattle. This work was funded by the NASA Earth Science Research Program. We thank Adrian Simmons and two anonymous reviewers for their constructive feedback on the manuscript, as well as Walt Meier for use discussions.
List of Acronyms
Atmospheric Infrared Sounder
Atmospheric Model Intercomparison Project
Advanced Microwave Sounding Unit
Arctic System Reanalysis, version 2
Advanced Very High Resolution Radiometer
Climate Forecast System Reanalysis
Department of Energy
European Centre for Medium-Range Weather Forecasts
ECMWF interim reanalysis
Goddard Earth Observing System model
global sea ice and SST dataset
Goddard Space Flight Center
Hadley Centre sea ice and SST dataset version 1
High-resolution global monthly
International Satellite Cloud Climatology Project
Japan Meteorological Agency
AMIP simulation using the MERRA-2 AGCM
AMIP simulation using the MERRA-2 AGCM and ERA-I forcing
Multidisciplinary Drifting Observatory for the Study of Arctic Climate
Modern-Era Retrospective Analysis for Research and Applications
National Aeronautics and Space Administration
National Center for Atmospheric Research
National Centers for Environmental Prediction
USSR Arctic ice drift stations
- OSI SAF
Ocean and Sea Ice Satellite Application Facility
- OSI SAFr
OSI SAF reprocessed sea ice concentration product
Operational Sea Surface Temperature and Sea Ice Analysis
Surface Heat Budget of the Arctic
Sea surface temperature
Television Infrared Observation Satellite
TIROS Operational Vertical Sounder
Union of Soviet Socialist Republics
This article is included in the Modern-Era Retrospective analysis for Research and Applications version 2 (MERRA-2) special collection.