1. Introduction
The Arctic is one of the regions most affected by ongoing climate change (IPCC 2019), warming 2–3 times as fast as the global average (IPCC 2021) and consequently losing its sea ice cover. Since the beginning of the satellite record, the sea ice extent has been reduced by 9% in winter and 48% in summer (Docquier and Koenigk 2021), while the sea ice thickness has been reduced by 66% (Kwok 2018). The multiyear ice area has halved (Kwok 2018), and as a result the shelf regions have become seasonally ice free (Onarheim et al. 2018). These changes directly impact the upper Arctic Ocean, notably its freshwater content (Solomon et al. 2021, and references therein). Sea ice changes also seem to be caused by and to enhance changes in the deeper layers (Årthun and Eldevik 2016), in particular that of the Atlantic Water (AW), via a process known as the “Atlantification” of the Arctic Ocean (Polyakov et al. 2017): the inflowing Atlantic Water is warmer and remains warm further into the Arctic, which reduces the sea ice cover, either by directly melting the ice or inhibiting sea ice growth, which in turn allows the atmosphere to modify water properties at greater depths (e.g., Decuypère et al. 2022). Climate models, however, fail to reproduce the sea ice evolution (Notz et al. 2020), and their upper Arctic Ocean representation strongly varies among models (Ilicak et al. 2016; Lique and Thomas 2018; Zanowski et al. 2021). We here investigate their representation of the deeper Arctic Ocean layers, from the Atlantic Water to the seafloor.
The Arctic Ocean consists of four deep basins (Fig. 1): the Nansen and Amundsen Basins on the Eurasian side, and the Makarov and Canada Basins on the Amerasian side, separated by the Lomonosov Ridge. The Eurasian Basin contains two water masses below 1000 m (Smethie et al. 1988): the Eurasian Basin Deep Water (EBDW; down to 2500 m depth) and Eurasian Basin Bottom Water (EBBW; from 2500 m to the seafloor). The denser deep and bottom waters are primarily the result of sea ice formation on the Siberian shelf (Nansen 1906): when sea ice forms, brine is rejected, and the resulting dense water cascades off the shelf through troughs and canyons (Aagaard 1981; Rudels et al. 1999). This cascading is often referred to as “overflow,” the term we use in this manuscript. The only deep connection between the Arctic Ocean and the global oceanic circulation is via Fram Strait (∼2500 m deep), through which the comparatively warm and salty Atlantic Water enters from the Nordic seas. After entering through Fram Strait, the Atlantic Water circulates cyclonically around the entire Arctic Ocean, its upper limit gradually deepening from the surface to ∼200-m depth, its lower limit never exceeding 1000 m (Rudels et al. 1999; Aksenov et al. 2011). However, its properties impact the whole water column as it can be entrained by the overflows (Smethie et al. 1988; Frank et al. 1998; Valk et al. 2020). At the bottom of Fram Strait, the EBDW flows out. Part of it mixes with fresh Greenland Sea deep waters and flows back into the Arctic through Fram Strait (Frank et al. 1998; Langehaug and Falck 2012; von Appen et al. 2015), below the Atlantic Water (Fig. 1). As Fram Strait is the only gateway for deep waters in and out of the Arctic, its representation in climate models is crucial for the models’ ability to represent the Arctic hydrography. In the Amerasian Basin, the deep water mass is the Canada Basin Deep Water (CBDW), the saltiest and warmest of the Arctic deep waters (Aagaard et al. 1985), suspected to be modified EBDW that intruded through the Lomonosov Ridge. There is no agreement as to whether this intrusion happens continuously (Timmermans and Garrett 2006), in pulses (Timmermans et al. 2005), or whether it happened and stopped centuries ago (Schlosser et al. 1997). The higher salinity and temperature of this CBDW compared to its Eurasian source is most likely caused by shelf overflows in the Amerasian Basin (Rudels 1986; Ivanov et al. 2004). Eventually, CBDW intrudes back into the Eurasian Basin through canyons in the Lomonosov Ridge (orange arrows in Fig. 1), as a very salty deep water mass (Björk et al. 2018).
To properly represent the deep Arctic circulation and water mass properties, models need to accurately simulate
-
the interactions with sea ice and upper Arctic Ocean processes, especially ventilation and shelf processes;
-
the large-scale circulation within the Arctic, including bathymetry and mixing; and
-
Fram Strait and upstream ocean properties.
Earlier studies suggest that accurately simulating all these processes was challenging in the previous generation of climate models (Shu et al. 2019) and will continue to be challenging for the models that participated in the latest Climate Model Intercomparison Project, phase 6 (CMIP6; Eyring et al. 2016): the modeled Arctic sea ice (Notz et al. 2020), Arctic solid and liquid freshwater storage and fluxes (Zanowski et al. 2021; Rosenblum et al. 2021), and properties and processes upstream in the Nordic seas (Heuzé 2021) are inaccurate, and the models have a large range of behaviors. The vast majority of these models also fail to reproduce overflows in other parts of the world (Adcroft et al. 2019; Heuzé 2021). Although not directly resolved in climate models, turbulent mixing (induced by, e.g., tides, the breaking of internal waves, and eddies) are known to influence the hydrographic structure of the Arctic Ocean (Rippeth and Fine 2022). Despite the levels of turbulent kinetic energy generally being much lower in the Arctic Ocean than elsewhere across the global ocean (Pinkel 2005; von Appen et al. 2022), mixing contributes to stir up heat from intermediate depths (Polyakov et al. 2020) and stir down freshwater (Manucharyan and Spall 2016), hence influencing the stratification. Generally, wind-induced mixing is limited in the Arctic, partly due to the decoupling of the ocean from the atmosphere by sea ice (Morison et al. 1985), although this may be changing as the sea ice disappears (Armitage et al. 2020), but mixing due to tides has been shown to be an important process, especially on the shelves and near the shelf break where they generate internal waves (Rippeth et al. 2015; Fer et al. 2020). Eddies can intensify vertical mixing (Rippeth and Fine 2022, and references therein), but also play an important role in the transport of water masses between the shelves and the deep basins (Spall et al. 2008). The parameterizations of such processes are thus likely to be of importance for the representation of the deep water masses of the Arctic Ocean.
Khosravi et al. (2022) recently published an overview of biases in the Atlantic Water in CMIP6 models; we here expand on their results by assessing not only the Atlantic Water but also the deep and bottom waters, and by explaining the causes for all these biases, focusing on the models’ mean historical state only. We start by describing the 14 CMIP6 models and methods that we use (section 2) before quantifying the biases in Arctic deep waters in all basins (section 3a). We then assess the representation of overflows (section 3b) and circulation of the deep water masses within the Arctic (section 3c). We finally evaluate the representation of Fram Strait, both in properties and fluxes in and out of the Arctic, and its relation to the biases in the Arctic (section 3d). We finish with a discussion, notably on possible directions for CMIP7 (section 4).
2. Data and methods
a. The CMIP6 models
We use the output from 14 fully coupled models that participated in CMIP6 (Eyring et al. 2016), listed in Table 1. These models were selected, following a preliminary study on the 35 CMIP6 models used in Heuzé (2021), as representative of their family, for diversity in vertical grid types, for comparison with those used in a companion paper (Muilwijk et al. 2023), and after eliminating the ones with the poorest bathymetry (i.e., absence of Lomonosov Ridge and/or unrealistically narrow Fram Strait). Most of the models we selected have a resolution of ∼50 km in the Arctic (9 km for the highest resolution) and 50 levels or more in the vertical. No more than two models share the same ocean component with the same version. These 14 models have been initialized using six different ocean climatologies (Table 1).
Characteristics of the 14 CMIP6 models used in this study: horizontal grid type, which outputs are missing (if any), horizontal resolution in the Arctic, type of vertical grid and number of vertical levels, ocean model component, vertical mixing scheme(s), ocean climatology used to initialize the model, and reference. The horizontal resolution in the Arctic (fourth column) was calculated as the square root of the total area north of 70°N divided by the number of points the model has north of 70°N. For the vertical grids, ρ means isopycnic, σ is terrain-following, and several symbols indicate hybrid. For the vertical mixing schemes, we use a similar nomenclature to that of Huang et al. (2014) for CMIP5: KPP = K-profile parameterization scheme (Large et al. 1994); TM = tidal mixing parameterization; ePBL = energetics-based planetary boundary layer (Reichl and Hallberg 2018); NK = Noh and Jin Kim (1999); PP = Pacanowski and Philander (1981); DL = Decloedt and Luther (2010); TC = turbulent closure scheme; and K90 = Kraus (1990).
The 14 models include a wide variety of mixing schemes, from a simple linear increase of vertical diffusivity with depth to more complex kinetic energy closure schemes. It is worth noting that the information presented in Table 1 is probably incomplete as the published model descriptions are inconsistent in the level of detail, and the fields on the ESGF website are often left empty. Interestingly, several of the most extensive references list modifications relevant for the Arctic and/or for overflows (quoted from the cited papers):
-
EC-Earth3 has a “diffusive bottom boundary layer scheme with implicit bottom friction to mix dense water down a slope” (Döscher et al. 2022);
-
GFDL-CM4 includes an overflow parameterization in the Nordic seas, but not yet in the Arctic (Adcroft et al. 2019);
-
in MIROC6, “the turbulent mixing process in the surface mixed layer is changed so that there is no surface wave breaking and no resultant near-surface mixing in regions covered by sea ice,” which the authors argue “contributes to better representations of the surface stratification in the Arctic Ocean” (Tatebe et al. 2019);
-
in NorESM2-LM, “selective damping of external inertia–gravity waves in shallow regions is enabled to mitigate an issue with unphysical oceanic variability in high-latitude shelf regions that had caused excessive sea ice formation in CMIP5” (Seland et al. 2020);
-
finally in UKESM1-0-LL, the albedo of snow on sea ice is decreased as “compensation for deficient transport of warm Atlantic water into the Arctic in ORCA1” (Sellar et al. 2019).
We evaluate the last 30 years of the historical run, i.e., January 1985–December 2014. We use only one ensemble member for each model, labeled “r1i1p1f1,” except for UKESM1-0-LL for which we use “r1i1p1f2” as r1i1p1f1 was not available. The output we use are the monthly seawater salinity “so,” potential temperature “thetao,” eastward velocity “uo,” and northward velocity “vo,” except for GFDL-ESM4 for which uo and vo were not archived. We also use the sea ice concentration “siconc” and sea ice thickness “sivol” (in fact, sea ice volume divided by gridcell area; available for a majority of models), except for CanESM2 for which we use “sithick” as “sivol” was not available (actual floe thickness; available for few models). For 8 models, we also use the seawater age since surface contact “agescc,” which we will hereafter refer to as the age of water. For the mixed layer depth, we use the “mlotst” output when available, and otherwise computed it as per the CMIP6 protocol by first computing the potential density σθ from the monthly salinity and temperature, and then using a threshold of 0.125 kg m−3 referenced to 10-m depth. The “mlotst” and computed values are not the same due to the nonlinearity of the equation of state, but as shown in Heuzé (2021), the difference is not significant for shallow mixed layers. With the exception of the mixed layer and all flux computations, we use the density referenced to 2000-m depth (σ2) as a compromise considering the wide range of depths covered. The diagnostics based on σ2 differences were also done using σ0 and σ4 (not shown), but no significant differences in any of our results were found. All densities were computed using the TEOS-10 equation of state as implemented in the Gibbs Seawater (GSW) Oceanographic Toolbox (McDougall and Barker 2011).
All computations were performed on the models’ native grid (see the appendix, Fig. A1) with these two exceptions:
-
The GISS-E2-1-H and NorESM2-LM native vertical grids were particularly challenging to work with, so we instead show their regularized grid output. We nevertheless verified that our key results still hold on the native grid.
-
The comparisons to the climatology in sections 3a and 3d were performed after interpolating all the model temperature and salinity values onto the World Ocean Atlas 2018 (WOA18) (see next section) climatology’s grid.
b. Observational data
To quantify biases in the CMIP6 models, we first compare them to the Unified Database for Arctic and Subarctic Hydrography (UDASH; Behrendt et al. 2018) by generating basin 30-yr-average temperature and salinity profiles in the four deep basins of the Arctic Ocean (as defined in Fig. 1). As the UDASH profiles are scattered, rather than interpolating them ourselves we use the WOA18 (Locarnini et al. 2019; Zweng et al. 2019) objectively analyzed annual fields at a 0.25° resolution (∼25 km), which includes the same profiles as in UDASH, for all computations where the model and observations had to be collocated.
Most models use an earlier version of the World Ocean Atlas as initialization (Table 1), with 7 out of 14 models using the version that was the latest as the models ran, i.e., WOA13. Two models use an even earlier version from 2009 or even 2001. The main difference between the versions is the amount of data ingested and the time period of the data; the reader will find more information about the versions’ differences in the WOA18 publications (Locarnini et al. 2019; Zweng et al. 2019). The second most common climatology is the Polar science center Hydrographic Climatology (PHC; Steele et al. 2001), which includes the WOA98 data and the Arctic Ocean Atlas (AOA; Environmental Working Group 1997, 1998), gridded compilation of previously classified U.S. and Russian hydrographic data collected during the Cold War. The one disadvantage of PHC is that the latest version, PHC3, was last updated in 2005. Finally, one model uses the Met Office Hadley Centre climatology EN4 (Good et al. 2013), which merges the World Ocean Database 2009 with autonomous data [see Good et al. (2013) for more information]; in the ice-covered regions, these are to date limited to the upper 1000 m (see review in Rabe et al. 2022). For robustness, we computed the model biases relative to all these datasets as well and found no significant difference in our results, most likely because the number of profiles in the deep Arctic remains extremely low to this day; therefore, differences between the observational datasets are negligible in the deep Arctic compared to the model biases. We therefore keep the higher spatial resolution WOA18 as our reference, as it also is the most directly comparable to UDASH.
c. Methods
The primary objective of this paper is to quantify biases in the properties of the deep water masses of the Arctic Ocean: the AW, the EBDW, its counterpart the CBDW, and the EBBW. Traditionally, for observational datasets, the definition of these water masses is based on temperature, salinity, or density thresholds (e.g., Smethie et al. 1988; Rudels 2009; Korhonen et al. 2013). As we expect these properties to be biased in the models, we instead chose these three definitions:
-
The Atlantic Water core is the depth of the temperature maximum, detected between 150- and 2000-m depth. This temperature-based definition is similar to that used for observations, but without imposing a constraint on the value of the temperature maximum, and adjusted for the wider depth range in models. Note that using a threshold of 100 m instead of 150 m does not change the results, probably because as found by Lavoie et al. (2022), Pacific Water tends to be missing from CMIP models, a result not surprising given their biases in the representation of the Bering Strait inflow (Zanowski et al. 2021).
-
Deep water properties are those at 2000 m. In observations, EBDW sits between approximately 1000- and 2500-m depth in the Eurasian Basin, and CBDW extends from approximately 1000 m all the way to the seafloor. Using the value at 1500 m or that at 2500 m did not change our results significantly.
-
Bottom water properties are those of the deepest grid cell with a value.
The upper ocean is not the topic of this paper. We nevertheless investigate whether biases in the upper ocean and in the deep layers are related, and therefore computed the mean temperature, salinity, and density in the top 100 m as a proxy for upper ocean properties. A detailed study of stratification is provided by Muilwijk et al. (2023); we here only provide a simplified definition of stratification, taken as the difference between the upper 100 m mean density and that of the AW core.
We compare the properties of the different water masses in the four deep basins of the Arctic north of 70°N (Fig. 1a), where “deep basin” is defined as having a seafloor depth greater than 2000 m. The shelf is defined as regions shallower than 1000 m (Rudels 2009). Note that the 1000- and 2000-m isobaths coincide at most locations in most models, as the shelf break is very steep. Throughout this manuscript, we use the short name “Siberian shelf” to refer to the shelf along the Eurasian Basin, i.e., from Fram Strait to 160°W. As we will show, no deep water formation occurs on the shelf along the Canada Basin in CMIP6 models, so we do not focus on this region. Finally, to investigate the deep outflows from the Arctic, we determine the biases on the Greenland shelf, i.e., around Greenland but north of 70°N.
In the Arctic, dense waters cascading from the shelf to the deep basin, commonly referred to as overflows, strongly modify the properties of all water masses (e.g., Aagaard 1981; Luneva et al. 2020). As summarized in Luneva et al. (2020), these overflows are bottom-trapped gravity currents characterized by a comparatively high density, but also by a young age, as dense shelf waters sink off the shelf within the same year that they sank from the surface to the shelf seafloor. Therefore, we detect their presence in models by studying
-
the minimum age at the bottom grid cell, for the eight models that provided the age of water output, and
-
the maximum bottom density, for the other six models.
For both groups of models, we look for a continuity in this diagnostic on and off the shelf, in maps of the bottom properties, and in sections along and across the troughs where we expect their presence.
As in Zanowski et al. (2021), the boundaries for Fram Strait were chosen by hand for each model and span 20°W–12°E, 78°–80°N. The results are not sensitive to the choice of exact boundaries within that range, but should ideally be as close to 79°N as possible for comparison with observations (e.g., Beszczynska-Möller et al. 2012). For the rotated and tripolar grids, the northward velocity “vo” does not correspond to velocities toward the true north 90°N but rather toward the model’s location of the North Pole. Therefore, for all models,
3. Results
In this section, we first quantify the biases in the properties of the Atlantic Water, deep, and bottom water masses and their horizontal and vertical relationships between the biases. We then evaluate the representation of the processes that set these properties, within the Arctic Ocean (sections 3b and 3c) and at Fram Strait (section 3d).
a. Biases in water mass properties
We start by quantifying biases in the mean temperature and salinity and their evolution with depth in the four deep basins (Fig. 2 and individual values in Tables A1–A3 in the appendix). In observations, as the Nansen Basin lies closest to its inflow, the Atlantic Water there is warm (black line, Fig. 2a), salty (Fig. 2b), and constrained to a thin and shallow depth range, around 200-m depth. In the models in contrast, the Atlantic Water lies deeper (multimodel average of 395 m, ranging from 76 to 1321 m) and occupies a thicker layer, which is in agreement with the findings of Khosravi et al. (2022) in CMIP6, and Ilicak et al. (2016) for CORE-II. In fact, had we used the standard definitions that the Atlantic Water is anything warmer than 0°C (e.g., Korhonen et al. 2013) or lighter than 27.97 kg m−3 (e.g., Rudels 2009) (black dotted lines in Fig. 2c), we would have found Atlantic Water all the way to the seafloor in half of the models. Therefore, although on average the models are biased cold in the Atlantic Water core (multimodel mean of −0.44°C), they are biased warm at 2000-m depth (MMM of 1.14°C) and at the bottom of the Nansen Basin (1.25°C). The salinity profile is also inaccurate: when in observations the salinity is maximum in the AW core, in 10 of 14 models the salinity continues to increase with depth. Consequently, the T–S diagram in the Nansen Basin (Fig. 2c) is unrealistic for the majority of the models. Most models have a shape somewhat resembling that of the observations (black), but with peaks at the wrong temperature and/or salinity and of a largely inaccurate magnitude (see, e.g., CanESM5, plain blue line). The least inaccurate is GFDL-CM4 (plain green line), despite an AW core lying on average 400 m too deep and the whole AW layer extending to 2000-m depth. One of the most inaccurate is NorESM2-LM, which has many discontinuities in its hydrographic profiles. This is because on its native isopycnic grid (not shown), as the model is comparatively unstratified, some density classes occupy hundreds of meters. On average, the models are less stratified than observations: they have a dense bias in the AW and a light bias in the deeper layers; this result will be important in section 3b when investigating the ventilation.
Area-weighted mean bias model minus WOA18 climatology in potential temperature (first line, left), salinity (first line, right; unit: psu), depth (second line, left), and density σ2 (second and third lines, right; unit: kg m−3) of the Atlantic Water core for each model and the multimodel mean “MMM” in the four deep basins and on the two shelf regions of interest.
Area-weighted mean bias model minus WOA18 climatology in potential temperature (first line, left), salinity (first line, right; unit: psu) and density σ2 (second line) of the Arctic deep water, defined here as properties at 2000-m depth, for each model and the multimodel mean “MMM” in the four deep basins.
Area-weighted mean bias model minus WOA18 climatology in potential temperature (first line, left), salinity (first line, right; unit: psu), and density σ2 (second line) of the bottom water, defined as the deepest grid cell with values, for each model and the multimodel mean “MMM” in the four deep basins and on the two shelf regions of interest.
All four deep basins exhibit the same biases: the Atlantic layer is too deep, too thick, and in some cases occupies the entire depth of the basin (Fig. 2). This suggests that the biases throughout the water column are linked (Fig. 3). To verify this link, we compute the across-model correlation, i.e., each model is represented by its 30-yr average, basin-average value, and the correlation between models is tested. In all basins, the across-model relationship between any two properties of the different water masses in that basin is split in two distinct depth levels:
-
The biases in the upper 100 m are strongly correlated to each other: cold biases are associated with salty biases, which are associated with dense biases, and in turn with a weak stratification. These upper ocean biases are further investigated in Muilwijk et al. (2023) and beyond the scope of this paper. What is relevant for this study is that the biases in the upper 100 m are not correlated to those of the deeper water masses (empty squares in the top four lines, Fig. 3).
-
From the Atlantic layer down, the biases in all properties and water masses are positively correlated to each other (albeit at 90% significance only between AWCT and AWCS due to the cold but salty bias of NorESM2-LM). As our definitions artificially split the Canada Basin Deep Water in two different water masses (2000-m depth and bottom), we expect a strong correlation between these two depth levels in the Makarov and Canada Basins. However, the correlations are larger than 0.9 across all basins and depth levels (diagonal of deep red values, Fig. 3), and the actual values nearly align along the unit line when plotted against each other (not shown). As suspected from Fig. 2, most models in our study do not have distinct deep water masses, but rather fill the deep basins with a similar water from the Atlantic Water level to the seafloor.
Note that Fig. 3 was created using the area-weighted means, but the same results were found if using the area-weighted root-mean-squared error (RMSE) or the actual properties. Finally, the reader may have noticed that the Atlantic Water core depth (AWCD) is not correlated to any other property—we will come back to this finding later in the manuscript.
In observations, the properties of each water mass evolve not only with depth but also horizontally. Most visibly, the Atlantic Water becomes colder, fresher, deeper, and thicker, and consequently results in a less pronounced peak on the T–S diagram, as it travels from the Nansen Basin to the Canada Basin (black lines, Fig. 2). We do not observe this in models. AW density and temperature show little change across the Arctic. As a result, the biases (Tables A1–A3) change primarily because the value in the reference climatology changes rather than the values in the models. This is most visible when the properties are mapped (Fig. 4 and Figs. A2 and A3): the AW appears biased dense and cold the most in the Nansen Basin, as it is the basin where the density is lowest and temperature highest in the climatology. The maps reveal that no basin is better represented than the others; rather, the difference is largest when comparing the different layers (RMSE value in Fig. 4), and when comparing the deep basins to the shelves. No model clearly outperforms the others; the model that can be qualified of “most accurate” depends on the depth and property considered (Fig. 4 and Figs. A2 and A3, second row).
As for the evolution of biases with depth, we verify that for each layer its biases are consistent throughout the Arctic as suggested by Fig. 4 by computing the across model correlations between the basins (Fig. 5). For the four deep basins, the temperature and the salinity, and the three layers, the correlations often exceed 0.9 (dark red in Fig. 5). There are two exceptions:
-
On the Siberian shelf, there are no correlations with the deep basins. This suggests that the majority of models do not accurately represent the connection between the Siberian shelf and the deep basin via dense water overflows. We investigate this further in the next subsection.
-
On the Greenland shelf, there are no significant correlations in salinity but strong correlations in temperature, especially with the AW in the deep basins. This suggests that the flow of Atlantic Water from the deep basins southward and onto the Greenland shelf, notably through Fram Strait, may be accurately represented. We investigate this further in the next subsections.
In summary, across CMIP6 models, the Atlantic layer is biased cold, fresh, and dense when compared to observations, while the deep and bottom waters are biased warm, fresh, and light. The biases between water masses are strongly correlated to each other, and coupled with the fact that the AW occupies nearly the entire water column in most models, suggest that the different water masses are not significantly different from each other. The biases are also consistent throughout the Arctic. In the next two subsections, we investigate whether this lack of variation with depth and with distance is caused by inaccurate ventilation and circulation of these waters within the Arctic.
b. Ventilation of deep water masses within the Arctic
We just showed that there is no across-model correlation between the Atlantic Water and deeper ocean biases and those in the upper ocean. This means that the deep biases may come from an inaccurate representation of the processes that normally form or modify those deep waters: ventilation within the Arctic, circulation within the Arctic, or exchanges through Fram Strait. We start with the processes that take place within the Arctic, and in particular with dense water overflows.
Of the 8 of 14 models that provided the age of water in their output, only 2 appear to simulate overflows to some extent at the Arctic shelf break (Figs. 6a,d, regions highlighted with green boxes): NorESM2-LM, through Franz-Victoria Trough and Saint Anna Trough, two locations where overflows have been observed (see Luneva et al. 2020, and references therein); and MIROC6, through Saint Anna Trough only. For both these models, the overflow is visible as a continuous 0–1-yr age on either side of the 1000-m isobath. We attempt to track these overflows as they travel off the shelf break, but both in animations (not shown) and in sections across (Figs. 6b,e) and along (Figs. 6c,f) the shelf break, unsurprisingly, we can only detect the occasional grid cell with a low age and not a clear flow. These suggest that NorESM2-LM may ventilate down to 3000-m depth occasionally, and MIROC6 to 2000 m. These two models also have the least biased deep and bottom waters for the entire Arctic (see previous section). One of the reasons for these models’ relatively good performance may be their use of isopycnic and terrain-following coordinates, respectively, which should be particularly well-suited to represent a density-driven flow along a slope (Legg et al. 2006), in contrast to the other models which use geopotential coordinates.
For the remaining 6 of 14 models, we use bottom density as a proxy for ventilation. Only GFDL-ESM4 may have a dense water overflow, in Saint Anna Trough (Fig. 6g), but tracking its progression down the shelf (Figs. 6h,i) is not trivial. Referencing the density to different depth levels did not make the result clearer. As GFDL-ESM4 is the model with the least biased 2000 m salinity and density, and run with hybrid vertical coordinates, it is possible that it has intermittent overflows. Besides, GFDL-ESM4 and NorESM2-LM are able to simulate overflows on the Antarctic shelf break (Heuzé 2021), which suggests the potential for them to do the same in the Arctic. Either way, previous studies have shown that overflows occur at several other locations, including at the Canadian shelf break (Luneva et al. 2020). Of the 14 models we study here, however, only 3 models show indications of simulating overflows, all in the same troughs.
Why are these three models the only ones with overflows, and why in Saint Anna Trough (SAT) only? Starting with the models’ bathymetry (gray shading in Fig. 7), BCC-CSM-MR and CAMS-CSM1-0 do not even have a trough there; their bathymetry is shallower than 500 m on the entire continental shelf. For all the other models, SAT is the only trough represented. One possibility therefore is that the models form dense water elsewhere on the shelf, but cannot export it. In observations, dense water formation is caused by sea ice processes, in particular through brine rejection in coastal polynyas, with those in the Laptev Sea being most intense (Tamura and Ohshima 2011). Several models have a perennial sea ice cover over both SAT and the Laptev Sea (magenta contours in Fig. 7), a result previously explained notably by their cold air bias (Davy and Outten 2020), making it unlikely that they can open polynyas there. We nonetheless computed the polynya probability at each grid cell over the 30 winters of our study period (Fig. A4). Note that polynya statistics and variability in the Arctic in CMIP6 models would deserve a study of their own, and that what we present here is but a brief analysis. We also acknowledge that most models are probably too coarse to realistically represent Arctic polynyas, and that their study would require daily sea ice parameters, ideally daily thickness, which are not always available. Most models have polynyas nearly every year in the Barents Sea (∼10°–60°E), albeit in the southern part, not by the shelf break. In the Kara Sea/SAT sector (∼60°–100°E), only CESM2, CanESM5, EC-Earth3, IPSL-CM6A-LR, MPI-ESM1-2-HR, and MRI-ESM2-0 have polynyas more than 20 out of 30 years, often by the coast. Interestingly, of the overflowing models, only MIROC6 has polynyas over our study period, for fewer than 10 years. For most models, the polynya frequency is further reduced in the Laptev Sea (∼100°–145°E), but it remains nonzero for 7 of 14 models, especially when considering the daily sea ice (Fig. A4, bottom panels). In summary, most models represent SAT in their bathymetry and have polynyas in the vicinity; they tend to not have any trough on the Laptev Sea, but have polynyas close to the shelf break. We would therefore expect them to have overflows, at both locations.
One possibility is that the polynyas do not result in cold saline (dense enough) water on the shelf. The two GFDL models are the only ones with extremely dense water on the shelf (Table 2), with a maximum density more than 1 kg m−3 denser than the multimodel average and that all other models, both by SAT and the Laptev Sea. In SAT, the other overflowing models, MIROC6 and NorESM2-LM, also are above the MMM but not strongly (37.56 and 37.81 kg m−3, compared to 37.53 kg m−3). The reason for their strong densities also varies between models: the GFDL models are both salty and cold, while MIROC6 is fresh and cold, and NorESM2-LM salty and warm. In fact, 6 of 14 models have a minimum temperature at the bottom of the shelf above freezing by SAT, and this number increases to 8 of 14 in the Laptev Sea. The spatial and seasonal variability of the properties show no consistency with overflow presence (Table 2). The apparent disconnect between shelf properties and sea ice behavior may be the result of the sea ice models. Their detailed analysis is beyond the scope of this paper, but it is worth noting that the four families of sea ice models (CICE, COCO, LIM, and SIS) all simulate virtual salt fluxes instead of actual brine rejection (see references listed in Table 1). In conclusions, the presence of overflows seems unrelated to the bathymetry and presence of polynyas, and rather related to vertical grid type and density on the shelf: the z-level models’ dense water is most likely diluted by mixing before/shortly after it has reached the shelf break. The GFDL models, which are z* till 1000-m depth, have such extremely high densities that these high values survive the mixing. Meanwhile, NorESM2-LM and MIROC6, thanks to their isopycnic and terrain-following grids, respectively, have overflows despite a barely above-average shelf density. These models most likely have no overflows in the Laptev Sea because they are warmer and/or fresher there.
For each model and the multimodel mean (MMM), extremum, seasonal cycle, and spatial differences in bottom density, bottom salinity, and bottom temperature over the Saint Anna Trough (SAT) and Laptev Sea regions (shown in Fig. 7), which can impact the formation of overflows. Density and salinity: “max” refers to the temporal maximum of the spatial maximum, “seas.” to the difference between the temporal maximum and minimum of the spatial maximum, “spatial” to the temporal maximum of the spatial standard deviation. Temperature: “min.” refers to the temporal minimum of the spatial minimum, “seas.” to the difference between the temporal maximum and minimum of the spatial minimum, “spatial” to the temporal maximum of the spatial standard deviation.
Another process that can ventilate the deep ocean is open-ocean deep convection. The Arctic Ocean is too stratified for open ocean deep convection to occur (Rudels and Quadfasel 1991). However, using the high resolution climate model HiGEM and a 4 times increase in CO2 scenario, Lique and Thomas (2018) found that open ocean deep convection can start in the central Arctic. Considering that the models in this study are less stratified than observations (section 3a), we verify whether they ventilate the deep Arctic via open ocean deep convection by studying their maximum mixed layer depth reached over the entire 1985–2014 period. The only model with deep mixed layers in this study is GFDL-CM4, which reaches a maximum of 1815 m in the Nansen Basin (Fig. 8—note the logarithmic color scale). The second deepest is EC-Earth3, with a maximum of 536 m. All the other models have mixed layers shallower than 100 m on average over the deep Arctic basins, never exceeding 250 m. Considering that we found a deep bias in the Atlantic layer, this means that GFDL-CM4 and EC-Earth3 are the only two models whose mixed layers can reach below the halocline. As previously discussed, GFDL-CM4’s Atlantic layer extends deeper than 2000 m, so its comparatively deep mixed layer still cannot ventilate the deep and bottom waters.
In summary, we found three models that show indications of dense water overflows in Saint Anna Trough that may penetrate below the Atlantic Water, and two models that may ventilate the Atlantic layer via open ocean deep convection.
c. Circulation of deep water masses within the Arctic
We now investigate the representation of the ocean circulation in the Arctic, first for the subset of models that provided the age of water output. Tanhua et al. (2009) estimated the age of water in the Arctic Ocean from transient tracer measurements (Fig. 9a). The age of water in the models depends strongly on whether the models followed the OMIP protocol (Griffies et al. 2016), which recommended that the model age be reset to 0 at the beginning of the historical run. In CESM2, CanESM5, IPSL-CM6A-LR, NorESM2-LM instead, the age was set to 0 before the spinup began and not reset since (2022 personal communication with the individual modelers listed in the acknowledgments). Note that for the study we conduct here, the latter method is most desirable. Therefore, we instead compare the models’ normalized ages, as was done by Dufour et al. (2017), i.e., the age in the Arctic divided by the maximum age, globally, by the end of the historical run (given in the panel titles of Fig. 9). We present the absolute ages, with all the caveats caused by the different protocols, in Fig. A5.
In the upper ocean (top panels of Fig. 9), most models seem to “spill over,” i.e., below 100-m depth, the age gradually increases from the shallow levels of the Nansen Basin by the Kara Sea (to the right, true age is 0) toward the deep parts of the Canada Basin by Alaska (to the left, true age is larger than 100 years, Fig. A5). Notable exceptions are CanESM5 (Fig. 9c) and NorESM2-LM (Fig. 9h) who have waters that are much older than the observations between 200 and 1000 m depth throughout most of the deep Arctic (up to 500 years older for CanESM5), albeit with a mild doming of young waters deeper over the Mendeleev Ridge—opposite to the observations. In the deep ocean (bottom panels of Fig. 9), all models reproduce the contrast between the Eurasian Basin (right) and the Canadian Basin (left): in the deep Eurasian Basin, waters are younger to a deeper level than in the Canada Basin. All models also show a latitudinal gradient in age at any depth, with the exception of IPSL-CM6A-LR, whose age primarily increases with depth. Finally, the overflows of MIROC6 and NorESM2-LM are once again visible, as a flow of water of age 0 on the shelf in the upper panels then a bulge of young water in the upper right corner of their bottom panels. In fact, the age sections even suggest that MPI-ESM1-2-HR (Fig. 9f) and MRI-ESM2-0 (Fig. 9g) might have occasional overflows.
Both mixing and large-scale circulation could be responsible for this age distribution. The evolution of age with depth-only in IPSL-CM6A-LR in particular could be caused by its comparatively simple turbulent closure schemes (Madec et al. 2008), yet UKESM1-0-LL and its even simpler linear diffusivity scheme have a somewhat accurate age distribution. Another option is that, as found by Muilwijk et al. (2019), who used passive tracers in a coordinated study of 9 ocean models, the Atlantic Water flow pattern in the Arctic Ocean is highly inaccurate. Here, the strong significant across-model correlation between the age of the Atlantic Water on the Greenland shelf and its temperature (−0.71, i.e., older water is colder) also suggests that the circulation may be inaccurate in CMIP6 models. In observations, the journey of the Atlantic Water across the Arctic can be retraced based on its properties once it reaches the Greenland shelf: the shorter route across the Lomonosov Ridge involves less modification than the long route around the Canada Basin, so this younger water is also warmer (e.g., Rudels 2012). Therefore, in the models with the older and colder water, the flow may be slower than in the models with younger and warmer waters, or the flow may be taking different routes. We therefore now investigate the velocity fields of the models.
Comparing velocity fields close to the North Pole for all models would require nontrivial regridding for the rotated grids. Therefore, we chose two models, MIROC6 and CanESM5, whose horizontal grids are not significantly rotated compared to the Cartesian reference (see Fig. A1). That is, the velocity components “uo” and “vo” are meaningful on the models’ grids. Besides, MIROC6 is one of the “young” models, and CanESM5 one of the “old” models; their velocity vectors are shown in Fig. 10. The norm of the velocity is shown for all other models in Figs. A6 and A7. Note that there are no observational datasets of velocity in the deep Arctic, but the reader can find a detailed explanation of the path of each water mass in Rudels (2012). As expected, these two models differ significantly both in the magnitude of their ocean velocity and in its direction. In MIROC6 (Fig. 10a), the Atlantic Water flows in an orderly loop around the Eurasian Basin at 2 cm s−1 or faster, i.e., the same order of magnitude as measured by the Eastern Eurasian Basin moorings of Woodgate et al. (2001) and Pnyushkov et al. (2015). The flow in CanESM5 (Fig. 10b) is 4 times slower and less orderly, with a lot of recirculation within the Eurasian Basin. The AW also recirculates more in the Makarov Basin in CanESM5 than in MIROC6, but in the Canada Basin, they look somewhat similar, although again MIROC6 is twice as fast. At 2000 m, the circulation in the Eurasian Basin is very similar to that of the AW for both models (Figs. 10c,d), probably because as discussed previously, the same water mass is found at the depth of the AW core as at 2000 m in most models. In MIROC6 it is no issue for the water to flow from the Makarov Basin toward the Canadian shelf, but in CanESM5 the water loops around a shallow feature, most likely the model’s interpretation of the Alpha Ridge. Aside from that loop, MIROC6 shows again velocities twice as high as CanESM5. The absolute velocity does not seem to be the key element for ventilation though; for example, CESM2 and UKESM1-0-LL (Figs. A6c,l) have similar velocities in each basin, yet very different ages, even taking UKESM1-0-LL’s age reset into account. IPSL-CM6A-LR and NorESM2-LM in contrast have similar ages but very different velocities both in the Atlantic layer and at 2000 m depth (Figs. A6 and A7h,k), with NorESM2-LM being up to 100 times faster than IPSL-CM6A-LR locally. In summary, the age difference in Fig. 9 likely is the result of a more coherent flow rather than flow speed only, both in the Atlantic layer and deeper.
What causes these differences in circulation? We find significant, negative across-model correlations between the depth of the Atlantic Water core and its velocity in each basin (−0.47 in the Nansen Basin; −0.62 Amundsen; −0.46 Makarov; −0.42 Canada). That is, the slower the core, the deeper. It is unclear however what the causality is, i.e., whether the flow is slower because it is deeper or deeper because it is slower. Another thing we notice is the impact of horizontal resolution, notably when comparing the higher resolution GFDL-CM4 (9 km) to the others (40–50 km): at this resolution, the meanders and recirculations can be clearly represented (Fig. A6e). The effect of resolution on Arctic circulation was also investigated by previous studies: for example, Docquier et al. (2019) and Docquier et al. (2020) show that higher ocean resolution intensifies the Atlantic Water currents and allows to better resolve the different oceanic pathways into the Arctic. Docquier et al. (2020) further note that eddy-permitting ocean resolution results in improved circulation in comparison to observations, as we see with GFDL-CM4. Roberts et al. (2016) also found that a higher ocean resolution leads to stronger boundary currents. Furthermore, differences in model diffusivity may result in different flow speeds—for example, despite having similar overall volume transports, models with higher diffusivity can have low biases in velocity as the currents are less confined to the coastal boundaries (as was found for the North Atlantic by Talandier et al. 2014) and vice versa for models with low diffusivity. Atmospheric biases are another likely explanation for differences in Atlantic Water flow speeds and patterns, as recently demonstrated by Hinrichs et al. (2021), whose realistic Atlantic Water circulation worsened after coupling to a biased atmospheric model. Finally, Karcher et al. (2007) showed that for early versions of Arctic Ocean models, the balance of potential vorticity is also important and closely linked to the intensity and the pattern of Atlantic Water flow. Steep topographic features such as the Lomonosov Ridge can create a potential vorticity barrier, thus differences in the momentum advection schemes and momentum closure schemes, and obviously, in the bathymetry representation (Fig. 7), might also lead to differences among the models.
In summary, in this subsection we have shown that differences in model ages (even accounting for their different protocols) seem linked to a more coherent flow: the more coherent, the younger the waters/the faster they travel throughout the Arctic. Such flow efficiently transports the water from the Nansen to the Canada Basins, suggesting that the properties in the Nansen Basin, i.e., what enters the Arctic through Fram Strait, set the properties in the whole deep Arctic. In the following subsection, we therefore investigate these flows through Fram Strait.
d. Exchanges through Fram Strait
The representation of Fram Strait in our selection of CMIP6 models is quite biased, be it in properties or in fluxes. When compared to WOA18 (Fig. 11), most models are biased cold in the upper ocean where WOA18 is warm, and biased warm in the deeper layer where WOA18 is cold. In other words, their temperature contrast between the upper and deeper ocean is too small. We observe the same pattern in salinity to some extent (Fig. A8), with strong saline biases in the upper ocean toward Greenland (left of the panels) where WOA18 is freshest, but in the rest of the strait there is no across-model consistent bias. The biases in Fram Strait have a strong and significant across-model correlation to the property biases in the Nansen Basin described previously: 0.84 between the Fram Strait inflow and the Nansen Basin Atlantic Water core for the salinity and 0.74 for the temperature, reduced to 0.78 and 0.56, respectively, when comparing the Fram Strait inflow to the Nansen Basin bottom properties. The Nansen Basin biases are also strongly correlated to the bottom property biases in the Nordic seas (Heuzé 2021), the largest correlation being 0.81 (0.83) between the Nordic seas bottom salinity (temperature) and that in the Nansen Basin at 2000 m depth, suggesting that the biases are advected from the south (upstream of Fram Strait) and into the Arctic. We verify this hypothesis below.
The location of the inflows and outflows is also inconsistent across models (black contours, Fig. 11). Using the moorings deployed across Fram Strait, Beszczynska-Möller et al. (2012) showed the presence of a strong outflow, i.e., flow out of the Arctic, to the west, a strong inflow to the east, and several recirculations in the center of the strait (schematically represented in Fig. 11a). Although both in- and outflows are in fact each composed of several water masses (von Appen et al. 2015), the longitudinal patterns are nonetheless quite consistent through depth. The models show instead a large range of behaviors:
-
BCC-CSM2-MR and CAMS-CSM1-0 do not simulate a separation by longitude but by depth, where the upper ocean is an outflow, intermediate depths (the majority of the water column) is an inflow, and anything below 2000 m is again an outflow.
-
CanESM5, EC-Earth3, IPSL-CM6-A-LR, MPI-ESM1-2-HR, and UKESM1-0-LL simulate an inflow that is limited to a strong core along the east coast, extending no deeper than 1000 m.
-
GFDL-CM4, GISS-E2-1-H, and MRI-ESM2-0 simulate an outflow to the west and inflow to the east, which is correct. They however lack the observed recirculations (i.e., alternation of in- and outflows) to be deemed accurate.
Fram Strait below 500 m is biased warm and the location and extent of the in- and outflows are inaccurate in all models, at least when compared to the mooring data of Beszczynska-Möller et al. (2012). It is therefore not surprising that the heat and volume fluxes through Fram Strait are inaccurate as well. Note that as the salt fluxes strongly resemble the volume fluxes and uncertain observational values were only mentioned in Marnela et al. (2016), we limit our discussion to the heat and volume fluxes. Besides, in contrast to observational data, the models do not have distinct east/west and upper/deeper fluxes. We therefore discuss here the full-depth net fluxes into and out of the Arctic, i.e., the sum of the positive and negative fluxes, respectively. For the heat flux (Fig. 12a), most models are within the observational range, except for GFDL-CM4, MIROC6, and MRI-ESM2-0, who overestimate both the inflow and outflow. For example, with a 30-yr mean value of 61.6 ± 7.1 TW, the inflow in MIROC6 is nearly twice as large as that computed by Schauer et al. (2004) over 1997/98 (31.8 TW). All models correctly simulate that the transport of heat into the Arctic is larger than the transport out (difference of height between the bars), but this difference ranges from 1.4 TW for EC-Earth3 to 37.0 TW for MIROC6. One caveat is that where observational values are computed relative to different reference temperatures, we here computed them all relative to 0°C in order to better compare the models to each other. We argue that as all the models of this study are biased warm in Fram Strait (Fig. 11), and that the across-model correlation between heat flux and temperature bias is only 0.49, i.e., explains only 24% of the variance, choosing a common reference temperature is not the leading reason for the differences between models and observations.
Unlike the heat flux, the volume flux is underestimated in the majority of our models (Fig. 12b). The volume flux is the integral of the velocity through Fram Strait [see Eq. (1)], while the heat flux is the integral of that velocity multiplied by the temperature through the same section [see Eq. (3)]. So first, the volume flux underestimation means that the strong warm bias in Fram Strait dominates the heat flux values. Regarding the volume, only the inflow of GFDL-CM4 and GISS-E2-1-H are within the observational range (averaged from Beszczynska-Möller et al. 2012; Marnela et al. 2016; de Steur et al. 2014; Schauer et al. 2004), and no model reaches the outflow observational range (11 ± 2 Sv, same references; 1 Sv ≡ 106 m3 s−1). Although all models except GISS-E2-1-H correctly have larger outflow than inflow, this difference is nearly twice the observational average (∼2 Sv) in CanESM5, IPSL-CM6A-LR, and MIROC6 (3.5, 4, and 3.5 Sv on average, respectively), and less than half in BCC-CSM2-MR, CAMS-CSM1-0, and CESM2 (<1 Sv). Zanowski et al. (2021) computed the upper ocean liquid and solid freshwater fluxes, where solid means freshwater content of the sea ice, in and out of all the Arctic gateways for seven CMIP6 models. We use their results to determine whether the inaccurate differences between deep inflow and deep outflow through Fram Strait that we found are compensated by the flows through the other straits and/or the solid fluxes. With only five models in common, statistics are meaningless, but this small comparison suggests that the more total solid freshwater flux out of the Arctic, the smaller our heat and volume outflows; and the more total liquid freshwater flux out of the Arctic, the stronger our volume inflow. That is, the more sea ice out, the less heat and volume out, but the more water out, the more deep water flows in. Although these results would be logical, they should be investigated in a larger group of models; doing this here is however beyond the scope of this paper.
Could the biases in heat or volume fluxes through Fram Strait explain the biases that we found in the deep water masses of the Arctic? At first glance, no: there is no across-model relationship between any of the biases described in section 3a and the net in- or outflows. We instead investigate the models individually and compare their fluxes to the Atlantic Water core temperature, in the Nansen Basin only, as we previously showed that all property biases in all water masses and all deep basins were strongly correlated with each other. We find for all models strong positive correlations between the fluxes and time series of the properties (see two exemplary models in Fig. 13), but no across model consistency. That is, some models have their strongest correlation with the heat flux, while others with the volume flux (not shown). But more importantly, for all models the whole inflow is not consistently correlated to the Nansen Basin properties: for some, a specific longitude has most of the positive correlation (Fig. 13a); others have distinct patches, similar to what is expected from observations (Fig. 13b, note the upper and lower patches, separated at approximately 1500-m depth).
In summary, for all models, we do find strong positive correlations between at least part of the inflow and the biases in properties in the deep Arctic. The volume fluxes are biased low in most models, which coupled with the fact that Fram Strait is biased warm, results in seemingly accurate heat fluxes through Fram Strait. Nevertheless, it would be desirable to understand why the volume fluxes are inaccurate. In observations, heat and volume fluxes have their largest values in winter, typically February/March, and lowest values in spring/summer, typically June (Schauer et al. 2004; Beszczynska-Möller et al. 2012; de Steur et al. 2014). In our models, the majority follow this pattern of maximum in winter and minimum in summer, although the maximum can be found in any month. The exceptions are GISS-E2-1-H and NorESM2-LM, who have their lowest values in winter for both heat (Fig. 12c) and volume (Fig. 12d). The yearly range can be large in some models (up to 32.4 TW for the heat inflow in MIROC6, and 2.3 Sv for the volume inflow in GFDL-CM4), but so can it in observations (10–50 TW and 4–6 Sv; Schauer et al. 2004; Beszczynska-Möller et al. 2012; de Steur et al. 2014).
The reason why the fluxes through Fram Strait are highest in winter can be found in the processes that cause them. In models (Årthun and Eldevik 2016; Muilwijk et al. 2019) as in observations (Wang et al. 2020), the heat and volume fluxes through Fram Strait are driven at least in part by the gyre and/or winter convective activity in the Nordic seas (Smedsrud et al. 2022), regardless of the depth level considered (von Appen et al. 2015; Chatterjee et al. 2018). The convective activity values in CMIP6 models were recently published by Heuzé (2021): they showed that all the models that we consider here largely overestimate it. In particular, all models but CAMS-CSM1-0 had mixed layers deeper than 1000 m every year over 1985–2014 over an extensive region, which is visible in Fig. 8; CAMS-CSM1-0 did so only 24 out of 30 years. Comparing our fluxes with their mean deep mixed volume, i.e., sum of the cell area multiplied by the mixed layer depth (MLD) for all cells where that MLD is deeper than 1000 m, we find significant across-model correlations (at 90%) with the heat inflow through Fram Strait (0.48) and the volume outflow (0.42). That is, as in observations (e.g., Wang et al. 2020), a stronger convective activity in the Nordic seas is associated with a stronger heat inflow into the Arctic, but also with a stronger volume outflow from the Arctic. These results do not prove causality but suggest a possible chain of biases:
-
The Nordic seas have biased temperature and salinity and a biased representation of convective activity (Heuzé 2021).
-
The stronger the convective activity, the stronger the volume transport northward, through Fram Strait and into the Arctic.
-
That volume transport advects the biases in properties from the Nordic seas to Fram Strait, so that the stronger the volume transport, the more Fram Strait is biased warm. Another possibility is that the convective activity directly sets the properties of the advected water, as has been found in observations before (Langehaug and Falck 2012).
-
The stronger the warm bias at Fram Strait, the stronger the heat flux into the Arctic.
This would explain why the “worst” models for the heat fluxes are the “least bad” for the volume fluxes: the higher volume fluxes in (and out) of the Arctic are more efficient at advecting the warm bias from the Nordic seas into the Arctic.
4. Discussion and conclusions
In this study, we first quantified biases in the Atlantic Water in all deep basins of the Arctic. In agreement with Khosravi et al. (2022), we find that its core is too cold by 0.4°C on average, too deep by 400 m, and in half of the models the Atlantic layer extends all the way to the seafloor, i.e., the properties do not evolve with depth as they do in the real ocean. Besides, in most models the properties do not change from basin to basin. We attribute these inaccurate properties and behavior to a lack of shelf overflows in most models, a result previously found in ocean-only simulations (Ilicak et al. 2016), and inaccurate heat and volume fluxes through Fram Strait. To the best of our knowledge, no study was performed on CMIP5 models to quantify biases in deep and bottom water properties in the Arctic; we here determine that CMIP6 models are too warm by more than 1°C as multimodel average. Our findings reveal a strong decoupling between the upper layer and the rest of the deep Arctic (below 200 m), which is quite homogeneous in depth and between the basins. These biases matter for the rest of the Arctic system: we find a significant correlation between pan-Arctic sea ice volumes and Atlantic Water temperature (−0.43 at 90%), while Muilwijk et al. (2023) find not only strong biases in the representation of stratification, but also that we cannot accurately predict future stratification changes as individual models return diverging results depending on their AW biases.
We linked these biases to processes both within and outside the Arctic. Within the Arctic, the main issue is the absence of ventilation: only three models appear to have dense water overflows, and these are taking place at only two locations (cf., e.g., to the list in Luneva et al. 2020), and do not seem to ventilate the deepest layers. Our results are limited by the fact that too few models provide the age of water output, that they followed different protocols to compute it, and that a monthly resolution may be too coarse to effectively track overflows as they cascade off the shelf. Nevertheless, this finding comes as no surprise considering that the models suffer from the same overflow issue in the rest of the world (Adcroft et al. 2019; Heuzé 2021), but this issue is particularly acute in the Arctic where no other process can replace overflows (Peralta-Ferriz and Woodgate 2015), and where open ocean deep mixing is rather indicative of inaccurate stratification (Lique and Thomas 2018). The higher resolution of CMIP6 models compared to CMIP5 was not enough to improve the overflows; in fact, it seems unlikely that such processes can ever become explicitly represented in global climate models (Fox-Kemper et al. 2019). Instead, one can notice that the three models that seem to have overflows also have isopycnal, terrain-following, or hybrid grids (Table 1). Another solution could be the widespread implementation of overflow parameterizations (e.g., Danabasoglu et al. 2010).
The biases are also related to the circulation: within the Arctic, the age of the oldest waters in the CMIP6 models studied here ranges from 122 to 1946 years (Fig. 9). Despite the models following different protocols for the age calculation, we could attribute the age difference primarily to the coherency of the flows (e.g., fewer local, most likely recirculations). The highest resolution model had the most coherent and detailed flow, probably thanks to its eddy-permitting resolution and accurate representation of bathymetry, as discussed above. While we could speculate on the reasons for these different flow speeds and paths across the models, such study is (still) impeded by model inconsistencies and lack of crucial metadata. Notably, we would like to see
-
the ocean velocities be archived for all models;
-
the necessary information to reproject the velocities onto the Cartesian grid be included in the output files, e.g., via an angle parameter that for each grid cell gives its rotation compared to the true north;
-
the age of water be archived for all models;
-
the age of water has the same definition for all models; in particular, resetting the age of water to 0 at the beginning of the historical run seriously impacts any study of the deep ocean;
-
the spinup time be routinely provided, e.g., in the model description.
It is unlikely that continuously increasing the horizontal resolution of the entire climate model will ultimately result in an accurate circulation, given the very high resolutions required to resolve the narrow canyons through which some flow exchanges take place and the meso- and submesoscale processes. Promising possible solutions are model nesting (T. Martin 2022, personal communication) or adaptive mesh (Wang et al. 2018), which can increase the resolution at crucial locations such as canyons or the shelf break without making computations unnecessarily heavy.
At Fram Strait, we found that all models underestimate the volume fluxes in and out of the Arctic, i.e., all models are biased slow. The heat flux however appears accurate or even biased high, as the low volume fluxes are compensated by warm temperature biases at Fram Strait. We found across-model relationships between Fram Strait biases and fluxes, and inaccurate properties and deep convective activity in the Nordic seas. This suggests that as in observations (e.g., Langehaug and Falck 2012), modeled deep convection is enhanced by the deep outflow from the Arctic and enhances the deep inflow, but also modifies the properties of the water advected through Fram Strait. The inaccurate Nordic seas convective activity was previously blamed on inaccurate representations of sea ice extent and seasonal cycle (Heuzé 2021) and atmospheric modes of variability and wind patterns (Heuzé 2017), suggesting that improving the individual components, after identifying their biases for example via SIMIP (Notz et al. 2016) or AMIP (Eyring et al. 2016), may be a necessary first step toward accurately modeling the coupled Arctic system. Correcting biases in the deep Arctic Ocean could even have widespread impact on the entire modeled global climate: We found significant across-model relationship between biases in the properties of the Atlantic Water in the Canada Basin and that of the subpolar gyre reported by Heuzé (2021) (0.45 at 90% between the temperatures; 0.66 at 99% between the salinities), and even between the volume fluxes out of the Arctic and the Atlantic meridional overturning circulation [AMOC; values from Heuzé (2021), correlation of −0.43 at 90%].
Higher resolution, parameterizations and dedicated MIPs can however only go so far when there are virtually no observations to constrain the models. In the database UDASH (Behrendt et al. 2018), there are fewer than 700 full-depth hydrographic profiles in the entire Arctic north of 82°N, and only 40 of them are in winter. Consequently in their recent review, Solomon et al. (2021) did not even try to investigate the deep Arctic Ocean as there were too few observations; even for the upper ocean, they could not close the freshwater budget as Arctic river discharge time series were few, incomplete, and uncertain. There is an urgent need for more multidisciplinary and multiscale (both in time and space) observation campaigns, similar to the recently completed Multidisciplinary drifting Observatory for the Study of Arctic Climate (MOSAiC) expedition (Rabe et al. 2022), across the entire Arctic, or at least for more coordination and cooperation between different expeditions to properly investigate processes and their interaction, instead of the traditional local component-specific studies.
Acknowledgments.
This work was funded via Vetenskapsrådet Grant 2018-03859 awarded to Céline Heuzé. Morven Muilwijk received funding from the European Union’s Horizon 2020 research and innovation programme under Grant Agreement 101003826 via project CRiceS. We acknowledge the World Climate Research Programme, which, through its Working Group on Coupled Modelling, coordinated and promoted CMIP6. We thank the climate modeling groups for producing and making available their model output, the Earth System Grid Federation (ESGF) for archiving the data and providing access, and the multiple funding agencies who support CMIP6 and ESGF. We are grateful to Jianglong Li (BCC-CSM2-MR), Xinyao Rong (CAMS-CSM1-0), Gary Strand (CESM2), Andrew Shao and Neil Swart (CanESM5), Thomas Reerink (EC-Earth3), the GFDL Climate Model Info Team (GFDL-CM4 and GFDL-ESM4), Gavin Schmidt (GISS-E2-1-H), Olivier Boucher (IPSL-CM6A-LR), Hiroaki Tatebe and Yoshiki Komuro (MIROC6), Johann Jungclaus (MPI-ESM1-2-HR), Shogo Urakawa (MRI-ESM2-0), Øyvind Seland and Mats Bentsen (NorESM2-LM), and Andrew Yool and Colin Jones (UK-ESM1-0-LL), for their prompt replies to our questions regarding their respective models, indicated in parentheses. We are extremely grateful to Chuncheng Guo for sending us the NorESM2-LM output when we urgently needed them but their server was down. Finally, we thank the three reviewers Hailong Liu, Carolina Dufour, and Jonathan Rheinlænder, as well as the editor Yi Deng, for their many comments that greatly improved this manuscript.
Data availability statement.
All CMIP6 data are freely available via the Earth Grid System Federation. For this paper, we used the German Climate Computing Centre (DKRZ) node: https://esgf-data.dkrz.de/search/cmip6-dkrz/ and the Geophysical Fluid Dynamics Laboratory (GFDL) node: https://esgdata.gfdl.noaa.gov/search/cmip6-gfdl/.
The Unified Database for Arctic and Subarctic Hydrography is freely available via https://doi.pangaea.de/10.1594/PANGAEA.872931. All versions of the World Ocean Atlas climatology are freely available via https://www.ncei.noaa.gov/products/world-ocean-atlas. All versions of the Polar science center Hydrographic Climatology are freely available via http://psc.apl.washington.edu/nonwp_projects/PHC/Climatology.html. The EN4 climatology is freely available via https://www.metoffice.gov.uk/hadobs/en4/. The gridded bathymetry GEBCO is freely available via https://www.gebco.net/data_and_products/gridded_bathymetry_data/. The sea ice concentration product HadISST1 is freely available at https://www.metoffice.gov.uk/hadobs/hadisst/data/download.html.
The volume, heat, and salt flux time series are freely available on PANGAEA via https://doi.org/10.1594/PANGAEA.949182 and can be cited as Zanowski et al. (2022). The routines to compute them from the CMIP6 output are freely available on Zenodo (https://doi.org/10.5281/zenodo.4606856).
APPENDIX
A More Detailed Look at the Model Biases
This appendix presents
-
the models’ native grid (Fig. A1);
-
the pan-Arctic biases in each water mass in temperature (Fig. A2) and salinity (Fig. A3);
-
the models’ polynya activity (Fig. A4);
-
the absolute age of the water (Fig. A5);
-
the models’ pan-Arctic velocity of the AW core (Fig. A6) and at 2000-m depth (Fig. A7); and
-
the salinity across Fram Strait (Fig. A8).
It also presents the area-weighted mean biases in all basins (Tables A1–A3).
REFERENCES
Aagaard, K., 1981: On the deep circulation in the Arctic Ocean. Deep-Sea Res., 28A, 251–268, https://doi.org/10.1016/0198-0149(81)90066-2.
Aagaard, K., J. H. Swift, and E. C. Carmack, 1985: Thermohaline circulation in the Arctic Mediterranean seas. J. Geophys. Res., 90, 4833–4846, https://doi.org/10.1029/JC090iC03p04833.
Adcroft, A., and Coauthors, 2019: The GFDL global ocean and sea ice model OM4.0: Model description and simulation features. J. Adv. Model. Earth Syst., 11, 3167–3211, https://doi.org/10.1029/2019MS001726.
Aksenov, Y., V. V. Ivanov, A. J. G. Nurser, S. Bacon, I. V. Polyakov, A. C. Coward, A. C. Naveira-Garabato, and A. Beszczynska-Möller, 2011: The Arctic circumpolar boundary current. J. Geophys. Res., 116, C09017, https://doi.org/10.1029/2010JC006637.
Armitage, T. W. K., G. E. Manucharyan, A. A. Petty, R. Kwok, and A. F. Thompson, 2020: Enhanced eddy activity in the Beaufort Gyre in response to sea ice loss. Nat. Commun., 11, 761, https://doi.org/10.1038/s41467-020-14449-z.
Årthun, M., and T. Eldevik, 2016: On anomalous ocean heat transport toward the Arctic and associated climate predictability. J. Climate, 29, 689–704, https://doi.org/10.1175/JCLI-D-15-0448.1.
Behrendt, A., H. Sumata, B. Rabe, and U. Schauer, 2018: UDASH—Unified Database for Arctic and Subarctic Hydrography. Earth Syst. Sci. Data, 10, 1119–1138, https://doi.org/10.5194/essd-10-1119-2018.
Beszczynska-Möller, A., E. Fahrbach, U. Schauer, and E. Hansen, 2012: Variability in Atlantic Water temperature and transport at the entrance to the Arctic Ocean, 1997–2010. ICES J. Mar. Sci., 69, 852–863, https://doi.org/10.1093/icesjms/fss056.
Björk, G., M. Jakobsson, K. Assmann, L. G. Andersson, J. Nilsson, C. Stranne, and L. Mayer, 2018: Bathymetry and oceanic flow structure at two deep passages crossing the Lomonosov Ridge. Ocean Sci., 14, 1–13, https://doi.org/10.5194/os-14-1-2018.
Boucher, O., and Coauthors, 2020: Presentation and evaluation of the IPSL-CM6A-LR climate model. J. Adv. Model. Earth Syst., 12, e2019MS002010, https://doi.org/10.1029/2019MS002010.
Chatterjee, S., R. P. Raj, L. Bertino, Ø. Skagseth, M. Ravichandran, and O. M. Johannessen, 2018: Role of Greenland Sea Gyre circulation on Atlantic Water temperature variability in the Fram Strait. Geophys. Res. Lett., 45, 8399–8406, https://doi.org/10.1029/2018GL079174.
Danabasoglu, G., W. G. Large, and B. P. Briegleb, 2010: Climate impacts of parameterized Nordic sea overflows. J. Geophys. Res., 115, C11005, https://doi.org/10.1029/2010JC006243.
Danabasoglu, G., and Coauthors, 2020: The Community Earth System Model version 2 (CESM2). J. Adv. Model. Earth Syst., 12, e2019MS001916, https://doi.org/10.1029/2019MS001916.
Davy, R., and S. Outten, 2020: The Arctic surface climate in CMIP6: Status and developments since CMIP5. J. Climate, 33, 8047–8068, https://doi.org/10.1175/JCLI-D-19-0990.1.
Decloedt, T., and D. S. Luther, 2010: On a simple empirical parameterization of topography-catalyzed diapycnal mixing in the abyssal ocean. J. Phys. Oceanogr., 40, 487–508, https://doi.org/10.1175/2009JPO4275.1.
Decuypère, M., L. B. Tremblay, and C. O. Dufour, 2022: Impact of ocean heat transport on Arctic sea ice variability in the GFDL CM2-O model suite. J. Geophys. Res. Oceans, 127, e2021JC017762, https://doi.org/10.1029/2021JC017762.
de Steur, L., E. Hansen, C. Mauritzen, A. Beszczynska-Möller, and E. Fahrbach, 2014: Impact of recirculation on the East Greenland Current in Fram Strait: Results from moored current meter measurements between 1997 and 2009. Deep-Sea Res. I, 92, 26–40, https://doi.org/10.1016/j.dsr.2014.05.018.
Docquier, D., and T. Koenigk, 2021: A review of interactions between ocean heat transport and Arctic sea ice. Environ. Res. Lett., 16, 123002, https://doi.org/10.1088/1748-9326/ac30be.
Docquier, D., and Coauthors, 2019: Impact of model resolution on Arctic sea ice and North Atlantic Ocean heat transport. Climate Dyn., 53, 4989–5017, https://doi.org/10.1007/s00382-019-04840-y.
Docquier, D., R. Fuentes-Franco, T. Koenigk, and T. Fichefet, 2020: Sea ice–ocean interactions in the Barents Sea modeled at different resolutions. Front. Earth Sci., 8, 172, https://doi.org/10.3389/feart.2020.00172.
Döscher, R., and Coauthors, 2022: The EC-Earth3 Earth system model for the Coupled Model Intercomparison Project 6. Geosci. Model Dev., 15, 2973–3020, https://doi.org/10.5194/gmd-15-2973-2022.