Different strategies have been proposed in previous studies for monitoring the Atlantic meridional overturning circulation (AMOC). As well as arrays to directly monitor the AMOC strength, various fingerprints have been suggested to represent an aspect of the AMOC based on properties such as temperature and density. The additional value of fingerprints potentially includes the ability to detect a change earlier than a change in the AMOC itself, the ability to extend a time series back into the past, and the ability to detect crossing a threshold. In this study we select metrics that have been proposed as fingerprints in previous studies and evaluate their ability to detect AMOC changes in a number of scenarios (internal variability, weakening from increased greenhouse gases, weakening from hosing and hysteresis) in the eddy-permitting coupled climate model HadGEM3-GC2. We find that the metrics that perform best are the temperature metrics based on large-scale differences, the large-scale meridional density gradient, and the vertical density difference in the Labrador Sea. The best metric for monitoring the AMOC depends somewhat on the processes driving the change. Hence the best strategy would be to consider multiple fingerprints to provide early detection of all likely AMOC changes.
The Atlantic meridional overturning circulation (AMOC) transports large amounts of heat in the Atlantic significantly influencing the climate. Much evidence has been seen of the potential impacts of changes in the AMOC on climate, whether it is from interannual to decadal variability (Knight et al. 2006; Hermanson et al. 2014), a gradual weakening of the AMOC from increased greenhouse gases (Woollings et al. 2012; Drijfhout et al. 2012; Haarsma et al. 2015), or a more rapid collapse of the AMOC (Jackson et al. 2015). Hence there is strong need to monitor the AMOC, and if possible provide early warning of any changes.
Currently there are observational programs continuously monitoring the AMOC in the subtropical North Atlantic [Rapid Climate Change–Meridional Overturning Circulation and Heatflux Array (RAPID–MOCHA); McCarthy et al. 2015b] and the subpolar North Atlantic [Overturning in the Subpolar North Atlantic Program (OSNAP); Lozier et al. 2019], with other programs providing limited observations at other latitudes in the Atlantic (McCarthy et al. 2020; Frajka-Williams et al. 2019). Results so far show large variability on many time scales, meaning that a much longer time series is required to detect an AMOC trend (Baehr et al. 2007; Roberts and Palmer 2012).
An alternative way to monitor the AMOC is via metrics that can be used as an AMOC fingerprint (defined here to be a metric that well represents an aspect of the AMOC). Many studies have suggested metrics that can be used as a fingerprint of the AMOC. These can be used to enable historical reconstructions, such as in the studies of Rahmstorf et al. (2015), Caesar et al. (2018), and Thornalley et al. (2018). Some have been suggested as ways to improve near-term predictions of the AMOC or its impacts (Hermanson et al. 2014; Sévellec et al. 2018) or to reconstruct multidecadal variability (Zhang 2008; Msadek et al. 2010; Roberts et al. 2013a). Others have developed fingerprints to detect a weakening of the AMOC (Baehr et al. 2007; Roberts et al. 2013a; Vellinga and Wood 2004). Some of these may provide faster detection of a change, either through the fingerprint having less high-frequency variability than the AMOC (greater signal-to-noise ratio; Baehr et al. 2007; Brennan et al. 2008; Roberts and Palmer 2012) or by being a precursor to an AMOC change, such as changes in deep convection regions found to lead AMOC changes (Ba et al. 2014; Danabasoglu et al. 2016). A useful fingerprint should satisfy three criteria: first, it should be a good representation of an aspect of the AMOC that we want to monitor; second, the relationship with the AMOC should be understood physically; and third, it must be potentially observable. For early warning, the fingerprint must also be able to detect a change in the AMOC before it is detected by the AMOC itself. Most metrics that have been proposed are based on ocean temperatures (Latif et al. 2004; Zhang 2008; Msadek et al. 2010; Roberts et al. 2013a; Zhang and Zhang 2015; Caesar et al. 2018), since changes in the heat transported by the AMOC are likely to affect temperatures in the North Atlantic, or densities (Baehr et al. 2007; Roberts et al. 2013a; Hermanson et al. 2014; Butler et al. 2016; Robson et al. 2016; Haskins et al. 2019), since there are arguments associating the AMOC with density or pressure gradients (Butler et al. 2016). Metrics of mixed layer depth (MLD) that indicate convection were proposed by Jackson and Wood (2018a), and gradients in sea level that can indicate volume transport in the Gulf Stream have also been proposed (Bingham and Hughes 2009; McCarthy et al. 2015a). Transports of freshwater in the southern (de Vries and Weber 2005) and subtropical Atlantic (Mecking et al. 2016; Jackson and Wood 2018a) have been suggested as potential indicators of an AMOC collapse. Some studies have looked for multivariate fingerprints based on water mass properties (Vellinga and Wood 2004; Roberts and Palmer 2012; Klus et al. 2019), although Roberts and Palmer (2012) warned against simply using statistical techniques to look for patterns, without investigating the physical relationships. Various studies have highlighted the added value of considering a number of different metrics (Vellinga and Wood 2004; Roberts and Palmer 2012; Klus et al. 2019), including different locations of the AMOC to assess spatial coherence (Bingham et al. 2007; Feng et al. 2014).
In this study we will test various metrics that have previously been proposed as AMOC fingerprints. We will use an eddy-permitting coupled climate model to test candidate metrics for detecting AMOC change in different scenarios: internal variability, changes from an AMOC weakening (forced by an increase in greenhouse gases or addition of freshwater), and the recovery after a weakening.
There are good arguments that what we should be monitoring is not the AMOC itself, but the total Atlantic Ocean heat transport (AOHT), which is dependent on changes in the horizontal circulation as well as changes in ocean temperature. It is changes in the heat transport that have impacts on the ocean temperatures and hence on the climate. However, studies proposing fingerprints for monitoring have developed these fingerprints for the AMOC, so we focus this study on testing the metrics against the AMOC itself. We note that when changes in the AMOC are large, there are also large changes in heat transport and hence the metrics are both fingerprints of the AMOC and AOHT. For smaller changes, in particular for studying variability, we examine the potential of the metrics to be fingerprints of both the AMOC and AOHT.
Although some of the experiments used here have been found to have a threshold beyond which the AMOC does not recover when freshwater is added (Jackson and Wood 2018a), we will not be covering the use of the time series statistics to indicate the approach of a threshold. Various papers (Lenton 2011; Lenton et al. 2012; Boulton et al. 2014; Nikolaou et al. 2015; Klus et al. 2019) have shown that this is potentially a useful indicator of approaching a threshold. However, although current techniques have been shown to indicate approaching a threshold (through increased variance or autocorrelation), they do not indicate how far away the threshold is. The usefulness of time series statistics in the real world is also constrained by the need to have a reliable fingerprint with a long (hundreds of years) time series. For the experiments considered in this paper where changes are occurring quickly and the lengths of integrations are short, it is difficult to explore the statistics.
This paper is structured as follows. Section 2 describes the model, experiments used, and AMOC response, and section 3 introduces the different metrics. Then we examine the potential of the metrics as AMOC fingerprint for variability in section 4, for AMOC weakening in section 5, to distinguish an AMOC recovery in section 6, and to detect a threshold in section 7. Finally, in section 8, we present our conclusions.
2. Models and experiments
a. Model description
In this analysis we use the eddy-permitting, coupled climate model HadGEM3-GC2 which is the forerunner to HadGEM3-GC3.1 submitted to phase 6 of the Coupled Model Intercomparison Project (CMIP6). The GCM consists of ocean, atmosphere, land, and sea ice components and is described in detail in Williams et al. (2015). In particular the ocean model component uses the G05 version (Megann et al. 2014) of the NEMO model (Madec 2008), with a nominal resolution of 0.25° and 75° vertical levels.
We make use of a long control run (200 years) with preindustrial forcings (CON) and a 140-yr-long run that is spun off where the CO2 concentrations are increased at 1% per year up to 4 times preindustrial concentrations (1PC). We also use a suite of experiments where an idealized additional flux of freshwater (hosing) is added to the North Atlantic north of 50°N. See Jackson and Wood (2018b) for details. These experiments are referred to as hos01, hos02, hos03, hos05, and hos10 where the numbers indicate the amount of hosing added (0.1–1.0 Sv; 1 Sv ≡ 106 m3 s−1). Finally, experiments are spun off from these where the hosing is stopped after a length of time T. In some of these the AMOC recovers back to its control strength, but in others the AMOC stays weak. These experiments are described in Jackson and Wood (2018a) and are named in the form offH_T for those where the hosing H is applied for T years (i.e., off05_20 has had a hosing of 0.5 Sv for 20 years).
The AMOC streamfunction in CON shows a coherent overturning cell (Fig. 1a). M26 and M45 are the strengths at 26.5° and 45°N (measured as the maximum over depth), which have means of 14.5 and 11.6 Sv, respectively, and annual standard deviations of 0.9 and 1.0 Sv (Fig. 2). The AOHT (maximum over latitude of the Atlantic Ocean heat transport) has a mean value of 0.99 PW with an annual standard deviation of 0.07 PW (Fig. 1d). Time series show coherent multidecadal variability (Fig. 1e), with AOHT instantaneously varying with M26 and M45 leading by a couple of years (Fig. 1f). A lag between M26 and M45 is consistent with other studies showing AMOC signals propagating from high to low latitudes in the North Atlantic (Xu et al. 2019).
In the 1PC experiment there is a substantial weakening of AMOC, including at 26.5° and 45°N, and this weakens the AOHT (Figs. 1c,d). In the hosing experiments, the freshening of the North Atlantic also weakens the AMOC and AOHT (Figs. 1b,d). When hosing stops, the AMOC recovers in some experiments and not others (Fig. 2). There are significant correlations between M26 and M45 and AOHT for all experiments since when the AMOC change is large, it is coherent across the basin and has a large impact on the heat transport. Hence results for many sections of this study are at least qualitatively similar when using M45 or AOHT rather than M26 (not shown). However, there are some differences when using fingerprints to detect variability (section 4). This is because internal variability can differ between the subtropical and subpolar AMOC and because we might expect variability in temperature or the horizontal circulation to influence AOHT as well as the AMOC. Hence, we include some discussion of the potential of the metrics to be fingerprints of M45 and AOHT in section 4.
We start with a description of different metrics (listed in Table 1 and shown in Figs. 3 and 4), and a comparison with the AMOC. Later sections investigate the potential of these metrics as fingerprints for different scenarios.
Figures for a few of the metrics are shown in Figs. 5–8 and correlations of decadal mean time series of metrics with M26 for the different experiments are shown in Table 2. Many metrics show strong correlations for the 1PC experiment and the hosing experiments; however, in these experiments many quantities have long-term trends and this strong correlation just means that the metric also has a trend.
a. Existing monitoring arrays
We extract the AMOC in a similar way to two existing observational arrays. For the RAPID section we calculate the AMOC at 26.5°N using the observational method of geostrophic approximations to velocity, rather than model velocities themselves (Roberts et al. 2013b). Hence there are slight differences with M26 although, as expected, there are very high correlations. The RAPID observations have three components: the Florida Straits transport (FC), the upper mid-ocean component (UMO), and the wind-driven Ekman component (McCarthy et al. 2015b). The latter of these is very strongly related to the wind forcing and has little relationship to the AMOC and so is excluded. Much of the AMOC weakening in the hosing and 1PC runs comes from the FC component rather than the UMO component; however, we note that individually both components show little simultaneous relationship to M26 during decadal variability in the control (Table 2).
The OSNAP section (Lozier et al. 2019) is calculated from model velocities in density coordinates and is extracted along the observational path using the method in Zou et al. (2020).The OSNAP overturning and its east and west components show strong correlations with the M26 (Table 2); the east component has the strongest mean strength and greatest weakening in the 1% and hosing experiments (not shown).
b. Temperature metrics
Many metrics have been calculated based on sea surface temperature (SST) because SST can be strongly impacted by the heat transport of the AMOC. Temperatures are also readily measured, through satellite data at the surface, through subsurface ARGO floats, and even indirectly through paleo evidence (Mann et al. 2008). The metrics shown differ in the locations used for the metric (see Table 1 and Fig. 3). Various studies have shown links in models between the multidecadal variability of Atlantic surface temperatures and multidecadal variability of the AMOC, leading Latif et al. (2004), Msadek et al. (2010), Roberts et al. (2013a), and Rahmstorf et al. (2015) to propose indices (amv1, amv2, SST_dipole, SST_spg) based on the SST in the North Atlantic. To remove the influence of temperature changes from increasing greenhouse gases, these indices also subtract the global mean, the hemispheric mean, or South Atlantic mean temperature. Caesar et al. (2018) instead used a dipole between the Gulf Stream and subpolar gyre and only winter and spring SSTs.
Of the SST metrics, amv2 (Fig. 5) has the strongest relationship with M26, although amv1 and SST_spg also generally have strong correlations. Some [SST_dipole (see Fig. 6) and SST_caesar] show little relationship with M26 in the control; however, if temperature changes are caused by the AMOC changing heat transport, which changes the heat content, then a lagged relationship would be expected instead. This will be investigated further in section 4. Most temperature metrics show strong correlations when there are large ongoing trends (1% and hosing experiments) but are less good at explaining the recovery experiments. SST_dipole shows a negative trend for 1PC because, although the subpolar gyre warms less than other regions, as expected, the Nordic seas farther north warm more (Fig. 3).
Similar to SST fingerprints, Zhang and Zhang (2015) suggested using empirical orthogonal functions (EOFs) of detrended heat content over the top 700 m (uohc) or temperatures at 400 m (Tsub). We calculate the first EOF of CON and define regions for a dipole based on this pattern (shown in Fig. 3a for Tsub; see Table 1). We use these fixed regions to calculate time series for each experiment since the EOF assumes that the whole time series is already known. However we have also calculated the first principal component (PC1) time series associated with each experiment’s own first EOF (not shown). In experiments where there is a strong trend (1PC and hosing experiments), the first EOF would show a trend; however, the EOFs of detrended time series do not resemble the AMOC. In experiments where there is no or little overall trend (CON and experiments where the AMOC recovers after hosing), PC1 resembles the time series calculated from the fixed regions. We choose to use fixed regions to calculate time series since, in practice, the EOF analysis could only be done on years prior to the year in consideration, and this restricts the time periods we can examine.
We do not see strong instantaneous relationships of Tsub and uohc with the AMOC, other than in the 1PC run (Table 2). The lack of a relationship in the control may again be because a lagged relationship is more appropriate, but it is possible that changes in heat advection through gyre advection also play a significant role in the relationship between temperature and AMOC (Roberts et al. 2013a). The AMOC variability in this model is also weaker than observed, which could lead to an underestimation of the relationship with Atlantic Ocean temperature (Yan et al. 2018). We note that a dipole between the two regions used for Tsub and uohc does develop in the 1PC run leading to positive correlations of this index with the AMOC, but in the hosing runs the correlations are weak (Table 2).
c. Density metrics
The second most common class of metrics is those based on density gradients. This originates from the knowledge that the AMOC can be related to zonal pressure difference, and studies have found that this results in an relationship between the AMOC and both meridional density gradients (Thorpe et al. 2001; de Boer et al. 2010; Robson et al. 2013) and zonal density gradients (Baehr et al. 2007). Roberts et al. (2013a) proposed a meridional difference (m_dipole) and Baehr et al. (2007) and Robson et al. (2016) proposed zonal density differences at different latitudes (z_dipole, 26N_dipole). Since the density signal is mostly found in the western subpolar gyre, Roberts et al. (2013a) and Hermanson et al. (2014) also suggested densities in the Labrador Sea (LS_mid, LS_dipole). Previous modeling studies have shown that densities in the western subpolar gyre can be modified by surface fluxes over decadal time scales, with density anomalies then propagating southward along the western boundary (Ba et al. 2014; Buckley and Marshall 2016; Ortega et al. 2017). Hence density changes in the subpolar gyre could be precursors to AMOC change.
Of the density metrics the one that works the best is m_dipole (which we note was designed for this ensemble; Jackson and Wood 2018b). The time series for m_dipole is shown in Fig. 7. Examination of the other density metrics (not shown) shows that the differences during the hosing experiments were mainly due to the lack (for LS_mid) or location (for LS_density, z_dipole) of a second region to create a density gradient. For these experiments the changes in the deep ocean can be very large, so correcting for this correctly is important. For the density gradient at 26.5°N (26N_dipole) there is little relationship with M26: the hosing experiments mostly show no change or a weakening, while the 1PC experiment shows an increase. This may be because the regions used to calculate the boundary densities are relatively wide and may not represent values on the actual boundaries.
Butler et al. (2016) pointed out that, instead of density gradients, it would be more accurate of use gradients in pressure itself. The meridional gradient in pressure (pintg) has a good relationship with M26 across many experiments.
d. Mixed layer depth
The mixed layer depth (MLD) in March can be used as a proxy for deep convection which affects the overturning circulation. Modeling studies have shown that changes in mixed layer depths can occur before changes in the AMOC (Ba et al. 2014; Buckley and Marshall 2016; Ortega et al. 2017), so MLD may be useful for providing early warning of AMOC changes.
Jackson and Wood (2018a) noted that MLD may be a useful fingerprint for the AMOC in these experiments. The SPG_MLD (Fig. 8) and LS_MLD show largely similar behavior to the AMOC time series with strong correlations (other than in the control, although there may be a lagged relationship; see section 3). SPG_MLD is shown in Fig. 8. The MLD time series shows a larger interannual variability than the AMOC. It also seems to show a faster response, both in the recovery of the MLD after hosing and in the temporary recovery and then weakening seen in some of the nonrecovering experiments after hosing. In the 1PC experiment it shows a faster weakening then levels off, unlike the more linear decrease of the AMOC itself. These all point to the potential of the MLD to act as an early warning of the AMOC change. Since SPG_MLD shows similar, but marginally better agreement with the AMOC than LS_MLD, we drop LS_MLD from further analysis. We also note that the MLD in HadGEM3-GC2 is too deep in the Labrador Sea and hence using a wider area is less likely to be model dependent.
e. Sea level
One impact of a changing AMOC is changes in sea surface heights, particularly across the Gulf Stream, leading to differences in sea level along the eastern North American coast. This has led to proposals of using the sea level difference between coastal locations as fingerprints (Bingham and Hughes 2009; McCarthy et al. 2015a). Of the two sea level metrics, the more southerly dipole (sl1) performs better than the more northerly dipole (sl0), although less well than other metrics. It is possible that model deficiencies in the path of the Gulf Stream, particular in the separation from the coast, could have adversely affected these metrics.
f. Freshwater transports
Although freshwater transports are less easy to monitor than other options, there have been suggestions that the potential of the AMOC for collapse and the ability of the AMOC to recover may be related to the transport of freshwater (de Vries and Weber 2005). As well as the more established metrics related to the overturning and total freshwater transport at the southern boundary of the Atlantic (fov34S, ftot34S), Jackson and Wood (2018a) found that the transports at 30°N (fov30N, ftot30N) play an important role in the recovery of the hosing experiments used here.
Although we found good correlations of freshwater transport with the AMOC (not shown), this is not surprising since we would expect a relationship with at least the overturning component of the freshwater transport. We found that the freshwater transports gave no extra information, so given that they are more difficult to observe, we only include them in the analysis of thresholds (section 7).
4. Fingerprint for variability
There is evidence that the AMOC experiences decadal to multidecadal variability from model simulations (Zhang and Wang 2013; Ba et al. 2014) and observations of SSTs (Knight et al. 2006). These changes have impacts on temperatures, precipitation, and storms around the North Atlantic, as well as wider impacts on monsoons. One potential use for fingerprints is in detecting these changes, and potentially improving the detection time. In this model the AMOC has multidecadal variability and the autocorrelation show periods of around 40–60 years (Figs. 1e,f). To examine how well the metrics capture the decadal to multidecadal variability of the AMOC we use running decadal means (mean over the following decade starting from each year).
We show lagged correlations of our metrics to M26, M45, and AOHT for the control run in Fig. 9. Correlations that are judged significant in Fig. 9 are shown with solid lines, where significance is tested using a moving block bootstrap with blocks of 20 years to account for autocorrelation (Wilks 1997). Using longer blocks (30 years) makes little difference, but using a single year was found to increase the number of values found to be significant since autocorrelation is not correctly accounted for. Those metrics with no significant signal are not shown. Some metrics where the results are very similar to others are also not shown, although the similarity is noted in the text. The AMOC in this model has internal variability of around 60 years, leading to peaks in negative correlations 30 years before or after (Fig. 1f).
RAPID and OSNAP both show strong correlations near lag zero with M26, M45, and AOHT. There is also a significant correlation of the upper mid-ocean component of RAPID. The maximum correlation of OSNAP occurs 1–2 years before the maximum correlation in RAPID in each case, with RAPID coinciding with M26 but lagging M45 by a year. These lags are consistent with a signal propagating southward; however, the lag is more consistent with a fast wave signal rather than a slower advective signal (Buckley and Marshall 2016). Negative correlations at ±20–30 years are indications of multidecadal variability of the AMOC.
Of the temperature-related metrics, amv1 (and amv2 similarly but less significantly; not shown) shows good correlations with M26, M45 and AOHT at lag zero, with strong lagged correlations also from SST_spg. A lagged signal might be expected from an influence of the AMOC on heat transport and hence on the temperatures. Reasons for signals to precede an AMOC change, such as for SST_dipole, are less clear. A negative correlation preceding the AMOC could be from a cold and dense anomaly that drives an AMOC change. Spatial correlations of SST (Fig. 3b) shows the typical horseshoe pattern of warm SSTs in the subpolar gyre, the eastern subtropics and across the tropics (Ba et al. 2014; Wills et al. 2019). This is captured by those metrics that use a large region covering the subpolar gyre or much of North Atlantic. The metrics focusing on a dipole between the subpolar gyre and the Gulf Stream (Tsub, SST_caesar) perform less well since there is no opposing temperature change in the Gulf Stream such as seen in Zhang (2008) and Zhang and Zhang (2015).
The density metrics m_dipole, LS_dipole, and pintg all have significant correlations with M26, M45, and AOHT. The signal in LS_dipole occurs the earliest (about 5 years before M26 and AOHT, and 3 years before M45). The different lags and relationships suggest that the nature of the relationship between the AMOC and the Labrador Sea density depends on the depth and exact region used for the density. SPG_MLD (and LS_MLD; not shown) shows significant correlations near zero lag with M45 but not M26. There are also significant correlations with both sea level metrics; however, these are mostly lagging the changes in AMOC and AOHT. We note opposing correlation signs between sl0 and sl1, suggesting that the region affected is around 40°N, which is the northerly region for sl0 and the southerly region for sl1.
There are several density and temperature metrics that are useful for monitoring AMOC variability, such as SST_dipole, amv1, amv2, SST_spg, LS_dipole, m_dipole, and pintg. Density metrics such as LS_dipole can give warning of AMOC changes up to 5 years ahead. Temperature changes can also give early warning: several temperature metrics show a negative correlation (cold SSTs before an AMOC increase) preceding M45. However, we note that this is not true for M26 and that the negative correlation is likely related to the multidecadal variability of the AMOC. Since the timing and mechanisms of multidecadal variability vary a lot across models (Ba et al. 2014), these negative correlations are likely to be model dependent. One advantage of SST metrics is the potential of extending the historical record back in time to understand past variability (Rahmstorf et al. 2015; Caesar et al. 2018).
5. Detecting a weakening
Most metrics shown in Table 2 are able to detect a weakening in a 1% or hosing scenario; however, a metric that is good for detection and early warning will detect a change earlier. This may be because the change actually happens earlier, or because there is a greater signal-to-noise ratio, meaning that the signal emerges from the noise earlier. Using the hosing experiments and the 1PC experiment we use annual means to calculate the earliest year Y where the trend (from the start of the experiment to year Y) of the forced experiment is outside the 95% percentile of the trends of the control experiment of length Y years (Baehr et al. 2007; Brennan et al. 2008; Roberts and Palmer 2012). We also calculate a signal-to-noise ratio for each metric (s2n), based on the ratio of the standard deviation of all weakening experiments to the standard deviation in the control.
Results are shown in Table 3 for M26, M45, and AOHT, the observational arrays (RAPID, OSNAP), and those metrics where the detection time is shorter in most experiments than for M26. There are some differences between metrics, but it is difficult to assess whether those differences are significant or simply due to internal variability without having ensembles for each forcing scenario. There is mostly a faster detection time for M26 than either M45 or AOHT. This may be because of a greater signal-to-noise ratio. We note that OSNAP has a number of experiments where the detection time is faster than M26, despite having a similar signal-to-noise ratio. This may be because changes in the AMOC are initiated at high latitudes, and can take a few years to reach the subtropics (Buckley and Marshall 2016).
A few metrics show mostly faster detection times (although not in every experiment) than M26. In particular, m_dipole has detection times that are at least 30% faster than the AMOC itself for all the hosing experiments, although detection is slower in the 1PC run. This may be because temperature increase from the increased greenhouse gases also affect the densities used for the fingerprint. LS_dipole also has earlier or equal detection times for all experiments. We note that these metrics both have higher signal-to-noise ratios than M26, which could also improve the detection time. The SPG_MLD metric also has mostly earlier detection times despite a similar signal-to-noise ratio. The good detection times of these metrics (m_dipole, LS_dipole, and SPG_MLD) may also be because they are monitoring processes occurring at higher latitudes where the signal can precede M26.
No single metric shows a consistent improvement over M26 itself, but several show detection rates that are similar or better in most experiments. These suggest that monitoring of the subpolar gyre region, in particular the densities, deep convection, and overturning, could provide early warning for changes.
6. Detecting recovery
The last section discussed the time of detection for an ongoing weakening; however, there are many experiments where hosing is removed and the AMOC either recovers or stays weak. A simple time of detection of a trend from the start of the experiment does not distinguish between a metric that starts changing but takes some time to emerge from the noise and a metric where the change is delayed. One good example of this is the experiment where the AMOC recovers after 150 years of 0.1 Sv hosing (Fig. 2, purple), where the recovery does not actually start until 100 years after the removal of hosing.
An alternative method for assessing the time of detection is presented here. Assuming a window of length n years, we calculate trends of length n for the experiments where the AMOC recovers after hosing and compare to a bootstrapped probability density function (pdf) of trends of length n years from the control. The time at which the trends are below the 5% of the pdf are a detection of a negative trend and above the 95% are a detection of a positive trend. We choose n = 20 years as a balance between a period that is too short (where an ongoing signal is difficult to detect because of variability) and too long (where detection of changes at the start of the run are delayed). We divide the recovery experiments into those where the AMOC recovers and those where is does not and assess the times of the first detected positive and negative trends in the two groups.
Since there is only one of the recovery experiments (off10_10) where a negative AMOC change is detected (an AMOC nonrecovery is normally associated with a relatively flat time series), we focus on the positive trends and ask whether any metrics can detect the increase before the AMOC time series itself. Figure 10 shows the difference between the detected time of increase for each metric from the detected time of increase for M26. Only metrics that perform similarly to or better than M26 are shown. We note that off01_150 in particular shows a wide range of detection times. From examination of the time series (Fig. 2) we can see that the AMOC recovery is initially slow with a weak signal compared to the noise (interannual variability). This long time period increases the likelihood of falsely detecting a recovery, and the large noise compared to the signal results in a large spread of detection times. We note that using an initial condition ensemble would reduce the uncertainty on the detection times.
For most experiments where the AMOC recovers (left side of plot), there is a faster or similar detection rate of recovery from m_dipole and LS_dipole than the AMOC time series itself in all experiments. SPG_MLD also performs well, with most experiments having a similar or earlier detection time than M26.
One aspect of the AMOC behavior, once hosing has been removed in these experiments, is that the AMOC can temporarily strengthen, before weakening and continuing in a weaker state (Fig. 2). Because of this behavior, positive trends can be detected in M26 in those experiments where the AMOC does not recover (Fig. 10, right-hand side), giving a false indication that the AMOC will continue recovering.
We can assess these metrics to understand whether they can indicate whether the AMOC is going to recover or remain in a quasi-stable weak state. For a good indicator of recovery after hosing, the metric should have high detection rate of recovery and a low false detection rate. The time of detection should also be as small as possible. Table 4 shows detection rates of recovery trends (see also Fig. 10). Most metrics have a high detection rate of recovery, and m_dipole has the fastest detection rate. However, it also has a high false detection rate since all the runs where the AMOC does not recover do show an initial increase in m_dipole. Other metrics also have high false detection rates, including the M26 time series itself. Three metrics have low false detection rates: LSmid, pintg, and SPG_MLD, and of these the latter has the fastest detection rate of recovery. Hence SPG_MLD is able to quickly determine whether the AMOC is recovering or not.
Examination of the time series of SPG_MLD (Fig. 8) shows that there is some temporary increase indicating increased convection, but it is short-lived and mostly within the large variability exhibited in the control. Hence the strengthening is not detected.
Hence, we find that SPG_MLD is quick to detect recovery of the AMOC and produces a low false detection rate. However, detecting the lack of recovery is difficult because it requires detecting no trend.
7. Detecting a threshold
Jackson and Wood (2018a) showed that the AMOC in these experiments experiences hysteresis: when the hosing stops, in some experiments the AMOC does not recover (also see Fig. 2). They show that how long the hosing is experienced is important and that the AMOC has temporary resilience. So with a hosing strength of 0.3 Sv (hos03), after 20 years of hosing the AMOC can recover, but not after 50 years of hosing. Hence the threshold of temporary resilience is crossed between 20 and 50 years in this case, giving us a window for the threshold. They found that the AMOC strength was a good indicator of this threshold, with experiments where the AMOC does not weaken below 8 Sv showing a subsequent recovery once hosing stops, but those where the AMOC weakened below 8 Sv staying in a weak state. They also found that salt or freshwater transport by the AMOC at 30°N and mixed layer depth in the Labrador Sea were potential indicators. However, that analysis used centered decadal means to remove noise, but this means that after, for example, 20 years of hosing you need to know the AMOC averaged from years 15–25. For a better indicator we need to consider the preceding decade (i.e., years 10–20).
Figure 11 uses decadal means from the preceding decade for different metrics. Metrics not shown are RAPID and OSNAP, which show similar results to M26, and metrics that have large overlaps between the bars. The blue bars show the range of values for the periods before and including the lower bound of the threshold (i.e., where the AMOC can recover when hosing stops) and the red bars show the range of values for the periods after and including the upper bound of the threshold (i.e., where the AMOC does not recover when hosing stops). The majority of metrics show a separation between the two ranges when using a centered decadal mean (not shown); however, when using the previous decade only SST_dipole and LS_dipole show a separation. This means that the strength of these indicators can be used (in these experiments) for determining whether the threshold has been crossed.
One interesting observation is that fov34S is not a useful indicator of crossing this threshold. The freshwater transport by the AMOC into the Atlantic has been proposed by many studies to be an indicator of whether an AMOC off state exists, and hence whether hysteresis can occur (de Vries and Weber 2005; Huisman et al. 2010; Drijfhout et al. 2011). These theories are based on the idea that if fov34S < 0 then a feedback occurs: weakening AMOC results in less export of freshwater from the Atlantic, resulting in a fresher, more buoyant Atlantic and greater weakening of the AMOC. However, Jackson and Wood (2018a) showed that this feedback did not occur in these experiments, since the Atlantic as a whole did not necessarily get fresher. Instead the freshwater transport at 30°N (fov30N) was found to be more relevant, with the subpolar North Atlantic getting fresher. The metric fov34S does not give any indication of crossing the threshold, and fov30N is not as clear an indicator as SST_dipole and LS_dipole (Fig. 11).
SST_dipole and LS_dipole give the clearest indications of crossing the threshold, but a key caveat is that we do not know what the values of these thresholds would be in the real ocean, or even whether the same hysteresis behavior is seen.
We use a suite of experiments to examine different methods proposed in other studies for monitoring the AMOC. This includes model representations of the current AMOC monitoring arrays for the RAPID section (at 26.5°N) and the OSNAP section (50°–60°N). These arrays show strong relationships with the true model AMOC at 26.5° and 45°N and the ocean heat transports. We also examine various metrics based on aspects such as ocean temperature, density, sea level, and mixed layer depth that have been proposed as potential fingerprints (representing an important aspect of the AMOC).
Temperature metrics based on large-scale temperature patterns are found to be good fingerprints for AMOC variability, but less so for long-term AMOC weakening where patterns can be different in different scenarios and changes take longer to detect than other metrics. However, a dipole in SST between the subpolar gyre and the South Atlantic (SST_dipole) is good at giving early warning of passing a threshold in this model beyond which the AMOC does not recover. Temperature metrics based on local patterns are not found to be useful fingerprints in any scenario.
Density metrics prove to be useful fingerprints for a range of scenarios. In particular, the large-scale meridional density gradient and the vertical gradient of density in the Labrador Sea are good fingerprints both for assessing AMOC variability (with the Labrador Sea density giving a 5-yr early warning of AMOC change) and for detecting AMOC changes. The Labrador Sea density can also give early warning of passing a threshold beyond which the AMOC does not recover.
In some hosing experiments the AMOC experiences a temporary recovery before reducing again to a weak state. A number of the metrics replicate (or exaggerate) this temporary recovery; however, the mixed layer depth is a useful metric because it quickly distinguishes those experiments where the AMOC recovers from those where the AMOC stays in a weak state after a temporary recovery.
Of course, this analysis has all been conducted with one climate model, and there are likely to be differences in other models (Roberts et al. 2013a). In particular we note that there are differences in the mechanisms and in the frequency of AMOC variability across models (Ba et al. 2014; Menary et al. 2015), which are likely to impact the patterns associated with variability. It is also possible that modifications to the metrics here (which have mostly been developed using a single model) could make them more successful. However, understanding how these proposed metrics perform in one model across a number of scenarios is a first step to developing a robust fingerprint. We also need to improve our understanding of the processes involved and hence which metrics are most likely to be robust.
Current arrays monitoring the AMOC directly (RAPID and OSNAP) are shown to perform well, although fingerprints can provide added value. Results here show that some metrics are useful fingerprints; however, the best metric may depend on the question being asked. The results do point to continued monitoring of densities, surface temperatures, and mixed layer depths, particularly in the subpolar gyre. A successful monitoring strategy for climate-relevant AMOC changes is likely to involve using multiple fingerprints and physical understanding.
The authors were supported by the Met Office Hadley Centre Climate Programme funded by BEIS and Defra (GA01101). This project is TiPES contribution number 13: this project has received funding from the European Union’s Horizon 2020 research and innovation programme under grant agreement 820970. The authors declare no conflicts of interest.
Denotes content that is immediately available upon publication as open access.