## 1. Introduction

The ocean stores most of the Earth energy uptake associated with historical forcings (Levitus et al. 2012; Rhein et al. 2013; von Schuckmann et al. 2016). This energy is primarily stored in the form of heat and warms the various layers of the ocean. Because the warming is mixed and transported into the ocean depths, it delays surface warming. As such, the ocean heat uptake plays an important role in the global climate system’s temperature response to radiative forcing (Collins et al. 2013).

Ocean heat content (OHC) changes are driven by a combination of internal climate variability and external forcings. External forcings include both contributions from natural origins (e.g., changes in solar irradiance, volcanic emissions of aerosols) and from anthropogenic origins (e.g., changes in GHGs, ozone and aerosol concentrations, changes in land use). The internal variability is the variability spontaneously generated by the climate system in the absence of changes in the external forcing. Detection and attribution (D&A; Bindoff et al. 2013) studies intend to unravel the response of the global OHC to these different forcings. Several past D&A studies demonstrate a human footprint and assess the role of natural and anthropogenic factors in observed OHC changes at the global scale (Barnett et al. 2001, 2005; Pierce et al. 2006, 2012; Gleckler et al. 2012, 2016; Tokarska et al. 2019). These studies clearly identified the anthropogenic forcing as the main contributor to the global variations of OHC over the last decades. However, within the anthropogenic forcing influence, they have not tried to separate and quantify the role played by GHG emissions specifically, as opposed to other anthropogenic forcing (including the role of aerosols). A few other studies have developed similar D&A approaches to analyze the influence of anthropogenic forcing on the global mean thermosteric sea level, which is a proxy of the global OHC changes (Domingues et al. 2008; Marcos and Amores 2014). They confirmed that, as for OHC, the anthropogenic forcing is the main contributor to the observed global mean thermosteric sea level changes over the past decades. Two recent studies from Slangen et al. (2014) and Bilbao et al. (2019) went one step further and intended to separate the effect of GHG emissions from the effect of other anthropogenic forcing on respectively thermosteric sea level and ocean temperature. Both studies found it difficult to separate the effect of GHG from the effect of other anthropogenic forcing due to a degeneracy between the global signals induced by GHG emissions and anthropogenic aerosols. They found that aerosols and GHGs have opposite and anticorrelated effects on OHC that mirror each other in the global mean [aerosols cool the system whereas GHGs warm it; see, e.g., Fig. 3c in Slangen et al. (2014)]. They concluded that the use of extra spatiotemporal information is necessary to remove effectively the degeneracy and disentangle the role of GHG emissions from the role of aerosols with regard to global thermosteric sea level change and OHC change.

Estimating the OHC change associated with GHG emissions is essential to improve our understanding of the global climate system’s temperature response to the anthropogenic GHG radiative forcing (Collins et al. 2013). In particular, it can help in refining our estimates of key characteristics of this response such as the equilibrium climate sensitivity (which determines the long-term equilibrium warming response to stable atmospheric composition) or the transient climate response (which is a measure of the magnitude of transient warming while the climate system, particularly the deep ocean, is not in equilibrium). Ultimately, this would help both to understand the present-day climate change and to constrain predictions of future climate change (e.g., Irving et al. 2019).

The objective of this study is to unravel the response of the global OHC to these different forcings and ultimately determine the global OHC changes associated with GHG emissions. To tackle this problem, we adopt here a D&A approach as in previous studies. We compare observations of OHC to a range of single forcing experiments simulated with a set of 39 climate models from the the archive of phase 5 of the Coupled Model Intercomparison Project (CMIP5; Taylor et al. 2012). We complement previous D&A studies on the OHC by using a larger set of attribution experiments driven by various external forcing: historical, natural only, anthropogenic only, GHG only, and aerosols only (see section 2). We use the control runs to quantify the internal climate variability (see section 2). Compared to previous studies our approach is original for three reasons. First, we use a new D&A method that takes into account for the first time the uncertainty in the regional pattern in OHC changes (see section 2). Second, we follow Pierce et al. (2006) and ensure a rigorous comparison between model simulations and observations by using a time-varying sampling mask to restrict the comparison to where and when in situ ocean observations are available. The Pierce et al. (2006) approach of using a time-varying sampling mask has not been followed in recent D&A studies (Domingues et al. 2008; Marcos and Amores 2014; Slangen et al. 2014; Gleckler et al. 2016) but it proves to be important to get an accurate comparison between observations and model simulations (see section 3). Third, we use for the first time a bivariate version of D&A method to introduce regional information in order to disentangle the role of GHG emissions from the role of aerosols (see section 3). The results of our analyses are presented in section 3 and discussed in section 4.

## 2. Data and methods

D&A analysis provides a framework to compare the OHC anomaly trends calculated from both model outputs and in situ ocean observations. Dealing with different sources of data implies using different spatial grids, masks, corrections, and averages. To get consistent input data in the D&A method, we pay a particular attention to apply the same treatments across simulations and in situ observations.

Sections 2a and 2b describe the in situ ocean observations and climate simulations used in this study, respectively. Section 2c details the calculation of the OHC anomaly trends and the construction of the observational mask, which is applied to both observations and climate simulations. The D&A method is explained in section 2d.

### a. Observed data

Ishii et al. (2017) and Levitus et al. (2012) datasets are used for this study. Those datasets were constructed using temperature and salinity observations issued from the World Ocean Database. Main differences between both datasets lie in the interpolation techniques and instrumental corrections for biases in expendable and mechanical bathythermograph data. The observed OHC anomalies for the upper 700 m used for this study are the ensemble mean of the two reconstructions.

Reconstructed fields are provided globally and over the full ocean depth, although in situ observations are sparse and unevenly distributed with a bias toward the Northern Hemisphere and the upper ocean (the vast majority of observations are above 700-m depth; Rhein et al. 2013; Abraham et al. 2013). In particular there are very few observations in the Southern Ocean before the year 2000 (Durack et al. 2014). Different techniques are used to fill areas presenting few or no observations. Ishii et al. (2017) and Levitus et al. (2012) use objective interpolation techniques to carry out the infilling, which bias the temperature anomaly toward zero in data-sparse regions (Gregory et al. 2004; Gleckler et al. 2012). This affects the depiction of OHC content changes and variability at the global scale and in regions with sparse data (Gregory et al. 2004; AchutaRao et al. 2007).

To remove this bias and ensure the accuracy of the D&A analysis, we apply an observational mask that restricts the data to areas and periods with a sufficient number of available observations. We build the observational mask using the data distribution field (dd) available in the *World Ocean Atlas*. For a given grid cell and a given year, the entire 0–700-m water column is considered as observed when at least 10 observations are present along the water column. To avoid seasonal biases, an additional criterion retains a given grid cell and a given year only if observations cover at least two trimesters. We also choose to focus over the period 1971–2005 because the data coverage becomes more global in 1971 (Rhein et al. 2013; Abraham et al. 2013).

Figure 1 shows the total and subsampled yearly mean OHC anomalies. As expected, the subsampling induces a larger variability in particular at the times of the sparsest sampling [i.e., before 1971 and between 1996 and 2003 when the coverage of the global ocean was below 40%; Abraham et al. (2013)]. The subsampled dataset also exhibits a larger trend over the period 1971–2005. This is not surprising as the infilling of nonobserved areas with null trends in Ishii et al. (2017) and Levitus et al. (2012) datasets tends to reduce the global positive trend observed over 1971–2005.

### b. Climate simulations

In this study, we use various types of CMIP5 numerical experiments: preindustrial control (PIC) and historical (HIST, include all forcing), natural forcing only (NAT), and greenhouse gases forcing only (GHG). The last two sets of simulations are specifically designed for D&A studies, where only a subset of external forcings are varying in time.

PIC runs simulate the climate system under constant forcings, equal to those observed in 1850 or 1860. In particular, greenhouse gas atmospheric concentrations and aerosol rates are kept constant to the level observed in 1850 or 1860. The length of PIC runs ranges from 200 to 1156 years (see second column of Table 1), totaling more than 20 000 years of simulation. Those unforced simulations are used to correct a potential model drift and to estimate the internal variability (Sen Gupta et al. 2013). CMIP5 PIC runs start after the completion of the model spinup. However, the length of this spinup is usually too short to reach the required quasi-equilibrium in the deep ocean, and most PIC and other simulations still exhibit a drift in OHC (in particular in the deep ocean). For each model, and for each grid cell, the drift is computed as the linear trend of the OHC over the full length of the PIC run and is removed from the PIC and other historical simulations following Melet and Meyssignac (2015). Internal variability is estimated from the PIC runs divided into 35-yr segments, taken at 20-yr intervals, matching the length of the period of interest in this study: 1971–2005.

List of models used for this study, length of preindustrial control runs (PIC; in years), and number of realizations available for historical (HIST), natural only (NAT), and greenhouse gases only (GHG) experiments. (Expansions of acronyms are available online at http://www.ametsoc.org/PubsAcronymList.)

HIST runs simulate the climate over the period 1850–2005 in response to all known external forcings, including anthropogenic forcings (from GHGs, aerosols, ozone, and land use when available) and natural forcing (solar and volcanic forcings). NAT runs cover the same period but only the natural forcing varies with time. GHG runs also cover this period, with varying GHG concentrations while other forcings are kept constant. The inclusion of other attribution experiments (e.g., anthropogenic forcings only, aerosol forcing only) was prevented by the limited number of runs available in CMIP5. However, by assuming that the response to a combination of forcings is equal to the sum of the responses to individual forcings, we can infer the response of OHC to other combinations of forcings. The additivity assumption has been verified for the dynamical sea level by Slangen et al. (2015). It holds for OHC as dynamical sea level and OHC are linearly related at global scale. We infer the response of OHC to all anthropogenic forcings (including GHGs and aerosol forcing, referred to as ANT forcing hereafter) by calculating the difference between the response of OHC in HIST versus NAT simulations. We also infer the response of OHC to aerosols (and land use when available, referred to as OA forcing hereafter) as the OHC response to HIST forcing minus NAT forcing minus GHG forcing.

The selection of CMIP5 models was made upon the availability of temperature and salinity fields during the 1971–2005 period, resulting in a total of 39 models for the HIST experiment and 18 models for both NAT and GHG experiments (see Table 1). The HIST, NAT, and GHG columns of Table 1 details the number of realizations used for each model.

### c. Calculation of the ocean heat content trends

To compute OHC anomalies, monthly 3D temperature and salinity fields from 0- to 700-m depth are averaged annually and masked over marginal seas and lakes (such as the Mediterranean Sea, Red Sea, Black Sea, Caspian Sea, Baltic Sea, Persian Gulf, Hudson Bay, and Great Lakes) for each dataset. OHC anomalies are computed from the annual mean temperature fields and a climatology of the salinity fields, following the 1980 UNESCO International Equation of State (IES80). The OHC is calculated on each layer and each cell of the reconstruction/model grid and vertically integrated. Anomalies are obtained by subtracting the 1971–2000 climatology for each grid cell. For observations, temperature and salinity fields were first converted into potential temperature fields [using the methodology of Bryden (1973)] before the computation of OHC anomalies.

In a second step, we compute the global (and regional, when necessary) subsampled averages and trends. In this case, a common observational mask, varying in time and space, is applied to all observational datasets and simulations. As a consequence, the spatial coverage might be different from one year to another. The spatial average is computed from subsampled OHC anomalies over the globe (and over different ocean basins when necessary). Finally, linear trends are computed over the period of interest: 1971–2005.

The D&A method is applied to the subsampled OHC anomaly trends. As a consequence, the observational constraint obtained applies to the subsampled OHC trend, which is different from the global OHC trend. This result can be difficult to understand and to be exploited. Therefore, we translate the results obtained on subsampled OHC (sub-OHC) into a result applicable to global OHC (tot-OHC) using a linear regression model between sub-OHC and tot-OHC. For each type of forcing, a linear regression model *y* = *ax* + *b* is fitted from an ensemble of sub-OHC (*x*) and tot-OHC (*y*) values issued from available model simulations. An observational constraint on tot-OHC trends, as well as the associated uncertainty, is then derived from this linear regression (see Fig. 2). Uncertainties on the regression line are included in the tot-OHC uncertainties, using a Monte Carlo technique to sample uncertainty on *a* and *b*. The regression models assume that, for a given type of experiment, changes over a spatiotemporal subset (as defined by the observational mask) are linearly related to the global changes, and that this ratio is correctly simulated by GCMs. Figure 2 shows the global OHC against the subsampled OHC for the period 1971–2005 for all GCM simulations and for observations [we use here the infilled estimate from Ishii et al. (2017) and Levitus et al. (2012) as a global OHC estimate]. This figure suggest that, for the HIST, NAT, and GHG simulations, the relationship between tot-OHC and sub-OHC is close to linearity. The root-mean-square error (RMSE) estimated as the root-mean-square (RMS) of the residual is below 16% of the RMS of the total signal for the NAT simulations, below 9% for the HIST simulations, and below 7% for the GHG simulations. The relationship between tot-OHC and sub-OHC is also primarily linear for OA and ANT simulations as expected, since these simulations are linear combinations of the HIST, NAT, and GHG simulations. For all simulations that include the GHG forcing (i.e., HIST, GHG, and ANT) the estimated regression coefficients *a* are similar because the OHC signal is dominated by the response to GHGs (e.g., Gleckler et al. 2012). For the NAT simulation, the internal variability dominates and generates a different ratio between sub-OHC and tot-OHC. For the OA simulations, the ratio between sub-OHC and tot-OHC is significantly different from that found in GHG experiments, because the aerosol forcing is highly variable across space (unlike the GHG forcing). Note that observations lie below the regression line of the HIST simulations. This is because the global observed OHC is biased low due to the infilling technique in Ishii et al. (2017) and Levitus et al. (2012), which tends to bias the temperature anomalies toward 0°C in data-sparse regions such as the Southern Ocean (Durack et al. 2014).

Linear relation between the subsampled ocean heat content 35-yr trends and the global ocean heat content 35-yr trends for different climate model simulations and for observations. For observations, the 35-yr trend is computed over the period 1971–2005 and indicated with a black dot. HIST refers to historical simulations that include all forcing. NAT refers to simulations with only the natural forcing. ANT refers to simulations with only anthropogenic forcing (including GHG emissions, aerosol forcing, and land use change when available). It is calculated by taking the difference between HIST and NAT simulations. GHG refers to simulations with only the forcing from anthropogenic GHG emissions. OA refers to simulations with only the forcing from anthropogenic activity other than GHG emissions (it includes the forcing from aerosol emissions and land use change when available). It is calculated as HIST simulations minus NAT simulations minus GHG simulations.

Citation: Journal of Climate 33, 24; 10.1175/JCLI-D-19-0091.1

Linear relation between the subsampled ocean heat content 35-yr trends and the global ocean heat content 35-yr trends for different climate model simulations and for observations. For observations, the 35-yr trend is computed over the period 1971–2005 and indicated with a black dot. HIST refers to historical simulations that include all forcing. NAT refers to simulations with only the natural forcing. ANT refers to simulations with only anthropogenic forcing (including GHG emissions, aerosol forcing, and land use change when available). It is calculated by taking the difference between HIST and NAT simulations. GHG refers to simulations with only the forcing from anthropogenic GHG emissions. OA refers to simulations with only the forcing from anthropogenic activity other than GHG emissions (it includes the forcing from aerosol emissions and land use change when available). It is calculated as HIST simulations minus NAT simulations minus GHG simulations.

Citation: Journal of Climate 33, 24; 10.1175/JCLI-D-19-0091.1

Linear relation between the subsampled ocean heat content 35-yr trends and the global ocean heat content 35-yr trends for different climate model simulations and for observations. For observations, the 35-yr trend is computed over the period 1971–2005 and indicated with a black dot. HIST refers to historical simulations that include all forcing. NAT refers to simulations with only the natural forcing. ANT refers to simulations with only anthropogenic forcing (including GHG emissions, aerosol forcing, and land use change when available). It is calculated by taking the difference between HIST and NAT simulations. GHG refers to simulations with only the forcing from anthropogenic GHG emissions. OA refers to simulations with only the forcing from anthropogenic activity other than GHG emissions (it includes the forcing from aerosol emissions and land use change when available). It is calculated as HIST simulations minus NAT simulations minus GHG simulations.

Citation: Journal of Climate 33, 24; 10.1175/JCLI-D-19-0091.1

### d. Detection and attribution method

Most D&A methods rely on linear regression models where the observations *Y* are regressed onto the expected response to forcings *i*, as simulated by climate models *X*_{i}. These methods assume that the spatiotemporal pattern of the response to each forcing given by climate models is correct but that its amplitude is wrong (Hegerl and Zwiers 2011; Bindoff et al. 2013). Their objective is then to find the right amplitude by comparison with observations. The assumption underlying these methods presents two important limitations. First, models do provide information on the amplitude of the response to each forcing. This information is useful and should not be discarded a priori because it can help to constrain the estimate of the real response to each forcing. Second, the spatiotemporal distribution of the response to each external forcing simulated by climate models is not perfectly correct. A careful analysis suggests that there are significant differences among climate models in the spatial response patterns, as shown by the large spread among model patterns (see, e.g., Slangen et al. 2014). So, there are probably some errors in the models’ patterns. This information should be considered through a modeling of the uncertainty in the spatial response pattern.

Recently Ribes et al. (2017) developed a new D&A method to deal with these two limitations. We use this new D&A method, which provides a framework to treat the amplitude and the spatiotemporal pattern of the response to different forcing consistently, without any a priori assumptions on their reliability.

The Ribes et al. (2017) statistical method is based on the additivity assumption: the true and unknown response of the climate system (in terms of OHC here) to all external forcing **Y*** is assumed to be equal to the sum of the true and unknown responses to each external forcing *i* taken separately

where *n*_{f} is the total number of external forcing considered. It is also assumed that the observations of the climate response **Y** and the multimodel mean response to each forcing *i*, **X**_{i} (i.e., the best available estimate of the expected response), are related to the exact responses **Y*** and

where *ε*_{Y} represents the noise in observations **Y** (including internal variability and observation errors) and *ε*_{Y} and **Σ**_{Y} and

In (1) and (2), **Y**, **X**_{i}, and ** ε** are all vectors made of

*n*variables. The way these vectors are constructed deserves some discussion. Usually in D&A methods, these vectors are spatiotemporal vectors, in which each coordinate corresponds to the mean over a given region and a given period. In this study, we will limit ourselves to the univariate (i.e., with

*n =*1, using only global averages) or the bivariate (i.e., with two regions and

*n =*2) cases. We can hardly go further because the small number of independent climate model simulations available in CMIP5 is not large enough to estimate properly each covariance matrix

*n*> 2 (see below for more details). In the univariate case, we only consider the global 1971–2005 OHC trend. In the bivariate case, an extra variable (e.g., the OHC trend over another region and/or period) is considered in addition to the global OHC 1971–2005 trend variable. The idea to use two variables at the same time is to account for some spatial information in the D&A. With this approach we expect to improve the attribution of the forced signal in OHC response and to separate the GHG response from the aerosol response. We run a set of 83 bivariate D&A analysis with a range of 83 different extra variables added to the global OHC 1971–2005 trend. The objective is to test which variable leads to the best separation of the GHG response from the aerosol response.

The main parameters of interest in (1) and (2) are

where *A*/*B* denotes the product **A** ⋅**B**^{−1} and

To estimate **Σ**_{Y} and **Σ**_{υ} is estimated from unforced PIC experiments. OHC trends are calculated for 35-yr-long segments, following the same processing as for observations. Segments are taken at 20-yr intervals along the PIC runs (segments can overlap and are thus not totally independent) and **Σ**_{υ} is computed as the sample estimate over this ensemble of segments. It leads to an estimate of the uncertainty induced by the internal variability of ±0.48 × 10^{7} J m^{−2} yr^{−1} (see Fig. 3a). The measurement uncertainty is dominated by instrument bias corrections, and the definition of a baseline climatology to estimate the OHC anomalies (see, e.g., Boyer et al. 2016). Boyer et al. (2016) estimated that the uncertainty due to instrument bias corrections and the base line climatology at global scale for the period 1970–2008 ranges between ±0.10 × 10^{7} and ±0.15 × 10^{7} J m^{−2} yr^{−1} (depending on the spatial interpolation scheme used to calculate the global OHC estimate; see their Table 4). Here we consider the period 1971–2005, which is close to the reference period 1970–2008 of Boyer et al. (2016) and we do not interpolate the data to get global OHC estimate. Because we do not interpolate the data, we do not introduce any uncertainty due to the interpolation scheme, so the measurement uncertainty in our case is smaller than the estimate from Boyer et al. (2016). In the worst case, Boyer et al. (2016) estimate the uncertainty due to instrument bias corrections and the base line climatology is ±0.15 × 10^{7} J m^{−2} yr^{−1}. This is about 3 times as small as the uncertainty due to the internal variability estimated above. For this reason, we neglect here the uncertainty in measurements and we approximate the uncertainty in the observations by the uncertainty due to the internal variability only (**Σ**_{υ}).

Global 1971–2005 sub-OHC trend values and associated probability distribution functions for (a) PIC simulations (green curve), (b) HIST simulations (purple curve), (c) the combination of NAT (blue curve) and ANT (orange curve) simulations, and (d) the combination of NAT (blue curve), GHG (red curve), and OA (yellow curve) simulations. The trend value obtained by one GCM (averaged over available realizations) is represented by a vertical bar. The observed sub-OHC trend is represented by a black arrow and the sum of the combinated forcings by a pink arrow. The gray error bars and their center represent the 90% confidence interval and the multimodel mean computed from the ensemble of simulations. The colored error bars represent the reduced 90% confidence interval and the new estimation computed from the Ribes et al. (2017) univariate detection and attribution model.

Citation: Journal of Climate 33, 24; 10.1175/JCLI-D-19-0091.1

Global 1971–2005 sub-OHC trend values and associated probability distribution functions for (a) PIC simulations (green curve), (b) HIST simulations (purple curve), (c) the combination of NAT (blue curve) and ANT (orange curve) simulations, and (d) the combination of NAT (blue curve), GHG (red curve), and OA (yellow curve) simulations. The trend value obtained by one GCM (averaged over available realizations) is represented by a vertical bar. The observed sub-OHC trend is represented by a black arrow and the sum of the combinated forcings by a pink arrow. The gray error bars and their center represent the 90% confidence interval and the multimodel mean computed from the ensemble of simulations. The colored error bars represent the reduced 90% confidence interval and the new estimation computed from the Ribes et al. (2017) univariate detection and attribution model.

Citation: Journal of Climate 33, 24; 10.1175/JCLI-D-19-0091.1

Global 1971–2005 sub-OHC trend values and associated probability distribution functions for (a) PIC simulations (green curve), (b) HIST simulations (purple curve), (c) the combination of NAT (blue curve) and ANT (orange curve) simulations, and (d) the combination of NAT (blue curve), GHG (red curve), and OA (yellow curve) simulations. The trend value obtained by one GCM (averaged over available realizations) is represented by a vertical bar. The observed sub-OHC trend is represented by a black arrow and the sum of the combinated forcings by a pink arrow. The gray error bars and their center represent the 90% confidence interval and the multimodel mean computed from the ensemble of simulations. The colored error bars represent the reduced 90% confidence interval and the new estimation computed from the Ribes et al. (2017) univariate detection and attribution model.

Citation: Journal of Climate 33, 24; 10.1175/JCLI-D-19-0091.1

The uncertainties in the simulations of the forced response, *X*_{i}, are related to both the uncertainty due to internal variability **Σ**_{υ} and the uncertainty due to climate modeling errors, **Σ**_{m}. The procedure used to estimate each **Σ**_{m} and relies on the paradigm that *models are statistically indistinguishable from the truth* (Annan and Hargreaves 2010). The resulting **Σ**_{m} estimate depends on the spread of responses among models, within the HIST, NAT, and GHG multimodel ensembles.

## 3. Detection and attribution of the 1971–2005 global ocean heat content increase

This section describes the D&A analysis results for the trends in sub-OHC and the extension of those results to tot-OHC. The contributions, calculated from the multimodel ensemble and from the D&A model, are given with their 90% confidence interval (i.e., 1.65*σ* assuming a Gaussian distribution).

### a. Detection of a signal inconsistent with internal variability alone

The internal variability in OHC is estimated from more than 20 000 years of unforced detrended PIC runs. Global sub-OHC anomalies trends, calculated on 35-yr segments, are shown as vertical bars on Fig. 3a. The data distribution is very similar to a Gaussian distribution of 0 ± 0.58 × 10^{7} J m^{−2} yr^{−1} (90% confidence interval), with the model unforced responses ranging from −1.04 to 1.26 × 10^{7} J m^{−2} yr^{−1}. The consistency of the observed trend (black arrows in Fig. 3) with internal variability [see Eq. (18) in Ribes et al. 2017] can be tested statistically, and is rejected with a *p* value smaller than 10^{−4}.

As expected, the observed global trend (1.38 × 10^{7} J m^{−2} yr^{−1}) cannot be explained by the internal variability alone. This finding is consistent with previous studies (Barnett et al. 2001; Pierce et al. 2006; Gleckler et al. 2012, 2016).

### b. Role of external forcings

We now compare the observations with historical simulations including all forcings and historical simulations including only a subset of forcings.

First, observations are compared with HIST simulations (see Fig. 3b). The Gaussian distribution describing uncertainty in observations is deduced from the internal variability and centered on the observed global OHC trend. Ensemble means obtained from HIST runs for each model are represented by vertical bars. They range from 0.16 to 2.21 × 10^{7} J m^{−2} yr^{−1}, corresponding to *X*_{HIST} = 1.43 ± 0.72 × 10^{7} J m^{−2} yr^{−1} assuming a Gaussian distribution. The observed trend is consistent with the simulated response to all forcing, with a *p* value of 0.94. This shows that models perform well in reproducing the observed trend. By combining the simulated and observed trends, a new estimation of the OHC response to all forcings is computed, using a univariate D&A model (Ribes et al. 2017). This new estimation of the real response (purple error bar) is similar to the multimodel mean (the gray and the purple error bars are both centered on the same value), but with a substantial decrease of the corresponding uncertainty (i.e., the size of the error bars) if compared to the multimodel estimate–error bars are reduced by 37%. The new estimate of the OHC changes in response to all forcings over 1971–2005 is

We now compare the observed trend with the response to different combinations of forcings. Figure 3c shows the response of sub-OHC trends to NAT and ANT forcing. The uncertainty in simulated NAT trends is limited, with a model estimate of *X*_{NAT} = −0.18 ± 0.36 × 10^{7} J m^{−2} yr^{−1}. The outlier at −1.22 × 10^{7} J m^{−2} yr^{−1} comes from the GFDL-ESM2M simulation, which has only one realization. There are actually some other individual realizations from other models that show similar values but they do not appear on this graph as they are part of a multimember ensemble and thus they are smoothed in the ensemble mean (Fig. 3 only shows, for each model, the ensemble mean across all realizations available). The response to ANT forcing exhibits a larger variance among models (individual model trends ranging from 0.60 to 2.65 × 10^{7} J m^{−2} yr^{−1}) than the response to NAT forcing. It shows a large multimodel mean of *X*_{ANT} = 1.57 ± 0.94 × 10^{7} J m^{−2} yr^{−1}, which dominates the response to all forcing. The observed trend (indicated by a black arrow that is masked by the pink arrow on Fig. 3c) cannot be explained by NAT forcing only. However, it is consistent with the response to ANT forcing only, with a *p* value of 0.78. It is also consistent with the sum of the NAT and the ANT responses. Indeed, the simulated OHC response to NAT + ANT forcing is almost equal to the observed OHC response (the pink arrow points almost to the same value as the black arrow on Fig. 3c; for this reason the two arrows are separated vertically). We now apply the univariate D&A model to reduce uncertainties and compute a new estimation of the OHC response to each forcing. The D&A analysis yields similar estimates as the multimodel mean in terms of mean OHC response to the NAT forcing and to the ANT forcing. However, it improves the uncertainty around these mean responses. The standard deviation of the uncertainty in the OHC response to the NAT forcing is only improved by 5% (the new estimate is

We now consider the three-forcing decomposition of the response (NAT + GHG + OA), shown in Fig. 3d. In this analysis, the ANT forcing is decomposed into two distinct forcings: the GHG forcing and the OA forcing (which is largely dominated by the aerosol forcing). As expected, the results for the NAT forcing in this analysis are unchanged compared to the results obtained before. This is because splitting ANT into GHG + OA has no impact on the NAT calculations. Under GHG forcing only, the sub-OHC anomaly trends estimated from the multimodel mean reach the value of *X*_{GHG} = 2.41 ± 0.64 × 10^{7} J m^{−2} yr^{−1} (with extreme values at 1.83 and 3.16 × 10 J m^{−2} yr^{−1}). Under the OA forcing, the sub-OHC anomaly trends are mainly negative (values ranging from −2.11 to +0.66 × 10^{7} J m^{−2} yr^{−1}) with a multimodel mean value of *X*_{OA} = −0.84 ± 0.90 × 10^{7} J m^{−2} yr^{−1}. This negative response of OHC is due to the cooling effect of aerosols on the ocean (Bindoff et al. 2013). It compensates partly the large positive trend induced by the GHG forcing. On an individual basis, none of the various forcings generate a response in OHC that is consistent with observations. But the sum of the response to all forcings is nearly equal to the observed 1971–2005 trend in OHC (in Fig. 3d the black arrow representing the observed OHC trend is masked below the pink arrow, which represents the combination of the response to all forcings). Among all individual OHC responses, the OHC response to the OA forcing only is the most uncertain. This is because the OA forcing is very different from one model to another (e.g., there is a large spread in the aerosols–cloud interactions across climate models; Boucher et al. 2013). The response of the climate system to the OA forcing is also very different from one model to another. When we apply the D&A procedure, we find no significant shift in the estimates of the OHC response to each forcing compared to the multimodel mean estimate. This means that the multimodel mean estimate agrees with observations within error bars. However, the D&A yields significant improvements on the estimate of the uncertainties. The standard deviations of the uncertainty in OHC response to the NAT, GHG, and OA forcings are reduced respectively by 4%, 13%, and 28%, with new estimates of

Figure 4a shows the contributions of each individual forcing when considering the model ensemble (gray error bars) and after the application of the univariate D&A model (black continuous error bars). The use of the Ribes et al. (2017) D&A model on subsampled OHC trends results in a better determination (i.e., lower uncertainty) of the contributions of individual forcings, especially the most uncertain ones, such as HIST, ANT, and OA.

Contribution of each forcing to the global 1971–2005 OHC anomaly trends. (a) For the subsampled OHC, uncertainties are computed from the model ensemble (gray error bars) and from the univariate (black continuous error bars) and the bivariate (black dotted error bars) detection and attribution models. (b) The total OHC is inferred from the subsampled bivariate results, using a linear relationship between simulated subsampled and total anomaly trends. The uncertainty is inferred from the subsampled bivariate uncertainty and from the subsampled-to-total transformation uncertainty (black dotted error bars).

Citation: Journal of Climate 33, 24; 10.1175/JCLI-D-19-0091.1

Contribution of each forcing to the global 1971–2005 OHC anomaly trends. (a) For the subsampled OHC, uncertainties are computed from the model ensemble (gray error bars) and from the univariate (black continuous error bars) and the bivariate (black dotted error bars) detection and attribution models. (b) The total OHC is inferred from the subsampled bivariate results, using a linear relationship between simulated subsampled and total anomaly trends. The uncertainty is inferred from the subsampled bivariate uncertainty and from the subsampled-to-total transformation uncertainty (black dotted error bars).

Citation: Journal of Climate 33, 24; 10.1175/JCLI-D-19-0091.1

Contribution of each forcing to the global 1971–2005 OHC anomaly trends. (a) For the subsampled OHC, uncertainties are computed from the model ensemble (gray error bars) and from the univariate (black continuous error bars) and the bivariate (black dotted error bars) detection and attribution models. (b) The total OHC is inferred from the subsampled bivariate results, using a linear relationship between simulated subsampled and total anomaly trends. The uncertainty is inferred from the subsampled bivariate uncertainty and from the subsampled-to-total transformation uncertainty (black dotted error bars).

Citation: Journal of Climate 33, 24; 10.1175/JCLI-D-19-0091.1

### c. Exploration of the bivariate case and potential to further reduce uncertainties

In this section, we include an extra variable in the D&A analysis (in addition to the 1971–2005 global mean OHC trend). The objective is to consider more information on the OHC variability to further reduce the uncertainties in the OHC response to the GHG forcing and the OA forcing.

The GHG and OA forcings effects on OHC show distinct spatiotemporal patterns (Slangen et al. 2015). The OA effect is local and its amplitude decreases in the late twentieth century, while the GHG forcing effect is globally quasi-uniform and increases until present (Slangen et al. 2015). Despite these differences, the effects of the GHG and OA forcings are similar on the OHC (with opposite sign) when they are averaged over the ocean and over the period 1971–2005 (Slangen et al. 2014). These similar effects make it difficult to unravel the role of each forcing on OHC when looking only at global OHC. To remove this degeneracy, we use a second variable in the D&A analysis. For a second variable, we take a regional average of the OHC trend. We test different regions for the averaging of the OHC trend and we test also different periods over which the OHC trend is computed. In total, 14 different regions (ranging from global to oceanic basins; see Table S1 in the online supplemental material) and 6 different periods (the periods start after 1957 and end before 2005; see Table S1) were tested leading to a total of 83 different second variables. (That is, 14 regions by 6 periods lead to 14 × 6 = 84 different combinations for the second variable. Within these 84 second variables, we do not consider the combination that corresponds to global OHC over the period 1971–2005 because this variable is already used as primary variable. So, in total it leads to 83 different second variables). Note that when global OHC over a given period is used as primary variable and regional OHC over the same period is used as secondary variable, then the information on the regional OHC is used twice, one time in the primary variable as part of the total information on global OHC and one time in the secondary variable.

Figure 5 summarizes the 83 estimates of the global 1971–2005 sub-OHC trends in response to each forcing and their associated uncertainty. In all tests, the inclusion of the second variable in the D&A analysis does not change substantially the best estimate of the global sub-OHC trend over 1971–2005 (see the *x* axis in Fig. 5). The best estimates of the global sub-OHC trend over 1971–2005 range between 1.2 and 1.6 × 10^{7} J m^{−2} yr^{−1} for the HIST simulations, between −0.23 and −0.14 × 10^{7} J m^{−2} yr^{−1} for the NAT simulations (excluding the outlier at −0.27 × 10^{7} J m^{−2} yr^{−1}), between 2.28 and 2.55 × 10^{7} J m^{−2} yr^{−1} for the GHG simulation, and between −1.05 and −0.9 × 10^{7} J m^{−2} yr^{−1} for the OA simulation. This is consistent with the results of the univariate case. The inclusion of the second variable makes a difference on the uncertainty associated to the best estimate of the global sub-OHC trend over 1971–2005 (see the *y* axis in Fig. 5). For all tests, the inclusion of the second variable reduces the uncertainty of the sub-OHC response to all forcings compared to the univariate case (see Fig. 5). However, the reduction is not the same for each forcing and for each second variable that is considered. The variables that yield the smallest uncertainty in the sub-OHC response to OA are not the same as the ones that yield the smallest uncertainty in the sub-OHC response to GHG. This suggests that each variable has its own sensitivity to the different forcings.

The *y* axis shows the uncertainties of the subsampled global 1971–2005 OHC trends using the bivariate detection analysis (estimated with the standard deviation of the subsampled global 1971–2005 OHC anomalies trends) for (a) HIST, (b) NAT, (c) GHG, and (d) OA. Different regions and periods of trends were used in the second variable of the bivariate case, resulting in different estimators and associated uncertainties in the subsampled global 1971–2005 OHC trends (black dots). Colored symbols indicate particularly interesting cases discussed in main text: blue, 1957–2005; red, 1971–2005; green, 1957–80.

Citation: Journal of Climate 33, 24; 10.1175/JCLI-D-19-0091.1

The *y* axis shows the uncertainties of the subsampled global 1971–2005 OHC trends using the bivariate detection analysis (estimated with the standard deviation of the subsampled global 1971–2005 OHC anomalies trends) for (a) HIST, (b) NAT, (c) GHG, and (d) OA. Different regions and periods of trends were used in the second variable of the bivariate case, resulting in different estimators and associated uncertainties in the subsampled global 1971–2005 OHC trends (black dots). Colored symbols indicate particularly interesting cases discussed in main text: blue, 1957–2005; red, 1971–2005; green, 1957–80.

Citation: Journal of Climate 33, 24; 10.1175/JCLI-D-19-0091.1

The *y* axis shows the uncertainties of the subsampled global 1971–2005 OHC trends using the bivariate detection analysis (estimated with the standard deviation of the subsampled global 1971–2005 OHC anomalies trends) for (a) HIST, (b) NAT, (c) GHG, and (d) OA. Different regions and periods of trends were used in the second variable of the bivariate case, resulting in different estimators and associated uncertainties in the subsampled global 1971–2005 OHC trends (black dots). Colored symbols indicate particularly interesting cases discussed in main text: blue, 1957–2005; red, 1971–2005; green, 1957–80.

Citation: Journal of Climate 33, 24; 10.1175/JCLI-D-19-0091.1

We plot in Fig. 5 the detailed results for the cases that lead to the smallest uncertainty in sub-OHC response to GHG (blue plus sign in Fig. 5c, corresponding to the case with the South Pacific sub-OHC trend over 1957–2005 as second variable) and to OA (green square in Fig. 5d, corresponding to the case with the high-latitude sub-OHC trend over 1957–80 as second variable). The pair of variables that leads to the smallest uncertainty in the sub-OHC response to OA (green square in Fig. 5) does not lead to the smallest uncertainty in the sub-OHC response to GHG. The same is true for the pair of variables that leads to the smallest uncertainty in the sub-OHC response to GHG (blue plus sign in Fig. 5). It does not lead to the smallest uncertainty in sub-OHC response to OA. The reason is that for these pairs of variables the regional responses of the sub-OHC to GHG and OA are not still largely anticorrelated such that we are still close to degeneracy (even with two variables).

Indeed, in Figs. 6b and 6d the semi-major axis of the GHG ellipse is aligned with the semi-major axis of the OA ellipse, which means that the pair of variables have the same response under OA or GHG forcing and thus they do not help in removing the degeneracy in this case.

Global 1971–2005 sub-OHC trend values vs (a) South Pacific 1971–2005, (b) South Pacific 1957–2005, (c) global 1957–80, and (d) high-latitude 1957–80 trend values for the combination of NAT, GHG, and OA simulations. Trend values obtained by independent GCMs (averaged over available realizations) are represented by a colored cross, observed value by a black cross, and the sum of the combinated forcings by a pink star. The gray error bars and their center represent the 90% confidence interval and the multimodel mean computed from the ensemble of simulations, respectively. The colored error bars represent the reduced 90% confidence interval and the new estimation computed from the Ribes et al. (2017) univariate detection and attribution model.

Citation: Journal of Climate 33, 24; 10.1175/JCLI-D-19-0091.1

Global 1971–2005 sub-OHC trend values vs (a) South Pacific 1971–2005, (b) South Pacific 1957–2005, (c) global 1957–80, and (d) high-latitude 1957–80 trend values for the combination of NAT, GHG, and OA simulations. Trend values obtained by independent GCMs (averaged over available realizations) are represented by a colored cross, observed value by a black cross, and the sum of the combinated forcings by a pink star. The gray error bars and their center represent the 90% confidence interval and the multimodel mean computed from the ensemble of simulations, respectively. The colored error bars represent the reduced 90% confidence interval and the new estimation computed from the Ribes et al. (2017) univariate detection and attribution model.

Citation: Journal of Climate 33, 24; 10.1175/JCLI-D-19-0091.1

Global 1971–2005 sub-OHC trend values vs (a) South Pacific 1971–2005, (b) South Pacific 1957–2005, (c) global 1957–80, and (d) high-latitude 1957–80 trend values for the combination of NAT, GHG, and OA simulations. Trend values obtained by independent GCMs (averaged over available realizations) are represented by a colored cross, observed value by a black cross, and the sum of the combinated forcings by a pink star. The gray error bars and their center represent the 90% confidence interval and the multimodel mean computed from the ensemble of simulations, respectively. The colored error bars represent the reduced 90% confidence interval and the new estimation computed from the Ribes et al. (2017) univariate detection and attribution model.

Citation: Journal of Climate 33, 24; 10.1175/JCLI-D-19-0091.1

To cope with this issue, and to find cases where the degeneracy is removed, we considered the three cases that lead to the largest uncertainty reduction in sub-OHC responses to GHG and OA simultaneously (the uncertainty of the sub-OHC responses to both GHG and OA is computed as the quadratic sum of the uncertainty in the sub-OHC response to GHG and the uncertainty in the sub-OHC response to OA). These cases are indicated by a red cross (case with the South Pacific sub-OHC trend over 1971–2005 as second variable), a blue cross (case with the South Pacific sub-OHC trend over 1957–2005 as second variable), and a green circle (case with the global sub-OHC trend over 1957–1980 as second variable) in Fig. 5.

Among these three cases, Fig. 6 indicates that two cases (cases with red and blue crosses in Fig. 5) are still very close to degeneracy (see Figs. 6a,b). Indeed, in Figs. 6a and 6b, the semi-major axis of the GHG ellipse is almost aligned with the semi-major axis of the OA ellipse. There is only one case where the semi-major axis of the GHG ellipse is less aligned with the semi-major axis of the OA ellipse. It is the case with the global sub-OHC trend over 1957–80 as second variable (case with the green circle on Fig. 5). For this single case the degeneracy is reduced (see Fig. 6c). The semi-major axes of the GHG and OA ellipses are no longer aligned (although they are still close). This indicates that the ratio of the global sub-OHC 1957–80 trend over the global sub-OHC 1971–2005 trend does not show the same response to GHG forcing and to OA forcing. Thus, the OHC response to GHG and OA cannot compensate anymore with this pair of variables. It means that the combined analysis of the global sub-OHC over the two periods 1957–80 and 1971–2005 enables us to decorrelate (partly) the response to GHG from the response to OA. With the decorrelation, the responses of the sub-OHC to both GHG and OA are independently observable at the same time. It should improve the associated estimate of the global sub-OHC trend response to OA and GHG over 1957–80 and 1971–2005. It should also reduce significantly the associated uncertainty. The resulting estimate of the global sub-OHC trend over 1971–2005 is 1.42 ± 0.18 × 10^{7} J m^{−2} yr^{−1}. This trend breaks down into −0.17 ± 0.26 × 10^{7} J m^{−2} yr^{−1} in response to the natural forcing, 2.37 ± 0.34 × 10^{7} J m^{−2} yr^{−1} in response to the GHG forcing, and −0.78 ± 0.37 × 10^{7} J m^{−2} yr^{−1} in response to the OA forcing (Fig. 4). These numbers show that the use of the bivariate case has not changed the best estimate of the OHC response to the different forcing (see Fig. 4). It has improved the uncertainty estimates (by a few percent) but the improvement is marginal compared to the univariate case (see Fig. 4). This is because the decorrelation of the OHC response to OA and GHG even with our best pair of variables is only partial. In all cases analyzed in this study, the bivariate case is our best estimate of OHC response to OA and GHG. When we translate this into tot-OHC trend we find that the global OHC trend over 1971–2005 is 1.22 ± 0.14 × 10^{7} J m^{−2} yr^{−1}. This trend breaks down into −0.13 ± 0.09 × 10^{7} J m^{−2} yr^{−1} in response to the natural forcing, 2.12 ± 0.21 × 10^{7} J m^{−2} yr^{−1} in response to the GHG forcing, and −0.84 ± 0.18 × 10^{7} J m^{−2} yr^{−1} in response to the OA forcing (Fig. 4).

## 4. Discussion

In this study, we estimated the global OHC trend over 1971–2005 and applied a new D&A method. It enables us to estimate the contribution of each forcing, namely the NAT forcing, the GHG forcing, and the OA forcing to the global OHC trend. We find that the global OHC has increased significantly since 1971 in response to GHG emissions by 2.12 ± 0.21 × 10^{7} J m^{−2} yr^{−1}. Of this increase, 40% has been compensated by other anthropogenic influences (mainly aerosol emissions), which induced an OHC decrease since 1971 (−0.84 ± 0.18 × 10^{7} J m^{−2} yr^{−1}). The natural forcing has also induced a slight global OHC decrease since 1971, compensating an extra 6% of the OHC increase due to GHG. In total we find that the global OHC increase since 1971 amounts to 1.22 ± 0.14 × 10^{7} J m^{−2} yr^{−1}.

These results confirm earlier studies (Domingues et al. 2008; Gleckler et al. 2012, 2016; Marcos and Amores 2014; Slangen et al. 2014; Tokarska et al. 2019; Bilbao et al. 2019) in showing that the response to anthropogenic forcing (GHG plus OA response) explains most of the global OHC change since 1971 while the natural forcing explains only a small part of it (decrease of about 12%).

Compared to previous studies, we have been able to separate the effect of the anthropogenic forcing into the effect of the GHG forcing and the effect of the OA forcing (essentially aerosols). This has been possible by using a new D&A method and simultaneously analyzing the global OHC trends over 1957–80 and over 1971–2005. This bivariate method takes advantage of the different time variation of the GHG forcing and the OA forcing since 1957 to separate both effects. It shows that the OA forcing has offset 33% of the increase in OHC due to the GHG effect.

Several elements give confidence in this result. First, the new D&A method used here accounts for the modeling uncertainty, and treats uncertainties on the shape and amplitude of the OHC signals more consistently than previous studies.

Second, in this study, areas where the observations were sparse were masked out in both the observations and climate model simulations [as in previous studies such as Pierce et al. (2006)]. This approach allows one to remove from observations the large biases and errors due to the interpolation of the data and ensures a better comparison with model simulations. A consequence is that we find a very good agreement between CMIP5 historical simulations and observations (see Fig. 3). This agreement is substantially better than previously reported (Slangen et al. 2014; Gleckler et al. 2012, 2016). Another consequence of using an observational mask is that we get smaller uncertainty in global OHC estimates. This is because the trends tend to be higher in regions where we have observations than in other regions (see the regression lines on Fig. 2 between global and subsampled trends, which have slopes <1). As a result, the standard deviations of global trends are smaller than the standard deviations of subsampled trends and we get smaller uncertainty in the global OHC trends than in the subsampled OHC trends.

Third, we used a bivariate method to analyze the anticorrelation between the GHG and the OA effect on the global OHC. By testing 83 pairs of different variables we have been able to find pairs of variables that enable to partly decorrelate the GHG and OA effects on global OHC, reducing the degeneracy of the problem (e.g., Allen et al. 2006; Bilbao et al. 2019, and many others). This reduces the uncertainty and increases confidence in the estimates of the GHG and OA effects on OHC.

However, the results from the bivariate method are disappointing. We expected a better decorrelation by using several variables while the improvement has turned out to be marginal compared to the univariate method. A reason for this is that at global and basin scale the OHC response is fairly similar regardless of which regional basin is considered. Potentially, looking at higher resolution, with smaller regions and also looking at different depth layers [as in Bilbao et al. (2019)], may increase the signal-to-noise ratio and improve the decorrelation. So, a way to improve our approach would be to increase the number of variables and to use a multivariate approach (>2 variables). Given the first results here with a partial decorrelation with basin-scale variables, the multivariate approach appears promising. But developing such a multivariate approach requires a very large number of independent model simulations to be able to estimate the variance–covariance matrix **Σ**_{X}. In the future, very large perturbed physics ensembles could provide such large ensemble of independent simulations and help in this work. Another important limitation in our study, which is common to all D&A study, is that the internal variability is estimated from PIC simulations that are about 1000 years long only. Because 1000 years is not enough to reach an equilibrium in the simulated climate, these simulations are drifting and thus the internal variability is not estimated over a period with stationary statistics. To cope with this issue, we detrended the PIC simulations but it is not clear to which extent this detrending is removing all the drift in the PIC simulations. In the future, multi-thousand-year PIC simulations should help in addressing this issue.

Note that the partial reduction of the degeneracy has been possible because the GHG and OA forcings have different effects over different regions and periods. The trace of these changes is more easily found in the ocean because the ocean variability is slow and has a long memory of past changes (at least over decades to centuries). Because the memory in the atmosphere and land is much shorter, it is not clear whether such an approach would also work with atmospheric variables. This remains to be tested.

## Acknowledgments

We acknowledge the World Climate Research Programme’s Working Group on Coupled Modelling, which is responsible for CMIP, and we thank the climate modeling groups for producing and making available their model output. For CMIP the U.S. Department of Energy’s Program for Climate Model Diagnosis and Intercomparison provided coordinating support and led development of software infrastructure in partnership with the Global Organization for Earth System Science Portals. We also acknowledge the CHAVANA project and the RTRA STAE Toulouse Fundation for funding and supporting this research. Finally, the authors would like to thank Chris E. Forest and three anonymous reviewers for their constructive comments that enabled us to improve the manuscript.

## REFERENCES

Abraham, J. P., and Coauthors, 2013: A review of global ocean temperature observations: Implications for ocean heat content estimates and climate change.

, 51, 450–483, https://doi.org/10.1002/rog.20022.*Rev. Geophys.*AchutaRao, K. M., and Coauthors, 2007: Simulated and observed variability in ocean temperature and heat content.

, 104, 10 768–10 773, https://doi.org/10.1073/pnas.0611375104.*Proc. Natl. Acad. Sci. USA*Allen, M. R., and Coauthors, 2006: Quantifying anthropogenic influence on recent near-surface temperature change.

, 27, 491–544, https://doi.org/10.1007/s10712-006-9011-6.*Surv. Geophys.*Annan, J., and J. Hargreaves, 2010: Reliability of the CMIP3 ensemble.

, 37, L02703, https://doi.org/10.1029/2009GL041994.*Geophys. Res. Lett.*Barnett, T. P., D. W. Pierce, and R. Schnur, 2001: Detection of anthropogenic climate change in the world’s oceans.

, 292, 270–274, https://doi.org/10.1126/science.1058304.*Science*Barnett, T. P., D. W. Pierce, K. M. AchutaRao, P. J. Gleckler, B. D. Santer, J. M. Gregory, and W. M. Washington, 2005: Penetration of human-induced warming into the world’s oceans.

, 309, 284–287, https://doi.org/10.1126/science.1112418.*Science*Bilbao, R. A. F., J. M. Gregory, N. Bouttes, M. D. Palmer, and P. Stott, 2019: Attribution of ocean temperature change to anthropogenic and natural forcings using the temporal, vertical and geographical structure.

, 53, 5389–5413, https://doi.org/10.1007/s00382-019-04910-1.*Climate Dyn.*Bindoff, N., and Coauthors, 2013: Detection and attribution of climate change: From global to regional.

*Climate Change 2013: The Physical Science Basis*, T. F. Stocker et al., Eds., Cambridge University Press, 867–952.Boucher, O., and Coauthors, 2013: Clouds and aerosols.

*Climate Change 2013: The Physical Science Basis*, T. F. Stocker et al., Eds., Cambridge University Press, 571–657.Boyer, T., and Coauthors, 2016: Sensitivity of global upper-ocean heat content estimates to mapping methods, XBT bias corrections, and baseline climatologies.

, 29, 4817–4842, https://doi.org/10.1175/JCLI-D-15-0801.1.*J. Climate*Bryden, H. L., 1973: New polynomials for thermal expansion, adiabatic temperature gradient and potential temperature of sea water.

, 20, 401–408, https://doi.org/10.1016/0011-7471(73)90063-6.*Deep-Sea Res. Oceanogr. Abst.*Collins, M., and Coauthors, 2013: Long-term climate change: Projections, commitments and irreversibility.

*Climate Change 2013: The Physical Science Basis*, T. F. Stocker et al., Eds., Cambridge University Press, 1029–1136.Domingues, C. M., J. A. Church, N. J. White, P. J. Gleckler, S. E. Wijffels, P. M. Barker, and J. R. Dunn, 2008: Improved estimates of upper-ocean warming and multi-decadal sea-level rise.

, 453, 1090–1093, https://doi.org/10.1038/nature07080.*Nature*Durack, P. J., P. J. Gleckler, F. W. Landerer, and K. E. Taylor, 2014: Quantifying underestimates of long-term upper-ocean warming.

, 4, 999–1005, https://doi.org/10.1038/nclimate2389.*Nat. Climate Change*Gleckler, P. J., and Coauthors, 2012: Human-induced global ocean warming on multidecadal timescales.

, 2, 524–529, https://doi.org/10.1038/nclimate1553.*Nat. Climate Change*Gleckler, P. J., P. J. Durack, R. J. Stouffer, G. C. Johnson, and C. E. Forest, 2016: Industrial-era global ocean heat uptake doubles in recent decades.

, 6, 394–398, https://doi.org/10.1038/nclimate2915.*Nat. Climate Change*Gregory, J. M., H. T. Banks, P. A. Stott, J. A. Lowe, and M. D. Palmer, 2004: Simulated and observed decadal variability in ocean heat content.

, 31, L15312, https://doi.org/10.1029/2004GL020258.*Geophys. Res. Lett.*Hegerl, G., and F. Zwiers, 2011: Use of models in detection and attribution of climate change.

, 2, 570–591, https://doi.org/10.1002/WCC.121.*Wiley Interdiscip. Rev.: Climate Change*Irving, D. B., S. Wijffels, and J. A. Church, 2019: Anthropogenic aerosols, greenhouse gases, and the uptake, transport, and storage of excess heat in the climate system.

, 46, 4894–4903, https://doi.org/10.1029/2019GL082015.*Geophys. Res. Lett.*Ishii, M., Y. Fukuda, S. Hirahara, S. Yasui, T. Suzuki, and K. Sato, 2017: Accuracy of global upper ocean heat content estimation expected from present observational data sets.

, 13, 163–167, https://doi.org/10.2151/SOLA.2017-030.*SOLA*Levitus, S., and Coauthors, 2012: World Ocean heat content and thermosteric sea level change (0–2000 m), 1955–2010.

, 39, L10603, https://doi.org/10.1029/2012GL051106.*Geophys. Res. Lett.*Marcos, M., and A. Amores, 2014: Quantifying anthropogenic and natural contributions to thermosteric sea level rise.

, 41, 2502–2507, https://doi.org/10.1002/2014GL059766.*Geophys. Res. Lett.*Melet, A., and B. Meyssignac, 2015: Explaining the spread in global mean thermosteric sea level rise in CMIP5 climate models.

, 28, 9918–9940, https://doi.org/10.1175/JCLI-D-15-0200.1.*J. Climate*Pierce, D. W., T. P. Barnett, K. M. AchutaRao, P. J. Gleckler, J. M. Gregory, and W. M. Washington, 2006: Anthropogenic warming of the oceans: Observations and model results.

, 19, 1873–1900, https://doi.org/10.1175/JCLI3723.1.*J. Climate*Pierce, D. W., P. J. Gleckler, T. P. Barnett, B. D. Santer, and P. J. Durack, 2012: The fingerprint of human-induced changes in the ocean’s salinity and temperature fields.

, 39, L21704, https://doi.org/10.1029/2012GL053389.*Geophys. Res. Lett.*Rhein, M., and Coauthors, 2013: Observations: Ocean.

*Climate Change 2013: The Physical Science Basis*, T. F. Stocker et al., Eds., Cambridge University Press, 255–315.Ribes, A., F. W. Zwiers, J.-M. Azaïs, and P. Naveau, 2017: A new statistical approach to climate change detection and attribution.

, 48, 367–386, https://doi.org/10.1007/s00382-016-3079-6.*Climate Dyn.*Sen Gupta, A., N. C. Jourdain, J. N. Brown, and D. Monselesan, 2013: Climate drift in the CMIP5 models.

, 26, 8597–8615, https://doi.org/10.1175/JCLI-D-12-00521.1.*J. Climate*Slangen, A., J. A. Church, X. Zhang, and D. Monselesan, 2014: Detection and attribution of global mean thermosteric sea level change.

, 41, 5951–5959, https://doi.org/10.1002/2014GL061356.*Geophys. Res. Lett.*Slangen, A., J. A. Church, X. Zhang, and D. Monselesan, 2015: The sea level response to external forcings in historical simulations of CMIP5 climate models.

, 28, 8521–8539, https://doi.org/10.1175/JCLI-D-15-0376.1.*J. Climate*Taylor, K. E., R. J. Stouffer, and G. A. Meehl, 2012: An overview of CMIP5 and the experiment design.

, 93, 485–498, https://doi.org/10.1175/BAMS-D-11-00094.1.*Bull. Amer. Meteor. Soc.*Tokarska, K. B., G. C. Hegerl, A. P. Schurer, A. Ribes, and J. T. Fasullo, 2019: Quantifying human contributions to past and future ocean warming and thermosteric sea level rise.

, 14, 074020, https://doi.org/10.1088/1748-9326/ab23c1.*Environ. Res. Lett.*von Schuckmann, K., and Coauthors, 2016: An imperative to monitor Earth’s energy imbalance.

, 6, 138–144, https://doi.org/10.1038/nclimate2876.*Nat. Climate Change*