A complete map of the ocean subsurface temperature is essential for monitoring aspects of climate change such as the ocean heat content (OHC) and sea level changes and for understanding the dynamics of the ocean/climate variation. However, global observations have not been available in the past, so a mapping strategy is required to fill the data gaps. In this study, an advanced mapping method is proposed to reconstruct the historical ocean subsurface (0–700 m) temperature field from 1940 to 2014 by using ensemble optimal interpolation with a dynamic ensemble (EnOI-DE) approach and a multimodel ensemble of phase 5 of the Coupled Model Intercomparison Project (CMIP5) historical and representative concentration pathway 4.5 simulations. The reconstructed field is a combination of two parts: a first guess provided by the ensemble mean of CMIP5 models and an adjustment by minimizing the analysis error with the assistance of error covariance determined by the CMIP5 models. The uncertainty of the field can also be assessed. This new approach was evaluated using a series of tests, including subsample tests by using data from the Argo period, idealized tests by specifying a truth field from the models, and withdrawn-data tests by removing 20% of the observations for validation. In addition, the authors showed that the ocean mean state, long-term trends, and interannual and decadal variability are all well represented. Furthermore, the most significant benefit of this method is to provide an improved estimate of the long-term historical OHC changes since 1940, which have important implications for Earth’s energy budget.
The ocean is the key component in tracking Earth’s heat budget (Abraham et al. 2013; Allan et al. 2014; Church and White 2011; Church et al. 2011; Trenberth et al. 2014a; Trenberth and Fasullo 2012); therefore, the change in the ocean heat content (OHC) is a critical metric for monitoring ongoing global warming (von Schuckmann et al. 2016). Estimates of historical OHC change rely on in situ temperature profile observations. However, few temperature observations are available spatially or temporally, leaving numerous data gaps. For example, data were available only near the west coast of the Pacific and Atlantic Oceans in 1941 (Fig. 1). The data coverage was extended to the midlatitudes of the northern Pacific and Atlantic Oceans by 1971, and then by 1991 to most regions of the Northern Hemisphere. Until 2005, observations were scattered globally all over the ocean when the Argo system was inaugurated (with the exception of the Southern Ocean and coastal area) (Freeland et al. 2010; Roemmich et al. 2012; von Schuckmann et al. 2014; Roemmich et al. 2015; Riser et al. 2016). The lack of in situ observations has impeded the accurate assessment of the rate of global warming (Lyman and Johnson 2008; Abraham et al. 2013; Cheng and Zhu 2014a) and the development of an in-depth understanding of ocean and climate dynamics (Gille 2008).
A temperature field with complete spatial and temporal coverage throughout the subsurface is always preferred in oceanographic studies. A large group of methods makes use of data assimilation techniques by taking into account the ocean dynamics represented by ocean models. These efforts have resulted in a number of reanalysis datasets (Balmaseda et al. 2013; Carton and Giese 2008; Chang et al. 2013; Xue et al. 2011). However, OHC estimates in various reanalysis datasets have revealed substantial differences, as indicated by reanalyzed intercomparison studies (Carton and Santorelli 2008; Xue et al. 2012; Palmer et al. 2015).
Another group of strategies has been adopted by the community to statistically fill in the data gaps (Domingues et al. 2008; Good et al. 2013; Ishii et al. 2003; Levitus et al. 2012; Palmer and Haines 2009; Roemmich and Gilson 2009; Smith and Murphy 2007; Willis et al. 2004). Among these methods, Lyman et al. (2010) and Palmer and Haines (2009) used the average temperature anomaly of the sampled areas to represent the global mean, which assumes that ocean change in data-rich regions is the same as that in regions with data gaps. Domingues et al. (2008) adapted an EOF analysis based on the available data and used the principle components obtained to represent the relationship between a data-rich and a data-sparse region. Willis et al. (2004) projected the sea level change onto the OHC field based on their correlations.
In addition to the methods described above, other methods have been used, which are often termed “objective interpolation methods.” These techniques require an error covariance field to propagate information from observed regions to regions with data gaps combined with a first-guess field to provide prior information. The first guess is always based on climatology (i.e., with zero anomaly) for simplification (i.e., Levitus et al. 2012; Smith and Murphy 2007; Ishii et al. 2003; Good et al. 2013). This choice might cause the final field to revert to the first-guess field, especially in areas where there are large data gaps, such as in the Southern Hemisphere (in Fig. 1 at 1941, 1971, and 1991). This problem is suspected to result in underestimated long-term OHC trend (Cheng and Zhu 2014a; Durack et al. 2014a). Therefore, a more appropriate strategy is to improve the first guess; such an improvement is the motivation for this study.
Meanwhile, the error covariance defines the correlations among different ocean grids and determines how the information propagates from data-rich to data-sparse regions. The existing methods use parameterized covariance based on some basic assumptions, for example, typical correlation length scales (Good et al. 2013; Ishii et al. 2003; Levitus et al. 2012). The error covariance fields in these studies are empirically constructed and are mainly determined by the spatial distance between the location analyzed and the location observed. A better choice is to use a covariance defined by the output of a high-resolution climate model simulation, which allows a high-resolution representation of error covariance (Smith and Murphy 2007). However, this proposed method relies on the accuracy of the applied ocean model, which always contains a systematic bias to some extent; therefore, a potential improvement is the use of an ensemble of models, which is another motivation of this study.
Here, an ensemble of model outputs from historical runs and representative concentration pathway 4.5 (RCP4.5) of phase 5 of the Coupled Model Intercomparison Project (CMIP5) is used to provide a more reasonable choice for the first-guess field and error covariance statistics. The conclusion of the Fifth Assessment Report (AR5) of Intergovernmental Panel on Climate Change (IPCC) states that “there is high confidence that many CMIP5 models reproduce the observed increase in ocean heat content since 1960,” and many other studies agree (AchutaRao et al. 2007; Gleckler et al. 2012; Flato et al. 2013; Cheng et al. 2015b; Smith et al. 2015). Therefore, an ensemble of models could be a reasonable choice to generate the first-guess field, which is better than using climatology that has a zero temperature anomaly over time. Moreover, by using the ensemble strategy, the impact of model bias can be minimized and a more accurate value for the covariance could potentially be achieved.
This manuscript is structured as follows: the data used for this study is introduced in section 2. The methods and the related choices of parameterizations are introduced in section 3. To determine one of the key parameters and perform a first validation of the method, the data for a recent decade are used in section 4 because the Argo network helps to maintain a dataset with a near-global coverage. The idealized experiments based on the models are outlined in section 5, which illustrates the improvement due to the proposed strategy when using either a perfect or a slightly biased first-guess field. However, the error covariance is exactly known in these idealized tests. In section 6, we further examine the method by using in situ observations for which the error covariance is unknown. We withdrew a portion of the in situ observations as an independent dataset for validation. In section 7, the obtained reconstructed temperature field is compared with other datasets. A brief description of the analysis field is presented in section 8, in which we highlight the historical upper ocean heat content estimate from 1940 to 2014 based on the analyzed field. A discussion of possible future applications is provided in section 9.
In situ ocean temperature observations are sourced from the World Ocean Database 2013 (WOD13) from 1940 to 2014, with more than 10 000 000 temperature profiles (Boyer et al. 2013). The quality flags provided by WOD13 are used to remove erroneous profiles and measurements. An additional standard deviation check is also utilized to remove spurious temperature measurements, whereby measurements more than five standard deviations away from the mean are removed. XBT bias is corrected by using the process of Cheng et al. (2014), and mechanical bathythermograph (MBT) bias is corrected by using the process of Ishii and Kimoto (2009).
Twelve monthly climatologies were constructed by averaging all of the temperature profiles from 1999 to 2005 into 1° × 1° ocean grid boxes and 27 standard depths, from the sea surface to 700 m. The standard depths were as follows: 1 m, 5 m, 10–100 m in 10-m intervals, 120–200 m in 20-m intervals, and 250–700 m in 50-m intervals. Then, a nine-point median filter (around each specific horizontal grid) was applied to the resulting blended fields. Only seven years (within 1999–2005) of data were used for the following reasons:
Cheng and Zhu (2015) indicated that a long-term climatology will bias the long-term calculation of OHC, so a climatology constructed using data from a short period was preferred for this study.
The CMIP5 models of historical simulation ended in 2005, and then the RCP4.5 simulation began in 2006. However, the successive RCP simulations for nine models were not available.
The global OHC time series obtained using the 1999–2005 climatology is very similar to that obtained by using the 2008–12 climatology when using the method of Cheng et al. (2015b).
The climatology is subtracted from each temperature profile to obtain a temperature anomaly profile. Then, the anomaly profiles are averaged in 1° × 1° grid boxes for each month and 27 standard depths to obtain the grid-averaged and monthly temperature anomaly field. Given the slowness of ocean temporal variation, in each month, we use data collected over a 3-month window to calculate the monthly mean, which helps to increase the data coverage without losing the ENSO variation. In this study, the proposed mapping method was applied to this grid-averaged, monthly temperature anomaly field.
The CMIP5 model output of historical runs from 1940 to 2005 and an RCP4.5 scenario from 2006 to 2014 was assembled for this study (Taylor et al. 2012). The historical runs included 40 separate runs and the RCP4.5 projections included 31 separate runs, which are listed in Table 1. We note here that a so-called “climate drift” exists for CMIP5 climate models (Sen Gupta et al. 2013), which is a spurious long-term variation that cannot be explained by internal and external forced variability. According to Durack et al. (2014a), the upper 0–700 m of the OHC is only weakly affected by climate drift; therefore, we did not apply a correction for climate drift in this study.
The data preparation for each CMIP5 model is similar to the observations. In brief, for each model, the temperature field was first interpolated to a 1° × 1° grid and 27 standard depths, and then 12 monthly climatologies were constructed using the 1999–2005 data. The temperature anomaly field was then obtained by subtracting the climatology from the interpolated field. This process was repeated separately for 40 (31 after 2006) times for 40 (31) models, and then we obtained 40 (31) grid-averaged, monthly temperature anomaly fields, which served as the input for our proposed mapping method.
The use of the temperature anomaly field rather than temperature field in the mapping is emphasized because the spatial and seasonal variation of the ocean (on the order of 1°C from the equator to the polar region and 0.1°C from summer to winter) is much larger than its interannual (on the order of 0.01°C yr−1 for global average signals) and multidecadal variation (on the order of 0.001°C yr−1 for global average signals). The climate models have substantial differences in simulating the mean ocean state, including spatial and seasonal variation, but they are not our primary focus. Instead, the anomaly field is used to remove the impact of the ocean spatial and seasonal variation.
A framework combining in situ observations and CMIP5 models is required to obtain the final analysis field. In this section, we briefly describe the method used in this study, the ensemble optimal interpolation method with a dynamic ensemble (EnOI-DE), which is based on the Kalman filter (KF; Kalman 1960; Evensen 1994, 2003, 2004). The analysis field Xa is a linear combination of a prior guess field Xb (or background field), and in situ observations (as denoted in matrix ):
where is the transfer matrix from the analysis space to observation space, and the Kalman gain K is obtained by the maximum likelihood method by minimizing the analysis error. The Kalman gain is calculated by
The superscript T denotes the transposition operation; is the error covariance of the observations, which is always predefined (a description of the method will be provided later); and Pb is the error covariance of the background field. This covariance is essential for propagating the signals from the data-rich to the data-sparse regions. Ideally, Pb has to be calculated explicitly given the model propagation matrix. However, because ocean dynamics are greatly nonlinear, it is impossible to explicitly calculate Pb. Some methods use an empirically parameterized Pb as an approximation (Good et al. 2013; Ishii et al. 2003; Levitus et al. 2012), but the accuracy of such a method is unclear. Improved strategies include the ensemble Kalman filter (EnKF) method (Evensen 1994, 2004); the ensemble optimal interpolation (EnOI; Evensen 2003); and the ensemble square root filter (EnSRF; Bishop et al. 2001; Sakov and Oke 2008a,b), which use an ensemble of models to represent the background error covariance (Pb). In this way, N ensemble members of the model outputs are recorded in matrix , which is defined as = (Xb1, Xb2, …, XbN). The ensemble mean of the models is
where 1 is the unity vector. The model covariance (or background covariance) Pb is now calculated as
For the traditional EnKF or EnSRF method, the error covariance is propagated and updated according to the following equation: Pa = ( − K)Pb. Thus, the analyses of the next time step use the information from the current time step. However, in this study, the CMIP5 models were not constrained by observations, so they do not carry the covariance information from the previous cycles forward in time. EnKF-like systems that use static or seasonal ensembles are often classified as the EnOI method. Therefore, the method used in this study is characterized as EnOI-DE, and the dynamic ensemble is provided by the unconstrained simulations of the CMIP5 models. The implementation of the method follows Sakov and Oke (2008a,b).
The general framework of the mapping method has now been developed. The reconstructed field (Xa) is the monthly temperature anomaly in each 1° × 1° ocean grid at each standard depth. In other words, the mapping strategy has been applied to the grid-averaged temperature anomalies in a plane with two-dimensional latitude–longitude. The background field (Xb) is the ensemble mean of the 40 pre-2005 CMIP5 models (N = 40) and the 31 models from 2006 to 2014 (N = 31), and each observed anomaly () is the gridded average and monthly in situ observation.
It is still necessary to define some key parameters for the application of this method. For clarity, the parameters are defined as follows.
The observation error variance represents the error of the observations that must be defined prior to the analysis. It consists of both the instrumental error due to instrument inaccuracy and the representativeness error due to the need to represent the spatial (at 1° × 1° and 1-m standard grid depths) and temporal (1 month) averages from a limited numbered of observations. Both the instrumental error and the representativeness errors are assumed to be uncorrelated between any two different grids; therefore, both and are diagonal matrices. This assumption may not be correct if the temperature observations in two different grids are made by using the same instruments or on the same cruise, which might lead to a correlation of the errors in the two grids. For instance, most of the XBT data are obtained by analog systems before 1985, which has a larger bias, and by digital systems after 1990 with smaller bias. However, we decided to ignore this correlation because it is difficult to quantify.According to the discussion, the observational error variance in each spatial (1° × 1° and standard depth) and temporal (1 month) grid is calculated as follows: in each grid cell is set to the mean of the typical precision of the different instruments used to obtain the data in the cell; and Ei is the precision of the instrument for each individual observation, which is the variance of the instrumental error and is set to 0.1°C2 for XBT, 0.001°C2 for CTD, 0.002°C2 for Argo, 0.3°C2 for MBT, and 0.01°C2 for the other instruments according to Abraham et al. (2013). Because defines the accuracy of the monthly mean value in each cell, it is approximated by the variance of the monthly mean values of the temperature anomalies. The symbol σ2 represents the variance of the various temperature measurements against the monthly mean value. The data from 2005 to 2013 are used to calculate the value of σ2 in each grid because of the approximately full global coverage of the data, which are shown in Fig. 2 for depths of 1, 300, and 700 m. The representativeness error is expected to show flow dependency, that is, the error is expected to be higher in areas where a gradient in the flow rate (speed) is present. The error is larger in regions of higher variability, so more observations are required to represent the mean value. Figure 2 shows a larger representativeness error in the boundary-current regions and near the Antarctic Circumpolar Current (ACC) regions.
Next, the need for a localization strategy should be determined. This strategy assumes that each observation could only impact a limited region around its location. The application of a localization strategy helps to minimize the impact of an imperfect error covariance, which could propagate the information incorrectly over a much longer distance. In general, an error in the model covariance will initially increase with distance from the analysis point, but it is almost impossible to specify an exact box (its size is determined by the influencing radius) within which the observations could benefit the analysis and outside of which they could not because the truth correlation length is unknown. Previous studies either specified a constant length scale over the global ocean [i.e., the three-pass analyses of 880, 660, and 440 km in Levitus et al. (2012) or ~300 and ~900 km in Willis et al. (2004)] according to the ocean large-scale wavenumber spectral in Zang and Wunsch (2001) or an empirically parameterized scale [i.e., a length scale that decreases with latitude, as in Ishii et al. (2003), or estimates according to the Hadley Centre Sea Ice and Sea Surface Temperature (HadISST) data in Smith and Murphy (2007)]. However, a localization strategy interferes with the potential advantage of an improved covariance. Therefore, we included a test in the next section to justify the usage of a localization strategy and empirically determine the influencing radius that defines a specific area; data outside that area are not used for analysis.
4. Evaluation by using data in recent decade
During the last decade, the Argo network achieved near-global coverage (Abraham et al. 2013; Roemmich et al. 2015; Riser et al. 2016). Therefore, in this section, we describe the conduction of a subsample test by sampling the data from the past eight years (2007–14) at historical observation locations. Similar tests were also done by a recent study (Smith et al. 2015). This test was used first to examine the impact of a localization strategy, to determine the maximum influencing radius as discussed in the previous section, and then to preliminarily evaluate the proposed mapping method.
In the subsample test, we used the temperature anomaly at 10 m for each selected month (January and August from 2007 to 2014) as the truth value (for a total of 16 truth fields). To achieve better spatial coverage, the data within 3 months around the selected month were averaged together. Each truth field was then subsampled according to the history of the observation locations (on January at every five years from 1941 to 2014). Thus, 16 subsampled fields were generated for each truth field. To examine the performance of the new method, all of the subsampled fields were mapped using the proposed method and then compared with the truth fields. This test was run multiple times by using different influencing radii from 4° to 36° with 4° increments, and another run without using a localization strategy was also carried out.
The global mean error between the mapped fields and the truth fields are shown by dots in Fig. 3a as a function of the influencing radius. It appears that a radius between 16° and 24° minimizes the error of the reconstructed field. An influencing radius larger than 24° allows more remote correlations, but it increases the error, suggesting the inaccuracy of the remote covariance represented by models. An influencing radius less than 16° also increases the analysis error because of the ineffective propagation of information from data-rich to data-sparse regions. Therefore, we used 20° as the influencing radius for the localization strategy. It is noted here that it requires a taper function in the localization strategy, which defines how much an observation can impact the analyzed grid. In this study, we adapted a Gaussian function assuming that the impact of the observation is exponentially decreasing with distance away from the analyzed grid. Tests to find the best selection of the taper function will be the first priority for the future improvement of the proposed method, for example, the functions proposed by Gaspari and Cohn (1999).
Figure 3a shows that the global mean temperature errors (defined as analysis minus observation) are always positive, indicating an error in the reconstructed field. To further investigate this error, we presented the temperature error as a function of the sampling year from 1941 to 2014 in Fig. 3b, with the influencing radius set to 20°. A positive error was found for the pre-1965 ocean sampling. By contrast, for post-1965 ocean samplings, the proposed mapping method could reconstruct the truth field in an unbiased manner, leading to errors in values around zero. This indicated that the new method could reduce the impact of the change of the observation system from the traditional ship-based system to the Argo system at the beginning of this century, as documented in Cheng and Zhu (2014a). Next, we will explore the reasons that some pre-1965 errors might be present and how well the information was propagated from the data-rich to data-sparse areas.
The geographical distribution of temperature anomalies at 10 m in January 2014 is shown in Fig. 4, along with the reconstructed field. The subsampled fields for each observation location in January 1941, 1961, 1981, and 2001 as well as the corresponding mapped fields are also shown. In 2014, the ocean evidenced a moderate La Niña pattern, with cooling signals in the eastern and southeastern Pacific together with strong warm anomalies in the northwestern Pacific. The Indian Ocean experienced cooling in the north and warming in the south, and the Atlantic Ocean showed warming in the midlatitudes and cooling in the other latitudes.
In January 1941, only some grids were observed in the coastal regions near Japan, western Australia, and in the northwestern Atlantic Ocean, so the temperature anomaly pattern in midlatitudes in the Northern Hemisphere and near Australia can be correctly reconstructed. However, in the western Indian, southeastern Pacific, and South Atlantic Oceans, the reconstructed field reverts to the first-guess field, which shows different geographical patterns than the true patterns. This is because the localization strategy prevents the propagation of information from the observed regions to the data-sparse regions with a spatial distance larger than 20°. This illustrates a key assumption of the proposed method: the final analyzed field will revert to the first guess when very few observations are available (e.g., pre-1960). Under such circumstances, the current method trusts the first-guess field provided by the models, which leads to the pre-1960 error in Fig. 3b. The bias of the long-term trend due to this assumption was quantified and the results are presented in the following sections.
In 1961, the data coverage increased to greater than 10%, although the data were still distributed around the major cruise line routes. The geographical pattern of the reconstructed field agrees well with the truth field for 2014 (Fig. 4), and the global averaged error is lower compared with 1941 (Fig. 3b). This suggests that the unconstrained ensemble of model runs contains the correct type of variability to reconstruct the temperature field. Similar results can be found for the ocean sampling in 1981 and 2001.
In summary, we used the data from the last decade to determine the influencing radius and to perform a preliminary evaluation of our proposed method. We showed that the new mapping method is able to reconstruct the truth field post-1960 in an unbiased manner, but the reconstructed field shows a slight reversion to the first-guess field pre-1960. We note here that the principal shortcoming of this test is that data for this decade still does not have global coverage, especially in coastal areas and in the Southern Ocean.
5. Idealized tests using CMIP5 models
To evaluate the new method and examine potential improvement of the present analysis for historical ocean heat content change, a series of idealized tests were constructed by explicitly specifying a truth field. Pseudo observations were constructed by sampling the truth field according to the locations of historical observations from 1940 to 2014. Observational errors are added to the pseudo observations by generating random noise with zero mean and standard deviation calculated by Eq. (5). Then, the pseudo observations were analyzed by using a mapping method (discussed in the following). Finally, the mapped field we obtained was compared with the truth field to evaluate the performance of the reconstruction.
Three different mapping methods were applied separately for comparison (note that not all available mapping methods were compared):
The Gmean method uses an OHC calculated from the available data to represent the global OHC, potentially assuming that the OHC in data gaps is equal to the mean OHC in the available-data region. This method has been widely used in previous studies such as in Lyman et al. (2010) and Palmer et al. (2007).
The parameterized (PARAM) method is group of methods that used a parameterized error covariance (according to the spatial distance between the analyzed grid and the observations) and climatology (zero anomaly) as the first-guess field. Several major data centers have employed a similar method (e.g., Good et al. 2013; Ishii et al. 2003; Levitus et al. 2012). In this study, we adapted the method presented in Levitus et al. (2012).
The proposed method in this study (referred to as EnOI-DE/CMIP5 hereafter).
To test the performance of the three methods outlined above, and the following two experiments were constructed by setting two different truth fields:
The “truth ensemble” is the ensemble mean of the CMIP5 model outputs was used as the truth field. Because the first-guess field is the CMIP5 ensemble mean, this experiment showed the performance of the mapping methods when the first-guess field is exactly known.
The “truth select” is when one of the model outputs was randomly selected as the truth field. This experiment indicated the impact of a slightly biased first-guess field because the model selected has a different mean than the ensemble mean of the CMIP5 models. We used the Hadley Centre Coupled Model, version 3, as an example for better illustration, but the use of the other models would have led to the same conclusion.
The monthly and global means of 10-m (representative of SST) and 0–700-m-averaged temperature anomalies (representative of OHC) from 1940 to 2014 when the first-guess field was exact (the truth ensemble experiment) are presented in Fig. 5. Both the SST and OHC by the EnOI-DE/CMIP5 mapping exactly match the truth field, and the root-mean-square error (RMSE) is close to zero (Fig. 6), indicating that the new method is capable of reconstructing the historical ocean change when the first-guess and error covariance fields are well represented by the CMIP5 models. However, the historical SST and OHC time series from the PARAM mapping strongly revert to zero at earlier times, especially prior to 1970, mainly because large data gaps are present in the early era, which have been mostly filled by the first-guess field (zero anomaly). In addition, the Gmean approach results in a larger increase in the SST and OHC because the observations prior to 2000 were mainly from the Northern Hemisphere, which had experienced stronger warming than the Southern Hemisphere (Cheng and Zhu 2014a).
When the first-guess field was substantially biased (i.e., the truth select experiment), the proposed mapping method also reconstructed the truth of the SST and 0–700-m-averaged temperature anomaly well, as shown in Fig. 5, especially after 1955. A significant improvement was found when using the new method over both the Gmean-mapping and PARAM-mapping methods because the new method resulted in a smaller mean error and RMSE, as shown in Fig. 6. However, pre-1955, the analyzed field slightly shifted to the CMIP5 ensemble mean. This confirms the analyses in the previous section, which indicate that the new method confers some trust on the first-guess field when the data coverage is extremely sparse. Therefore, the performance of the new method pre-1960 might depend partly on how accurately the CMIP5 model ensemble represented the long-term temperature change. This error could either under- or overestimate the long-term trends of the historical ocean subsurface temperature change.
We included an additional experiment to quantify the impact of the first-guess field on the estimate of long-term trend of the SST and the 0–700-m temperature change based on the truth select framework. We randomly selected 10 models as the truth and then computed the long-term trend for the first guess (CMIP5 ensemble mean), truth, and the reconstructed field. Figure 7 shows the percent error of the long-term trend for both the first-guess and the reconstructed field compared with the truth. Generally, the use of the proposed method could reduce the trend error to less than 20% from 1940 to 2014, 10% from 1970 to 2014, and 5% from 1990 to 2014, even if the trend error of the first-guess fields ranged from −60% to 80%. These results indicated that the new technique could capture long-term trends that were significantly different than the first guess. A worse first-guess field (with larger error) could lead to larger trend errors for the reconstructed field. However, according to Cheng et al. (2015b), the OHC trend of the CMIP5 ensemble mean over the 1970–2005 period was only 12% weaker than their observation-based estimate, suggesting that the multimodel ensemble mean was a reasonable choice for the first-guess field (the trend error will be less than 10%). Furthermore, a slightly smaller error was found for the trend at 10 m than the trend at 0–700 m, especially before 1960 because of better data coverage in the upper ocean than in the deeper ocean.
6. Assessment against withheld observations
The subsample tests and idealized tests indicated the potential for the improvement of the reconstruction of the temperature fields from pseudo observations by using the EnOI-DE/CMIP5 method. In this section, we further evaluated the proposed approach by applying the mapping method to a portion of observations (80%) and using the remaining data (20%) for validation. To perform the subdivision, we randomly divided the observation dataset in each month into five subsets, each containing 20% of the data. Five different tests were performed, and each test used one of the subsets for validation and the remaining four for reconstruction. In this way, when performing five tests, all of the observations could be validated.
Figure 8 presents the monthly time series of the mean error and RMSE between the analyzed fields with respect to the withdrawn data (an average of the five tests). Based on the global average, the new method again resulted in a smaller error compared with the other two methods and reduced the RMSE by 0.24°C (0.38°C) compared with Gmean mapping and by 0.16°C (0.14°C) compared with the PARAM method at 0–700 m (at 10-m depth).
More detailed information was obtained by evaluating the geographical distribution of the mean error and the RMSE between the analyzed field and the withdrawn data when comparing the following three mapping methods: EnOI-DE/CMIP5 (Figs. 9a,b), Gmean (Figs. 9c,d), and PARAM (Figs. 9e,f). Generally, all of the methods led to smaller mean errors and RMSE in low latitudes within 30°S–30°N than in higher latitudes and larger errors near the boundary current system because larger spatial and temporal mesoscale variabilities exist in the boundary current system, which are difficult to reconstruct. However, EnOI-DE/CMIP5 mapping resulted in much smaller errors than the other two methods over the global ocean. Compared with the PARAM method, better reconstruction of the temperature field was found for the Kuroshio and Gulf Stream with the proposed method. Those regions have been well sampled by historical ship-based observation systems, so this improvement indicates the effect of the covariance field. A globally consistent covariance (as in PARAM), which assumes the same spatial correlation over the global ocean, might not be accurate enough to represent the anisotropy of the covariance in those regions. By contrast, the models have some capability to simulate the general ocean circulation and could provide a better representation of the covariance for the boundary current system. In the same way, the proposed method also showed a smaller error than PARAM in the ACC regions. Furthermore, the Gmean method resulted in much larger errors in middle and high latitudes, indicating a key limitation of the traditional representative method. Because different temporal variations are present at different locations, filling the data gaps with the mean OHC in well-sampled regions led to the largest errors.
The results of the data withdrawal test again confirmed that the proposed method could reconstruct the historical temperature change. However, this test also is limited because the withdrawn data are mostly from the well-sampled regions. This indicates that this test mainly revealed the performance of the mapping methods in the data-rich regions.
7. Assessment with individual datasets
After the three different evaluations shown in the previous sections, the improvement of the proposed mapping method for reconstructing subsurface temperature fields is evident. Therefore, we applied the approach to the entire dataset and obtained the analyzed field of the ocean subsurface temperature. In this section, additional assessments were made by focusing on several different key metrics of the climate system because it is also important to investigate whether the new dataset is capable of representing the mean state and historical ocean variations on various time scales, such as a long-term trend, interannual variability, and decadal variability. We note that the assessments are mainly made for the sea surface temperature field because no independent subsurface dataset exists. However, these assessments also have important implications for the performance of the reconstruction of the ocean subsurface.
a. Reconstruction of the ocean climatology
Although the ultimate goal of this study is to provide a complete subsurface temperature anomaly field for climate change studies, it is also important to examine the temperature filed in the first evaluation. The 0–700-m average temperature during 2005–12, based on the fields reconstructed in this study, is presented in Fig. 10 and compared with the climatologies in World Ocean Atlas 2013 (WOA13; Locarnini et al. 2013). WOA13 is a well-developed and widely used ocean climatology that uses an objective analysis. The most important development was its increased spatial resolution; two resolutions are presented (1° × 1° and ¼° × ¼°). Our 0–700-m average temperature climatology shows nearly an identical pattern with WOA13 in the 1° × 1° resolution, indicating a warm pool at the western boundary within 10°–30°N and 10°–30°S in each ocean basin. Although the 0–700-m average temperatures show small seasonal variability because of the large heat capacity of the ocean, ocean warming in the Northern (Southern) Hemisphere is still detectable in summer (winter). Furthermore, the WOA13 climatology in the ¼° × ¼° resolution reveals a similar pattern as the two climatologies discussed above but shows more mesoscale signals. This implies that this study could be potentially improved in the future by increasing the resolution to gain a better representation of the mesoscale eddies.
b. Reconstruction of SST change
The sea surface temperature (SST) is an important variable associated with climate change, which has been carefully analyzed in various data centers (Kennedy et al. 2011; Rayner et al. 2003; Cowtan and Way 2014). SST variability at different time scales is well documented in the literature because it plays a crucial role in the air–sea interaction and dominates climate feedback related to global warming (i.e., the water cycle). Furthermore, it is also independent because it is measured by a different observation system (ship hulls, buoys, and insulated buckets) rather than via Argo or XBT devices. Therefore, the SST data can be considered an independent dataset for the validation of our new dataset. In this section, we compared our analyzed field at a depth of 10 m with an independent and fully evaluated dataset, HadSST3 from the Met Office (Kennedy et al. 2011). Figure 11 shows the monthly time series of the global SST change from 1940 to 2014 in the reconstructed field compared with HadSST3. It is apparent that the interannual variability of the SST is nearly identical in both datasets, suggesting that the SST field has been well represented by the new approach. Regarding long-term changes, the EnOI-DE/CMIP5 method results in a linear trend of approximately 0.0069°C yr−1 from 1940 to 2014 and approximately 0.0127°C yr−1 from 1970 to 2014, showing a slightly stronger trend than the Hadley Centre SST, version 3 (HadSST3), dataset from 1940 to 2014 (~0.0057°C yr−1) and a near-identical trend from 1970 to 2014 (~0.0128°C yr−1). These differences can be attributed to 1) the difference in the land mask used by the two groups; 2) the difference in the reconstructed methods, in particular, the results of our method might be slightly reverted to the first-guess field pre-1960; and 3) uncertainties in both the sea surface temperature datasets (HadSST3) and the near-surface datasets (this study).
The correlation of SST change in each 1° × 1° grid between the reconstructed dataset and the HadISST dataset (Rayner et al. 2003) is shown in Fig. 12, where the long-term linear trend and the seasonal cycle in each grid has been removed. The results showed a significant positive correlation within 30°S–60°N, as follows: R > 0.5 in general from 1940 to 2014 and R > 0.7 from 1970 to 2014. However, a small or even a negative correlation in the Southern Ocean (60°–30°S) was apparent, which was due to the lack of observations for both subsurface and sea surface temperatures (i.e., HadISST).
c. Reconstruction of interannual variability (ENSO)
ENSO is the most important signal on the interannual scale in the climate system; it is able to affect weather conditions over the entire globe by teleconnection mechanisms. As indicated by Palmer and McNeall (2014) and Taylor et al. (2012), the timing of an unforced variability such as ENSO only matches observations by coincidence. Thus, it remains to be observed whether the temperature field obtained from the CMIP5 data in the EnOI-DE/CMIP5 method represented the correct timing of an unforced variability such as ENSO. Here, ENSO will be indicated by the Niño-3.4 index.
The global SST time series presented in section 7b has already shown that the interannual variability in the reconstructed SST field agreed with the observations. Figure 11b shows a further calculation of the Niño-3.4 index by using the reconstructed field compared to the index obtained with the Niño-3.4 index provided by oceanic Niño index (ONI) from NOAA. The two indexes are almost identical (R > 0.8), indicating that the timing of the ENSO variability was well represented by the reconstructed field. However, we note here that the reconstruction based on temperature anomaly field after subtracting 12 monthly climatologies is difficult to unbiased catching ENSO, because ENSO is phase locking in season (always get to the peak in winter) and asymmetric between its positive and negative phase. This leads to the residuals of ENSO signals in the monthly climatology, which potentially bias the representation of ENSO in the anomaly field. How well the reconstructed field in this study in representing ENSO compared with the existing methods requires a more comprehensive study in the future, which is out of scope of this study.
d. Reconstruction of decadal variability (PDO)
Decadal variability is another important feature of the climate system. The interdecadal Pacific oscillation (IPO) is one of the predominant signals of climate variability on a decadal scale (Zhang et al. 1997), which is responsible for the recent “climate hiatus” that has been documented in many recent studies (Meehl et al. 2011; Kosaka and Xie 2013; Meehl et al. 2013; Trenberth and Fasullo 2013; England et al. 2014; Trenberth et al. 2014b; Cheng et al. 2015a; Trenberth 2015). Therefore, we calculated the first empirical orthogonal function (EOF) mode of the Pacific SST field within 30°S–45°N. The spatial pattern of the first EOF mode is shown in Fig. 13a and displays a horseshoe pattern that includes a maximum positive signal in the eastern Pacific Ocean in low latitudes and two negative centers in middle latitudes in both hemispheres. This pattern is well documented in the literature and reveals the low-frequency variability of the air–sea interaction mode in the Pacific Ocean. The time series of this mode are presented in Fig. 13b, appearing as a negative phase during the 1950–75 and 1998–2014 periods and as a positive phase during the 1976–97 period. For comparison, the IPO index provided by the Met Office Hadley Centre (Folland et al. 2002) and the Pacific decadal oscillation (PDO) index provided by Zhang et al. (1997) are also shown in Fig. 13b. The three time series are consistent with each other, indicating that the IPO (or PDO) signal can be reconstructed well by the new approach.
8. The reconstructed field and its uncertainties
Based on the evaluation as outlined in the previous sections, we found that the EnOI-DE/CMIP5 mapping method showed robust improvement in the reconstruction of the ocean surface and subsurface temperature fields at different time scales. As indicated in section 3, the uncertainties of the reconstructed field can also be assessed, as represented by updating the members of the ensemble.
For example, Fig. 11 displays the SST time series of the ensemble members with their standard deviation in pink. Larger errors are apparent at earlier times: ±0.2°C in 1960 compared with ±0.05°C in 2014. The reduction of such errors might be tied to both an increase in data coverage and the improvement of the instrument quality of the observations. Moreover, the uncertainty was smaller during 1999–2005 than during the other periods because the 1999–2005 climatology was used. The global averaged anomalies for the different models were all approximately zero during this period, which led to a smaller analysis error for the globally averaged metrics such as the global SST.
Similarly, the 0–700-m-averaged temperature change (representative of the OHC) was calculated from 1940 to 2014 using the proposed method together with the 40 (31 after 2006) ensemble members (gray curves) and the standard deviation (pink curves) as shown in Fig. 14. This estimate can be regarded as another independent OHC estimate compared with the existing estimates presented in Ishii et al. (2003), Willis et al. (2004), Smith and Murphy (2007), Domingues et al. (2008), Palmer and Haines (2009), Roemmich and Gilson (2009), Levitus et al. (2012), Good et al. (2013), and Cheng et al. (2015b). The new EnOI-DE/CMIP5 method resulted in a long-term linear trend of 0.0034 ± 0.0010°C yr−1 (0.20 ± 0.07 W m−2 averaged over the global surface) from 1940 to 2014 and 0.0052 ± 0.0009°C yr−1 (0.31 ± 0.06 W m−2) from 1970 to 2014. The three other independent estimates (Cheng et al. 2015b; Levitus et al. 2012; Balmaseda et al. 2013) are also shown in Fig. 14. The new estimate from this study showed a slightly weaker OHC trend compared with Cheng et al. (2015b; ~0.35 ± 0.12 W m−2) from 1970 to 2014 and Balmaseda et al. (2013; ~0.30–0.34 W m−2) from 1970 to 2009. However, it showed a much stronger trend from 1970 to 2014 than Levitus et al. (2012; ~0.24 W m−2). As indicated by the tests in this study, the traditional parameterized methods (i.e., Ishii et al. 2003; Smith and Murphy 2007; Levitus et al. 2012) most likely underestimate the long-term trend because of the choice of the first-guess field.
Moreover, the estimates by Cheng et al. (2015b), NOAA/NODC, and this study show a very similar interannual oscillation, although the magnitude is not consistent. The interannual variation of the upper OHC is linked to the ENSO variation as investigated in Roemmich and Gilson (2011) by using Argo data since 2004. Moreover, a strong OHC shift from 2001 to 2003 was shown in Levitus et al. (2012), to be partly (nearly by half) spurious, which was caused by the change in the observation system from a ship-based system to the Argo system (Cheng and Zhu 2014a). As indicated in Fig. 14, the net temperature change within 2001–03 is lower by half in both Cheng et al. (2015b) and in this study, compared with Levitus et al. (2012). Furthermore, the OHC changes forced by three major volcano eruptions are apparent in the reconstructed field: Mount Agung in 1963, El Chichón in March–April 1982, and Mount Pinatubo in June 1991.
The calculation of the geographical distribution of the long-term change in the ocean heat content is also valuable. Figure 15 shows the linear trend of the temperature change at 10 m from 1970 to 2014 compared with that based on HadISST, and the zonal average temperature change at each depth from 1 to 700 m. The sea surface temperature field shows large-scale warming over the global ocean, indicating signals of anthropogenic forcing. More rapid warming of the poleward flowing western boundary currents such as the East Australian Current and the Kuroshio can be detected (Wu et al. 2012).
The zonal trend of temperature change shown in Fig. 15c reveals the modification of the ocean circulation and the downward transportation of heat. The large-scale warming from 10° to 40°N and from 20° to 50°S indicates the downward transport of warming by subtropical cells (Levitus et al. 2009; Rhein et al. 2013) and is consistent with poleward displacement of the mean temperature field. The stronger warming within 50°–40°S might be linked to the poleward displacement of the ACC and the southern limb of the subtropical gyres, as illustrated by Gille (2008) and Alory et al. (2007). The weak warming (or slight cooling) at depths between 20°S and 10°N has also been shown by previous studies (i.e., Rhein et al. 2013) and is consistent with a southward shift of cooler water near the equator. This is probably because of the strengthening of the shallow subtropical cell due to the intensifying tropical trade winds since 1990s (England et al. 2014). The cooling of the Southern Ocean is apparent at the upper 100 m, consistent with the recent increase of sea ice coverage.
9. Concluding remarks
In the latest IPCC report (Rhein et al. 2013), the available upper-ocean heat content estimates from 1970 to 2010 show large differences ranging from 74 TW (Smith and Murphy 2007) to 137 TW (Domingues et al. 2008), implying a large uncertainty of calculation. The community has made a great effort to reduce the errors in the OHC calculation as documented in the literature. Among these efforts are the removal of XBT biases (Abraham et al. 2014; Cowley et al. 2013; Levitus et al. 2009; Schwalbach et al. 2014; Shepard et al. 2014; Cheng et al. 2016), the selection of an appropriate climatology (Lyman and Johnson 2014; Cheng and Zhu 2015), the removal of errors due to the insufficient vertical resolution of the data (Cheng and Zhu 2014b), and the proposal of new mapping methods (Lyman et al. 2010; Levitus et al. 2012). In this study, we have focused on a new mapping strategy motivated by the deficiencies in some of the available methods: 1) the choice of climatology (zero anomaly) as the first-guess leads to an overreversion of the OHC estimate toward zero, and 2) the parameterized error covariance is empirical and might not work well for some regions such as the boundary current regions. Based on this motivation, our new strategy uses an ensemble of CMIP5 models to provide the first-guess field and error covariance and shows better performance in reconstructing the historical ocean subsurface temperature field.
However, we do not claim that this method is the best because it is still unclear whether it can perfectly fill the data gaps in the historical dataset. The difficulty is that no independent data for validation exist, especially for the ocean subsurface. A potential strategy to fully evaluate this method is the use of several different synthetic datasets, such as high-resolution models, sea level data, etc., which requires further effort. Furthermore, comprehensive and international collaborations are required in the future to fully understand the performance of different mapping methods.
It is worth noting here that a potential problem with the new method is that the temperature field (and global OHC) from the earlier eras (i.e., prior to 1960) based on the EnOI-DE/CMIP5 method might slightly revert to the ensemble mean of the CMIP5 model outputs. Therefore, the accuracy of this method might depend on the performance of the CMIP5 models, and future improvements of the models will positively feed back to the reconstruction of historical subsurface temperature field.
It is also noteworthy that further applications of the proposed strategy include the reconstruction of the deeper ocean temperature field below 700 m because increasingly more evidence suggests the importance of the deeper ocean in Earth’s energy budget (Balmaseda et al. 2013; Palmer et al. 2011; Purkey and Johnson 2010; Trenberth et al. 2014a). However, the difficulty is that deeper ocean measurements are even sparser than those for the upper ocean because most of the XBT instruments have a maximum depth of 700–750 m. Another possible application is the reconstruction of the ocean subsurface salinity field, which plays an important role in monitoring the sea level rise (Church and White 2011; Church et al. 2011; Durack et al. 2014b) and the water cycle (Durack et al. 2012).
This work is supported by the Natural Science Foundation of China (41476016 and 41276027), “Structures, Variability and Climatic Impacts of Ocean Circulation and Warm Pool in the Tropical Pacific Ocean” of National Basic Research Program of China (2012CB417404), and Chinese Academy of Sciences’ project “Western Pacific Ocean System: Structure, Dynamics and Consequences” (XDA11010405). The authors acknowledge NOAA/NODC for decades of effort in collecting and quality controlling the historical subsurface data, which was the data source used in our study. The reconstructed dataset is available at http://220.127.116.11/cheng/. We also acknowledge the Program for Climate Model Diagnosis and Intercomparison (PCMDI) for archiving the CMIP data.