In agreement with observations, Earth system models participating in phase 5 of the Coupled Model Intercomparison Project (CMIP5) simulate a decline in September Arctic sea ice extent (SIE) over the past decades. However, the spread in their twenty-first-century SIE projections is large and the timing of the first ice-free Arctic summer ranges from 2020 to beyond 2100. The uncertainties arise from three sources (internal variability, model uncertainty, and scenario uncertainty), which are quantified in this study for projections of SIE. The goal is to narrow uncertainties by applying multiple diagnostic ensemble regression (MDER). MDER links future projections of sea ice extent to processes relevant to its simulation under present-day conditions using data covering the past 40 years. With this method, we can reduce model uncertainty in projections of SIE for the period 2020–44 by 30%–50% (0.8–1.3 million km2). Compared to the unweighted multimodel mean, the MDER-weighted mean projects an about 20% smaller SIE and an earlier near-disappearance of Arctic sea ice by more than a decade for a high–greenhouse gas scenario. We also show that two different methods estimating internal variability in SIE differ by 1 million km2. Regardless, the total uncertainties in the SIE projections remain large (up to 3.5 million km2, with irreducible internal variability contributing 30%) so that a precise time estimate of an ice-free Arctic proves impossible. We conclude that unweighted CMIP5 multimodel-mean projections of Arctic SIE are too optimistic and mitigation strategies to reduce Arctic warming need to be intensified.
Observations show that the ongoing warming of Earth caused the September Arctic sea ice extent (SIE) to shrink by almost 50% since the 1970s (Stroeve et al. 2012a). But not only has the ice area decreased, the sea ice has also become thinner and younger (i.e., the amount of multiyear ice has decreased rapidly; Fowler et al. 2004; Maslanik et al. 2011); about 70% of the winter sea ice is now seasonal ice (i.e., first-year ice) (Kwok 2018). Thinner ice melts out more easily in summer, opening more ice-free areas and thus accelerating the Arctic warming (Holland and Bitz 2003; Stroeve et al. 2012a). Earth system models (ESMs) participating in phase 5 of the Coupled Model Intercomparison Project (CMIP5; Taylor et al. 2012) simulate a further decrease in sea ice throughout the twenty-first century in all future scenarios that keep atmospheric greenhouse gas concentrations at current or higher levels. In the RCP8.5 scenario, where atmospheric CO2 concentrations more than double by the end of the twenty-first century compared to 2000 (Riahi et al. 2011), almost all models project the Arctic to become ice-free in summer before 2100.
The year of near-disappearance of summer Arctic sea ice (YOD) is defined as the first year of a series of five consecutive years in which the minimum SIE drops below 1 million km2 (Wang and Overland 2009). The large spread in the projections stems from internal variability, different model structures (model uncertainty) and the uncertainty in future greenhouse gas scenarios (Kay et al. 2011; Stroeve et al. 2012b; Liu et al. 2013; Swart et al. 2015). Different methods have been applied to reduce uncertainties in sea ice projections, like large ensemble simulations, model selection, weighting, or constraining projections with observations (e.g., Massonnet et al. 2012; Liu et al. 2013; Melia et al. 2015; Jahn et al. 2016; Knutti et al. 2017; Screen and Williamson 2017; Sigmond et al. 2018). Recent studies highlight the positive effects on Arctic sea ice of restricting the global warming to 1.5°C (Screen and Williamson 2017; Jahn 2018; Screen 2018; Sigmond et al. 2018). For example, Jahn (2018) and Sigmond et al. (2018) agree that the probability of an ice-free Arctic summer under 1.5°C stabilized warming is around 2%, as opposed to 20%–35% at 2.0°C warming.
Following, for example, Melia et al. (2015), Lique et al. (2016), and Hodson et al. (2012), we use the method of Hawkins and Sutton (2009, 2011) to separate total multimodel projection uncertainty into its three components: internal variability, model uncertainty, and scenario uncertainty. The goal of this study is to reduce model uncertainty in projections of Arctic SIE and to give a more precise estimate of YOD. For this we use a set of 29 ESMs from 17 different institutes participating in CMIP5. Additionally, we compare the internal variability component from the Hawkins and Sutton (2009) method to an estimate that is based on the spread among a single-model large ensemble, similar to Jahn et al. (2016).
To narrow model uncertainty, we use the multiple diagnostic ensemble regression (MDER) method developed by Karpechko et al. (2013). This method uses statistical relationships between the projected future target variable (here September Arctic SIE) and historical model performance in terms of different process-oriented diagnostics. The results obtained are cross-validated to test for spurious relationships using a pseudoreality approach. MDER has proven its potential to reduce CMIP5 multimodel projection uncertainties in two previous studies: the spread in the projected return dates of Antarctic total column ozone could be reduced by over a decade (Karpechko et al. 2013) and the near-future position of the summer austral jet stream could be bias-corrected by 1.5° southward compared to the unweighted multimodel mean (Wenzel et al. 2016).
This paper is structured in the following way: section 2 describes the MDER method and how the three types of uncertainty can be quantified. Additionally, the diagnostics used in this study are introduced and the model simulations and observations are described. In section 3, we apply MDER to constrain SIE projections and narrow model uncertainty. We also give our estimate of YOD for a scenario with a high greenhouse gas concentration. Section 4 closes with a summary and discussion.
a. Multiple Diagnostic Ensemble Regression
MDER was developed by Karpechko et al. (2013) and implemented into the Earth System Model Evaluation Tool, version 1.0 (ESMValTool; Eyring et al. 2016), by Wenzel et al. (2016). The ESMValTool is a community-developed open source software package aiming to facilitate the complex evaluation of ESMs. For this study, the ESMValTool was extended by additional diagnostics related to sea ice and MDER was adapted for applications to Arctic sea ice.
MDER is based on the correlation between selected process-oriented diagnostics applied to historical or present-day periods for which observations are available, and a future target variable. An iterative step-wise regression algorithm based on von Storch and Zwiers (1999) takes a set of preselected diagnostics as an input and selects a subset of these diagnostics to build a regression model that best predicts the future variable. The algorithm iteratively adds and removes diagnostics to and from the regression model until the regression sum of squares does not increase significantly, based on an F test with a significance level p = 0.1. The final regression model is then the linear combination of the selected process-oriented diagnostics that best predicts the future variable and is of the form
where y is the estimated climate response (SIE), β0 and β are the multiple regression parameters with β being a column vector of the size of the number of the selected diagnostics, and X is the matrix of diagnostic values of the selected diagnostics. Using observational data for the selected diagnostics with the regression model yields a multidiagnostic constraint. MDER then calculates model weights based on this constraint by extending the formula derived by Bracegirdle and Stephenson (2012) for a case with a single diagnostic to multiple diagnostics:
where N ≡ (1T1)−11 is a vector of a size equal to the number of models n and the value of all elements equal to n−1, X0 is the vector of observed diagnostics, and is the vector of the multimodel-mean diagnostics. The weights are used to calculate a weighted multimodel mean from the CMIP5 model ensemble with the aim of reducing model uncertainty in the multimodel projection.
The diagnostic selection is one of the key strengths of MDER: compared to simply taking all subjective diagnostics (e.g., Snape and Forster 2014; Knutti et al. 2017; Sanderson et al. 2017), the step-wise regression algorithm reduces redundancy in selected diagnostics, which is the danger of including multiple diagnostics that effectively describe the same process. Another potential danger in using all diagnostics instead of only the MDER-selected diagnostics lies in overfitting caused by spurious relationships between historical diagnostics and the future variable (Bracegirdle and Stephenson 2012). To test for spurious relationships, the MDER results are cross-validated in a pseudoreality approach: since observations of future variables are naturally unavailable, we select one model at a time as reference (= pseudoreality) and benchmark the other models against this reference model by calculating the root-mean-square error (RMSE) as a measure of prediction uncertainty. Since there is no preferred reference model, each model is taken as pseudoreality once all of the remaining models are tested against it. The difference in RMSE between the unweighted multimodel mean (uMMM) and the MDER results reveals the potential of MDER to reduce uncertainty in the SIE projections and is a measure of the uncertainty of the MDER result.
MDER is based on the following assumptions that are similar to other regression-based approaches (Bracegirdle and Stephenson 2012):
There is a linear relationship between the present-day mean state and trends and future change, which is similar in climate models and observations.
The residuals from the regression fit are independently distributed.
Climate model results and observations are interchangeable.
The effects due to measurement errors are assumed to be negligible compared to other sources of uncertainty.
The target variable in this study is the 2020–44 September-mean Arctic sea ice extent. SIE is derived from the gridded variable sea ice concentration (SIC, also known as “sea ice area fraction”), which describes the area fraction of each ocean surface grid cell that is covered with sea ice. Sea ice extent is defined as the total area of all grid cells in which SIC ≥ 15%. Here, we define SIE as the September sea ice extent in the Arctic (60°–90°N). The annual minimum Arctic sea ice extent typically occurs in September and thus September mean sea ice quantities are commonly used in literature, for example in analyses of the timing of an ice-free Arctic (e.g., Massonnet et al. 2012; Jahn et al. 2016; Screen 2018; Sigmond et al. 2018).
The set of process-oriented diagnostics preselected by the authors as input for the MDER method is listed in Table 1 and is based on published literature on processes that are known to influence sea ice concentration. The selection is not meant to be an exclusive list, but considers many of the variables that previous work suggests are important for sea ice evolution. All diagnostics are applied to (present-day) data for the same time period (1979–2012); see also section 2d.
The diagnostics are based on five variables. These include two different sea ice variables, historical SIE and sea ice thickness (SIT), which affect sea ice processes and projections (e.g., Laxon et al. 2003; Massonnet et al. 2018). To account for freezing and melting processes, two temperature variables are included to represent the thermal influence on the ice from above and below (e.g., Zhang et al. 2000; Weeks 2010): near-surface air temperature (TAS) and sea surface temperature (SST). Atmospheric surface pressure (PSL) is used as a proxy for the influence on ice drift due to atmospheric winds near the surface (e.g., Thorndike and Colony 1982; Spreen et al. 2011). For each of these variables, three different metrics are calculated over the whole historical time period (1979–2012): the climatological mean (indicated by _c), the trend (_t), and the interannual (“year-to-year”) variability (_i). The calculated diagnostic results from each model and observation/reanalysis are shown in the supplemental information (see Figs. S1–S5 in the online supplemental material).
c. Uncertainty estimation
In addition to the MDER method, we use the method of Hawkins and Sutton (2009, 2011) to quantify different sources of uncertainty in the CMIP5 multimodel projections for the twenty-first-century SIE (2006–2100): internal variability, model uncertainty, and scenario uncertainty. Numerous studies have applied this method to Arctic variables such as CMIP3 summer Arctic sea ice extent (Lique et al. 2016), CMIP3 Arctic temperature and precipitation (Hodson et al. 2012), and CMIP5 September Arctic sea ice thickness (Melia et al. 2015). In the following, we describe the sources of uncertainty and how they can be isolated, following Hawkins and Sutton (2009, 2011). It is important to note that no ensemble averages are taken in the calculation of the uncertainty estimates (as opposed to the diagnostic calculations; see section 2d for details).
1) Internal variability
To quantify the contribution of internal variability to the overall uncertainty of the model ensemble, each SIE projection from all models is individually fit with a fourth-order polynomial over the time period 2006–2100, using the least squares method and creating a smooth fit. We can write the predictions X from each model m, for scenario s and year t, as
where i is the reference value (year 2000 SIE), x the smooth fit, and ε the residual. The internal variability component V is computed from the multimodel mean of the variances of the residuals:
where σs,t is the variance across all scenarios and across time and Nm is the number of models. Note that here, because the variance is computed over all time values, V is constant in time by design. This is a limitation of this method, especially since interannual variability has been shown to increase as the ice thins. The internal variability obtained with this method is compared in section 3b to a more recent method by Jahn et al. (2016), who used a large ensemble of a single ESM to estimate internal variability.
2) Model uncertainty
Model uncertainty is estimated from the variance in the fits xm,s,t of each scenario. The multiscenario mean model uncertainty M(t) is calculated as follows:
where Ns is the number of scenarios.
3) Scenario uncertainty
The CMIP5 projections are divided into sets of scenarios of which each assumes a different amount of external forcing due to the emission of climate-relevant substances. More specifically, the scenarios represent four different representative concentration pathways (RCPs; van Vuuren et al. 2011) that assume a different radiative forcing (W m−2) by the end of 2100: RCP2.6, RCP4.5, RCP6.0, and RCP8.5. The scenario uncertainty is calculated from the variance of the multimodel means over the RCP scenarios:
For the estimate of scenario uncertainty, we only consider the scenarios RCP4.5 and RCP8.5, because the variable SIC is not available from all 29 models for RCP2.6 and 6.0, which were considered optional in CMIP5. This means that this study considers a medium and a high (“business-as-usual”) RCP scenario for the uncertainty estimates. However, we repeated the analysis with the 19 models for which output for RCP2.6 was available. The results of this additional analysis are described in section 3b.
d. Models, observations, and reanalyses
For this study, an ensemble of 29 CMIP5 models from 17 different institutes or modeling centers (Table 2) is used. We could not use all (~40) CMIP5 models because not all models provided the required output for all five variables and the two scenarios used in this study. Even though not all CMIP5 models are strictly speaking Earth system models, we will refer to them as such for simplicity.
The historical time period used with the diagnostics (1979–2012) is determined by two constraints. The earliest start is the year 1979 because reliable observations or reanalyses of the required variables are not available for earlier times. The final year is a compromise between extending the CMIP5 historical model experiments ending in 2005 as long as possible while keeping the impact of the RCP scenario chosen to extend the historical experiments to a minimum. Here, we extended the time period of the historical simulations up to the year 2012 using results from the corresponding RCP4.5 simulations. We did not use RCP8.5 results for extension of the historical runs since these were not available from all models and for all variables used. To minimize the influence of the actual scenario used for extension, the simulations were only extended until 2012 where differences between RCP4.5 and RCP8.5 are still small.
Our target period for future SIE is 2020–44. This 25-yr period is selected to 1) start sufficiently far from the end date of the historical experiments to increase the signal in SIE decline and 2) end before SIE from any model approaches zero, since correlations between historical diagnostics and future SIE may become spurious otherwise. Although we consider this target period as the most suitable period for the purpose of this study, we also tested the sensitivity of the results to the selected period. We found that using a longer period (2016–64) than the 25-yr period 2020–44 gives similar results, suggesting that our conclusions are not very sensitive to the exact target period. The target SIE projections used with the MDER regression algorithm are from RCP8.5 simulations.
For each model experiment, some models run different realizations (ensemble members) with slightly different initial conditions to sample internal variability. We selected only those ensemble members from each model that provide simultaneously all scenarios and variables that are used in this study. To account for the different ensemble sizes in the diagnostic calculations, we calculated for each model an ensemble average of the diagnostic values, which is then used in the further analyses. Thus, the multimodel metrics are not biased toward models with many ensemble members (see also Massonnet et al. 2012). Note that the ensemble average is taken after the diagnostic calculations of the 1979–2012 climatological mean, trends, and interannual variability (section 2b) since calculating the diagnostics on ensemble means would give incorrect estimates of, especially, interannual variability.
The observations and reanalyses for each diagnostic are listed in Table 1 and are provided as monthly means. In the following, we briefly describe the five datasets used in this study.
We use satellite observations of SIC from the National Snow and Ice Data Center (NSIDC; Walsh et al. 2015). The product is available from 1978 to present and was processed with the NASA-Team retrieval algorithm (NT; Cavalieri et al. 1996) from data of Nimbus-7 SMMR and DMSP SSM/I and SSMIS passive microwave sensors. The spatial resolution is 25 km × 25 km.
Sea ice thickness reanalyses are taken from the Pan-Arctic Ice Ocean Modeling and Assimilation System (PIOMAS; Zhang and Rothrock 2003), a coupled ice–ocean model forced with National Centers for Environmental Prediction (NCEP) reanalyses and assimilating observations of sea ice concentration and sea surface temperature. Satellite observations of SIT are spatially and temporarily limited (e.g., Kwok et al. 2009; Tilling et al. 2015). Since we need complete temporal and Arctic-wide coverage in this study, we use the PIOMAS reanalyses. PIOMAS was compared to SIT satellite observations and found to be a good estimate of the observed SIT in numerous studies (e.g., Lindsay and Zhang 2006; Schweiger et al. 2011; Laxon et al. 2013; Stroeve et al. 2014). PIOMAS reanalyses are commonly used in studies analyzing sea ice thickness (e.g., Melia et al. 2015; Dirkson et al. 2017; Labe et al. 2018).
The Hadley Centre Sea Ice and Sea Surface Temperature dataset (HadISST; Rayner et al. 2003) provides SST data. It is a global reanalysis product combining data from the Met Office Marine Data Bank (MDB), the Global Telecommunications System (GTS), and the Comprehensive Ocean–Atmosphere Dataset (COADS) and has a spatial resolution of 1° × 1°.
For TAS and PSL, we use the European Centre for Medium-Range Weather Forecasts interim reanalysis (ERA-Interim; Dee et al. 2011) data. The data assimilation system used is based on the Integrated Forecast System, cycle 31r2 (IFS-Cy31r2), and is available from 1979 with monthly updates and an approximate horizontal resolution of 80 km.
3. Application of MDER to Arctic sea ice projections
MDER calculates a regression model from the historical diagnostics that best predicts future SIE (section 3a). The model weights obtained with this regression model are then applied to twenty-first-century SIE projections to narrow model uncertainty (section 3b) and improve the predictions of YOD (section 3c).
a. Diagnostic selection and regression model
Figure 1 shows the absolute correlation coefficients of all CMIP5 historical diagnostics (1979–2012; see Table 1) with future SIE (2020–44). The climatological mean September Arctic sea ice extent (SIE_c) is the diagnostic with by far the highest correlation coefficient (r = 0.91) and smallest uncertainty range. This means that biases in the historical mean sea ice extent dominate the projections of near-term future sea ice extent: models that simulate a small (large) SIE in the past simulate a small (large) SIE in the near future. Nine out of the 15 diagnostics show a statistically significant correlation with future SIE. After SIE_c, the climatological mean sea ice thickness (SIT_c) shows the largest correlation to future SIE with r = 0.69. Thinner ice breaks up and melts out more easily in summer (e.g., Bitz and Roe 2004; Kwok 2018; Petty et al. 2018) and thus results in a smaller future summer ice extent, but this relationship has a significantly larger uncertainty than SIE_c.
The other diagnostics with statistically significant correlations with future SIE include historical ice trends and interannual variability, historical SST trends and variability, and climatological air temperature. This means that projections of SIE are in particular influenced by biases in past sea ice conditions and Arctic surface temperatures, but not so much by biases in Arctic wind patterns as estimated by the proxy surface pressure. The values for each diagnostic from models and observations are given in Figs. S1–S5, and the scatterplots in Figs. S6 and S7 show the correlations between all diagnostics and future SIE.
The linear combination of multiple diagnostics (parsimonious regression model) that best predicts future SIE is calculated from the pool of diagnostics listed in Table 1 by the iterative stepwise regression algorithm. MDER selected two diagnostics for the regression model, namely SIE_c and TAS_t, with the resulting regression model equation −2.99 + 1.16 × SIE_c − 2.97 × TAS_t (see Fig. 2). This means that future SIE is constrained by a linear combination of a bias correction in the historical simulations of SIE and the trend in Arctic surface temperature. The coefficient of determination of the regression is R2 = 0.82.
Figure S1 in the supplemental material reveals that the unconstrained values of SIE_c from most models show a positive bias compared to the NSIDC-NT observations (negative constant in the regression model equation). The positive sign of the SIE_c term reflects the positive correlation between historical and future SIE, and the negative sign of the TAS_t term shows that models that simulate a strong positive temperature trend in the Arctic in the historical simulations tend to simulate a smaller SIE in the future. Most models show a negative bias in TAS_t, suggesting that the reduction in the constrained SIE is partly due to the TAS_t diagnostic. This is not surprising since Arctic temperatures have risen about twice as much as the global average (Bellucci et al. 2015), which can be partly explained by the loss of sea ice (Screen et al. 2013).
The selected diagnostics also show that the linear combination of diagnostics that best predicts the target variable does not necessarily contain the diagnostics with the highest correlation to the target variable, since the correlation coefficient between TAS_t and future SIE is not statistically significant (see Fig. 1). MDER has selected TAS_t as a predictor despite the weak relationship between TAS_t and future SIE because MDER is a stepwise approach, and a variable is added at a given step if it explains a significant fraction of residual variance from the previous step. In other words, the variance explained by a combination of SIE_c and TAS_t is significantly larger in terms of the F test than that explained by SIE_c only. Therefore, it is not necessary that there is a strong relation between TAS_t and future SIE because such a relation may be masked, for example, by a stronger relation between SIE_c and future SIE. In fact, a sensitivity test without TAS diagnostics resulted in a regression model that only included SIE_c and the strength of the constraint was roughly halved. This suggests that the two diagnostics have a roughly similar contribution to the constraint (not shown).
Applying the regression model equation to the observed SIE_c and TAS_t (blue dashed lines in Fig. 2) we can constrain the CMIP5 projections of SIE. This constraint reduces the projected multimodel climatological mean SIE (red dashed line) by 0.9 million km2 (i.e., from 4.25 to 3.35 million km2). This means that applying MDER to SIE results in an over 20% smaller Arctic sea ice extent between 2020 and 2044 compared to the unconstrained CMIP5 multimodel mean.
To estimate the impact of internal variability on the 1979–2012 historical Arctic SIE and TAS trends used in our analysis (Kay et al. 2011; Swart and Fyfe 2013; Swart et al. 2015), Fig. 3 shows trend distributions that were calculated over the whole 34-yr time period from the results of the 29 CMIP5 models and from a large initial-condition ensemble obtained with the Community Earth System Model (CESM LE; Kay et al. 2015; see also section 3b; Jahn et al. 2016). The assumption is that the spread in large initial-condition ensembles (round-off level perturbation) represents the internal variability of the climate system within the context of a particular climate model. Comparing the standard deviation of the CMIP5 trends to the one obtained from the CESM LE gives an estimate of the contribution of internal variability to the spread in the CMIP5 SIE trends. For SIE, the standard deviation of the CESM LE trends (0.21 million km2 decade−1) is slightly smaller than one of the CMIP5 trends (0.27 million km2 decade−1), suggesting that internal variability is an important but not the only factor determining the spread in the CMIP5 SIE trends. The standard deviations in the TAS trends show a qualitatively similar behavior (Fig. 3, bottom).
Unfortunately, the method often used in literature to account for contribution of internal variability (Fyfe et al. 2013; Swart and Fyfe 2013; Swart et al. 2015) could not be applied here, since it requires models with multiple ensemble members. Here, we can use only one to very few ensemble members per model since we need the exact same ensemble members to be available for all diagnostics. For the five variables required here only one ensemble member is available for most of the models (see number of ensemble members from each model in Table 2).
b. Uncertainty estimation and the potential to narrow model uncertainty
To test if the regression model is overfitted and to investigate whether the MDER method actually gives a more precise estimate of future SIE than the unweighted multimodel mean, we cross-validate the results using a pseudoreality approach. This approach selects each model in turn as a reference by which to benchmark the other models (see section 2a). Figure 4 shows the RMSE for all pseudorealities considered, both for the unweighted multimodel-mean (uMMM) and the MDER results. The results show that RMSEMDER (0.93 million km2) is about 62% smaller than RMSEuMMM (2.48 million km2). Similarly, the 25th–75th percentiles of the error ensembles are more than halved from 1.8 million km2 (uMMM) to 0.8 million km2 (MDER). The RMSE of the uMMM prediction basically reflects the intermodel spread in the projections of the 2020–44 mean SIE and is largely influenced by cases where the pseudoreality is an outlier model. In contrast, with MDER we use the information of the historical SIE (1979–2012) from the pseudoreality (“reference”) model and use the other models to estimate the change in mean SIE between 1979–2012 and 2020–44. The reduction of RMSE in MDER in comparison to that in uMMM indicates a predictive skill of MDER and suggests that overfitting is not an issue in our calculations.
To assess the full potential of MDER for reducing the prediction uncertainty, we apply the model weights calculated by MDER to obtain a weighted multimodel mean. We hereby combine the MDER approach with the method introduced by Hawkins and Sutton (2009) that separates total prediction uncertainty into the three components internal variability, model uncertainty, and scenario uncertainty (see section 2c). Figure 5a shows time series of the three sources of uncertainty in unweighted SIE projections. The dominant source of uncertainty throughout the whole time period is model uncertainty, averaging around ±2.5 million km2, which is due to the large intermodel spread in SIE projections between different CMIP5 models (Kay et al. 2011; Stroeve et al. 2012b; Liu et al. 2013; Swart et al. 2015). Differences in SIE due to different RCP scenarios start to emerge after 2020 and become increasingly larger in the second half of the twenty-first century. Thus, the scenario uncertainty increases with time and becomes more important than internal variability after 2050. Scenario uncertainty, however, remains smaller than model uncertainty until at least 2100. It is important to note that only simulations of the scenarios RCP4.5 and RCP8.5 were used here, since not enough models have run the RCP2.6 and RCP6.0 scenarios (see section 2c). As a sensitivity test, the analyses have been repeated with 19 CMIP5 models for which the three scenarios RCP2.6, RCP4.5, and RCP8.5 were available (not shown), and model uncertainty still remained the dominant source over the whole time period in this method.
The internal variability component is assumed constant in time by Hawkins and Sutton (2009), which is a limiting factor (see also section 2c). With this method, it has a value of ±0.6 million km2 and is the least important source of uncertainty after 2050. Other studies find a larger contribution of internal variability in other variables (Melia et al. 2015; Jahn et al. 2016; Lique et al. 2016; see also section 3b). This raises the question whether the method of Hawkins and Sutton (2009) really captures the internal variability to its full extent. Jahn et al. (2016) estimated internal variability in Arctic SIE projections from a large ensemble of simulations with the Community Earth System Model, as did Swart and Fyfe (2013) for Antarctic sea ice area trends.
To compare our results to the Jahn et al. (2016) method, we repeated their approach here. Figure 6 shows the results for the 38-member large ensemble (CESM LE; gray), which was forced with RCP8.5, and the 15-member medium ensemble (CESM ME; blue), which was forced with RCP4.5. The ensemble spreads were calculated as the standard deviation at each time step across all ensemble members. The two ensemble spreads are similar to each other (around 1.5 million km2) until more and more ensemble members of CESM LE reach an SIE of 0, which leads the CESM LE ensemble spread to approach 0 as well. If we assume that the ensemble spread represents internal variability, the Jahn et al. (2016) estimate is more than twice the internal variability from the Hawkins and Sutton (2009) method. The internal variability estimated by Jahn et al. (2016) quantifies variability as produced by a particular model for a particular scenario and could therefore be different for other models. In contrast, the estimate from Hawkins and Sutton (2009) is obtained by performing a statistical fit to the simulated time series and quantifying internal variability as the residual from the smooth fit across a number of models and scenarios. These are obviously different metrics and the large difference in estimated values indicates that a true estimate of SIE internal variability that accounts for multiple models is not yet possible. Regardless, it appears that the Hawkins and Sutton method may underestimate internal variability in the case of September SIE.
Model weighting has the potential to narrow uncertainties in climate model projections (Hawkins and Sutton 2009, 2011; Melia et al. 2015; Knutti et al. 2017; Eyring et al. 2019). MDER can be used to produce model weights by calculating the regression of historical diagnostics and future SIE (see sections 3a and 2a). Note that these MDER weights are different from the classical performance-based model weights, since they are not directly proportional to model biases and can be negative (Bracegirdle and Stephenson 2012). Table S1 lists all models and their weights. We recalculate the three types of uncertainty weighting the models with the MDER weights. By applying the weights calculated from the target period 2020–44 to the whole projection period we need to assume that the same processes selected by MDER are similarly important for SIE projections during the other periods, because we cannot apply MDER to a longer time period given the spurious correlations occurring when more and more models reach a sea ice extent close or equal to zero (see section 2a). The results of the weighting on the types of uncertainty are shown in Fig. 5b. Compared to the unweighted case (Fig. 5a) the weighted model uncertainty is considerably smaller, while internal variability is (as expected) not affected by weighting.
The bar charts in Fig. 5c show a more quantitative estimation of this uncertainty reduction. The relative reduction of model uncertainty becomes larger with time: from 30% in 2010 to 50% in 2090. This large uncertainty reduction is mainly due to a bias correction: the MDER weights are calculated using the two diagnostics SIE_c and TAS_t (section 3a). For SIE_c, the models with the largest biases have the smallest weights, reducing the model spread. Comparing the diagnostic values for SIE_c (Fig. S1) and TAS_t (Fig. S3) reveals that the model with the largest positive SIE_c bias (CSIRO-Mk3.6.0) also has an unrealistically small temperature trend (TAS_t), resulting in a small SIE_t, and thus gets the smallest weight (Table S1). This example illustrates how model weighting can reduce model uncertainty. In the weighted results, the contribution of model uncertainty to the total uncertainty becomes smaller with increasing projection time (from over 80% in 2010 to less than 40% in 2100), but since scenario uncertainty is also slightly reduced by weighting after 2065, model uncertainty remains the most important source of uncertainty in SIE projections throughout the twenty-first century.
To account for a possible underestimation by the Hawkins and Sutton method (as discussed above), Figs. 5d and 5e show a rough estimate of this uncertainty by doubling the internal variability component, as the Jahn et al. method suggests. Here, internal variability is more important than scenario uncertainty until 2100, accounting for 20%–30% of the total uncertainty, and becomes almost equally as important as the weighted model uncertainty toward the end of the twenty-first century. However, even considering a doubled internal variability component, model uncertainty—despite weighting—remains the dominant source of uncertainty throughout this century.
c. Weighting SIE projections and estimating YOD
We now apply the results of MDER to reduce the uncertainty in the multimodel projection of future SIE by weighting the models with the MDER weights (Fig. 7). For both RCP scenarios, the weighted multimodel-mean SIE (wMMM) is about 1 million km2 smaller than uMMM until YOD is reached in the wMMM projection of SIE. The smaller SIE from the wMMM suggests a more pessimistic future for Arctic sea ice than from uMMM, implying an earlier disappearance of the ice. We calculate two thresholds to quantify the differences between uMMM and wMMM: the first year in which the multimodel-mean SIE drops below 2 million km2 (YO2) and the multimodel-mean YOD. For RCP4.5 the YOD comes after the year 2100 in both cases but YO2 happens much earlier in case of wMMM compared with uMMM: whereas YO2 happens in uMMM after 2100, YO2 is reached in wMMM in the year 2064 and thus at least 37 years earlier. In RCP8.5 both YOD and YO2 are also earlier in the wMMM: 2047 instead of 2060 in the case of YO2 and 2062 instead of 2076 for YOD, meaning that when applying MDER to reduce the model uncertainty nearly ice-free conditions in the summertime Arctic in the RCP8.5 scenario are reached more than a decade earlier compared with the unweighted CMIP5 multimodel mean.
Weighting the multimodel mean has similar effects to weighting model uncertainty (section 3b): outlier models are down-weighted and since more models on average show a positive SIE_c bias than negative bias, and since those models also project a larger future SIE than the others, MDER’s bias correction predicts a smaller future multimodel-mean SIE. Similarly, the weighted standard deviation across the CMIP5 projections is about 0.5 million km2 smaller than the unweighted standard deviation until wMMM reaches YO2. Note that here the multimodel standard deviation is a different measure than model uncertainty [Eq. (5)] and that the standard deviation is estimated individually for each scenario, which is why the reduction in model uncertainty by weighting (section 3b) is larger than the reduction in the multimodel standard deviation. Here, the weighted standard deviation is still large, amounting up to 5 million km2.
Since YOD is clearly scenario-dependent, it provides a measure of the anthropogenically forced response of the climate system. As seen in section 3b, even without model uncertainty the contribution of internal variability is too large to precisely predict YOD. However, the results obtained here strongly suggest an earlier near-disappearance of Arctic sea ice than estimated from an unweighted multimodel mean confirming similar findings by Massonnet et al. (2012).
4. Summary and discussion
We applied the multiple diagnostic ensemble regression method (e.g., Karpechko et al. 2013) to projections of September Arctic sea ice extent (SIE) from 29 CMIP5 models (Table 2). From a set of 15 process-oriented diagnostics (Table 1) known to influence Arctic sea ice, the stepwise regression algorithm within MDER selected a linear combination (parsimonious regression model) of two of these diagnostics that best predict the projected future SIE: the climatological mean SIE (SIE_c) and the trend in Arctic near-surface temperature (TAS_t). Applying this regression model to observations of these two quantities, MDER is used to constrain the SIE projections. Compared to unweighted multimodel-mean projections, MDER results in smaller ice extents by about 1 million km2 and an earlier year of near-disappearance of Arctic sea ice (YOD) by more than a decade in the RCP8.5 scenario. By cross-validating the MDER results with a pseudoreality approach we have confirmed that our regression model has not been overfitted. We caution that the predictions of future SIE and YOD provided in this study are affected by unavoidably subjective preselection of diagnostics, and they can be refined once additional emergent constraints become available. Finding emergent constraints (e.g., Bracegirdle and Stephenson 2012; Borodina et al. 2017) for Arctic sea ice is an ongoing scientific topic. The study showed that MDER has the potential to increase the understanding of which variables mainly control the model uncertainty in the projected Arctic sea ice evolution, and we would like to encourage further process-oriented studies.
Using the method introduced by Hawkins and Sutton (2009, 2011) to separate the total uncertainty in projections of SIE in its three components (internal variability, model uncertainty, and scenario uncertainty), we have identified model uncertainty as the largest source of uncertainty in SIE projections throughout the twenty-first century. We have shown that weighting the models based on MDER’s regression model greatly reduces model uncertainty in projections of SIE by 30%–50%. This is possible because the weighting in MDER uses information from two diagnostics allowing the projections to be constrained with observations. Hodson et al. (2012) found that uncertainties in the twentieth-century mean state of the Arctic climate are a major source for uncertainties in Arctic climate projections and that observational constraints can greatly improve the precision of Arctic climate projections, which was also demonstrated here.
However, the other two sources of uncertainty—internal variability and scenario uncertainty—cannot be significantly reduced by applying the MDER method. By analyzing large ensembles of the CESM model, Jahn et al. (2016) found that these two types of uncertainty make up for a combined prediction uncertainty in YOD of about 25 years. We also used this approach here and obtained values for internal variability that were about twice as high as the one obtained with the method by Hawkins and Sutton (2009). While the CESM analysis only quantifies internal variability within the context of a single model, the discrepancy with the curve fitting method of Hawkins and Sutton (2009) suggests that the latter method might underestimate the actual internal variability in SIE. Assuming an internal variability component that is twice as large as the one suggested by Hawkins and Sutton (2009), internal variability becomes as large as the weighted model uncertainty (1.3 million km2). In any case, and despite the lack of a convincing estimate of internal variability, the internal variability is too large for a precise prediction of YOD. However, MDER strongly indicates an earlier disappearance of Arctic sea ice compared to the unconstrained CMIP5 multimodel mean.
For other sea ice variables, two studies find slightly different contributions of different sources of uncertainty to the total uncertainty. Melia et al. (2015) find that model uncertainty in September sea ice thickness projections only becomes the dominant source of uncertainty after 2022 (before that, the dominant source is internal variability) and that it accounts for maximally 70% of the total uncertainty in sea ice thickness projections, whereas for sea ice extent we find that model uncertainty is the dominant source throughout the twenty-first century, regardless of the method estimating internal variability. For the change in CMIP3 summer (July–September) Arctic sea ice extent, Lique et al. (2016) found internal variability to be more important than model uncertainty until 2020, with a model uncertainty of between 1 and 3 million km2. This is similar to the model uncertainty that we find (around 2.5 million km2). However, our estimate of the internal variability component (0.6 million km2) in SIE projections is smaller than in Lique et al. (2016), which is around 1 million km2. This could be due to the Hawkins and Sutton method not capturing internal variability to its full extent as discussed above, or due to the use of different model results (CMIP3 vs CMIP5). To further analyze and quantify the contribution of internal variability to the diagnostics used in MDER—especially the trend diagnostics—multiple ensemble members from each model are required. With larger ensembles available in CMIP6, this study can be repeated and internal variability contributions estimated.
This study demonstrates the potential of MDER to reduce model uncertainty in multimodel projections of Arctic SIE and constrain the prediction of YOD. Its strength is partly based on the strong relationship between past and future climatological mean SIE, with a correlation value of r = 0.91 (Fig. 1 and Fig. S6), which is a well-known emergent constraint (e.g., Boé et al. 2009; Bracegirdle et al. 2015; Borodina et al. 2017). Other studies (Massonnet et al. 2012; Liu et al. 2013; Melia et al. 2015; Knutti et al. 2017; Screen and Williamson 2017) also use this relationship to constrain projections of Artic sea ice using different methods such as, model selection (Massonnet et al. 2012) or bias correction (Melia et al. 2015). All studies estimate a YOD in the 2040s or 2050s, similar to our results. Liu et al. (2013) combine model selection based on model performance and constraining the model biases with observations. They estimate a YOD between 2054 and 2058. In view of our results with still large uncertainties despite weighting, it is surprising that Liu et al. (2013) find such a narrow estimate for YOD. It may stem from confining their analyses to 5-yr sliding windows.
Knutti et al. (2017) introduce a weighting scheme that accounts for both model performance and model interdependence using sea ice and surface temperature diagnostics, resulting in a faster expected decline of SIE than in the unweighted case. They raise the concern of selecting the right diagnostics, but argue that picking unsuitable diagnostics will assign random weights to the models and therefore will not influence the results, as long as the model ensemble is large enough. This is one of the key strengths of our approach: the stepwise regression algorithm in MDER filters spurious relationships and only retains the most suitable diagnostics. The strength of the MDER method to constrain the projections can be further improved by finding additional emergent constraints that can be used as diagnostics.
In addition to historical Arctic SIE, the MDER results are also based on historical Arctic temperature trends. Using CMIP3 results, Mahlstein and Knutti (2012) found a nearly linear relationship between Arctic sea ice area and both global and Arctic mean temperature. Using this relationship, they predict the future evolution of Arctic sea ice extent under different stabilized warming scenarios, based on observations and considering internal variability from the models. Their findings suggest that a permanently ice-free Arctic could likely be prevented under 2°C or less stabilized warming.
The results of the different approaches mentioned above are in general agreement with our findings. However, despite all the great community efforts to reduce uncertainties in the projections of Arctic climate, model uncertainty remains too large to give a precise estimate of the timing of the first near-disappearance of Arctic sea ice. In fact, considering the large contribution of irreducible internal variability, it is likely that these exact predictions are impossible. Yet, all of these studies, including this study, hint at a more pessimistic outlook for Arctic sea ice. Recent studies found that reducing Arctic warming to 1.5°C instead of 2°C by the end of the twenty-first century can greatly reduce the number of occurrences of an ice-free Arctic (Jahn 2018; Screen 2018; Sigmond et al. 2018). It is therefore imperative to further pursue and enhance global mitigation strategies to limit climate change.
This work was funded by the European Union’s Horizon 2020 Framework Programme for Research and Innovation “Coordinated Research in Earth Systems and Climate: Experiments, kNowledge, Dissemination and Outreach (CRESCENDO)” project under Grant Agreement 641816. Additional funding was received by the Advanced Earth System Model Evaluation for CMIP (EVal4CMIP) project funded by the Helmholtz Association of German Research Centers. A.K. is supported by the Academy of Finland (Grants 286298 and 319397). The authors acknowledge the World Climate Research Program’s (WCRP’s) Working Group on Coupled Modeling (WGCM), which is responsible for CMIP, and thank the modeling groups (Table 2) for providing their model output. We thank Veronika Eyring (DLR) and Marika Holland (National Center for Atmospheric Research) for their contributions to the study, François Massonnet (Université Catholique de Louvain) for fruitful discussions on an earlier version of the manuscript, and Mattia Righi, Manuel Schlund, and Sabrina Zechlau (DLR) for technical support with the ESMValTool. The authors also thank the three anonymous reviewers and editor James Screen for their helpful comments.
Denotes content that is immediately available upon publication as open access.