Resolution in climate models is thought to be an important factor for advancing seasonal prediction capability. To test this hypothesis, seasonal ensemble reforecasts are conducted over 1993–2009 with the European community model EC-Earth in three configurations: standard resolution (~1° and ~60 km in the ocean and atmosphere models, respectively), intermediate resolution (~0.25° and ~60 km), and high resolution (~0.25° and ~39 km), the two latter configurations being used without any specific tuning. The model systematic biases of 2-m temperature, sea surface temperature (SST), and wind speed are generally reduced. Notably, the tropical Pacific cold tongue bias is significantly reduced, the Somali upwelling is better represented, and excessive precipitation over the Indian Ocean and over the Maritime Continent is decreased. In terms of skill, tropical SSTs and precipitation are better reforecasted in the Pacific and the Indian Oceans at higher resolutions. In particular, the Indian monsoon is better predicted. Improvements are more difficult to detect at middle and high latitudes. Still, a slight improvement is found in the prediction of the winter North Atlantic Oscillation (NAO) along with a more realistic representation of atmospheric blocking. The sea ice extent bias is unchanged, but the skill of the reforecasts increases in some cases, such as in summer for the pan-Arctic sea ice. All these results emphasize the idea that the resolution increase is an essential feature for forecast system development. At the same time, resolution alone cannot tackle all the forecast system deficiencies and will have to be implemented alongside new physical improvements to significantly push the boundaries of seasonal prediction.
Climate forecasting at a subseasonal to interannual time range is now done routinely and operationally by an increasing number of research centers and institutions. Although forecasting systems using numerical general circulation models (GCMs) have made substantial progress in the last decades (Doblas-Reyes et al. 2013), systematic errors and the misrepresentations of key processes still hinder forecast quality and limit the value of dynamical prediction in certain areas of the globe (Lin 2007; Guemas et al. 2012; Vannière et al. 2013; Voldoire et al. 2014). The desire to better capture physical processes in the ocean and atmosphere, alongside a continued development of the computational efficiency of high-performance clusters used to run climate models, has motivated an increasing number of studies using higher-resolution components of the climate system for historical simulations and climate change projections (Gent et al. 2010; Delworth et al. 2012; Sakamoto et al. 2012; Hourdin et al. 2013), as well as for seasonal forecasting (MacLachlan et al. 2015; Jia et al. 2015).
Positive impacts of increased resolution have been reported in many studies using fully coupled or atmosphere-only GCMs (CGCMs and AGCMs, respectively), as documented in Table 1. From this table, it appears that increasing the resolution of atmosphere and/or ocean has beneficial impacts on the representation of the Asian monsoon independently of the baseline resolution in both AGCMs (Sperber et al. 1994; Lal et al. 1997; Branković and Gregory 2001; Mizielinski et al. 2014; Johnson et al. 2016) and CGCMs (Gent et al. 2010; Delworth et al. 2012). Despite those improvements, many studies acknowledge that the main biased features of the monsoon are not corrected by increasing resolution (Martin 1999; Johnson et al. 2016). For other phenomena, such as atmospheric blockings, the impact of increased resolution exhibits a dependence on the baseline resolution: improvements are only noted in studies considering atmospheric resolution higher than 100 km (AGCM experiments: Berckmans et al. 2013; Matsueda et al. 2009; Jung et al. 2012; Dawson and Palmer 2014). The well-known wide spread among representations of the CGCM cold tongue bias (Vannière et al. 2013) has been shown to be reduced in three different studies using high-resolution ocean component (less than 35 km; Shaffrey et al. 2009; Roberts et al. 2009; Sakamoto et al. 2012), thanks to a better representation of tropical instability waves (Roberts et al. 2009). The improvement of the representation of orographic winds and their effects on the ocean at high resolution (atmosphere resolution higher than 50 km) improves the representation of the Somali jet (AGCM study; Johnson et al. 2016) and Pacific upwelling (CGCM studies; Gent et al. 2010; Sakamoto et al. 2012). The characteristics of El Niño–Southern Oscillation (ENSO) have shown to be sensitive to atmospheric resolution (Guilyardi et al. 2004; seasonal forecast: Jia et al. 2015), oceanic resolution (Kirtman et al. 2012), and both oceanic and atmospheric resolutions (Shaffrey et al. 2009; Sakamoto et al. 2012; Delworth et al. 2012). However, in seasonal forecast experiments using resolution and grid spacing ratios similar to these studies, MacLachlan et al. (2015) conclude that the ENSO skill was not affected by an increased resolution. Similarly, Zhu et al. (2015) do not show any change in the ENSO skill when increasing atmospheric resolution to 15 km coupled with an oceanic resolution of 100 km. Several other improvements have also been reported as listed in Table 1.
Most of the previously cited studies concerned preindustrial, historical, and climate projection experiments. The impact of increasing resolution in the atmosphere on seasonal forecast quality has been comparatively much less documented. GFDL operates a high-resolution atmosphere (0.5°) and standard resolution ocean (1°) versions of their CM2.5 coupled climate model for seasonal forecasts (Jia et al. 2015; Yang et al. 2015; see Table 1 for more details) as a contribution to the North American Multimodel Ensemble (NMME) coordinated experiment. The UK Met Office GloSea5 system now uses high resolution in the ocean (0.25° and 75 vertical levels) and in the atmosphere (0.83° longitude × 0.55° latitude, ~50 km at midlatitudes), and exhibits encouraging improvements in extratropical forecasting skill, including surface North Atlantic Oscillation (NAO), winter storminess, and near-surface temperature and wind speed over Europe and North America (Scaife et al. 2014; MacLachlan et al. 2015).
The aim of the present study is to answer the following two questions: 1) What is the impact of increasing oceanic and atmospheric resolutions on seasonal forecast quality? 2) Could an increase of resolution result in an improvement of key phenomena at the seasonal time scale, such as the NAO, ENSO, blocking, and Indian monsoon? In particular, how model-dependent are the results of Scaife et al. (2014) and MacLachlan et al. (2015), especially concerning the NAO? To answer those questions, we performed seasonal reforecasts over a 17-yr period with the EC-Earth coupled model (ocean, atmosphere, land, and sea ice components) at three different resolutions: standard (SRes), increased in the ocean (IRes), and increased in both the ocean and the atmosphere (HRes). The standard horizontal resolution settings for EC-Earth are T255 for the atmosphere (approximately 0.7° in latitude and longitude and 91 vertical levels) and a 1° oceanic grid (with refinements around the equator and at the poles) counting 46 vertical layers. At the intermediate resolution the oceanic resolution is increased to a 0.25° horizontal grid with 75 vertical layers and in the high resolution both oceanic and atmospheric resolutions are enhanced, with 0.25° and 75 vertical layers in the ocean and T511 in the atmosphere (~0.35° in latitude and longitude). These resolutions have been chosen because they are comparable with the resolutions of GloSea5 (slightly higher in the atmosphere) and thus we could expect to see similar improvements especially for the NAO skill. Following Table 1, we might also expect improvement of the Asian monsoon, midlatitude blocking frequency, Pacific cold tongue bias, oceanic upwelling, and the representation of ENSO. Analyzing IRes and HRes separately will help in understanding the importance of enhanced resolution in the ocean rather than in atmosphere. The separate effects of oceanic and atmospheric resolutions have been poorly documented up to now (Kirtman et al. 2012).
The paper is organized as follows. Section 2 describes the EC-Earth coupled model in detail, the experimental setup, and the methods used to assess the changes due to the resolution. Global results in terms of forecast quality are presented in section 3. Then, the results of the resolution changes will be further described in section 4, in three different subsections focused first on the tropics, then on the midlatitudes and more specifically Europe, and finally on high latitudes. Discussion and conclusions can be found in section 5.
2. Experiments and methods
a. The EC-Earth 3.0.1 coupled model
The EC-Earth Earth system model (ESM), used to perform the reforecast of this study, is developed by the EC-Earth consortium, counting close to 20 European institutions. EC-Earth consists in the coupling of different models representing the components of the Earth system: atmosphere, land, ocean, sea ice, vegetation, glaciers, and atmospheric chemistry. In the present study, the version 3.0.1 of the coupled model, including the ocean, atmosphere, land, and sea ice components, is used to reforecast seasonal climate over a 17-yr period. An earlier version of the EC-Earth ESM is described in detail in Hazeleger et al. (2012). The main differences between these two versions are an improved radiation scheme (Morcrette et al. 2008) and a new cloud microphysics scheme (Forbes et al. 2011).
The atmosphere model is the Integrated Forecasting System (IFS) model cy36r4. In line with the objectives of this study, we use two different horizontal atmospheric resolutions: T255 (linear triangular truncation at wavenumber 255, corresponding to approximately 0.7° in latitude and longitude) and T511 (linear triangular truncation at wavenumber 511, corresponding to approximately 0.35° in latitude and longitude). At both resolutions, the number of vertical layers is identical (91, up to 0.01 Pa). The time steps for IFS are 2700 and 900 s for T255 and T511 configurations, respectively.
EC-Earth embeds the NEMO version 3.3.1 ocean model (Madec 2008). Again, in order to address the scientific question raised in the introduction, we use two different horizontal resolutions: the ORCA-1 (~1°) and ORCA-025 (~0.25°) grids with, respectively, 46 and 75 layers. The grid has higher horizontal resolution (of about one-third of a degree in the ORCA-1 configuration and one-tenth of a degree in the ORCA-025 configuration) near the equator to resolve equatorial planetary waves. The grid has three poles, one on the South Pole and two near the North Pole (one in Canada and one in Siberia), but at different longitudes and latitudes. As a result, the horizontal resolution is increased in the vicinity of the North Pole. The sea ice component used in this study is the second version of the Louvain-la-Neuve (LIM2) sea ice model (Fichefet and Morales Maqueda 1997). The time steps are 3600 and 1200 s for ORCA-1 and ORCA-025, respectively, and the sea ice model is called every ocean time step at both resolutions.
EC-Earth uses the H-TESSEL (TESSEL for Tiled ECMWF Scheme for Surface Exchanges over Land) scheme for the land surface (van den Hurk et al. 2000), which includes an improved representation of hydrology over the TESSEL scheme, in agreement with more recent IFS cycles (Balsamo et al. 2009).
The atmosphere and ocean/ice components are coupled with the Ocean Atmosphere Sea Ice Soil version 3 (OASIS3; Valcke 2013) coupler. The coupling frequency between the ocean and atmosphere is every 3 h.
b. Experimental setup and initialization strategy
Three sets of 4-month 10-member seasonal reforecasts were carried out with different configurations of the atmosphere and ocean components (Table 2). Forecasts start on 1 May and 1 November every year from 1993 to 2009. Three experiments are considered: SRes (standard resolution: T255-ORCA1L46), IRes (intermediate resolution: T255-ORCA025L75), and HRes (high resolution: T511-ORCA025L75). For all configurations the default version of EC-Earth 3.0.1 as released by the EC-Earth consortium has been used. The simulations have been run using the autosubmit workflow manager (Manubens-Gil et al. 2016).
It is worth noting, and remembering for the remainder of the paper, that only the standard-resolution version has undergone extensive tuning. However, a number of parameters have been modified in the high-resolution configuration. The high-resolution ocean model uses increased albedo parameters for sea surface temperatures and sea ice, increased nonlinear bottom drag, modified eddy diffusivities and viscosities, and increased surface input of the turbulent kinetic energy. Furthermore, the Langmuir parameterization is switched off and the advection scheme for tracers has been changed for individual members due to numerical instabilities. In the atmospheric model the parameters related to the momentum flux of gravity waves and a limiter for wind tendencies in the upper atmosphere is changed. All parameter and their respective values are summarized in the supplementary information except for the numerical parameters associated to the solver, which change according to the model resolution. The main characteristics of these simulations are summarized in Table 1.
The atmospheric initial conditions are generated from ERA-Interim. As the number of vertical levels is different in EC-Earth and ERA-Interim, a vertical interpolation of the model-level variables is performed. The 10 atmospheric initial conditions used to create the ensemble are generated using atmospheric singular vectors (Du et al. 2012). Ocean and sea ice initial conditions provided by the Global Ocean Reanalysis and Simulations (GLORYS2v1), produced at the ORCA-025 resolution (Ferry et al. 2010), have been used for IRes and HRes. For the SRes experiment, these initial conditions have been interpolated to the ORCA1 resolution and smoothed to avoid initial shocks.
1) Bias and bias correction
In our study, we analyze reforecast skill and bias for the May and boreal summer [June–August (JJA)] predictions (for May initialization), and for the November and boreal winter [December–February (DJF)] predictions (for November initialization). When initialized from an estimate of the observed climate state, reforecasts typically drift toward their attractor (i.e., toward the own stationary model climate). The drift can be understood as a systematic error that depends on the forecast time. Because model bias is a long-standing issue in climate science, we first address the impact of enhanced resolution on the development of this bias in our system. We define the forecast climatology for a specific forecast time as the forecast values averaged over all the members and all the start dates. The bias is therefore defined as the difference between the forecast and the observed climatologies over the same period. We apply a simple per-pair bias correction (García-Serrano et al. 2013) to our seasonal reforecasts before estimating the forecast skill and reliability. This method of bias correction consists in subtracting the climatology of the forecast to the different member and start dates. The same procedure is applied on the observations. This way the mean bias of the model is removed and the reforecasts can be compared with the observations. This method has been used in cross-validation mode; this way the climatology is calculated excluding the year that is forecasted.
2) Skill assessment
The skill is assessed using the anomaly correlation coefficient (ACC), the root-mean-square error (RMSE), and the Brier skill score (BS; Brier 1950) as defined below:
where f is the forecast time, the bias-corrected model ensemble mean forecast for year i, the forecast averaged over the reforecast period, and the corresponding reference data at forecast time f, and N the number of years in the reforecast period. Also,
where is the probabilistic forecast of a given event (e.g., temperature reaches over the second tercile of reference data in year i at forecast time f) based on the fraction of ensemble members predicting this event; is the corresponding “observation” in the reference data and has a value of 1 if the event happens, 0 otherwise. The Brier score is a distance in probability space and should be as small as possible.
We use the standard reliability-resolution-uncertainty decomposition of the Brier score as in Toth et al. (2003), by binning forecast–observation pairs according to since our ensemble size is 10 members. If is the number of forecasts worth , the decomposition is written as follows:
Reliability (Rel) is an estimate of the conditional bias of the probabilistic forecast, whereas resolution (Res) evaluates how well the model separates probabilistic events. Uncertainty (Unc) is independent from the model and depends only on the probabilistic event we choose to evaluate the model on.
We provide uncertainty estimates and confidence intervals with all the skill assessment. We use the Student distribution with N degrees of freedom to estimate the significance level of correlation, N being the effective number of independent data calculated following the method of von Storch and Zwiers (2001). The significance of the difference between two correlations is estimated using the methodology of Siegert et al. (2016), which takes into account the dependence from sharing the same observations in both correlation coefficients. This method to assess the significance of the difference of two correlations also takes into account the independent number of data, which is necessary given the serial correlation typical of the time series considered.
Different observational and reanalysis datasets are used in the verification process to assess the robustness of the results. The ERA-Interim reanalysis (Dee et al. 2011) is used for the two-meter air temperature (T2M), sea level pressure (SLP), winds, and daily precipitations. For sea surface temperature (SST), the ERSST v3b (Smith et al. 2008) and European Space Agency Climate Change Initiative (ESA CCI) datasets are used. For precipitation, in addition to the daily precipitation from ERA-Interim, the Global Precipitation Climatology Project (GPCP) v2.2 (Adler et al. 2003) and the Global Precipitation Climatology Centre (GPCC) data product version 4 (Schneider et al. 2008) are used. For sea ice, the bootstrap sea ice concentrations version 2 (Comiso 2000) were used as a reference, although the evaluation was also conducted on alternative products: ESA CCI (Ivanova et al. 2015), Ocean and Sea Ice Satellite Application Facility (OSI-SAF; Eastwood et al. 2014), HadISST (Rayner et al. 2003), and Centennial In Situ Observation-Based Estimates of the Variability of SST and Marine Meteorological Variables, version 2 (COBE-2; Hirahara et al. 2014) to gauge the sensitivity of results to observational error. The reference datasets will be referred to as observation in the following. All the verification, as well as part of the plotting, have been done using the version 2.1.1 of the R-based s2dverification package (http://cran.r-project.org/web/packages/s2dverification/index.html). All the area-averaged indices have been computed directly on the native grids. For maps data have been interpolated on a 1° × 1° grid. Some tests have been performed to assess the sensitivity of the results to this interpolation showing that interpolation does not affect the score representation (not shown).
The monsoon onset date has been estimated from Wang and LinHo (2002) criteria. The onset date is defined as the first day for which precipitation from ERA-Interim averaged over India exceeds 5 mm day−1. To estimate this date, we first compute the quantile corresponding to the 5 mm day−1 threshold, in cross-validation mode, from observed daily data for the months of May and June. Then, the forecasted onset date is estimated as the first day exceeding this quantile in the model.
Several methods can be used to compute the NAO index, such as EOF analysis of sea level pressure or 500-hPa geopotential height fields (e.g., Doblas-Reyes et al. 2003), sea level pressure point value differences between Reykjavik and Gibraltar (e.g., Maidens et al. 2013), or area-averaged sea level pressure differences between subpolar regions and midlatitudes (e.g., Stephenson et al. 2006). In this study, we choose to compute the NAO with two different indices based on an EOF analysis leading to an NAO index for the model (Pmod) and the observations (Pobs). The Pobs (Pmod) NAO index computation method consists in computing the NAO pattern as the leading EOF of sea level pressure in the reference dataset (the reforecast data) in cross-validation mode and projecting the model sea level pressure anomalies for the given month/season onto the pattern to compute the index. These results have also been compared with the method of Stephenson et al. (2006).
The instantaneous blocking index computed as in Davini et al. (2012) is used. This index is a 2D extension of the Tibaldi and Molteni (1990) index and provides a measure of the Rossby wave breaking activity in the midlatitudes (between 30° and 75°N). It is based on the reversal of the daily 500-hPa geopotential height gradient: data are interpolated on a common 2.5° × 2.5° grid before the index is computed. The GHGS and GHGN gradients are thus computed as follows:
where ranges from 30° to 75°N and ranges from 0° to 360°; and . Instantaneous blocking (IB) is identified when
ERA-Interim reanalysis data are used to assess the monsoon onset, NAO, and blocking results.
a. Bias reduction
The impact of increasing resolution on the model mean climate is first assessed by considering how the bias in the SRes reforecasts changes in the IRes and HRes experiments. The winter (DJF) and summer (JJA) biases are computed for forecast months 2 to 4 of the forecast initialized on 1 November and 1 May respectively following the methodology described in section 2c. These are the fast growing systematic errors of the coupled model, as some biases take several years to develop. The biases described here do not necessarily correspond to the stationary biases of the coupled model (Toniazzo and Woolnough 2014). Figures 1 and 2 show results for JJA and DJF reforecasts, respectively, for SST, precipitation, and 850-hPa winds. Figures 1a and 2a show the classical cold tongue bias in the equatorial Pacific (Vannière et al. 2013) of up to 3°C in the SRes. A similar bias is also visible in boreal summer in the tropical Atlantic (Wahl et al. 2011; Liu et al. 2012). SRes simulates too-warm SST in the western boundary currents, Kuroshio, and Gulf Stream (as was the case in EC-Earth 2.2; Hazeleger et al. 2012; Gent et al. 2010) and in the Antarctic circumpolar region. All the previously mentioned biases are present in both seasons but are stronger in DJF.
In summer, in the Indian Ocean, SRes exhibits a warm bias in the Somali upwelling that is common to other coupled models (Prodhomme et al. 2014). In the summer hemisphere (JJA in the Northern Hemisphere and DJF in the Southern Hemisphere) SRes exhibits a large cold bias in the midlatitudes.
Precipitation biases are mainly visible in the tropics, in part because the precipitation is stronger there. Precipitation is excessive in the Indian Ocean in both winter and summer, especially over the Maritime Continent [consistent with Neale and Slingo (2003)] and for the monsoon precipitations [as in Prodhomme et al. (2014)]. In the Pacific, the double ITCZ bias is also found for precipitation in winter, although in the summer the ITCZ is shifted northward (Lin 2007; Bellucci et al. 2010).
Figures 1b, 1c, 2b, and 2c show the bias differences between the SRes and the IRes and HRes experiments, respectively, for the SST. Increasing the resolution is in general beneficial for the SST bias. IRes and HRes show some important and similar improvements with respect to SRes: 1) the cold bias in summer in the North Pacific and North Atlantic basins, 2) the warm bias in the Somali upwelling, and 3) the cold tongue bias are all reduced, especially in boreal winter. This last improvement could be related to a better representation of tropical instability waves (Roberts et al. 2009). The higher-resolution atmosphere in HRes further reduces the cold tongue bias and the warm bias in the Somali upwelling, due to the better representation of the air–sea coupling and Ekman pumping (Chowdary et al. 2016).
For wind and precipitation the main changes occur when both oceanic and atmospheric resolutions are increased. In the Indian Ocean, in agreement with Prodhomme et al. (2014), the SST cooling, associated with the increase of oceanic resolution, leads to a decrease of the excessive oceanic precipitation (Figs. 1c,f). Probably because of the improved orography, the bias over the Maritime Continent is also reduced, which is consistent with Love et al. (2011) and Schiemann et al. (2014).
To quantify the change in terms of mean state when the resolution is increased, Table 3 shows the percentage of Earth’s surface where the bias is reduced in HRes, with respect to SRes for SST, T2M, and precipitation. This table shows that for all variables the bias is reduced for more than half of Earth’s surface, so according to this metric the mean state is improved when the resolution is increased, which confirms the results of Figs. 1 and 2 described above. To conclude, the increase of oceanic resolution leads to an improvement of the SST mean state but does not affect the precipitation, winds, and temperature. The combination of both oceanic and atmospheric resolution increases improves the representation of the mean state of all the considered variables. This encouraging result is consistent with the literature (Jung et al. 2012; Sakamoto et al. 2012) and needs to be further investigated from a physical point of view in order to clearly attribute processes that could be responsible for these improvements. While this will be the topic of future studies, we want here to stress that these improvements have been obtained with minimal tuning of the IRes and HRes configurations, opening promising perspectives when it will come to calibrate high-resolution GCMs in a more thorough way. The next section will investigate how the prediction skill of the system is affected by the resolution changes.
b. Impact on the forecast quality
The left columns of Figs. 3 and 4 show the ACC time correlation for the SRes experiment for SST, T2M, and precipitation with respect to the observation for JJA and DJF reforecasts, respectively. The correlation is computed for each grid point of the reference observed dataset so as to give a fair comparison of the skill for all the resolutions after applying a conservative interpolation of the reforecasts. EC-Earth3 exhibits standard correlation levels, comparable to other state-of-the-art seasonal forecasting systems (Doblas-Reyes et al. 2013). For all variables, scores are the highest over the tropical area, especially in the tropical Pacific basin, related to the El Niño–Southern Oscillation (ENSO) predictability (Figs. S1 and S2; Phelps et al. 2004; Landman and Beraki 2012; Doblas-Reyes et al. 2013). The skill of precipitation is lower than for SST, a common feature of dynamical seasonal forecast systems (Figs. 3 and 4; Doblas-Reyes et al. 2013). In summer, the correlation is generally lower than for DJF, associated with a weaker ENSO predictability (Figs. 4 and 5), due to the well-known spring predictability barrier (Chen et al. 2004; Duan and Wei 2013).
The right columns of Figs. 3 and 4 show for the two seasons the difference in ACC time correlation between HRes and SRes. To have an objective metric of the changes, Table 4 shows the percentage of the globe for which the skill is increased in HRes with respect to SRes and the percentage of the globe with correlation significant at 95% confidence level in SRes and HRes. In general, the changes are patchy, for approximately half of Earth’s surface, the skill is higher in HRes than SRes and the area with significant skill remains equivalent in the two reforecasts. For both seasons, we find few areas of significant change in correlation by increasing the resolution (Figs. 3b,d,f and 4b,d,f). The ACC decreases in the Southern Ocean, especially in boreal winter, as well as in the Atlantic Ocean and in the polar regions in the summer hemisphere (Figs. 3b,d,f and 4b,d,f). In boreal summer, the main areas of improvement common for the three considered variables, are the equatorial Indian Ocean, Maritime Continent, and western equatorial Pacific (Figs. 3b,d,f), which is consistent with the observed bias reduction in the region (see section 3a). This suggests that the improved resolution in the Indo-Pacific leads to a better representation of physical processes in the region, for example the upwelling (Akuetevi et al. 2016) and gravity waves in the Maritime Continent (Love et al. 2011). Large correlation increases are also noticed for T2M, for both JJA and DJF, in the north of Russia, Europe, and Alaska in the high-resolution experiment (Figs. 3b,d,f and 4b,d,f), where the biases also decrease in HRes compared to SRes (Fig. 1). It is worth restating here that despite the significant correlation increase the ACC might remain insignificant in some regions. This correlation increase could be associated with a better representation of the main variability modes in these regions, such as blocking or the NAO. This will be investigated in further detail in section 4b.
To conclude, it appears that increasing the resolution does not affect substantially skill at the gridpoint level. This conclusion holds with a different skill metric such as RMSE (see supplementary Figs. S3 and S4). In some areas, including large impact areas such as the Indian Ocean and the Indian monsoon region, some skill increase is visible. However, it is worth taking into account that the comparison between standard and high-resolution experiments is not completely fair in the sense that the standard resolution has undergone substantial tuning, while the same settings with no further tuning are used for the high-resolution configurations.
To better understand the skill changes that occur when the resolution is increased and investigate these changes at a subseasonal scale, the next section will provide a deeper assessment of skill and forecast quality for selected regions of interest.
4. Regional skill and forecast quality assessment
a. Tropical variability
1) ENSO and the Atlantic Niño
The main source of skill at seasonal time scales is the ENSO phenomenon. From the maps presented in the previous section, the tropical Pacific does not stand out as a region where skill is improved. Still, dots showing significant differences are visible in the equatorial Pacific (Figs. 3 and 4). To evaluate the changes occurring in this highly relevant region, we look more specifically at the skill of the Niño-3.4 index (SST averaged over 5°S–5°N, 190°–240°E), which is generally used to assess the skill of a system to forecast ENSO. Figure 5 shows that the resolution increase does improve the ENSO skill. The skill increase occurs for both summer and winter seasons and during the whole forecast length, although the increase is stronger in summer. Increasing only the oceanic resolution already improves the ENSO skill and the benefit is reinforced when the resolution of both components is increased, which is consistent with the strong coupled nature of the ENSO phenomenon. This improvement is robust when considering both deterministic scores such as correlation with different observational products (Figs. 5a,e), RMSE (Figs. 5b,f), and probabilistic skill scores such as the Brier score (Figs. 5c,d,g,h). The Brier score is smaller (therefore better) in all cases considered in HRes and/or IRes with respect to SRes, due to improvements in both resolution and reliability components, except for the DJF Niño-3.4 index reaching above the second tercile. In this case (Fig. 5h) the reliability is better and the reliability curve is closer to the perfect reliability diagonal, but the resolution score decreases. As shown in Table 1, the increase of atmospheric and/or oceanic resolution is not always associated with the improved representation of ENSO (MacLachlan et al. 2015; Jia et al. 2015; Zhu et al. 2015). From this small number of studies using different atmospheric and oceanic resolutions and grid ratios (Table 1), it is hard to extract one common argument explaining why the skill increases in some models and not in others. These improvements might be linked to the mean state improvement in the tropical Pacific (Figs. 1 and 2) and also to the better representation of high-frequency and small-scale coupled processes associated to increased horizontal but also vertical resolution (Masson et al. 2012). In order to better understand why ENSO is affected by resolution changes, other sensitivity experiments, including seasonal forecast and preindustrial control simulations with different horizontal and vertical resolutions in the ocean and atmosphere, in particular an experiment with resolution increased only in the atmosphere component, would be needed. The robustness of these results must also be assessed with larger simulations, since 17 start dates and 10 members is relatively limited.
The skill of the Atlantic Niño and the West African monsoon does not increase with model resolution (Fig. S5). This result might be expected for several reasons. First, in the literature the tropical Atlantic and surrounding precipitation have not been highlighted as affected by resolution changes (see Table 1). Then, the strongly biased tropical Atlantic mean state (Fig. 1) is not affected the by resolution increase, which suggests that no change in the representation of this region occurs when the resolution increases. Exarchou et al. (2016, manuscript submitted to Climate Dyn.) investigate in detail the mechanisms leading to the formation of the biases in the EC-Earth 3 model in this region and conclude that biases are associated with the misrepresentation of cloud cover in the Angola–Benguela region and around the equator due to an erroneous subtropical overturning cell. Those two processes are not expected to be strongly impacted by resolution changes. As expected in the absence of improvement in the tropical Atlantic, the skill of the West African monsoon average precipitation is not clearly improved when the resolution is increased (Fig. S5).
2) Indian monsoon and Indian Ocean
Several studies reported an improved representation of the Indian monsoon with an increase of resolution (Table 1; Sperber et al. 1994; Lal et al. 1997; Gent et al. 2010; Delworth et al. 2012; Johnson et al. 2016). In addition, the monsoon and the monsoon onset date are known to be strongly influenced by ENSO (Boschat et al. 2011; Prodhomme et al. 2015); therefore improvement of ENSO representation might lead to improvement in the Indian monsoon. Figures 6a–c show three different monsoon indices: the Indian summer monsoon dynamical index (IMDI; zonal wind at 850 hPa averaged in the region 5°–15°N, 40°–80°E minus the zonal wind averaged in the region 20°–30°N, 70°–90°E; Wang et al. 2001), the extended Indian monsoon rainfall (5°–30°N, 70°–95°E), and the Indian summer monsoon rainfall (continental precipitation over the region 5°–30°N, 70°–95°E). All these indices show an improvement of skill with the increase of resolution at the beginning of the monsoon season. The improvement of early monsoon skill could be associated with an improvement of the forecast of the monsoon onset date, one of the most relevant variables of the monsoon. To confirm this hypothesis we have estimated the monsoon onset date based on the criteria of Wang and LinHo (2002), following the methodology described in section 2c. We obtain the results described in Table 5, which shows the estimated mean threshold, the mean, the standard deviation, and the ACC of the onset date in the different simulations. The three simulations show a relatively high correlation for this precipitation-based monsoon onset date, close to other studies of monsoon onset predictability (Alessandri et al. 2014). The skill of the onset date is increased in IRes and HRes compared to SRes, and the highest skill is obtained in the IRes reforecast; however, the changes are not significant at the 95% confidence level.
To show the increase of monsoon skill, Figs. 6d–f show the skill of three indices known to influence the monsoon and especially the monsoon onset: the SST in the Arabian Sea (Levine and Turner 2012), in the western Indian Ocean (WIO; Prodhomme et al. 2014), and in the Indian Ocean basin (IOB; Boschat et al. 2011; Prodhomme et al. 2015). The skill of these three indices is slightly increased in IRes and HRes; as for the monsoon onset date, the skill is the highest in IRes. This last point requires deeper investigation. This increase of skill in the Indian Ocean and the increase of ENSO skill could explain the improvement of the monsoon predictability in IRes and HRes.
The results suggest that higher resolution in the ocean and/or atmosphere can improve the simulation of the Indo-Pacific SST, which seems to have a positive impact on the Indian monsoon predictability. To better understand the sources of skill coming from the resolution increase and disentangle the effects of large-scale and local processes, an assessment of the large-scale dynamics of the monsoon, vertical and horizontal wind shear in the Indian Ocean, north–south tropospheric temperature gradient, dynamical onset date (Xavier et al. 2007), and the local processes in the Indian Ocean, such as the intraseasonal oscillation (Goswami 2005), would be needed. The robustness of the small improvements of the onset date skill should also be tested by comparing different indices (Fasullo and Webster 2003; Xavier et al. 2007; Wang et al. 2009; Bombardi and Carvalho 2009).
b. Atmospheric circulation at midlatitudes
1) North Atlantic Oscillation
Impacts of increasing ocean and atmosphere resolution are assessed over the Northern Hemisphere midlatitude both in terms of variability (Northern Hemisphere blocking index) and skill in forecasting interannual variations of the main climate indices (e.g., NAO).
The NAO accounts for a substantial part of the large-scale atmospheric circulation over the North Atlantic region at seasonal-to-interannual time scales. Several studies have evaluated the (generally limited) forecast skill of GCMs of the seasonal NAO index (e.g., Doblas-Reyes et al. 2003; Arribas et al. 2011; Kim et al. 2012). Recently, higher-resolution GCMs were found to have significant skill in forecasting the NAO (Scaife et al. 2014). Butler et al. (2016) showed by studying the Climate-System Historical Forecast Project database that models with a well-resolved stratosphere tend to better represent responses to ENSO and the quasi-biennial oscillation (QBO) in upper levels of the atmosphere, but that this does not necessarily translate into boreal winter NAO skill improvements. The NAO skill assessment is highly uncertain when using limited ensemble sizes and short reforecast periods (e.g., Shi et al. 2015). We therefore chose to evaluate the NAO reforecast skill using several NAO indices (pressure difference, Pmod and Pobs; see section 2c for more detail), and focused on both seasonal (months 2–4 average) skill and monthly skill up to four months lead.
NAO index correlation coefficients with respect to ERA-Interim data for boreal winter and summer in the three experiments studied here are shown in Fig. 7 for Pobs and Pmod NAO index calculations. Results from Fig. 7 can be summarized as follows. The NAO prediction skill in EC-Earth is quite poor with standard resolution over the 1993–2009 reforecast period for both boreal winter and summer. Increasing resolution leads to some improvements in NAO correlation, especially for boreal winter (Fig. 7b). For boreal summer (Fig. 7a) results are more contrasted, since for higher forecast times (those exceeding month 3) the NAO correlation drops when increasing resolution.
Results are similar for both methods shown in the figure, and confirmed in the case of boreal winter by computing area-averaged sea level pressure differences as in Stephenson et al. (2006) (not shown). However it is crucial to note that uncertainty intervals for these scores are very large and differences are often not significant.
2) Atmospheric blocking
Atmospheric blocking simulation is a common issue in state-of-the-art GCMs. The most recent generation of GCMs still exhibits large biases, especially over the European region (Anstey et al. 2013). To evaluate the winter atmospheric blocking variability, the instantaneous blocking index computed as in Davini et al. (2012) is hereafter used.
Figure 8 shows the average bias for the full winter season (November–February) for the SRes, IRes, and HRes experiments. SRes shows limited bias over both North Atlantic and Pacific; however, the classic negative bias over Europe is clearly present. Increasing the oceanic resolution does not lead to any evident improvement; conversely, increasing the atmospheric resolution leads to a reduction of the bias over Europe and the North Pacific. This is in agreement with experiments showing a notable improvement of blocking over the Euro-Atlantic region following an increase in horizontal resolution (e.g., Matsueda et al. 2009; Jung et al. 2012; Dawson et al. 2012). Indeed, there is mounting evidence that reduced Euro-Atlantic blocking bias is associated with better-resolved transient eddy activity, which can sustain blocking persistence (Berckmans et al. 2013), and with a higher orography variance (especially relevant over the Rocky Mountains), which affects the mean tilt of the Atlantic eddy-driven jet, favoring higher geopotential height values over Europe (Jung et al. 2012; Berckmans et al. 2013).
On the other side, the lack of improvements following an increase in oceanic resolution is likely associated with the persistent SST biases in the North Atlantic region, which can be seen in both SRes and IRes (Fig. 2). Although minor improvements can be observed, there are small changes in the SST midlatitude frontal zone (and in general over North Atlantic) between the low and high oceanic resolution experiments. This partially contrasts with the work of Scaife et al. (2011), where increased oceanic resolution leads to improved blocking. However, we must keep in mind that the November initializations and the short-term duration of the current simulations certainly play a significant role. They indeed keep both the models far from their own attractor, potentially reducing the benefits of a better resolved oceanic circulation.
Regarding the predictive skill, an extremely weak signal is found (not shown). Greenland blocking shows weak skill in agreement with results for the NAO, but nothing significant emerges over Europe. Considering the improvement in the mean state for the HRes experiments, it is likely that a larger number of ensemble members will be needed to achieve forecast skill for blocking over Europe.
c. Polar regions
Seasonal sea ice prediction is a growing field of research (Chevallier et al. 2013; Wang et al. 2013). In the Arctic, forecasting the sea ice conditions a few months in advance is of great strategic relevance (Emerson and Lahn 2012; U.S. Navy 2012). As an example of response to these challenges, the Sea Ice Prediction Network (SIPN; www.arcus.org/sipn) has collected, each year since 2008, submissions from various groups using a range of methods in order to predict the upcoming summer sea ice conditions in the Arctic. Predicting the seasonal development of Antarctic sea ice is arguably less interesting from an end-user perspective, especially over the winter season, but is still challenging from a scientific point of view given the recent chain of high records of sea ice extent in 2012 (Turner et al. 2013), 2013 (Reid et al. 2015), and 2014 (Massonnet et al. 2015b). The present section attempts to assess systematically how resolution can impact the skill and biases of sea ice seasonal prediction.
We focus here on the fourth month of prediction (August and February) as the sea ice cover exhibits strong seasonality reaching its maximum and minimum extensions in late local winter or summer, respectively. (Note that September is conventionally preferred to August, but our simulations do not extend that far). We define sea ice extent as the cumulative area of ocean grid cells where sea ice concentration exceeds 15%, and compute this diagnostic from the monthly-mean sea ice concentration field, for various sectors of the planet.
We note first that changing the resolution does not have an obvious impact (positive or negative) on model bias (Fig. S6). This bias is systematically larger between any experiment and the observations than the difference between any two experiments, for all sectors and start dates considered. The results should thus be viewed as “conservative” in the sense that high-resolution does not degrade existing model biases; it remains to be established whether fine tuning can eventually reduce them. In either case, the message is straightforward: resolution alone does not spontaneously improve the mean state of simulated Arctic and Antarctic sea ice, at least at the seasonal time scale. Improving the estimation of ocean–sea ice initial conditions would likely play a more important role to reduce basinwide and regional biases, as suggested by recent studies (Guemas et al. 2016; Massonnet et al. 2015a).
We then consider the ability of the model to predict interannual variations of August sea ice extent. Because of secular negative (positive) trends in sea ice extent in the Arctic (Antarctic), we focus on detrended time series to avoid overinterpretation of model skill. Figure 9 reports correlation of detrended time series of sea ice extent in August in several sectors of the Arctic and Southern Oceans for the three experiments SRes, IRes, and HRes, and using five sea ice concentration products as observational references. This makes a total of 40 triplets (120) of correlations. We find the HRes experiment to rank highest 19 times. If all tests were independent from each other (which is questionable since observations and sea ice extent in sectors are well correlated with one another), a result as extreme as this one could happen 4.5% of the time just by chance. An inspection of actual time series of sea ice extent (Fig. S6) reveals that correlation changes are the consequence of hardly noticeable changes in the time series. Using alternative observational products alters the estimated correlations, by amounts sometimes as large as differences across experiments themselves, but preserves rankings anyway (Fig. 9). If not spectacular, the overall improvement of the scores, consistent among observational products, when resolution increases is a reassuring result. However, the difference of correlation between HRes and SRes are not significant at 95% confidence level in most of the cases.
A similar analysis conducted for the target month of February (not shown here) reveals a totally different picture. The SRes run ranks highest in 20 cases out of 40, which would have happened 2.1% of the time by chance. By a symmetrical argument, we conclude that using a higher-resolution, but untuned configuration degrades the ability of the model to forecast sea ice extent variations. Sea ice extent in the Arctic in winter is largely controlled by oceanic heat convergence (Bitz et al. 2005), and we hypothesize that a simple change in resolution is not sufficient to transport the adequate amounts of heat northward—at least when the same parameterizations as the ones for the SRes configuration are used.
When considering both the mean state and variability, it is apparent that our three simulations resemble more each other than they resemble observations. This indicates that resolution alone cannot fully address longstanding biases, and that other ongoing developments (modeling, new parameterizations, generation of better initial conditions) have to come into play.
In the present study, we have compared three different simulations run with the same version of the EC-Earth coupled model. The simulations differ only in their oceanic and atmospheric resolutions. This unique set of simulations offers the possibility to investigate systematically the benefits of resolution for seasonal climate prediction. It is not possible to extract a common conclusion for all regions, seasons, variables, and lead times considered in this paper, as results vary highly depending on the process investigated. We first summarize cases where resolution is clearly beneficial, then examples where it leaves baseline prediction capabilities unchanged (or even deteriorates them), and finally cases for which the assessment is uncertain.
The increase of oceanic and/or atmospheric resolution in EC-Earth 3 generally leads to an improved representation of the mean state of the reforecast. The SST is more sensitive to resolution increase in the ocean model, whereas atmospheric variables (precipitation, near-surface wind, T2M over land) are improved only if atmospheric resolution is also increased. Some improvement of the mean state, especially in the tropical Indo-Pacific, is associated with an improvement of skill in this region for both SST (Niño-3.4, IOB, western Indian Ocean, and Arabian Sea) and monsoon simulation (onset date, dynamical index, and rainfall). Some minor improvements are also visible in the sea ice and NAO predictability, as well as the representation of the blocking indices. All those improvements, summarized in Table 6, suggest that, despite the absence of specific tuning, increasing the resolution is a way forward to improve both the simulation of the mean climate and the predictability of high-impact phenomena. Other phenomena have been shown to be improved with resolution increase and have not been assessed in this study, such as tropical cyclones (Caron et al. 2011; Manganello et al. 2016) or polar jet (Guemas and Codron 2011; Hourdin et al. 2013).
However, despite these improvements, the SST skill is decreased in the high-resolution simulation over more than half of the ocean surface. This is especially true in the tropical Atlantic, where both mean state and skill are deteriorated in the high-resolution runs compared to the standard resolution. Most coupled models are strongly biased in the Atlantic (Toniazzo and Woolnough 2014); the PREFACE (Enhancing Prediction of Tropical Atlantic climate and its Impacts) EU FP7 project, which partly funded this study, aims to better understand the reasons underlying the formation of those biases. Our results illustrate the limitations to increasing resolution as a means of improving seasonal forecasts, particularly in the case of the tropical Atlantic. In the scope of the project, different studies are in preparation tackling these biases with different approaches, such as wind stress and flux corrections, assessment of the daily evolution of the model drift in seasonal reforecasts, and CMIP5 multimodel analysis. It is essential to study together seasonal forecast and long-term simulations to better understand how both systematic error and variability of the long run could be related to seasonal forecast drift and skill.
Some hints of improvement of the representation of the North Atlantic Oscillation and blocking indices have been found when the resolution is increased; however the limited ensemble size and number of start dates considered here are not sufficient to assess robustly the changes in the forecasting skill of these phenomena. These considerations open a common debate in the seasonal forecasting community: whether resources should be used to increase resolution or ensemble size. Running the HRes experiment is approximately 19 times more expensive than the SRes experiments. In addition, other strategies, such as introducing stochastically perturbed parameterization tendencies (SPPT) perturbations in the atmosphere could lead to comparable increase of ENSO skill to the one obtained in HRes (Batté and Doblas-Reyes 2015). It is legitimate to ask if such an expensive experiment is justified given the amplitude of the observed improvements. To answer this question it is worth investigating if an equivalent tuning to the one performed for the low-resolution configuration would help the HRes version significantly improve its performances. A substantial effort should be done for the tuning of the high-resolution version of EC-Earth to answer this question, ideally following an objective framework applied to both model resolutions (Bellprat et al. 2012).
As the horizontal resolution increases in both the ocean and the atmosphere, GCMs move closer to the so-called “grey zone” of spatial scales for which classic physical parameterizations may no longer be adapted. As shown by Jung et al. (2012), an upper limit to improvements due to increasing resolution in current-generation GCMs is to be expected, and research is now underway to design scale-aware parameterizations that could help push back limitations in climate forecasting, provided that appropriate computational resources are available (e.g., Stan and Xu 2014; Campin et al. 2011) . At the spatial resolutions considered in this study (from ~1° to ~0.25°), model biases and forecast skill are in general moderately affected by the sole changes in resolution, indicating that further process-based tuning will be necessary for the newly developed high-resolution versions of the model. Yet, in some particular cases (ENSO prediction, Indian Ocean), these simple changes in resolution are sufficient to significantly advance the forecast system capabilities. Developing and tuning global forecast systems at higher resolutions is thus of utmost importance for pushing the boundaries of seasonal prediction.
The research leading to these results has received funding from the EU Seventh Framework Programme FP7 (2007–2013) under Grant Agreements 308378 (SPECS), 603521 (PREFACE), and 607085 (EUCLEIA), the Horizon 2020 EU program under Grant Agreements 641727 (PRIMAVERA) and 641811 (IMPREX), and the ESA Climate Change Initiative (CCI) Living Planet Fellowship VERITAS-CCI. We acknowledge PRACE for awarding access to Marenostrum3 based in Spain at the Barcelona Supercomputing Center through the HiResClim project. We acknowledge the work of the developers of the s2dverification R-based package (http://cran.r-project.org/web/packages/s2dverification/index.html) and autosubmit workflow manager (https://pypi.python.org/pypi/autosubmit/3.5.0). Paolo Davini acknowledges the funding from the European Union’s Horizon 2020 research and innovation programme COGNAC under the European Union Marie Sklodowska-Curie Grant Agreement 654942.
Supplemental information related to this paper is available at the Journals Online website: http://dx.doi.org/10.1175/JCLI-D-16-0117.s1.