While fully coupled atmosphere–ocean models have been used to study the seasonal predictability of sea ice variations within the context of models’ own variability, their capability in predicting the observed sea ice at the seasonal time scales is not well assessed. In this study, sea ice predictions from the recently developed NCEP Climate Forecast System, version 2 (CFSv2), a fully coupled atmosphere–ocean model including an interactive dynamical sea ice component, are analyzed. The focus of the analysis is the performance of CFSv2 in reproducing observed Northern Hemisphere sea ice extent (SIE). The SIE climatology, long-term trend, interannual variability, and predictability are assessed. CFSv2 contains systematic biases that are dependent more on the forecast target month than the initial month, with a positive SIE bias for the forecast for January–September and a negative SIE bias for the forecast for October–December. A large source of seasonal prediction skill is from the long-term trend, which is underestimated in the CFSv2. Prediction skill of interannual SIE anomalies is found to be primarily within the first three target months and is largest in the summer and early fall. The performance of the prediction of sea ice interannual variations varies from year to year and is found to be related to initial sea ice thickness. Potential predictability based on the forecast ensemble, its dependence on model deficiencies, and implications of the results from this study for improvements in the seasonal sea ice prediction are discussed.
Sea ice plays an important role in the global climate system through its impacts on the high-latitude energy and water balance. Variability and recent trends of sea ice are shown to represent two-way interactions between the atmosphere and ocean. On one hand, long-term trend in sea ice during the past decades is attributed to external forcing associated with increasing greenhouse gases (Stroeve et al. 2007; Min et al. 2008; Kay et al. 2011), and further, interannual variations of sea ice are found to be related to atmospheric climate variability (Ogi and Wallace 2007; Kay et al. 2008; L’Heureux et al. 2008; Graversen et al. 2011; Zhang et al. 2008a). On the other hand, the atmospheric anomalies have been shown to be associated with sea ice variability at interannual (Francis et al. 2009; Balmaseda et al. 2010; Kumar et al. 2010) and decadal (Alexander et al. 2004; Deser et al. 2007) time scales. More recently, high-latitude amplification of temperature trends has been attributed to recent decline in sea ice (Serreze et al. 2009; Screen and Simmonds 2010). In addition, the Pacific Ocean inflow is suggested to have contributed to the sea ice loss (Shimada et al. 2006). For longer time scales, freshwater flux associated with the sea ice melt has been associated with variability in ocean circulation in high latitudes [e.g., the Atlantic meridional overturning circulation (AMOC) in the Atlantic; Mahajan et al. 2011]. These results indicate that sea ice variability is related to the interactions between the atmosphere and ocean, and therefore its simulation, prediction, and projection requires the use of fully coupled atmosphere–ocean systems.
With advances in coupled models, and the recent decline in the sea ice over the Arctic leading to the possibility of open oceans during boreal summer, prediction and predictability of sea ice have received increasing attention during the past several years. In this context, fully coupled atmosphere–ocean general circulation models (GCMs) have been used to analyze the sea ice predictability in several studies and the source of the predictability is attributed to change in the external radiative forcing, sea ice thickness, and high-latitude ocean temperatures (Holland et al. 2011, Holland and Stroeve 2011; Blanchard-Wrigglesworth et al. 2011a,b).
For predictions of the observed sea ice on seasonal and interannual time scales the analysis is mostly based on statistical prediction methods (Drobot and Maslanik 2002; Drobot et al. 2006; Tivy et al. 2007; Lindsay et al. 2008). Such methods, however, cannot take into account the coupled interactions and feedbacks between sea ice and the atmosphere. An alternate method of predicting the future state of sea ice on seasonal time scales is the use of an ocean-only model forced with atmospheric fields from previous years (Zhang et al. 2008b), but this method also ignores the interactions between the atmosphere and ocean and between the atmosphere and sea ice.
In this study, we go beyond the use of empirical prediction or ocean-only models and utilize a viable alternative for sea ice prediction—a fully coupled GCM. We assess the prospects for Arctic sea ice extent (SIE) from the recently developed National Centers for Environmental Prediction (NCEP) coupled Climate Forecast System, version 2 (CFSv2). The analysis is based on a 10-month retrospective forecast (or hindcast) ensemble of 16 members initialized for each month from December 1981 to November 2007. The following aspects related to sea ice prediction and predictability are analyzed: (i) overall skill in forecasting sea ice extent, (ii) the capability of the forecast system in maintaining observed long-term trend and capturing the interannual variability, (iii) seasonal dependence of the forecast skill, and (iv) the potential predictability of the sea ice extent within the forecast system and its relationship with the representation of the internal variability. In addition, the mean bias of the forecast system and its impacts on the skill is also discussed.
2. The forecast model and data
The CFSv2 model consists of fully coupled components of the ocean, atmosphere, and land (Saha et al. 2012, manuscript submitted to J. Climate). The atmospheric component is the NCEP Global Forecast System (GFS) at a horizontal resolution of T126 (~100 km) with 64 vertical levels extending from the surface to 0.26 hPa. The oceanic component is the Geophysical Fluid Dynamics Laboratory (GFDL) Modular Ocean Model, version 4 (MOM4), which uses 40 levels in the vertical, a zonal resolution of 0.5°, and a meridional resolution of 0.25° between 10°S and 10°N, gradually increasing through the tropics until becoming fixed at 0.5° poleward of 30°S and 30°N. The ocean component includes a thermodynamic and dynamic sea ice model from the GFDL Sea Ice Simulator (Griffies et al. 2004). The sea ice model has two layers for sea ice and one layer for snow. In each ice grid there are five categories of possible sea ice thicknesses: 0–0.1, 0.1–0.3, 0.3–0.7, 0.7–1.1, and >1.1 m. The sea ice dynamics is based on Hunke and Dukowicz (1997) using the elastic–viscous–plastic technique to calculate ice internal stress, while the sea ice thermodynamics is based on Winton (2000).
The CFSv2 hindcasts were initialized from the Climate Forecast System Reanalysis (CFSR; Saha et al. 2010). The CFSR is the latest version of the NCEP climate reanalysis with the first guess from a coupled atmosphere–ocean model that is similar to CFSv2 model with a higher horizontal resolution of T382 (~38 km). The CFSR assimilates available in situ and satellite observations. For sea ice, only the sea ice concentration from satellite observations is assimilated. Sea ice thickness and its horizontal displacement are determined by the thermodynamic and dynamic balance between the sea ice and the overlying atmosphere and underlying ocean water. The input satellite sea ice concentration observations are from different sea ice analyses. From January 1979 to December 1996, observed sea ice data from Goddard Space Flight Center (GSFC) sea ice analysis (Cavalieri 1994; Cavalieri et al. 1996) was assimilated. From January 1997 to October 2007, an offline NCEP sea ice analysis was assimilated. After October 2007, the NCEP operational sea ice analysis was used. The use of different satellite retrievals in CFSR has caused an upward SIE jump in 1997 (Wang et al. 2011), and, as to be shown, resulted in a weaker downward SIE trend in the CFSv2 hindcasts initialized from the CFSR. More details on the assimilation of sea ice concentration can be found in Saha et al. (2010).
For each year in the analysis period, CFSv2 forecasts were produced every five days on 1, 6, 11 January and so on until 27 December without considering 29 February. Further, four forecast runs from 0000, 0600, 1200, and 1800 UTC were available for each initial day. In this study, the seasonal forecasts initialized from December 1981 to November 2007 are analyzed. For each initial month, an ensemble of 16 runs taken from the last four initial dates of each month is used. For example, forecasts from 16, 21, 26, and 31 January, with four runs from each day, form the ensemble from January for subsequent 10 target months. Following this example, forecasts from January initial conditions for target months of February, March, April, etc., are referred to as forecasts at the lead time of zero month, one month, two months, and so on. Forecasts used for the analysis are monthly mean of the 16-member ensemble.
The focus of our analysis is the prediction and predictability of sea ice extent in the Northern Hemisphere. SIE is defined as the total area of grid boxes where monthly mean sea ice concentration is greater than 15%. For the assessment of the CFSv2 performance, GSFC analysis of sea ice concentration (Cavalieri 1994; Cavalieri et al. 1996) archived by the National Snow and Ice Data Center (NSIDC) is used as the observed truth. The original NSIDC data on a 25 km × 25 km grid were interpolated to the 0.5° × 0.5° CFSv2 ocean grid in the mid- to high latitudes. A land–sea 0.5° × 0.5° mask that gives a common ocean coverage between the NSIDC data and CFSv2 ocean component is used for computing SIE.
We first compare the SIE climatology between the observations and forecasts. The prediction and predictability of SIE are then analyzed, followed by an analysis of spatial distributions of long-term trend and interannual anomalies for two pairs of adjacent years. SIE climatology is defined as the 26-yr average and anomalies are departures from the climatology.
a. SIE climatology
Mean SIE biases of the CFSv2 forecast are shown in Fig. 1a (Fig. 1b) as a function of initial (target) month and lead time. Overall, forecast errors depend on target months more strongly than initial months beyond the lead time of 2–3 months. The amplitude of the forecast bias varies with lead time, with a maximum at the lead time of 0–2 months for January–March, and 3–8 months for April–June (Fig. 1b). Seasonally, the largest errors are found at 3–6-month lead for the forecast for April–June and October–December. As shown in Fig. 1c, the CFSv2 forecast at 5-month lead has a positive bias in the prediction for January–September and a negative bias for October–December.
Spatial distributions of average sea ice concentration for the 5-month-lead forecast and for 4 target months (March, June, September, and December) are shown in Fig. 2 with shadings, while the observed sea ice concentration of 15% is shown with green curves. The CFSv2 captures the general features of the seasonal sea ice retreat in summer and fall, and expansion in winter and spring. However, the forecast contains substantial systematic errors. There is a negative bias around the Bering Strait in December and March, which actually persists from November–April (not shown). In December, as well as in November (not shown), there is also a negative bias over the Hudson Bay and Davis Strait. In March and June, there is a positive bias in the Labrador Sea and Greenland Sea. The positive bias in the Labrador Sea exists during February–June and the positive bias in the Greenland Sea can be seen throughout the year (not shown). While these biases reflect the mean differences between the forecast and observation, they may also affect model’s ability in predicting interannual variability. For example, the missing sea ice near the Bering Strait in December forecast at 5-month lead time (Fig. 2d) indicates that the forecast system will be unable to reproduce the observed interannual variations there.
b. SIE prediction
Variations in the observed sea ice anomalies include two components: the long-term trend and the interannual variability. To assess the model’s performance in predicting each component, we define a linear trend for the observation and prediction anomalies, respectively. Detrended anomalies are then computed as departures from the linear trend. In addition, a time series of year to year (Y2Y) changes is also analyzed. The Y2Y change for year n is taken as SIE of year n + 1 minus SIE of year n. The calculation of the Y2Y change effectively removes long-term trend even if it is not linear in time.
1) Predicted SIE anomalies
Observed March SIE anomalies and the corresponding forecast at 0-, 2-, and 5-month lead times are shown in Fig. 3 for total anomalies (Fig. 3a), detrended anomalies (Fig. 3b), and Y2Y change (Fig. 3c). The anomalies for September are shown in Fig. 4. The CFSv2 produced downward trends as in the observation. However, the forecasted trends from CFSv2 are weaker than the observed for both March (Fig. 3) and September (Fig. 4). For March (September), with the unit of 106 km2 decade−1, the observed linear trend is −0.52 (−0.82), compared to CFSv2 forecast trends of −0.31 (−0.46), −0.33 (−0.48), and −0.33 (−0.37) at 0-, 2-, and 5-month lead times. That the weaker forecast trends are seen even at the shortest lead time (L0) suggests that it is probably related to initialization errors in CFSR, which contains an unrealistic upward SIE jump in 1997 due to the change in satellite retrieval (Wang et al. 2011). Consistent with the weaker forecast trends, the total anomalies are also generally smaller than observed before 1997 and larger than observed after 1997.
This systematic change in the observation–forecast contrast is likely due to the use of different sea ice analyses in CFSR before and after 1997 (Saha et al. 2010), making it difficult to effectively remove the trend in the forecast. Consequently, the detrended forecast anomalies (Figs. 3b and 4b) may still include the component that depends on the change in CFSR initial conditions in 1997. Additional discussions on the impacts of the use of different sea ice analyses in CFSR can be found in Wang et al. (2011).
In terms of characterizing interannual variations, the Y2Y change (Figs. 3c and 4c) is more representative compared to the detrended anomalies. Generally, CFSv2 is unable to forecast interannual anomalies (detrended anomalies and Y2Y changes) beyond a 2-month lead. Forecast accuracy also varies with time. For example, Y2Y changes of September SIE is quite realistic even for L5 for 1989–93, but the forecast of Y2Y changes for 2006/07 did not capture the sharp drop even for L2.
2) SIE prediction Skill
The SIE prediction skill of CFSv2 is assessed based on mean anomaly correlation coefficient (ACC) and root-mean-square error (RMSE). Figure 5 shows the ACC and RMSE as a function of target month (x axis) and lead time (y axis). The level of significance is estimated based on the Monte Carlo approach whereby ACC and RMSE are computed after randomizing the forecasts. This procedure is repeated 1000 times, and the significance is defined as the fraction of times the actual forecast ACC (RMSE) is greater (less) than ACC (RMSE) achieved with the randomized set.
There is a strong seasonal dependence of the forecast skill of total anomalies (Fig. 5a). The RMSE is larger for warm seasons (June–October) than for cold seasons (November–May). This is probably related to the seasonal variation of interannual variability which has larger amplitude in warm season as to be discussed later. At longer lead time (L > 2 months), ACC for December–March is higher than that for June–October. For short lead times (L < 2), the ACC for April–October is higher than that for December–February. This seasonal variation of forecast skill suggests that different factors control prediction skill in different seasons. It is also noticed that for January and February forecasts, the ACC (RMSE) at a 0-month lead is even lower (larger) than that at longer lead (Figs. 5a,e). This is probably due to the errors in the initialization with CFSR and it takes 1 month or so for the model to reach the balance among the interacting components.
The ACC of detrended anomalies and Y2Y changes is much smaller than that of total anomalies, indicating that the skill of seasonal forecast is strongly affected by the trend, consistent with the result of Lindsay et al. (2008) who showed the dominance of the trend over the detrended interannual variations based on statistical methods for September SIE prediction. The impact of long-term trend related to climate forcing is also discussed in Blanchard-Wrigglesworth et al. (2011b). The seasonality of the forecast skill is similar between detrended anomalies and Y2Y changes. The skill of detrended forecast is slightly higher than skill for the Y2Y changes, possibly because, while the Y2Y purely represents the year-to-year variability, the linearly detrended anomalies contain the year-to-year variability as well as the nonlinear part of the trend that is not removed by the linear detrending. If the nonlinear part of the trend is captured by the forecast system to some extent, the forecast skill of the detrended anomalies will be higher than the skill for Y2Y change. Both detrended and Y2Y changes show relatively larger ACC for August–October at lead time of 1–2 months.
For comparison, prediction skill of damped persistence forecasts is given in Fig. 6. The damped persistence forecast for each target month is taken as the observed monthly mean anomaly of the initial month scaled by lagged correlation between the initial and target months. For example, the damped persistence forecast for September at a 2-month lead is the observed anomaly of June multiplied by the lagged anomaly correlation coefficient between June and September. In this study, damped persistent forecasts using NSIDC data from December 1981 to November 2007 as the initial observations are used.
Compared to the CFSv2, the skill of damped persistence shows similar seasonal variations, and similar contrast among total anomalies, detrended anomalies, and Y2Y changes. For ACC, the skill of damped persistence forecast is generally lower than the CFSv2 forecast skill. It should be pointed out that the ACC of damped persistence is the same as that of persistence without damping, or the lagged anomaly correlation coefficient. The SIE persistence memory of detrended anomalies and Y2Y changes varies with season with stronger persistence for August–October and March–May for lead time of 0–2 months. Stronger persistence of July and August into September is also found in Blanchard-Wrigglesworth et al. (2011a). The larger ACC of detrended anomalies and Y2Y changes in CFSv2 for the short lead time (0–3 months) indicate that there are additional skill sources from other predictors and dynamic and thermodynamic processes.
Comparison in RMSE between CFSv2 and the damped persistence varies seasonally (Figs. 5d,e, 4f, and 6d–f). Overall, the advantage of the use of the CFSv2 dynamical forecast over the damped persistence forecast is modest. As will be discussed, the prediction skill and predictability strongly depends the initial condition and the representation of the mean state and variability of the system, and their improvements are highly desirable for an improved prediction skill. For detrended anomalies and Y2Y changes, the CFSv2 RMSE is smaller than damped persistence RMSE for the warm season (June–September) for lead months 0–2, but is comparable or larger than damped persistence RMSE for other seasons and lead times (Figs. 5e, 6e, 5f, and 6f). However, the RMSE of total anomalies in CFSv2 is smaller than that of damped persistence for most of the seasons, implying that the damped persistence does not capture the signal associated with the long-term trend. The RMSE of detrended anomalies and Y2Y changes in CFSv2 for June–October at the lead time of 3 months and longer is larger than that in damped persistence. This is probably because the damped persistence RMSE is essentially the standard deviation of the observed anomalies because of damping effect of the lagged anomaly correlation coefficient, which is near zero (Figs. 6b,c).
One interesting aspect in the assessment of a forecast system is to what extent the potential predictability is realized. To assess the SIE predictability, the perfect-model ACC is computed by treating one of the forecast ensemble members as “observation” and the ensemble average of the other 15 members as the forecast. In addition, the prognostic potential predictability (PPP; Pohlmann et al. 2004; Holland et al. 2011) is also analyzed. The PPP is defined as , where is the variance with respect to ensemble average, representing the unpredictable noise of forecast, and is the variance with respect to monthly climatology. Both and are computed across the ensemble members and the entire hindcast period for each initial month and target month. The difference between and , , is the variance of ensemble mean across the hindcast period with respect to monthly climatology, representing a signal that is assumed to be predictable. Accordingly, PPP is the ratio of variance of the signal to the total variance.
Similar variations with season and lead time are seen between ACC and PPP, with relatively higher skill for August–October and lower skill for March–May. The similar seasonality between ACC and PPP is because both ACC and PPP are determined by the signal-to-noise ratio () with a larger signal-to-noise ratio corresponding to higher predictability (Kumar and Hoerling 2000). For all seasons the perfect-model shows much higher ACC compared to the actual forecast (Figs. 5a–c and 7a–c). In particular, the perfect-model ACC for the detrended anomalies and Y2Y changes during the warm season (August–October) is higher than during other seasons for all lead times. Estimating predictability limit as the time when the signal becomes zero, that is PPP = 0, the predictability of the detrended anomalies and Y2Y changes is longer than the maximum lead time (10 months) of CFSv2 (Figs. 7e,f).
These results of predictability are comparable with previous studies. The relatively lower predictability during springtime is consistent with the result of Holland et al. (2011) based on experiments for the present-day control initialized from January. The return of predictability from January initial conditions after spring in Holland et al. (2011) can also be seen in Fig. 7. For example, the PPP of Y2Y change (detrended anomalies) from January is about 0.4 (0.45) for May and 0.55 (0.7) for September (Figs. 7e,f). The high perfect-model predictability estimate from CFSv2 is also in accordance with the analysis of Blanchard-Wrigglesworth et al. (2011b) who showed that the initial value predictability of pan-Arctic sea ice area is as high as 1–2 yr. While this might indicate a large room for further improvement of seasonal sea ice prediction, it should be emphasized that the actual prediction skill strongly depends on the accuracy of initial conditions and the perturbation among ensemble members in the perfect-model experiments may not represent observed errors. Further, the predictability estimate in the perfect model may be affected by the forecast model’s capability of reproducing the observed interannual variability and correctly capturing the predictable signal, which are further discussed below.
3) Understanding of the prediction skill and predictability estimate
The expected forecast skill depends on the ratio of predictable signal and unpredictable noise (Kumar and Hoerling 2000; Sardeshmukh et al. 2000). For a system to produce reliable and skillful predictions, and to provide a reasonable estimate of predictability, one requirement for the system is to reproduce the observed variability. Figure 8 compares interannual monthly-mean standard deviation for total anomalies (top), detrended anomalies (middle), and Y2Y changes (bottom). Both the observation and forecasts show relatively larger total standard deviation during warm season (July–October) than other seasons. The forecasted total standard deviation in December–March is comparable with the observed (Fig. 8a), and likely represents a balance between the overly strong interannual variability, as represented by the standard deviation of the detrended and Y2Y anomalies (Figs. 8b,c), and the overly weak long-term trend (e.g., Fig. 3a). In the warm season during August–October, both interannual variability beyond a 0-month lead (Figs. 8b,c) and the trend (e.g., Fig. 4a) in the CFSv2 are underestimated, resulting in weaker total standard deviation than the observed (Fig. 8a).
Next we examine the predictability based on CFSv2 forecasts and just focus on the interannual variability based on Y2Y changes. We approximate the predictable signal as the variance of ensemble-mean anomalies of the 16 members and the unpredictable noise as the variance of the departure of individual ensemble members from the ensemble mean. Figure 9 shows the signal (, blue curves) and noise (, red curves) in the prediction together with the variance of total anomalies of individual forecast members (, solid black curves) and the variance of the observed anomalies (dashed black curves) for March (top panel) and September (bottom panel).
As one would expect for the initial value prediction problem, the noise increases and the signal decreases with forecast lead time. After the 3-month lead, both the noise and signal change more gradually. For March, although the predicted total Y2Y variance of individual members decreases with lead time, it is greater than the observed value during the entire target months. The noise variance is also greater than the observed variance, starting at a 1-month lead time. This suggests that the variability in CFSv2 for this month is overestimated, which results in a lower estimate for the potential predictability. Also, the large differences in the variance between individual members and the observation indicate that large differences may exist between CFSR initial conditions and NSIDC data. For September, forecast Y2Y variance is only about half of the observed after the 2-month lead. The noise variance increases much more slowly after the 5-month lead, and its amplitude is substantially smaller than the observed variance. Although the potential predictability depends on both signal and noise, it is possible that it was overestimated in CFSv2 because of the too small variance.
c. Spatial SIE variability
Next we analyze the prediction of spatial variations. The prediction of long-term change is first examined. Interannual variations are then analyzed based on two examples of year-to-year changes. The long-term change and interannual variations may result from different processes. The former has been attributed to the radiative forcing due to increasing greenhouses gases, while the latter is more dominated by dynamic processes in the ocean and atmosphere.
1) Long-term change
To assess the long-term change in the seasonal forecasts, we compare the average of sea ice coverage at the same forecast lead time between the first 10 yr (1983–92) and the last 10 yr (1998–2007). The observed average of March sea ice coverage and the prediction at 0-, 2-, and 5-month lead times are shown in Fig. 10. Observed sea ice coverage decreases are seen in the Davis Strait, Labrador Sea, Greenland and Norwegian Seas, and Bering Sea. The CFSv2 captured the decreases in Labrador Sea and Bering Sea but with smaller amplitude compared to the observed. The forecast fails to reproduce the decrease in sea ice coverage over Greenland and Norwegian Seas. In both 10-yr periods, the predicted sea ice in this region tends to quickly move toward its mean state, which has a positive bias (Fig. 2a), indicating a stronger control of sea ice by other processes than the radiative forcing related to the increasing CO2 concentration.
Comparisons of September sea ice coverage for the first and last 10-yr periods are shown in Fig. 11. The observed sea ice edge retreats over most of the Arctic with the largest decrease in sea ice coverage in Chukchi and Beaufort Seas and decreases with smaller amplitude to the east of Greenland, Barents Sea, East Siberian, and Laptev Seas. CFSv2 reproduced the overall observed sea ice retreat. The largest error in the CFSv2 prediction is the underestimate of the decrease of sea ice coverage over Chukchi and Beaufort Seas.
2) Year-to-year changes
To assess the spatial characteristics in the prediction of interannual variations, we compare September sea ice coverage between 1991 and 1992, and between 2006 and 2007. These two pairs of consecutive years are among years that experienced large year-to-year differences in sea ice extent (Fig. 4c). Observed and forecasted September sea ice concentration contours of 15% are shown in Fig. 12 for 1991 and 1992, and in Fig. 13 for 2006 and 2007. For 1991/92, the largest changes are the sea ice expansion around the East Siberian and Laptev Seas. The CFSv2 reproduced this expansion at 0–2-month lead times. The model also captured the observed sea ice expansion to the east of Greenland from 1991 to 1992 at a 0-month lead, but failed to reproduce this expansion at a 1- and 2-month lead time. For sea ice coverage in September 2006 and 2007, the largest change is the significant sea ice retreat in 2007 in the Pacific sector. The model reproduced this retreat from 2006 to 2007 at lead time of 0 and 1 month, but forecasted almost the same sea ice extent at a 2-month lead for the two years.
Evolution of sea ice depends on various factors, including subsequent atmospheric variability and initial sea ice thickness. If the evolution is controlled by initial sea ice thickness, reasonable prediction can be expected for the subsequent seasons with an accurate specification of the initial state of sea ice. However, if the evolution of the observed sea ice is dominated by atmospheric anomalies, it is unlikely predictable by a forecast system at a lead time beyond the predictability of atmospheric circulation in the high latitudes. Chen et al. (2010) and Kumar et al. (2011) showed that the predictability of high-latitude atmospheric circulation drops significantly beyond one month. To understand the differences in the prediction skill in sea ice coverage change between 1991/92 and 2006/07 (Figs. 12 and 13), we analyze the prediction as a function of lead time and consider the possible impacts of initial sea ice thickness. The observed and forecasted Y2Y changes of September SIE are shown in Fig. 14a for 1992 minus 1991 and Fig. 14b for 2007 minus 2006. Solid curves are the forecasts and the dotted lines are observations. The model forecasted the positive SIE change from 1991 to 1992 September to some extent as early as from March initial conditions. However, for the drastic negative SIE change from 2006 to 2007 September, the CFSv2 did not produce a significant SIE decrease until forecasts initiated in July.
To explore possible impacts of the sea ice initial state on the prediction, Fig. 15a shows differences between 1991 and 1992 in CFSR sea ice concentration (solid curve) and thickness (dotted curve) averaged over 72°–80°N, 118°–160°E, where the difference in the observed September sea ice coverage is large between the two years. It is seen that while the sea ice concentration difference between 1991 and 1992 is near zero until July, relatively large differences in ice thickness occurs as early as in March. For 2006 and 2007, CFSR sea ice differences over 72°–84°N, 120°E–140°W, where the greatest difference in observed September sea ice coverage between the two years was observed, shows more in-phase evolution between sea ice concentration and thickness average with little Y2Y changes in both variables until July (Fig. 15b). Comparison between the CFSv2 prediction (Fig. 14) and sea ice conditions in CFSR (Fig. 15), which was used to initialize CFSv2, indicates that the prediction skill is strongly dependent on initial sea ice thickness. In particular, the CFSv2 was unable to predict the 2007 dramatic plunge in September sea ice minimum from initial conditions before July 2007. One explanation for this CFSv2 failure is that atmospheric circulation and the associated wind/radiation anomalies during 2007 summer, which contributed to the dramatic decrease of September 2007 sea ice (Ogi and Wallace 2007; Kay et al. 2008; L’Heureux et al. 2008; Graversen et al. 2011; Zhang et al. 2008a), were unpredictable at longer lead times. The CFSv2 was unable to predict reasonable negative September Y2Y values for 2006/07 until it was initialized from July and August when the impacts of the atmospheric impacts on the unusual thinning of the sea ice in 2007 was established in sea ice initial state. The importance of initial sea ice thickness as a source of prediction skill has been discussed in Lindsay et al. (2008), Holland et al. (2011), Blanchard-Wrigglesworth et al. (2011b), and Chevallier and Salas-Mélia (2012). In particular, Chevallier and Salas-Mélia (2012) found that the critical sea ice thickness varies with target month and lead time.
4. Summary and discussion
Seasonal prediction of the observed sea ice has been mostly studied based on statistical methods with selected predictors or ocean-only dynamic models without atmospheric feedbacks. In this paper, we analyze seasonal sea ice prediction and predictability based on the recently developed NCEP Climate Forecast System, version 2 (CFSv2), which includes a fully interactive dynamical sea ice component. The analysis focuses on the performance of the CFSv2 in reproducing observed Northern Hemisphere sea ice extent (SIE) variations from 10-month forecasts with an ensemble of 16 forecast members initialized in each month during the period of 1982–2007. The prediction of SIE climatology, the long-term trend, and interannual variability was examined.
The comparison in climatology with the observation shows that the CFSv2 contains systematic biases in SIE that are more dependent on the target month than the initial month. Overall, there is a positive SIE bias for the forecast of January–September and a negative SIE bias for October–December. Spatially, the CFSv2 shows biases over various regions in different seasons with a negative bias in sea ice coverage around the Bering Strait for the forecasts from November–April, a negative bias over the Hudson Bay and Davis Strait in November and December, a positive bias in the Labrador Sea during from February to June, and a positive bias in the Greenland Sea throughout the year.
The CFSv2 underestimates the observed long-term trend of SIE. This may be related to the use of different sea ice analyses before and after 1997 in the initialization system, the CFSR, which assimilates the GSFC sea ice analysis (Cavalieri 1994; Cavalieri et al. 1996) before 1997 and NCEP sea ice analysis starting in 1997 (Saha et al. 2010). As was shown in Wang et al. (2011), there is an inconsistency in long-term sea ice extent variations over the Arctic regions between the CFSR and the observational estimate with the CFSR showing smaller sea ice extent before 1997 and larger extent starting in 1997. This inconsistency is likely a cause of the overall negative (positive) SIE anomaly errors before (after) 1997 in the CFSv2 prediction. Another possible factor for the weaker trend in the CFSv2 forecast is the quick convergence of the predicted sea ice coverage to its climatology over some regions (e.g., eastern Greenland Sea), suggesting that errors in other model physics may have dominated over the maintenance of the initial trend.
Prediction skill of the SIE measured by ACC and RMSE shows strong dependence on the season and lead time. The CFSv2 has high prediction skill (ACC > 0.7) of total SIE for the late fall to spring at all lead times except for the slightly lower skill at lead time of 0 and 1 month for January and February, possibly because of the model’s adjustment caused by initial SIE errors. For the summer and early fall, the higher ACC (lower RMSE) of total SIE is confined to the first five target months. The ACC of year-to-year (Y2Y) changes is substantially smaller than that of total anomalies, indicating a large source of prediction skill is the long-term trend. Useful prediction skill (ACC > 0.5) of Y2Y changes is found to be primarily within the first three target months and is mostly in the summer and early fall.
The performance of CFSv2 in the prediction of sea ice interannual variations may be different for different years. The differences are analyzed by comparing the prediction for September in two selected pairs of adjacent years: 1991/92 and 2006/07, which are among the years of large SIE interannual variations. It is shown that while the 1991/92 Y2Y SIE change in September was well predicted as early as March, the CFSv2 failed to produce reasonable Y2Y SIE change for 2006/07 until July. It is argued that this difference in prediction skill of interannual SIE variation may be related to initial sea ice thickness. For both of the selected pairs of years, significant Y2Y differences in observed sea ice extent occurred after July. However, changes in the Y2Y sea ice thickness preceding the month of September, and over the regions of large sea ice changes in September, was quite different between 1991/92 and 2006/07. For 2006/07, the sea ice thickness change was small before July, while for 1991/92, substantial sea ice thickness differences started in March. The short lead time for a reasonable prediction for 2006/07 Y2Y change is consistent with previous studies, which showed that the dramatic drop in sea ice extent in 2007 is related to the atmospheric circulation anomalies in the 2007 summer (Ogi and Wallace 2007; Kay et al. 2008; L’Heureux et al. 2008; Graversen et al. 2011; Zhang et al. 2008a). These results suggest if the subsequent seasonal sea ice evolution is driven by atmospheric circulation and the associated cloud–radiation anomalies, it is largely unpredictable until the atmospheric impacts have been integrated into the ocean state.
A model-based predictability estimate is analyzed with the perfect-model approach. The potential predictability is found to be much higher than the realized prediction skill by the CFSv2. The predictability based on CFSv2 is comparable to results from previous studies. In particular, predictability from the winter season initial conditions is low in springtime and recovers during the warm seasons, consistent with the results of Holland et al. (2011) based on experiments initialized from January. Blanchard-Wrigglesworth et al. (2011a,b) attributed this return of predictability for warm seasons to the memory of long-lived thickness anomalies in the central Arctic. The predictability in terms of PPP is beyond the maximum lead time of CFSv2 (10 months), in accordance with the result of Blanchard-Wrigglesworth et al. (2011b) that initial value predictability of pan-Arctic sea ice area is as high as 1–2 years. While this suggests that there may be room for the improvement in the prediction of sea ice, it should be emphasized that the perturbation to initial conditions among ensemble members in the perfect-model experiments may not represent observed errors. Further, possible dependence in the estimate of predictability on the model’s deficiencies exists. For the March forecast, the month of seasonal SIE maximum, the predicted variance of Y2Y changes is overestimated, which may have resulted in an underestimate of the potential predictability. On the other hand, for September, the month of seasonal SIE minimum, the forecast Y2Y variance is only about half of the observed after the 2-month lead. The ensemble spread variance increases very slowly after the 5-month lead, and its amplitude is substantially smaller than the observed interannual variance. Although the potential predictability depends on both signal and noise, it is possible that CFSv2 may have overestimated September predictability because of a smaller overall variance.
Results from this study have implications for the improvements of the forecast system. First, the consistency of sea ice analysis used in the initialization is important for the prediction system to maintain a reasonable long-term trend, which is a large source of seasonal prediction skill of sea ice. Second, validation of sea ice thickness in the initialization system for a realistic initial sea ice state is of critical importance for an accurate prediction of sea ice. Given the sparseness of available sea ice thickness observations, which limits its direct assimilation, more practical approaches for realistic sea ice thickness initialization may be needed. Third, it is important to improve the forecast model’s mean state and variability of the SIE. Because of the nonnegative nature of sea ice, correctly reproducing the observed seasonal retreat and expansion is critical for the system to capture reasonable variability. In addition, since the predictability depends on the signal-to-noise ratio, the improvement of the model’s capability in reproducing observed variability with reasonable amplitude is also important.
The authors greatly appreciate the valuable comments by Amy Butler and Peitao Peng. Three anonymous reviewers are gratefully acknowledged for their helpful comments and suggestions.