This paper presents a statistical ice event forecast model for the Arctic based on Fourier transforms and a mathematical filter. The results indicate that this model compares very well with both a multiple regression model and a human-made forecast. There seems to be a direct link between the period associated with the dominant spectral peak of the Fourier transform and the ease with which the date of events, such as fractures, bergy water, or open water, can be forecast. While useful for the normal timing of events, at this time, none of the current forecast models can predict events that occur before or beyond the usual or historical dates, which poses a forecast problem in the Arctic.
In Canadian waters, forecasting events such as the onset of ice breakup or a marine area being considered open water or bergy water is important to the shipping trade and the Coast Guard for ship routing and icebreaking plans. Bergy water is an area of freely navigable water in which ice of glacier origin is present. Other ice types may be present, although the total concentration of all other ice is less than 1/10. Open water is an area of freely navigable water in which ice is present in concentrations less than 1/10 and no glacial ice is present (Environment Canada 2005). Currently, the Canadian Ice Service uses a multiple linear regression model as guidance to forecast these events 1–6 months in advance of the dates on which these events will occur. Next, an experienced forecaster, using knowledge of the Arctic meteorology and oceanography, prepares a seasonal outlook at the start of the Arctic summer for the following 3 months. The forecast gives the dates for significant ice events for areas in which there will be active shipping. A 30-day outlook is then issued and reassessed every 15 days throughout the Arctic summer.
The new statistical events forecasting model (SEF) presented herein uses the dates on which these various events have occurred each year since the Canadian Ice Service first began detecting them by aerial or satellite reconnaissance almost 46 years ago. These dates form a continuous series for each arctic summer year. The SEF uses the series ending the previous year to forecast event dates for the current year. It uses both fast Fourier transforms (FFTs) and an optimal filtering-based model (OFBM). As a result, this model can serve as a tool for estimating event dates for the upcoming year before the multiple regression model becomes useful (1–6 months in advance) and before the specialized forecaster is able to use atmospheric and oceanic data for the next Arctic summer. The specialized forecaster (long-range forecaster) uses the “analog year technique.” First, he or she finds years in which ice conditions and freezing degree days were similar to, or as close as possible to, those currently reported. From among the years found above, the forecaster next finds year(s) in which the reported temperatures were similar to, or as close as possible to, those forecast for the current outlook period. With the year(s) found above as a guide, the forecaster attempts to determine how the ice is going to evolve during the forecast period, and ice drift from the wind stress is used as well. If no analog year is found, the long-range forecaster has to rely on his or her experience and climatology atlases (Environment Canada 2011).
The major challenge related to predicting the ice a year or more in advances is that are only 46 yr of sea ice observations available. A longer time series would be better for revealing patterns. Compounding the problem of short time frames of dates of melt ice is that the sea ice over the last 46 yr has been greatly impacted by climate change. There has been significant variability in ice and weather conditions from one year to the next.
One of the major difficulties with predicting ice conditions in general is that the ice is very much tied to local variations in weather. A different track for a single low pressure system can have an impact on the extent of the ice within a short time frame. This can mean the difference between a record lack of ice or an average amount of ice. Oceanographic conditions also play a role, but the impact is less immediate. The limited success among the different methods used now (multiple regression and analog methods) has pushed scientists to explore new ways of forecasting. One of the new methods, SEF, is presented here.
The SEF model used herein consists of the following steps. First, the fast Fourier transforms are performed (MathWorks 2000) on the series of observation dates for the various events (e.g., fracture) in each of the areas, for example, Pond Inlet, Nunavut, Canada. We thus obtain a periodogram that gives the frequency and power of this FFT. The frequency is inverted to obtain the period and also to establish a periodogram in years per cycle. The peak power gives an indication of the period in which the most likely outcomes occur, that is, the years in which the event has occurred on the same dates. A well-known example is that of sunspot activity over the past 300 yr, which is cyclical, and its peak period is obtained using an FFT. (Incidentally, the period for sunspots is 11 yr.)
Then, a moving average of observation dates is taken over an interval determined by this peak period. Finally, an OFBM filter is applied. This tool has been tested for making seasonal forecasts of ice coverage in Canada and seemed to perform well. The main advantage of using the SEF model is in determining if the ice from one year to the next in an area has demonstrated enough of a pattern that the ice conditions can be forecast well in advance. Effectively, there is a relationship between the peak period of the sea ice time melting series and the success in forecasting for certain arctic areas. An area with a peak period of 5 yr correlated with strong success of prediction (e.g., Hudson Strait area) whereas there is a low success of prediction when the peak period is 15 yr (e.g., Canadian High Arctic). In using this method, forecasters have guidance in selecting the best technique for forecasting the ice conditions years or more in advance. This new tool has the ability to improve forecasts.
The fast Fourier transform was adopted because this method is a simple, fast, and accurate method for obtaining a periodogram, and the peak periods can easily be plotted over a geographical map for a spatial analysis. OFBM methods have the advantage of not requiring any external information since the forecast is based solely on a temporal series of observations of the events. This method was tested for forecasting melting lake ice in the Great Lakes area, and the results were promising (T. Wohlleben 2015, personal communication).
Multiple regression models need multiple predictors (e.g., sea surface temperature, air temperature, geopotential height, etc.) and require more-powerful computational resources compared with the OFBM. In many situations, the OFBM extrapolation turns out to be vastly more powerful than any kind of simple polynomial extrapolation (Press et al. 1992, section 13.6). The OFBM is regarded as being a good first choice to use in the SEF model. One of the advantages of using filter techniques is that they are not required to be tuned to the specific interaction between the selected input and a chosen data-driven model (Taormina and Chau 2015). Thus, the SEF can be used in conjunction with a variety of other forecast methods.
This classical prediction method (OFBM) is used in the case where the data point yβ (dates of ice-related events) are equally spaced along a line yi, i = 1, 2, … ., N, and we want to use M consecutives values of yi to predict an M + 1st. Stationarity is assumed. That is, the autocorrelation 〈yjyk〉 is assumed to depend only on the difference |j − k|, and not on j and k individually, so that the autocorrelation ϕ has only a single index:
Here, the approximate equality shows one way to use the actual dataset values to estimate the autocorrelation components. In the situation described, the estimation equation is
This set forms a set of M equations for M unknown dj, called the linear prediction coefficients:
The mean square discrepancy 〈xn2〉 is estimated by
To use linear prediction, we first compute the dj’s, using (1) and (3). We then calculate (4) and apply (2) to the known record to get an idea of how large are the discrepancies xi. If the discrepancies are small, then we can continue applying (2) right on into the forecast, imagining the unknown forecast discrepancies xi to be zero. In this application, (2) is a kind of extrapolation formula.
The forecast results of the SEF model were compared with the multiple linear regression model (MLR). The MLR is a statistical model based on a linear regression of observations that uses outside predictors to produce 1–6-month forecasts. Ice conditions are then correlated with the different variables for various time lags and, as a result, we could create more than 1000 predictors. A regression is then performed with a regression/elimination technique to avoid the overdetermination of predictors. MLR has performed well in forecasting the ice conditions on Hudson Bay, for example (Tivy et al. 2007).
The results of the SEF model (12-month forecast) compared against the rates of success of the forecaster (3-month forecast) and the MLR statistical model (3-month forecast) for the years 2011–14 are shown in Table 1. The results of the SEF model for various ice events are shown in Table 2. Historical records of sea ice concentration using CIS analysis are available back to 1968. Climatological records are created using 30-yr datasets. The most recent set covers the periods from 1981 to 2010. The following years, 2011–14, have been used for this study since they are not included in the climatology but are of interest because of the variability that has been observed in these recent years. In addition, there was readily available information from the multiple regression model, the SEF, and human-produced long-range forecasts for these periods that could be used for comparison.
A forecast is considered correct when the forecast date is within 7 days of the date of event observation (Gauthier and Falkingham 2002). Note that the SEF statistical model performs as well as the other models for shorter-term forecasts. It is also worth noting that the MLR statistical model is not able to forecast as many events since it sometimes is unable to find enough significant predictors to make a forecast. Neither model can forecast ice events that occur on extreme dates, either occurring very early or much later than normal or historical dates. Multiple regression models and human-made forecasts are adopted as the benchmark for comparison because these methods have traditionally been the only methods of forecasting the concentration of sea ice over the last 10 yr at the Canadian Ice Service (CIS). A different approach to SEF has been used in the field of rainfall-runoff modeling. There, time series have been studied with autoregressive integrated moving average models coupled with ensemble empirical mode decomposition (Wang et al. 2015). This method will be evaluated in more detail at CIS as possible additional guidance for improving ice forecasts.
Peak periods of events for the various forecast areas are shown in Figs. 2–5. There appears to be a direct relationship between the peak period and the ease with which we are able to forecast the dates of events. We note that, when an event had a short peak period, it was easier to make a correct forecast. Indeed, when the period is short, the event date occurs more frequently and is easier to forecast. In the case of the central Arctic (Fig. 2), the Norwegian Bay area is a region where no correct forecast was made for those 4 yr, and it has a corresponding peak period of 20 yr. The same is true for Baffin Bay (Fig. 3) and Foxe basin area near Hudson Bay (Fig. 4). By contrast, the Hudson Strait has a short peak period and a corresponding high rate of forecast success. The southern Beaufort Sea (Fig. 5) shows areas of both long and short peak periods of events.
Forecast dates for ice fracture in the central Arctic were specifically investigated. The fracture dates are the dates of complete fracture of the ice. Since we have the observed fracture dates and the fracture dates forecast by the SEF statistical model, it is possible to determine the difference between them. Figures 6 and 7 show the difference in the number of days between forecast and observed dates. A difference of −15, for example, indicates that the forecast date is 15 days ahead of the observed date. A difference of +10 indicates that a forecast date is 10 days later than the date observed. Mainly, what we can see from these two figures is that a shift in the forecast results can be observed from 2011 to 2014. In 2011 and 2012, the forecast dates were later than the observation dates. In 2013 and 2014, the forecast dates were earlier than the actual fracture observation.
The difference between the forecast fracture dates and the observed dates of fracture between 2011 and 2014 were examined using reanalysis data for the months of July during these years [National Centers for Environmental Predictions–Department of Energy (NCEP-DOE) AMIP-II Reanalysis; Kanamitsu et al. (2002)]. July is the month when fracture normally occurs for this part of the Arctic. The advantage of the reanalysis data is in theability to retrace meteorological conditions during these periods with confidence. The reanalysis data provide a relatively consistent source for atmospheric data because the reanalyses are based on a static data assimilation scheme, the data analysis is global in extent, and more observations are used (Kalnay et al. 1996).
To do this, we use average monthly air temperature at 2 m and the kinetic energy variable. Two significant meteorological factors in ice growth and decay are air temperature and surface wind. High temperatures and strong winds will tend to melt ice. Low temperatures and light winds will contribute to ice growth. Surface air temperature fields are readily available for analysis; however, surface winds are more challenging. The kinetic energy, which is a combination of wind near the surface and sea level pressure, can be interpreted as a dynamic variable, potentially causing ice fracture. The kinetic energy has the advantage as being a more complete variable than simply using separate wind and pressure variables, particularly where high sea level pressure conditions (e.g., an anticyclone) are correlated with nonnegligible drift of sea ice by the wind. Sea surface temperature (SST) was considered, but reanalyses show no significant variation over the areas during the month of July, with SST constant near the freezing mark. Kinetic energy is calculated in accordance with Lorenz (1957) and Wiin-Nelson and Chen (1993), and is equal to the normal wind speed at 10 m multiplied by surface pressure multiplied by 1 over gravitational acceleration. We can see in Figs. 8 and 9 that air temperatures in July 2012 were above freezing and that kinetic energy was close to 10 kJ m−2. In Figs. 10 and 11, we also see that in 2014 air temperatures were freezing with lesser kinetic energy levels of about 5 kJ m−2. Now if we look at these data in relation to the discrepancies between the forecast and the earlier observed fracture dates, we see that in 2012 the earlier observed fracture was aided by warmer temperatures and greater kinetic energy (i.e., higher wind speeds) causing larger waves and a more fragile ice coverage, which helped to break up the ice. Whereas in 2014, the observed fracture was later with lower temperatures and lesser kinetic energy levels resulting in ice coverage that is harder to break. The SEF model is heavily dependent on historical and normal conditions, but varying climatic conditions will result in observation dates that are either earlier or later than the forecast dates.
5. Bergy water
Forecasting when the sea ice melts completely, leaving only icebergs or bergy water, was investigated in Baffin Bay. The ice-melt events in Baffin Bay were quite different from the case of fracturing in the Canadian Archipelago or central Arctic. The average contour for the peak period of bergy water in Baffin Bay has a maximum near Clyde River (central Baffin Island coast) and was an area where the rate of forecast success was low between 2011 and 2014 (see Fig. 12). Although air temperatures for July and August for these years are relatively consistent near the coast of Baffin Island (about 4°C), an area of low kinetic energy (about 5 kJ m−2) is sustained throughout these years (Fig. 13). Indeed, there appears to be, at least for these years, an area of low wind speeds and normal atmospheric pressure at that location, which tends to favor a stable area where sea ice lingers longer before yielding to bergy water. This situation is also easily verifiable with satellite imagery, for example. The SEF model indicated a forecast later than normal for the sea ice melt. In reality the sea ice did melt later than normal, but it was sooner than forecast, resulting in a missed forecast. Pack ice does not just melt in place; melt and deterioration of the ice depends on winds and currents. Thus, a high average peak period associated with a stable atmospheric situation results in little forecasting success near Clyde River on the Baffin Island coast.
6. Open or bergy water
The Hudson Strait area is considered to be open water or bergy water depending on the position of the iceberg limit. Both the MLR model and the SEF model (Fig. 14) achieve very high rates success in forecasting when sea ice will completely melt for this area. A short average peak period associated with a gradual increase in air temperature from July to August favors complete ice melt at almost identical dates from one year to another with a gradual trend to earlier breakup and later freezing. In addition, we see about 10 kJ m−2 of kinetic energy along the east–west axis of the strait favoring continual breakup of the ice coverage at this location.
7. Open water
For the western Arctic, the average peak period obtained with the SEF model is 10 yr in Amundsen Gulf and Mackenzie Bay. Given that almost all marine traffic occurs along the coast to avoid the pack ice of the Beaufort Sea, there are only historical records of the onset of open water conditions in this area. The rate of forecast success increases with proximity to the coast where air temperatures are also highest. In Amundsen Gulf in particular, air temperatures are close to 10°C (Fig. 15) for the months of July and August between 2011 and 2014, and kinetic energy is low. Kinetic energy increases from east to west as wind speeds increase (Fig. 16). In the case of the Amundsen Gulf, air temperature appears to be the meteorological variable facilitating the presence or formation of open water.
A statistical model consisting of fast Fourier transforms (FFTs), the moving average, and a mathematical filter [optimal filtering based model (OFBM)] was used as guidance to forecast dates of different ranges of ice concentration during the melt period. Dates of melt are the value requested by mariners so they can plan their activities around when an area will be navigable. The fast Fourier transform was adopted because this method is simple, fast, and accurate when a periodogram and the peak periods need to be obtained OFBM methods have the advantage of not requiring any outside information since the forecast is based solely on a temporal series. This method was tested experimentally for the case of melting lake ice in Great Lakes area, and the results were found to be promising. This last case shows promise for improving forecasting values of sea ice concentration across the Great Lakes. When we compare the results from this SEF statistical model (12-month forecasts) with those of a multiple linear regression (MLR) model (3-month forecasts), as well as the predictions of a forecaster who specializes in ice conditions (3-month forecasts), we find that the SEF model performs as well as the other forecast models. These methods yield very similar results in so far none of these types of forecasting can predict the dates of extreme event occurrence (i.e., far ahead of or behind usual or historical dates). Very short-term weather variations can cause extreme dates of complete melting in certain area without warning and can be caused by very short-term meteorological and ocean conditions. With Fourier transforms, we are able to obtain a periodogram indicating the peak period in which the most likely outcomes occur (i.e., the years in which the event has occurred on the same dates). There appears to be a direct relationship between the peak period and the ease with which we are able to forecast the dates of events. In studying the case of fractures in the central Arctic, we notice that observed fracture dates are earlier when air temperatures and kinetic energy levels are high. High kinetic energy (which takes into account both wind speed and atmospheric sea level pressure) appears to be a variable favoring the continual breakup of ice. This is seen in the Hudson Strait. Indeed, the model achieves a very high rate of forecast success for this area. In the specific case of fracture in the central Arctic, which features the presence of high air temperature and great kinetic energy together, this situation helps to break up the ice caused by larger waves and more fragile ice coverage. This situation was observed by an earlier moment date of fracture. In the particular case of open water for the western Arctic, where historical records of open water occur mainly along the coast, air temperature appears to be the most significant meteorological variable facilitating the presence or formation of open water. Finally, the SEF model provides a long-term outlook for certain events in the Arctic that can be used for planning. Other types of forecasting are used for shorter-term forecasts for more tactical or operational decisions. An added benefit is that the statistical model does not require significant computational resources.
To complete this study in the future, the novel approach developed in the hydrological sciences of using an autoregressive integrated moving average model coupled with the ensemble empirical mode decomposition will be tested to determine the feasibility of mimicking times series of dates of ice-related events in the Arctic. This method will be studied in more detail in the future at the CIS and probably be a component of other alternatives. A longer period of comparison between human-made forecasts, MLR, SEF, and the hydrology approach will be useful. It is anticipated that in the future long-range forecasting will be in the realm of fully dynamical models; however, at present statistical models are more accurate and empirical relationships uncovered during their development may help advance physical modeling efforts.
I would like to thank Darlene Langlois for proofreading this paper and two reviewers for providing suggestions.