A set of ensemble seasonal reforecasts for 1958–2014 is conducted using the National Centers for Environmental Prediction (NCEP) Climate Forecast System, version 2. In comparison with other current reforecasts, this dataset extends the seasonal reforecasts to the 1960s–70s. Direct comparison of the predictability of the ENSO events occurring during the 1960s–70s with the more widely studied ENSO events since then demonstrates the seasonal forecast system’s capability in different phases of multidecadal variability and degrees of global climate change. A major concern for a long reforecast is whether the seasonal reforecasts before 1979 provide useful skill when observations, particularly of the ocean, were sparser. This study demonstrates that, although the reforecasts have lower skill in predicting SST anomalies in the North Pacific and North Atlantic before 1979, the prediction skill of the onset and development of ENSO events in 1958–78 is comparable to that for 1979–2014. In particular, the ENSO predictions initialized in April during 1958–78 show higher skill in the summer. However, the skill of the earlier predictions declines faster in the ENSO decaying phase, because the reforecasts initialized after boreal summer persistently predict lingering wind and SST anomalies over the eastern equatorial Pacific during such events. Reforecasts initialized in boreal fall overestimate the peak SST anomalies of strong El Niño events since the 1980s. Both phenomena imply that the model’s air–sea feedback is overly active in the eastern Pacific before ENSO event termination. Whether these differences are due to changes in the observing system or are associated with flow-dependent predictability remains an open question.
El Niño–Southern Oscillation (ENSO) generates the strongest interannual variability of Earth’s climate (e.g., McPhaden et al. 2006). Developing every few years in the tropical Pacific, the warm (El Niño) and cold (La Niña) ENSO events not only cause substantial anomalies in the tropics (e.g., Rasmusson and Carpenter 1982; Wallace et al. 1998) but also impact weather and climate worldwide (e.g., Trenberth et al. 1998; National Research Council 2010). As the most important source of global climate predictability (e.g., Shukla and Wallace 1983; Kumar et al. 2014), a successful forecast of an upcoming ENSO event forms the centerpiece of skillful predictions on seasonal–interannual time scales in many regions, such as the seasonal anomalies in U.S. precipitation (e.g., Kumar and Hoerling 1998) and the Asian monsoon (e.g., Lau and Nath 2003; Zhang et al. 2016). It also provides critical background information for predicting subseasonal anomalies and weather extremes.
Cane et al. (1986) famously predicted the 1986 El Niño event in near-real time using an intermediate coupled model with a simple ocean initialization based on surface wind forcing. Using an intermediate coupled model, Chen et al. (2004) have also conducted retrospective forecasts of the interannual climate fluctuations in the tropical Pacific Ocean for 148 yr (1857–2003). Since the pioneering work of Cane et al. (1986), seasonal ENSO forecasting has progressed to fully coupled GCM predictions with sophisticated initialization of all component models (e.g., Ji et al. 1994; Stockdale et al. 1998, 2011; Schneider et al. 1999; Saha et al. 2006, 2014; Jin et al. 2008; Zhu et al. 2012; among many others). In particular, assimilated oceanic initial states based on in situ surface and subsurface observations, as well as remote sensing measurements, provide the most important source of ENSO predictability (e.g., Balmaseda et al. 2008, 2013; Saha et al. 2010). Since the 1990s, successive seasonal forecast systems at major meteorological centers have demonstrated steadily increasing skill in predicting the tropical Pacific SST anomalies in retrospective forecasts of the past 35 years (e.g., Ji et al. 1994; Saha et al. 2006, 2014; Stockdale et al. 1998, 2011; Molteni et al. 2011). Operationally, many forecast models predicted the onset of the strong 1997/98 El Niño 1–2 seasons in advance (Barnston et al. 1999). A similar success can also be claimed for the most recent strong El Niño event in 2015/16 (e.g., see http://iri.columbia.edu/our-expertise/climate/forecasts/enso/2015-April-quick-look/). As shown in Barnston et al. (2012, their Fig. 4), major La Niña events, such as those in 2007/08 and 2010/11, were also well predicted seasonally.
Although current models have demonstrable skill, there are noticeable missed events and false alarms in retrospective ENSO predictions (e.g., Zhu et al. 2012). Similar unsuccessful predictions also have occurred operationally. A recent example is the series of forecasts for boreal fall 2014 and winter 2014/15, issued from the spring to early summer of 2014, when most models predicted a strong El Niño event. However, although the observed ocean–atmosphere state in the tropical Pacific during that early spring was reminiscent of the situation in early 1997, a major warm event did not develop in 2014 (McPhaden 2015). Instead, the widely anticipated strong warm event came one year later in 2015/16. Several studies proposed possible reasons preventing this warm event from happening in 2014, including an early termination of the spring westerly wind bursts (Menkes et al. 2014), unusually strong basinwide easterly winds in June (Hu and Fedorov 2016), and persistent cold SST anomalies in the off-equatorial southeastern Pacific (Zhu et al. 2016). In retrospect, although the accumulation of equatorial oceanic heat content seemed ripe for the onset of a warm event by early 2014, the atmospheric and sea surface conditions in late spring and early summer played a decisive role in causing the 1-yr “delay.” Whether the apparent failure of model predictions was a forecast bust due to system errors, or was associated with an inherently unpredictable event remains an open question. Admittedly, when considering the whole ensemble rather than the mean, some models, such as the European Centre for Medium-Range Weather Forecasts (ECMWF), predicted a nonnegligible probability for a neutral or weak El Niño. Another apparent false alarm for the occurrence of a strong El Niño event was also issued by many models in 2012 (see http://iri.columbia.edu/our-expertise/climate/forecasts/enso/archive/201208/QuickLook.html), as also will be shown in this paper.
In general, the ENSO prediction skill has declined during the 2000s compared with the 1980s and 1990s (e.g., Barnston et al. 2012; Zhao et al. 2016), in spite of increasing ocean observations during this period (Kumar et al. 2015). This decline coincided with the more frequent occurrences of the warm-pool El Niño events (e.g., Kao and Yu 2009; Kug et al. 2009) and prolonged La Niña events (e.g., Hu et al. 2014). The warm water volume (WWV) in the equatorial Pacific, a critical precursor of El Niño events, also leads the ENSO SST index by only one season in this period, instead of 2–3 seasons as in the 1980s and 1990s (McPhaden 2012). These changes in ENSO characteristics were likely associated with mean-state changes in the tropical Pacific Ocean. Over the past two decades, the Pacific trade winds have strengthened, causing colder SST in the eastern equatorial ocean (e.g., England et al. 2014). Although the declining skill during the 2000s may be a sign of reduced ENSO predictability in response to the change of the mean state, the inability to predict warm-pool-type El Niño events may also reflect model inadequacy in adapting the long-term climate variations.
To further improve seasonal ENSO prediction, it is necessary to better represent critical atmospheric and surface processes that trigger or hinder ENSO development, such as those occurring in the spring of 2014. The ENSO triggers on subseasonal–seasonal time scales may include westerly wind bursts (WWBs) (e.g., Kessler et al. 1995), the tropical meridional modes (e.g., Chiang and Vimont 2004; Zhu et al. 2016), surface salinity anomalies (e.g., Zhu et al. 2014), and the footprint of the extratropical atmospheric perturbations from both Northern (e.g., Alexander et al. 2010) and Southern (e.g., Terray 2011) Hemispheres, among many other factors, mostly occurring from boreal winter to early summer prior to the ENSO development. We should also make our forecast systems more adaptive to background changes and to different ENSO “flavors” (Johnson 2013). To make progress on these fronts, a process-oriented improvement of forecast systems should be made. This requires a critical assessment of current predictions/reforecasts to see how realistically they reproduced the key developments of historical El Niño events and what could be their weakness in a case-by-case analysis. For instance, we may ask whether the retrospective predictions, such as the one for 2014, could be improved with targeted modifications to model physics and/or initialization through more adequate sampling of the intraseasonal SST variations and better representation of the atmospheric boundary layer and convective processes. We should also explore critical factors for predicting ENSO intensity and structure. Currently, substantial effort is devoted to the correlation skill of predicting ENSO indices while the model capability in predicting intensity and spatial structure of individual ENSO events has not been evaluated as extensively.
The case-study approach requires predictions for a large number of historical ENSO events with different background states. Since most current reforecasts only cover the period since 1979, it would be very useful to extend them to an earlier period. The mean climatology in the Indo-Pacific basin in the twentieth century prior to 1979 was apparently quite different from that in the later period owing to climate change and multidecadal variability (e.g., Zhang et al. 1998; Fedorov and Philander 2000). Moreover, earlier El Niño events also showed unique features distinct from those in either the 1980s and 1990s or the 2000s. For instance, in the earlier period, warm SST anomalies usually appeared first near the coast of South America then later in the central equatorial Pacific while the order reversed in the cold tongue El Niño events of the 1980s and 1990s (e.g., Wang 1995). On the other hand, the WWV lead time to the ENSO SST index was typically 2–3 seasons in the 1960s and 1970s, similar to that of the 1980s and 1990s but different from those in the 2000s (McPhaden et al. 2015). It is important to examine how well forecast systems represent these interdecadal changes in ENSO characteristics.
Recently, we have conducted a set of ensemble seasonal reforecasts for the period 1958–2014 using the Climate Forecast System, version 2 (CFSv2), the operational climate prediction system at the National Centers for Environmental Prediction (NCEP; Saha et al. 2014). This set of reforecasts is initialized with the new long-term global ocean reanalysis from ECMWF (Balmaseda et al. 2013), together with the best available observation-based analyses for the land and atmosphere. In particular, this set of reforecasts provides a long record of the seasonal ENSO predictions using a state-of-the-art coupled climate system and modern initializations for an extended period. Some ENSO prediction experiments extending to the earlier period used intermediate coupled models with simple ocean initialization schemes (e.g., Chen et al. 2004), probably due to the lack of high-quality ocean reanalysis at the time. Similarly, the seasonal predictions in the DEMETER project were initialized with oceanic states from forced ocean model runs without assimilating subsurface observations (Palmer et al. 2004).
On the other hand, as part of the ENSEMBLES project (http://ensembles-eu.metoffice.com/), a set of 46-yr (1960–2005) seasonal ensemble reforecasts has been conducted using five global climate models, four of which are initialized by an oceanic analysis with the assimilation of subsurface measurements (Weisheimer et al. 2009; Alessandri et al. 2011). Complementary to ENSEMBLES, our reforecasts provide a longer record that extends to the most recent decade using a current operational seasonal forecast model with a new set of ocean analysis. Our reforecasts also complement the recent McPhaden et al. (2015) case study in predicting the tropical Pacific SST anomalies in 1975 using the current ECMWF operational seasonal forecast system.
The purpose of this study is to examine a long and continuous high-quality dataset of seasonal reforecasts for targeted seasonal ENSO forecast evaluation. It substantially enlarges the sample size of the ENSO cases for prediction and predictability research and is useful to the research community in many other aspects. It is also a rigorous test of the robustness of the state-of-the-art climate forecast systems at different stages of global climate change and multidecadal variability. We realize that a major concern is whether the seasonal reforecasts for the presatellite era (1960s and 1970s) provide useful skill, when the in situ measurements were also much fewer, although the hydrographic observations have been increasing steadily since the first International Geophysical Year in 1958. In this paper, we demonstrate that the quality of this extended set of reforecasts, especially during the earlier period, is adequate for the purposes mentioned above.
The paper is structured as follows. The 57-yr (1958–2014) reforecast experiment is described in section 2. Section 3 provides a general evaluation of ENSO prediction skill. As examples, section 4 analyzes the reforecasts of the 1963/64 and 1982/83 El Niño events. The main results of this study are summarized and discussed in section 5.
2. A 57-yr reforecast (1958–2014)
CFSv2 is a coupled climate forecasting system composed of interacting atmospheric, oceanic, sea ice, and land components. Its atmospheric component is a reduced-resolution version of the Global Forecast System (GFS), used for U.S. operational global numerical weather prediction and atmospheric reanalysis (e.g., Kalnay et al. 1996). This version of GFS has a spectral horizontal resolution of T126 (105-km grid spacing) and 64 vertical levels in a hybrid sigma–pressure coordinate. It is also directly coupled to the Noah land surface model (LSM) at the same horizontal resolution (Chen et al. 1996; Ek et al. 2003). The oceanic component is the Geophysical Fluid Dynamics Laboratory (GFDL) Modular Ocean Model (MOM), version 4 (MOM4; Griffies et al. 2004), configured for the global ocean with a horizontal grid of 0.5° × 0.5° poleward of 30° latitude and meridional resolution increasing gradually to 0.25° between 10°S and 10°N (nominally referred to as 0.5° resolution). It has 40 vertical levels in a z coordinate, with 27 levels within the upper 400 m and the maximum depth at approximately 4.5 km. The sea ice component is a three-layer global interactive dynamical sea ice model with predicted fractional ice cover and thickness (Winton 2000). The atmospheric, oceanic, and sea ice components exchange surface momentum, heat, and freshwater fluxes, as well as SST and surface information on ice every 30 min.
The reforecasts are initialized from observationally based ocean, atmosphere, and land initial conditions. For the whole period of 1958–2014, the ocean initial states are from the instantaneous restart files of the ECMWF Ocean Reanalysis System 4 (ORAS4) (Balmaseda et al. 2013). The ORAS4 analysis is produced by the NEMO variational (NEMOVAR) ocean data assimilation (ODA) system (Mogensen et al. 2012), which assimilates quality-controlled temperature and salinity profiles from the EN3 database (Ingleby and Huddleston 2007) and along-track satellite sea level measurements with a 10-day window. A set of five ensemble-member assimilation runs is driven by daily surface momentum, heat, and freshwater fluxes successively from ERA-40 from September 1957 to December 1988 (Uppala et al. 2005), ERA-Interim from January 1989 to December 2009 (Dee et al. 2011), and the ECMWF operational analysis from January 2010 onward, subject to a strong relaxation to the gridded SST analyses. In our analysis, we did not find any significant effect on the ocean analysis by these changes of the surface forcing. The five members of the ensemble assimilation runs are generated with the following procedure: The five members of the ensemble assimilation runs start from different initial conditions at the start of the reanalysis (for details, see Balmaseda et al. 2013). In addition, four of the five ensemble members are forced with perturbations added to the observed surface momentum fluxes, commensurate with the estimated uncertainty in the wind stress analysis (Balmaseda et al. 2008). For these perturbed members, a small fraction of temperature and salinity measurements is also randomly rejected to represent the uncertainty of observational coverage and quality control decision algorithm (Balmaseda et al. 2013). The ORAS4 restart files from these ensemble runs are directly converted to the CFSv2 initial states by linear interpolation to the MOM4 grid. We have previously used another NEMOVAR-based ocean analysis produced by ECMWF to initialize CFSv2 seasonal forecasts for 1979–2008 and found that it achieves the highest ENSO prediction skill among a group of state-of-the-art ocean reanalysis datasets (Zhu et al. 2012).
Unlike the ocean reanalysis, there are no continuous reanalyses for land, atmosphere, and sea ice throughout the whole period. Therefore, the initial conditions for these component models were assembled from several different data sources before and after 1979. Starting in 1979, the atmosphere, land, and sea ice initial states were taken from the restart files of the Climate Forecast System Reanalysis (CFSR; Saha et al. 2010). For 1958–78, the atmospheric initial states were interpolated from the ERA-40 model level atmospheric reanalysis (Uppala et al. 2005). The land initial states were adapted from the reprocessed 3-hourly Global Land Data Assimilation System, version 2.0 (GLDAS-2.0), analysis on a 1° × 1° grid (Rodell et al. 2004; Rui and Beaudoing 2015), produced by the National Aeronautics and Space Administration (NASA) Goddard Space Flight Center (GSFC). The GLDAS-2.0 analyses are derived from the Noah LSM (Ek et al. 2003) forced with the Princeton global meteorological forcing data (Sheffield et al. 2006). The state variables used for the CFSv2 initialization are soil moisture (both total and liquid part) and soil temperature at the standard Noah model layers, snow depth and liquid water equivalent, skin temperature, and canopy water storage, which are linearly interpolated to the grid of CFSv2. Since there is no suitable sea ice analysis available before 1979, we used a fixed annual cycle of sea ice states to initialize the reforecasts of 1958–78. For each initial month, a sea ice initial state from CFSR is selected with its Arctic sea ice area closest to the 1958–78 ERA-40 mean value of that month. This was at or near the maximum value found in CFSR and was always from the first five years after which sea ice area was always lower than the 1958–78 ERA-40 mean value.
We produced four sets of ensemble reforecasts with 12-month duration initialized at the beginning of January, April, July, and October, respectively, for 1958–2014. Starting from the baseline of five ORAS4 initial states at 0000 UTC of the first day for an initial month, the ensemble is constructed by pairing each of the oceanic initial states with four atmospheric and land initial states at 0000 UTC of the first four days of that month. Therefore, an ensemble of 20 members is generated for each reforecast. The sea ice initial state is fixed at 0000 UTC of the first day of the month for all 20 ensemble members. Table 1 summarizes the data sources and the ensemble generation for the reforecasts.
The observed SST used for verification is the global monthly Extended Reconstructed SST, version 3 (ERSST.v3; Smith et al. 2008), for 1958–2014 on a 2° latitude × 2° longitude grid. We have also used the SST and upper-ocean temperatures from ORAS4 for the verification of the reforecasts. The zonal surface wind stress data are from the ERA-40 and ERA-Interim atmospheric reanalyses for 1958–78 and 1979–2014, respectively.
3. An evaluation of the ENSO reforecast skill
In this section, we present an evaluation of the SST reforecast skill for 1958–2014 with a focus on the comparison before and after 1979. This choice is based on two considerations. First, many of the comparable reforecasts start from 1979 when the remote sensing observations became available. Moreover, the in situ subsurface observations have started to grow rapidly in the tropical Pacific since the 1980s. Therefore, compared to the later period (1979–2014), the earlier period (1958–78) is characterized by a much smaller observation base. Second, these two periods are separated by a major decadal climate shift occurring during 1976–77 in the North Pacific (e.g., Trenberth and Hurrell 1994), which may have led to a change of the ENSO characteristics (e.g., Wang 1995; An and Wang 2000).
Because the global mean SST has been increasing since the 1950s as a result of climate change, the SST anomalies derived by subtracting a fixed monthly climatology for the whole period mix this long-term trend with the interannual variability. To focus on the interannual signals, we have used two separate monthly climatologies for 1958–78 and 1979–2014 to derive the SST anomalies for these two periods respectively. Accordingly, the bias corrections to the reforecasts are estimated and applied separately for these two periods, which reduces the effects of the potential mean-state differences due to the different amounts and qualities of the observations used for the model initializations in these two periods. The Niño-3.4 index (i.e., the averaged SST anomalies in 5°S–5°N, 120°–170°W) calculated in this way is generally consistent with the historical index provided by the Climate Prediction Center (CPC), which is also derived from multiple climatologies with overlapping 30-yr base periods successively updated at a 5-yr interval (Lindsey 2013). For brevity, the two periods of 1958–78 and 1979–2014 are referred to as P5878 and P7914, respectively, hereafter.
Figure 1 shows the spatial distribution of the SST correlation skill at selected lead months from the reforecast runs initialized in April for P5878 (Fig. 1, left) and P7914 (Fig. 1, right). At lead month 0 (April), the correlation skill is generally high (>0.6) in most regions north of 30°S for both (Fig. 1a), with higher skill (>0.8) in the tropical Pacific, a large part of the North Pacific, the northern tropical Atlantic, and the Arabian Sea for both P5878 and P7914. However, high skill (>0.8) appears only in P7914 in the subtropical South Pacific, the southern tropical Atlantic, and the northern North Atlantic (Fig. 1a, right). In the Indian Ocean, skill is generally higher in P7914 than in P5878, although the latter is better near Madagascar. The higher degree of correlation at lead month 0 in the later period can be due to better-observed SST (hence higher consistency between verification and initialization), larger trends, and/or stronger persistence, possibly associated with deeper ocean mixed layers.
At lead month 2 (June), there is a general skill reduction, and high correlations (>0.6) are mostly confined to the tropics for both P5878 and P7914 (Fig. 1b). The decline is faster for P5878 in the North Pacific and Atlantic Oceans, with its skill practically lost in these two regions by lead months 5 and 8 (Figs. 1c,d, left). On the other hand, the P5878 skill is quite comparable to that in P7914 in the tropics. In fact, the skill in P5878 is better than in P7914 in the off-equatorial central Pacific and the northern tropical Atlantic Oceans at lead month 2 (Fig. 1b). By lead month 5 (September), the P5878 skill in the off-equatorial Pacific remains above 0.7 and higher than its P7914 counterpart (Fig. 1c). At this lead month, P5878 also shows higher skill in the western Indian Ocean. By lead month 8 (December), the highest skill for both periods is located in the central equatorial Pacific between 120°W and the date line (Fig. 1d). For P7914, there is a skill recovery from lead month 5 but this skill recovery is not apparent for P5879.
The reforecast runs initialized in July (Fig. 2) generally show slower reduction of the correlation skill with increasing lead month in the tropical Pacific than those initialized in April (Fig. 1), as expected from the well-documented spring barrier effect. The skill of the July runs is also comparable between P5878 (Fig. 2, left) and P7914 (Fig. 2, right) up to lead month 5 (Figs. 2a–c), although P7914 shows better skill than P5878 in the tropical Pacific by lead month 8 (March; Fig. 2d). Similarly, the reforecast runs initialized in October show comparable skill in the tropics between P5878 and P7914 at lead months 0 (October; Fig. 3a) and 2 (December; Fig. 3b). In fact, P5878 outperforms P7914 in the central equatorial Indian Ocean at lead month 2. However, as in the July runs, by March (lead month 5), the P5878 skill (Fig. 3c, left) in predicting ENSO, as well as in the tropical Indian Ocean, is lower than that of P7914 (Fig. 3c, right), suggesting a faster decay of skill. The P5878 skill in these regions is largely lost by lead month 8 (June) (Fig. 3d, left), although the P7914 skill is quite robust (>0.6) from the central to eastern equatorial Pacific and in the northern Indian Ocean (Fig. 3c, right). This phenomenon is more clearly shown in the January runs (Fig. 4) when the correlation skill in P5878 generally decreases faster than in P7914. The correlations are mostly below 0.5 by June (lead month 5) in the central equatorial Pacific in P5878 (Fig. 4c, left), while they are mostly above 0.6 there in P7914 (Fig. 4c, right). This is possibly related to the fact that the El Niño events were shorter-lived in P5878 (e.g., Balmaseda et al. 2013, their Fig. C2). Outside the tropics, the P7914 generally performs better than P5878. Like the April reforecasts, the skill of the P5878 reforecasts initialized in all other months again declines faster and is lost quickly in the North Pacific and Atlantic Oceans.
These ENSO reforecasts are further evaluated by comparing the Niño-3.4 index from the ensemble reforecasts with the observations. The correlations of the ensemble mean forecasts with observations show that the Niño-3.4 predictive skill is better for P5878 (red curves) than for P7914 (blue curves) in the April initialization runs (Fig. 5a, left) during June–October when the latter shows a minimum in August. This difference seems to imply a change in the characteristics of the boreal spring predictability barrier between the later (P7914) and the earlier (P5878) periods (e.g., Balmaseda et al. 1995; Barnston et al. 2012). Initialized in July, the correlation skill is comparable between P5878 and P7914 until February when the P5878 skill decays more quickly (Fig. 5b, left). Faster decay starting in March is also shown in the P5878 skill for the October and January runs (Figs. 5c,d, left). On the other hand, the root-mean-square error (RMSE) (Fig. 5, right) is generally smaller for P5878 (red curves) than for P7914 (red curves). In particular, the RMSE increases quickly for P7914 in the beginning months of the July and October runs that peak in December (Figs. 5b,c, right). Although this may reflect an increased variance of the SST anomalies in the second period because stronger El Niño events occurred, this may not fully explain why the RMSE is particularly high in the summer and fall seasons.
Comparing the predicted and observed Niño-3.4 time series, these statistical features can be linked with the characteristics of the model-predicted ENSO events during these two periods. Figure 6 shows that the P5878 ensemble mean reforecasts initiated in April (thick black curves) predicted the observed (thick red curve) 1963/64, 1965/66, 1972/73, and 1976/77 El Niño events skillfully, although the 1968/69 warm event was missed. The model also predicts the major 1964/65, 1970/71, and 1973/74 La Niña events, as well as the extended cold conditions in 1971 and the reemergent La Niña in 1975. Compared with the reforecasts of the 1975 case reported by McPhaden et al. (2015), our reforecasts seem more realistic in picking up the growth of the initial cold SST anomalies in April 1975 and the subsequent peak in early 1976 (thick black and red curves, Fig. 6). These are in contrast to the reforecasts of McPhaden et al. (2015) initialized in May 1975, which generally predicted a decay of the initial cold SST anomalies throughout the year (see their Fig. 7). In comparison, the somewhat lower correlation and higher RMSE for P7914 (Fig. 5a) likely reflect the fact that there are more missed events and false alarms during 1993–95 and since the 2000s than in other episodes. In particular, the coupled system mispredicts or overestimates the warm SST anomalies during 2012–14. The model also underestimates the strong 1982/83 El Niño event (Fig. 6). As we have pointed out in the introduction, understanding the specific reasons for each of these events requires a case-by-case analysis.
In comparison to the April runs, the reforecasts initialized in July show significantly improved skill. In particular, the peak time and amplitude of most major ENSO events are correctly predicted by the ensemble means (Fig. 7). For example, the 1982/83 event is predicted quite accurately. In comparison with the April forecasts, the SST fluctuations during 2012–14 are also better predicted. The time series of the predicted Niño-3.4 indices from the October initial states (Fig. 8) also show that the model generally simulates the timing of the transition of phases from growth to decay realistically for major ENSO events throughout the whole period of 1958–2014. However, during P5878, the model predicts slower decay rate of the SST anomalies after the peaks of several major warm and cold events, such as those in 1964 and 1972, although the model handles the decaying phase of El Niño events in 1966 and 1973 reasonably well. On the other hand, the larger RMSE in P7914 during shorter lead months (Fig. 5c, right) seems mainly related to the overshoot in the peak amplitude of several major El Niño events in 1982/83, 1986/87, 1997/98, and 2002/03. The magnitude and the decay rate of the major ENSO events are better simulated by the predictions initialized in January (Fig. 9), although, similar to the April runs, the model tends to underestimate the growth rate of the ENSO development (e.g., the warm events initiated in 1972, 1982, 1986, and 1997).
Overall, the P7914 reforecasts are better than the P5878 ones in the North Pacific and North Atlantic Oceans with all starting dates. This indicates a decadal change in seasonal prediction skill outside the tropical Pacific. It is interesting that the skill patterns in the North Pacific and Atlantic Oceans bear certain resemblance to the patterns of lower-frequency variations in these regions, such as the Pacific decadal oscillation (PDO) and the Atlantic multidecadal variability (AMV). Further examinations will be conducted to determine whether the increased skill in P7914 in these regions is simply due to the aliasing of the lower-frequency variations or is related to the changed physical processes, such as the reemergence of the SST anomalies (e.g., Alexander et al. 1999) enhanced by a potentially deeper extratropical oceanic mixed layer in winter.
On the other hand, the tropical SST correlation skill is comparable in these two periods from the ENSO initiation to mature phases. In fact, when initialized in spring, the P5878 reforecasts outperform those in P7914 in the central equatorial Pacific in the ENSO development phase (June–September). Initiated after summer, the P5878 reforecasts lose skill more quickly than those in P7914 during the next spring and early summer, associated with slower-than-observed decay of the model ENSO signals during this period. The larger RMSE scores at short lead months for the P7914 reforecasts from the July and October initial states are mostly due to the overestimates of the peak SST anomalies of the major El Niño events in late fall and early winter.
In addition to SST, we have also examined the skill in predicting the upper-ocean heat content (HC), defined as the mean temperature of the upper 300 m. The verification data are from the ORAS4 monthly mean temperature analysis. Figure 10 shows the spatial distribution of the HC correlation skill at selected lead months from the reforecast runs initialized in April for P5878 (Fig. 10, left) and P7914 (Fig. 10, right). The correlation skill is high globally at lead month 0 (Fig. 10a), demonstrating the high persistence of the initialized HC anomalies. High correlations also appear in lead month 2 over many regions, although faster decreases in skill occur in the equatorial Atlantic and Indian Oceans (Fig. 10b). By lead months 5 (Fig. 10c) and 8 (Fig. 10d), high correlations are concentrated in the tropical Pacific and the northern part of the North Pacific and North Atlantic. There are no significant differences in the spatial distributions of the correlation skills for P5878 and P7914.
We have further examined the prediction skill of two HC indices corresponding to the WWV and “tilt” modes of the thermocline variability in the equatorial Pacific, respectively (e.g., Meinen and McPhaden 2000; Bunge and Clarke 2014). Following Meinen and McPhaden (2000), we use the average of the HC anomalies within the region 5°S–5°N, 120°–80°W to approximate WWV. We also use the average of the HC anomalies in the Niño-3.4 region to characterize the tilt mode that is largely in phase with other ENSO indices (e.g., Bunge and Clarke 2014). For simplicity, we refer to these two indices as the WWV and tilt indices in the following discussion.
With the reforecasts initialized in April, the predicted WWV index has a correlation skill above 0.8 up to a lead time of 8 months and above 0.5 by the end of the 1-yr reforecast (Fig. 11a). Correspondingly, the RMSE increases gradually from below 0.1° to around 0.3°C throughout the reforecasts (Fig. 11b). Moreover, the prediction skill of both is comparable between P5878 (red curve) and P7914 (blue curve). This confirms that the WWV anomalies are adequately initialized in our reforecasts and provide useful skill in our ENSO predictions. On the other hand, the correlation (Fig. 11c) and RMSE (Fig. 11d) of the tilt index are very consistent with those of the Niño-3.4 SST index (Fig. 5a), showing better skill for P5878 (red curves) than for P7914 (blue curves), especially during June–September.
4. Reforecasts of the 1963/64 and 1982/83 El Niño events
In this section, we examine the reforecasts of the 1963/64 and 1982/83 El Niño and the subsequent La Niña events in the P5878 and P7914 periods, respectively, which demonstrate the main characteristics typical of the major ENSO events for the two separate periods described in section 3. Figure 12 shows the observed and predicted Niño-3.4 indices for these two episodes. For observational verification, we have used the SST analyses from both ERSST.v3 (black curve) and ORAS4 (gray curve). It can be seen that, for the 1963/64 event, positive Niño-3.4 was first observed in February–March 1963 (black and gray curves in Fig. 12a) and grew throughout the year to peak at over 1°C in December. The index then declined quickly and became negative after May 1964. The cold SST anomalies continued to grow into a La Niña, with cold SST anomalies peaking at over −1°C by the end of the year. The cold anomalies then gradually weakened in 1965. The two analyses are quite consistent with each other during this episode, except in the early summer of 1963, when a temporary dip occurred during May–June in ERSST.v3 (black curve) but not in ORAS4 (gray curve).
For the 1982/83 event, the two analyses are more noticeably different. The Niño-3.4 index from the ERSST.v3 (black curve in Fig. 12b) increased gradually from February to June in 1982 and, after a short decline between June and July, increased rapidly from 0.5°C in July to about 2°C in October. The dip in June and July, however, was weaker in ORAS4 (gray curve in Fig. 12b). Since October, the Niño-3.4 index further increased to its peak in January 1983. During this period, the ORAS4 data showed a faster rate of increase than ERSST.v3, with the former peaking at 3.0°C while the latter peaked at 2.5°C. The warm SST anomalies started to decay in February and cold SST anomalies appeared in July. The cold anomalies then developed into a prolonged La Niña event. It first peaked in November–December 1983 at −1°C, and, after a quick weakening in the spring of 1984, the cold SST anomalies reemerged in the subsequent summer. The two analyses are more consistent during the decaying phase of the 1982/83 El Niño event.
For the model reforecasts, we focus on the cases initialized in April and October because they best characterize the ENSO initiation/development and mature/decay phases, respectively. Initialized in April 1963, the ensemble mean reforecast (the red curve in Fig. 12a) predicted the growth of the warm SST anomalies quite accurately with a spread of the ensemble members (thin dashed red curves) that envelops the analyses (black and gray curves) well from the growth to peak stages. The decay starting in January, however, is slightly delayed and subsequently takes place at a somewhat slower rate. On the other hand, the ensemble mean reforecast initialized in April 1982 (thick red curve in Fig. 12b) underestimated the growth of the Niño-3.4 index throughout the duration of the reforecast. The spread of ensemble members (thin dashed red curves) is notably larger than that of the corresponding 1963/64 reforecasts. It is interesting to note that the observed index, as represented by both thick black and gray curves in Fig. 12b, is near the upper limit of the spread, and a fraction of the ensemble members seems to catch the faster observed growth rate from July to October.
When forecasts are initialized in October, the mean Niño-3.4 index from the 1963 reforecasts (thick green curve in Fig. 12a) first increased slightly from October to December, then decayed slowly throughout the rest of the reforecast period. As a result, the ensemble mean SST anomalies are still above 0.5°C by the end of this reforecast, which is in stark contrast to the rapid decay of the observed Niño-3.4 index (black and gray curves). On the other hand, the mean Niño-3.4 index from the 1982 October reforecasts is already about 0.5°C warmer than its observed counterparts at lead month 0. This is possibly due to either the errors of the initial SST anomalies or a faster growth rate of the model SST anomalies. As we have pointed out before, the SST anomalies from the ORAS4 increased faster than those from ERSST.v3 during November 1982–January 1983. Therefore, it is possible that the stronger model SST anomalies are in part due to the differences between these two analyses.
On the other hand, it also seems likely that a faster initial increase occurred in the model predictions. The model Niño-3.4 index kept growing faster than both analyses in the next few months and peaked at over 3°C in January 1983, 1°C warmer than the ERSST.v3 and 0.5°C warmer than the ORAS4. Although having a higher growth rate, the model accurately predicts the timing of the transition from increase to decrease. It also reproduces the decay of the warm event and the initiation of the cold event realistically. When initialized in April of 1964 and 1983, both sets of reforecasts are quite successful in predicting the onsets of the La Niña events following the warm events (blue curves in Figs. 12a,b).
The process of air–sea feedback is shown in Hovmöller (time–longitude) diagrams along the equatorial Pacific during these two episodes (Figs. 13 and 14). In these diagrams, the observed SST anomalies are from ERSST.v3. In the spring of 1963, observed warm SST anomalies appeared in the equatorial eastern Pacific near 120°W (shading in Fig. 13a), associated with the westerly wind anomalies along the equator over the central Pacific (150°W–date line). The Bjerknes-type air–sea feedback was responsible for the growth of the SST anomalies in the central and eastern equatorial Pacific and their westward expansion from June to August, together with enhanced westerly wind anomalies peaking at 0.02 N m−2 around the western side of the growing SST anomalies (contours in Fig. 13a). After a sustained period, the air–sea feedback was enhanced again from October to December, leading to the peak SST anomalies in 120°–150°W in December. Afterward, the westerly wind stress anomalies east of the date line decreased quickly and switched to easterly wind anomalies by March and April of 1964. This led to a quick demise of the warm SST anomalies during January–April.
The reforecast (bottom half of Fig. 13b) reproduces the gross pattern of the Bjerknes feedback, although the coupling strength seems stronger, leading to larger SST anomalies that also show a stronger tendency to propagate westward. This stronger model coupling seems associated with the gradual eastward expansion of the warm surface water beginning in August. This can be seen by the eastward tilt of the 28°C isotherm in the reforecast (green contour in Fig. 13b), although the observed isotherm was largely stationary near 150°W throughout the warm event (green contour in Fig. 13a). In the reforecast, the SST and westerly wind anomalies are also more persistent after the peak SST anomalies in January 1964. Interestingly, the reforecast initialized in October (Fig. 13c) also shows prolonged warm SST and westerly wind anomalies throughout the reforecast duration. This is possibly related to the fact that the model SST was generally warmer during most of the period. For instance, although the observed 28°C isotherm was located near 150°W during the winter season, surface water warmer than 28°C occupied the eastern Pacific during the spring of 1964 (green contour in Fig. 13c). On the other hand, the observed relationship between the cold SST and easterly wind anomalies during 1964/65 (Fig. 13a) is better simulated by the reforecast initialized in April 1964 (top half of Fig. 13b).
The onset of the observed 1982/83 El Niño was different from the 1963/64 event in that warm SST anomalies first appeared in the western Pacific associated with mild westerly wind anomalies near the date line and propagated eastward in April and May (Fig. 14a). In June and July, westerly wind intensified in the west. Starting from August, the westerly wind anomalies extended gradually eastward, generating significant warming in the central and eastern equatorial Pacific. In October, the westerly wind anomalies further intensified in the western and central Pacific, peaking at 0.06 N m−2 near 150°E–170°W in November. Correspondingly, the maximum warm SST anomalies reached 3.5°C in December in the eastern Pacific near 120°W. The 1982/83 El Niño was also different from the 1963/64 event in its prolonged termination process. In fact, after the peak SST anomalies in the eastern equatorial Pacific, large westerly wind anomalies (>0.03 N m−2) persisted in the central Pacific until April 1983. Anomalous westerly winds also propagated further into the eastern Pacific during the spring of 1983 and were maintained into late summer. Associated with these persistent wind anomalies, the eastern equatorial Pacific warm SST anomalies also persisted well into the spring and early summer of 1983. Lengaigne and Vecchi (2010) have identified this prolonged termination process as a feature of extreme El Niño events (i.e., 1982/83 and 1997/98 cases) that distinguishes them from the rest of the recorded historical events since 1906. This intensive air–sea feedback was associated with a major eastward migration of the warm surface water. In fact, beginning in August 1982, the 28°C isotherm extended eastward. From December 1982 to June 1983, the warm surface water occupied the whole equatorial Pacific.
Although generating a warm event, the reforecast initialized in April 1982 seriously underestimates the magnitude of the observed El Niño event (Fig. 14b, bottom half). For instance, although the westerly wind anomalies start to grow in the western Pacific in June, the maximum wind stress anomaly only reaches 0.02–0.025 N m−2 by December 1982 and January 1983. The peak SST anomalies are also quite mild (~1°C) in the central equatorial Pacific. Partly as a result of the mild growth rate, the development of the model event did not lead to a major eastward migration of the 28°C water.
As we have pointed out before, a fraction of the ensemble members catch the observed increase of the Niño-3.4 index from July to December reasonably well (Fig. 12b). The composite of the five ensemble members with the best Niño-3.4 predictions (green curves in Fig. 15a) further demonstrate that they better capture the evolution of this El Niño event in terms of the growth of the equatorial wind and SST anomalies and their eastward propagation from August 1982 to March 1983 in the equatorial Pacific (Fig. 15b). On the other hand, the composite of the five members with the worst predictions of the Niño-3.4 index for this period (blue curves in Fig. 15a) shows a near-normal condition in the equatorial Pacific, with weak cold SST and easterly wind anomalies developing in the eastern part of the ocean (Fig. 15c). These results imply that the observed extreme warming at the end of 1982 may be an outcome with low probability. This is consistent with the results of Takahashi and Dewitte (2016) that random westerly wind stress anomalies in summer played a key role for the 1982/83 event to develop into a strong El Niño. Lopez and Kirtman (2014) also examined the effect of WWBs on the ENSO predictability.
On the other hand, the large magnitude of the warm SST anomalies for the 1982/83 warm event is much better predicted when the model is initialized in July (not shown here but can be discerned in the prediction of Niño-3.4 in Fig. 7). Furthermore, it is interesting that the reforecast initialized in October 1982 (Fig. 14c), which starts near the peak of the observed El Niño event, apparently overestimates the air–sea feedback, with the maximum westerly wind anomaly at 0.07 N m−2 in December and the maximum SST anomalies of 4.2°C in January. This very strong air–sea feedback in the forecast system seems to be associated with the migration of the warm surface water (>28°C) into the eastern equatorial Pacific and its long residence there. As a result, the October reforecasts reproduce the prolonged termination process of this event quite well, including a clear eastward propagation of the westerly wind anomalies, together with the strong peak SST anomalies in the boreal winter season and slow decay well into the subsequent summer. The reforecast initialized in April 1983 reproduces the transition from the warm to cold event quite realistically (top half of Fig. 14b).
Overall, these two examples of typical events illustrate the general patterns we have described in section 3. For instance, the predicted 1972/73 El Niño from October initialization also showed a somewhat prolonged ending, although to a lesser extent, while the prediction of the strong 1997/98 warm event greatly overshot the peak warm SST anomalies (Fig. 8). On the other hand, the prediction of the 1965/66 El Niño showed large uncertainty during its decaying phase, with a fraction of ensemble members showing a reemergence of the warming tendency.
5. Summary and discussion
We have examined a set of ensemble seasonal reforecasts for 1958–2014, using CFSv2 initialized with a global ocean reanalysis (ECMWF ORAS4), as well as the observationally based land and atmosphere reanalyses. The purpose is to provide a long and continuous dataset of seasonal reforecasts extending to the ENSO events in 1960s and 1970s and to examine the performance of a modern climate forecast system in different phases of global climate change and multidecadal variability.
A general concern of the extended reforecasts is whether the ocean reanalysis of 1960s and 1970s provides adequate initialization in the earlier period. Our preliminary examination shows that the ensemble mean ENSO prediction skill for 1958–78 is overall comparable to that for 1979–2014. In fact, the former outperforms the latter slightly for the ENSO development phase, possibly due to the multidecadal changes in ENSO characteristics. This demonstrates that modern forecast systems can be used to examine the seasonal prediction skill and predictability of ENSO events occurring in the 1960s and 1970s. Our results are qualitatively consistent with those of ENSEMBLES (Weisheimer et al. 2009). In particular, a visual examination of the time series of the observed DJF Niño-3.4 index and the ENSEMBLES predictions at lead times of 2–4 and 5–7 months (see Weisheimer et al. 2009, Fig. S4 in their supplemental information) does not reveal a significant change of skill between the early and later parts of the reforecasts.
On the other hand, we notice that the skill of the earlier predictions declines faster in the ENSO decay phase. This faster decline is mostly associated with the fact that, after the mature phase of major ENSO events, the reforecast Niño-3.4 anomalies are usually more persistent than the observed ones for the ENSO events in 1958–78. Although this seems to be a general model flaw, it affects the model skill more seriously in the earlier period when the observed ENSO events terminated early. For the later period, we also find that reforecasts initialized after summer tend to produce stronger SST anomalies, thus overestimating the magnitude of the major ENSO events.
As examples of the characteristics of the earlier and later reforecasts, we further analyze the reforecasts of the 1963/64 and 1982/83 El Niño and the subsequent La Niña events. Initialized in the spring season, the forecast system predicted the onset of both events, although the model underestimates peaking magnitudes of the SST and zonal wind stress anomalies of the 1982/83 El Niño. The larger spread of the ensemble members in the latter case also suggests that the 1982/83 El Niño is less predictable than the 1963/64 event. Initialized in October 1963, however, the model predicts a prolonged lingering of the wind and SST anomalies in the eastern equatorial Pacific, although the observed equatorial SST and wind stress anomalies decayed quickly in the early spring of 1964. On the other hand, the reforecast initialized in October 1982 significantly overestimates the peak magnitude of the wind and SST anomalies in the 1982/83 El Niño.
Although behaving differently, both prolonged decay and overshooting suggest an overly strong model air–sea feedback in the mature and decay phases of El Niño events. We argue that this is a consequence of the warmer SST in the central and eastern equatorial Pacific Ocean in CFSv2 during boreal winter and spring seasons. It has been known observationally that, during the two strongest El Niño events (1982/83 and 1997/98), large westerly wind and warm SST anomalies persisted in the eastern equatorial Pacific until the early summer of the next year although the local thermocline had been shoaling since late winter (Harrison and Vecchi 2001; Zelle et al. 2004). This was because the warm surface waters (>28°C) that accumulated in the eastern equatorial Pacific generate an equatorially centered intertropical convergence zone and substantially reduce the easterly winds in these strong El Niño events (e.g., Fig. 14a) (see also Lengaigne and Vecchi 2010) while the warm pool water remains in the western Pacific Ocean during more moderate events. Although such strong air–sea feedback over the eastern Pacific was only observed during strong El Niño events (i.e., 1982/83 and 1997/98), it seems to be more easily triggered in the coupled predictions (Figs. 13c and 14c) as a result of the coupled model warm SST bias in boreal winter and spring. Therefore, we argue that the systematic model bias is the main factor that affects the prediction skill in the ENSO mature and decay phases. This effect should be most significant in the warm events. The main task for improvement is to reduce the model systematic bias in the tropical Pacific, not only the equatorial cold bias that is strongest in boreal summer and fall but also the warm bias in the tropics in the winter and spring seasons. Previously, we have conducted sensitivity experiments with empirical flux corrections to reduce model bias (e.g., Manganello and Huang 2009; Pan et al. 2011). Further sensitivity experiments with flux correction are also being conducted with CFSv2 in the framework of prediction. More solid progress, however, depends on a better understanding of the physical processes controlling the annual cycle of the tropical Pacific Ocean.
In summary, we have demonstrated the feasibility of extending seasonal reforecast to the 1960s and 1970s with current climate prediction systems. This enables us to conduct a closer examination of the reforecasts of historical ENSO events, which provide new insight into the potential flaws of current forecast systems and may help to find ways to improve it. Equally useful is the examination of the unsuccessful ENSO predictions, either the false alarms or missed events, to identify the origin of the errors within the model and initialization. A more rigorous test of the forecast system on a larger set of cases and a wide range of background states will be a useful approach for model forecast improvement.
The GMU/COLA scientists are supported by grants from NSF (AGS-1338427), NOAA (NA14OAR4310160), and NASA (NNX14AM19G) and a grant from the Indian Institute of Tropical Meteorology and the Ministry of Earth Sciences, Government of India (MM/SERP/COLA-GMU_USA/2013/INT-2/002). We acknowledge the Extreme Science and Engineering Discovery Environment (XSEDE) for providing the computational resources for the reforecast project. We also thank ECMWF for providing the ORAS4 and NASA’s Earth Science Division for providing the GLDAS-2.0 data archived and distributed by the Goddard Earth Sciences (GES) Data and Information Services Center (DISC). Finally, we thank the editor and the anonymous reviewers for their constructive comments and suggestions.