Enhanced ENSO Prediction via Augmentation of Multimodel Ensembles with Initial Thermocline Perturbations

Terence J. O’Kane CSIRO Oceans and Atmosphere, Hobart, Tasmania, Australia

Search for other papers by Terence J. O’Kane in
Current site
Google Scholar
PubMed
Close
,
Dougal T. Squire CSIRO Oceans and Atmosphere, Hobart, Tasmania, Australia

Search for other papers by Dougal T. Squire in
Current site
Google Scholar
PubMed
Close
,
Paul A. Sandery CSIRO Oceans and Atmosphere, Hobart, Tasmania, Australia

Search for other papers by Paul A. Sandery in
Current site
Google Scholar
PubMed
Close
,
Vassili Kitsios CSIRO Oceans and Atmosphere, Hobart, Tasmania, Australia

Search for other papers by Vassili Kitsios in
Current site
Google Scholar
PubMed
Close
,
Richard J. Matear CSIRO Oceans and Atmosphere, Hobart, Tasmania, Australia

Search for other papers by Richard J. Matear in
Current site
Google Scholar
PubMed
Close
,
Thomas S. Moore CSIRO Oceans and Atmosphere, Hobart, Tasmania, Australia

Search for other papers by Thomas S. Moore in
Current site
Google Scholar
PubMed
Close
,
James S. Risbey CSIRO Oceans and Atmosphere, Hobart, Tasmania, Australia

Search for other papers by James S. Risbey in
Current site
Google Scholar
PubMed
Close
, and
Ian G. Watterson CSIRO Oceans and Atmosphere, Hobart, Tasmania, Australia

Search for other papers by Ian G. Watterson in
Current site
Google Scholar
PubMed
Close
Free access

Abstract

Recent studies have shown that regardless of model configuration, skill in predicting El Niño–Southern Oscillation (ENSO), in terms of target month and forecast lead time, remains largely dependent on the temporal characteristics of the boreal spring predictability barrier. Continuing the 2019 study by O’Kane et al., we compare multiyear ensemble ENSO forecasts from the Climate Analysis Forecast Ensemble (CAFE) to ensemble forecasts from state-of-the-art dynamical coupled models in the North American Multimodel Ensemble (NMME) project. The CAFE initial perturbations are targeted such that they are specific to tropical Pacific thermocline variability. With respect to individual NMME forecasts and multimodel ensemble averages, the CAFE forecasts reveal improvements in skill when predicting ENSO at lead times greater than 6 months, in particular when predictability is most strongly limited by the boreal spring barrier. Initial forecast perturbations generated exclusively as disturbances in the equatorial Pacific thermocline are shown to improve the forecast skill at longer lead times in terms of anomaly correlation and the random walk sign test. Our results indicate that augmenting current initialization methods with initial perturbations targeting instabilities specific to the tropical Pacific thermocline may improve long-range ENSO prediction.

© 2020 American Meteorological Society. For information regarding reuse of this content and general copyright information, consult the AMS Copyright Policy (www.ametsoc.org/PUBSReuseLicenses).

Corresponding author: Terence J. O’Kane, terence.okane@csiro.au

Abstract

Recent studies have shown that regardless of model configuration, skill in predicting El Niño–Southern Oscillation (ENSO), in terms of target month and forecast lead time, remains largely dependent on the temporal characteristics of the boreal spring predictability barrier. Continuing the 2019 study by O’Kane et al., we compare multiyear ensemble ENSO forecasts from the Climate Analysis Forecast Ensemble (CAFE) to ensemble forecasts from state-of-the-art dynamical coupled models in the North American Multimodel Ensemble (NMME) project. The CAFE initial perturbations are targeted such that they are specific to tropical Pacific thermocline variability. With respect to individual NMME forecasts and multimodel ensemble averages, the CAFE forecasts reveal improvements in skill when predicting ENSO at lead times greater than 6 months, in particular when predictability is most strongly limited by the boreal spring barrier. Initial forecast perturbations generated exclusively as disturbances in the equatorial Pacific thermocline are shown to improve the forecast skill at longer lead times in terms of anomaly correlation and the random walk sign test. Our results indicate that augmenting current initialization methods with initial perturbations targeting instabilities specific to the tropical Pacific thermocline may improve long-range ENSO prediction.

© 2020 American Meteorological Society. For information regarding reuse of this content and general copyright information, consult the AMS Copyright Policy (www.ametsoc.org/PUBSReuseLicenses).

Corresponding author: Terence J. O’Kane, terence.okane@csiro.au

1. Introduction

Despite having advanced data assimilation systems for initialization, fully coupled ocean–atmosphere–sea ice dynamical models solving the physical equations of the climate system are generally known to be only slightly more skillful than statistical models in forecasting El Niño–Southern Oscillation (ENSO) phase and intensity (Balmaseda and Anderson 2009). In particular, general circulation models (GCMs) have problems in predicting boreal winter tropical Pacific sea surface temperature (SST) for forecasts starting in boreal spring (February–May); that is, they exhibit the so-called boreal spring predictability barrier (Flügel and Chang 1998; Jin et al. 2008; Philander et al. 1984). During the boreal spring, the intertropical convergence zone (ITCZ) is typically situated close to the equator with the climatological (seasonal) SST at a maximum (Lai et al. 2018). The combined effect is that ENSO SST anomalies (SSTA) and associated sea level pressure anomalies, measured by the Southern Oscillation index (SOI), are weakest during boreal spring, thereby reducing the signal-to-noise ratio and making forecasts more sensitive to random variability. During boreal spring, ENSO events are typically in a decaying phase with relatively weak zonal SST gradients, and thus small perturbations in SST can be amplified over a substantial region of the equatorial Pacific, making associated SSTA difficult to detect, let alone to forecast accurately (Jin et al. 2008).

Since 2002, routine ENSO forecasts, largely in terms of indices associated with the variability of SST anomalies in the equatorial Pacific, have been collated by the International Research Institute for Climate and Society (IRI) and published on their web page (http://iri.columbia.edu/climate/ENSO/currentinfo/SST_table.html). Both Barnston et al. (2012) and Tippett et al. (2012) reviewed the performance of the constituent models comprising the IRI dataset over the 2002–11 period. They found the highest skill for those forecasts initiated after the boreal spring predictability barrier for target seasons prior to the subsequent boreal spring; that is, forecasts were found to verify systematically better against observations at lead times earlier than the intended forecast targets. The IRI dataset has more recently been augmented by hindcasts (1982–2010) and real-time (2011–15) predictions generated as part of the North American Multimodel Ensemble (NMME) project (Kirtman et al. 2014). Barnston et al. (2017) assessed skill in the NMME in terms of the mean square error skill score and anomaly correlation. In a companion study, Tippett et al. (2019) assessed probabilistic forecasts of ENSO phase and amplitude in current NMME prediction systems, finding that regardless of model, forecast format, and skill metric, the boreal spring predictability barrier still explains much of the dependence of skill on target month and forecast lead.

O’Kane et al. (2019) described the development of variants of strongly coupled data assimilation (DA) systems based on ensemble optimal interpolation (EnOI) and ensemble transform Kalman filter (ETKF) methods. The assimilation system was first tested on a small paradigm model of the coupled tropical–extratropical climate system, then implemented for a coupled GCM. They assessed the impact of assimilating ocean observations on the atmospheric state analysis update via the cross-domain error covariances from the coupled-model background ensemble. Using the CSIRO Climate Analysis Forecast Ensemble (CAFE), they also conducted multiyear ENSO prediction experiments with a particular focus on the atmospheric response to tropical ocean perturbations examining the relationship between ensemble spread, analysis increments, and forecast skill over 2-yr lead times.

Specifically, they employed initial forecast perturbations generated from bred vectors (BVs) (Toth and Kalnay 1997) projecting onto disturbances at and below the thermocline with similar structures. They found that the error growth of these dynamical vectors leads ENSO SST phasing by 6 months. Once expressed at the surface, the dominant mechanism communicating tropical ocean variability to the extratropical atmosphere was found to be via tropical convection modulating the Hadley circulation. They concluded that BVs specific to tropical Pacific thermocline variability were the most effective choices for ensemble initialization and ENSO forecasting. They further assessed forecast skill by comparison of receiver operating characteristic (ROC) curves calculated from a large hindcast dataset (a total of 3696 forecast years), finding that the reduced spread observed at long lead times (out to 465 days), for forecasts initiated from perturbations restricted to the equatorial thermocline, was an indicator of increased skill.

In this paper, we extend the study of O’Kane et al. (2019) examining the utility of the CAFE forecasts in direct comparison to state-of-the-art climate forecast systems comprising the NMME. In the CAFE forecasts, initial perturbations specific to the thermocline of the equatorial oceans were applied to a common analyzed atmosphere–ocean state, so it is reasonable to assume that ensemble spread in ENSO forecasts is largely due to these disturbances. The NMME models are initialized with a variety of methods, but all include global perturbations to both ocean and atmospheric initial states. Thus, comparison of the CAFE and NMME models sheds light on the efficacy of targeting initial conditions to regions where the dynamics relevant to the appropriate spatiotemporal variability resides—in this case ENSO variability on seasonal time scales.

In section 2, we describe the NMME and CAFE data used. Section 3 briefly describes the CAFE configuration used to generate the initial forecast perturbations. Section 4 compares the skill of the CAFE forecasts to the NMME in terms of phase (section 4a) and using the random walk sign test (section 4b). Discussion and conclusions are in section 5.

2. Data

Throughout, we characterize ENSO state (amplitude, phase, and duration) by the Niño-4 index, that is, SST averaged over the equatorial Pacific region: 5°S–5°N, 160°E–150°W (Barnston et al. 1997). Monthly averages of the observed Niño-4 index for the period January 1982–August 2016 are computed based on HadISST data (Rayner et al. 2003). Forecast monthly averages of the Niño-4 index come from CAFE and the NMME.

The NMME consists of integrations with start dates from a hindcast period (1982–2010) and a real-time period (2011–15), although here we do not discriminate between hindcasts and forecasts, referring to both as forecasts but cognizant of the fact that forecast skill is typically lower than that of hindcasts. The NMME data (Kirtman et al. 2014) have been used in several recent studies of ENSO predictability (DelSole and Tippett 2014, 2016; Tippett et al. 2019; Barnston et al. 2017) and are available from the IRI Data Library (http://iridl.ldeo.columbia.edu/SOURCES/.Models/.NMME). Specifically, the NMME is based on real-time intraseasonal to seasonal to interannual prediction systems [see Table 1 of Kirtman et al. (2014) for the references describing each operational system]. However, apart from setting a minimum lead time (9 months) and ensemble size (11 members), model configurations (i.e., resolution, version, physical parameterizations, initialization strategies, and ensemble generation strategies) are left open to the forecast providers, who use a wide variety of data assimilation and ensemble perturbation strategies. Monthly mean data are provided for the NMME on global grids of SST, 2-m temperature (T2m), and precipitation.

Given the limited mutual span of these datasets (February 2002 to December 2015), the most feasible approach for calculating anomalies that includes a bias correction is where model anomalies over the common period are computed relative to the (lead-time dependent) ensemble-mean model climatology over the same period using cross-validation.

To facilitate a reasonable comparison to the CAFE forecasts, we only consider that subset of NMME models where forecast lead times extending up to 12 months have been provided. These models (see Table 1) are the Canadian (CanCM3, CanCM4); GFDL (FLORA, FLORB, AE04), and Center for Ocean–Land–Atmosphere Studies (COLA) models. The initialization method, forecast length, and number of ensemble members vary by model; however, all models are initialized near the start of each month. In the results that follow we label the monthly averages of up to 12-month integrations as having lead times of 0, 1, … , 11 months so that the 0-month lead of a forecast with nominal start date in January is the January average, and so on.

Table 1.

Attributes of CAFE and NMME models. Common period considered is 2002–15.

Table 1.

We apply our analysis methods to the SST for each model and the observations computed relative to their respective climatologies calculated over the base period of the data being considered (i.e., 2002–15). All model climatologies are computed as a function of month and lead time, and leave-one-year-out cross-validation is used when computing anomalies. Specifically, we consider I member ensemble forecasts over a total of Y years, from each of the respective CAFE and NMME models, initialized each month (m ∈ January, February, … , December) over a given period of years (y ∈ 2002, 2003, …, 2015) such that N = I × Y is the total number of forecasts initialized from a given month m and lead time τ. The SST Niño-4 bias for a given month m at a given lead time (τ ∈ 1, 2, … , 12 months) is estimated as
NINO4bias(τ,m)=1I×Yi=1Iy=1Y[NINO4yi(τ,m)NINO4yHadISST(τ,m)].
This is a common approach as discussed by Saha et al. (2006) and Kirtman and Min (2009). Similarly, observed anomalies are computed relative to the observed climatology over the same period using cross-validation. This approach makes for consistent comparison between the different models relative to observations.

3. CAFE data assimilation and initial forecast perturbations

Ensemble forecasts from CAFE were generated with lead times up to 24 months spanning the period of 2002–15. The CAFE system, including data assimilation, ensemble initialization, and ENSO skill in terms of ROC curves has been described in detail by O’Kane et al. (2019). Here, we extend that study to examine the value of targeting thermocline disturbances as initial conditions for long-range ENSO prediction relative to state-of-the-art forecast systems. A detailed description of the data assimilation and ensemble generation used to initialize the CAFE forecasts, including examination of the growth of errors and skill with respect to observations, is reported in O’Kane et al. (2019). In the interests of clarity and as background we now give a brief overview of ensemble initialization in CAFE describing only the details pertinent to the current discussion and refer the reader to O’Kane et al. (2019) for additional information. Note that the CAFE forecast data used here correspond to the F1 forecast data from O’Kane et al. (2019).

a. Data assimilation

The CAFE forecasts are initialized as perturbations about a single analyzed state estimated using ensemble optimal interpolation (EnOI) (Evensen 2003). In this study the analyzed states are from EnOI, where a time invariant covariance matrix, Pf=ZfZfT, is formed from a predetermined ensemble of anomalies Zf where the k-ensemble forecast anomalies zi, defined as
Zf=1k1[z1f,z2f,,zkf],
with zif being n-dimensional in model space and where i = 1, 2, …, k, run over the entire ensemble. In CAFE, the Zf are formed using the tendency of the free-running model calculated over the same interval as the assimilation window. This approach was found to give reduced root-mean-square errors relative to constructing Pf from anomalies with respect to the climatological seasonal cycle calculated from a long control simulation of the forecast model. The CAFE EnOI assimilation includes atmospheric increments due to cross covariances between the ocean and atmosphere, scaled to be no larger than the tendency of the free-running atmospheric model over a time interval determined by the assimilation window (i.e., 28 days). CAFE assimilates a comprehensive range of surface and subsurface ocean observations as detailed in Table 2 of O’Kane et al. (2019), broadly consisting of remotely sensed infrared microwave satellite observations, namely SST, sea surface salinity (SSS), sea surface height anomalies (SSHA), and in situ temperature (T) and salinity (S) data. SSHA is derived from the Radar Altimeter Database System (RADS) altimetry (http://rads.tudelft.nl/rads/rads.shtml) and in situ T and S observations from Argo, expendable bathythermograph (XBT), and conductivity–temperature–depth (CTD) data, as well as TAO/TRITON (Pacific), PIRATA (Atlantic), and RAMA (Indian) ocean moorings from the Global Tropical Moored Buoy Array and from the World Meteorological Organization Global Telecommunication System (WMO GTS) (see http://www.wmo.int/pages/prog/www/TEM/GTS/index_en.html).

b. Initial forecast perturbations

Following O’Kane et al. (2011), initial conditions for the ensemble forecasts are generated as BVs; that is, finite perturbations generated (or bred) by evolving the perturbed system ω′(t) = ω(t) + Δω(t), where ω(t) is the control state under the full nonlinear governing equations and Δω is the perturbation of the model state from the unperturbed control. The perturbations themselves are rescaled to a given size ε periodically at a time interval ΔT as follows. The difference between control and perturbed trajectories Δω(t + Δt) = ω′(t + Δt) − ω(t + Δt) is computed at times Δt = nΔT for n ∈ 1, …, N, whereupon the perturbation is rescaled and the perturbed system redefined as ω(t+Δt)=ω(t+Δt)+εΔω(t+Δt)/Δω(t+Δt). The perturbations evolve freely until the next rescaling is scheduled at time (n + 1)ΔT. The BV corresponds to the (finite) perturbation Δω(t) constructed at time t via a straightforward rescaling of the forecast perturbation by a uniform factor zia=czif. The local growth rate of the BVs is given by
g(t)=1ΔTlog[Δω(t+Δt)/Δω(t)],
where ΔT is the rescaling interval and Δω(t) is the BV at time t. More generally we may define the relative amplification factor in terms of the vector of gridpoint values of the BV of any climate variable field as ωt, t) initiated at time t and evolved to time Δt + t. We take the L2 norm as the root mean square of the vector by ω(Δt,t) and define the amplification factor as A(Δt,t)=ω(Δt,t)/ω(0,t) and the local total growth rate g˜(t)=(1/Δt)logA(Δt,t).
As described in O’Kane et al. (2019), the BVs were rescaled to the standard deviation for the ocean temperature in the upper 500 m calculated from anomalies with respect to a seasonal climatology based on 500 years of a control simulation. The control simulation consists of running the coupled model to steady state with repeat year radiative forcing and constructing the climatology from the last 500 years of output. We then considered the fraction of the total variance at each spatial location partitioned into various temporal bands (in-band variance) calculated using Welch’s method (Cooley et al. 1969) applied to the temperature time series at each grid point in the ocean. This variance, with the loci of the largest-amplitude perturbations on 1–2-month time scales, was thresholded at values greater than 0.5°C such that the only values retained are within an isosurface located around the tropical thermocline, with no expression at the surface. These values were used to determine rescaling amplitudes (here, 1% of the standard deviation within the isosurface) in order to generate the BV ensemble forecast perturbations. Specifically, we rescale the initial BV amplitudes S based on the L2-norm for temperature at each level within the isosurface as
Si=|TipTic|2dxdyσi2,
where the subscript i refers to level, Tip and Tic are perturbed and control temperatures at a given level within the isosurface, and σi is the standard deviation of temperature at a given level i calculated from the control simulation as described above. This norm is applied to all ocean prognostic state variables, and is a simplification of the multivariate approach of Cai et al. (2003).

The BVs are generated each month as the renormalized differences between the unperturbed (control) and the perturbed forecasts at 1-month lead time. The new ensemble of 10 BVs is then added to the analyzed ocean state. The 11-member ensemble forecasts are initialized each month from the 10 BVs and the analyzed (control) states. The atmosphere is initialized to a common state that has been constrained via cross-covariances with the ocean observations. As stated earlier, the rescaling amplitudes are based on the 1–2-month in-band variance isosurface specific to the equatorial thermocline. BVs within the isosurface are the only perturbations added to the analysis to generate the ensemble, and hence all disturbances originate from the tropical thermocline.

To briefly illustrate how BV perturbations initially at the thermocline evolve and finally contribute to the improved ocean surface forecast, we calculate the ensemble average monthly mean BV perturbations, that is, ensemble averages of 10 monthly mean ocean temperature forecasts differenced with respect to the control (unperturbed) forecast between ±15° latitude over the equatorial Pacific. In Fig. 1, we show the evolution of an initial disturbance located at about 95 m, and track the growth of that disturbance through the water column (depths of 95, 85, 55, 25, and 5 m) over lead times of 1, 3, 6, 7, 9, and 10 months from an initial start date of March 2015. Comparison to disturbances at the surface shown in Fig. 1 shows no coherent response until lead-time month 7 when the bred vectors express at the surface. Thus, an initial well-chosen disturbance can add additional information specific to the thermocline that expresses 6–7 months later, potentially adding important information about growing subsurface instabilities to initial conditions for forecasts initiated during the boreal spring and here demonstrated during the leadup to the 2016 El Niño.

Fig. 1.
Fig. 1.

The growth and evolution of surface (SST) and subsurface monthly mean ensemble-averaged (10 members) temperature bred vectors initiated in March 2015 over a 10-month period. For each case, the range within the domain (15°S–15°N, 100°E–60°W) is given. Note the smaller scale for the 1-month-lead cases and the decreasing depth with lead time for the subsurface cases.

Citation: Journal of Climate 33, 6; 10.1175/JCLI-D-19-0444.1

The CAFE forecasts were developed to better understand the role of the Pacific thermocline in ENSO predictability and the extratropical atmospheric response to equatorial ocean disturbances. As we are generally interested in ENSO prediction at lead times from seasonal to interannual, we assume that predictability resides in the ocean and that atmospheric initial conditions are subdominant. O’Kane et al. (2019, their Fig. 8) showed that the maximum lag correlation between the Multivariate ENSO Index (MEI) (Wolter and Timlin 2011) and the growth rate of subsurface temperature disturbances in the equatorial Pacific occurred at approximately 150-m depth with a 6-month lag. Thus, thermocline disturbances lead SST by 6 months, indicating a dominant role for the subsurface dynamics at lead times beyond a season. Forecast ensemble plumes of the raw CAFE multiyear ENSO forecasts (O’Kane et al. 2019, their Fig. 14) initialized from January 2007 found that the member forecasts overshot the observed 2008 La Niña as a result of the model equatorial Pacific cold tongue bias reestablishing coincident with the onset of the spring predictability barrier. However, relative to other types of initial perturbations, CAFE forecasts initialized with perturbations specific to the growing disturbances local to the tropical Pacific thermocline evolved more coherently with reduced error growth and consequently reduced ensemble spread and with improved ENSO predictability. More generally, we propose that using an ensemble of states differing only by perturbations specifically tuned to tropical coupled instabilities, and hence relevant to ENSO, would be effective as forecast initial perturbations, and particularly so where ensemble sizes are limited. Similar approaches have been considered by Yang et al. (2006) and Frederiksen et al. (2010).

4. Comparison to NMME

In the subsequent calculations we examine the forecast skill of ENSO, in terms of the Niño-4 index, at given lead times from 0 to 12 months. As a reference, in Fig. 2 we show the ensemble average forecast Niño-4 index initialized each month for the CAFE and respective NMME models at lead times of 0, 3, 6, and 11 months. Apart from the systematic increase in spread as lead time increases, perhaps the most noticeable point of interest is each model’s prediction of both the maximum and phase for the 2009/10 El Niño and the minimum and phase of the subsequent 2010/11 La Niña is quite good even at lead times of 6 months. O’Kane et al. (2019, their Fig. 15) showed receiver operator characteristic (ROC) curves calculated for the Niño-4 index comparing 11-member BV ensemble forecasts started each month at lead times out to 2 years over the period 2003 through June 2017. The ROC curve (i.e., both hit rate and false alarm rate) is calculated for prediction of an occurrence (i.e., yes or no) of certain events where the Niño-4 anomaly >1°K. These results generate a statistical analysis of the accuracy of the CAFE forecasts with respect to the observed Niño-4 index with the CAFE forecasts better than random at lead times up to 465 days.

Fig. 2.
Fig. 2.

Niño-4 indices of ensemble-averaged forecasts initialized each month for CAFE (blue) and NMME (magenta) at leads of 0, 3, 6, and 11 months as compared to the observed Niño-4 index calculated from HadISST (black).

Citation: Journal of Climate 33, 6; 10.1175/JCLI-D-19-0444.1

a. Anomaly correlation coefficient

Here we use the anomaly correlation coefficient (ACC) to verify ENSO indices based on spatially averaged SST fields (Pearson 1895; Jolliffe and Stephenson 2011). Specifically, we apply ACC to SST in the Niño-4 region considering the correlation of anomalies of forecasts with verifying reference values from the HadISST dataset.

1) Metric definition

The ACC is defined as
ACC=i=1Nwi(fif¯)(oio¯)i=1Nwi(fif¯)2i=1Nwi(oio¯)2,with 1ACC1,
where N is the sample size and f¯ and o¯ are the means of fi and oi respectively. More generally, wi is the weighting coefficient (here equal to 1) such that
fi=FiCi,f¯=(i=1Nwifi)/i=1Nwi,
oi=OiCi,o¯=(i=1Nwioi)/i=1Nwi,
where Fi, Oi, and Ci = C represent individual forecast, verifying, or observed and reference values (i.e., climatological samples), respectively. When the variation pattern of the forecast anomalies coincides exactly with those of the verifying data, the ACC equals 1 or alternately −1 where the pattern is completely reversed. The ACC measures the correspondence or phase difference between forecast and observations, subtracting out the climatological mean at each point Ci, where sample mean values are subtracted for the centered ACC. The anomaly correlation is frequently used to verify output from numerical weather prediction (NWP) models. Importantly, the ACC is not sensitive to forecast bias, so it follows that a good anomaly correlation does not necessarily guarantee accurate forecasts, hence the common practice of employing a large hindcast dataset to debias forecast data. Here, we focus on skill in predicting ENSO phase verified by the metric ACC, recognizing that verification will be subject to the usual limitations of finite sample sizes, even in the period studied here, due to the relatively few actual El Niño and La Niña events.

Statistical significance levels are calculated to test the significance of positive correlations and differences. Here we use the fraction of observed negative values as a p value, and compare this to chosen significance level—here the 95th percentile or the 5% level. Nonparametric bootstrapped distributions are constructed as a function of lead time and initial months from randomly sampled (without replacement) initial years and ensembles used to build the mean. Here we use 100 bootstrap resamplings. A similar approach is described in detail in Goddard et al. (2013).

2) Results

We now examine predictability specifically in terms of ENSO phase, as determined by ACC in CAFE relative to the NMME forecasts applying all caveats regarding model biases. We stress that the CAFE ensemble forecasts focus nearly exclusively on the predictability arising from subsurface equatorial ocean dynamics and that only the large scales of the atmosphere have been weakly constrained to the current climate through cross-domain correlations with ocean observations. As the CAFE forecasts were not initialized with the observed synoptic features present at the time of the forecast, one does not expect them to be more skillful (relative to NMME) over the first few forecast months when knowledge of the particular state of the synoptic and faster time scale atmospheric processes determine a significant fraction of the potential predictability. Rather, as shown by O’Kane et al. (2019), it is when the predictability associated with the thermocline expresses at the surface (at lead times of 6 months and longer) that a signal (i.e., enhanced predictability) emerges.

We first calculate the anomaly correlations of the CAFE and NMME forecasts in relation to the observed Niño-4 index (Figs. 3a,d,f,h,j,l,n). Here the black dots indicate significant correlations at the 95th-percentile level. The black lines indicate ACC values for forecasts initiated in March (1-month lead) through December (10-month lead). The boreal spring predictability barrier is clearly evident in the HadISST lagged autocorrelation (Fig. 3b). This period of reduced potential predictability is in contrast to the boreal autumn where significant positive values (r > 0.9) in the autocorrelation extend out to 12-month lead (positive lag), indicating that the potential predictability is at a maximum. Unsurprisingly, all NMME models exhibit the lowest ACC values at lead times greater than 6 months for those forecasts with target months in between June and December corresponding to the months and lags where the HadISST autocorrelation is low. The CAFE ACC indicates best skill for forecasts initialized during the boreal spring, summer, and autumn and in particular beyond 12 months (not shown) for forecasts initialized in March–April.

Fig. 3.
Fig. 3.

ACC (Pearson correlation coefficient) for (a) CAFE and (d),(f),(h),(j),(l),(n) NMME Niño-4 forecasts. (b) Lagged HadISST autocorrelation in the Niño-4 region. (c),(e),(g),(i),(k),(m) Differences between CAFE and NMME ACC. ACC is for individual NMME models. Black dots indicate the 95th percentile (i.e., 5% statistically significantly better than a climatological forecast). Diagonal lines indicate target dates for forecasts initiated in March.

Citation: Journal of Climate 33, 6; 10.1175/JCLI-D-19-0444.1

The CAFE model configuration displays a general “cold tongue” bias in SST whereby the major region of variability in the equatorial Pacific is displaced to the west of the observed maximum. This bias impacts the modeled ENSO phase locking and is the major source of error evident in CAFE forecasts initiated in December at lead times of 5–8 months (i.e., corresponding to target months June–August) (Fig. 3a). The CanCM3, FLORA, FLORB, and COLA models are most impacted by the boreal spring barrier at longer lead than 6 months. At lead times less than 6 months, CanCM4 is the best performing model in terms of ACC.

To quantify the differences in skill between the CAFE and NMME forecasts, we take the difference in their respective ACC values from that of the CAFE forecasts; see Figs. 3c,e,g,i,k,m, where red (blue) indicates larger (smaller) ACC values for CAFE relative to NMME. The CAFE forecasts are less skillful than the NMME forecasts for lead times shorter than 6 months and forecasts initiated in the boreal winter. Again, the solid black lines indicate skill of forecasts initiated in March out to 10-month lead (i.e., December). Forecast skill for the target months (read in the horizontal direction) of May, June, July, and August generally degrades at lead times beyond 6 months as reflected in the anomaly correlation with HadISST (Fig. 3a) and arises due to the aforementioned model bias [also discussed in O’Kane et al. (2019)]. That said, relative to the CanCM3, FLORA, FLORB, and COLA (Figs. 3c,i,k,m) forecasts, the CAFE forecasts are generally more skillful for target months beyond 6–8-month lead. As CAFE employs a variant of the GFDL CM2.1 coupled general circulation model, the differences between the GFDL AER04 configuration (Fig. 3g) are presumably less pronounced than those relative to the FLORA and FLORB configurations (Figs. 3i,k). That said, CAFE exhibits improvements in skill over the GFDL configurations at longer lead times in common with comparisons to the other NMME models with the exception of CanCM4. Interestingly, we found a reasonable correlation between increased CAFE skill relative to the individual NMME models and regions of reduced HadISST autocorrelation values.

The general increase in ACC with respect to the individual models is reflected in the difference between the multimodel ensemble-mean ACC with and without inclusion of the CAFE forecasts (Fig. 4). Here we calculate the ACC for five NMME models plus the CAFE forecasts and take the difference with the ensemble-mean ACC calculated from the complete (six member) NMME. This is repeated excluding each individual NMME model in turn (Figs. 4a–f). Last, we then take the average over all six difference calculations in order to produce an average difference plot (Fig. 4g). This process gives an estimate of the impact on ACC when the CAFE forecasts are included relative to the NMME without prejudice of any particular model. Here we note that even for lead times shorter than 6 months, with the exception of forecasts initiated in December, the combined multimodel ensemble is positively impacted by inclusion of the CAFE forecasts, again with the improvements in target months and lags occurring where the corresponding HadISST autocorrelation values are low. These results are robust and consistent across all model combinations.

Fig. 4.
Fig. 4.

(top three rows) The difference between the ACC (Pearson correlation coefficient) calculated from an ensemble average where the respective NMME model indicated has been replaced by the CAFE forecasts, and the ACC calculated from the ensemble-average NMME. The period considered spans 2003–15. (bottom) The average of all six difference calculations is shown.

Citation: Journal of Climate 33, 6; 10.1175/JCLI-D-19-0444.1

A prominent feature emerges from this analysis, namely, that the NMME models are in general most skillful at 3–6-month lead time for target months between October and December, whereas the CAFE forecasts, relative to the NMME (1982–2015), are generally more skillful at lead times longer than 6–8 months and in particular target months corresponding to the boreal spring barrier. These results indicate the importance of atmospheric initial conditions on ENSO predictability at shorter lead times of a few months and the more important role of the emergent signal from thermocline disturbances at longer lead times beyond two seasons into the future.

b. Sign test

We further apply a procedure proposed by DelSole and Tippett (2016) based on random walks and formally equivalent to the sign test, here applied to the Niño-4 index. The method is independent of distributional assumptions about the forecast errors and, while assuming that individual forecasts are independent, can provide useful information even where serial correlations in time are present. Additionally, the method requires only relatively few years of data to detect significant differences in the skill of models where biases are known a priori.

1) Metric definition

Following DelSole and Tippett (2016), consider N forecasts A and B, where K denotes the number of times A is more skillful than B. Assume each time step is an independent Bernoulli (random) trial (i.e., there is no correlation between consecutive events). The underlying null hypothesis is that forecast A is equally likely to be more or less skillful than B. In this case K should follow a binomial distribution with p = ½; thus,
pb(K)=12NN!K!(NK)!,
where the p value is given by
pvalue=2pb(0)+2pb(1)++2pb(min[K0,NK0]),
where Eq. (6b) is the probability of obtaining an observed value K0 and the factor 2 accounts for the test being two-tailed (i.e., we do not know a priori which forecast is best). The null hypothesis is rejected if pvalue falls below a prescribed significance level α.
We determine a critical value Kαbinom as the smallest value of K0 such that pvalueα for α = 5%. If K0<Kαbinom or K0>NKαbinom, then the hypothesis is rejected. Counts are expressed as a random walk such that whenever A is more skillful than B, a positive step is taken. If B is more skillful than A, then a negative step is taken. The case A = B is assumed never to occur and if it does occur, then no steps are taken. Thus, there are K steps in the positive direction and NK in the negative direction and
dN=K(NK)=2KN
is the total distance traveled by the random walk.

2) Results

We now apply the random walk sign test as a general method to compare the skills of two predictions and where the evaluation of mean square error here partially serves to evaluate the amplitude of ENSO. After applying bias correction, as described in section 2, the sign test random walk is initiated from February 2002 through December 2015. We consider skill in monthly values of the Niño-4 index; that is, the count increases by 1 when the squared error of CAFE (model B) Niño-4 index is larger than that for any individual NMME model Ai∈NMME(1),NMME(2), …, NMME(6) (Ai is defined in Table 1) and decreases by 1 otherwise. Throughout the assumption is AiB(∀i). The count is accumulated forward in time for each model separately, over all initial months in a given season and over the years between 2002 and 2015 (for a fixed lead time), thereby tracing out a random walk. In Fig. 5, the white region indicates the range of counts that would be obtained 95% of the time under independent random (Bernoulli) trials for p = 0.5 (i.e., models in this range are statistically indistinguishable). A random walk extending into the blue (tan) shaded area indicates that CAFE forecasts are less (more) significantly skillful more often than expected for independent random trials; that is, CAFE is further from (closer to) the observation. Changes in the average slope of the random walk can be an indication of a systematic change in skill. Cognizant of the seasonal dependence for ENSO forecasts, we partition the count by season.

Fig. 5.
Fig. 5.

Random walk sign test comparison of seasonal-mean Niño-4 for the CAFE forecasts with respect to the various NMME model forecasts at (left) 3-, (center) 6-, and (right) 11-month leads. The systematic error (bias) relative to observations over the period 2002–15 has been removed from each model. Each row refers to forecasts initiated in a given season (e.g., DJF refers to forecasts initiated during December–February).

Citation: Journal of Climate 33, 6; 10.1175/JCLI-D-19-0444.1

In Fig. 5 and the text that follows, we refer to the respective seasons using the first letter of each month (i.e., DJF refers to the boreal winter months of December–February and similarly for the other seasons). Despite being explicitly constructed to project onto ENSO predictability at lead times beyond 6 months, at 3-month lead (Fig. 5, left column) the CAFE forecasts are statistically indistinguishable from the majority of NMME forecasts initialized in JJA and SON; less skillful than COLA, CanCM3, and CanCM4 for DJF; and only less skillful than COLA for MAM forecasts. At 6-month lead (Fig. 5, center column) for DJF forecasts, CAFE is statistically less skillful than all NMME models with the exception of CanCM4 and AER04, a result consistent with the poor phase locking evident in Fig. 3a for CAFE forecasts initialized in December at 6-month lead time. In the remaining seasons (MAM, JJA, and SON), there is little to distinguish CAFE and NMME forecasts at 6-month lead. At 11-month lead (Fig. 5, right column) for DJF, there is evidence that CAFE is slightly more skillful than COLA and CanCM3; however, skill, as measured by the sign test, is comparable across models.

5. Discussion and conclusions

We have shown that, during the boreal spring, the difficulties associated with detecting the relevant and often weak SST anomalies associated with specific ENSO events may be mitigated to some extent by targeting disturbances about the tropical Pacific thermocline. In O’Kane et al. (2019), it was demonstrated that the variability associated with thermocline disturbances typically leads the SST variability by about 6 months, thereby providing a mechanism for extended predictability. They focused on the methods and mechanisms by which ensemble forecasts may be initiated using nonlinearly modified dynamical vectors to span the local low-dimensional subspace where variance, partitioned into a specific spatiotemporal band relevant to ENSO, resides.

In this follow-up study, we have compared multiyear ensemble ENSO forecasts from the CAFE system to ensemble forecasts from state-of-the-art dynamical coupled models in the NMME project. Our analysis of ACC as a standard metric of forecast skill in terms of ENSO phase largely negates the biases specific to each of the respective models. We find that, relative to the respective NMME forecasts, the CAFE forecasts display increased ACC values at lead times greater than 6 months with increased ACC values largest for target months when predictability is most strongly limited by the boreal spring barrier. Comparison of ensemble-mean ACC values with and without the CAFE forecasts clearly shows the utility of augmenting current initialization methods with initial perturbations based on targeting instabilities specific to the tropical Pacific thermocline for ENSO prediction beyond a season. These results show that the inclusion of CAFE to NMME increases the ACC for most lead times and target months and that CAFE brings rather independent information to the NMME.

Next, the random walk sign test was used to measure of forecast skill. Although less skillful at 3-month lead than a seasonally dependent subset of NMME models, the relative skill of the CAFE forecasts progressively increases against all NMME models with lead time such that at 11-month lead CAFE is at least as skillful as any particular NMME model. This is an important result given that the CAFE ensemble forecasts were initialized with only a highly targeted initial perturbation at the equatorial Pacific thermocline. Previous calculations of receiver operator characteristic (ROC) curves for Niño-4 forecasts, when calculated over all start dates and lead times out to 24 months (Fig. 15 of O’Kane et al. 2019), revealed skill out beyond 400 days. In those calculations, the ROC curve (i.e., both the hit rate and the false alarm rate) was calculated for prediction of an occurrence; i.e., yes or no) of Niño-4 events where the Niño-4 anomaly exceeds 1°C.

Taken as a whole, our results suggest that augmenting current initialization methods with initial perturbations targeting instabilities specific to the tropical Pacific thermocline generally improves ENSO prediction beyond 6-month lead time. More generally, it is reasonable to assume that the predictability of specific climate teleconnections may be targeted by judicious projection of initial ensemble perturbations onto the relevant instabilities responsible for determining variability at the spatiotemporal scales of interest.

Acknowledgments

The authors were supported by the Australian Commonwealth Scientific and Industrial Research Organisation (CSIRO) Decadal Climate Forecasting Project (https://research.csiro.au/dfp).

REFERENCES

  • Balmaseda, M. A., and D. Anderson, 2009: Impact of initialization strategies and observations on seasonal forecast skill. Geophys. Res. Lett., 36, L01701, https://doi.org/10.1029/2008GL035561.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Barnston, A. G., M. Chelliah, and S. B. Goldenberg, 1997: Documentation of a highly ENSO-related SST region in the equatorial Pacific: Research note. Atmos.–Ocean, 35, 367383, https://doi.org/10.1080/07055900.1997.9649597.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Barnston, A. G., M. K. Tippett, M. L. L’Heureux, S. Li, and D. G. DeWitt, 2012: Skill of real-time seasonal ENSO model predictions during 2001–11. Is our capacity increasing? Bull. Amer. Meteor. Soc., 93, 631651, https://doi.org/10.1175/BAMS-D-11-00111.1.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Barnston, A. G., M. K. Tippett, M. Ranganathan, and M. L. L’Heureux, 2017: Deterministic skill of ENSO predictions from the North American multimodel ensemble. Climate Dyn., 53, 72157234, https://doi.org/10.1007/S00382-017-3603-3.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Cai, M., E. Kalnay, and Z. Toth, 2003: Bred vectors of the Zebiak–Cane model and their potential application to ENSO prediction. J. Climate, 16, 4056, https://doi.org/10.1175/1520-0442(2003)016<0040:BVOTZC>2.0.CO;2.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Cooley, J. W., P. A. W. Lewis, and P. D. Welch, 1969: The fast Fourier transform and its applications. IEEE Trans. Educ., 12, 2734, https://doi.org/10.1109/TE.1969.4320436.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • DelSole, T., and M. K. Tippett, 2014: Comparing forecast skill. Mon. Wea. Rev., 142, 46584678, https://doi.org/10.1175/MWR-D-14-00045.1.

  • DelSole, T., and M. K. Tippett, 2016: Forecast comparison based on random walks. Mon. Wea. Rev., 144, 615626, https://doi.org/10.1175/MWR-D-15-0218.1.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Evensen, G., 2003: The ensemble Kalman filter: Theoretical formulation and practical implementation. Ocean Dyn., 53, 343367, https://doi.org/10.1007/s10236-003-0036-9.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Flügel, M., and P. Chang, 1998: Does the predictability of ENSO depend on the seasonal cycle? J. Atmos. Sci., 55, 32303243, https://doi.org/10.1175/1520-0469(1998)055<3230:DTPOED>2.0.CO;2.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Frederiksen, J. S., C. S. Frederiksen, and S. L. Osbrough, 2010: Seasonal ensemble prediction with a coupled ocean–atmosphere model. Aust. Meteor. Ocean J., 59, 5366, https://doi.org/10.22499/2.5901.007.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Goddard, L., and Coauthors, 2013: A verification framework for interannual-to-decadal predictions experiments. Climate Dyn., 40, 245272, https://doi.org/10.1007/s00382-012-1481-2.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Jin, E. K., and Coauthors, 2008: Current status of ENSO prediction skill in coupled ocean–atmosphere models. Climate Dyn., 31, 647664, https://doi.org/10.1007/s00382-008-0397-3.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Jolliffe, I. T., and D. B. Stephenson, 2011: Forecast Verification: A Practitioner’s Guide in Atmospheric Science. 2nd ed. Wiley, 292 pp.

    • Crossref
    • Export Citation
  • Kirtman, B. P., and D. Min, 2009: Multimodel ensemble ENSO prediction with CCSM and CFS. Mon. Wea. Rev., 137, 29082930, https://doi.org/10.1175/2009MWR2672.1.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Kirtman, B. P., and Coauthors, 2014: The North American Multi-Model ensemble (NMME): Phase-1 seasonal to interannual prediction; Phase-2 toward developing intra-seasonal prediction. Bull. Amer. Meteor. Soc., 95, 585601, https://doi.org/10.1175/BAMS-D-12-00050.1.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Lai, A. W.-C., M. Herzog, and H.-F. Graf, 2018: ENSO forecasts near the spring predictability barrier and possible reasons for the recently reduced predictability. J. Climate, 31, 815838, https://doi.org/10.1175/JCLI-D-17-0180.1.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • O’Kane, T. J., P. R. Oke, and P. A. Sandery, 2011: Predicting the East Australian current. Ocean Modell., 38, 251266, https://doi.org/10.1016/j.ocemod.2011.04.003.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • O’Kane, T. J., and Coauthors, 2019: Coupled data assimilation and ensemble initialization with application to multiyear ENSO prediction. J. Climate, 32, 9971024, https://doi.org/10.1175/JCLI-D-18-0189.1.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Pearson, K., 1895: Notes on regression and inheritance in the case of two parents. Proc. Roy. Soc. London, 58, 240242, https://doi.org/10.1098/rspl.1895.0041.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Philander, S. G. H., T. Yamagata, and R. C. Pacanowski, 1984: Unstable air–sea interactions in the tropics. J. Atmos. Sci., 41, 604613, https://doi.org/10.1175/1520-0469(1984)041<0604:UASIIT>2.0.CO;2.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Rayner, N. A., D. E. Parker, E. B. Horton, C. K. Folland, L. V. Alexander, D. P. Rowell, E. C. Kent, and A. Kaplan, 2003: Global analyses of sea surface temperature, sea ice, and night marine air temperature since the late nineteenth century. J. Geophys. Res., 108, 4407, https://doi.org/10.1029/2002JD002670.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Saha, S., and Coauthors, 2006: The NCEP Climate Forecast System. J. Climate, 19, 34833517, https://doi.org/10.1175/JCLI3812.1.

  • Tippett, M. K., A. G. Barnston, and S. Li, 2012: Performance of recent multimodel ENSO forecasts. J. Appl. Meteor. Climatol., 51, 637654, https://doi.org/10.1175/JAMC-D-11-093.1.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Tippett, M. K., M. Ranganathan, M. L. L’Heureux, A. G. Barnston, and T. DelSole, 2019: Assessing probabilistic predictions of ENSO phase and intensity from the North American Multimodel Ensemble. Climate Dyn., 53, 74977518, https://doi.org/10.1007/S00382-017-3721-Y.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Toth, Z., and E. Kalnay, 1997: Ensemble forecasting at NCEP and the breeding method. Mon. Wea. Rev., 125, 32973319, https://doi.org/10.1175/1520-0493(1997)125<3297:EFANAT>2.0.CO;2.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Wolter, K., and M. Timlin, 2011: El Niño/Southern Oscillation behaviour since 1871 as diagnosed in an extended multivariate ENSO index (MEI.ext). Int. J. Climatol., 31, 10741087, https://doi.org/10.1002/joc.2336.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Yang, S. C., M. Cai, E. Kalnay, M. Rienecker, G. Yuan, and Z. Toth, 2006: ENSO bred vectors in coupled ocean–atmosphere general circulation models. J. Climate, 19, 14221436, https://doi.org/10.1175/JCLI3696.1.

    • Crossref
    • Search Google Scholar
    • Export Citation
Save
  • Balmaseda, M. A., and D. Anderson, 2009: Impact of initialization strategies and observations on seasonal forecast skill. Geophys. Res. Lett., 36, L01701, https://doi.org/10.1029/2008GL035561.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Barnston, A. G., M. Chelliah, and S. B. Goldenberg, 1997: Documentation of a highly ENSO-related SST region in the equatorial Pacific: Research note. Atmos.–Ocean, 35, 367383, https://doi.org/10.1080/07055900.1997.9649597.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Barnston, A. G., M. K. Tippett, M. L. L’Heureux, S. Li, and D. G. DeWitt, 2012: Skill of real-time seasonal ENSO model predictions during 2001–11. Is our capacity increasing? Bull. Amer. Meteor. Soc., 93, 631651, https://doi.org/10.1175/BAMS-D-11-00111.1.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Barnston, A. G., M. K. Tippett, M. Ranganathan, and M. L. L’Heureux, 2017: Deterministic skill of ENSO predictions from the North American multimodel ensemble. Climate Dyn., 53, 72157234, https://doi.org/10.1007/S00382-017-3603-3.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Cai, M., E. Kalnay, and Z. Toth, 2003: Bred vectors of the Zebiak–Cane model and their potential application to ENSO prediction. J. Climate, 16, 4056, https://doi.org/10.1175/1520-0442(2003)016<0040:BVOTZC>2.0.CO;2.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Cooley, J. W., P. A. W. Lewis, and P. D. Welch, 1969: The fast Fourier transform and its applications. IEEE Trans. Educ., 12, 2734, https://doi.org/10.1109/TE.1969.4320436.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • DelSole, T., and M. K. Tippett, 2014: Comparing forecast skill. Mon. Wea. Rev., 142, 46584678, https://doi.org/10.1175/MWR-D-14-00045.1.

  • DelSole, T., and M. K. Tippett, 2016: Forecast comparison based on random walks. Mon. Wea. Rev., 144, 615626, https://doi.org/10.1175/MWR-D-15-0218.1.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Evensen, G., 2003: The ensemble Kalman filter: Theoretical formulation and practical implementation. Ocean Dyn., 53, 343367, https://doi.org/10.1007/s10236-003-0036-9.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Flügel, M., and P. Chang, 1998: Does the predictability of ENSO depend on the seasonal cycle? J. Atmos. Sci., 55, 32303243, https://doi.org/10.1175/1520-0469(1998)055<3230:DTPOED>2.0.CO;2.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Frederiksen, J. S., C. S. Frederiksen, and S. L. Osbrough, 2010: Seasonal ensemble prediction with a coupled ocean–atmosphere model. Aust. Meteor. Ocean J., 59, 5366, https://doi.org/10.22499/2.5901.007.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Goddard, L., and Coauthors, 2013: A verification framework for interannual-to-decadal predictions experiments. Climate Dyn., 40, 245272, https://doi.org/10.1007/s00382-012-1481-2.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Jin, E. K., and Coauthors, 2008: Current status of ENSO prediction skill in coupled ocean–atmosphere models. Climate Dyn., 31, 647664, https://doi.org/10.1007/s00382-008-0397-3.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Jolliffe, I. T., and D. B. Stephenson, 2011: Forecast Verification: A Practitioner’s Guide in Atmospheric Science. 2nd ed. Wiley, 292 pp.

    • Crossref
    • Export Citation
  • Kirtman, B. P., and D. Min, 2009: Multimodel ensemble ENSO prediction with CCSM and CFS. Mon. Wea. Rev., 137, 29082930, https://doi.org/10.1175/2009MWR2672.1.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Kirtman, B. P., and Coauthors, 2014: The North American Multi-Model ensemble (NMME): Phase-1 seasonal to interannual prediction; Phase-2 toward developing intra-seasonal prediction. Bull. Amer. Meteor. Soc., 95, 585601, https://doi.org/10.1175/BAMS-D-12-00050.1.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Lai, A. W.-C., M. Herzog, and H.-F. Graf, 2018: ENSO forecasts near the spring predictability barrier and possible reasons for the recently reduced predictability. J. Climate, 31, 815838, https://doi.org/10.1175/JCLI-D-17-0180.1.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • O’Kane, T. J., P. R. Oke, and P. A. Sandery, 2011: Predicting the East Australian current. Ocean Modell., 38, 251266, https://doi.org/10.1016/j.ocemod.2011.04.003.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • O’Kane, T. J., and Coauthors, 2019: Coupled data assimilation and ensemble initialization with application to multiyear ENSO prediction. J. Climate, 32, 9971024, https://doi.org/10.1175/JCLI-D-18-0189.1.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Pearson, K., 1895: Notes on regression and inheritance in the case of two parents. Proc. Roy. Soc. London, 58, 240242, https://doi.org/10.1098/rspl.1895.0041.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Philander, S. G. H., T. Yamagata, and R. C. Pacanowski, 1984: Unstable air–sea interactions in the tropics. J. Atmos. Sci., 41, 604613, https://doi.org/10.1175/1520-0469(1984)041<0604:UASIIT>2.0.CO;2.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Rayner, N. A., D. E. Parker, E. B. Horton, C. K. Folland, L. V. Alexander, D. P. Rowell, E. C. Kent, and A. Kaplan, 2003: Global analyses of sea surface temperature, sea ice, and night marine air temperature since the late nineteenth century. J. Geophys. Res., 108, 4407, https://doi.org/10.1029/2002JD002670.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Saha, S., and Coauthors, 2006: The NCEP Climate Forecast System. J. Climate, 19, 34833517, https://doi.org/10.1175/JCLI3812.1.

  • Tippett, M. K., A. G. Barnston, and S. Li, 2012: Performance of recent multimodel ENSO forecasts. J. Appl. Meteor. Climatol., 51, 637654, https://doi.org/10.1175/JAMC-D-11-093.1.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Tippett, M. K., M. Ranganathan, M. L. L’Heureux, A. G. Barnston, and T. DelSole, 2019: Assessing probabilistic predictions of ENSO phase and intensity from the North American Multimodel Ensemble. Climate Dyn., 53, 74977518, https://doi.org/10.1007/S00382-017-3721-Y.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Toth, Z., and E. Kalnay, 1997: Ensemble forecasting at NCEP and the breeding method. Mon. Wea. Rev., 125, 32973319, https://doi.org/10.1175/1520-0493(1997)125<3297:EFANAT>2.0.CO;2.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Wolter, K., and M. Timlin, 2011: El Niño/Southern Oscillation behaviour since 1871 as diagnosed in an extended multivariate ENSO index (MEI.ext). Int. J. Climatol., 31, 10741087, https://doi.org/10.1002/joc.2336.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Yang, S. C., M. Cai, E. Kalnay, M. Rienecker, G. Yuan, and Z. Toth, 2006: ENSO bred vectors in coupled ocean–atmosphere general circulation models. J. Climate, 19, 14221436, https://doi.org/10.1175/JCLI3696.1.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Fig. 1.

    The growth and evolution of surface (SST) and subsurface monthly mean ensemble-averaged (10 members) temperature bred vectors initiated in March 2015 over a 10-month period. For each case, the range within the domain (15°S–15°N, 100°E–60°W) is given. Note the smaller scale for the 1-month-lead cases and the decreasing depth with lead time for the subsurface cases.

  • Fig. 2.

    Niño-4 indices of ensemble-averaged forecasts initialized each month for CAFE (blue) and NMME (magenta) at leads of 0, 3, 6, and 11 months as compared to the observed Niño-4 index calculated from HadISST (black).

  • Fig. 3.

    ACC (Pearson correlation coefficient) for (a) CAFE and (d),(f),(h),(j),(l),(n) NMME Niño-4 forecasts. (b) Lagged HadISST autocorrelation in the Niño-4 region. (c),(e),(g),(i),(k),(m) Differences between CAFE and NMME ACC. ACC is for individual NMME models. Black dots indicate the 95th percentile (i.e., 5% statistically significantly better than a climatological forecast). Diagonal lines indicate target dates for forecasts initiated in March.

  • Fig. 4.

    (top three rows) The difference between the ACC (Pearson correlation coefficient) calculated from an ensemble average where the respective NMME model indicated has been replaced by the CAFE forecasts, and the ACC calculated from the ensemble-average NMME. The period considered spans 2003–15. (bottom) The average of all six difference calculations is shown.

  • Fig. 5.

    Random walk sign test comparison of seasonal-mean Niño-4 for the CAFE forecasts with respect to the various NMME model forecasts at (left) 3-, (center) 6-, and (right) 11-month leads. The systematic error (bias) relative to observations over the period 2002–15 has been removed from each model. Each row refers to forecasts initiated in a given season (e.g., DJF refers to forecasts initiated during December–February).

All Time Past Year Past 30 Days
Abstract Views 25 0 0
Full Text Views 842 232 107
PDF Downloads 380 88 6