1. Introduction
In recent years, seasonal-to-interannual climate forecasts based on ensemble techniques have been shown to possess somewhat enhanced skill. Both the European Centre for Medium-Range Weather Forecasts (ECMWF) and the U.S. National Weather Service (NWS) routinely provide coupled model forecasts using ensembles, where ensemble members differ in the perturbations introduced to the initial conditions. The value of the ensemble method has been advanced through hindcasting exercises and comparisons with a variety of climate markers. In this paper, sea surface temperature (SST) anomaly prediction by ensemble techniques, particularly the so-called superensemble (SupEns) technique, is presented. Motivation for the general area of SST prediction comes from the desire to accurately forecast El Niño events more than a few seasons in advance. Accurate SST forecasts would also assist the so called Tier-1 family of climate models. These are atmospheric general circulation models (AGCM) that utilize prescribed SST.
a. Background
A number of past studies in seasonal SST forecasting have shown considerable promise. Amongst the commonly used techniques are canonical correlation analysis (CCA), hybrid coupled modeling (HCM), linear inverse modeling (LIM) and Markov modeling (MM). In CCA, physical intuition guides the selection of a set of predictors and predictands, whose correlation is subsequently maximized using CCA methods. Predictor measurements are used to issue forecasts (e.g., Barnston and Ropelewski 1992; Barnston and Smith 1996; Mo 2003). HCM assumes that detailed atmospheric structure is unimportant to seasonal SST forecasts; thus, fully nonlinear ocean general circulation models (OGCMs) are coupled to simpler, often statistical, atmospheric models (see Neelin 1990; Latif and Villwock 1990; Barnett et al. 1993; Syu et al. 1995). The statistical LIM technique relates forecasts to predictand history. Principal oscillation pattern (POP) analysis (Hasselmann 1988) is used to maximize predictor/predictand correlations. Penland and collaborators (Penland and Magorian 1993; Penland 1996; Penland et al. 1998) in particular have demonstrated skill enhancement in SST prediction over all the three major ocean basins using LIM. Markov models seek to exploit SST persistence, as compared to atmospheric variables, through the use of linear stochastic models. Examples are found in Xue et al. (2000), Johnson et al. (2000), and Berliner et al. (2000).
The above methods apply, in varying degrees, statistical and dynamical concepts to the forecasting of SST. We continue this approach in the present paper although the mix of dynamical and statistical tools differs considerably. Specifically, we will exploit the results of 13 state-of-the-art dynamical coupled atmosphere–ocean models by means of statistical ensemble analyses for the purpose of SST anomaly forecasting. Our particular emphasis is on the recently developed superensemble method, and we will present a series of deterministic and probabilistic skill tests comparing this method with other, more classic ensemble predictions. In addition, comparisons of the superensemble forecasts will be made with forecasts derived from some of the above methods. Special emphasis is given to SST anomaly forecasts during El Niño/La Niña events and for the different phases of the Indian Ocean dipole (IOD; Saji et al. 1999).
Section 2 describes the superensemble methodology. Section 3 provides a brief overview of the member models, their datasets, and the use of these datasets in the superensemble algorithm. Results for the global oceans and major ocean basins, and during El Niño/La Niña and IOD events, are presented in detail in section 4. This section also compares the results of the superensemble with some of the nonmember model forecasts. Section 5 summarizes the major results of this study with a note toward future development for seasonal SST forecasts.
2. The superensemble methodology
The superensemble methodology (Krishnamurti et al. 1999) issues a consensus forecast from a set of dynamical model forecasts by applying a collective bias correction in a manner unlike classical bias corrections, The latter weight every model equally both in bias calculation and forecast construction. In contrast, the superensemble technique considers the past performance of each model in assigning its relative forecast weight. This is carried out for each of the models, for each of the variables and at every grid location. The technique has been shown to yield superior forecasts when applied to weather prediction and seasonal climate (Krishnamurti et al. 2000a, b, 2001, 2002, 2003, 2006).
Our primary method for the construction of a multimodel superensemble for climate forecasts is somewhat different. It follows a method described in Yun et al. (2005), Krishnamurti et al. (2006), and Chakraborty and Krishnamurti (2006). This study utilizes something called a synthetic superensemble that generates a proxy data stream for each of the model forecasts. This entails an expansion of the forecasts and the analysis (observed) fields using principal component (PC) in time and empirical orthogonal function (EOF) in space. Some 50 PCs are used in these expansions to assure that the final results are not dependent on this number. The spatial structure of the predicted fields was replaced by the analysis-based counterparts. It contains the following steps.
3. Models/datasets and the construction of the superensemble
The datasets for this study were derived from 13 coupled ocean–atmosphere models. Table 1 provides an outline of these models, which include 7 European models [extracted from the Development of a European Multimodel Ensemble System for Seasonal-to-Interannual Prediction (DEMETER; Palmer et al. 2004) database], the National Center for Atmospheric Research (NCAR) Community Climate Model version 3 (CCM3), the Predictive Ocean Atmosphere Model for Australia (POAMA-1), and four Florida State University (FSU) coupled atmosphere–ocean models. As may be seen, these are diverse models in their physical parameterizations, horizontal and vertical resolutions, and dynamics. A common resolution, that is, 2.5° in latitude by 2.5° in longitude and 14 vertical levels, was selected for the construction of the superensemble, and all model results were interpolated to this grid.
The focus of our study is the prediction of SST 1 month in advance. The month immediately followed by the forecast start date was not included either in training or in the forecast phase. A time series of seasonal mean (month 2 to month 4) forecast was thus created for each model to construct a string of 1-month lead-time seasonal mean forecasts. All the models had been applied in this hindcast mode to the interval from 1989 to 2001. The European models each provided nine separate forecasts differing in their initial conditions (adjacent days were substituted for the starting date). Therefore, there were in total 23 400 months of forecasts available for this study.
The 13 yr (equivalent to 52 seasons) of model data were somewhat short of what experience has shown to be needed to provide stable superensemble results [72 seasons, according to the Atmospheric Model Intercomparison Project (AMIP) study of Krishnamurti et al. (2000a)]. This problem was addressed by employing a cross-validation procedure in the spirit of Déqué (1997). During the training phase of the superensemble the year (4 seasons) being forecasted was excluded from the training dataset and superensemble coefficients were calculated based on the forecasts for the remaining years (48 seasons). Kharin and Zwiers (2002) quite correctly pointed out that a multimodel ensemble can lead to over-fitting of parameters to datasets. We realize this problem. In the FSU superensemble such a problem would be reflected if the coefficient matrix became singular. We did not encounter this problem in these climate runs.
The training phase and the forecast validation require the analysis counterparts based on observed datasets. For all atmospheric variables except precipitation the benchmark analysis that this study uses is the 40-yr ECMWF Re-Analysis (ERA-40) datasets. For precipitation the Climate Prediction Center (CPC) Merged Analysis of Precipitation (CMAP; Xie and Arkin 1997) datasets were used. For SST the Reynolds (Reynolds et al. 2002) datasets were taken as the observed counterpart. These datasets are also horizontally and vertically interpolated to the reference grid of the superensemble.
4. Performance of the superensemble
a. Global oceans climatologies
The first analysis performed on the model data was to analyze seasonal SST climatologies from each member model. It was found that all the models exhibited somewhat comparable skills for their SST prediction. Here, the boreal spring climatology for the months of March–May is illustrated as an example (Fig. 1). The observed climatology based on the satellite-derived SSTs (Reynolds et al. 2002) averaged over the study interval (1989–2001) is shown in the first panel. The models exhibit reasonable simulation of the SST climatology: their RMS errors range between roughly 1° and 2°C, for example, U.K.Met: 0.99°C, ECMWF: 1.07°C, and POAMA-1: 2.18°C. The RMS error of the ensemble mean shown in (Fig. 1o) is 0.82°C whereas that of the superensemble is 0.03°C. This significant improvement is due largely to a correct geographical distribution of SST. As shown between Figs. 1a and 1p, the superensemble could place each SST isotherm almost on top of the observed SST isotherms. This feature appears to significantly enhance SST anomaly forecasts. In comparison, the Max Planck Institute (MPI) model has a reasonable climatology over much of the equatorial belt, but the warm SSTs (>28°C) south of India in the near equatorial belt do not extend over the Bay of Bengal and Arabian Sea as in the observations. Discrepancies of this nature tend to increase the RMS errors to >0.5°C. We have not presented the model climatology for all other seasons, but they were noted to be equally skillful for the superensemble.
b. Seasonal forecast skill
Here we compare SST seasonal forecast skills according to the so-called false alarm ratio (FAR; see Table 2 for a definition). This measure indicates the reliability of the model in computing extreme events, with small values of FAR indicating high reliability. The results for seasonal forecasts covering the months March–May for the years 1989 to 2001 are illustrated in Fig. 2. The thin vertical barbs show the FAR of the 13 member models, for the bias removed ensemble mean, and for the multimodel superensemble. Overall, the superensemble outperforms all the models and the bias-removed ensemble mean (BREM) for nearly all values of the anomaly thresholds. The exception occurs for SST anomalies very close to zero (i.e., <0 and >0), where the consequences of issuing a false alarm forecast are minimal. For the important cases of large anomalies >|0.3|°C, the superensemble generally issues a more reliable product.
Equitable threat scores (ETSs) for global SST spring (March–May) forecasts at various thresholds are shown in Fig. 3 (higher scores are better). For all thresholds greater than |0.3|°C the superensemble scores higher than any of the member models and the BREM. For thresholds less than |0.3|°C, the superensemble ETS performance was somewhat less relative to the member models, but remains high enough to be useful. In other words, the skill of the superensemble is high enough to signal small anomalies, even though the forecasted value of the small anomaly is not highly accurate. This overall skill of the superensemble in predicting SST anomalies greater than 0.3°C (negative or positive) is obtained for all seasons over the global oceans. The positive feature of these computations is clearly that a better product is generally obtained from the construction of the superensemble. For most of the thresholds of SST anomaly >|0.3|°C the ETS achieved by the multimodel superensemble is around 0.2 or less. At the present stage of coupled modeling this appears to be the best possible score for SST anomalies. It should be noted that this is a probabilistic skill score and is very stringent in its demands. In fact, these same years of seemingly low ETS do carry very large values of anomaly correlations (ACs) in a deterministic sense. They show that these results are in fact far better than are implied by the low scores for ETS.
c. Major ocean basins
The results over the tropical Pacific Ocean were very impressive. Figure 5 shows the RMS errors of the SST and the anomaly correlation (for the SST anomalies). The RMS scores between 1989 through 2001 all show consistently a very marked reduction of the RMS error for the superensemble (∼0.3°C) compared to the ensemble mean and all of the individual member models (∼1°C). The anomaly correlations over the tropical Pacific Ocean were above 0.6 for most of the years. It is also interesting to see the very high skills during El Niño years 1991–92, 1997, 1998, and 1999. An exceptional year was 1995 when the anomaly correlations for the member models were very low. Even during this year the superensemble carried a value close to 0.4. Overall these results from the superensemble suggest that both SST and SST anomalies in the tropical Pacific are computed much more accurately relative to any of the single coupled ocean–atmosphere models.
In this period, El Niño events occurred during 1991–92 and 1997–98. It appears that the model skills of RMS errors (Fig. 5a) were not adversely affected by the occurrence of El Niño events. However we did note that the anomaly correlations (Fig. 5b) showed slightly lower skill for the multimodel superensemble compared to the best model during the El Niño years of 1992 and 1997–98. Much higher values for these anomaly correlations during these El Niño events were also noted.
Tropical Atlantic SST forecasts are important for assessing potential hurricane activity. These are presented in Fig. 6. Here again the RMS seasonal SST forecast errors of the superensemble are consistently smaller, at levels near 0.3° to 0.4°C, than those of the member models (RMS errors ≥ 1°C). SST anomaly correlations (Fig. 6b) show that there were only two years, 1989 and 2000, when the superensemble performed poorly. Both were years of very weak SST anomaly. During most of the other years the superensemble provided either the best or close to the best results compared to the member models. This is valuable information because when forecasts are first made, which model’s forecast will be optimal from a collection of forecasts is unknown. It is in that sense that reliance on the superensemble is useful since it is consistently a better product. In the El Niño year of 1992 the SST anomalies from the superensemble carried an anomaly correlation close to 0.9 for the tropical Atlantic Ocean. Results during the years 1991, 1997, and 1998 were also equally impressive. The La Niña years in this sequence include 1989, 1995–96, and 1998–2001 (based on the Oceanic Niño Index).
It is of interest to compare SST anomaly spatial structure from the various forecast products as compared to observations; we focus here on periods when the anomaly correlations are high. We show here examples from the Pacific and Atlantic Oceans from 1999, when the anomaly correlations were quite high (∼0.85). Figure 7a shows the seasonal mean June–August 1999 forecast from the best model in the suite (blue contours) against observations (green contours). The SST anomaly during the same season from the superensemble (red contour) is compared with observations in Fig. 7b. Clearly, the spatial patterns of the SST anomaly over the entire tropical Pacific Ocean are well captured by the superensemble. In comparison, the best model (correlation skill of 0.6) yields spatial structures that in places depart notably from the data. In Figs. 7c,d, comparisons from the Atlantic Ocean summer of 1992 are shown. Again, the spatial SST anomaly patterns are well represented in the superensemble forecast, especially in comparison to that of the best ensemble member model. These of course are specific selected examples of the highest forecast skills for the SST anomalies over the Pacific and Atlantic Oceans.
It appears that the skill of the superensemble during La Niña years is around 0.3 or higher. The lowest values of the anomaly correlation clearly were during years when the observed SST anomalies were very small. This knowledge is important because as soon as such forecasts are completed one has the knowledge as to whether the superensemble carries very small SST anomalies in its forecasts. If so, it is plausible, based on those forecasts, to anticipate weak SST anomalies.
d. Performance over the Niño regions
The seasonal forecasts for the Niño-3 region, over the equatorial Pacific Ocean, are shown in Fig. 8. The top panel shows the RMS errors for the 13 yr of seasonal forecasts for the months March–May. In these illustrations, we show the skills for each of the member models (thin lines), the bias-removed ensemble mean (black line), and the superensemble (heavy black line). The strength of the superensemble stands out in both of these illustrations. The RMS errors are the least for the superensemble, being of the order of 0.3°C and the mean anomaly correlation (for 13 yr) of the order of 0.7. It is higher than 0.9 over several years. The SST anomalies over the Niño-3 region were predicted to a very high degree of accuracy over several years, the exception being the March–May seasons during 1992, 1994, and 1997 when the anomaly correlations for the superensemble werelow. During these years the performances of the member models were also quite low.
The seasonal prediction errors for the coupled models, the ensemble mean, and the superensemble are illustrated in Fig. 9. These are the values of the seasonal forecast SST minus the observed over the Niño-3.4 region of the Pacific Ocean. Many models exhibit protracted intervals of cold biases (as large as −3° to −4°C). As a result, the ensemble mean exhibits a negative anomaly with mean value of −1°C (not shown). The superensemble has an error value very close to zero and clearly provides the best forecasts for the Niño-3.4 SST anomalies. The amplitude of the errors of the superensemble is around 0.5°C generally.
Figure 10 shows comparable superensemble skill in predicting the SST anomalies of the Niño-3.4 region, in comparison to the member models. Here we show a plot of predicted seasonal anomaly minus the observed anomaly for the 13 yr. The spread of these differences with respect to the observed anomalies is indeed quite large for the member models. The errors of the superensemble based SST anomalies are around ±1°C over the entire period.
These figures Figs. (9 and 10) show quite clearly that both the SST and SST anomaly forecasts are much improved for the Niño-3.4 region when formed by a multimodel superensemble.
e. Performance during positive and negative phases of Indian Ocean dipole
Most of the DMI relationships to the monsoon were carried out at zero lag (in terms of statistics). Within the framework of seasonal forecasts it is worth asking whether the coupled models can predict the DMI across several decades. Given such forecast datasets, we can also explore possible nonzero monsoon–DMI lag connections.
Figure 11 shows forecasts of the DMI from the superensemble and the best of the ensemble member models as compared to the DMI observations. Here performance was defined in terms of the RMS error of the forecast relative to the observations for 1-month lead-time forecasts. This clearly shows that the DMIs are predicted extremely well by the multimodel superensemble. The superensemble carries similar seasonal forecasts for the total rains over the Indian monsoon region (Krishnamurti et al. 2006). That is another component of our study from the same suite of model outputs. Overall it was seen that the model implicitly is handling the monsoon rainfall/ENSO/IOD indices quite well in these seasonal forecasts. The skill of the best model for forecasting Niño-3.4 SST anomalies, DMI, and the Indian rainfall (Krishnamurti et al. 2006) were clearly somewhat lower compared to that of the multimodel superensemble for seasonal forecasts.
f. Case studies over different ocean basins
Covering the summer monsoon months, June, July, and August for the year 1998, we illustrate the seasonal SST anomalies in Fig. 12 for the Indian Ocean. Here the observed SST anomalies are compared with those of the multimodel superensemble and for the models with the highest and lowest skills. Clearly the highest anomaly correlations and the RMS skills are for the multimodel superensemble (0.78° and 0.46°C, respectively). Those for the best and worst models are 0.67° (0.41°) and 1.64°C (1.26°C), respectively. The summer of 1998 was an above-normal rainfall year for the summer monsoon. That year was characterized by a positive IOD index (i.e., somewhat warmer SST anomalies over the western equatorial Indian Ocean compared to the eastern side). That feature was best predicted during 1998 by the multimodel superensemble. The best model carried an inverse IOD index with the warmer SST anomalies residing near the Sumatra Coast. The model with the poorest skill carried very large errors over the entire Indian Ocean. Even the model with the best overall skill had a very large RMS error reflecting errors in its ability to predict the seasonal mean state of SSTs. Over most years the geographical distributions of the predicted SST anomalies were quite striking.
Another example, one for the 1998 northern summer rainfall season over the Pacific Ocean, is illustrated in Fig. 13. This was a year of below-normal SST for the tropical Pacific Ocean. The multimodel superensemble reasonably predicted the observed negative SST anomalies. The spread of warm SST anomalies over the entire southern Pacific Ocean in the best model and the spread of cold SST anomalies in the model with the lowest skill were noted. Overall the anomaly correlations for the superensemble, the model with the highest and the lowest skills, were 0.86, 0.68, and 0.40 respectively. The corresponding numbers for the RMS errors were 0.47°, 0.91°, and 2.09°C, respectively. The strength of the superensemble is seen clearly in these very large values for the anomaly correlations that are of the order of 0.86.
Figure 14 provides a comparison of the SST anomalies over the tropical Atlantic for the summer season (June–August) of 1998. Here we show the observed SST anomalies and those from the superensemble, the best member model (in terms of RMS skills), and those of the model with the lowest skill. Also indicated in each panel are the anomaly correlations (top of each panel) and the RMS errors (bottom of each panel). This shows clearly that it is possible to raise the anomaly correlations for the superensemble above those of the best models; and the RMS error is also drastically reduced by the superensemble.
g. Comparison with some other current forecast products
A number of research groups periodically report seasonal forecasts of SST and SST anomalies. Several of these are not well suited for inclusion in the atmosphere–ocean coupled modeling superensemble for various reasons. For example, some are ocean-only models driven by prescribed atmosphere forecasting and some do not have an unbroken forecast data record from which training and forecast phases can be constructed. Some are purely statistical formulations. It was noted that several of these are in fact quite promising for the SST forecasts over different ocean basins.
A comparison of the performance of the FSU superensemble with the Markov model over the tropical Pacific Ocean appears in Fig. 15. Here results for three successive boreal summers are compared (years 1999, 2000, and 2001). The top panels in this illustration show the observed SST anomalies, the middle panel shows the superensemble from 13 coupled models, and the bottom panel shows the results from the Markov model. The Markov model’s predicted anomalies lie mostly within 5° of the equator. Anomalies elsewhere were mostly much less than 0.5°C. The larger observed spread of anomalies in the belt 20° latitudes off of the equator was captured by the superensemble for all of these years, though not by the Markov model. There are several specific features where the results from the superensemble appear quite encouraging in these comparisons. In the years 1999 and 2000, south of 10°S and east of 120°W, robust warm anomalies were noted in the observed fields that were predicted extremely well by the superensemble. The cold equatorial SST anomalies of −1.2°C were reasonably predicted by the superensemble, whereas the Markov model predictions were closer to ∼−1°C. Near 80°W, the pattern and amplitudes of the superensemble-based forecasts matched those of the observations quite well.
The observed and predicted anomalies during the summer of 2001 were small (<0.3°C) in magnitude (see Fig. 15, right), and the anomaly correlations for the superensemble were not very large. However the superensemble still provided results that where somewhat closer to observations than did the Markov model, which carried cold equatorial SST anomalies <−0.3°C in its forecasts. Overall it appears that the superensemble provides somewhat better SST forecasts than the Markov model.
Another well-known experimental model for SST forecasts is one by Barnett et al. (1998, 1999, 2000) that provides HCM forecasts for SST anomalies. Here again we took an arbitrary sequence of three years of their seasonal forecasts to compare with the superensemble. Figure 16 presents observed SST anomaly fields for the fall months September–November for the years 1998, 1999, and 2000. The equatorial SST anomalies were cooler by 1°C or larger during 1998 and 1999, whereas in 2000 they were slightly below normal. The superensemble seasonal forecasts for these seasons are shown in the middle panel. They show a remarkable success in predicting these equatorial SST anomalies. Even warm anomalies of the Southern Hemisphere near 90°W in 1998, near the date line during 1999 and near 170°E during 2000 are reasonably predicted by the superensemble. The HCM (bottom panels) carries somewhat too strong anomalies during 1998, although they are located quite accurately within the basin. This model fails to predict any anomalies during the year 1999. During the year 2000, a few isolated weak pockets of cold SST anomalies near the equator are predicted well by this model. Overall the multimodel superensemble appears to carry a more robust and realistic distribution of SST anomalies for its seasonal forecasts.
The results from the performance of the LIM of Penland et al. (1998, 1999, 2000) were also examined. Figure 17 shows the observed, the superensemble, and the LIM-based seasonal forecasts for the summer months ofJune–August for the years 1998, 1999, and 2000. During 1998 the Indian Ocean experienced weak warm SST anomalies reasonably predicted by the superensemble. The LIM carried somewhat warmer anomalies. Over the Pacific Ocean, the superensemble described the cold equatorial water and its north–south extent between 120°W and 150°E very well. The superensemble captured the warm lobes around the cold (upwelled) water near the equator reasonably well. These warm cores were somewhat amplified (15°S and 120°W) by the LIM. This model also extended a cold-water anomaly toward the coast of Mexico and the western Gulf of Mexico, features not present in the observations or in the superensemble. Offshore from Peru, there was a warmed SST anomaly in the east Pacific observed field that was predicted reasonably by the superensemble but was absent in the LIM forecasts. During 1999 the colder SST anomalies of the Indian Ocean were somewhat reasonably predicted by the superensemble. The LIM carried warmer anomalies on the order of 0.3°C. The pattern of the SST anomaly over the entire tropical Pacific Ocean was very reasonably predicted by the superensemble. The LIM predicted warm anomalies that were somewhat too strong over the western Pacific, however it accurately captured the cold anomalies of the eastern Pacific Ocean. For the year 2000, the forecasts for June–August were again handled very well by the superensemble. The LIM carried somewhat too warm SST anomalies for its seasonal forecasts over the Indian Ocean and the western Pacific Ocean. The equatorial cold-water anomaly over the eastern half of the Pacific Ocean was handled well by the superensemble, while its cold anomaly was amplified in the LIM.
It is of interest to examine the performance of the proposed methodology during specific El Niño years. Note that in our data length 1989–2001 there were only a few El Niño events. Thus the overall statistics from the training phase of the superensemble carries a mix of El Ni o, La Niña, and neutral years. In spite of that, we found that the proposed methodology does show a superior performance for the superensemble. In Figs. 18 and 19 we show seasonal precipitation forecasts for December–February of the 1997/98 season and for March–May of the year 1992. These were two of the major El Niño years in our data record. In these illustrations we show the observed SST anomalies and those predicted from the superensemble, from the model with a highest skill and form the model that carried the lowest skill. The RMS averages for the predicted SST anomalies are also shown at the top of each panel. The model with the lowest skill overemphasizes the cold envelope around the warm equatorial Pacific SST anomaly for both El Niños. Both of these (best and the lowest skill) models predict too warm an SST anomaly off the coasts of south America. These features are properly corrected by the superensemble. Overall, the superensemble carries a lower RMS error for the SST anomaly forecasts compared to all member models. The forecasts for the 1997/98 season El Niño by the superensemble were exceptional. The warm equatorial waters of the El Niño of 1992 were somewhat underestimated by the superensemble, but the overall spread of errors of the other models still makes the superensemble a more superior product.
Overall it appears from these comparisons (with these nonmember models of the superensemble) that the superensemble has a better horizontal distribution for seasonal forecasts over the global tropical oceans than these other current forecast products.
5. Concluding remarks and future work
This study has examined the quality of SST forecasts issued by a superensemble constructed from 13 state-of-the-art coupled atmosphere–ocean models. It was possible to collect forecast datasets from as many as 23 400 seasonal forecast experiments. Both deterministic and probabilistic measures were used to evaluate the skills of seasonal forecasts of SST and SST anomalies. The main result of our analysis is that the superensemble is a powerful approach for the reduction of SST forecast errors. Model biases are sufficiently systematic and large that it was possible to reduce, on the average, the RMS errors from the superensemble for the SST by 41%, and the SST anomaly correlations were increased by as much as 46%. The best results were obtained over the tropical Pacific Ocean where, regardless of El Niño or La Niña years, a marked improvement was noted (Fig. 5) from the construction of the superensemble. The reduction of errors for the Atlantic and the Indian Ocean was also quite large, but the improvements in the anomaly correlations were not as high as those for the Pacific Ocean.
The lack of higher frequencies (time scales less than 10 days) in the oceanic SST observations may have been an important factor for the improved forecasts for the superensemble. The superensemble reduced the systematic errors of the member models collectively. The low-frequency SST changes of the member models carried signatures of robust systematic errors that were somewhat easier to correct as compared to atmospheric superensemble modeling. There were many years when the anomaly correlations (for the SST anomalies) from the superensemble had values of the order of 0.8 or larger. In these seasons the SST spatial structure over the entire tropical Pacific Ocean matched the observations very closely. That however is not the case for the best member model for that season (Fig. 7).
This paper also shows some comparisons on the performance of the superensemble with some well-known research models that were not part of our member model suite. These included a Markov model, a hybrid coupled model, and a linear inverse model. In each of these cases, an examination of several successive years of performance against observations and against the superensemble showed that the latter generally provides somewhat superior seasonal forecasts.
When a seasonal forecast is completed in real time with multimodels, a priori it is not clear which among the models could be relied upon for the best forecast. In that regard the consistent higher skills of the multimodel superensemble appear easier to accept.
It is possible to carry out such an exercise in a true forecast mode, that is, for the future. The European suite of models may not be available for this purpose. However it may be possible to include four versions of the FSU coupled models (described in this paper) along with the Australian POAMA model, the NCAR CCM3, and the U.S. National Weather Service’s Climate Forecast System (CFS) model to form a suite of seven models. In the coming year, these will be used to explore the performance of the superensemble for SST forecasts in real time.
Acknowledgments
We gratefully acknowledge the ECMWF for providing observed analysis and seven DEMETER coupled model datasets and BMRC, Australia, for providing the POAMA-1 dataset. The research reported here was supported by NSF Grant ATM-0108741, NOAA Grant NA06GPO512, FSURF Grant 1338-831-45, and NASA Grant NAG5-13563.
REFERENCES
Barnett, T., N. Graham, S. Pazan, W. White, M. Latif, and M. Flgel, 1993: ENSO and ENSO-related predictability. Part I: Prediction of equatorial Pacific sea surface temperature with a hybrid coupled ocean–atmosphere model. J. Climate, 6 , 1545–1566.
Barnett, T., D. Pierce, N. Graham, and M. Latif, 1998: Dynamically based forecasts for tropical pacific SST through mid 1998 using a hybrid coupled ocean-atmospheric model. Experimental Long-Lead Forecast Bulletin, Vol. 7, No. 2, COLA.
Barnett, T., D. Pierce, N. Graham, and M. Latif, 1999: Dynamically based forecasts for tropical pacific SST through mid 2000 using a hybrid coupled ocean-atmospheric model. Experimental Long-Lead Forecast Bulletin, Vol. 8, No. 2, COLA.
Barnett, T., D. Pierce, N. Graham, and M. Latif, 2000: Dynamically based forecasts for tropical Pacific SST through mid 1998 using a hybrid coupled ocean-atmospheric model. Experimental Long-Lead Forecast Bulletin, Vol. 9, No. 1, COLA, 19–21.
Barnston, A. G., and C. F. Ropelewski, 1992: Prediction of ENSO episodes using canonical correlation analysis. J. Climate, 5 , 1316–1345.
Barnston, A. G., and T. M. Smith, 1996: Specification and prediction of global surface temperature and precipitation from global SST using CCA. J. Climate, 9 , 2660–2697.
Berliner, L. M., C. K. Wikle, and N. Cressie, 2000: Long-lead prediction of Pacific SSTs via Bayesian dynamic modeling. J. Climate, 13 , 3953–3968.
Chakraborty, A., and T. N. Krishnamurti, 2006: Improved seasonal climate forecasts of the South Asian summer monsoon using a suite of 13 coupled ocean–atmosphere models. Mon. Wea. Rev., 134 , 1697–1721.
Déqué, M., 1997: Ensemble size for numerical seasonal forecasts. Tellus, 49A , 74–78.
Gadgil, S., P. N. Vinayachandran, P. A. Francis, and S. Gadgil, 2004: Extremes of the Indian summer monsoon rainfall, ENSO and equatorial Indian Ocean oscillation. Geophys. Res. Lett., 31 .L12213, doi:10.1029/2004GL019733.
Hasselmann, K., 1988: PIPs and POPs: The reduction of complex dynamical systems using principal interaction and oscillation patterns. J. Geophys. Res., 93 , 11015–11021.
Johnson, S. D., D. S. Battisti, and E. S. Sarachik, 2000: Empirically derived Markov models and prediction of tropical Pacific sea surface temperature anomalies. J. Climate, 13 , 3–17.
Kharin, V. V., and F. W. Zwiers, 2002: Climate predictions with multimodel ensembles. J. Climate, 15 , 793–799.
Krishnamurti, T. N., C. M. Kishtawal, T. E. LaRow, D. R. Bachiochi, Z. Zhang, C. E. Williford, S. Gadgil, and S. Surendran, 1999: Improved weather and seasonal climate forecasts from multimodel superensemble. Science, 285 , 1548–1550.
Krishnamurti, T. N., C. M. Kishtawal, D. W. Shin, and C. E. Williford, 2000a: Multimodel superensemble forecasts for weather and seasonal climate. J. Climate, 13 , 4196–4216.
Krishnamurti, T. N., C. M. Kishtawal, Z. Zhang, T. E. LaRow, D. R. Bachiochi, C. E. Williford, S. Gadgil, and S. Surendran, 2000b: Improving tropical precipitation forecasts from a multianalysis superensemble. J. Climate, 13 , 4217–4227.
Krishnamurti, T. N., and Coauthors, 2001: Real-time multianalysis–multimodel superensemble forecasts of precipitation using TRMM and SSM/I products. Mon. Wea. Rev., 129 , 2861–2883.
Krishnamurti, T. N., L. Stefanova, A. Chakraborty, T. S. V. V. Kumar, S. Cocke, D. Bachiochi, and B. Mackey, 2002: Seasonal forecasts of precipitation anomalies for North American and Asian monsoons. J. Meteor. Soc. Japan, 80 , 1415–1426.
Krishnamurti, T. N., and Coauthors, 2003: Improved skill for the anomaly correlation of geopotential heights at 500 hPa. Mon. Wea. Rev., 131 , 1082–1102.
Krishnamurti, T. N., A. K. Mitra, W-T. Yun, and T. S. V. V. Kumar, 2006: Seasonal climate forecasts of the Asian monsoon using multiple coupled models. Tellus, 58A , 487–507.
Latif, M., and A. Villwock, 1990: Interannual variability as simulated in coupled ocean–atmosphere models. J. Mar. Syst., 1 , 51–60.
Mo, K. C., 2003: Ensemble canonical correlation prediction of surface temperature over the United States. J. Climate, 16 , 1665–1683.
Neelin, J. D., 1990: A hybrid coupled general circulation model for El Niño studies. J. Atmos. Sci., 47 , 674–693.
Palmer, T. N., and Coauthors, 2004: Development of a European Multimodel Ensemble System for Seasonal to Interannual Prediction (DEMETER). Bull. Amer. Meteor. Soc., 85 , 853–872.
Penland, C., 1996: A stochastic model of IndoPacific sea surface temperature anomalies. Physica D, 98 , 534–558.
Penland, C., and T. Magorian, 1993: Prediction of Niño 3 sea surface temperatures using linear inverse modeling. J. Climate, 6 , 1067–1076.
Penland, C., K. Weickman, C. Smith, and L. Matrosova, 1998: Forecast of tropical SSTs using linear inverse modeling (LIM). Experimental Long-Lead Forecast Bulletin, Vol. 7, No. 2, COLA.
Penland, C., K. Weickman, C. Smith, and L. Matrosova, 1999: Forecast of tropical SSTs using linear inverse modeling (LIM). Experimental Long-Lead Forecast Bulletin, Vol. 8, No. 2, COLA.
Penland, C., K. Weickman, C. Smith, and L. Matrosova, 2000: Forecast of tropical SSTs using linear inverse modeling (LIM). Experimental Long-Lead Forecast Bulletin, Vol. 9, No. 2, COLA.
Reynolds, R. W., N. A. Rayner, T. M. Smith, D. C. Stokes, and W. Wang, 2002: An improved in situ and satellite SST analysis for climate. J. Climate, 15 , 1609–1625.
Saji, N. H., B. N. Goswami, P. N. Vinayachandran, and T. Yamagata, 1999: A dipole mode in the tropical Indian Ocean. Nature, 401 , 360–363.
Syu, H-H., J. D. Neelin, and D. Gutzler, 1995: Seasonal and interannual variability in a hybrid coupled GCM. J. Climate, 8 , 2121–2143.
Xie, P., and P. A. Arkin, 1997: A 17-year monthly analysis based on gauge observations, satellite estimates, and numerical model outputs. Bull. Amer. Meteor. Soc., 78 , 2539–2558.
Xue, Y., and M. Ji, 1999: Forecast of tropical Pacific SST using a Markov model. Experimental Long-Lead Forecast Bulletin, Vol. 8, No. 2, COLA.
Xue, Y., and M. Ji, 2000: Forecast of tropical Pacific SST using a Markov model. Experimental Long-Lead Forecast Bulletin, Vol. 9, No. 2, COLA.
Xue, Y., and M. Ji, 2001: Forecast of tropical Pacific SST using a Markov model. Experimental Long-Lead Forecast Bulletin, Vol. 10, No. 2, COLA.
Xue, Y., A. Leetmaa, and M. Ji, 2000: ENSO prediction with Markov models: The impact of sea level. J. Climate, 13 , 849–871.
Yun, W-T., L. Stefanova, A. K. Mitra, T. S. V. V. Kumar, W. Dewar, and T. N. Krishnamurti, 2005: Multimodel synthetic superensemble algorithm for seasonal climate prediction using DEMETER forecasts. Tellus, 57A , 280–289.
Characteristics of the 13 models used in this study.
The definition of ETS and FAR: probabilistic skill scores for categorical forecasts.