## Abstract

Interannual precipitation variability in central-southwest (CSW) Asia has been associated with East Asian jet stream variability and western Pacific tropical convection. However, atmospheric general circulation models (AGCMs) forced by observed sea surface temperature (SST) poorly simulate the region’s interannual precipitation variability. The statistical–dynamical approach uses statistical methods to correct systematic deficiencies in the response of AGCMs to SST forcing. Statistical correction methods linking model-simulated Indo–west Pacific precipitation and observed CSW Asia precipitation result in modest, but statistically significant, cross-validated simulation skill in the northeast part of the domain for the period from 1951 to 1998. The statistical–dynamical method is also applied to recent (winter 1998/99 to 2002/03) multimodel, two-tier December–March precipitation forecasts initiated in October. This period includes 4 yr (winter of 1998/99 to 2001/02) of severe drought. Tercile probability forecasts are produced using ensemble-mean forecasts and forecast error estimates. The statistical–dynamical forecasts show enhanced probability of below-normal precipitation for the four drought years and capture the return to normal conditions in part of the region during the winter of 2002/03.

*May Kabul be without gold, but not without snow.*

—Traditional Afghan proverb

## 1. Introduction

Prediction of climate anomalies on seasonal-to-interannual time scales is practical in regions and seasons where predictable boundary conditions [e.g., land surface properties and sea surface temperature (SST)] lead to predictable changes in seasonal weather statistics (Goddard et al. 2001). Both dynamical and statistical descriptions of the effect of SST anomalies on the climate system have their particular shortcomings. Dynamical models, in particular atmospheric general circulation models (AGCMs), though based on physical laws, are unable to resolve all spatial and temporal scales, and inaccurate parameterizations of unresolved processes such as convection lead to errors in predicting the climate response to SST anomalies. Statistical methods such as regression predict the climate response to SST anomalies based on the historical record. However, the shortness and quality of the climate record limit accuracy. Stationarity of the climate system is a further complicating issue.

Recently dynamical and statistical methods have been combined to compute the climate response to SST forcing (Smith and Livezey 1999; Feddersen et al. 1999; Mo and Straus 2002; Tippett et al. 2003; Widmann et al. 2003). The statistical–dynamical approach is in the spirit of model output statistics (MOS) where systematic errors of the dynamical model are identified and corrected (Glahn and Lowry 1972). Multivariate MOS correction identifies model patterns related to observed patterns and then replaces model patterns with observed ones. The MOS correction may effectively only make small shifts or rotations of model output when model deficiencies are minor. In these cases, *local* model information is sufficient to perform the MOS correction. In other cases AGCM precipitation simulation deficiencies require using other model variables, for instance, geopotential height, in the MOS correction (Landman and Goddard 2002). More severe AGCM errors may result in a complete failure to reproduce particular components of large-scale teleconnection patterns seen in observations. If the AGCM reproduces some part of the large-scale SST responses, MOS corrections that complete missing features may be feasible. In these cases, use of spatially remote model variables and model variables other than the target variable may be required. An example is the use of spatially remote model-simulated precipitation to estimate winter precipitation in the central-southwest (CSW) Asian region (Tippett et al. 2003). The multimodel application of the statistical–dynamical method to wintertime precipitation over the CSW Asian region, with emphasis on forecast performance during the past five winters (1999–2003), is the subject of this paper.

Much of CSW Asia, including parts of Iran, Afghanistan, Turkmenistan, Uzbekistan, Tajikistan, and Pakistan, has a semiarid climate. The region lies beyond the usual reach of the Indian monsoon and receives most of its annual precipitation during winter and early spring in the form of snow along the high elevations of the region (Martyn 1992). This precipitation, associated with eastward-propagating midlatitude cyclones, displays considerable interannual variability. Below-normal precipitation during four consecutive winter seasons (winter of 1998/99 to 2001/02) resulted in the worst drought in 50 yr and had a severe impact on agricultural production and livestock populations (Agrawala et al. 2001; Barlow et al. 2002). An indication of the role of SST forcing in this recent drought is found in the Hoerling and Kumar (2003) modeling study where several AGCMs forced by observed SST reproduced features of the drought in CSW Asia. However, AGCM simulations of the period prior to the recent drought show little skill in simulating CSW Asian seasonal precipitation anomalies, and we must rely heavily upon the observational record to elucidate connections between CSW Asian precipitation and SST.

The classical ENSO response does not include the CSW Asian region (Ropelewski and Halpert 1987, 1989). However, there are some modest indications that ENSO has a positive relation with December–January–February–March (DJFM) precipitation in the northeastern part of the domain and a negative relation in the southeastern part; the correlation of DJFM precipitation with ENSO SST indices has slightly negative values (∼−0.2) over central Iran and positive values (∼0.3) in the northeastern quadrant of the domain. Mason and Goddard (2001) found enhancement of the frequency of above-normal DJF precipitation in southwest (SW) Iran during the eight strongest La Niña events from 1951–52 to 1995–96. A similar analysis (data available online at http://iridl.ldeo.columbia.edu/SOURCES/.IRI/.Analyses/.ENSO-RP/.dataset_documentation. html) of all three-month seasons using the 10 strongest warm and cold events of the same period shows enhanced frequency of above-normal DJF and JFM precipitation in the region around the border of Afghanistan and China during warm events and enhanced frequency of below-normal JFM precipitation in northern Afghanistan during cold events (S. J. Mason 2004, personal communication). Barlow et al. (2002) linked the La Niña episode of 1998–2002 with the severe drought in CSW Asia and found a stronger relation between CSW Asian precipitation and ENSO when ENSO events were stratified according to the strength of their western Pacific anomalies; ENSO events having stronger western Pacific SST anomalies were associated with precipitation patterns similar to those observed during the recent drought period. This suggests that the CSW Asia region is affected by not simply the ENSO phase but by details of the basinwide SST anomaly pattern. This is not unexpected since modeling studies have found the atmospheric circulation to be sensitive to the location of tropical heating (Sardeshmukh and Hoskins 1988; Ting and Sardeshmukh 1993; Hoerling and Kumar 2002; Barsugli and Sardeshmukh 2002).

An observational study by Lau and Boyle (1987) noted different circulation responses to western Pacific/Maritime Continent (95°–135°E) and central Pacific (175°E–140°W) OLR anomalies, finding that Maritime Continent OLR anomalies had more association with the circulation over Asia than did OLR anomalies in the central Pacific. A dominant feature of the wintertime circulation over Asia is the upper-tropospheric westerly jet stream over subtropical east Asia and the western Pacific, referred to as the East Asian jet stream (EAJS). Using composite analysis, Lau and Boyle (1987) found that enhanced EAJS strength was associated with enhanced Maritime Continent convection. Maritime Continent convection influences the EAJS through the local Hadley circulation (Chang and Lau 1982; Chang and Lum 1985; Lau and Boyle 1987). Enhanced Maritime Continent convection leads to upper-level divergence and southerly flow into the subtropical Northern Hemisphere. The resulting westerly flow near the EAJS exit region, due to Coriolis effect, intensifies the EAJS. Strength of the EAJS correlates positively with precipitation anomalies in the Maritime Continent and western Pacific regions but appears uncorrelated with ENSO (Yang et al. 2002). CSW Asian precipitation is negatively correlated with the strength of EAJS (Tippett et al. 2003). A possible explanation for the association between EAJS strength and CSW Asian precipitation is that the dominant mode of variability of observed (reanalysis) upper-level winds indicates that EAJS strengthening is accompanied by a southward shift of the jet maximum and northeasterly flow anomalies over the CSW Asian region (Tippett et al. 2003). The negative correlation between EAJS strength and CSW Asian precipitation reflects the association between anomalous southwesterly flow over the region and enhanced upslope precipitation.

Tippett et al. (2003) found in the ECHAM4.5 AGCM that poor simulation of EAJS variability precluded using upper-level AGCM winds as a predictor for CSW Asian precipitation. This deficiency may be a factor in the generally poor AGCM simulation of CSW Asian precipitation. However, statistical corrections using ECHAM4.5 precipitation in the western Pacific/Maritime Continent region did give statistically significant simulation skill (Tippett et al. 2003). In the present work, we apply this method to the ECHAM4.5 AGCM and four additional AGCMs and make retrospective statistical–dynamical forecasts based on operational two-tier International Research Institute for Climate Prediction (IRI) AGCM forecasts of DJFM precipitation anomalies for the 5 yr (1999–2003); the AGCM forecasts use SST predicted the preceding October. Historical simulation skill is used to estimate forecast uncertainty and produce tercile probability forecasts.

## 2. Data and methods

### a. Observations

DJFM CSW Asian climatological precipitation and its variability, shown in Fig. 1, are closely related to the elevation of the region. These data are taken from the extended New et al. (2000) gridded dataset of monthly precipitation for the period from 1950 to 1998, and a version of this data interpolated to a T42 grid is used to compute skill and statistical corrections. Climatological precipitation follows the principal mountain ranges of the region: the Zagros, Himalaya, Karakorum, and Hindu Kush. Two geographical regions with large climatological precipitation and variability are seen in Fig. 1. One accompanies the Zagros mountain range along the southwest border of Iran with Iraq and the Persian Gulf. Another region of precipitation variability is found where the borders of Afghanistan, Pakistan, and Tajikistan meet in the Hindu Kush mountain range. The correlation between box averages over the southwest (26.5°–35°N, 45°–56.25°E) and northeast (NE: 35°–45.5°N, 67.5°–73°E) regions is 0.34, suggesting only a weak statistical relation between the precipitation variability of the two regions over the entire period (Tippett et al. 2003); both regions did experience drought during 1970–71 and 1999–2002. The correlation of regional precipitation with ENSO indices has marginal statistical significance with slightly negative values in the SW region and positive values in the NE region.

The Climate Anomaly Monitoring System (CAMS) and OLR Precipitation Index (OPI) precipitation dataset, which includes satellite observations, shows qualitative features of precipitation during the period 1999–2003 in Fig. 2 (Janowiak and Xie 1999). Below-normal precipitation began in DJFM 1999 and continued through 2001. The drought weakened in some northern areas in 2002, and there was a return to normal conditions in northern areas in 2003. However, there are relatively few reporting stations in the region during this period, and the precipitation estimate relies heavily on satellite data, limiting forecast verification to qualitative aspects. News reports and humanitarian aid information support these general features, including the enhanced wet conditions in the northern part of the regions during DJFM 2003 where flooding occurred. Station data available during the drought period and with sufficiently long records to compute 30-yr (1961–90) climatologies are shown in Fig. 3. The station data show above-normal precipitation in DJFM 1998 followed by 3 yr of below-normal precipitation. Station precipitation amounts were close to normal in DJFM 2002 and above normal in DJFM 2003.

### b. Model simulations and forecasts

We now examine simulation skill of the AGCMs used to make seasonal forecasts at the IRI: National Centers for Environmental Prediction (NCEP)/Medium-Range Forecast model (MRF9), ECHAM4.5, Center for Ocean–Land–Atmosphere Studies model (COLA), Community Climate Model, version 3.2 (CCM3.2), and National Aeronautics and Space Administration (NASA) Seasonal-to-Interannual Prediction Project (NSIPP)-1 (Livezey et al. 1996; Roeckner et al. 1996; Kinter et al. 1997; Hack et al. 1998; Bacmeister et al. 2000, respectively). Simulation skill is estimated from long ensemble integrations forced by observed SST. Spatial resolution, simulation period, and ensemble size for each model are shown in Table 1. Spatial maps of temporal anomaly correlation of ensemble-mean model simulation and observation (not shown) indicate little simulation skill in the CSW Asian region with few correlations exceeding 0.3. These correlation values are substantially less than the perfect model skill (the expected correlation between an ensemble member and the ensemble mean), which is almost everywhere greater than 0.3 and has domain-averaged values ranging from 0.38 to 0.51 for the various models. The anomaly correlation skill (Fig. 4a) of a multimodel average is mostly larger than that of the individual models. Table 2 shows the number of grid points whose correlation exceeds 0.3 and their average correlation for the individual models.

In the IRI two-tier real-time seasonal forecasts and net assessment forecasts, SST conditions for the forecast period are first predicted, and then the predicted SST conditions are used as boundary conditions for a set of AGCM integrations (Mason et al. 1999; Goddard et al. 2003). SST predictions are made using a dynamical prediction for the tropical Pacific and a statistical prediction for the tropical Atlantic and Indian Oceans, with damped persistence in the midlatitudes (Mason et al. 1999). For the forecasts considered here, the AGCMs are forced with observed SST until the end of September and with forecast SST for the period October–March. Forecast DJFM seasonal anomalies are computed with respect to the time mean of the given AGCM’s simulations over the period 1969–98. The AGCM ensemble sizes are the same as those listed in Table 1. The availability of the AGCMs in forecast mode varies during the period with only the NCEP, CCM3 and ECHAM4.5 models being available for the entire period; the ECHAM4.5 model forecasts for 1999–2001 were not available in real time. The NSIPP-1 and COLA forecasts were available for DJFM 2002 and 2003.

### c. Correction method

Statistical correction methods have been used to correct model-simulated precipitation anomalies (Smith and Livezey 1999; Feddersen et al. 1999) and seasonal forecasts (Mo and Straus 2002). The fundamental idea of these methods is a multivariate (pattern) regression between model fields and observed anomaly fields. Prior to employing such a multivariate regression, separate principal component analyses (PCAs) of model fields and observations (“prefiltering”) are applied to reduce the number of degrees of freedom and decrease the effects of sampling error. Canonical correlation analysis (CCA) is the multivariate regression method used to identify model fields most highly correlated with observed precipitation anomaly patterns (Barnett and Preisendorfer 1987). The set of CCA correspondences between model and observation patterns is then used to predict observed precipitation anomalies from model outputs.

Previous work showed a relation between observed (reanalysis) variations of the EAJS and observed CSW Asian precipitation with the observed 200-mb wind field being a good predictor of simultaneous observed CSW Asian precipitation (Tippett et al. 2003). However, examination of the ECHAM4.5- and NSIPP-1- simulated wind fields shows different interannual variability than that of observed winds and little relation with observed CSW Asian precipitation; the AGCM-simulated winds are more highly correlated with ENSO than are observed (reanalysis) winds. Wind fields from the other AGCMs were not available. Since western Pacific upper-atmospheric heating is related to EAJS variability, it is reasonable that it might be directly related to CSW Asian precipitation. Therefore the statistical correction is made using ensemble-mean model precipitation in the region 20°S–20°N, 100°E–130°W as the predictor; no effort was made to optimize the predictor domain, and the same domain was used for all AGCMs. Figure 5 shows the observation and AGM patterns and time series of the first CCA mode for the NSIPP-1 AGCM; the patterns and time series are calculated without cross validation. Positive precipitation anomalies in CSW Asia are associated with an AGCM precipitation pattern showing an eastward shift of precipitation along the equator. The correlation of the observation and model pattern time series with the Niño-3.4 index are 0.4 and 0.68, respectively; the correlation of the AGCM and observation time series is 0.67. Patterns and time series for the other models are similar.

To estimate the cross-validated skill of the CCA model, three consecutive years are selected and omitted from the calculation of the climatology, anomalies, and CCA prediction model (Michaelsen 1987). The CCA model is then used to predict the observed anomaly for the middle withheld year, and the full field is formed by adding the climatology. A Monte Carlo estimate of the statistical significance of the correlations is made by ranking the correlations of 50 000 random permutations in time of the observed data fields with correctly ordered observation fields (Livezey and Chen 1983). EOF and CCA truncations (Table 3) were chosen to maximize the sum of the cross-validated correlations exceeding 0.3 in the simulation skill estimates. The relatively low dimension of the statistical correction lessens the risk of the CCA overfitting the data. The cross-validated anomaly correlation of the multimodel average of corrected simulations (Fig. 4b) is very similar in spatial extent and magnitude to that of the individual models. Only correlations above the 95% significance level are plotted; also not plotted are negative correlations resulting from the negative bias associated with cross validation in areas without skill (Barnston and Van den Dool 1993). Correction skill is limited to the northern part of the region from Turkmenistan west through Uzbekistan, northern Afghanistan and Pakistan, Tajikistan, and Kyrgyzstan. The number of grid points whose correlation exceeds 0.3 and their average correlation are given in Table 2.

### d. Estimation of tercile probabilities

Probabilistic seasonal forecasts provide a means of quantifying and communicating forecast uncertainty and ideally should consist of the probability distribution function (PDF) of future climate conditions given the present climate state. Then the probability of a particular event, for instance, the probability of precipitation exceeding a given amount, can be computed. Here our goal is a tercile probability forecast, that is, the probabilities of the observed precipitation falling into the equally likely below-normal, normal, and above-normal categories.

Ensembles of AGCM forecasts can be used directly to make a simple nonparametric estimate of tercile probabilities by calculating the fraction of ensemble members falling into each of the tercile categories. However, model error and small ensemble size limit the direct use of the ensemble distribution (Rajagopalan et al. 2002; Kharin and Zwiers 2003). Here we use the historical record directly to estimate forecast uncertainty and assume normally distributed forecast error. Details are shown in the appendix. A deficiency of our implementation is that forecast uncertainty is estimated from AGCM simulations forced with observed SST and does not include the contribution from SST forecast error to the precipitation forecast uncertainty. The effect of SST forecast error could be quantified from analysis of hindcasts of ensembles of AGCMs forced with imperfect SST comparable to that used in the two-tier system. In Tippett et al. (2003) the statistical correction was developed from a set of hindcasts using the ECHAM4.5 AGCM forced by persisted SST anomalies. However, such hindcasts are computationally expensive and unavailable for most of the models here. Therefore we expect some underestimation of forecast uncertainty.

Tercile probabilities are computed for the ensemble-mean simulations and their corrections, and domain-averaged ranked probability skill scores (RPSSs) are shown in Table 4. The RPSS is a skill metric for forecasts of categorical probabilities (Epstein 1969). A RPSS of 100% is only obtained by consistently forecasting the observed category with probability 100%; a climatological forecast with equally likely tercile categories receives a score of 0%. The spatial distribution (not shown) of RPSS is similar to that of anomaly correlation. The RPSS of the parametric probabilities based on uncorrected ensemble-mean simulations is mostly zero; the RPSS is mostly negative when the fraction of ensemble members in each category is used to compute tercile probabilities. The domain-averaged RPSS of the parametric probabilities based on corrected ensemble-mean simulations is higher with values ranging from 2%–10% in skillful areas.

## 3. Forecast results

We now examine retrospective probability forecasts for DJFM 1999 through 2003, comparing probability forecasts obtained using the AGCM forecast precipitation over CSW Asia and those obtained using statistical–dynamical forecasts based on ensemble-mean model precipitation over the western Pacific. The forecast period is independent of the period used to compute the model corrections. The AGCMs are forced with SST that is forecast in the October preceding the DJFM season as described in section 2b.

During the drought years, La Niña conditions prevailed. These cool conditions were also to some extent present but with weaker amplitude in the forecast SST. In DJFM 1999 and DJFM 2000, the forecast SST was too cool near the South American coast and did not capture the observed westward extension of cool conditions; however, warm SST in the Maritime Continent region was correctly forecast. The worst error existed in the SST forecast for DJFM 2001, which failed to capture either the cool conditions in the central Pacific or the warm conditions in the Maritime Continent region. Tercile probabilities from the uncorrected AGCM output (Figs. 6a,c,e) show enhanced likelihood of above-normal precipitation along the southwest border of Iran and in the region northeast of Afghanistan for all three of the drought years. In contrast, the corrected tercile forecasts (Figs. 6b,d,f) show enhanced likelihood of below-normal precipitation in the region where there is skill, though in DJFM 2001 the shift toward below normal is weaker than in the previous two forecasts, perhaps due to the weakness of the SST forcing.

Warm SST anomalies in the central-western Pacific, Maritime Continent, and Indian Ocean were observed in DJFM 2002. However, only very modest warm anomalies in the central and western Pacific were forecast. All AGCMs indicated wet anomalies in CSW Asia except the COLA model, which showed negative precipitation anomalies in the northern part of the domain. The uncorrected AGCM tercile probabilities indicate enhanced likelihood of above-normal precipitation (Fig. 6g). However, the patterns and even sign of the precipitation anomalies in the individual corrected AGCMs varied considerably, perhaps as a result of the weakness of the SST forcing. The resulting tercile forecast reflects the lack of consensus and shows a slight shift to dry conditions in most of the region with a slight shift toward wet in the northeast of Afghanistan and Tajikistan (Fig. 6h). While drought continued in many regions, drought conditions began to ease in March and April in the northeast, consistent with the station data shown in Fig. 3.

Warm SST anomalies were observed in DJFM 2003 across the central Pacific (a weak-to-moderate El Niño was beginning to decay), through the Maritime Continent, and into the Indian Ocean. Forecast SST captured only the warm Pacific SST. AGCM forecast anomalies and tercile probabilities indicated wet conditions, much as they did during the drought (Fig. 6i). The statistical–dynamical forecasts are uniformly wet across models, and the tercile probabilities are shifted to the above-normal category (Fig. 6j). Above-normal precipitation was observed in the northern half of the region.

## 4. Summary and conclusions

Statistical–dynamical seasonal forecasts use statistical methods to correct systematic deficiencies in the response of atmospheric general circulation models (AGCMs) to predicted sea surface temperature (SST). The statistical correction is constructed here using canonical correlation analysis (CCA) between AGCM simulations forced by observed SST and the corresponding observations. This CCA is then applied to AGCM forecasts forced by predicted SST in a two-tier prediction system, outside of the training period. Simulation performance provides an estimate of forecast uncertainty that can be used to construct a parametric forecast probability density function and compute categorical probabilities.

We applied this method to winter [December–January–February–March (DJFM)] precipitation in central-southwest (CSW) Asia. Observational studies relate the region’s precipitation to tropical SST forcing with some success. Cold events are associated with modest enhanced frequencies of below-normal precipitation in the northeastern part of the region and above-normal precipitation in the southwestern part. There are some indications that details of the Pacific SST beyond the ENSO phase have a role in the climate of the region (Barlow et al. 2002). However, AGCMs forced with observed sea surface temperatures simulate poorly the region’s interannual variability. Observational evidence suggests that the CSW Asian response pattern is part of a large-scale pattern that includes the East Asian jet stream and ocean–atmosphere processes around the Maritime Continent. AGCMs do simulate precipitation variability in the Maritime Continent region reasonably well, and we base the statistical correction on ensemble-mean model precipitation over the Maritime Continent region. This approach was previously used for the ECHAM4.5 AGCM using observed and persisted SST (Tippett et al. 2003). Here we have applied the method to ECHAM4.5 and four additional AGCMs presently used at IRI for seasonal forecasting. We find that the correction of simulations forced by observed SST results in significant cross-validated skill in the northeastern part of the domain.

We also applied the statistical–dynamical method to two-tier AGCM forecasts of the DJFM season made during the period 1999–2003; the AGCMs are forced with SST forecasts made the previous October. This period is independent of that used for developing the correction statistics and includes a severe multiyear (1999–2002) drought. Tercile probability forecasts were constructed using simulation skill to estimate forecast uncertainty. While the SST forecasts had errors, they usually had the correct anomaly sign in much of the critical region in the western tropical Pacific. In spite of these SST errors, the statistical–dynamical forecasts capture some of the general features of the 1999–2003 period. The statistical–dynamical forecast probabilities show enhanced likelihood of below-normal precipitation during the drought years and enhanced likelihood of above-normal precipitation for DJFM 2003 when the northern part of the region experienced normal and above-normal precipitation, including flooding.

The statistical nature of this approach leads to the question of whether there is a benefit to using AGCMs or whether a purely statistical forecast using only the forecast SST would perform as well. We believe that statistical–dynamical approaches are potentially superior to purely statistical ones since AGCMs have the potential to produce nonlinear responses to SST forcing. However, here nonlinear effects seem small since the estimated skill (Fig. 7; Table 2) of a purely statistical CCA scheme using simultaneous (DJFM) observed SST as a predictor is only slightly less than that of the corrected AGCMs. Despite a small skill difference, however, we can conclude that the statistical–dynamical method permits AGCMs to achieve skill levels comparable with, if not better than, purely statistical methods. Additionally, since the detailed characteristics of each AGCM are different, using a multimodel approach improves the robustness of the forecast.

## Acknowledgments

We thank Xiaofeng Gong (IRI) for making retrospective forecasts with the ECHAM4.5 model and all the individuals and institutions who make their model data available to the IRI. Mark Lindeman and Maria Anulacion of the Production Estimates and Crop Assessment Division, USDA, kindly provided station data and precipitation estimates. We thank Benno Blumenthal for the IRI Data Library. Comments, suggestions, and corrections from Simon Mason, Marty Hoerling, and two anonymous reviewers improved the quality of this paper. IRI is supported by its sponsors and NOAA Office of Global Programs Grant NA07GP0213.

## REFERENCES

**,**

**,**

**,**

**,**

**,**

**,**

**,**

**,**

**,**

**,**

**,**

**,**

**,**

**,**

**,**

**,**

**,**

**,**

**,**

**,**

**,**

**,**

**,**

**,**

**,**

**,**

**,**

**,**

**,**

**,**

**,**

**,**

**,**

**,**

**,**

**,**

### APPENDIX

#### Estimating Tercile Probabilities

In this appendix we describe a method to convert single deterministic forecasts into forecasts of below-normal, normal, and above-normal tercile probabilities. Here the single deterministic forecasts are either ensemble-mean forecasts or MOS-corrected ensemble-mean forecasts. Pan and Van den Dool (1998) used contingency tables based on historical forecast verification data to forecast tercile probabilities. In their method a deterministic forecast for a single category is replaced with the conditional (given the category forecast) frequency of the various categories over the historical record. However, this method does not distinguish between cases in the same category and tends to be unreliable without large samples. Mason and Mimmack (2002) converted multilinear regression forecasts into category probabilities using prediction intervals. The approach followed here is similar and is equivalent to estimating forecast uncertainty by the standard error of a regression fit, neglecting the effect of sample size.

When the correlation *r* between an observed anomaly *O* and a forecast anomaly *F* is positive, *r* can be used to define a positive linear regression coefficient *α* = *r*〈*F*^{2}〉/〈*O*^{2}〉, where the angle brackets denote a time average, so that

where the forecast error *E* is assumed independent of the forecast *F* and is stationary. Using (A1), the correlation *r* can be expressed as

where *σ*^{2}_{F} ≡ 〈*F*^{2}〉 is the forecast variance, *σ*^{2}_{E} ≡ 〈*E*^{2}〉 is the forecast error variance, and the assumption that the forecast and forecast error are independent implies 〈*FE*〉 = 0. The correlation *r* is determined by the relative sizes of the forecast and forecast error variance. Conversely, given the correlation and forecast variance, the forecast error variance is found from (A2) to be

If we further assume that the forecast error *E* is Gaussian, then *O* is Gaussian with mean *αF* and variance *α*^{2}*σ*^{2}_{E}, and the climatological pdf is Gaussian with mean zero and variance *α*^{2}(*σ*^{2}_{F} + *σ*^{2}_{E}). The tercile probabilities associated with a forecast anomaly *F* are found by integrating the observation pdf between the climatological terciles. A direct calculation gives

where *P*(*B*|*F*), *P*(*N*|*F*), and *P*(*A*|*F*) are, respectively, the probabilities of the below-normal, normal, and above-normal categories given the forecast, erf(·) is the error function, and the tercile *x _{b}* of the climatological pdf is approximately

*x*

_{b}≈ 0.43

*σ*

^{2}

_{F}+

*σ*

^{2}

_{E}. The category probabilities are independent of the regression coefficient

*α*. In summary, (A4), along with the specification of the forecast

*F*, the historical forecast variance

*σ*

^{2}

_{F}, and the correlation

*r*, completely determines the forecast tercile probabilities where there is a positive correlation between forecasts and observations. Forecasts with nonpositive correlations are effectively taken to have error variance

*σ*

^{2}

_{E}that is unbounded [consistent with

*r*→ 0 in (A3)] leading to forecast tercile probabilities with climatological values. Equally likely probabilities of the categories are always forecast where there is no forecast skill.

The correlation *r* and forecast variance *σ*^{2}_{F} are computed at each grid point for uncorrected simulations and for the leave-three-out cross-validated statistical–dynamical simulations. The error variance *σ*^{2}_{E} at each grid point is then obtained from (A3) and the tercile probabilities from (A4). Probabilities from different AGCMs are averaged. This averaging treats the probability forecasts as being equally likely and has the undesirable effect of diluting the impact of apparently more skillful AGCMs.

## Footnotes

*Corresponding author address:* Dr. Michael K. Tippett, IRI, Earth Institute at Columbia University, 61 Route 9W, Palisades, NY 10964-8000. Email: tippet@iri.columbia.edu