Drought and flooding are recurrent and serious problems in the U.S. Affiliated Pacific Islands (USAPI). Given the agricultural and water-dependent characteristics of the USAPI economies, accurate forecasts of seasonal to interseasonal rainfall variations have the potential to provide important information for decision makers involved in resource management issues and response strategies related to drought and flood events.
Climatology of rainfall and outgoing longwave radiation (OLR) cycle in the USAPI and the response of OLR to the El Niño–Southern Oscillation (ENSO) are addressed. Boxplot and harmonic analyses indicate that the annual cycles in rainfall and OLR are generally strong in USAPI except those stations close to the equator. Northern USAPI have positive (negative) OLR anomalies during El Niño (La Niña) winters.
Two statistical models, canonical correlation analysis (CCA) and a relatively new method called multivariate Principal Component Regression (PCR), are employed to forecast rainfall variations in 10 USAPI stations. Sea surface temperatures (SSTs) in the Pacific Ocean are used as predictors for both models. The results of this study indicate that both models are potentially useful in predicting seasonal rainfall variations in the USAPI region, especially in winter (DJF) and spring (MAM). CCA cross validation shows that at one and two seasons lead JFM is the most accurately forecast period in the northern USAPI stations, with average skills of 0.53 and 0.41, respectively. However, the authors’ analysis indicates a problem of lower predictive skill in summer (JJA) and fall (SON). One reason might be associated with the so-called spring barrier in predictive skill in the tropical ocean–atmosphere system. Another reason might be associated with the tropical cyclone activity during these seasons. Predictions using the PCR model yield similar predictive skill. Though simpler than He and Barnston’s model in term of the number of predictor variables used, the authors’ CCA and PCR provide comparable skills.
The U.S. Affiliated Pacific Islands (USAPI) comprise former territories administered under a United Nations trusteeship [now the Federated States of Micronesia (FSM), Republic of Palau, the Commonwealth of Northern Mariana Islands, and the Republic of the Marshall Islands] as well as older U.S. territories (the Territory of Guam, Wake Island, and American Samoa). All but American Samoa lie in the Northern Hemisphere (Fig. 1).
Drought and flooding are recurrent and serious problems in the Pacific islands. Deficient rainfall can lead to water resource management problems, such as strict water rationing and expensive importation of potable water, as well as deleterious impacts on rain-fed agriculture and increased risk of wildfire hazard. On the other hand, flooding can damage civil infrastructure such as bridges, roads, and housing, as well as destroy agricultural crops and fields. Accurate forecasts of seasonal rainfall variation with sufficient lead times have the potential to provide important information for decision makers involved in resource management issues and response strategies related to drought and flood events in the Pacific islands.
A Pacific ENSO Applications Center (PEAC) was established in early 1994 as a partnership among the University of Guam, the University of Hawaii, NOAA’s National Weather Service and Office of Global Programs, and the U.S. affiliated island governments of the Pacific region, through the Pacific Basin Development Council (PBDC). The main purpose of PEAC is to develop and provide tools to assist government officials and other parties interested in changes in local climate and impacts within the Pacific region arising from the El Niño–Southern Oscillation (ENSO) cycle. Tools developed by the center support decision-making processes undertaken in the Pacific region for water resource management, fisheries management, agriculture, natural disaster mitigation strategies, power utilities, coastal zone management, and other climate-sensitive sectors. We determined that understanding the nature of rainfall and its predictability is essential to serving our clients.
Studies over the past 30 yr have demonstrated that ENSO has a significant impact on the climate variability in the Pacific islands. Bjerknes’s (1966, 1969) pioneering studies indicated that tropical climate was strongly influenced by ENSO episodes. Later studies (Meisner 1976; Wright 1979; Ropelewski and Halpert 1987; Chu 1995) have supported the results of Bjerknes. Lau’s (1985) global climate model experiments indicate that much of the atmospheric response to ENSO is associated with the changes in SSTs in the Pacific. Pacific SSTs can thus be used to forecast regional climate fluctuations, especially in the tropical Pacific area.
We decided to use two statistical techniques for our application. The first technique is canonical correlation analysis (CCA), a multivariate statistical model that computes linear combinations of a set of predictors that maximize relationships, in a least squares error sense, to similarly calculated linear combinations of a set of predictands. In other words, CCA is used to find linear combinations of two datasets that are most highly correlated. Given this characteristic of CCA and the high correlations between climate variations and ENSO (or SSTs), CCA modeling using Pacific SSTs as predictors certainly provides good potential in predicting seasonal to perhaps interannual climate variations (e.g., rainfall fluctuations) for the USAPI. The second technique that we use is multivariate Principal Component Regression (PCR). It is mathematically simpler than CCA. One of our interests is in comparing performance of the two techniques.
Recently, He and Barnston (1996) used the CCA method to forecast rainfall in the tropical Pacific islands at various lead times. He and Barnston (hereafter referred to as HB) noted a moderate skill for certain islands in the Northwestern Pacific, particularly during late northern winter. Our current study complements the HB analysis by further investigating the seasonality of rainfall predictability in the USAPI. Climatology of the rainfall cycle in the USAPI and the response of rainfall and outgoing longwave radiation (OLR) to ENSO, which are not addressed by HB, will be examined. Furthermore, whereas HB used quasi-global SSTs, Northern Hemisphere 700-mb height, and the past history of rainfall itself in forecasting island rainfall, we have focused on SSTs as the sole predictors. We base this approach on the close proximity of the USAPI to the equatorial center of atmospheric response to ENSO (Ropelewski and Halpert 1987).
Section 2 describes the data, and section 3 discusses climatology and rainfall/OLR variations related to ENSO. The analysis methods and CCA and PCR models are described in section 4. Section 5 discusses the results of cross-validation skills, along with an examination of seasonality of rainfall predictability. Finally, section 6 summarizes the study.
Monthly rainfall data for 10 major USAPI stations (Koror, Yap, Guam WSO, Andersen Air Force Base in Guam, Chuuk, Pohnpei, Wake, Kwajalein, Majuro, and Pago Pago) in the Pacific islands are selected as predictands. Locations of selected long-term USAPI rainfall stations are shown in Fig. 2 and a table of geographical details for selected stations is given in Table 1. The rainfall data are for a period of 38 yr (1957–94). Data were obtained from the Western Regional Climate Center (WRCC) in Reno, Nevada.
The SST data used in this analysis are a combination of two datasets. The Global Ocean and Global Atmosphere (GOGA) data (4.5° lat × 7.5° long), derived from the Comprehensive Ocean–Atmosphere Data Set (COADS; Slutz et al. 1985), are used in the first part from 1956 to 1988. For the later part of SST data (1989–94), we use the Climate Prediction Center (CPC) blended SST data, which were derived from a blend of in situ data, AVHRR satellite data, and ice data (Reynolds 1988). The SST data cover most of the Pacific Ocean from 100°E to 72°W, and 55°S to 55°N. The SST data are then averaged into 10° lat × 24° long boxes, as in Joseph et al. (1991). There is a total of 80 boxes, or predictor elements, for the forecasts because 8 boxes do not have data.
We also used OLR, which is a commonly used proxy for rainfall in the oceanic Tropics where stations are scarce and the rainfall is primarily convective (Lau and Chan 1983a,b; Weickmann 1983; Murakami and Nakazawa 1985; Yoo and Carton 1990; Lyons 1991). Although Morrissey (1986) and Lyons (1991) demonstrated that OLR can be misleading in certain climate regimes, the USAPI is generally free of that problem. An inverse relation between OLR and rainfall in the Tropics generally holds. This can be further seen in the high negative correlation coefficient between the USAPI monthly rainfall series and OLR series (at nearest corresponding box) in each USAPI station (Table 2). As indicated in Table 2, a correlation analysis of 19-yr USAPI rainfall and OLR monthly series show strong correlation coefficients ranging from −0.60 to −0.78. All correlation coefficients are significant at the 99% level when the serial correlation is taken into account.
The Northern Hemisphere USAPI lie within the monsoon region as defined by Ramage (1971). The rainy seasons coincide with the annual northward march and then southward retreat of the surface monsoon trough (Sadler et al. 1987). The cloudiness and precipitation maximum associated with the westerlies equatorward of the trough constitute the intertropical convergence zone (ITCZ) often described in the literature (Sadler 1975). An additional warm season feature is the Tropical Upper-Tropospheric Trough (TUTT), which serves as a primary source of warm season disturbances and some typhoons near Wake Island (Sadler 1978). Wake Island lies well northeast of the maximum northward position of the monsoon trough and experiences no ITCZ rains. During the winter monsoon, dry northeasterly winds blow at all Northern Hemisphere USAPI; cold fronts reach Wake Island.
American Samoa lies eastward of the Southern Hemisphere monsoon region as defined by Ramage (1971) and Webster (1987). Climatologically, the monsoon trough during austral summer is located near 12°S, just to the north of Australia, extending eastward from about 150°E to 170°E (Sadler et al. 1987). However, eastward advances of the monsoon trough can sometimes envelop Samoa, a feature evident during the 1982–83 ENSO episode (Sadler 1983). This trough, together with the South Pacific convergence zone (SPCZ), which is aligned northwest to southeast from about 170°E to 150°W (Vincent 1994), is the source of the cloudiness. In the austral cool season a cloudiness maximum remains near Samoa. This maximum forms in the convergence between two high pressure cells, one in the extreme southeast Pacific and a second immediately east of Australia.
Figure 3 shows boxplots of monthly rainfall data in selected USAPI stations. The annual cycle appears strong in the Northern Hemisphere USAPI stations with maximum median rainfall in ASO (August–October) and minimum in FMA (February–April). The annual cycles for individual stations vary slightly and maximum rainfalls (medians) occur at times ranging from early June to October, depending on the locations of individual stations relative to the seasonal, meridional movement of the monsoon trough. Thus, rainfall maxima occur in July/August in low latitude stations such as Koror, Yap, Pohnpei, Chuuk, and Majuro while they occur in September/October in higher latitude stations such as Guam WSO and Kwajalein. In Wake, however, maximum rainfall is usually associated with the TUTT as discussed earlier. Low rainfall values in spring are associated with the dry east Asian winter monsoon. In Pago Pago, maximum rainfall occurs in January when the SPCZ is very strong, and the minimum is in September/October when the SPCZ is weak.
The OLR boxplots at locations (Fig. 4) show similar characteristics. Most Northern Hemisphere stations appear to have minimum OLR (or maximum convection) in ASO and maximum OLR (or minimum convection) in FMA.
To evaluate the importance of the annual cycle, harmonic analysis is performed on the long-term monthly mean rainfall and OLR series. The first harmonic, which represents the annual cycle, explains a substantial variance of the rainfall/OLR variability in the monsoonal regions (Fig. 5 and Table 3). Maximum rainfalls (or minimum OLRs) occur from August to October for northern USAPI stations (Table 3). The first harmonic explains most of the variance (at least 53% of total variance for rainfall and 73% for OLR), especially for Guam (92%, 90%), Pago Pago (90%, 99%), and Yap (85%, 87%). The annual cycle is relatively weak (though still explaining over 53% of the variance for rainfall) at stations near the equator such as Koror, Chuuk, and Pohnpei. These stations receive monsoon trough rains twice a year as the trough crosses on its northward advance and southward retreat. A second harmonic (Table 4), which represents the semiannual cycle, adds significantly to the variance explained at Koror, Pohnpei, and Wake.
The OLR data are utilized to demonstrate ENSO effects in the Tropical Pacific region, as in Chu (1995). Due to the relatively short record of OLR data, we selected five winters (DJF) of warm ENSO events (1976/77, 1982/83, 1986/87, 1991/92, 1992/93) and two winters of cold events (1975/76, 1988/89) for composite analyses. In a normal Northern Hemisphere winter, the northern USAPI are in the high OLR (low rainfall) zone (Fig. 6a). The SPCZ, as defined by a persistent OLR minimum area with values less than 230 W m−2, extends southeastward from New Guinea to American Samoa and beyond.
The OLR composites are shown in Figs. 6b and 6c. For El Niño winters (Fig. 6b), strong negative OLR anomalies (associated with stronger convention) are located in the equatorial central Pacific while strong positive anomalies (associated with weaker convection) are found in the equatorial western Pacific. Two relatively weak positive anomalies are found in the subtropical areas of the central Pacific. Pago Pago is in the boundary of the negative anomaly center while most northern Pacific stations are located in the positive anomaly area. Thus, Pago Pago receives more than normal rainfall while most northern Pacific stations become drier during El Niño. The pattern of OLR anomalies during cold events (Fig. 6c) is approximately opposite to warm events: relatively large, negative OLR anomalies are found in the northern tropical Pacific while positive anomalies are located in the equatorial central Pacific.
4. Analysis methods and CCA and PCR model building
SSTs in the Pacific are used as predictors in the CCA model. Studies by Hastenrath and Heller (1977), Moura and Shukla (1981), and Shukla and Misra (1977) indicated that SSTs play an important role in regulating precipitation on regional or global scales. Lau (1985) also found that much of the atmospheric response to ENSO was associated with the changes in sea surface temperatures in the Pacific. SSTs have long-term records and thus provide a good sample size for statistical models. Indeed, Barnston and He (1996) used a CCA model to predict rainfall in Hawaii and Alaska and their results suggested that SSTs contribute the most to forecast accuracy.
b. Data prefiltering by empirical orthogonal function (EOF) analysis
It is highly advisable to apply EOF analysis prior to the CCA. This can be accomplished by projecting the data onto EOFs and then retaining only a limited number of principal components in the analysis. There are several reasons for doing this. First of all, the large number of spatial points can cause difficulty in inverting the matrices and in the eigenvalue problem. The preprocessing procedure transforms the large spatial dimension into a smaller number of retained EOF modes and this value is quite modest even for large fields. The computation of canonical modes is thus simplified after preprocessing. In addition, the small-scale noise is filtered out after EOF analysis since the use of EOFs allows the analysis to focus on the dominant modes of variability within each input dataset. The procedure for computing eigenvectors from a matrix of data has been described extensively in the literature and will not be discussed here.
The EOF analysis is thus performed onto the box-averaged seasonal mean SSTs for the period of 1956 to 1994 and seasonal rainfall totals for the period of 1957 to 1994. The data have not been detrended or standardized. The EOF analysis is thus performed using a covariance matrix. The advantage of using the covariance matrix is that the strongest variations can be identified or isolated in a dataset.
There is no universally agreed upon procedure for determining how many EOF modes should be retained. Morrison (1976) suggests that the retained components should explain perhaps 75% or more of the variances. Therefore, the leading eight modes of SSTs and the four leading modes of rainfall are selected since they account for approximately 75% or higher of the total variance, respectively (Tables 5 and 6). Scree graphs (Wilks 1995) are also used and results are consistent with the above selection rule (not shown).
c. CCA forecast model
The CCA technique was introduced by Hotelling (1935), and a useful discussion of the method, with examples of its application to meteorological fields, was presented by Glahn (1968). Since the introduction of CCA, prediction of climate variations using the method has received wide attention and has had some success; for example, CCA was employed to forecast short-term climate fluctuations (Barnett and Preisendorfer 1987; Graham et al. 1987a,b; Barnett et al. 1988) and seasonal rainfall fluctuations (Barnston 1994; Chu and He 1994; Yu 1994; Barnston and He 1996; Barnston and Smith 1996).
Assuming that Xs,t denotes the predictor matrix and Yr,t denotes the predictand matrix, where the subscripts s and r represent space, and subscript t represents time, namely,
Both Xs,t and Yr,t are monthly mean (or seasonal mean) removed matrices. Performing EOF analysis on Xs,t and Yr,t leads to
where Es,s and Er,r represent EOF spatial modes of the predictor and predictand matrices, respectively, and Ts,t and Tr,t are their attendant time coefficients.
Assuming that we take the first i EOF modes of the predictor time series (Ti,t, i < m) and the first j EOF modes of the predictand series (Tj,t, j < n) as inputs to CCA analysis, we can determine canonical vectors (u, υ) and linear combinations of Z = u′ Ti,t and W = υ′Tj,t. The following matrices are defined:
where a1, . . . , aq are “canonical correlations” between Z and W, and a1 ≥ a2 ≥ . . . ≥ aq and q is equal to i or j, whichever is smaller.
d. Principal Component Regression (PCR) model
In the CCA model, we assumed that the predictor field is a matrix of known constants. If that matrix is random, then the analysis is carried out conditional on it so that it is still treated as if it were fixed. The initial EOF truncations also cause a problem. In CCA, it is implicitly assumed that the first four modes of the predictand field are the ones that are most highly correlated with the first eight modes of the predictor fields. However, there is no guarantee that one of the higher modes of variations in the predictor set (e.g., higher than the eighth mode) will not be strongly associated with the predictand set.
To overcome these problems, we use a relatively new approach, principal component regression (Draper and Smith 1981). Using the same notation as in section 4c, we can write the Principal Component Regression (PCR) model as follows:
where Ti,t is a matrix of the truncated EOF time coefficients and Br,i is a matrix of regression coefficients (obtained from least squares estimates). The error matrix ϒr,t has rows that are assumed to be multivariate normal with means zero and some appropriate covariance structure. Note that there is no EOF truncation on the predictand dataset so that the full information contained in the rainfall dataset is used. By doing so, there is no need to assume that the leading truncated modes in the predictor dataset are those that are most highly related to the leading modes in the predictand dataset. Once fitted equations are obtained in terms of the selected modes, they can be transformed back into a function of the original predictor data using Eq. 7.
We still use the first eight modes of SSTs (which explain more than 75% of the total variance as shown in Table 6) as predictors to construct a PCR model. To ensure that we do not miss modes with strong correlation to predictands, the correlations between the time coefficients of the eigenvectors associated with other nonzero eigenvalues and the predictand values are examined, and no particularly strong correlation is found. The selection of the first eight principal components in PCR also make it easier to compare the skills between PCR and CCA model results directly.
5. Model results
a. CCA maps
The loading patterns of mode one for winter (DJF) SSTs and the following spring (MAM) rainfall are shown in Figs. 7a and 7b. Strong negative (positive) SST anomalies in the equatorial central/eastern Pacific are associated with positive (negative) rainfall anomalies throughout the northern USAPI region. The rainfall series in USAPI is highly correlated to the Pacific SST series for this mode, which accounts for 39% of the total variance. In Fig. 7c, the two canonical component times series have a high correlation coefficient (0.80). Higher than normal winter SSTs in the equatorial central Pacific lead to lower than normal spring rainfall in northern USAPI and higher than normal spring rainfall in Pago Pago. Specifically, a maximum anomaly of +0.8°C in the equatorial central Pacific leads to −136 mm and +34 mm of spring rainfall anomalies at Andersen AFB and Pago Pago, respectively. Given the importance of the first mode, the above results suggest that the interannual rainfall variations in the tropical Pacific islands are strongly influenced by the SST field and further support the use of the Pacific SSTs as predictors in the model.
The pattern of the observed SST anomalies at the height of the 1982–83 ENSO (Fig. 7d) is similar to the SST loading pattern (Fig. 7a) although the sign is reversed. By combining the canonical component time series of SSTs (solid line in Fig. 7c) and the SST loading pattern (Fig. 7a), we independently obtain a positive anomaly in the equatorial central–eastern Pacific during DJF of 1982/83. Therefore, the first mode is an ENSO-related mode, implying that ENSO plays an important role in the interannual rainfall variation.
b. CCA cross-validation results and seasonality of rainfall predictability
Cross validation is a generalization of the common technique of repeatedly omitting a few observations from the data, reconstructing the model, and then making forecasts for the omitted cases (Stone 1974; Chu and He 1994). Cross validation is conducted to evaluate the overall forecasting skill of the CCA model. The cross validation is nonparametric and provides an unbiased estimate of forecast skill. The approach goes as follows (Chu 1989): The predictor and predictand data of N time points are divided into L segments. A model is then developed using the data of L − 1 segments. This model is then used to predict the variable in the remaining segment. This process is successively repeated by changing the segment that has been excluded from the model development. By doing this, we obtain N predictions. These predicted values can be correlated with N observations and the overall forecast skill can be determined.
In this study, we remove only one observation at a time for each case. This is justified as interannual autocorrelations in the data are small (e.g., the average absolute autocorrelations of the rainfalls and SSTs is 0.10 and 0.17, respectively). Therefore we use all data available except for the season for which we want to make a prediction. For example, to forecast the summer (JJA) rainfall of 1980 with one-season lead time, we use a 37-yr summer rainfall time series (1957–79, 1981–94) and a 37-yr spring SST series (1957–79, 1981–94) to build a CCA model (redo everything each time, including the pre-EOFs). Then this resulting CCA model is used to forecast rainfall values in summer 1980 using SST values in spring 1980. We use the moving average season of three consecutive months in order to identify the season with best predictability, yielding 12 target seasons (DJF, JFM,..., OND, NDJ).
The cross-validation skills for JFM, AMJ, JAS, and OND are shown in Table 7. As indicated, different islands show different levels of predictive skill. Overall, rainfall forecasts for Yap, Guam WSO, Andersen, Chuuk, and Pohnpei are relatively well predicted with a mean skill of 0.33 or higher as demonstrated in the bottom line of Table 7. Rainfall forecasts are most skillful in Guam WSO (0.39) and Andersen (0.36). The CCA model generates very poor rainfall prediction for Pago Pago. One possible reason may be that Pago Pago is not located near any of the other stations in the study and the variance of this station may get chopped off after the pre-EOF truncation because it does not have any “partners” to align with.
The cross validation also indicates a seasonality of predictive skill. In general, forecasts for JFM and AMJ are better than the corresponding ones for JAS and OND. When all stations are considered, the model yields the best forecast skill (0.49) for one-season lead when the target season is JFM. At some stations (Yap and Pohnpei), one-season lead forecasts have skills of 0.70. For some stations (Yap, Andersen, and Wake for JFM; and Yap, Guam WSO, and Andersen for AMJ), the model can even demonstrate moderate predictive skill (≥ 0.30) at four-season lead time when the target seasons are JFM and AMJ.
Forecast skills for JAS rainfall are generally not as good as those for JFM and AMJ. However, moderate skills are found in Guam WSO and Pohnpei. OND is the most difficult season to accurately predict rainfall in the islands. Only one station (Chuuk) shows moderate forecast skill. One reason why JFM and AMJ have better predictability is probably because ENSO responses are most pronounced during boreal winter as the Pacific SSTs and Southern Oscillation index anomalies reach their peaks during boreal winter/spring. Table 8 gives more detailed cross-validation skills for the JFM target season at varying lead times.
Figure 8 provides plots of cross-validation skills for 12 moving seasons (from DJF to NDJ). We can see that at one-season lead JFM is the most skillful forecast period in the northern USAPI, with average skill of 0.53. JFM is also the most skillful forecast period at two seasons lead, with average skill of 0.41. The maximum cross validation skills in JFM and FMA and the minimum skills in ASO and SON yield a seasonality of rainfall predictive skill.
A target season-lead time cross-section plot is a useful way to present the CCA cross-validation skills. Such a plot is given in Fig. 9. Cross-validation skill varies strongly as a function of target season. Skills are high during boreal winter and spring when the contour line of 0.30 extends to 8 months lead. Low skills are found during boreal fall (SON). Therefore, a strong seasonality in empirical rainfall predictive skill is seen. Figure 9 also indicates that cross-validation skill tends to decrease as lead time increases.
As mentioned earlier, one of our objectives is to determine how much predictability can be achieved when only the Pacific SSTs are used as predictors. Compared to HB’s Fig. 2a and Fig. 3a, Figs. 8 and 9 imply that our model provides comparable skills (note that HB uses 14 stations, including Johnston and 4 Hawaiian stations, and that the lead time in their study is defined to be 3 months less than in our study such that OND-to-JFM is called a zero lead time). This is intriguing since our model is simpler relative to HB and suggests that the Pacific SSTs alone may be enough to make a moderately skillful rainfall prediction for the USAPI. Given the well-known ENSO’s impact on USAPI rainfall, the Pacific SSTs may thus have advantages in predicting Pacific island rainfall and there is no need to include SSTs from other ocean basins and 700-mb height into the prediction scheme.
Since SSTs are the only predictors in the CCA model, the predictability of rainfall variations will depend largely on the characteristics of the ocean surface. Latif and Graham (1991) noticed that there appeared to exist a “predictability barrier” around the time of boreal spring. They found that the correlations between observations and predictions of the SSTs of the coupled system of the Pacific Ocean decrease rapidly between April and June. Similar decreases across the spring period appear in the results of other coupled models (e.g., Cane 1991). Focusing on the rapid decline in forecast skills during boreal spring,Webster and Yang (1992) indicated that the summer monsoon circulation develops fastest from April to May, the time when the Walker circulation is the weakest and most susceptible to external noise. Thus, the typical ocean–atmosphere interaction may be least robust during boreal spring and thus subject to larger error growth. This idea was supported by Xue et al. (1994) who found that the transient initial error grows fastest starting from spring and slowest starting from late summer; they attributed the rapid decline in forecast skill in boreal spring (the “spring barrier”) to the smallness of the signal to be forecast.
In our CCA model, no useful skill can be produced for summer and fall rainfall prediction if variations in SSTs are uncertain in the antecedent boreal spring. As shown in Fig. 10, rainfall forecasts using AMJ SSTs as predictors generate the worst skill for 4–6 months ahead forecasts and, as a result, the average skill (0.17) is the lowest among the four seasons. JAS also has very low skill from 1 to 4 months ahead. Forecasts initiated in OND provide the best overall skill (0.37). The model’s low rainfall forecast skills in summer and fall (using AMJ SSTs as predictors) and the good rainfall forecast skills in winter and spring (using OND SSTs as predictors) suggest that a spring barrier occurs in the CCA when SSTs are used as predictors. These results are consistent with previous studies (e.g., Cane et al. 1986; Webster and Yang 1992; Xue et al. 1994).
Another possible explanation of the low rainfall forecast skills in summer and fall relates to tropical cyclone activity in the corresponding period. It is known that the western North Pacific is the favored breeding ground for tropical cyclones, which generally form in late summer and early fall. These intense convection systems can produce substantial rainfall over limited areas and can cause significant local variability in seasonal rainfall at an individual island.
We calculated tropical cyclone frequency in the area of the North Pacific islands. A box covering a domain of 0°–20°N and 130°E–180° is selected. Tropical cyclone frequency is counted only when a tropical cyclone is located within the domain of the selected box and with a maximum sustained wind speed of 17 m s−1 or more. The relationship between cross-validation skills and tropical cyclone frequency is plotted in Fig. 11. One-season and two-season lead cross-validation skills are shown in this figure. The predictive skill of rainfall is negatively proportional to the tropical cyclone frequency in the northwest Pacific. The minimum tropical cyclone frequency in February is associated with the highest cross-validation skill in JFM season. After February, the cross-validation skills decline rapidly as the spring barrier is penetrated, and then reach minimum values in September and October when maximum tropical cyclones form in this region.
c. PCR results
We employ the same cross-validation scheme (described section 5b) to assess our PCR model validity and adequacy. In addition to cross-validation skill (correlation), we compare the results of the PCR and CCA models by computing what is known as the PRESS statistic (the sum of the squared differences between the actual observations and the predictions from the leave one out cross-validation results). The PRESS selection procedure was proposed by Allen (1971). It is a combination of all possible regressions, residual analysis, and validation techniques. The PRESS statistic (Raymond 1990) can be expressed as
where Yt denotes the tth observation and Ŷt,−t is the leave one out prediction for the tth value. In general, a model with the smallest PRESS is preferred.
Figure 12 shows lead time-target season cross section of PCR cross-validation skill for northern USAPI rainfall. It has an almost identical pattern with Fig. 9. The PCR model shows high prediction skills during boreal winter and spring, and again JFM is also the most skillful forecast period. A further comparison of cross-validation skills is made by computing the PRESS statistic for both PCR and CCA models (shown in Figs. 13 and 14, respectively). The PCR model results are generally as good as the CCA results. In both PCR and CCA models, small PRESS values (useful predictive skills) are found during boreal winter and spring, and large PRESS values (weak skills) in summer and fall.
Following an overview of the climatology of the USAPI stations and the effects of ENSO, CCA and PCR are employed to forecast rainfall and study rainfall predictability for 10 stations in the USAPI. Box-averaged SSTs in the Pacific Ocean are used as predictors. To filter the noise and reduce sample size, EOF analysis is conducted as a preprocessing method. The following conclusions are drawn:
Boxplots and harmonic analysis of rainfall series indicate that the annual cycle in rainfall is generally strong for most USAPI stations with maximum rainfall in August/September and minimum in February/March. However, the annual cycle is relatively weak at stations close to the equator. This is evidenced by these equatorial stations (e.g., Koror, Pohnpei), which encounter monsoon trough rains twice each year as the trough crosses on its northward advance and southward retreat. The results are supported by a similar study of OLR.
ENSO has a strong impact on the climate of USAPI. For example, most stations have positive OLR anomalies (less convection) during El Niño winters and negative anomalies (more convection) during La Niña winters. The ENSO impact in Pago Pago in the Southern Hemisphere is generally opposite.
The CCA model provides useful skill in predicting rainfall in the Pacific Islands. Rainfall at most USAPI stations (e.g., Yap, Guam WSO, Andersen, Chuuk, and Pohnpei) shows moderate overall forecast skill with mean cross-validated correlation skill of 0.33 or higher. This result is intriguing in that it is a simpler (one predictor variable: SST) model relative to He and Barnston (1996), who used three predictor variables, and it further suggests that the Pacific SSTs alone might be enough to make a moderate-skill rainfall prediction for the USAPI. This might be expected in view of the important role of SSTs in regulating precipitation and the strong ENSO impact on USAPI rainfall.
The CCA cross-validated predictive skills show a strong annual cycle (seasonality). JFM is the most accurately forecast period in the northern USAPI at one and two seasons lead time with average correlation skill of 0.53 and 0.41, respectively. Generally speaking, high predictability in rainfall is found in boreal winter and spring while relatively low predictability is found in summer and fall.
The poor forecast skills resulting from AMJ SSTs as predictors suggest that the so-called spring-barrier effect may exist for SST-based predictions in linear statistical models. It is very difficult to generate accurate forecasts using spring SSTs as predictors if variations in SSTs are uncertain during boreal spring. The moderately useful skill using OND SSTs to predict forthcoming seasonal rainfall further supports this claim. The spring-barrier effect may contribute, at least partially, to the seasonality of rainfall predictions.
Our study also indicates that there might be a relationship between the predictability of seasonal rainfall in the USAPI region and tropical cyclone activity in the area. The low CCA skills in boreal summer and fall may be attributed in part to the strong tropical cyclone activity during these periods. On the other hand, the high forecast skills in winter and early spring are associated with the minimum tropical activity in the same period.
The relatively high predictability in boreal winter and early spring might be attributed to the least SST barrier in fall, the low tropical cyclone activity, and the pronounced ENSO responses in winter and spring.
A relatively new multivariate Principal Component Regression model is also employed to predict USAPI rainfall. Comparison of both PCR and CCA model results indicates that PCR provides comparable skills in predicting USAPI rainfall variation relative to CCA. Because of its simplicity, PCR offers a potential as a new tool for statistical climate prediction research.
Deep appreciation is extended to the Office of Global Programs (OGP) of the National Oceanic and Atmospheric Administration (NOAA) for supporting this research under Research Grant NA37RJ0199 and NA67RJ0154. The views expressed herein are those of the authors and do not necessarily reflect the views of NOAA or any of its subagencies. We are grateful to Alan Hilton LT/NOAA, who make some helpful contributions to this research and to Dr. Fei-Fei Jin for his comments. We are also thankful to Tony Barnston for his careful reading and reviewing of the paper. The addition of principal component regression was recommended by an anonymous reviewer and it improved the manuscript.
Corresponding author address: Mr. Zhi-Ping Yu, Jr. Researcher, Pacific ENSO Applications Center, University of Hawaii at Manoa, 2525 Correa Rd., HIG 331, Honolulu, HI 96822.
* SOEST Contribution Number 4504.