The International Research Institute for Climate Prediction (IRI) and Ceará Foundation for Meteorology and Water Resources (FUNCEME) in Brazil have developed a dynamical downscaling prediction system for Northeast Brazil (the Nordeste) and have been issuing seasonal rainfall forecasts since December 2001. To the authors’ knowledge, this is the first operational climate dynamical downscaling prediction system. The ECHAM4.5 AGCM and the NCEP Regional Spectral Model (RSM) are the core of this prediction system. This is a two-tiered prediction system. SST forecasts are produced first, which then serve as the lower boundary condition forcing for the ECHAM4.5 AGCM–NCEP RSM nested system. Hindcasts for January–June 1971–2000 with the nested model, using observed SSTs, provided estimates of model potential predictability and characteristics of the model climatology. During 2002–04, the overall rainfall forecast skill, measured by the ranked probability skill score (RPSS), is positive over a majority of the Nordeste. Higher skill is found for the March–May (MAM) and April–June (AMJ) seasons with forecast lead times up to 3 months. The skill of the downscaled forecasts is generally higher than that of the driving global model forecasts.
The Brazilian northeast (the Nordeste) is comprises nine states. From north to south, these states are: Maranhão, Piauí, Ceará, Rio Grande do Norte, Paraíba, Pernambuco, Alagoas, Sergipe, and Bahia (Fig. 1). The interannual climate variability in the Nordeste is primarily attributed to the sea surface temperature (SST) anomaly forcing, particularly over the tropical Pacific and the Atlantic Oceans (Hastenrath and Heller 1977; Mechoso et al. 1990; Moron et al. 1998; Moura and Shukla 1981; Nobre and Shukla 1996; Pezzi and Cavalcanti 2001; Ropelewski and Halpert 1996; Sun et al. 2005b; Uvo et al. 1998; Ward and Folland 1991). Slowly evolving SST anomalies, particularly in the tropical oceans, can be predicted with some degree of skill at lead time of several months (Mason et al. 1999; Repelli and Nobre 2004; Zebiak and Cane 1987). It is therefore reasonable to expect a high potential for seasonal climate prediction in the Nordeste (Folland et al. 2001; Moura and Hastenrath 2004). Recent reviews on climate predictions have shown relatively high skill of rainfall forecasts in the Nordeste (Carson 1998; Goddard et al. 2001).
The International Research Institute for Climate Prediction (IRI) has been issuing seasonal forecasts of global precipitation and temperature since late 1997 (Barnston et al. 2003; Mason et al. 1999). The IRI’s global prediction system is based on the premise that the atmospheric response to SST predictions provides the potential to produce forecasts of seasonal climate anomalies for many regions of the world. It is a two-tiered process in which a prediction is first made for the SSTs in the global oceans, and then the SST prediction is used as a driver of a forecast for the atmospheric climate. The climate forecasts are issued as probabilities of each of three equiprobable categories (i.e., below, near, and above normal with respect to the 1969–98 period). Evaluation of the IRI’s forecasts for 1997–2001 indicates that positive skill exists over a majority of the tropical land areas for both temperature and precipitation forecasts (Goddard et al. 2003).
In addressing actual applications of climate forecasts, one immediately is confronted with a need for regional- or even local-scale information. Global climate models typically are run at a resolution of approximately 2.5° × 2.5°, which is too coarse to provide useful information at these scales. One remedy for this is climate downscaling. In recent years, increasing attention has been given to dynamical downscaling (Chen et al. 1999; Hong and Leetmaa 1999; Fennessy and Shukla 2000; Giorgi and Mearns 1999; Misra et al. 2003; Nobre et al. 2001; Roads et al. 2003; Sun et al. 1999; Takle et al. 1999). The simulation of both the finer spatial scale details and the overall monthly or seasonal mean precipitation is often improved using the regional models. Examination of other atmospheric variables confirms that the regional model simulations are generally as good as or better than those from AGCMs alone. Sun et al. (2005b) used the National Centers for Environmental Prediction (NCEP) Regional Spectral Model (RSM), with horizontal resolution of 60 km, to downscale the ECHAM4.5 AGCM (T42) simulations for the Nordeste during the period of 1971–2000 and found that the AGCM forced by observed SSTs was able to capture the large-scale waves, and the RSM was able to produce regional- or local-scale rainfall at both seasonal and intraseasonal time scales. This work demonstrates the success of dynamical downscaling in the Nordeste and provides the scientific foundation for dynamically downscaled forecasts.
The IRI and Ceará Foundation for Meteorology and Water Resources (FUNCEME) in Brazil have developed a dynamical downscaling prediction system for the Nordeste and have been issuing seasonal climate forecasts since December 2001. At this time, rainfall is the only forecast variable. Forecast users are not interested in temperature forecast due to its small interannual variability in the Nordeste. This is, to our knowledge, the first such prediction system to be used operationally. The principal purpose of this paper is to describe this downscaling prediction system and validate its real-time performance during 2002–04. This introduction is followed by a summary of the construction of a high-resolution observed rainfall dataset in section 2. The prediction system is described in section 3, forecast validation is discussed in section 4, and a summary is given in section 5.
2. Observed rainfall data
Production of meaningful climate forecasts requires extensive evaluation of model outputs based on comparisons with observational data (Mason et al. 1999). The global monthly precipitation dataset with a resolution of 0.5° from the Climate Research Unit at the University of East Anglia (hereafter CRU05 dataset) (New et al. 1999, 2000) covers the period 1901–98. An anomaly approach was used to construct the CRU05 dataset. In this technique, grids of monthly climatology were first derived. Using as many stations as possible together with an explicit treatment of elevation dependency maximized the representation of spatial variability in mean climate. Grids of monthly anomalies were generated using an angular-distance-weighted (ADW) interpolation scheme. The anomaly grids were then combined with the monthly climatology to arrive at the estimated monthly precipitation total. We found that the CRU05 monthly precipitation anomaly field over Nordeste exhibited strong spatial variability in the 1970s and weak spatial variability in the mid-1990s. This may be due to the difference in station density between the 1970s and 1990s. The CRU05 precipitation data over the Nordeste was constructed using about 1800 stations in the 1970s and about 50 stations in the mid-1990s. Since IRI now has more stations available during the 1990s than that used in the CRU05 data and we also need high-resolution observed data to verify the forecasts for the period of 2002–04 in this study, we constructed a new monthly rainfall dataset at 60-km spacing on a Mercator conformal projection for the period of 1952–2004 over the Nordeste. The new constructed data are on the native grids of our prediction system. Note that the grid spacing on the earth surface is not exactly 60 km because of the Mercator conformal projection. This gridded dataset will be referred to as the IRI Nordeste rainfall dataset.
To construct the IRI Nordeste rainfall dataset, we use the same anomaly approach as in the CRU05 dataset. The monthly climatology of the IRI Nordeste rainfall was first derived from the CRU05 monthly climatology (i.e., the 1961–90 mean) using spatial bilinear interpolation from a 0.5° grid to the grid of the prediction system (∼60 km spacing). The station dataset compiled by the IRI was used to generate the monthly anomalies. The station coverage varies with time. The number of stations increases from the 1950s to early 1960s, stays at a maximum of ∼1900 from the early 1960s to the early 1990s, decreases significantly to ∼700 during the mid-1990s, increases again after 1998, and reaches ∼1800 in 2002–04. For instance, there are 1897 stations available in March 1975 and 634 stations available in March 1995 (Figs. 1c,d). The observed daily data have not been corrected for gauge biases since the correction of individual records requires detailed local meteorological and station information, which are not readily available. Monthly rainfall was obtained for the stations with no missing record during the month.
Prior to interpolation, each station time series was converted to anomalies relative to the 1961–90 mean. Series with less than 20 years of data during 1961–90 were excluded from the analysis. The station precipitation anomaly was transformed into a percentage of the station 1961–2000 mean. The variance of precipitation is closely related to the mean, and the interpolation in percentage units preserves this relationship better than interpolation in absolute units. Anomaly interpolation from stations to 60 km × 60 km grids follows New et al. (2000), namely, the ADW interpolation scheme. We use the eight nearest stations, regardless of direction or distance, in estimating each grid point, which results in a radius of influence that varies with station density. A station is weighted by an angular-distance weight:
where x0 is the correlation decay distance [(CDD): 450 km in this study], x is the distance from the grid point of interest, and cosθj is the angular separation of station k and l with the vertex of the angle defined at grid point j.
We combined the anomaly field for each month from 1952 to 2004 with the 1961–90 mean monthly climatology to arrive at monthly rainfall total on 60 km × 60 km grids. Extensive evaluation of the IRI Nordeste rainfall data is beyond the scope of this paper. Major sources of errors in gridded datasets are instrumental (isolated errors, systematic errors, and inhomogeneities), inadequate station coverage, and interpolation errors (Groisman et al. 1991). Inadequate station coverage is the largest source of error (New et al. 2000). A limited comparison between the IRI Nordeste rainfall data and the CRU05 data was done to highlight the differences that can arise due to differing station networks. Figure 2 shows rainfall anomalies in March 1975 and March 1995. For March 1975, there were 1816 stations and 1897 stations available for the construction of the CRU05 rainfall over the Nordeste and the IRI Nordeste rainfall, respectively. The two datasets present similar spatial patterns of rainfall anomalies. For March 1995 there were 28 stations and 634 stations available for the construction of the CRU05 rainfall and the IRI Nordeste rainfall, respectively. Higher spatial variability is presented by the IRI Nordeste rainfall data. The constructed so-called IRI Nordeste rainfall dataset is used to evaluate the downscaling prediction system.
3. Downscaling prediction system
As shown in Fig. 3, a two-tiered prediction system is used. We begin with SST predictions, followed by ensemble runs of the ECHAM4.5 AGCM with T42 resolution. This portion is processed at the IRI, and then the AGCM forecast outputs are sent to FUNCEME to drive the NCEP RSM ensemble runs at the resolution of 60 km to produce downscaled forecasts for the Nordeste. Since December 2001, seasonal rainfall forecasts have been issued monthly. Since there is essentially no rainfall from July to December, we issue forecasts only for four overlapping seasons, that is, January–March (JFM), February–April (FMA), MAM, and AMJ.
a. SST forecasts
Two SST scenarios are predicted. The first SST scenario is a persistence of the observed SST anomalies from the most recently completed calendar month, added to the monthly climatology of the target season. Persisted SST anomaly (SSTA) forecasts are generally skillful for the first season (Mason et al. 1999). Dynamical predictions using persisted SSTA are run only one season into the future.
The second SST scenario is the predicted SST anomalies for the upcoming six months. The predicted SST anomalies contain forecast SST anomalies for the tropical oceans and damped-persisted latest observed SST anomalies with 3 months’ e-folding time for the extratropical oceans. At this time, the SST predictions are made separately for each of the tropical ocean basins. Over the tropical Pacific Ocean, SST anomalies were predicted from the NCEP coupled ocean–atmosphere model (Ji et al. 1998) before June 2003, and SSTA forecasts have been made by averaging three SSTA forecasts from the Lamont-Doherty Earth Observatory (LDEO) coupled ocean–atmosphere model (Chen et al. 2000), the NCEP constructed analog model (van den Dool 1994; van den Dool and Barnston 1995), and the NCEP coupled ocean–atmosphere model since June 2003. Over the Indian Ocean, SST anomalies were predicted using a canonical correlation analysis (CCA) model developed at the IRI. The predictors were the recently observed SST anomalies in the tropical Pacific and Indian Oceans and also the NCEP forecasts for the tropical Pacific Ocean (Goddard and Graham 1999). Over the tropical Atlantic Ocean SST anomalies were predicted at Centro de Previsão de Tempo e Estudos Climáticos (CPTEC) using a CCA model. The predictors were the recent observed SST anomalies in the tropical Pacific and Atlantic Oceans (Repelli and Nobre 2004). These SST predictions were smoothly blended at their geographical interfaces.
b. Atmospheric models
The ECHAM4.5 AGCM and NCEP RSM version 97 form the core of this prediction system. A brief description for the two models follows.
The ECHAM4.5 AGCM was developed at Max Plank Institute for Meteorology (MPI) in Germany (Roeckner et al. 1996). The model is configured at triangular 42 (T42) spectral truncation, giving a spatial resolution of about 2.8° latitude–longitude, with 19 vertical layers extending from the surface to 10 hPa. The mass flux scheme of Tiedtke (1989) for deep, shallow, and midlevel convections is used with the modified closure schemes for penetrative convection and the formation of organized entrainment and detrainment (Nordeng 1995). The radiation scheme is a modified version of the European Centre for Medium-Range Weather Forecasts (ECMWF) scheme. Four- and six-band intervals are used in the solar and terrestrial part of the spectrum, respectively. The vertical diffusion is computed with a high-order closure scheme depending on the turbulent kinetic energy. Gravity wave drag associated with orographic gravity waves is simulated after Miller et al. (1989). The surface fluxes are based on the Monin–Obukhov similarity theory. The land surface scheme is a modified bucket model with improved parameterization of rainfall runoff (Dumenil and Todini 1992).
The NCEP RSM version 97 was developed by Juang and Kanamitsu (1994) and Juang et al. (1997). The RSM uses the terrain following sigma coordinates in the vertical with 19 layers. A simplified Arakawa–Schubert scheme is used for deep convection (Pan and Wu 1995). Shallow convection following Tiedtke (1984) is invoked only in the absence of deep convection. The solar and terrestrial radiation follows Chou (1992) and Harshvardhan et al. (1987), respectively. An orographic statistics-based wave breaking mechanism (Kim and Arakawa 1995) is applied to the gravity wave drag scheme. Boundary layer physics employs a nonlocal diffusion scheme developed by Hong and Pan (1996). The fluxes in the surface layer are based on Monin–Obukhov similarity theory. The model also includes the two-layer soil model of Pan and Mahrt (1987) and Pan (1990).
A Mercator projection was used for the projection of the regional grid. The RSM resolution is 60 km, and the domain (109 × 72 grid points) is defined in Fig. 4. This configuration was adopted by Sun et al. (2005b). The Nordeste is 10 or more grids away from the lateral boundaries. This prevents possible noise generated at the lateral boundaries from excessively contaminating the solution over the Nordeste (Seth and Giorgi 1998). It is necessary to have the entire tropical Atlantic Ocean in the domain in order to resolve well the Atlantic intertropical convergence zone (ITCZ) by the RSM (Sun et al. 2005b). With the resolution of 60 km, the main topographical features are resolved by the RSM. The São Francisco valley runs roughly in a south–north direction, bounded on the west by the ranges of Serra Geral de Goiás and on the east by the ranges of Serra do Espinhaço and Chapada Diamantina. The smaller ranges of Serra Ibiapaba and Borborema in northeast Nordeste can also be identified in Fig. 4. These topographic features cannot be resolved in ECHAM4.5 AGCM at T42 resolution. Instead, the land rises smoothly from the coast to inland with the peak at (18°S, 46°W) (not shown).
The one-way nesting of the NCEP RSM into the ECHAM4.5 AGCM is done in a way that is different from conventional methods, which use global model results along the lateral boundary zone only. The perturbation nesting method used here allows the global model outputs to be used over the entire regional domain, not just in the lateral boundary zone. The dependent variables in the regional domain are defined as a summation of perturbation and base. The base is a time-dependent prediction from the AGCM. All other variability that cannot be predicted by the AGCM but can be resolved and predicted by the RSM in the regional domain is defined as perturbation. Nesting is done in such a way that the perturbation is nonzero inside the domain but zero outside the domain. Five prognostic variables from the global model outputs are used as the base fields in the RSM. They are zonal and meridional winds, temperature, humidity, and surface pressure. Perturbations are often concentrated in the smaller spatial scales. In some cases, the perturbations can be strong at larger spatial scales as well (e.g., Ward and Sun 2002). All diagnostic variables (e.g., precipitation) are generated by the regional model itself.
It is essential that an ensemble approach be taken. Ten AGCM ensembles were generated by perturbing atmospheric initial conditions at the start data of January 1948. The AGCM ensemble multidecadal simulations were produced using the observed SSTs. The initial conditions for the AGCM predictions were taken from these continually updated AGCM simulations. The AGCM ensemble predictions were then produced using the predicted (persisted) SSTs. The NCEP RSM runs, driven by the ECHAM4.5 AGCM predictions, produced the downscaled forecasts.
An ensemble of ten nested RSM runs was computed for each SST scenario. Each ECHAM4.5 AGCM ensemble run was used to drive one RSM ensemble run. The nested RSM predictions using persisted SST anomalies were confined to the first forecast season due to the rapid loss of skill at longer leads, and predictions using forecast SST anomalies were generated for the upcoming four overlapping seasons (i.e., 6 months). A comparison between forecasts with different SST scenarios provided information on the sensitivity of the climate system to the evolving SST forcing and thus facilitated the interpretation of the model predictions.
c. Verification measure
The ranked probability skill score (RPSS) is widely used for probability forecast validation. The RPSS gives credits for forecasting the observed category with high probabilities, and also puts penalties for forecasting the wrong category with high probabilities. See the appendix for a derivation of the RPSS. According to its definition, the RPSS maximum value is 100%, which can only be obtained by forecasting the observed category with a 100% probability consistently. A score of zero implies no skill in the forecasts, which is the same score one would get by consistently issuing a forecast of climatology. For the three-category forecast, a forecast of climatology implies no information beyond the historically expected 33.3%–33.3%–33.3% probabilities. A negative score suggests that the forecasts are underperforming climatology. The skill for seasonal precipitation forecasts is generally modest. For example, IRI seasonal forecasts with zero-month lead for the period 1997–2000 scored 1.8% and 4.8%, using the RPSS, for the global and tropical (30°S–30°N) land areas, respectively (Wilks and Godfrey 2002).
d. Model validation
An ensemble of 10 runs with the nested model system, using observed SSTs, was done for the period of January–June 1971–2000. Each ECHAM4.5 AGCM ensemble run was used to drive one RSM ensemble run. These long historical runs provided an estimate of potential predictability and characteristics of model climatology that are essential to interpreting the seasonal predictions from the models (Mason et al. 1999).
The RSM simulations captured the geographic distributions of observed rainfall. An example is shown in Fig. 5. Observations exhibits wetness in the northwest Nordeste. Rainfall generally decreases from northwest to southeast. Relatively high rainfall along the coast and in southern Ceará and western Paraíba is also observed. The RSM reproduces these characteristics. For instance, the RSM is able to generate relative wet conditions in northern and southern Ceará and relative dry conditions in central Ceará. The main weakness in the RSM simulations is that the rainfall maximum in northwest Nordeste is shifted southward compared with observations.
Temporal anomaly correlations were calculated between the observed and the RSM ensemble mean rainfall. Correlation scores higher than 0.3 (the 90% confidence level) are presented in most of Nordeste in the four overlapping seasons (Fig. 6). Low correlation scores are found mainly in the state of Bahia. Correlation scores are generally higher during MAM and AMJ seasons than JFM and FMA seasons. The model generally captured the observed rainfall variability in the Nordeste. This may suggest that much of the interannual variability of Nordeste rainfall is associated with the global SST variations.
Probabilistic rainfall forecasts can be generated using ensemble mean contingency tables. An ensemble mean contingency table for a model grid for a season can be built by comparing the historical performance of the model to the observations according to tercile classifications in this study. An example is provided in Table 1. At this particular location and season, it showed that, for 60% of the years when the RSM indicated below-normal rainfall, the location was observed to receive below-normal rainfall. This means that 40% of the years when the RSM indicated below-normal rainfall, it was not observed to be below-normal rainfall. It indicated that 30% of those years the observed rainfall was near normal, and in 10% of the years the observed rainfall was actually above normal. Conditional errors can also be revealed by contingency tables. For instance, in Table 1, when the RSM produced near-normal rainfall, the location was observed to receive above-normal rainfall more frequently than near-normal rainfall.
Probabilistic rainfall hindcasts by the RSM for 1971–2000 provided an estimate of predictability that could be achieved given perfect SST forecasts. Hindcasts were expressed in probabilities in three equiprobable categories (i.e., above, near, and below normal). Probabilistic rainfall hindcasts were generated using ensemble mean contingency tables. A cross-validation approach was applied to generate contingency tables. A block of 3 yr was held out as independent data, and the tercile boundaries were redefined for each set of 3 yr removed from the tercile definition. For the 27 yr remaining, the tercile was based on 9–9–9 instead of 10–10–10, and a contingency table was built using these 27 yr. Then we generated the probabilistic hindcasts for the 3 yr left out. Then the next 3 yr would be left out and new tercile boundaries would be formed, new contingency tables would be generated, and probabilistic hindcasts for those 3 yr left out were generated. Cross-validation ensured that observations from the forecast period did not directly influence forecasts, while allowing us to make efficient use of limited data (Hansen and Indeje 2004).
The RPSS for the RSM 30-yr hindcasts was calculated. As shown in Fig. 7, positive skill was found over many locations in the Nordeste among all four seasons. The skill was moderate during the JFM season. Good skill (>10%) was found mainly in Piauí and western Ceará during the FMA season, and was shown in northern Nordetse during MAM and AMJ seasons. Further analysis indicated that the RPSS was often negative for the hindcasts favoring the near-normal category, implying that the models underperformed climatology in forecasts favoring the near-normal category.
The influence of ENSO on the variability of the Nordeste rainfall has been widely studied (Moura and Hastenrath 2004; Ropelewski and Halpert 1996; Ward and Folland 1991). The RPSS averaged over five strong El Niño years (i.e., 1973, 1983, 1987, 1992, and 1998) and five strong La Niña years (i.e., 1974, 1985, 1989, 1999, and 2000) during 1971–2000 were calculated (Fig. 8). We used the Niño-3.4 index to choose these ENSO years. During the ENSO years, areas with high skill (RPSS >20%) were significantly increased, implying RSM skill was enhanced in these areas. At the same time, areas with negative skill increased as well, implying that areas with no skill increased. That the RSM skill may increase or decrease during ENSO years depends on the locations in the Nordeste. It is difficult to conclude the influence of ENSO on RSM skill over the Nordeste as a whole.
e. Statistical postprocessing
The raw output of the nested model system may have systematic or conditional errors. Statistical analysis needs to be performed on model outputs to correct model biases. Several approaches used for the IRI global prediction system (Mason et al. 1999) are applied to this downscaling prediction system.
1) Ensemble means
The simplest form of presentation of ensemble forecasts is ensemble mean rainfall anomalies. Maps of the ensemble mean rainfall forecasts expressed in terms of terciles together with information based on the historical performance of the models are also produced. We mask out the forecasts where the RSM has no skill, as measured by the Heidke score (≤0). The Heidke score is a commonly used categorical verification score, measuring categorical matches between forecasts and observations (Barnston 1992; Heidke 1926). Since the Heidke score varies with forecast terciles and seasons, the mask is also a function of the forecast category and season as well. An example is provided in Fig. 9. The spatial distribution of rainfall anomalies and rainfall categories indicates that the downscaling prediction system is able to produce localized rainfall variability.
To generate tercile probabilistic forecasts, we first classified predicted rainfall anomalies into below-normal, near-normal, or above-normal categories and then used the ensemble mean contingency tables to generate tercile probabilities. Note that the ensemble mean contingency tables are based on the period 1971–2000 and not updated after every season. As discussed in section 3d, ensemble mean contingency tables represented the historical performance of the model to the observations according to tercile classifications. An example is provided in Fig. 10, showing the spatial probabilities associated with below-normal, near-normal, and above-normal probability predictions for FMA 2004 using the persisted SSTA scenario.
2) Ensemble spreads
Examination of the ensemble spreads of the RSM hindcasts shows that more RSM ensemble members usually fall into the same category as observations than would be expected by chance, particularly for the two outer categories. This suggests that the RSM response to the SST forcing is generally robust, and useful information exists in the ensemble spreads. Tercile probabilistic forecasts can also be generated by calculating the percentages of RSM ensemble members that fall within the below-, near-, and above-normal terciles with respect to the model’s own historical distribution. An example is provided in Fig. 11, showing rainfall probability prediction for FMA 2004 from January 2004, using the persisted SSTA scenario.
Forecast probabilities based on ensemble spread information could be corrected based on the forecasts versus observed distributions. The method of correcting systematic errors differs from the contingency table method in that characteristics of the forecast ensemble distribution are used, as opposed to just the ensemble mean. For example, suppose all the cases in which the forecast had just one out of 10 ensemble members in the above-normal tercile were collected and examined, and suppose the observations showed that the above normal tercile occurred 25% of the time. Then the implied above-normal tercile forecast probability of 10% from forecasts of this type would be calibrated upward accordingly. This is a more refined approach, but it has a weakness of being based on too few cases having each of the many possible probability types. The contingency table method is less refined but is more assured of getting sufficient sample sizes for each of the three forecast terciles.
At this time, no adjustment to the forecast probabilities based on ensemble spreads is made due to the small sample size. The forecast probabilities may not be reliable and thus the uncorrected forecast probabilities using ensemble spread information is not used in the final forecast product.
3) Forecast product
The forecast product is expressed in terms of probabilities of the respective season’s rainfall being in the driest third, in the wettest third of years, and in the third centered upon the climatological median. The ensemble mean contingency tables are used to generate tercile probabilities. At this time, the tercile probabilities using ensemble spread information are not used.
For one-month lead forecasts, rainfall forecasts with persisted SST scenario and rainfall forecasts with predicted SST scenario are available. We usually give an equal weight to both forecasts. For two-month or longer lead forecasts, only forecasts with predicted SST scenario is available. The following corrections are applied to the RSM probabilistic rainfall forecasts to generate the final forecast product: i) probabilities for near-normal category cannot be higher than 40% due to the poor forecast skill for this tercile and ii) a climatological forecast is used where the RSM has no skill, as measured by the Heidke score.
An example is provided in Fig. 12, showing the downscaled forecasts for FMA 2004. The downscaled forecasts often give different probabilistic distribution from the driving global model forecasts. For example, enhanced probabilities for below-normal rainfall along the north coast and for above-normal rainfall in southern and western Nordeste were forecast by the downscaling prediction system (Fig. 12). The global model forecasts, used to drive the downscaling forecasts, indicated enhanced probabilities for above-normal rainfall in the whole Nordeste (see the Web site online at http://iri.columbia.edu/forecast/climate/index.html). The difference is partially attributed to the difference in the abilities to resolve local rainfall variability. Observations indicate that rainfall variance at sub-GCM scales can account for about 50% of the rainfall total variance, and the regional model is able to resolve some sub-GCM scale rainfall variance (Sun et al. 2005b).
4. Downscaled forecast validation
For each of the three years (2002–04), downscaled forecasts were first issued in December for the period of January–June, forecasts were updated monthly, and the last downscaled forecasts were issued in March for the period of April–June. Therefore, 12 downscaled forecasts were made during 2002–04. The downscaled forecasts were verified with respect to the high-resolution observations. [The observed rainfall categories were posted alongside the forecasts on the forecast pages at the Web site: http://www.funceme.br/DEMET/Index.htm (click on “Projecto Downscaling”).] Figure 12 is an example of such a display.
Quantitatively verification of the probabilistic forecasts was made using the RPSS. Figure 13 presents the average RPSS for one-month lead forecasts. One may choose to assess only those locations and times for which a deliberate forecast was made (Fig. 13b) or one may choose to assess all points at all times, including the climatological forecasts (Fig. 13a). Positive RPSS is shown over a majority of the land areas. Comparison of Figs. 13a and 13b shows that on average higher RPSS is obtained when the climatological forecasts are excluded, implying that for regions containing nonclimatological forecasts the category having the highest forecast probability was observed more often than one-third of the time. Note that in Fig. 13b, local scores may be based on a reduced sample of forecasts, and thus may be subject to high sampling variability.
Forecast reliability was revealed in Table 2. Perfect reliability would appear as diagonal elements, from upper left to lower right, equal to the stated confidence level. Table 2 suggests that the forecast probabilities were approximately reliable at the 40% and 50% confidence levels, and overconfident at the 60% or higher confidence level. Most of the diagonal elements in Tables 2a–c are in the range of 45%–50%. This may suggest that we should issue forecasts at one confidence level in the future. Table 2 also indicates that the observations were more frequently below normal than above normal for the period of 2002–04 in Nordeste.
Forecast skill varied with seasons. Examination of the skill among the four seasons indicated that the forecast skill was generally higher for MAM and AMJ seasons than JFM and FMA seasons (Figs. 14). High positive scores were shown mainly over central Nordeste during JFM and FMA seasons and were presented over many locations in the Nordeste during MAM and AMJ seasons. Positive scores were also retained for MAM and AMJ seasons as the lead time increased. For instance, the average score over the Nordeste was 18.3% for one-month lead forecasts, and was 13.7% for three-month lead forecasts.
Since the SST was the only external forcing for this prediction system, the predicted SSTA was verified with the observations also. The predicted SSTA patterns were generally in agreement with the observations in MAM and AMJ seasons, and relatively large biases were found over the tropical Atlantic Ocean in some JFM and FMA seasons. Table 3 listed the predicted and observed values of the Atlantic SSTA dipole. To estimate the impact of SST biases, hindcasts using observed SSTs for the period 2002–04 were performed. Both forecast scores and hindcast scores were aggregated for the whole Nordeste in Table 4. The forecasts were almost as good as hindcasts for the MAM and AMJ seasons. The forecasts scores were much lower than the hindcast scores in FMA 2002 and 2003. A strong connection between the Nordeste rainfall and the tropical Atlantic SSTA dipole during rain seasons was confirmed in previous studies (e.g., Servain 1991). We speculate that the large biases of the predicted Atlantic SSTA dipole may result in the lower forecast scores for the FMA 2002 and 2003. The JFM hindcasts scored negative, implying that the SST impact on JFM rainfall is weak during 2002–04.
To examine the added value of the RSM forecasts, skill comparison between the downscaled forecasts and the driving global model forecasts was performed. The ECHAM4.5 AGCM probability forecasts were generated using the same methods as the RSM forecasts. We linearly interpolated the ECHAM4.5 AGCM probabilistic forecasts onto RSM grids in order to calculate the RPSS using the same high-resolution observations. The AGCM forecast scores were aggregated for the whole Nordeste in Table 5. The scores of the RSM forecasts were higher than those of the driving global model forecasts for most seasons, implying that the smaller spatial scale rainfall generated by the RSM is skillful.
The IRI and FUNCEME have developed a dynamical downscaling prediction system for the Nordeste, Brazil, and have been issuing real-time seasonal forecasts since December 2001. This is probably the first wholly operational climate dynamical downscaling prediction system. The NCEP RSM with a resolution of 60 km and ECHAM4.5 AGCM (T42) are the core of this prediction system. This is a two-tiered prediction system: SST forecasts are produced first, which then serve as the lower boundary condition forcing for the ECHAM4.5 AGCM–NCEP RSM nested system.
A new observed rainfall dataset was constructed for the Nordeste using an anomaly approach. The monthly climatology was first derived from the CRU05 precipitation climatology using spatial bilinear interpolation. The monthly anomalies were then generated using the station data. Anomaly interpolation from stations to 60 km × 60 km grids on a Mercator projection was done using the ADW interpolation scheme. This new dataset well represented the observed rainfall spatial and temporal variability in the Nordeste.
Hindcast validation has been done for January–June 1971–2000 using ensemble runs forced by observed SSTs. Rainfall in the RSM hindcasts is significantly correlated to the observations. The skill of the RSM hindcasts, measured by the ranked probability skill score, is generally moderate for JFM and FMA seasons and good for MAM and AMJ seasons.
Two SST scenarios are predicted. The first SST scenario is to persist the observed SST anomalies from the most recently completed calendar month and add them to the observed seasonal cycle. Dynamical predictions using persisted SST anomalies are run only one season into the future. The second SST scenario is the predicted SST anomalies for the upcoming six months. A mix of dynamical and statistical models has been used to construct the SST predictions, varying by tropical ocean basins, and damped persistence with 3 months e-folding time has been used for the extratropical oceans.
Dynamically downscaled forecasts during 2002–04 have been validated using the ranked probability skill score. The overall rainfall forecast skill is positive over a majority of the Nordeste. The highest skill is found for the MAM and AMJ season–s, skill that hardly decreases with lead time up to 3 months. Low scores for the FMA season may be attributed to the biases in the SST predictions over the tropical Atlantic Ocean. The downscaled forecasts show higher skill than the driving global model forecasts, particularly for the MAM and AMJ seasons. Skill scores are based on 12 forecasts, with overlap among those of the same year. Thus, the results here may be subject to high sampling variability. More reliable skill should be obtained using large forecast samples.
The dynamical downscaling prediction system is continuously evolving. A new forecast product, crop weather index, will be issued in the future. The crop weather index uses the daily rainfall time series to measure the severity of drought and flooding conditions. Sun et al. (2005a, manuscript submitted to J. Appl. Meteor. Climatol.) has demonstrated that crop yields in the rain-fed agriculture region of Ceará are highly related to the crop weather index, and the downscaling prediction system is skillful to predict this index. The CPTEC AGCM and the Regional Atmospheric Modeling System (RAMS) developed at Colorado State University (Liston and Pielke 2000) will also be added to this downscaling prediction system, and multimodel ensembling methods will be implemented to consolidate the downscaling forecasts in the near future.
Thanks are due to X. Gong for running the AGCM forecasts and to W. L. B. Melciades for performing the RSM integrations. The authors acknowledge the enlightening discussion with A. G. Barnston, A. Robertson, and L. Goddard. The authors also appreciate the thoughtful comments of four anonymous reviewers.
The Rank Probability Skill Score
The rank probability skill score (RPSS) measures the cumulative squared error between the categorical forecast probabilities and the observed category relative to some reference forecast (Epstein 1969; Murphy 1971; Wilks 1995). The most widely used reference strategy is that of climatology. The RPSS is defined as
N = 3 for tercile forecasts and fj, rj, and oj are the forecast probability, reference forecast probability, and observed probability for category j, respectively. The probability distribution of the observation is 100% for the category that was observed and is 0 for the other two categories. The reference forecast of climatology is assigned to 33.3% for each of the tercile categories. RPSfcst (RPSref) is the squared difference between the cumulative probability distributions of the forecast (reference forecast) and the observation.
Corresponding author address: Dr. Liqiang Sun, International Research Institute for Climate Prediction, Columbia University, Palisades, NY 10964. Email: email@example.com