• Atger, F., 2003: Spatial and interannual variability of the reliability of ensemble-based probabilistic forecasts: Consequences for calibration. Mon. Wea. Rev., 131 , 15091523.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Bosart, L. F., 2003: Whither the weather analysis and forecasting process? Wea. Forecasting, 18 , 520529.

  • Bourke, W., Buizza R. , and Naughton M. , 2004: Performance of the ECMWF and the BoM ensemble prediction systems in the Southern Hemisphere. Mon. Wea. Rev., 132 , 23382357.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Brier, G. W., 1950: Verification of forecasts expressed in terms of probability. Mon. Wea. Rev., 78 , 13.

  • Buizza, R., Houtekamer P. L. , Toth Z. , Pellerin G. , Wei M. , and Zhu Y. , 2005: A comparison of the ECMWF, MSC, and NCEP global ensemble prediction systems. Mon. Wea. Rev., 133 , 10761097.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Chessa, P. A., and Lalaurette F. , 2001: Verification of the ECMWF ensemble prediction system forecasts: A study of large-scale patterns. Wea. Forecasting, 16 , 611619.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Cui, B., Toth Z. , Zhu Y. , Hou D. , and Beauregard S. , 2005: Statistical post-processing of operational and CDC hindcast ensembles. Preprints, 21st Conf. on Weather Analysis and Forecasting/17th Conf. on Numerical Weather Prediction, Washington, DC, Amer. Meteor. Soc., CD-ROM, 12B.2.

  • Ebert, E. E., 2001: Ability of a poor man’s ensemble to predict the probability and distribution of precipitation. Mon. Wea. Rev., 129 , 24612480.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Ebert, E. E., Damrath U. , Wergen W. , and Baldwin M. E. , 2003: The WGNE sssessment of short-term quantitative precipitation forecasts. Bull. Amer. Meteor. Soc., 84 , 481492.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Eckel, F. A., and Mass C. F. , 2005: Aspects of effective mesoscale, short-range ensemble forecasting. Wea. Forecasting, 20 , 328350.

  • Gneiting, T., Raftery A. E. , Westveld A. H. III, and Goldman T. , 2005: Calibrated probabilistic forecasting using ensemble model output statistics and minimum CRPS estimation. Mon. Wea. Rev., 133 , 10981118.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Hamill, T. M., 2001: Interpretation of rank histograms for verifying ensemble forecasts. Mon. Wea. Rev., 129 , 550560.

  • Hamill, T. M., and Colucci S. J. , 1998: Evaluation of Eta–RSM ensemble probabilistic precipitation forecasts. Mon. Wea. Rev., 126 , 711724.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Hamill, T. M., Whitaker J. S. , and Wei X. , 2004: Ensemble reforecasting: Improving medium-range forecast skill using retrospective forecasts. Mon. Wea. Rev., 132 , 14341447.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Kalnay, E., and Coauthors, 1996: The NCEP/NCAR 40-Year Reanalysis Project. Bull. Amer. Meteor. Soc., 77 , 437471.

  • Kharin, V. V., and Zwiers F. W. , 2003: On the ROC score of probability forecasts. J. Climate, 16 , 41454150.

  • Legg, T. P., and Mylne K. R. , 2004: Early warnings of severe weather from ensemble forecast information. Wea. Forecasting, 19 , 891906.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Leith, C. E., 1974: Theoretical skill of Monte Carlo forecasts. Mon. Wea. Rev., 102 , 409418.

  • Mason, I., 1982: A model for assessment of weather forecasts. Aust. Meteor. Mag., 30 , 291303.

  • Mullen, S. L., and Buizza R. , 2002: The impact of horizontal resolution and ensemble size on probabilistic forecasts of precipitation by the ECMWF Ensemble Prediction System. Wea. Forecasting, 17 , 173191.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Murphy, A. H., 1973: A new vector partition of the probability score. J. Appl. Meteor., 12 , 595600.

  • Richardson, D. S., 2001: Measures of skill and value of ensemble prediction systems, their interrelationship and the effect of ensemble size. Quart. J. Roy. Meteor. Soc., 127 , 24732489.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Roebber, P. J., Schultz D. M. , Colle B. A. , and Stensrud D. J. , 2004: Toward improved prediction: High-resolution and ensemble modeling systems in operations. Wea. Forecasting, 19 , 936949.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Ryan, R. T., 2003: Digital forecasts—Communication, public understanding, and decision making. Bull. Amer. Meteor. Soc., 84 , 10011003.

  • Scherrer, S. C., Appenzeller C. , Eckert P. , and Cattani D. , 2004: Analysis of the spread–skill relations using the ECMWF Ensemble Prediction System over Europe. Wea. Forecasting, 19 , 552565.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Stratton, R. A., 1999: A high resolution AMIP integration using the Hadley Centre model HadAM2b. Climate Dyn., 15 , 928.

  • Szunyogh, I., and Toth Z. , 2002: The effect of increased horizontal resolution on the NCEP global ensemble mean forecasts. Mon. Wea. Rev., 130 , 11251143.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Talagrand, O., Vautard R. , and Strauss B. , 1997: Evaluation of probabilistic prediction systems. Proc. Workshop on Predictability, Shinfield Park, Reading, United Kingdom, ECMWF, 1–26.

  • Tennant, W. J., 2003: An assessment of intraseasonal variability from 13-yr GCM simulations. Mon. Wea. Rev., 131 , 19751991.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Toth, Z., and Kalnay E. , 1993: Ensemble forecasting at NMC: The generation of perturbations. Bull. Amer. Meteor. Soc., 74 , 23172330.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Toth, Z., and Kalnay E. , 1997: Ensemble forecasting at NCEP and the breeding method. Mon. Wea. Rev., 125 , 32973319.

  • Toth, Z., Zhu Y. , and Marchok T. , 2001: The use of ensembles to identify forecasts with small and large uncertainty. Wea. Forecasting, 16 , 436477.

    • Search Google Scholar
    • Export Citation
  • Wilks, D. S., 1995: Statistical Methods in the Atmospheric Sciences: An Introduction. Academic Press, 467 pp.

  • View in gallery

    Location of temperature (large dots) and rainfall (small dots) stations in South Africa, in relation to the NCEP EFS model grid boxes.

  • View in gallery

    Examples of week-2 temperature and wind probabilistic forecasts generated directly from the NCEP EFS data. Forecasts of warmer than average maximum temperatures and normal minimum temperatures for Johannesburg, and a higher incidence of easterly winds with overall lower wind speeds relative to climate for East London, are indicated in these products.

  • View in gallery

    RMSE of 0000 UTC 500-hPa geopotential height forecasts against forecast lead time averaged over the period Jan 2001–Dec 2004 for the domain 0°–60°S and 30°W–60°E. Graphs for high-resolution control, low-resolution control, best ensemble member for each forecast case, ensemble mean, climatology, and the ensemble spread are defined in the legend.

  • View in gallery

    ACC of 500-hPa geopotential height forecasts against forecast lead time averaged over the period Jan 2001–Dec 2004 for the (left) 60°S, 30°W–10°S, 60°E and (right) 37.5°S, 10°E–17.5°S, 40°E domains. Graphs for high-resolution control, ensemble average, and best ensemble member for each forecast case are defined in the legend. Bias-corrected scores are indicated by BC.

  • View in gallery

    Standard deviation of 0000 UTC 500-hPa geopotential height forecast fields with the time mean removed for the period Jan 2001–Dec 2004, showing high-resolution control, low-resolution control, ensemble average, and mean of the perturbed ensemble members indicated in the legend. Bias-corrected scores are indicated by BC.

  • View in gallery

    Average bias of 0000 UTC 500-hPa geopotential height forecast fields for the period Jan 2001–Dec 2004, showing high-resolution control, low-resolution control, and ensemble average before and after bias correction (indicated by BC in the legend).

  • View in gallery

    Spatial difference of (left) the average of the perturbed ensemble members minus the low-resolution control analysis and (right) 24-h forecast for (top) 500-hPa geopotential height and (bottom) sea level pressure. Negative contours are stippled.

  • View in gallery

    Spatial bias of 500-hPa height forecasts (left) before and (right) after bias correction for (top) week 1 (forecast days 1–7) and (bottom) week 2 (forecast days 8–14) lead times. Negative contours are stippled. Bias (un)corrected maps have a contour interval of (2) 0.5 gpm.

  • View in gallery

    Annual average rainfall for the period Jul 2000–Jun 2005 for (top left) observed gridded rainfall, (top right) ensemble average 5-day forecast rainfall as a percentage of observed, (bottom left) high-resolution 5-day control forecast, and (bottom right) low-resolution 5-day control forecast.

  • View in gallery

    Summer rainfall area (27.5°S, 30°E) (top) bias score and (bottom) equitable threat score as a function of rain threshold for forecast lead times of (left) 5, (middle) 10, and (right) 15 days for high- and low-resolution controls, ensemble probability (>50% taken as a categorical yes forecast), and frequency-adjusted high-resolution control and ensemble probabilities.

  • View in gallery

    As in Fig. 10 but for the winter rainfall area (35°S, 17.5°E).

  • View in gallery

    Talagrand diagram for NCEP 5-day forecasts (all ensemble members) for the (left) summer and (right) winter rainfall areas of South Africa for the period Jul 2000–Jun 2005. The gray line shows a perfect distribution.

  • View in gallery

    BSS decomposition and ROC of forecasts of the probability of the 850–500-hPa-thickness field dropping below 4200 gpm at grid point 35°S, 17.5°E for the May–Sep months of 2004–05. Solid lines represent standard output, dashed line the calibrated 850- and 500-hPa height fields, and dotted line the frequency-adjusted probabilities.

  • View in gallery

    As in Fig. 13 but for 4100-gpm thickness.

  • View in gallery

    The 7-day 500-hPa geopotential height forecast at 1200 UTC 6 Nov 2005 issued by (top left) ECMWF high-resolution control, (top right) NCEP GFS high-resolution control, (bottom left) spaghetti diagram of 5700-gpm contour of NCEP 23-member ensemble suite initialized at 0000 and 1200 UTC 31 Oct 2005, and (bottom right) NCEP GFS analysis for 1200 UTC 6 Nov 2005. Solid (dashed) boldface line denotes the 0000 UTC (1200 UTC) high-resolution control forecasts on 31 Oct 2005.

All Time Past Year Past 30 Days
Abstract Views 0 0 0
Full Text Views 100 85 1
PDF Downloads 32 30 2

Application of the NCEP Ensemble Prediction System to Medium-Range Forecasting in South Africa: New Products, Benefits, and Challenges

View More View Less
  • 1 South African Weather Service, Pretoria, South Africa
  • | 2 National Centers for Environmental Prediction, Camp Springs, Maryland
  • | 3 South African Weather Service, Pretoria, South Africa
© Get Permissions
Full access

Abstract

The National Centers for Environmental Prediction (NCEP) Ensemble Forecasting System (EFS) is used operationally in South Africa for medium-range forecasts up to 14 days ahead. The use of model-generated probability forecasts has a clear benefit in the skill of the 1–7-day forecasts. This is seen in the forecast probability distribution being more successful in spanning the observed space than a single deterministic forecast and, thus, substantially reducing the instances of missed events in the forecast. In addition, the probability forecasts generated using the EFS are particularly useful in estimating confidence in forecasts. During the second week of the forecast the EFS is used as a heads-up for possible synoptic-scale events and also for predicting average weather conditions and probability density distributions of some elements such as maximum temperature and wind. This paper assesses the medium-range forecast process and the application of the NCEP EFS at the South African Weather Service. It includes a description of the various medium-range products, adaptive bias-correction methods applied to the forecasts, verification of the forecast products, and a discussion on the various challenges that face researchers and forecasters alike.

Corresponding author address: Warren J. Tennant, South African Weather Service, Private Bag X097, Pretoria 0001, South Africa. Email: warren.tennant@weathersa.co.za

Abstract

The National Centers for Environmental Prediction (NCEP) Ensemble Forecasting System (EFS) is used operationally in South Africa for medium-range forecasts up to 14 days ahead. The use of model-generated probability forecasts has a clear benefit in the skill of the 1–7-day forecasts. This is seen in the forecast probability distribution being more successful in spanning the observed space than a single deterministic forecast and, thus, substantially reducing the instances of missed events in the forecast. In addition, the probability forecasts generated using the EFS are particularly useful in estimating confidence in forecasts. During the second week of the forecast the EFS is used as a heads-up for possible synoptic-scale events and also for predicting average weather conditions and probability density distributions of some elements such as maximum temperature and wind. This paper assesses the medium-range forecast process and the application of the NCEP EFS at the South African Weather Service. It includes a description of the various medium-range products, adaptive bias-correction methods applied to the forecasts, verification of the forecast products, and a discussion on the various challenges that face researchers and forecasters alike.

Corresponding author address: Warren J. Tennant, South African Weather Service, Private Bag X097, Pretoria 0001, South Africa. Email: warren.tennant@weathersa.co.za

1. Introduction

Weather forecasts have potential use at a variety of space and time scales. As a public weather forecast service, the South African Weather Service (SAWS) is tasked to provide a comprehensive forecast service from a few hours ahead through all scales up to several seasons ahead. The medium range (3–14 days) is particularly popular through a number of sectors and thus considerable effort has been invested in improving forecasts for this time scale. To this end the National Centers for Environmental Prediction (NCEP) Ensemble Forecasting System (EFS; Toth and Kalnay 1997; Toth et al. 2001; Buizza et al. 2005) is used operationally in South Africa for medium-range forecasts up to 14 days ahead.

Ensemble methods (Leith 1974) are considered to be an effective way to estimate the probability density function of future states of the atmosphere by addressing uncertainties present in initial conditions and in model approximations. Notwithstanding, biases remain in these forecast distributions, especially in user-orientated fields such as rainfall (e.g., Hamill and Colucci 1998) and surface temperature (e.g., Hamill et al. 2004). Various bias correction methods and verification statistics are described in the literature. However, it is important to get a grasp of the practical implementation of a forecast guidance system in a regional setting and to establish the strengths and weaknesses of the EFS in these regions.

The objective of this paper is to introduce some novel medium-range forecast products that have been generated from the NCEP EFS, including bias correction, and assess the success of these in an operational environment at the SAWS.

2. Ensemble forecast system and verification methodology

a. NCEP Ensemble Forecast System

The SAWS has been downloading subsets of the NCEP EFS on a daily basis since 2000. The operational configuration at NCEP used for this study has been in effect since May 2000 and consists of 23 ensemble members per day out to 16 days ahead. At 0000 UTC the suite consists of a high-resolution (T170L42, T254L64 since April 2003) control run up to 7 days, truncated to T62L28 for the remaining 9 days; a low-resolution (T62L28) control run; plus five pairs of perturbed integrations derived from the breeding cycle (Toth and Kalnay 1993; 1997). At 1200 UTC the high-resolution control run extends only to 3.5 days, and there are another five pairs of independently bred perturbations. The full ensemble set thus consists of 23 members per day. Since March 2004 the ensembles have been available four times daily at a T126L28 resolution (more detail on the NCEP EFS is available online at http://wwwt.emc.ncep.noaa.gov/gmb/ens/index.html). Because of bandwidth constraints the SAWS has, up to the writing of this paper, only downloaded the coarser 2.5°-resolution datasets. Although the model output assessed here is at this lower resolution, the model performance has been continuously improving through upgraded model physics, resolution, and data assimilation, and these effects are automatically manifest in the coarser-output sets.

b. Verification methods

Atmospheric variables such as pressure level geopotential heights and sea level pressure are compared against the ensemble control run analysis. These continuous data are verified using root-mean-square error and anomaly correlation coefficient scores. To calculate the anomalies used for the correlation calculations, monthly climatological fields, derived from NCEP–National Center for Atmospheric Research reanalysis data (Kalnay et al. 1996), are converted to daily values by performing a linear interpolation between the two nearest monthly values, and are subtracted from the forecast fields.

Surface variables such as rainfall and maximum–minimum temperatures are verified against station data from the SAWS database. There are 96 temperature stations across South Africa and between 1500 and 2000 rainfall stations (Fig. 1). At the 2.5° EFS forecast resolution, roughly 30–200 rainfall stations fall into each model box, with the lower values corresponding to the sparsely populated arid areas in the western interior of South Africa. This verification is done using two approaches. The first consists of model forecast values of temperature and rainfall probabilities being interpolated to the 96 forecast station locations and the second (used mainly in rainfall verification) of a value constructed for each model box using an average of the stations within the box. The EFS is designed for box-average probabilistic quantitative precipitation forecasts (PQPFs) and not point values, and is therefore not expected to perform well at individual points. A further caveat to note here is that convective-type rainfall in the summer rainfall areas of South Africa has a particularly high spatial variability and station measurements are sometimes not representative of the area—in this case the 2.5° model grid box. This also highlights the reverse problem of deriving a point forecast of rainfall from model grid boxes and will be addressed in future research. In a study over Australia (which has similar conditions to South Africa), Ebert et al. (2003) suggest that the differences between forecast and observed rainfall fields are much greater than errors in verification data (from representativeness), suggesting that we may proceed to verify PQPFs with caution. This verification is done using the equitable threat score (ETS; Ebert 2001), which has the advantage of measuring the fraction of observed and/or forecast events that were correctly predicted but adjusting the score for hits associated with random chance.

The skill of probabilistic forecasts is verified using the Brier skill score (BSS; Brier 1950; Wilks 1995). This skill score measures the squared probability error relative to climatology. Murphy (1973) decomposed the Brier score into reliability (agreement between forecast probability and observed frequency), resolution (ability of the forecast probabilities to distinguish between events and nonevents), and uncertainty (depends on climatology and has a maximum value of 0.25 when the observed frequency is 50%). A further measure of the ability of forecasts to distinguish between events and nonevents is shown using the relative operating characteristic (ROC; Mason 1982). The BSS looks at performance stratified by forecast probabilities and the ROC performance based on the observations. The ability of the spread of the EFS to represent the variability of the real atmosphere is measured using the rank histogram (Talagrand et al. 1997; Hamill 2001). This measure is particularly useful in determining whether the EFS has errors in its mean and spread. Again, as pointed out by Hamill (2001), errors in observations (mostly through representativeness) may introduce false signals in this histogram. More on this will follow in the discussion.

c. Forecast products

The products from the EFS can be divided into two groups. The first is the set of products provided to the forecasters that is used as guidance in compiling their medium-range forecasts. The second group consists of computer-generated products that are disseminated to the public and/or specialized users. These are each described separately below.

1) Forecaster guidance

Part of the success in introducing the EFS to the forecast offices in the SAWS has been the development of a user-friendly forecast display system. This is based on HyperText Markup Language (HTML) Web pages and consists of a home page in the form of a table with the vertical axis representing the forecast days from 1 to 14. The horizontal axis contains different forecast parameters. Each panel consists of a thumbnail-size spatial map that provides a quick preview of the expected weather in the medium range. The maps include a date with forecast day, PQPF for 1 and 8.5 mm over the 24-h period, probability of 24-h maximum and minimum temperature change (from the previous day) exceeding +2 and −2°C, contours of pressure level geopotential heights of the high-resolution control run overlying a shaded field that indicates the expected forecast uncertainty (based on ensemble spread), and the probability of the 850–500-hPa thickness falling below 4200 and 4100 gpm. Temperature changes over 24 h are preferred to actual values because surface air temperature varies significantly from one location to another (often within a model grid box) based on altitude, terrain, and proximity to bodies of water. Each thumbnail image in the Web page table can be expanded to full size through a mouse click and navigation around the expanded images in the table is done using arrow links. The expanded pages also provide links to additional detail such as additional thresholds, bias-corrected fields, and spaghetti diagrams. The Web page includes a built-in automatic update instruction to ensure that forecasters are always viewing the latest forecast.

2) Public forecast products

While forecasters have a good understanding of the use of numerical weather prediction models, the public can easily misinterpret such computer-generated information. Therefore, the products issued to the public need to be carefully designed, adequately documented, and limited to a few understandable parameters (Ryan 2003). The SAWS issues deterministic station-specific forecasts for the first 7 days, as has been the practice for many years before the introduction of the EFS, but this does include a probabilistic forecast of precipitation (available online at http://www.weathersa.co.za). For the second week, computer-generated products of the probabilities of maximum and minimum temperature categories and wind roses (Fig. 2) are issued for the 96 temperature-forecast stations shown in Fig. 1. These graphics include climatological distributions of the forecast parameter to indicate the expected departure from the mean. The advantage of these products is that they make full use of all of the ensemble information. Furthermore, the products span a week, so the impacts of errors in the individual ensemble member forecasts, regarding the timing of specific events during the second week, are reduced.

3. Verification of medium-range forecast products

a. Continuous variables

This section focuses predominantly on the 0000 UTC 500-hPa height forecasts for the period January 2001–December 2004 and over a domain covering the South Atlantic Ocean, southern Africa, and the southwestern Indian Ocean. Results for sea level pressure concur on the whole with the 500-hPa heights and are thus not repeated in the results. The ensemble average (10 perturbation members) has a lower RMSE than the low-resolution (high resolution) control run after 5 (6) days (Fig. 3). However, the ensemble-average RMSE exceeds the RMSE obtained by using climatology (the limit of skill) after day 8, only 1 day after the control runs reach this threshold. This shows that the benefit of the ensemble technique over the control run to make skillful forecasts of the instantaneous 500-hPa fields is realized on average for only 1 day in this region. Notwithstanding, when taking the best ensemble member at each forecast case (identified a posteriori), the RMSE is smaller than that of climatology up to day 12. Bourke et al. (2004) also note the advantage of using the Bureau of Meteorology EFS, with a 33% relative gain of the best ensemble member to the high-resolution control over Australia at day 5. A tally of the ensemble members shows that the high-resolution control run is the most accurate more often than any of the other ensemble members for the first 3 days of the forecast. The low-resolution control is usually most accurate about half of the amount of time that the high-resolution control is most accurate, illustrating the level of improvement in skill that can be realized by increasing the model resolution. From forecast day 4, one of the perturbed ensemble members is usually the most accurate. By the end of the forecast period (day 16), any ensemble member has an equal chance of being the best. In essence, the ensemble technique is successful in producing at least one forecast member that can be used to make a skillful forecast for this region up to 12 days ahead. This suggests that the breeding method is able to capture the uncertainty in the initial conditions such that the probability distribution of the ensemble forecast envelope does cover the observed events out to this lead time.

As already mentioned, upgrades to the Global Forecast System (GFS) model at NCEP have led to improvements in the model performance. This is evident in the ensemble-average, best-member, and control runs of the EFS. Generally, the skill of the best member has improved by almost 2 days to 14 days and the bias has been reduced by 60% when comparing the scores for 2001 against those of 2004 (not shown). It is also worth noting that the ensemble mean is slightly worse than the control run of the same resolution, for the first few days (also noted in Szunyogh and Toth 2002). This is because the ensemble suite is generated using the breeding method (Toth and Kalnay 1993, 1997), which adds perturbations to the control analysis, in the form of stochastic noise, with the aim of reducing model systematic error in the ensuing forecasts. The uncertainty estimate mask that determines the size of these perturbations has only recently been upgraded at NCEP and was known to produce inflated perturbations for the Southern Hemisphere. This would partially explain the problem with the ensemble mean. Additionally, the perturbed ensemble-member forecasts would be initially disadvantaged relative to the control forecasts if verified against the control analysis. However, the ensemble-generating technique is designed to capture the uncertainty in the initial conditions and hopefully provide a more useful forecast probability distribution. The benefit of this becomes clearer later in this paper, particularly with the 850–500-hPa-thickness probability forecasts.

The anomaly correlation coefficient (ACC) scores are similar to the RMSE scores (Fig. 4). One notable difference is that over the smaller domain (37.5°–17.5°S and 10°–40°E) the ensemble-average ACC scores remain poorer than the high-resolution control throughout the forecast period, whereas the ensemble average beats the high-resolution control from day 6 in the large domain. Furthermore, the score of the best-ensemble member at each case is higher over the small domain, than the large domain. These scores indicate that locally the correct type of weather system is being simulated by the model by (at least) some members and the poorer score for the ensemble average could be caused by the smoothing effect of combining forecasts of the same weather system, but located at different positions.

A useful forecast system must be able to capture as much of the range of natural variability as possible. One way to measure this is to plot the standard deviation of the ensemble-member fields from their own time-averaged value for the full verification period at each forecast lead time (Fig. 5). During the first week of the forecast the variance decreases, but then begins to increase during the second week. The control runs have a lower variance than that observed, with the low-resolution control considerably lower than the high-resolution control. This is consistent with the constrained atmospheric variability, in this region in particular, caused by the finite resolution of the models (Stratton 1999; Tennant 2003). The perturbation breeding method introduces additional variability and results in the first 3 days of the perturbed ensemble members having a higher variance than observed. The ensemble average obviously underestimates the variance significantly, making this field unsuitable for event forecasting in this region despite its apparent advantage over individual ensemble members in terms of RMSE and ACC between forecast days 5 and 8.

b. Bias correction

The most basic way to correct model systematic biases is by subtracting the long-term mean error of the forecasts (Richardson 2001). This is usually done independently for each grid point and forecast lead time. However, this omits two other rather important dependencies, namely those related to the season and to the circulation regime. To address this issue, Atger (2003) proposed a spatially and temporally dependent bias correction. To introduce this sort of bias correction in an operational environment, Eckel and Mass (2005) adopted a 14-day running-mean bias calculation. Cui et al. (2005) also discuss these methods. This study follows such a bias-correction technique with the following additions. The running mean was tested for 30, 14, and 7 days, and the 14-day running mean was found to be optimal. Bias correction was done independently for each control ensemble member but as a group for the perturbed ensemble members. This was necessitated by the different resolutions (and hence biases) of the control runs (Szunyogh and Toth 2002) and possible differences introduced by the initial perturbations on the perturbed members. There is no reason to expect the bias of any particular perturbed ensemble member to be different from the others, so the same average bias of all the perturbed members was subtracted from each member.

The bias-correction procedure was performed as follows. Starting on 1 January 2001, a bias for each forecast lead time for forecasts valid for the 14-day window period (ending 31 December 2000) was calculated. This was done by calculating the mean difference between the forecasts and the observed fields multiplied by a factor alpha (estimated empirically at 0.33; i.e., 33% of the forecast error could be attributed to the forecast bias). All the forecasts valid over the window period were corrected by subtracting the bias factor. The process was then repeated for 2 January 2001 (with the 14-day window extending from 17 December 2000 to 1 January 2001) and each day in turn until 1 January 2005. During this process each particular forecast case was bias corrected 14 times as the window moved across the time. This iterative process provided more stability to the bias-correction process. The latest forecast (simulating an operational environment) was bias corrected using the mean difference between the standard forecasts and the latest set of iteratively bias-corrected forecasts over the 14-day window. In this way the most up-to-date bias information was used to correct the current forecast. These forecasts (with only one bias-correction step) were saved separately and used in the verification process. The main advantage of this bias-correction method is that the bias can be calculated from a relatively short period and thus be implemented easily in an operational environment.

Over the southern Africa domain, the GFS model exhibits an increasing area-average negative bias with forecast lead time (Fig. 6). The magnitude of the bias is dependent on model resolution, with a larger bias associated with the lower resolution. As expected from results of other studies (Atger 2003; Cui et al. 2005), the bias-correction method is successful in reducing the magnitude of the bias considerably throughout the forecast period. Talagrand diagrams also confirm this bias reduction with a more even distribution (not shown). The forecast bias cannot be totally removed using these sorts of correction methods in a real-time forecasting sense, as it is not possible to fully anticipate the systematic-error component of future forecast errors based on past errors. Furthermore, efficient bias correction does not always necessarily lead to an improvement in forecast skill, but the method discussed here does also improve skill somewhat (Fig. 4). This improvement is also evident in the increase in variability after bias correction (Fig. 5). This probably occurs when the adaptive bias correction shifts the model forecast away from the model climate (with constrained variability) toward reality with more variability.

It is interesting that the ensemble perturbation members have a smaller bias than the low-resolution control run at forecast days 2 and 3 (shown by the ensemble-average curve in Fig. 6). The only differences between these runs are the perturbations added to the initial conditions, suggesting that this bias is influenced by the initial conditions. Buizza et al. (2005) state that a successful EFS should capture the effect of both initial condition and model uncertainties on forecast errors. To investigate this, the spatial variation of the difference between the perturbed ensemble members and the control run is shown in Fig. 7. Here, we see that the ensemble-breeding method has an overall spatial pattern where the perturbed ensemble members (on average at time zero) tend to have higher values over the land areas and South Atlantic anticyclone, and negative in the midlatitude westerlies for both the 500-hPa height and sea level pressure fields relative to the control analysis. These are clearly very small values but certainly spatially coherent. Although the cause of this is not clear, some further investigation revealed that these patterns seem to be related to the interpolation of the high-resolution analysis (from NCEP’s Global Data Assimilation System) to create the analysis for the lower model resolution of the ensemble set. This lower-resolution analysis is used for the breeding of the ensemble perturbations. After 24 h, these same patterns amplify to amounts that neatly offset the traditional model bias in the region of a negative bias in the subtropics and a positive bias in the midlatitudes of the region. Such a pattern where the Southern Hemisphere jet stream is displaced toward the equator by many general circulation models is familiar in the region (Tennant 2003). Furthermore, the bias-correction method, although successful in reducing the bias spatially, does leave some of the spatial pattern of the bias behind (Fig. 8). These highlight the need for regime-dependent bias correction, since model performance can be linked to correctly simulating circulation regimes (Chessa and Lalaurette 2001).

c. Probability of precipitation and temperature change forecasts

A clear bias in the EFS is evident over southern Africa in terms of inflated quantitative precipitation probabilities. The root of this problem appears to be that the NCEP GFS model overestimates rainfall amounts, climatologically speaking, by up to 300% over the summer rainfall areas of South Africa, especially along the eastern escarpment at 30°E (Fig. 9). Surprisingly, there is little improvement in the high-resolution forecast over the low-resolution forecast, as usually such biases tend to be related to model resolution. Of further note is that this bias becomes greater for rainfall amounts greater than 5 mm, especially for the longer forecast lead times (Fig. 10). The bias score used here, calculated as the number of forecast cases divided by the number of observed cases, is thus very large (up to 50), especially given the small number of observed cases of >20 mm over the 2.5° × 2.5° model grid box. This would partially explain why the high-resolution model has such a strong bias, because large rainfall amounts tend to be more easily generated by higher-resolution models (Mullen and Buizza 2002). Furthermore, forecast lead time appears to have little bearing on the magnitude of the bias (except for high-resolution control at 5 days), suggesting a problem with the model physics (possibly the precipitation parameterization schemes) over this region. In contrast to the summer rainfall areas, rainfall is underestimated in the winter rainfall region of the Western Cape Province (Figs. 9 and 11). In this region we have large-scale rain-bearing frontal systems that are enhanced by local topography, a feature probably not adequately captured by the current resolution of the NCEP global model.

Talagrand diagrams (Hamill 2001) confirm the findings above (Fig. 12). The summer rainfall region has a clear bias of forecasting too much rain too often (left-skewed diagram), where the observation is less than the driest ensemble member a third of the time. The winter rainfall region on the other hand has a U-shaped diagram indicating a lack of variability in this region, although the diagram does exhibit a slight left-sided bias. Again, these patterns do not change significantly with forecast lead time.

Quantitative precipitation forecast (QPF) fields were calibrated by adjusting the event threshold (i.e., the precipitation forecast amount defining a “yes” or “no” forecast event), so that the forecast frequency matched the observed frequency for the verification period (July 2000–June 2005). For this study, a cross-validation technique was used where each year was withheld from the calculations while the other 4 yr were used to calculate the forecast frequencies. Roebber et al. (2004) suggest that an increase in model resolution, postprocessing of model data, and combining high-resolution with ensemble techniques are practical ways to improve rainfall prediction. These approaches concur with the findings in this study in the following way.

Over the summer rainfall area, ETSs of the calibrated high-resolution control forecasts are improved for light rainfall amounts (Fig. 10). Over the winter rainfall area, the calibration of the high-resolution control was not as successful, except for 5-day forecasts of larger rainfall amounts over the Western Cape Province (Fig. 11). This is probably because the model does not capture the orographic augmentation of rainfall in this region adequately, resulting in a largely systematic bias that can easily be corrected. It is noteworthy that the ETS for highly simplified ensemble probability forecasts (assuming a deterministic yes forecast when the ensemble probability exceeds 50%) generally beats the control run scores for rainfall amounts less than 20 mm, and even more so at the longer lead times (Fig. 10). This demonstrates the utility of the EFS to perform QPFs for light rainfall events. Unfortunately, as found by Legg and Mylne (2004), the calibration of the ensemble QPF, particularly in the summer rainfall region, causes a significant deterioration of the ETS score for heavier rainfall events by reducing the probabilities too far as seen from the negative bias in Fig. 10. Perhaps, this requires a different interpretation of the probabilities for the more extreme events, as these would hardly ever exceed 50% in practice. The negative bias here is dependent on a deterministic decision of what probability should be considered to be a yes forecast.

Equivalent results were obtained from frequency adjustment of the ensemble PQPF values. An increase in the BSS and a marginal increase in the ROC were evident for the first few days of light-rain (<2 mm) forecasts over the summer rainfall area, but for heavier events (>10 mm) the ROC score deteriorated. Over the winter rainfall region, fairly good improvements were made to the BSS and ROC score during the first week. This is attributed to the successful correction of the systematic bias in model-simulated rainfall in this region as mentioned above.

Probability forecasts of temperature changes of 2° and 5°C are skillful for week 1 over most of South Africa and for week 2 over parts of the interior (not shown). Coastal temperatures are probably not resolved sufficiently by the model resolution and are less skillful than those over the interior. Maximum temperatures are more skillful than minimum temperatures, pointing again to the lack of resolution of subgrid-scale processes at night.

d. Forecaster guidance—Probability of events

A useful EFS product for event forecasting is the probability of the 850–500-hPa-thickness field falling below 4200 and 4100 gpm. These synoptic situations are good indicators of extreme cold weather and possible snowfall in South Africa, both of which are considered high-impact events in the region. Over the northern parts of the country the thickness fields almost never fall below 4100 gpm, but do fall below 4200 gpm around 10 times per winter season. Given the high altitude of the escarpment (2000–3000 m) and interior plateau (1500 m), surface temperatures are usually below freezing overnight over large areas. In the southern parts of the country, thickness fields below 4200 gpm occur almost half the time (corresponding to a maximum uncertainty value in the Brier score decomposition), but occurrences of values less than 4100 gpm match those of 4200 gpm over the northern parts. The results discussed below for the 4100-gpm events in the Western Cape Province correspond roughly to the 4200-gpm events in the northern regions of South Africa.

The EFS is able to capture these events in South Africa adequately for the first 7 days of the forecast, as shown by positive BSSs and ROC scores in excess of 0.5 (Figs. 13 and 14). Although reliability and resolution deteriorate during the second week, the forecasts still retain reasonable resolution throughout the forecast period. This suggests that calibration may be able to improve the skill of these forecasts.

The first calibration method tested here is the bias correction described above, where the 850–500-hPa-thickness fields were adjusted at each grid point and new thickness probability fields calculated. This method was successful in improving the BSS, resolution, and reliability considerably during the second week (Fig. 13). Rare events show a significant improvement in the BSSs, but this is not reflected in the ROC scores, which are related to forecast resolution (Fig. 14).

The second calibration method, where the event threshold is adjusted to match the forecast probability with the observed frequency, is more successful in improving both the BSS and ROC score for the 4200-gpm events (Fig. 13). The reliability is much improved (as the calibration is intended to do) and there is also a small increase in resolution during the second week. For the more rare 4100-gpm event, the BSS is again improved and the ROC score is worse than the raw output (Fig. 14). Overall, the first calibration method is more successful as it addresses the bias in the physical patterns more directly. However, neither calibration method does much for the resolution of the more extreme (rarer) events and is consistent with findings in Legg and Mylne (2004) where the skill in predicting extreme events often deteriorates after bias correction.

Forecast uncertainty is another useful forecaster guidance tool. This is indicated by shading the ensemble spread (categorized into a strong, medium, or weak signal and no predictability) under the control prognostic 500-hPa height and sea level pressure fields. It is determined using a basic definition of calculating the ensemble standard deviation around the ensemble mean and dividing by the observed field standard deviation for that time of year at each grid point (Scherrer et al. 2004). Values in the ranges 0%–33%, 33%–66%, 66%–100%, and >100% correspond to strong, medium, weak, and no signal, respectively. Periods of strong atmospheric instability may also lead to a large ensemble spread but not necessarily increased forecast uncertainty. Toth et al. (2001) proposed a relative measure of predictability based on the position of the forecast value in terms of climatological distribution to provide a quantitative probability of forecast uncertainty. The intention at the SAWS for now, however, is that forecasters use these uncertainty fields as a qualitative rather than a quantitative measure of forecast confidence. Notwithstanding, further development and refinement of this process is under way at the SAWS.

4. Discussion

The EFS was introduced to the SAWS National Forecast Center (NFC) in the beginning of 2004. Although this forecast guidance system was received apprehensively at first, consistent benefits in using this system have been experienced during this time and now the EFS is used widely in the NFC and regional forecast offices. This section now relates the application of the scientific aspects covered in section 3 to local forecaster experiences.

The EFS 1–14-day thumbnail Web page display system is particularly useful to a forecaster, as one quickly gains an overview of each of the different parameters, extending from 1 through to 14 days. With a glance, one can rapidly assess the broad trend in one or more parameters: for example, is rain generally increasing (decreasing) over time and is the rain indicated to be “mostly in the west” at the beginning of the sequence—possibly migrating eastward as one advances through the forecast period? Speed of assessment (of a weather pattern, whether the pattern at hand pertains to one playing out in the next few hours or in the next few days) within the forecast office environment is a theme we cannot escape from. Effective time management is critical in an operational forecast setting; so any visualization scheme (such as the EFS HTML pages) that can cut time and allow a forecaster to reach an informed decision is a very welcome operational tool. The product is easily accessible and the autoupdate function ensures one always views the latest data.

Numerical weather prediction models have always been relatively good at anticipating sudden temperature falls (rises) from one day to the next—especially frontally induced cooling of 10°C or more (per 24 h). The ensemble approach sustains this ability but at lead times generally surpassing those of deterministic NWP products. Notwithstanding poor predictability at extreme lead times, a forecaster is at least able to give the public a “heads-up” (heightened state of alertness), with a fair degree of confidence, with respect to possible extreme or inclement weather, at lead times commonly 3–10 days hence (and even 11–14 days in some cases). Such outlooks or weather-related guidance do not pertain only to the general curiosity of the general public but also arise from a commercial and public-safety perspective; thus, this type of guidance can be crucial. Among local examples are fire protection agencies, who need to make tactical decisions to hire helicopters, fixed-wing aircraft, and pilots on short-term contracts (days or weeks), ahead of expected breakouts of hot dry berg winds (air cascading from the interior of South Africa down the escarpment toward the coast). A second example is that of small stock farmers, who need to know, at least 2–4 days in advance, of impending cold, wet, and windy conditions, which may or may not include snow concurrently. The ensemble approach greatly assists the forecaster in assessing the likelihood of such events, which have the potential to affect life and/or property; at time scales often exceeding those of regular NWP methods (as alluded to earlier). In addition, the shaded uncertainty fields give the forecaster a quick eyeball overview as to the relative uncertainty of the control run predisposition toward a particular feature (cutoff low, etc.) or event (such as ridging surface high)—both in space and time. The spaghetti diagram link can then assist the forecaster by providing more information on the individual ensemble member’s handling of the weather system in question.

The probabilities of the 850–500-hPa-thickness fields dropping below 4200 and 4100 gpm are an additional useful tool in assessing the relative threat from impending snow events. Situational awareness has improved markedly within the NFC environment—nurturing active discussion and debate well ahead of such events—overcoming negative aspects such as “forecaster inertia” where (perhaps because of excessive non-weather-related workload, or other factors) a forecaster may be unaware of an unfolding severe weather pattern and only build up an awareness too late. It is essential (especially for high-impact weather) to raise awareness (among forecasters and farmers alike) with a sufficiently long lead time as to allow appropriate mitigative measures to be taken (such as bringing young, vulnerable animals down, out of the hills, a day or two ahead of the onset of such an extreme change). Naturally, one needs a suitable balance between early warnings versus an inflated alarm rate. Generally, the ensemble products are utilized to identify possible/probable extreme events at lead times approaching (or often exceeding) a week hence. Deterministic high-resolution (regional model) NWP guidance is then utilized, closer to the time, to closely monitor (in space and time) the unfolding weather event and refine the official forecast scenario ahead of the perceived extreme weather event.

Usually, forecasters would only be interested in a single best forecast tool (e.g., the high-resolution control run). However, outlier values and more specifically groups or clusters of outliers are also of interest and value to a forecaster, particularly during the developmental phase of assembling a prognosis. These may well be indicative of an alternative outcome/scenario, from a weather perspective. Experience has shown that the most popular scenario is not necessarily the outcome that is finally realized. A secondary (or even tertiary) clustering of members away from the control run might well be indicative of an alternate, and possibly significant, outcome. Such alternative scenarios could be incorporated into a forecast, or sometimes it might be more appropriate to adhere to the prognosis implied by the control run but to remain cognizant of the herald of a possible deviation from the expected pattern; this would thus imply that the forecaster should closely monitor real-time observations/developments in the area of interest to be aware of a possible deviation or shift toward an alternative scenario and amend/update forecast products accordingly.

A good example of such a case in South Africa that illustrates this point is the 7-day forecast for 1200 UTC 6 November 2005. High-resolution control runs from the European Centre for Medium-Range Weather Forecasts (ECMWF) and NCEP differed drastically as to the position and intensity of a cutoff low event over the Western Cape Province (Fig. 15). Synoptically, these two scenarios are very different and would have a huge impact on QPFs and warnings/advisories. The spaghetti diagrams interestingly showed a similar dichotomy in the prognosis, one cluster similar to that of the ECMWF model and one similar to the NCEP control run. Although the previous NCEP control run (0000 UTC; bold dashed line in Fig. 15) resembled the ECMWF control at 1200 UTC, the perturbed ensemble members around the 0000 and 1200 UTC analyses were spread fairly uniformly across both scenarios showing that this situation was particularly sensitive to perturbations in the initial conditions at the start of this 7-day forecast. Furthermore, uncertainty fields warned of especially low levels of forecast confidence in this particular event. Consequently, the forecasts were done by adopting a more cautious approach until greater certainty was evident. It turns out that the scenario with the system weaker and displaced to the northeast was closer from a forecaster point of view. However, the system was indeed intense with late spring snow (up 17 cm) over the mountains of the Eastern Cape Province and heavy rainfall along the coast (201 mm in East London over the weekend). Several severe storms with medium-sized hail and damaging winds were reported from many parts of the northeast interior of South Africa as well. The suggestion of an intense system, such as forecast by the 1200 UTC ECMWF run on 30 October and confirmed by a significant number of ensemble members, was sufficient to place forecasters on alert to monitor changes very closely up until the event played out. Without the ensemble support, the ECMWF forecast scenario might well have been ignored, as the following update was vastly different, or premature warnings could have been issued for the wrong areas.

The EFS approach to forecasting in the medium term thus frees the forecaster, to a certain degree, from being excessively constrained by a single official NWP prognosis. The forecaster is thus empowered to explore (to a limited degree) a spectrum of possible outcomes and juxtapositions of systems in a spatiotemporal context, allowing more creativity and flexibility on the part of the forecaster/analyst, but at the same time, retaining the strength and support of solid NWP guidance principles and products. Until the advent of the NCEP EFS at SAWS, the forecaster really had to “go out on a limb” to attempt to visualize possible weather scenarios at lead times of beyond a few days; this in itself was difficult enough. Furthermore, even to attempt variations on those scenarios was nearly impossible. The medium-term ensemble NWP goes a long way to fulfilling this need.

Bias-correction methods discussed in this paper have proved quite successful in improving the EFS forecast skill statistics. There still remain some additional avenues to explore for South Africa, including regime-dependent correction and perhaps ensemble model output statistics (Gneiting et al. 2005). However, forecasters, when armed with EFS guidance, even if not calibrated, can make useful forecasts by using their analytical powers to sort through the model forecasts (Bosart 2003). Still one area of concern is the tendency for calibration to adversely affect forecasts of extreme events (Legg and Mylne 2004; Kharin and Zwiers 2003). Thus, it is advisable to provide forecasters with both the calibrated and noncalibrated EFS guidance. Accurately forecasting high-impact weather is one of the primary responsibilities of the forecasters, and so they need to be able to fully utilize the EFS guidance to fulfill this objective. The success of this lies clearly in training and experience.

5. Summary and conclusions

The NCEP EFS has been successfully implemented as an integral part of the SAWS forecasting service. Consistent benefits of using the EFS have been noted in the forecast offices and this has strengthened the position of this forecasting tool in the operational environment. Foremost among these benefits is an improved hit rate of forecast high-impact weather events up to and beyond a week ahead. False alarms in these forecasts have also been kept in check by the ability of the EFS to provide useful information regarding the uncertainty in the forecast scenarios.

The EFS is particularly useful for generating objective forecaster guidance products and public forecast products. Probability distribution functions can be calculated objectively using EFS data. These can be used directly as automated end-user products or as tailored forecaster guidance to suit local conditions.

Bias correction of atmospheric fields (e.g., 500-hPa geopotential heights) and probability forecasts (e.g., quantitative precipitation forecasts) has proved quite successful at improving forecast reliability. However, the bias-correction methods tested in this paper do not lead to much improvement in the anomaly correlation coefficient scores of 500-hPa heights (spatial pattern skill). Similarly for the probability forecasts, bias correction really only corrects part of the problem omitting the extreme events. Therefore, in order to capture extreme events, more sophisticated postprocessing methods are needed.

Another issue currently not properly handled by the EFS systems is QPFs at station scale in the summer rainfall areas of South Africa. Although the EFS does verify better at grid scale than at station scale, the skill of these forecasts is still relatively poor and is generally only useful for the first few days of the forecast. Part of the problem with summer convective rainfall is the poor correlation between station rainfall and the synoptic situation. The suggested approach to this problem is through a combination of high-resolution modeling and MOS-type postprocessing.

The experience of the introduction of an EFS into the operational forecasting environment in South Africa is that forecasters are willing to use EFS provided that the benefit of using lower-resolution EFS instead of a single high-resolution control is properly demonstrated and that there is adequate training to assist in understanding and using this forecast tool. Finally, it is imperative that dynamic interaction between forecasters and researchers takes place in order to facilitate the timely implementation of desired products.

Acknowledgments

The authors thank two anonymous reviewers for their helpful and thorough comments that have resulted in significant improvements to this paper.

REFERENCES

  • Atger, F., 2003: Spatial and interannual variability of the reliability of ensemble-based probabilistic forecasts: Consequences for calibration. Mon. Wea. Rev., 131 , 15091523.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Bosart, L. F., 2003: Whither the weather analysis and forecasting process? Wea. Forecasting, 18 , 520529.

  • Bourke, W., Buizza R. , and Naughton M. , 2004: Performance of the ECMWF and the BoM ensemble prediction systems in the Southern Hemisphere. Mon. Wea. Rev., 132 , 23382357.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Brier, G. W., 1950: Verification of forecasts expressed in terms of probability. Mon. Wea. Rev., 78 , 13.

  • Buizza, R., Houtekamer P. L. , Toth Z. , Pellerin G. , Wei M. , and Zhu Y. , 2005: A comparison of the ECMWF, MSC, and NCEP global ensemble prediction systems. Mon. Wea. Rev., 133 , 10761097.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Chessa, P. A., and Lalaurette F. , 2001: Verification of the ECMWF ensemble prediction system forecasts: A study of large-scale patterns. Wea. Forecasting, 16 , 611619.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Cui, B., Toth Z. , Zhu Y. , Hou D. , and Beauregard S. , 2005: Statistical post-processing of operational and CDC hindcast ensembles. Preprints, 21st Conf. on Weather Analysis and Forecasting/17th Conf. on Numerical Weather Prediction, Washington, DC, Amer. Meteor. Soc., CD-ROM, 12B.2.

  • Ebert, E. E., 2001: Ability of a poor man’s ensemble to predict the probability and distribution of precipitation. Mon. Wea. Rev., 129 , 24612480.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Ebert, E. E., Damrath U. , Wergen W. , and Baldwin M. E. , 2003: The WGNE sssessment of short-term quantitative precipitation forecasts. Bull. Amer. Meteor. Soc., 84 , 481492.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Eckel, F. A., and Mass C. F. , 2005: Aspects of effective mesoscale, short-range ensemble forecasting. Wea. Forecasting, 20 , 328350.

  • Gneiting, T., Raftery A. E. , Westveld A. H. III, and Goldman T. , 2005: Calibrated probabilistic forecasting using ensemble model output statistics and minimum CRPS estimation. Mon. Wea. Rev., 133 , 10981118.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Hamill, T. M., 2001: Interpretation of rank histograms for verifying ensemble forecasts. Mon. Wea. Rev., 129 , 550560.

  • Hamill, T. M., and Colucci S. J. , 1998: Evaluation of Eta–RSM ensemble probabilistic precipitation forecasts. Mon. Wea. Rev., 126 , 711724.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Hamill, T. M., Whitaker J. S. , and Wei X. , 2004: Ensemble reforecasting: Improving medium-range forecast skill using retrospective forecasts. Mon. Wea. Rev., 132 , 14341447.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Kalnay, E., and Coauthors, 1996: The NCEP/NCAR 40-Year Reanalysis Project. Bull. Amer. Meteor. Soc., 77 , 437471.

  • Kharin, V. V., and Zwiers F. W. , 2003: On the ROC score of probability forecasts. J. Climate, 16 , 41454150.

  • Legg, T. P., and Mylne K. R. , 2004: Early warnings of severe weather from ensemble forecast information. Wea. Forecasting, 19 , 891906.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Leith, C. E., 1974: Theoretical skill of Monte Carlo forecasts. Mon. Wea. Rev., 102 , 409418.

  • Mason, I., 1982: A model for assessment of weather forecasts. Aust. Meteor. Mag., 30 , 291303.

  • Mullen, S. L., and Buizza R. , 2002: The impact of horizontal resolution and ensemble size on probabilistic forecasts of precipitation by the ECMWF Ensemble Prediction System. Wea. Forecasting, 17 , 173191.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Murphy, A. H., 1973: A new vector partition of the probability score. J. Appl. Meteor., 12 , 595600.

  • Richardson, D. S., 2001: Measures of skill and value of ensemble prediction systems, their interrelationship and the effect of ensemble size. Quart. J. Roy. Meteor. Soc., 127 , 24732489.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Roebber, P. J., Schultz D. M. , Colle B. A. , and Stensrud D. J. , 2004: Toward improved prediction: High-resolution and ensemble modeling systems in operations. Wea. Forecasting, 19 , 936949.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Ryan, R. T., 2003: Digital forecasts—Communication, public understanding, and decision making. Bull. Amer. Meteor. Soc., 84 , 10011003.

  • Scherrer, S. C., Appenzeller C. , Eckert P. , and Cattani D. , 2004: Analysis of the spread–skill relations using the ECMWF Ensemble Prediction System over Europe. Wea. Forecasting, 19 , 552565.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Stratton, R. A., 1999: A high resolution AMIP integration using the Hadley Centre model HadAM2b. Climate Dyn., 15 , 928.

  • Szunyogh, I., and Toth Z. , 2002: The effect of increased horizontal resolution on the NCEP global ensemble mean forecasts. Mon. Wea. Rev., 130 , 11251143.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Talagrand, O., Vautard R. , and Strauss B. , 1997: Evaluation of probabilistic prediction systems. Proc. Workshop on Predictability, Shinfield Park, Reading, United Kingdom, ECMWF, 1–26.

  • Tennant, W. J., 2003: An assessment of intraseasonal variability from 13-yr GCM simulations. Mon. Wea. Rev., 131 , 19751991.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Toth, Z., and Kalnay E. , 1993: Ensemble forecasting at NMC: The generation of perturbations. Bull. Amer. Meteor. Soc., 74 , 23172330.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Toth, Z., and Kalnay E. , 1997: Ensemble forecasting at NCEP and the breeding method. Mon. Wea. Rev., 125 , 32973319.

  • Toth, Z., Zhu Y. , and Marchok T. , 2001: The use of ensembles to identify forecasts with small and large uncertainty. Wea. Forecasting, 16 , 436477.

    • Search Google Scholar
    • Export Citation
  • Wilks, D. S., 1995: Statistical Methods in the Atmospheric Sciences: An Introduction. Academic Press, 467 pp.

Fig. 1.
Fig. 1.

Location of temperature (large dots) and rainfall (small dots) stations in South Africa, in relation to the NCEP EFS model grid boxes.

Citation: Weather and Forecasting 22, 1; 10.1175/WAF979.1

Fig. 2.
Fig. 2.

Examples of week-2 temperature and wind probabilistic forecasts generated directly from the NCEP EFS data. Forecasts of warmer than average maximum temperatures and normal minimum temperatures for Johannesburg, and a higher incidence of easterly winds with overall lower wind speeds relative to climate for East London, are indicated in these products.

Citation: Weather and Forecasting 22, 1; 10.1175/WAF979.1

Fig. 3.
Fig. 3.

RMSE of 0000 UTC 500-hPa geopotential height forecasts against forecast lead time averaged over the period Jan 2001–Dec 2004 for the domain 0°–60°S and 30°W–60°E. Graphs for high-resolution control, low-resolution control, best ensemble member for each forecast case, ensemble mean, climatology, and the ensemble spread are defined in the legend.

Citation: Weather and Forecasting 22, 1; 10.1175/WAF979.1

Fig. 4.
Fig. 4.

ACC of 500-hPa geopotential height forecasts against forecast lead time averaged over the period Jan 2001–Dec 2004 for the (left) 60°S, 30°W–10°S, 60°E and (right) 37.5°S, 10°E–17.5°S, 40°E domains. Graphs for high-resolution control, ensemble average, and best ensemble member for each forecast case are defined in the legend. Bias-corrected scores are indicated by BC.

Citation: Weather and Forecasting 22, 1; 10.1175/WAF979.1

Fig. 5.
Fig. 5.

Standard deviation of 0000 UTC 500-hPa geopotential height forecast fields with the time mean removed for the period Jan 2001–Dec 2004, showing high-resolution control, low-resolution control, ensemble average, and mean of the perturbed ensemble members indicated in the legend. Bias-corrected scores are indicated by BC.

Citation: Weather and Forecasting 22, 1; 10.1175/WAF979.1

Fig. 6.
Fig. 6.

Average bias of 0000 UTC 500-hPa geopotential height forecast fields for the period Jan 2001–Dec 2004, showing high-resolution control, low-resolution control, and ensemble average before and after bias correction (indicated by BC in the legend).

Citation: Weather and Forecasting 22, 1; 10.1175/WAF979.1

Fig. 7.
Fig. 7.

Spatial difference of (left) the average of the perturbed ensemble members minus the low-resolution control analysis and (right) 24-h forecast for (top) 500-hPa geopotential height and (bottom) sea level pressure. Negative contours are stippled.

Citation: Weather and Forecasting 22, 1; 10.1175/WAF979.1

Fig. 8.
Fig. 8.

Spatial bias of 500-hPa height forecasts (left) before and (right) after bias correction for (top) week 1 (forecast days 1–7) and (bottom) week 2 (forecast days 8–14) lead times. Negative contours are stippled. Bias (un)corrected maps have a contour interval of (2) 0.5 gpm.

Citation: Weather and Forecasting 22, 1; 10.1175/WAF979.1

Fig. 9.
Fig. 9.

Annual average rainfall for the period Jul 2000–Jun 2005 for (top left) observed gridded rainfall, (top right) ensemble average 5-day forecast rainfall as a percentage of observed, (bottom left) high-resolution 5-day control forecast, and (bottom right) low-resolution 5-day control forecast.

Citation: Weather and Forecasting 22, 1; 10.1175/WAF979.1

Fig. 10.
Fig. 10.

Summer rainfall area (27.5°S, 30°E) (top) bias score and (bottom) equitable threat score as a function of rain threshold for forecast lead times of (left) 5, (middle) 10, and (right) 15 days for high- and low-resolution controls, ensemble probability (>50% taken as a categorical yes forecast), and frequency-adjusted high-resolution control and ensemble probabilities.

Citation: Weather and Forecasting 22, 1; 10.1175/WAF979.1

Fig. 11.
Fig. 11.

As in Fig. 10 but for the winter rainfall area (35°S, 17.5°E).

Citation: Weather and Forecasting 22, 1; 10.1175/WAF979.1

Fig. 12.
Fig. 12.

Talagrand diagram for NCEP 5-day forecasts (all ensemble members) for the (left) summer and (right) winter rainfall areas of South Africa for the period Jul 2000–Jun 2005. The gray line shows a perfect distribution.

Citation: Weather and Forecasting 22, 1; 10.1175/WAF979.1

Fig. 13.
Fig. 13.

BSS decomposition and ROC of forecasts of the probability of the 850–500-hPa-thickness field dropping below 4200 gpm at grid point 35°S, 17.5°E for the May–Sep months of 2004–05. Solid lines represent standard output, dashed line the calibrated 850- and 500-hPa height fields, and dotted line the frequency-adjusted probabilities.

Citation: Weather and Forecasting 22, 1; 10.1175/WAF979.1

Fig. 14.
Fig. 14.

As in Fig. 13 but for 4100-gpm thickness.

Citation: Weather and Forecasting 22, 1; 10.1175/WAF979.1

Fig. 15.
Fig. 15.

The 7-day 500-hPa geopotential height forecast at 1200 UTC 6 Nov 2005 issued by (top left) ECMWF high-resolution control, (top right) NCEP GFS high-resolution control, (bottom left) spaghetti diagram of 5700-gpm contour of NCEP 23-member ensemble suite initialized at 0000 and 1200 UTC 31 Oct 2005, and (bottom right) NCEP GFS analysis for 1200 UTC 6 Nov 2005. Solid (dashed) boldface line denotes the 0000 UTC (1200 UTC) high-resolution control forecasts on 31 Oct 2005.

Citation: Weather and Forecasting 22, 1; 10.1175/WAF979.1

Save