Reply to Comments by Wilson and by Juras

R. Buizza European Centre for Medium-Range Weather Forecasts, Reading, United Kingdom

Search for other papers by R. Buizza in
Current site
Google Scholar
PubMed
Close
,
A. Hollingsworth European Centre for Medium-Range Weather Forecasts, Reading, United Kingdom

Search for other papers by A. Hollingsworth in
Current site
Google Scholar
PubMed
Close
,
F. Lalaurette European Centre for Medium-Range Weather Forecasts, Reading, United Kingdom

Search for other papers by F. Lalaurette in
Current site
Google Scholar
PubMed
Close
, and
A. Ghelli European Centre for Medium-Range Weather Forecasts, Reading, United Kingdom

Search for other papers by A. Ghelli in
Current site
Google Scholar
PubMed
Close
Full access

We are aware of a technical issue preventing figures and tables from showing in some newly published articles in the full-text HTML view.
While we are resolving the problem, please use the online PDF version of these articles to view figures and tables.

Corresponding author address: Dr. Roberto Buizza, European Centre for Medium-Range Weather Forecasts, Shinfield Park, Reading RG2 9AX, United Kingdom.

Corresponding author address: Dr. Roberto Buizza, European Centre for Medium-Range Weather Forecasts, Shinfield Park, Reading RG2 9AX, United Kingdom.

1. Answers to comments by Wilson

Wilson’s (2000) comments concerned the way the relative operating characteristic (ROC) curve has been computed and on the interpretation of the area under a ROC curve in terms of forecast skill.

a. On the computation of the ROC

Wilson’s comments on how the ROC curve should be computed are more than welcomed, and will lead to a revision of our computation methodology.

b. On the impact of the EPS system upgrade

We agree that the impact of the Ensemble Prediction System (EPS) upgrade of December 1996 on the area under the ROC curve is less evident if Wilson’s methodology is followed. Table 1 is the equivalent to Table 3 of Buizza et al. (1999b) but it based on Brier skill scores instead of ROC areas. Results support our general conclusion that the system upgrade of December 1996 improved the EPS performance.

c. On the impact of atmospheric variability between different seasons

The fact that atmospheric variability can have affected the results has been clearly stated in the text when results referring to different seasons are compared (sections 5 and 6), and in the conclusions. The reader is referred to Buizza et al (1998) for a “cleaner” comparison of different ensemble systems on the same set of 14 cases (unfortunately, with large size, high-resolution ensemble systems, there is a practical limit of the number of cases that can be run in different ensemble configurations).

d. On the 0.7 value for the ROC area as a limit of a useful prediction

We understand Wilson’s criticism. Generally speaking, we agree that there is a lot of arbitrariness in the choice of a threshold value. Theory only justifies the use of a 0.5 value as the limit for a skillful system. But we feel that the use of a threshold value could ease the comparison of the performance of two different systems for a certain time period, or of one system during different periods, especially when there is a need to condense results, for example, in few figures or tables. This was the reason why the 0.7 threshold was introduced.

e. On atmospheric predictability

We do not agree with the final criticism of Wilson: “If it is atmospheric predictability that is referred to, how can this be judged when no atmospheric data are used in the verification?” We have discussed rather extensively in an appendix (Buizza et al. 1999b) the problem of precipitation verification, where we have clearly stated the limits of our approach, and our belief that 0–24-h forecasts can be used as a practical way of verifying the model on a scale that is consistent with its formulation. The reader is referred to Mullen and Buizza (1999, manuscript submitted to Mon. Wea. Rev.) for an assessment of the performance of the European Centre for Medium-Range Weather Forecasts’ Ensemble Prediction System over the United States for a 2-yr period against a verifying field defined using observations.

2. Answer to comments by Juras

Three are the main points raised by Juras (2000): the first on the choice of 0.70 as a threshold value for the area under a ROC curve of a useful prediction, the second on the influence of climatic variability on skill measures, and the third one on the “no resolution” line of the reliability diagrams (Figs. 1c,d and 18 of Buizza et al. 1999b).

a. On the 0.7 value for the ROC area as a limit of a useful prediction

We accept the criticism. We did not intend to suggest that a ROC area of 0.7 should be accepted as a universal limit for verification, but rather we think that the identification of a threshold value could ease the comparison of the performance of two different systems for a certain time period, or of one system during different periods, especially when there is a need to condense results, for example, in a few figures or tables.

b. On the climatic variability of precipitation

Juras correctly points out that areas with different climatological frequencies of precipitation should not be combined together. One way to overcome this problem would be to compute, for each grid point and for each month (or season), the climatological distribution of observed precipitation. Then, consider probabilistic predictions of precipitation events as “precipitation beyond last decile (or quartile)” instead of events with fixed (i.e., geographically constant) thresholds as “precipitation amount greater than 2 mm day−1.”

Unfortunately, such gridded climatological distributions of precipitation are not available over Europe, to the authors’ knowledge. By contrast, it is easy to compute such a field for upper-level variables as the 500-hPa geopotential height field using so-called analysis fields (Buizza et al. 1999a).

Juras is therefore correct to say that the statement “the predictive skill is lower for small regions” reported in Buizza et al. (1999b) is inadequate. It should be kept in mind however that the smaller the area, the longer the sample needed to achieve significant results. This is one of the reasons why when considering areas as large as Europe one can find a bigger number of independent events.

c. No resolution line on reliability diagrams

Juras is correct. We have identified a bug in the software used to draw the reliability diagrams, which gave an incorrect value for the no-resolution line. The enclosed Figs. 1 and 2 show the correct diagrams.

REFERENCES

  • Buizza, R., T. Petroliagis, T. N. Palmer, J. Barkmeijer, M. Hamrud, A. Hollingsworth, A. Simmons, and N. Wedi, 1998: Impact of model resolution and ensemble size on the performance of an ensemble prediction system. Quart. J. Roy. Meteor. Soc.,124, 1935–1960.

    • Crossref
    • Export Citation
  • ——, J. Barkmeijer, T. N. Palmer, and D. S. Richardson, 1999a: Current status and future developments of the ECMWF Ensemble Prediction System. Meteor. Appl.,6, 1–14.

  • ——, A. Hollingsworth, F. Lalaurette, and A. Ghelli, 1999b: Probabilistic predictions of precipitation using the ECMWF Ensemble Prediction System. Wea. Forecasting,14, 168–189.

    • Crossref
    • Export Citation
  • Juras, J., 2000: Comments on “Probabilistic predictions of precipitation using the ECMWF Ensemble Prediction System.” Wea. Forecasting,15, 365–366.

    • Crossref
    • Export Citation
  • Wilson, L. J., 2000: Comments on “Probabilistic predictions of precipitation using the ECMWF Ensemble Prediction System.” Wea. Forecasting,15, 361–364.

    • Crossref
    • Export Citation

Fig. 1.
Fig. 1.

Reliability diagram for summer 1997 for the probabilistic prediction of (a) 1 mm and (b) 10 mm, at forecast day 3 for Europe. The horizontal dashed line is the line of no resolution, and the dotted line the line of no skill (corrected from Figs. 1c,d of Buizza et al. 1999b).

Citation: Weather and Forecasting 15, 3; 10.1175/1520-0434(2000)015<0367:RTCBWA>2.0.CO;2

Fig. 2.
Fig. 2.

(a) Winter 1996/97 reliability diagram for 5 mm at forecast day 4 over Europe. (b) As in (a) but for 10 mm. (c) and (d) As in (a) and (b) but at forecast day 3 for summer 1997. The horizontal dashed line is the line of no resolution, and the dotted line is the line of no skill (corrected Figs. 18a–d of Buizza et al. 1999b).

Citation: Weather and Forecasting 15, 3; 10.1175/1520-0434(2000)015<0367:RTCBWA>2.0.CO;2

Table 1.

Forecast time (day) at which the Brier skill score crosses for the first time the zero line.

Table 1.
Save
  • Buizza, R., T. Petroliagis, T. N. Palmer, J. Barkmeijer, M. Hamrud, A. Hollingsworth, A. Simmons, and N. Wedi, 1998: Impact of model resolution and ensemble size on the performance of an ensemble prediction system. Quart. J. Roy. Meteor. Soc.,124, 1935–1960.

    • Crossref
    • Export Citation
  • ——, J. Barkmeijer, T. N. Palmer, and D. S. Richardson, 1999a: Current status and future developments of the ECMWF Ensemble Prediction System. Meteor. Appl.,6, 1–14.

  • ——, A. Hollingsworth, F. Lalaurette, and A. Ghelli, 1999b: Probabilistic predictions of precipitation using the ECMWF Ensemble Prediction System. Wea. Forecasting,14, 168–189.

    • Crossref
    • Export Citation
  • Juras, J., 2000: Comments on “Probabilistic predictions of precipitation using the ECMWF Ensemble Prediction System.” Wea. Forecasting,15, 365–366.

    • Crossref
    • Export Citation
  • Wilson, L. J., 2000: Comments on “Probabilistic predictions of precipitation using the ECMWF Ensemble Prediction System.” Wea. Forecasting,15, 361–364.

    • Crossref
    • Export Citation
  • Fig. 1.

    Reliability diagram for summer 1997 for the probabilistic prediction of (a) 1 mm and (b) 10 mm, at forecast day 3 for Europe. The horizontal dashed line is the line of no resolution, and the dotted line the line of no skill (corrected from Figs. 1c,d of Buizza et al. 1999b).

  • Fig. 2.

    (a) Winter 1996/97 reliability diagram for 5 mm at forecast day 4 over Europe. (b) As in (a) but for 10 mm. (c) and (d) As in (a) and (b) but at forecast day 3 for summer 1997. The horizontal dashed line is the line of no resolution, and the dotted line is the line of no skill (corrected Figs. 18a–d of Buizza et al. 1999b).

All Time Past Year Past 30 Days
Abstract Views 0 0 0
Full Text Views 229 81 45
PDF Downloads 25 10 0