Abstract
This paper investigates the performance of some skill measures [e.g., linear error in probability space, (LEPS), relative operating characteristics score (ROCS), Brier scores, and proportion correct rates], commonly used in the validation and verification of seasonal climate forecasts, within the context of some simple theoretical forecast models. The models considered include linear regression and linear discriminant analysis types, where the forecasts are presented in the form of above/below median probabilities and tercile probabilities. Above and below the median categories are also explored within the context of stratified climatology models, while tail categories are explored within the context of the linear regression type. The skill scores for the models are calculated in each case as functions of a parameter that expresses the strength of the relationship between the predictor and predictand. The skill scores investigated are found to exhibit different dependencies on the model parameter, implying that a given skill score value (0.1 say) can imply a range of strengths in the relationship between predictor and predictand, depending on which skill score is being considered. On the other hand, interrelationships between pairs of skill scores are found to be similar across the different types of models, provided model reliability is preserved. The two-category and three-category LEPS skill scores are found to be on approximately the same scale for the linear regression–type model, thereby enabling a direct comparison.
Corresponding author address: Robert Fawcett, National Climate Centre, Australian Bureau of Meteorology, GPO Box 1289, Melbourne, VIC 3001, Australia. Email: r.fawcett@bom.gov.au