Search Results

You are looking at 1 - 3 of 3 items for :

  • Author or Editor: Lisa Goddard x
  • Monthly Weather Review x
  • Refine by Access: All Content x
Clear All Modify Search
Andrew W. Robertson
,
Upmanu Lall
,
Stephen E. Zebiak
, and
Lisa Goddard

Abstract

An improved Bayesian optimal weighting scheme is developed and used to combine six atmospheric general circulation model (GCM) seasonal hindcast ensembles. The approach is based on the prior belief that the forecast probabilities of tercile-category precipitation and near-surface temperature are equal to the climatological ones. The six GCMs are integrated over the 1950–97 period with observed monthly SST prescribed at the lower boundary, with 9–24 ensemble members. The weights of the individual models are determined by maximizing the log likelihood of the combination by season over the integration period. A key ingredient of the scheme is the climatological equal-odds forecast, which is included as one of the “models” in the multimodel combination. Simulation skill is quantified in terms of the cross-validated ranked probability skill score (RPSS) for the three-category probabilistic hindcasts. The individual GCM ensembles, simple poolings of three and six models, and the optimally combined multimodel ensemble are compared.

The Bayesian optimal weighting scheme outperforms the pooled ensemble, which in turn outperforms the individual models. In the extratropics, its main benefit is to bring much of the large area of negative-precipitation RPSS values up to near-zero values. The skill of the optimal combination is almost always increased (in the large spatial averages considered) when the number of models in the combination is increased from three to six, regardless of which models are included in the three-model combination.

Improvements are made to the original Bayesian scheme of Rajagopalan et al. by reducing the dimensionality of the numerical optimization, averaging across data subsamples, and including spatial smoothing of the likelihood function. These modifications are shown to yield increases in cross-validated RPSS skills. The revised scheme appears to be better suited to combining larger sets of models, and, in the future, it should be possible to include statistical models into the weighted ensemble without fundamental difficulty.

Full access
Simon J. Mason
,
Jacqueline S. Galpin
,
Lisa Goddard
,
Nicholas E. Graham
, and
Balakanapathy Rajartnam

Abstract

Probabilistic forecasts of variables measured on a categorical or ordinal scale, such as precipitation occurrence or temperatures exceeding a threshold, are typically verified by comparing the relative frequency with which the target event occurs given different levels of forecast confidence. The degree to which this conditional (on the forecast probability) relative frequency of an event corresponds with the actual forecast probabilities is known as reliability, or calibration. Forecast reliability for binary variables can be measured using the Murphy decomposition of the (half) Brier score, and can be presented graphically using reliability and attributes diagrams. For forecasts of variables on continuous scales, however, an alternative measure of reliability is required. The binned probability histogram and the reliability component of the continuous ranked probability score have been proposed as appropriate verification procedures in this context, but are subject to some limitations. A procedure is proposed that is applicable in the context of forecast ensembles and is an extension of the binned probability histogram. Individual ensemble members are treated as estimates of quantiles of the forecast distribution, and the conditional probability that the observed precipitation, for example, exceeds the amount forecast [the conditional exceedance probability (CEP)] is calculated. Generalized linear regression is used to estimate these conditional probabilities. A diagram showing the CEPs for ranked ensemble members is suggested as a useful method for indicating reliability when forecasts are on a continuous scale, and various statistical tests are suggested for quantifying the reliability.

Full access
Simon J. Mason
,
Michael K. Tippett
,
Andreas P. Weigel
,
Lisa Goddard
, and
Balakanapathy Rajaratnam
Full access