Search Results

You are looking at 1 - 10 of 10 items for

  • Author or Editor: Ian Jolliffe x
  • Refine by Access: All Content x
Clear All Modify Search
Ian T. Jolliffe

Abstract

A recent paper used a simple artificial example to illustrate the shortcomings of EOFs and rotated EOFs in identifying underlying physical modes. The example raises at least as many questions as it answers. This note addresses some of these questions and clarifies some of the behavior of EOFs and related techniques.

Full access
Ian T. Jolliffe

Abstract

When a forecast is assessed, a single value for a verification measure is often quoted. This is of limited use, as it needs to be complemented by some idea of the uncertainty associated with the value. If this uncertainty can be quantified, it is then possible to make statistical inferences based on the value observed. There are two main types of inference: confidence intervals can be constructed for an underlying “population” value of the measure, or hypotheses can be tested regarding the underlying value. This paper will review the main ideas of confidence intervals and hypothesis tests, together with the less well known “prediction intervals,” concentrating on aspects that are often poorly understood. Comparisons will be made between different methods of constructing confidence intervals—exact, asymptotic, bootstrap, and Bayesian—and the difference between prediction intervals and confidence intervals will be explained. For hypothesis testing, multiple testing will be briefly discussed, together with connections between hypothesis testing, prediction intervals, and confidence intervals.

Full access
Ian T. Jolliffe and Cristina Primo

Abstract

Rank histograms are often plotted to evaluate the forecasts produced by an ensemble forecasting system—an ideal rank histogram is “flat” or uniform. It has been noted previously that the obvious test of “flatness,” the well-known χ 2 goodness-of-fit test, spreads its power thinly and hence is not good at detecting specific alternatives to flatness, such as bias or over- or underdispersion. Members of the Cramér–von Mises family of tests do much better in this respect. An alternative to using the Cramér–von Mises family is to decompose the χ 2 test statistic into components that correspond to specific alternatives. This approach is described in the present paper. It is arguably easier to use and more flexible than the Cramér–von Mises family of tests, and does at least as well as it in detecting alternatives corresponding to bias and over- or underdispersion.

Full access
Ian T. Jolliffe and David B. Stephenson
Full access
Full access
Ian T. Jolliffe and David B. Stephenson

Abstract

Verification is an important part of any forecasting system. It is usually achieved by computing the value of some measure or score that indicates how good the forecasts are. Many possible verification measures have been proposed, and to choose between them a number of desirable properties have been defined. For probability forecasts of a binary event, two of the best known of these properties are propriety and equitability. A proof that the two properties are incompatible for a wide class of verification measures is given in this paper, after briefly reviewing the two properties and some recent attempts to improve properties for the well-known Brier skill score.

Full access
Mark P. Baldwin, David B. Stephenson, and Ian T. Jolliffe

Abstract

Often there is a need to consider spatial weighting in methods for finding spatial patterns in climate data. The focus of this paper is on techniques that maximize variance, such as empirical orthogonal functions (EOFs). A weighting matrix is introduced into a generalized framework for dealing with spatial weighting. One basic principal in the design of the weighting matrix is that the resulting spatial patterns are independent of the grid used to represent the data. A weighting matrix can also be used for other purposes, such as to compensate for the neglect of unrepresented subgrid-scale variance or, in the form of a prewhitening filter, to maximize the signal-to-noise ratio of EOFs. The new methodology is applicable to other types of climate pattern analysis, such as extended EOF analysis and maximum covariance analysis. The increasing availability of large datasets of three-dimensional gridded variables (e.g., reanalysis products and model output) raises special issues for data-reduction methods such as EOFs. Fast, memory-efficient methods are required in order to extract leading EOFs from such large datasets. This study proposes one such approach based on a simple iteration of successive projections of the data onto time series and spatial maps. It is also demonstrated that spatial weighting can be combined with the iterative methods. Throughout the paper, multivariate statistics notation is used, simplifying implementation as matrix commands in high-level computing languages.

Full access
Cristina Primo, Christopher A. T. Ferro, Ian T. Jolliffe, and David B. Stephenson

Abstract

Probabilistic forecasts of atmospheric variables are often given as relative frequencies obtained from ensembles of deterministic forecasts. The detrimental effects of imperfect models and initial conditions on the quality of such forecasts can be mitigated by calibration. This paper shows that Bayesian methods currently used to incorporate prior information can be written as special cases of a beta-binomial model and correspond to a linear calibration of the relative frequencies. These methods are compared with a nonlinear calibration technique (i.e., logistic regression) using real precipitation forecasts. Calibration is found to be advantageous in all cases considered, and logistic regression is preferable to linear methods.

Full access
Robin J. Hogan, Christopher A. T. Ferro, Ian T. Jolliffe, and David B. Stephenson

Abstract

In the forecasting of binary events, verification measures that are “equitable” were defined by Gandin and Murphy to satisfy two requirements: 1) they award all random forecasting systems, including those that always issue the same forecast, the same expected score (typically zero), and 2) they are expressible as the linear weighted sum of the elements of the contingency table, where the weights are independent of the entries in the table, apart from the base rate. The authors demonstrate that the widely used “equitable threat score” (ETS), as well as numerous others, satisfies neither of these requirements and only satisfies the first requirement in the limit of an infinite sample size. Such measures are referred to as “asymptotically equitable.” In the case of ETS, the expected score of a random forecasting system is always positive and only falls below 0.01 when the number of samples is greater than around 30. Two other asymptotically equitable measures are the odds ratio skill score and the symmetric extreme dependency score, which are more strongly inequitable than ETS, particularly for rare events; for example, when the base rate is 2% and the sample size is 1000, random but unbiased forecasting systems yield an expected score of around −0.5, reducing in magnitude to −0.01 or smaller only for sample sizes exceeding 25 000. This presents a problem since these nonlinear measures have other desirable properties, in particular being reliable indicators of skill for rare events (provided that the sample size is large enough). A potential way to reconcile these properties with equitability is to recognize that Gandin and Murphy’s two requirements are independent, and the second can be safely discarded without losing the key advantages of equitability that are embodied in the first. This enables inequitable and asymptotically equitable measures to be scaled to make them equitable, while retaining their nonlinearity and other properties such as being reliable indicators of skill for rare events. It also opens up the possibility of designing new equitable verification measures.

Full access
Peter W. Thorne, Kate M. Willett, Rob J. Allan, Stephan Bojinski, John R. Christy, Nigel Fox, Simon Gilbert, Ian Jolliffe, John J. Kennedy, Elizabeth Kent, Albert Klein Tank, Jay Lawrimore, David E. Parker, Nick Rayner, Adrian Simmons, Lianchun Song, Peter A. Stott, and Blair Trewin

No abstract available.

Full access