Search Results

You are looking at 51 - 60 of 87 items for

  • Author or Editor: ALLAN H. MURPHY x
  • Refine by Access: All Content x
Clear All Modify Search
Martin Ehrendorfer
and
Allan H. Murphy

Abstract

The sufficiency relation, originally developed in the context of the comparison of statistical experiments, provides a sound basis for the comparative evaluation of forecasting systems. The importance of this relation resides in the fact that if forecasting system A can be shown to be sufficient for forecasting system B, then all users will find A's forecasts of greater value than B's forecasts regardless of their individual payoff structures.

In this paper the sufficiency relation is applied to the problem of comparative evaluation of prototypical climate forecasting systems. The primary objectives here are to assess the basic applicability of the sufficiency relation in this context and to investigate the implications of this approach for the relationships among the performance characteristics of such forecasting systems.

The results confirm that forecasting system A is sufficient for forecasting system B when the former uses more extreme probabilities more frequently than the latter. Further, in terms of the relatively simple forecasting systems considered here, it is found that system A may be sufficient for system B even if the former uses extreme forecasts less frequently, provided that A's forecasts are—to a certain degree—more extreme than B's forecasts. Conversely, system A cannot be shown to be sufficient for system B if the former users less extreme forecasts more frequently than the latter. The advantages of the sufficiency relation over traditional performance measures in this context are also demonstrated.

Several issues related to the general applicability of the sufficiency relation to the comparative evaluation of climate forecasts are discussed. Possible extensions of this work, as well as some implications of the results for verification procedures and practices in this context, am briefly described.

Full access
Robert L. Winkler
and
Allan H. Murphy

Abstract

Proper scoring rules, such as the probability score, are based (in part) upon the assumption that the assessor's utility function is linearly related to the score. The effects of two nonlinear utility functions, one representing a “risk-taker” and one representing a “risk-avoider,” on an assessor's probability forecasts are considered. The results indicate that factors other than the expected score, e.g., the variance of the score, may be relevant for probability assessment. In general, a “risk-taker” “hedges” toward a categorical forecast, while a “risk-avoider” “hedges” away from a categorical forecast. The implications of these results for the process of probability assessment are briefly discussed.

Full access
Allan H. Murphy
and
Edward S. Epstein

Abstract

The evaluation process is considered in some detail with particular reference to probabilistic predictions. The process consists of several ordered steps at each of which elements (of the process) are identified. Consideration of the purposes leads to the identification of two distinct forms of evaluation: operational evaluation concerned with the value of predictions to the user and empirical evaluation, or verification, concerned with the perfection of predictions, i.e., the association between predictions and observations. Attributes, i.e., desirable properties, of predictions are defined with reference to these purposes, and a number of measures of the attributes for empirical evaluation are considered. An artificial example of comparative verification in which different measures appear to yield contradictory results is used to demonstrate the importance of, and need for, a careful analysis of the evaluation process.

Full access
Allan H. Murphy
and
Edward S. Epstein

Abstract

The consideration of a maxim and a statement, both of which are concerned with “hedging” on the part of meteorologists who prepare probability forecasts, leads to the identification of a property which all proper scoring systems for such forecasts should possess. A scoring system, to be proper, should encourage the meteorologist to make his probabilities correspond to his true beliefs. The conditions which a proper scoring system must satisfy are formulated in mathematical terms. Several existing scoring systems are examined to ascertain whether or not the systems are proper.

Full access
Robert L. Winkler
and
Allan H. Murphy

Abstract

Since a meteorologist's predictions are subjective, a framework for the evaluation of meteorological probability assessors must be consistent with the theory of subjective probability. Such a framework is described in this paper. First, two standards of “goodness,” one based upon normative considerations and one based upon substantive considerations, are proposed. Specific properties which a meteorologist's assessments should possess are identified for each standard. Then, several measures of “goodness,” or scoring rules, which indicate the extent to which such assessments possess certain properties, are described. Finally, several important uses of these scoring rules are considered.

Full access
Allan H. Murphy
and
Barbara G. Brown

This paper reports some results of a study in which two groups of individuals—undergraduate students and professional meteorologists at Oregon State University—completed a short questionnaire concerning their interpretations of terminology commonly used in public weather forecasts. The questions related to terms and phrases associated with three elements: 1) cloudiness—fraction of sky cover; 2) precipitation—spatial and/or temporal variations; and 3) temperature—specification of intervals.

The students' responses indicate that cloudiness terms are subject to wide and overlapping ranges of interpretation, although the interpretations of these terms correspond quite well to National Weather Service definitions. Their responses to the precipitation and temperature questions reveal that some confusion exists concerning the meaning of spatial and temporal modifiers in precipitation forecasts and that some individuals interpret temperature ranges in terms of asymmetric intervals. When compared to the students' responses, the meteorologists' responses exhibit narrower ranges of interpretation of the cloudiness terms and less confusion about the meaning of spatial/temporal precipitation modifiers.

The study was not intended to be a definitive analysis of public understanding of forecast terminology. Instead, it should be viewed as a primitive form of the type of forecast-terminology study that must be undertaken in the future. Some implications of this investigation for future work in the area are discussed briefly.

Full access
Allan H. Murphy
and
Barbara G. Brown

Worded forecasts, which generally consist of both verbal and numerical expressions, play an important role in the communication of weather information to the general public. However, relatively few studies of the composition and interpretation of such forecasts have been conducted. Moreover, the studies that have been undertaken to date indicate that many expressions currently used in public forecasts are subject to wide ranges of interpretation (and to misinterpretation) and that the ability of individuals to recall the content of worded forecasts is quite limited. This paper focuses on forecast terminology and the understanding of such terminology in the context of short-range public weather forecasts.

The results of previous studies of forecast terminology (and related issues) are summarized with respect to six basic aspects or facets of worded forecasts. These facets include: 1) events (the values of the meteorological variables): 2) terminology (the words used to describe the events); 3) words versus numbers (the use of verbal and/or numerical expressions); 4) uncertainty (the mode of expression of uncertainty); 5) amount of information (the number of items of information); and 6) content and format (the selection of items of information and their placement). In addition, some related topics are treated briefly, including the impact of verification systems, the role of computer-worded forecasts, the implications of new modes of communication, and the use of weather forecasts.

Some conclusions and inferences that can be drawn from this review of previous work are discussed briefly, and a set of recommendations are presented regarding steps that should be taken to raise the level of understanding and enhance the usefulness of worded forecasts. These recommendations are organized under four headings: 1) studies of public understanding, interpretation, and use; 2) management practices; 3) forecaster training and education; and 4) public education.

Full access
Paul R. Julian
and
Allan H. Murphy

Recent developments in six areas of statistical meteorology are described and the importance of interdisciplinary research in these areas is indicated. The areas are stochastic-dynamic prediction, assimilation of observed data, time-series analysis, statistical weather forecasting, probability forecasting, and precipitation modification statistics. Several problems are identified within each area and the need for further interdisciplinary research to solve these problems is emphasized.

Full access
Allan H. Murphy
and
Daniel S. Wilks

Abstract

The traditional approach to forecast verification consists of computing one, or at most very few, quantities from a set of forecasts and verifying observations. However, this approach necessarily discards a large portion of the information regarding forecast quality that is contained in a set of forecasts and observations. Theoretically sound alternative verification approaches exist, but these often involve computation and examination of many quantities in order to obtain a complete description of forecast quality and, thus, pose difficulties in interpretation. This paper proposes and illustrates an intermediate approach to forecast verification, in which the multifaceted nature of forecast quality is recognized but the description of forecast quality is encapsulated in a much smaller number of parameters. These parameters are derived from statistical models fit to verification datasets. Forecasting performance as characterized by the statistical models can then be assessed in a relatively complete manner. In addition, the fitted statistical models provide a mechanism for smoothing sampling variations in particular finite samples of forecasts and observations.

This approach to forecast verification is illustrated by evaluating and comparing selected samples of probability of precipitation (PoP) forecasts and the matching binary observations. A linear regression model is fit to the conditional distributions of the observations given the forecasts and a beta distribution is fit to the frequencies of use of the allowable probabilities. Taken together, these two models describe the joint distribution of forecasts and observations, and reduce a 21-dimensional verification problem to 4 dimensions (two parameters each for the regression and beta models). Performance of the selected PoP forecasts is evaluated and compared across forecast type, location, and lead time in terms of these four parameters (and simple functions of the parameters), and selected graphical displays are explored as a means of obtaining relatively transparent views of forecasting performance within this approach to verification.

Full access
Allan H. Murphy
and
Thomas E. Sabin

Abstract

This paper describes the results of a study of trends in the quality of National Weather Service (NWS) forecasts from 1967 to 1985. Primary attention is focused on forecasts of precipitation probabilities, maximum temperatures, and minimum temperatures A skill score based on the Brier score is used to verify the precipitation probability forecasts, whereas the temperature forecasts are evaluated using the mean absolute error and percentage of errors greater than 10°F. For each element, trends are examined for objective forecasts produced by numerical-statistical models and for subjective forecasts formulated by NWS forecasters. In addition to weather element, type of forecast, and verification measure, results are stratified by season (cool and warm), lead time (three or four periods), and NWS region (four regions and all regions combined).

At the national level, the forecasts for these three weather elements exhibit positive and highly significant trends in quality for almost all of the various stratifications. Exceptions to this general result are associated solely with the minimum temperature forecasts, primarily for the 60 h lead time. These national trends are generally stronger for the objective forecasts than for the subjective forecasts and for the cool season than for the warm season. Regionally, the trends in quality are almost always positive and are statistically significant in a majority of the cases. However, nonsignificant trends occur more frequently at the regional level than at the national level. As a result of the positive trends in performance, current levels of forecast quality for these weather elements are markedly higher than the levels that existed 15–20 years ago.

Full access