# Search Results

## You are looking at 1 - 10 of 16 items for

- Author or Editor: Robert L. Winkler x

- Refine by Access: All Content x

## Abstract

A general framework for forecast verification based on the joint distribution of forecasts and observations is described. For further elaboration of the framework, two factorizations of the joint distribution are investigated: 1) the calibration-refinement factorization, which involves the conditional distributions of observations given forecasts and the marginal distribution of forecasts, and 2) the likelihood-base factorization, which involve the conditional distributions of forecasts given observations and the marginal distribution of observations. The names given to the factorizations reflect the fact that they relate to different attributes of the forecasts and/or observations. Several examples are used to illustrate the interpretation of these factorizations in the context of verification and to describe the relationship between the respective factorizations.

Some insight into the potential utility of the framework is provided by demonstrating that basic elements and summary measures of the joint, conditional, and marginal distributions play key roles in current verification methods. The need for further investigation of the implications of this framework for verification theory and practice is emphasized, and some possible directions for future research in this area are identified.

## Abstract

A general framework for forecast verification based on the joint distribution of forecasts and observations is described. For further elaboration of the framework, two factorizations of the joint distribution are investigated: 1) the calibration-refinement factorization, which involves the conditional distributions of observations given forecasts and the marginal distribution of forecasts, and 2) the likelihood-base factorization, which involve the conditional distributions of forecasts given observations and the marginal distribution of observations. The names given to the factorizations reflect the fact that they relate to different attributes of the forecasts and/or observations. Several examples are used to illustrate the interpretation of these factorizations in the context of verification and to describe the relationship between the respective factorizations.

Some insight into the potential utility of the framework is provided by demonstrating that basic elements and summary measures of the joint, conditional, and marginal distributions play key roles in current verification methods. The need for further investigation of the implications of this framework for verification theory and practice is emphasized, and some possible directions for future research in this area are identified.

## Abstract

An experiment was conducted at the National Severe Storms Forecast Center during 1976 and 1977 in which National Weather Service forecasters formulated probabilistic forecasts of several tornado events in conjunction with both severe weather outlooks and severe thunderstorm and tornado watches. The results indicate that the probabilistic forecasts associated with the outlooks were quite reliable and exhibited positive skill, relative to forecasts based on sample climatological probabilities. The probabilistic forecasts associated with the watches, however, were less reliable and skillful. In view of the lack of prior experience at making probabilistic tornado forecasts, as well as the absence of feedback, comparable objective probabilistic guidance, and even appropriate past data on which to base climatological probabilities, the results of the experiment are quite encouraging. Some suggestions for further work in probabilistic tornado forecasting are provided.

## Abstract

An experiment was conducted at the National Severe Storms Forecast Center during 1976 and 1977 in which National Weather Service forecasters formulated probabilistic forecasts of several tornado events in conjunction with both severe weather outlooks and severe thunderstorm and tornado watches. The results indicate that the probabilistic forecasts associated with the outlooks were quite reliable and exhibited positive skill, relative to forecasts based on sample climatological probabilities. The probabilistic forecasts associated with the watches, however, were less reliable and skillful. In view of the lack of prior experience at making probabilistic tornado forecasts, as well as the absence of feedback, comparable objective probabilistic guidance, and even appropriate past data on which to base climatological probabilities, the results of the experiment are quite encouraging. Some suggestions for further work in probabilistic tornado forecasting are provided.

## Abstract

An experiment was conducted at the National Weather Service Forecast Office in St. Louis, Mo., to investigate the ability of forecasters to differentiate among different points in a forecast area with regard to the livelihood of the occurrence of measurable precipitation and the relative ability of forecasters to make point and area precipitation probability forecasts. On each forecasting occasion in the experimental period (November 1972–March 1973), the forecasters made an average point probability forecast for the St. Louis metropolitan area, point probability forecasts for five specific points in the area, an area probability forecast, and an expected areal coverage forecast.

The results indicate that the forecasters did not differentiate among the five points very often, but that this absence of differences among the point probabilities was justified by the lack of variability exhibited by the observations of precipitation occurrence at these points during the experimental period. Evaluations of the average point probability forecasts, individual point probability forecasts, and expected areal coverage forecasts reveal that these forecasts were quite reliable and accurate and that they were also internally consistent. The area probability forecasts, however, tended not to be consistent with the other forecasts, and the average area probability forecast was considerably lower than the relative frequency of occurrence of precipitation “somewhere in the area.”

The implications of these results for precipitation probability forecasting in meteorology are briefly discussed.

## Abstract

An experiment was conducted at the National Weather Service Forecast Office in St. Louis, Mo., to investigate the ability of forecasters to differentiate among different points in a forecast area with regard to the livelihood of the occurrence of measurable precipitation and the relative ability of forecasters to make point and area precipitation probability forecasts. On each forecasting occasion in the experimental period (November 1972–March 1973), the forecasters made an average point probability forecast for the St. Louis metropolitan area, point probability forecasts for five specific points in the area, an area probability forecast, and an expected areal coverage forecast.

The results indicate that the forecasters did not differentiate among the five points very often, but that this absence of differences among the point probabilities was justified by the lack of variability exhibited by the observations of precipitation occurrence at these points during the experimental period. Evaluations of the average point probability forecasts, individual point probability forecasts, and expected areal coverage forecasts reveal that these forecasts were quite reliable and accurate and that they were also internally consistent. The area probability forecasts, however, tended not to be consistent with the other forecasts, and the average area probability forecast was considerably lower than the relative frequency of occurrence of precipitation “somewhere in the area.”

The implications of these results for precipitation probability forecasting in meteorology are briefly discussed.

## Abstract

This paper describes the results of an experiment involving credible interval temperature forecasts. A credible interval is an interval of values of the variable of concern, in this case maximum or minimum temperature, accompanied by a probability which expresses a forecaster's “degree of belief” that the temperature will fall in the given interval. The experiment was designed to investigate the ability of fore-casters to express the uncertainty inherent in their temperature forecasts in probabilistic terms and to compare two approaches (variable-width and flied-width intervals) to credible interval temperature forecasting.

Four experienced weather forecasters participated in the experiment, which was conducted at the National Weather Service Forecast Office in Denver, Colorado. Two forecasters made variable-width, fixed-probability forecasts using 50% and 75% intervals, while the other two forecasters made fixed-width, variable-probability forecasts using 5°F and 9°F intervals. On each occasion the forecasters first determined a median, and the variable-width and fixed-width intervals were then centered at the median in terms of probability and width, respectively.

The results indicate that, overall, the medians determined by the forecasters were good point forecasts of maximum and minimum temperatures. Further, a comparison of the average errors for the forecasters’ medians with the average errors for the medians derived from climatology reveals that the forecasters were able to improve greatly upon climatology. The variable-width credible intervals were very reliable in the sense that the observed relative frequencies corresponded very closely to the forecast probabilities. Moreover, the variable-width intervals were more reliable and much more precise than the corresponding climatological forecasts. The fixed-width intervals, on the other hand, were assigned probabilities that were, on the average, larger that the corresponding relative frequencies.

In summary, the results indicate that weather forecasters can use credible intervals to describe the uncertainty contained in their temperature forecasts. The implications of these experimental results for probability forecasting in general and temperature forecasting in particular are discussed.

## Abstract

This paper describes the results of an experiment involving credible interval temperature forecasts. A credible interval is an interval of values of the variable of concern, in this case maximum or minimum temperature, accompanied by a probability which expresses a forecaster's “degree of belief” that the temperature will fall in the given interval. The experiment was designed to investigate the ability of fore-casters to express the uncertainty inherent in their temperature forecasts in probabilistic terms and to compare two approaches (variable-width and flied-width intervals) to credible interval temperature forecasting.

Four experienced weather forecasters participated in the experiment, which was conducted at the National Weather Service Forecast Office in Denver, Colorado. Two forecasters made variable-width, fixed-probability forecasts using 50% and 75% intervals, while the other two forecasters made fixed-width, variable-probability forecasts using 5°F and 9°F intervals. On each occasion the forecasters first determined a median, and the variable-width and fixed-width intervals were then centered at the median in terms of probability and width, respectively.

The results indicate that, overall, the medians determined by the forecasters were good point forecasts of maximum and minimum temperatures. Further, a comparison of the average errors for the forecasters’ medians with the average errors for the medians derived from climatology reveals that the forecasters were able to improve greatly upon climatology. The variable-width credible intervals were very reliable in the sense that the observed relative frequencies corresponded very closely to the forecast probabilities. Moreover, the variable-width intervals were more reliable and much more precise than the corresponding climatological forecasts. The fixed-width intervals, on the other hand, were assigned probabilities that were, on the average, larger that the corresponding relative frequencies.

In summary, the results indicate that weather forecasters can use credible intervals to describe the uncertainty contained in their temperature forecasts. The implications of these experimental results for probability forecasting in general and temperature forecasting in particular are discussed.

## Abstract

Proper scoring rules, such as the probability score, are based (in part) upon the assumption that the *assessor's* utility function is linearly related to the score. The effects of two nonlinear utility functions, one representing a “risk-taker” and one representing a “risk-avoider,” on an assessor's probability forecasts are considered. The results indicate that factors other than the expected score, e.g., the variance of the score, may be relevant for probability assessment. In general, a “risk-taker” “hedges” toward a categorical forecast, while a “risk-avoider” “hedges” away from a categorical forecast. The implications of these results for the process of probability assessment are briefly discussed.

## Abstract

Proper scoring rules, such as the probability score, are based (in part) upon the assumption that the *assessor's* utility function is linearly related to the score. The effects of two nonlinear utility functions, one representing a “risk-taker” and one representing a “risk-avoider,” on an assessor's probability forecasts are considered. The results indicate that factors other than the expected score, e.g., the variance of the score, may be relevant for probability assessment. In general, a “risk-taker” “hedges” toward a categorical forecast, while a “risk-avoider” “hedges” away from a categorical forecast. The implications of these results for the process of probability assessment are briefly discussed.

## Abstract

Since a meteorologist's predictions are subjective, a framework for the evaluation of meteorological probability assessors must be consistent with the theory of subjective probability. Such a framework is described in this paper. First, two standards of “goodness,” one based upon normative considerations and one based upon substantive considerations, are proposed. Specific properties which a meteorologist's assessments should possess are identified for each standard. Then, several measures of “goodness,” or scoring rules, which indicate the extent to which such assessments possess certain properties, are described. Finally, several important uses of these scoring rules are considered.

## Abstract

Since a meteorologist's predictions are subjective, a framework for the evaluation of meteorological probability assessors must be consistent with the theory of subjective probability. Such a framework is described in this paper. First, two standards of “goodness,” one based upon normative considerations and one based upon substantive considerations, are proposed. Specific properties which a meteorologist's assessments should possess are identified for each standard. Then, several measures of “goodness,” or scoring rules, which indicate the extent to which such assessments possess certain properties, are described. Finally, several important uses of these scoring rules are considered.

This paper describes the preliminary results of three experiments in subjective probability forecasting which were recently conducted in four Weather Service Forecast Offices (WSFOs) of the National Weather Service. The first experiment, which was conducted at the St. Louis WSFO, was designed to investigate both the ability of forecasters to differentiate among points in a forecast area with regard to the likelihood of occurrence of measurable precipitation and their relative ability to make point and area (including areal coverage) precipitation probability forecasts. The second experiment, which was conducted at the Denver WSFO, was designed to investigate the ability of forecasters to use credible intervals to express the uncertainty inherent in their temperature forecasts and to compare two approaches (variable-width intervals and fixed-width intervals) to credible interval temperature forecasting. The third experiment, which was conducted at both the Great Falls and Seattle WSFOs, was designed to investigate the effects of guidance (i.e., PEATMOS) forecasts upon the forecasters' precipitation probability forecasts.

For each experiment, some background material is presented; the design of the experiment is discussed; some preliminary results of the experiment are presented; and some implications of the experiment and the results for probability forecasting in meteorology and probability forecasting in general are discussed. The results of each of these experiments will be described individually and in much greater detail in a series of forthcoming papers.

This paper describes the preliminary results of three experiments in subjective probability forecasting which were recently conducted in four Weather Service Forecast Offices (WSFOs) of the National Weather Service. The first experiment, which was conducted at the St. Louis WSFO, was designed to investigate both the ability of forecasters to differentiate among points in a forecast area with regard to the likelihood of occurrence of measurable precipitation and their relative ability to make point and area (including areal coverage) precipitation probability forecasts. The second experiment, which was conducted at the Denver WSFO, was designed to investigate the ability of forecasters to use credible intervals to express the uncertainty inherent in their temperature forecasts and to compare two approaches (variable-width intervals and fixed-width intervals) to credible interval temperature forecasting. The third experiment, which was conducted at both the Great Falls and Seattle WSFOs, was designed to investigate the effects of guidance (i.e., PEATMOS) forecasts upon the forecasters' precipitation probability forecasts.

For each experiment, some background material is presented; the design of the experiment is discussed; some preliminary results of the experiment are presented; and some implications of the experiment and the results for probability forecasting in meteorology and probability forecasting in general are discussed. The results of each of these experiments will be described individually and in much greater detail in a series of forthcoming papers.

Some results of a nationwide survey of National Weather Service forecasters with regard to probability forecasting in general and precipitation probability forecasting in particular are summarized. Specifically, the questionnaire which was used in the survey, the participants in the survey (i.e., the forecasters), and the nature of the results are briefly described, and some recommendations based upon these results are presented.

Some results of a nationwide survey of National Weather Service forecasters with regard to probability forecasting in general and precipitation probability forecasting in particular are summarized. Specifically, the questionnaire which was used in the survey, the participants in the survey (i.e., the forecasters), and the nature of the results are briefly described, and some recommendations based upon these results are presented.

This paper summarizes the responses of the forecasters of the Travelers Weather Service to a questionnaire concerning probability forecasting. The questionnaire was designed to elicit information from the forecasters relative to the process of precipitation probability forecasting (e.g., the information sources examined, their relative importance and order of examination), the relationship between judgments and forecasts, the effect of the definition of precipitation on the forecasts, the meaning of the forecasts, the effects of feedback and experience on the forecasts, and related matters. The responses to the questionnaire and subsequent discussions with Travelers Weather Service and National Weather Service forecasters suggest the presence of a number of “problems” related to probability forecasting. Several of these “problems” are considered in some detail in a separate paper (Murphy and Winkler, 1971).

This paper summarizes the responses of the forecasters of the Travelers Weather Service to a questionnaire concerning probability forecasting. The questionnaire was designed to elicit information from the forecasters relative to the process of precipitation probability forecasting (e.g., the information sources examined, their relative importance and order of examination), the relationship between judgments and forecasts, the effect of the definition of precipitation on the forecasts, the meaning of the forecasts, the effects of feedback and experience on the forecasts, and related matters. The responses to the questionnaire and subsequent discussions with Travelers Weather Service and National Weather Service forecasters suggest the presence of a number of “problems” related to probability forecasting. Several of these “problems” are considered in some detail in a separate paper (Murphy and Winkler, 1971).