Search Results

You are looking at 1 - 10 of 34 items for

  • Author or Editor: David B. Stephenson x
  • All content x
Clear All Modify Search
David B. Stephenson

Abstract

This study investigates ways of quantifying the skill in forecasts of dichotomous weather events. The odds ratio, widely used in medical studies, can provide a powerful way of testing the association between categorical forecasts and observations. A skill score can be constructed from the odds ratio that is less sensitive to hedging than previously used scores. Furthermore, significance tests can easily be performed on the logarithm of the odds ratio to test whether the skill is purely due to chance sampling. Functions of the odds ratio and the Peirce skill score define a general class of skill scores that are symmetric with respect to taking the complement of the event. The study illustrates the ideas using Finley’s classic set of tornado forecasts.

Full access
Christopher A. T. Ferro and David B. Stephenson

Abstract

Verifying forecasts of rare events is challenging, in part because traditional performance measures degenerate to trivial values as events become rarer. The extreme dependency score was proposed recently as a nondegenerating measure for the quality of deterministic forecasts of rare binary events. This measure has some undesirable properties, including being both easy to hedge and dependent on the base rate. A symmetric extreme dependency score was also proposed recently, but this too is dependent on the base rate. These two scores and their properties are reviewed and the meanings of several properties, such as base-rate dependence and complement symmetry that have caused confusion are clarified. Two modified versions of the extreme dependency score, the extremal dependence index, and the symmetric extremal dependence index, are then proposed and are shown to overcome all of its shortcomings. The new measures are nondegenerating, base-rate independent, asymptotically equitable, harder to hedge, and have regular isopleths that correspond to symmetric and asymmetric relative operating characteristic curves.

Full access
Full access
Marion P. Mittermaier and David B. Stephenson

Abstract

Synoptic observations are often treated as error-free representations of the true state of the real world. For example, when observations are used to verify numerical weather prediction (NWP) forecasts, forecast–observation differences (the total error) are often entirely attributed to forecast inaccuracy. Such simplification is no longer justifiable for short-lead forecasts made with increasingly accurate higher-resolution models. For example, at least 25% of t + 6 h individual Met Office site-specific (postprocessed) temperature forecasts now typically have total errors of less than 0.2 K, which are comparable to typical instrument measurement errors of around 0.1 K. In addition to instrument errors, uncertainty is introduced by measurements not being taken concurrently with the forecasts. For example, synoptic temperature observations in the United Kingdom are typically taken 10 min before the hour, whereas forecasts are generally extracted as instantaneous values on the hour. This study develops a simple yet robust statistical modeling procedure for assessing how serially correlated subhourly variations limit the forecast accuracy that can be achieved. The methodology is demonstrated by application to synoptic temperature observations sampled every minute at several locations around the United Kingdom. Results show that subhourly variations lead to sizeable forecast errors of 0.16–0.44 K for observations taken 10 min before the forecast issue time. The magnitude of this error depends on spatial location and the annual cycle, with the greater errors occurring in the warmer seasons and at inland sites. This important source of uncertainty consists of a bias due to the diurnal cycle, plus irreducible uncertainty due to unpredictable subhourly variations that fundamentally limit forecast accuracy.

Full access
Ian T. Jolliffe and David B. Stephenson
Full access
Ian T. Jolliffe and David B. Stephenson

Abstract

Verification is an important part of any forecasting system. It is usually achieved by computing the value of some measure or score that indicates how good the forecasts are. Many possible verification measures have been proposed, and to choose between them a number of desirable properties have been defined. For probability forecasts of a binary event, two of the best known of these properties are propriety and equitability. A proof that the two properties are incompatible for a wide class of verification measures is given in this paper, after briefly reviewing the two properties and some recent attempts to improve properties for the well-known Brier skill score.

Full access
David B. Stephenson and Isaac M. Held

Abstract

The response of the Geophysical Fluid Dynamics Laboratory (GFDL) coupled ocean-atmosphere R15, 9-level GCM to gradually increasing C02 amounts is analyzed with emphasis on the changes in the stationary waves and storm tracks in the Northern Hemisphere wintertime troposphere. A large part of the change is described by an equivalent-barotropic stationary wave with a high over eastern Canada and a low over southern Alaska. Consistent with this, the Atlantic jet weakens near the North American coast.

Perpetual winter runs of an R15, nine-level atmospheric GCM with sea surface temperature, sea ice thickness, and soil moisture values prescribed from the coupled GCM results are able to reproduce the coupled model's response qualitatively. Consistent with the weakened baroclinicity associated with the stationary wave change, the Atlantic storm track weakens with increasing C02 concentrations while the Pacific storm track does not change in strength substantially.

An R15, nine-level atmospheric model linearized about the zonal time-mean state is used to analyze the contributions to the stationary wave response. With mountains, diabatic heating, and transient forcings the linear model gives a stationary wave change in qualitative agreement with the change seen in the coupled and perpetual models. Transients and diabatic heating appear to be the major forcing terms, while changes in zonal-mean basic state and topographic forcing play only a small role. A substantial part of the diabatic response is due to changes in tropical latent heating.

Full access
Thomas J. Bracegirdle and David B. Stephenson

Abstract

Statistical relationships between future and historical model runs in multimodel ensembles (MMEs) are increasingly exploited to make more constrained projections of climate change. However, such emergent constraints may be spurious and can arise because of shared (common) errors in a particular MME or because of overly influential models. This study assesses the robustness of emergent constraints used for Arctic warming by comparison of such constraints in ensembles generated by the two most recent Coupled Model Intercomparison Project (CMIP) experiments: CMIP3 and CMIP5. An ensemble regression approach is used to estimate emergent constraints in Arctic wintertime surface air temperature change over the twenty-first century under the Special Report on Emission Scenarios (SRES) A1B scenario in CMIP3 and the Representative Concentration Pathway (RCP) 4.5 scenario in CMIP5. To take account of different scenarios, this study focuses on polar amplification by using temperature responses at each grid point that are scaled by the global mean temperature response for each climate model. In most locations, the estimated emergent constraints are reassuringly similar in CMIP3 and CMIP5 and differences could have easily arisen from sampling variation. However, there is some indication that the emergent constraint and polar amplification is substantially larger in CMIP5 over the Sea of Okhotsk and the Bering Sea. Residual diagnostics identify one climate model in CMIP5 that has a notable influence on estimated emergent constraints over the Bering Sea and one in CMIP3 that that has a notable influence more widely along the sea ice edge and into midlatitudes over the western North Atlantic.

Full access
Edward C. D. Pope, David B. Stephenson, and David R. Jackson

Abstract

Categorical probabilistic prediction is widely used for terrestrial and space weather forecasting as well as for other environmental forecasts. One example is a warning system for geomagnetic disturbances caused by space weather, which are often classified on a 10-level scale. The simplest approach assumes that the transition probabilities are stationary in time—the homogeneous Markov chain (HMC). We extend this approach by developing a flexible nonhomogeneous Markov chain (NHMC) model using Bayesian nonparametric estimation to describe the time-varying transition probabilities. The transition probabilities are updated using a modified Bayes’s rule that gradually forgets transitions in the distant past, with a tunable memory parameter. The approaches were tested by making daily geomagnetic state forecasts at lead times of 1–4 days and were verified over the period 2000–19 using the rank probability score (RPS). Both HMC and NHMC models were found to be skillful at all lead times when compared with climatological forecasts. The NHMC forecasts with an optimal memory parameter of ~100 days were found to be substantially more skillful than the HMC forecasts, with an RPS skill for the NHMC of 10.5% and 5.6% for lead times of 1 and 4 days ahead, respectively. The NHMC is thus a viable alternative approach for forecasting geomagnetic disturbances and could provide a new benchmark for producing operational forecasts. The approach is generic and is applicable to other forecasts that include discrete weather regimes or hydrological conditions (e.g., wet and dry days).

Restricted access
Maarten H. P. Ambaum, Brian J. Hoskins, and David B. Stephenson

Abstract

The definition and interpretation of the Arctic oscillation (AO) are examined and compared with those of the North Atlantic oscillation (NAO). It is shown that the NAO reflects the correlations between the surface pressure variability at its centers of action, whereas this is not the case for the AO. The NAO pattern can be identified in a physically consistent way in principal component analysis applied to various fields in the Euro-Atlantic region. A similar identification is found in the Pacific region for the Pacific–North American (PNA) pattern, but no such identification is found here for the AO. The AO does reflect the tendency for the zonal winds at 35° and 55°N to anticorrelate in both the Atlantic and Pacific regions associated with the NAO and PNA. Because climatological features in the two ocean basins are at different latitudes, the zonally symmetric nature of the AO does not mean that it represents a simple modulation of the circumpolar flow. An increase in the AO or NAO implies strong, separated tropospheric jets in the Atlantic but a weakened Pacific jet. The PNA has strong related variability in the Pacific jet exit, but elsewhere the zonal wind is similar to that related to the NAO. The NAO-related zonal winds link strongly through to the stratosphere in the Atlantic sector. The PNA-related winds do so in the Pacific, but to a lesser extent. The results suggest that the NAO paradigm may be more physically relevant and robust for Northern Hemisphere variability than is the AO paradigm. However, this does not disqualify many of the physical mechanisms associated with annular modes for explaining the existence of the NAO.

Full access