Search Results

You are looking at 1 - 10 of 12 items for :

  • Model performance/evaluation x
  • Spatial Forecast Verification Methods Inter-Comparison Project (ICP) x
  • All content x
Clear All
Valliappa Lakshmanan and John S. Kain

different verification methods provide information on location, intensity, and structure errors, as well as on model performance at different scales. To enable reasonable comparison, the verification methods were carried out on synthetic and real fields with known errors. The methods were also applied to a common dataset used in a subjective model evaluation experiment. The results of the GMM approach on the different datasets that were created by the intercomparison project are presented below. a

Full access
Christopher A. Davis, Barbara G. Brown, Randy Bullock, and John Halley-Gotway

for purposes of this study, they are considered independent models. These are described further in section 3 . Results from the MODE-based evaluation will be compared with subjective impressions of differences in forecast quality. Following a description of the object identification, matching, and merging procedures that MODE comprises, we will present results from idealized cases where objects have simple geometric properties and differences between synthetic forecast and observed features are

Full access
Keith F. Brill and Fedor Mesinger

1. Introduction This note extends the analysis of Brill (2009) to include new performance measures, bias-adjusted threat and equitable threat scores, derived by Mesinger (2008) . The work of Mesinger (2008) was motivated by heuristic evidence of frequently misleading bias sensitivities for the threat score (TS) and equitable threat score (ETS), such as shown by Baldwin and Kain (2006) using a geometrical model. Specifically, this sensitivity was demonstrated to undermine the presumably

Full access
David Ahijevych, Eric Gilleland, Barbara G. Brown, and Elizabeth E. Ebert

excellent. For fairness, the models were ordered randomly and not labeled. b. Traditional scores and subjective evaluation The panel’s subjective scores are alternative viewpoints, not definitive assessments of forecast performance. The evaluators were not asked to consider the usefulness of the forecasts from the standpoint of any particular user (e.g., water manager, farmer, SPC forecaster) or to focus on a particular region, but to subjectively evaluate the forecast as a whole. Afterward, several of

Full access
Elizabeth E. Ebert

-based rainfall analysis ( Lin and Mitchell 2005 ), which provided the observational data. The overall performance for each of the three models was computed by aggregating its neighborhood verification results over the nine 2005 Spring Program cases. The aggregation was done by summing the contingency table elements (upscaling, MECT, and PP methods) or squared errors (FSS) for all of the cases and computing the scores from the summed components. As in any systematic verification, aggregation over many cases

Full access
Eric Gilleland, Johan Lindström, and Finn Lindgren

deformations. Subsequently, as will be seen in this section, the warp function utilizes a conditional likelihood of the verification field given the forecast field that is consistent with the general framework for forecast verification proposed by Murphy and Winkler (1987) . The stochastic model is described in section 2a , and an estimation procedure for the model parameters is described in section 2b followed by specific considerations for obtaining useful information about forecast performance in

Full access
Elizabeth E. Ebert and William A. Gallus Jr.

method. The next three sections investigate the ability of the CRA method to diagnose errors in three sets of continental-scale rainfall forecasts of increasing complexity: idealized geometric forecasts, perturbed “forecasts” to which known errors were applied, and NWP forecasts from three configurations of the Weather Research and Forecasting (WRF) model. The paper concludes with recommendations on the best use of the CRA method. 2. CRA verification method The CRA method was developed to evaluate

Full access
Eric Gilleland, David Ahijevych, Barbara G. Brown, Barbara Casati, and Elizabeth E. Ebert

evaluating the capabilities of new verification approaches, the issue of hedging (forecasting other than one’s true belief in order to improve the verification score) is generally also of concern. However, given that most of the new methods considered here are advanced diagnostic techniques for investigating high-resolution spatial forecasts, it is unlikely (but not impossible) that one would tune a model to obtain the best performance by hedging. For example, for a traditional verification metric such

Full access
Jason E. Nachamkin

calculations are very effective at revealing these errors. 4. Model intercomparison The final experiment consisted of a series of 1-h precipitation forecasts from three configurations of the Weather Research and Forecast (WRF) model (WRF2CAPS, WRF4NCAR, and WRF4NCEP). This experiment is somewhat different than the idealized experiments because each WRF configuration was evaluated from a series of forecasts as opposed to individual cases. The native grid spacing for WRF4NCAR and WRF4NCEP was 4 km, while it

Full access
Heini Wernli, Christiane Hofmann, and Matthias Zimmer

of these models was run without a parameterization of deep convection. For a discussion of further details of these model versions the reader is referred to Kain et al. (2008) and AGBE . Hourly accumulated QPFs from these models 3 and radar-derived observational datasets have been available for 9 days in April–June 2005. These small samples do not allow for a systematic (statistical) evaluation of the models’ levels of SAL performance. Therefore, in this study the aim is not to identify the

Full access