Search Results
You are looking at 1 - 10 of 10 items for
- Author or Editor: John Halley Gotway x
- Refine by Access: All Content x
Abstract
The authors use a procedure called the method for object-based diagnostic evaluation, commonly referred to as MODE, to compare forecasts made from two models representing separate cores of the Weather Research and Forecasting (WRF) model during the 2005 National Severe Storms Laboratory and Storm Prediction Center Spring Program. Both models, the Advanced Research WRF (ARW) and the Nonhydrostatic Mesoscale Model (NMM), were run without a traditional cumulus parameterization scheme on horizontal grid lengths of 4 km (ARW) and 4.5 km (NMM). MODE was used to evaluate 1-h rainfall accumulation from 24-h forecasts valid at 0000 UTC on 32 days between 24 April and 4 June 2005. The primary variable used for evaluation was a “total interest” derived from a fuzzy-logic algorithm that compared several attributes of forecast and observed rain features such as separation distance and spatial orientation. The maximum value of the total interest obtained by comparing an object in one field with all objects in the comparison field was retained as the quality of matching for that object. The median of the distribution of all such maximum-interest values was selected as a metric of the overall forecast quality.
Results from the 32 cases suggest that, overall, the configuration of the ARW model used during the 2005 Spring Program performed slightly better than the configuration of the NMM model. The primary manifestation of the differing levels of performance was fewer false alarms, forecast rain areas with no observed counterpart, in the ARW. However, it was noted that the performance varied considerably from day to day, with most days featuring indistinguishable performance. Thus, a small number of poor NMM forecasts produced the overall difference between the two models.
Abstract
The authors use a procedure called the method for object-based diagnostic evaluation, commonly referred to as MODE, to compare forecasts made from two models representing separate cores of the Weather Research and Forecasting (WRF) model during the 2005 National Severe Storms Laboratory and Storm Prediction Center Spring Program. Both models, the Advanced Research WRF (ARW) and the Nonhydrostatic Mesoscale Model (NMM), were run without a traditional cumulus parameterization scheme on horizontal grid lengths of 4 km (ARW) and 4.5 km (NMM). MODE was used to evaluate 1-h rainfall accumulation from 24-h forecasts valid at 0000 UTC on 32 days between 24 April and 4 June 2005. The primary variable used for evaluation was a “total interest” derived from a fuzzy-logic algorithm that compared several attributes of forecast and observed rain features such as separation distance and spatial orientation. The maximum value of the total interest obtained by comparing an object in one field with all objects in the comparison field was retained as the quality of matching for that object. The median of the distribution of all such maximum-interest values was selected as a metric of the overall forecast quality.
Results from the 32 cases suggest that, overall, the configuration of the ARW model used during the 2005 Spring Program performed slightly better than the configuration of the NMM model. The primary manifestation of the differing levels of performance was fewer false alarms, forecast rain areas with no observed counterpart, in the ARW. However, it was noted that the performance varied considerably from day to day, with most days featuring indistinguishable performance. Thus, a small number of poor NMM forecasts produced the overall difference between the two models.
Abstract
While traditional verification methods are commonly used to assess numerical model quantitative precipitation forecasts (QPFs) using a grid-to-grid approach, they generally offer little diagnostic information or reasoning behind the computed statistic. On the other hand, advanced spatial verification techniques, such as neighborhood and object-based methods, can provide more meaningful insight into differences between forecast and observed features in terms of skill with spatial scale, coverage area, displacement, orientation, and intensity. To demonstrate the utility of applying advanced verification techniques to mid- and coarse-resolution models, the Developmental Testbed Center (DTC) applied several traditional metrics and spatial verification techniques to QPFs provided by the Global Forecast System (GFS) and operational North American Mesoscale Model (NAM). Along with frequency bias and Gilbert skill score (GSS) adjusted for bias, both the fractions skill score (FSS) and Method for Object-Based Diagnostic Evaluation (MODE) were utilized for this study with careful consideration given to how these methods were applied and how the results were interpreted. By illustrating the types of forecast attributes appropriate to assess with the spatial verification techniques, this paper provides examples of how to obtain advanced diagnostic information to help identify what aspects of the forecast are or are not performing well.
Abstract
While traditional verification methods are commonly used to assess numerical model quantitative precipitation forecasts (QPFs) using a grid-to-grid approach, they generally offer little diagnostic information or reasoning behind the computed statistic. On the other hand, advanced spatial verification techniques, such as neighborhood and object-based methods, can provide more meaningful insight into differences between forecast and observed features in terms of skill with spatial scale, coverage area, displacement, orientation, and intensity. To demonstrate the utility of applying advanced verification techniques to mid- and coarse-resolution models, the Developmental Testbed Center (DTC) applied several traditional metrics and spatial verification techniques to QPFs provided by the Global Forecast System (GFS) and operational North American Mesoscale Model (NAM). Along with frequency bias and Gilbert skill score (GSS) adjusted for bias, both the fractions skill score (FSS) and Method for Object-Based Diagnostic Evaluation (MODE) were utilized for this study with careful consideration given to how these methods were applied and how the results were interpreted. By illustrating the types of forecast attributes appropriate to assess with the spatial verification techniques, this paper provides examples of how to obtain advanced diagnostic information to help identify what aspects of the forecast are or are not performing well.
Abstract
A well-known problem in high-resolution ensembles has been a lack of sufficient spread among members. Modelers often have used mixed physics to increase spread, but this can introduce problems including computational expense, clustering of members, and members that are not all equally skillful. Thus, a detailed examination of the impacts of using mixed physics is important. The present study uses two years of Community Leveraged Unified Ensemble (CLUE) output to isolate the impact of mixed physics in 36-h forecasts made using a convection-permitting ensemble with 3-km horizontal grid spacing. One 10-member subset of the CLUE used only perturbed initial conditions (ICs) and lateral boundary conditions (LBCs) while another 10-member ensemble used the same mixed ICs and LBCs but also introduced mixed physics. The cases examined occurred during NOAA’s Hazardous Weather Testbed Spring Forecast Experiments in 2016 and 2017. Traditional gridpoint metrics applied to each member and the ensemble as a whole, along with object-based verification statistics for all members, were computed for composite reflectivity and 1- and 3-h accumulated precipitation using the Model Evaluation Tools (MET) software package. It is found that the mixed physics increases variability substantially among the ensemble members, more so for reflectivity than precipitation, such that the envelope of members is more likely to encompass the observations. However, the increased variability is mostly due to the introduction of both substantial high biases in members using one microphysical scheme, and low biases in other schemes. Overall ensemble skill is not substantially different from the ensemble using a single physics package.
Abstract
A well-known problem in high-resolution ensembles has been a lack of sufficient spread among members. Modelers often have used mixed physics to increase spread, but this can introduce problems including computational expense, clustering of members, and members that are not all equally skillful. Thus, a detailed examination of the impacts of using mixed physics is important. The present study uses two years of Community Leveraged Unified Ensemble (CLUE) output to isolate the impact of mixed physics in 36-h forecasts made using a convection-permitting ensemble with 3-km horizontal grid spacing. One 10-member subset of the CLUE used only perturbed initial conditions (ICs) and lateral boundary conditions (LBCs) while another 10-member ensemble used the same mixed ICs and LBCs but also introduced mixed physics. The cases examined occurred during NOAA’s Hazardous Weather Testbed Spring Forecast Experiments in 2016 and 2017. Traditional gridpoint metrics applied to each member and the ensemble as a whole, along with object-based verification statistics for all members, were computed for composite reflectivity and 1- and 3-h accumulated precipitation using the Model Evaluation Tools (MET) software package. It is found that the mixed physics increases variability substantially among the ensemble members, more so for reflectivity than precipitation, such that the envelope of members is more likely to encompass the observations. However, the increased variability is mostly due to the introduction of both substantial high biases in members using one microphysical scheme, and low biases in other schemes. Overall ensemble skill is not substantially different from the ensemble using a single physics package.
Abstract
An important focus of research in the forecast verification community is the development of alternative verification approaches for quantitative precipitation forecasts, as well as for other spatial forecasts. The need for information that is meaningful in an operational context and the importance of capturing the specific sources of forecast error at varying spatial scales are two primary motivating factors. In this paper, features of precipitation as identified by a convolution threshold technique are merged within fields and matched across fields in an automatic and computationally efficient manner using Baddeley’s metric for binary images.
The method is carried out on 100 test cases, and 4 representative cases are shown in detail. Results of merging and matching objects are generally positive in that they are consistent with how a subjective observer might merge and match features. The results further suggest that the Baddeley metric may be useful as a computationally efficient summary metric giving information about location, shape, and size differences of individual features, which could be employed for other spatial forecast verification methods.
Abstract
An important focus of research in the forecast verification community is the development of alternative verification approaches for quantitative precipitation forecasts, as well as for other spatial forecasts. The need for information that is meaningful in an operational context and the importance of capturing the specific sources of forecast error at varying spatial scales are two primary motivating factors. In this paper, features of precipitation as identified by a convolution threshold technique are merged within fields and matched across fields in an automatic and computationally efficient manner using Baddeley’s metric for binary images.
The method is carried out on 100 test cases, and 4 representative cases are shown in detail. Results of merging and matching objects are generally positive in that they are consistent with how a subjective observer might merge and match features. The results further suggest that the Baddeley metric may be useful as a computationally efficient summary metric giving information about location, shape, and size differences of individual features, which could be employed for other spatial forecast verification methods.
Abstract
Tropical cyclone (TC) forecast verification techniques have traditionally focused on track and intensity, as these are some of the most important characteristics of TCs and are often the principal verification concerns of operational forecast centers. However, there is a growing need to verify other aspects of TCs as process-based validation techniques may be increasingly necessary for further track and intensity forecast improvements as well as improving communication of the broad impacts of TCs including inland flooding from precipitation. Here we present a set of TC-focused verification methods available via the Model Evaluation Tools (MET) ranging from traditional approaches to the application of storm-centric coordinates and the use of feature-based verification of spatially defined TC objects. Storm-relative verification using observed and forecast tracks can be useful for identifying model biases in precipitation accumulation in relation to the storm center. Using a storm-centric cylindrical coordinate system based on the radius of maximum wind adds additional storm-relative capabilities to regrid precipitation fields onto cylindrical or polar coordinates. This powerful process-based model diagnostic and verification technique provides a framework for improved understanding of feedbacks between forecast tracks, intensity, and precipitation distributions. Finally, object-based verification including land masking capabilities provides even more nuanced verification options. Precipitation objects of interest, either the central core of TCs or extended areas of rainfall after landfall, can be identified, matched to observations, and quickly aggregated to build meaningful spatial and summary verification statistics.
Abstract
Tropical cyclone (TC) forecast verification techniques have traditionally focused on track and intensity, as these are some of the most important characteristics of TCs and are often the principal verification concerns of operational forecast centers. However, there is a growing need to verify other aspects of TCs as process-based validation techniques may be increasingly necessary for further track and intensity forecast improvements as well as improving communication of the broad impacts of TCs including inland flooding from precipitation. Here we present a set of TC-focused verification methods available via the Model Evaluation Tools (MET) ranging from traditional approaches to the application of storm-centric coordinates and the use of feature-based verification of spatially defined TC objects. Storm-relative verification using observed and forecast tracks can be useful for identifying model biases in precipitation accumulation in relation to the storm center. Using a storm-centric cylindrical coordinate system based on the radius of maximum wind adds additional storm-relative capabilities to regrid precipitation fields onto cylindrical or polar coordinates. This powerful process-based model diagnostic and verification technique provides a framework for improved understanding of feedbacks between forecast tracks, intensity, and precipitation distributions. Finally, object-based verification including land masking capabilities provides even more nuanced verification options. Precipitation objects of interest, either the central core of TCs or extended areas of rainfall after landfall, can be identified, matched to observations, and quickly aggregated to build meaningful spatial and summary verification statistics.
Abstract
This study analyzes spatial and temporal characteristics of multiyear droughts and pluvials over the southwestern United States with a focus on the upper Colorado River basin. The study uses two multiscalar moisture indices: standardized precipitation evapotranspiration index (SPEI) and standardized precipitation index (SPI) on a 36-month scale (SPEI36 and SPI36, respectively). The indices are calculated from monthly average precipitation and maximum and minimum temperatures from the Parameter-Elevation Regressions on Independent Slopes Model dataset for the period 1950–2012. The study examines the relationship between individual climate variables as well as large-scale atmospheric circulation features found in reanalysis output during drought and pluvial periods. The results indicate that SPEI36 and SPI36 show similar temporal and spatial patterns, but that the inclusion of temperatures in SPEI36 leads to more extreme magnitudes in SPEI36 than in SPI36. Analysis of large-scale atmospheric fields indicates an interplay between different fields that yields extremes over the study region. Widespread drought (pluvial) events are associated with enhanced positive (negative) 500-hPa geopotential height anomaly linked to subsidence (ascent) and negative (positive) moisture convergence and precipitable water anomalies. Considering the broader context of the conditions responsible for the occurrence of prolonged hydrologic anomalies provides water resource managers and other decision-makers with valuable understanding of these events. This perspective also offers evaluation opportunities for climate models.
Abstract
This study analyzes spatial and temporal characteristics of multiyear droughts and pluvials over the southwestern United States with a focus on the upper Colorado River basin. The study uses two multiscalar moisture indices: standardized precipitation evapotranspiration index (SPEI) and standardized precipitation index (SPI) on a 36-month scale (SPEI36 and SPI36, respectively). The indices are calculated from monthly average precipitation and maximum and minimum temperatures from the Parameter-Elevation Regressions on Independent Slopes Model dataset for the period 1950–2012. The study examines the relationship between individual climate variables as well as large-scale atmospheric circulation features found in reanalysis output during drought and pluvial periods. The results indicate that SPEI36 and SPI36 show similar temporal and spatial patterns, but that the inclusion of temperatures in SPEI36 leads to more extreme magnitudes in SPEI36 than in SPI36. Analysis of large-scale atmospheric fields indicates an interplay between different fields that yields extremes over the study region. Widespread drought (pluvial) events are associated with enhanced positive (negative) 500-hPa geopotential height anomaly linked to subsidence (ascent) and negative (positive) moisture convergence and precipitable water anomalies. Considering the broader context of the conditions responsible for the occurrence of prolonged hydrologic anomalies provides water resource managers and other decision-makers with valuable understanding of these events. This perspective also offers evaluation opportunities for climate models.
Abstract
Evaluation of numerical weather prediction (NWP) is critical for both forecasters and researchers. Through such evaluation, forecasters can understand the strengths and weaknesses of NWP guidance, and researchers can work to improve NWP models. However, evaluating high-resolution convection-allowing models (CAMs) requires unique verification metrics tailored to high-resolution output, particularly when considering extreme events. Metrics used and fields evaluated often differ between verification studies, hindering the effort to broadly compare CAMs. The purpose of this article is to summarize the development and initial testing of a CAM-based scorecard, which is intended for broad use across research and operational communities and is similar to scorecards currently available within the enhanced Model Evaluation Tools package (METplus) for evaluating coarser models. Scorecards visualize many verification metrics and attributes simultaneously, providing a broad overview of model performance. A preliminary CAM scorecard was developed and tested during the 2018 Spring Forecasting Experiment using METplus, focused on metrics and attributes relevant to severe convective forecasting. The scorecard compared attributes specific to convection-allowing scales such as reflectivity and surrogate severe fields, using metrics like the critical success index (CSI) and fractions skill score (FSS). While this preliminary scorecard focuses on attributes relevant to severe convective storms, the scorecard framework allows for the inclusion of further metrics relevant to other applications. Development of a CAM scorecard allows for evidence-based decision-making regarding future operational CAM systems as the National Weather Service transitions to a Unified Forecast system as part of the Next-Generation Global Prediction System initiative.
Abstract
Evaluation of numerical weather prediction (NWP) is critical for both forecasters and researchers. Through such evaluation, forecasters can understand the strengths and weaknesses of NWP guidance, and researchers can work to improve NWP models. However, evaluating high-resolution convection-allowing models (CAMs) requires unique verification metrics tailored to high-resolution output, particularly when considering extreme events. Metrics used and fields evaluated often differ between verification studies, hindering the effort to broadly compare CAMs. The purpose of this article is to summarize the development and initial testing of a CAM-based scorecard, which is intended for broad use across research and operational communities and is similar to scorecards currently available within the enhanced Model Evaluation Tools package (METplus) for evaluating coarser models. Scorecards visualize many verification metrics and attributes simultaneously, providing a broad overview of model performance. A preliminary CAM scorecard was developed and tested during the 2018 Spring Forecasting Experiment using METplus, focused on metrics and attributes relevant to severe convective forecasting. The scorecard compared attributes specific to convection-allowing scales such as reflectivity and surrogate severe fields, using metrics like the critical success index (CSI) and fractions skill score (FSS). While this preliminary scorecard focuses on attributes relevant to severe convective storms, the scorecard framework allows for the inclusion of further metrics relevant to other applications. Development of a CAM scorecard allows for evidence-based decision-making regarding future operational CAM systems as the National Weather Service transitions to a Unified Forecast system as part of the Next-Generation Global Prediction System initiative.
Abstract
Forecast verification and evaluation is a critical aspect of forecast development and improvement, day-to-day forecasting, and the interpretation and application of forecasts. In recent decades, the verification field has rapidly matured, and many new approaches have been developed. However, until recently, a stable set of modern tools to undertake this important component of forecasting has not been available. The Model Evaluation Tools (MET) was conceived and implemented to fill this gap. MET (https://dtcenter.org/community-code/model-evaluation-tools-met) was developed by the National Center for Atmospheric Research (NCAR), the National Oceanic and Atmospheric Administration (NOAA), and the U.S. Air Force (USAF) and is supported via the Developmental Testbed Center (DTC) and collaborations with operational and research organizations. MET incorporates traditional verification methods, as well as modern verification capabilities developed over the last two decades. MET stands apart from other verification packages due to its inclusion of innovative spatial methods, statistical inference tools, and a wide range of approaches to address the needs of individual users, coupled with strong community engagement and support. In addition, MET is freely available, which ensures that consistent modern verification capabilities can be applied by researchers and operational forecasting practitioners, enabling the use of consistent and scientifically meaningful methods by all users. This article describes MET and the expansion of MET to an umbrella package (METplus) that includes a database and display system and Python wrappers to facilitate the wide use of MET. Examples of MET applications illustrate some of the many ways that the package can be used to evaluate forecasts in a meaningful way.
Abstract
Forecast verification and evaluation is a critical aspect of forecast development and improvement, day-to-day forecasting, and the interpretation and application of forecasts. In recent decades, the verification field has rapidly matured, and many new approaches have been developed. However, until recently, a stable set of modern tools to undertake this important component of forecasting has not been available. The Model Evaluation Tools (MET) was conceived and implemented to fill this gap. MET (https://dtcenter.org/community-code/model-evaluation-tools-met) was developed by the National Center for Atmospheric Research (NCAR), the National Oceanic and Atmospheric Administration (NOAA), and the U.S. Air Force (USAF) and is supported via the Developmental Testbed Center (DTC) and collaborations with operational and research organizations. MET incorporates traditional verification methods, as well as modern verification capabilities developed over the last two decades. MET stands apart from other verification packages due to its inclusion of innovative spatial methods, statistical inference tools, and a wide range of approaches to address the needs of individual users, coupled with strong community engagement and support. In addition, MET is freely available, which ensures that consistent modern verification capabilities can be applied by researchers and operational forecasting practitioners, enabling the use of consistent and scientifically meaningful methods by all users. This article describes MET and the expansion of MET to an umbrella package (METplus) that includes a database and display system and Python wrappers to facilitate the wide use of MET. Examples of MET applications illustrate some of the many ways that the package can be used to evaluate forecasts in a meaningful way.