Search Results
You are looking at 41 - 50 of 67 items for
- Author or Editor: Adam J. Clark x
- Refine by Access: All Content x
Abstract
Four different versions of the HAILCAST hail model have been tested as part of the 2014–16 NOAA Hazardous Weather Testbed (HWT) Spring Forecasting Experiments. HAILCAST was run as part of the National Severe Storms Laboratory (NSSL) WRF Ensemble during 2014–16 and the Community Leveraged Unified Ensemble (CLUE) in 2016. Objective verification using the Multi-Radar Multi-Sensor maximum expected size of hail (MRMS MESH) product was conducted using both object-based and neighborhood grid-based verification. Subjective verification and feedback was provided by HWT participants. Hourly maximum storm surrogate fields at a variety of thresholds and Storm Prediction Center (SPC) convective outlooks were also evaluated for comparison. HAILCAST was found to improve with each version due to feedback from the 2014–16 HWTs. The 2016 version of HAILCAST was equivalent to or exceeded the skill of the tested storm surrogates across a variety of thresholds. The post-2016 version of HAILCAST was found to improve 50-mm hail forecasts through object-based verification, but 25-mm hail forecasting ability declined as measured through neighborhood grid-based verification. The skill of the storm surrogate fields varied widely as the threshold values used to determine hail size were varied. HAILCAST was found not to require such tuning, as it produced consistent results even when used across different model configurations and horizontal grid spacings. Additionally, different storm surrogate fields performed at varying levels of skill when forecasting 25- versus 50-mm hail, hinting at the different convective modes typically associated with small versus large sizes of hail. HAILCAST was able to match results relatively consistently with the best-performing storm surrogate field across multiple hail size thresholds.
Abstract
Four different versions of the HAILCAST hail model have been tested as part of the 2014–16 NOAA Hazardous Weather Testbed (HWT) Spring Forecasting Experiments. HAILCAST was run as part of the National Severe Storms Laboratory (NSSL) WRF Ensemble during 2014–16 and the Community Leveraged Unified Ensemble (CLUE) in 2016. Objective verification using the Multi-Radar Multi-Sensor maximum expected size of hail (MRMS MESH) product was conducted using both object-based and neighborhood grid-based verification. Subjective verification and feedback was provided by HWT participants. Hourly maximum storm surrogate fields at a variety of thresholds and Storm Prediction Center (SPC) convective outlooks were also evaluated for comparison. HAILCAST was found to improve with each version due to feedback from the 2014–16 HWTs. The 2016 version of HAILCAST was equivalent to or exceeded the skill of the tested storm surrogates across a variety of thresholds. The post-2016 version of HAILCAST was found to improve 50-mm hail forecasts through object-based verification, but 25-mm hail forecasting ability declined as measured through neighborhood grid-based verification. The skill of the storm surrogate fields varied widely as the threshold values used to determine hail size were varied. HAILCAST was found not to require such tuning, as it produced consistent results even when used across different model configurations and horizontal grid spacings. Additionally, different storm surrogate fields performed at varying levels of skill when forecasting 25- versus 50-mm hail, hinting at the different convective modes typically associated with small versus large sizes of hail. HAILCAST was able to match results relatively consistently with the best-performing storm surrogate field across multiple hail size thresholds.
Abstract
This work evaluates the performance of a recently developed cloud-scale lightning data assimilation technique implemented within the Weather Research and Forecasting Model running at convection-allowing scales (4-km grid spacing). Data provided by the Earth Networks Total Lightning Network for the contiguous United States (CONUS) were assimilated in real time over 67 days spanning the 2013 warm season (May–July). The lightning data were assimilated during the first 2 h of simulations each day. Bias-corrected, neighborhood-based, equitable threat scores (BC-ETSs) were the chief metric used to quantify the skill of the forecasts utilizing this assimilation scheme. Owing to inferior observational data quality over mountainous terrain, this evaluation focused on the eastern two-thirds of the United States.
During the first 3 h following the assimilation (i.e., 3-h forecasts), all the simulations suffered from a high wet bias in forecasted accumulated precipitation (APCP), particularly for the lightning assimilation run (LIGHT). Forecasts produced by LIGHT, however, had a noticeable, statistically significant (α = 0.05) improvement over those by the control run (CTRL) up to 6 h into the forecast with BC-ETS differences often exceeding 0.4. This improvement was seen independently of the APCP threshold (ranging from 2.5 to 50 mm) and the neighborhood radius (ranging from 0 to 40 km) selected. Past 6 h of the forecast, the APCP fields from LIGHT progressively converged to that of CTRL probably due to the longer-term evolution being bounded by the large-scale model environment. Thus, this computationally inexpensive lightning assimilation scheme shows considerable promise for routinely improving short-term (≤6 h) forecasts of high-impact weather by convection-allowing forecast models.
Abstract
This work evaluates the performance of a recently developed cloud-scale lightning data assimilation technique implemented within the Weather Research and Forecasting Model running at convection-allowing scales (4-km grid spacing). Data provided by the Earth Networks Total Lightning Network for the contiguous United States (CONUS) were assimilated in real time over 67 days spanning the 2013 warm season (May–July). The lightning data were assimilated during the first 2 h of simulations each day. Bias-corrected, neighborhood-based, equitable threat scores (BC-ETSs) were the chief metric used to quantify the skill of the forecasts utilizing this assimilation scheme. Owing to inferior observational data quality over mountainous terrain, this evaluation focused on the eastern two-thirds of the United States.
During the first 3 h following the assimilation (i.e., 3-h forecasts), all the simulations suffered from a high wet bias in forecasted accumulated precipitation (APCP), particularly for the lightning assimilation run (LIGHT). Forecasts produced by LIGHT, however, had a noticeable, statistically significant (α = 0.05) improvement over those by the control run (CTRL) up to 6 h into the forecast with BC-ETS differences often exceeding 0.4. This improvement was seen independently of the APCP threshold (ranging from 2.5 to 50 mm) and the neighborhood radius (ranging from 0 to 40 km) selected. Past 6 h of the forecast, the APCP fields from LIGHT progressively converged to that of CTRL probably due to the longer-term evolution being bounded by the large-scale model environment. Thus, this computationally inexpensive lightning assimilation scheme shows considerable promise for routinely improving short-term (≤6 h) forecasts of high-impact weather by convection-allowing forecast models.
Abstract
Attempts at probabilistic tornado forecasting using convection-allowing models (CAMs) have thus far used CAM attribute [e.g., hourly maximum 2–5-km updraft helicity (UH)] thresholds, treating them as binary events—either a grid point exceeds a given threshold or it does not. This study approaches these attributes probabilistically, using empirical observations of storm environment attributes and the subsequent climatological tornado occurrence frequency to assign a probability that a point will be within 40 km of a tornado, given the model-derived storm environment attributes. Combining empirical frequencies and forecast attributes produces better forecasts than solely using mid- or low-level UH, even if the UH is filtered using environmental parameter thresholds. Empirical tornado frequencies were derived using severe right-moving supercellular storms associated with a local storm report (LSR) of a tornado, severe wind, or severe hail for a given significant tornado parameter (STP) value from Storm Prediction Center (SPC) mesoanalysis grids in 2014–15. The NSSL–WRF ensemble produced the forecast STP values and simulated right-moving supercells, which were identified using a UH exceedance threshold. Model-derived probabilities are verified using tornado segment data from just right-moving supercells and from all tornadoes, as are the SPC-issued 0600 UTC tornado probabilities from the initial day 1 forecast valid 1200–1159 UTC the following day. The STP-based probabilistic forecasts perform comparably to SPC tornado probability forecasts in many skill metrics (e.g., reliability) and thus could be used as first-guess forecasts. Comparison with prior methodologies shows that probabilistic environmental information improves CAM-based tornado forecasts.
Abstract
Attempts at probabilistic tornado forecasting using convection-allowing models (CAMs) have thus far used CAM attribute [e.g., hourly maximum 2–5-km updraft helicity (UH)] thresholds, treating them as binary events—either a grid point exceeds a given threshold or it does not. This study approaches these attributes probabilistically, using empirical observations of storm environment attributes and the subsequent climatological tornado occurrence frequency to assign a probability that a point will be within 40 km of a tornado, given the model-derived storm environment attributes. Combining empirical frequencies and forecast attributes produces better forecasts than solely using mid- or low-level UH, even if the UH is filtered using environmental parameter thresholds. Empirical tornado frequencies were derived using severe right-moving supercellular storms associated with a local storm report (LSR) of a tornado, severe wind, or severe hail for a given significant tornado parameter (STP) value from Storm Prediction Center (SPC) mesoanalysis grids in 2014–15. The NSSL–WRF ensemble produced the forecast STP values and simulated right-moving supercells, which were identified using a UH exceedance threshold. Model-derived probabilities are verified using tornado segment data from just right-moving supercells and from all tornadoes, as are the SPC-issued 0600 UTC tornado probabilities from the initial day 1 forecast valid 1200–1159 UTC the following day. The STP-based probabilistic forecasts perform comparably to SPC tornado probability forecasts in many skill metrics (e.g., reliability) and thus could be used as first-guess forecasts. Comparison with prior methodologies shows that probabilistic environmental information improves CAM-based tornado forecasts.
Abstract
Probabilistic ensemble-derived tornado forecasts generated from convection-allowing models often use hourly maximum updraft helicity (UH) alone or in combination with environmental parameters as a proxy for right-moving (RM) supercells. However, when UH occurrence is a condition for tornado probability generation, false alarm areas can occur from UH swaths associated with nocturnal mesoscale convective systems, which climatologically produce fewer tornadoes than RM supercells. This study incorporates UH timing information with the forecast near-storm significant tornado parameter (STP) to calibrate the forecast tornado probability. To generate the probabilistic forecasts, three sets of observed climatological tornado frequencies given an RM supercell and STP value are incorporated with the model output, two of which use UH timing information. One method uses the observed climatological tornado frequency for a given 3-h window to generate the probabilities. Another normalizes the observed climatological tornado frequency by the number of hail, wind, and tornado reports observed in that 3-h window compared to the maximum number of reports in any 3-h window. The final method is independent of when UH occurs and uses the observed climatological tornado frequency encompassing all hours. The normalized probabilities reduce the false alarm area compared to the other methods but have a smaller area under the ROC curve and require a much higher percentile of the STP distribution to be used in probability generation to become reliable. Case studies demonstrate that the normalized probabilities highlight the most likely area for evening RM supercellular tornadoes, decreasing the nocturnal false alarm by assuming a linear convective mode.
Abstract
Probabilistic ensemble-derived tornado forecasts generated from convection-allowing models often use hourly maximum updraft helicity (UH) alone or in combination with environmental parameters as a proxy for right-moving (RM) supercells. However, when UH occurrence is a condition for tornado probability generation, false alarm areas can occur from UH swaths associated with nocturnal mesoscale convective systems, which climatologically produce fewer tornadoes than RM supercells. This study incorporates UH timing information with the forecast near-storm significant tornado parameter (STP) to calibrate the forecast tornado probability. To generate the probabilistic forecasts, three sets of observed climatological tornado frequencies given an RM supercell and STP value are incorporated with the model output, two of which use UH timing information. One method uses the observed climatological tornado frequency for a given 3-h window to generate the probabilities. Another normalizes the observed climatological tornado frequency by the number of hail, wind, and tornado reports observed in that 3-h window compared to the maximum number of reports in any 3-h window. The final method is independent of when UH occurs and uses the observed climatological tornado frequency encompassing all hours. The normalized probabilities reduce the false alarm area compared to the other methods but have a smaller area under the ROC curve and require a much higher percentile of the STP distribution to be used in probability generation to become reliable. Case studies demonstrate that the normalized probabilities highlight the most likely area for evening RM supercellular tornadoes, decreasing the nocturnal false alarm by assuming a linear convective mode.
Abstract
During the 2019 Spring Forecasting Experiment in NOAA’s Hazardous Weather Testbed, two NWS forecasters issued experimental probabilistic forecasts of hail, tornadoes, and severe convective wind using NSSL’s Warn-on-Forecast System (WoFS). The aim was to explore forecast skill in the time frame between severe convective watches and severe convective warnings during the peak of the spring convective season. Hourly forecasts issued during 2100–0000 UTC, valid from 0100 to 0200 UTC demonstrate how forecasts change with decreasing lead time. Across all 13 cases in this study, the descriptive outlook statistics (e.g., mean outlook area, number of contours) change slightly and the measures of outlook skill (e.g., fractions skill score, reliability) improve incrementally with decreasing lead time. WoFS updraft helicity (UH) probabilities also improve slightly and less consistently with decreasing lead time, though both the WoFS and the forecasters generated skillful forecasts throughout. Larger skill differences with lead time emerge on a case-by-case basis, illustrating cases where forecasters consistently improved upon WoFS guidance, cases where the guidance and the forecasters recognized small-scale features as lead time decreased, and cases where the forecasters issued small areas of high probabilities using guidance and observations. While forecasts generally “honed in” on the reports with slightly smaller contours and higher probabilities, increased confidence could include higher certainty that severe weather would not occur (e.g., lower probabilities). Long-range (1–5 h) WoFS UH probabilities were skillful, and where the guidance erred, forecasters could adjust for those errors and increase their forecasts’ skill as lead time decreased.
Significance Statement
Forecasts are often assumed to improve as an event approaches and uncertainties resolve. This work examines the evolution of experimental forecasts valid over one hour with decreasing lead time issued using the Warn-on-Forecast System (WoFS). Because of its rapidly updating ensemble data assimilation, WoFS can help forecasters understand how thunderstorm hazards may evolve in the next 0–6 h. We found slight improvements in forecast and WoFS performance as a function of lead time over the full experiment; the first forecasts issued and the initial WoFS guidance performed well at long lead times, and good performance continued as the event approached. However, individual cases varied and forecasters frequently combined raw model output with observed mesoscale features to provide skillful small-scale forecasts.
Abstract
During the 2019 Spring Forecasting Experiment in NOAA’s Hazardous Weather Testbed, two NWS forecasters issued experimental probabilistic forecasts of hail, tornadoes, and severe convective wind using NSSL’s Warn-on-Forecast System (WoFS). The aim was to explore forecast skill in the time frame between severe convective watches and severe convective warnings during the peak of the spring convective season. Hourly forecasts issued during 2100–0000 UTC, valid from 0100 to 0200 UTC demonstrate how forecasts change with decreasing lead time. Across all 13 cases in this study, the descriptive outlook statistics (e.g., mean outlook area, number of contours) change slightly and the measures of outlook skill (e.g., fractions skill score, reliability) improve incrementally with decreasing lead time. WoFS updraft helicity (UH) probabilities also improve slightly and less consistently with decreasing lead time, though both the WoFS and the forecasters generated skillful forecasts throughout. Larger skill differences with lead time emerge on a case-by-case basis, illustrating cases where forecasters consistently improved upon WoFS guidance, cases where the guidance and the forecasters recognized small-scale features as lead time decreased, and cases where the forecasters issued small areas of high probabilities using guidance and observations. While forecasts generally “honed in” on the reports with slightly smaller contours and higher probabilities, increased confidence could include higher certainty that severe weather would not occur (e.g., lower probabilities). Long-range (1–5 h) WoFS UH probabilities were skillful, and where the guidance erred, forecasters could adjust for those errors and increase their forecasts’ skill as lead time decreased.
Significance Statement
Forecasts are often assumed to improve as an event approaches and uncertainties resolve. This work examines the evolution of experimental forecasts valid over one hour with decreasing lead time issued using the Warn-on-Forecast System (WoFS). Because of its rapidly updating ensemble data assimilation, WoFS can help forecasters understand how thunderstorm hazards may evolve in the next 0–6 h. We found slight improvements in forecast and WoFS performance as a function of lead time over the full experiment; the first forecasts issued and the initial WoFS guidance performed well at long lead times, and good performance continued as the event approached. However, individual cases varied and forecasters frequently combined raw model output with observed mesoscale features to provide skillful small-scale forecasts.
Abstract
The High Resolution Ensemble Forecast v2.1 (HREFv2.1), an operational convection-allowing model (CAM) ensemble, is an “ensemble of opportunity” wherein forecasts from several independently designed deterministic CAMs are aggregated and postprocessed together. Multiple dimensions of diversity in the HREFv2.1 ensemble membership contribute to ensemble spread, including model core, physics parameterization schemes, initial conditions (ICs), and time lagging. In this study, HREFv2.1 forecasts are compared against the High Resolution Rapid Refresh Ensemble (HRRRE) and the Multiscale data Assimilation and Predictability (MAP) ensemble, two experimental CAM ensembles that ran during the 5-week Spring Forecasting Experiment (SFE) in spring 2018. The HRRRE and MAP are formally designed ensembles with spread achieved primarily through perturbed ICs. Verification in this study focuses on composite radar reflectivity and updraft helicity to assess ensemble performance in forecasting convective storms. The HREFv2.1 shows the highest overall skill for these forecasts, matching subjective real-time impressions from SFE participants. Analysis of the skill and variance of ensemble member forecasts suggests that the HREFv2.1 exhibits greater spread and more effectively samples model uncertainty than the HRRRE or MAP. These results imply that to optimize skill in forecasting convective storms at 1–2-day lead times, future CAM ensembles should employ either diverse membership designs or sophisticated perturbation schemes capable of representing model uncertainty with comparable efficacy.
Abstract
The High Resolution Ensemble Forecast v2.1 (HREFv2.1), an operational convection-allowing model (CAM) ensemble, is an “ensemble of opportunity” wherein forecasts from several independently designed deterministic CAMs are aggregated and postprocessed together. Multiple dimensions of diversity in the HREFv2.1 ensemble membership contribute to ensemble spread, including model core, physics parameterization schemes, initial conditions (ICs), and time lagging. In this study, HREFv2.1 forecasts are compared against the High Resolution Rapid Refresh Ensemble (HRRRE) and the Multiscale data Assimilation and Predictability (MAP) ensemble, two experimental CAM ensembles that ran during the 5-week Spring Forecasting Experiment (SFE) in spring 2018. The HRRRE and MAP are formally designed ensembles with spread achieved primarily through perturbed ICs. Verification in this study focuses on composite radar reflectivity and updraft helicity to assess ensemble performance in forecasting convective storms. The HREFv2.1 shows the highest overall skill for these forecasts, matching subjective real-time impressions from SFE participants. Analysis of the skill and variance of ensemble member forecasts suggests that the HREFv2.1 exhibits greater spread and more effectively samples model uncertainty than the HRRRE or MAP. These results imply that to optimize skill in forecasting convective storms at 1–2-day lead times, future CAM ensembles should employ either diverse membership designs or sophisticated perturbation schemes capable of representing model uncertainty with comparable efficacy.
Abstract
Hail forecasts produced by the CAM-HAILCAST pseudo-Lagrangian hail size forecasting model were evaluated during the 2019, 2020, and 2021 NOAA Hazardous Weather Testbed (HWT) Spring Forecasting Experiments (SFEs). As part of this evaluation, HWT SFE participants were polled about their definition of a “good” hail forecast. Participants were presented with two different verification methods conducted over three different spatiotemporal scales, and were then asked to subjectively evaluate the hail forecast as well as the different verification methods themselves. Results recommended use of multiple verification methods tailored to the type of forecast expected by the end-user interpreting and applying the forecast. The hail forecasts evaluated during this period included an implementation of CAM-HAILCAST in the Limited Area Model of the Unified Forecast System with the Finite Volume 3 (FV3) dynamical core. Evaluation of FV3-HAILCAST over both 1- and 24-h periods found continued improvement from 2019 to 2021. The improvement was largely a result of wide intervariability among FV3 ensemble members with different microphysics parameterizations in 2019 lessening significantly during 2020 and 2021. Overprediction throughout the diurnal cycle also lessened by 2021. A combination of both upscaling neighborhood verification and an object-based technique that only retained matched convective objects was necessary to understand the improvement, agreeing with the HWT SFE participants’ recommendations for multiple verification methods.
Significance Statement
“Good” forecasts of hail can be determined in multiple ways and must depend on both the performance of the guidance and the perspective of the end-user. This work looks at different verification strategies to capture the performance of the CAM-HAILCAST hail forecasting model across three years of the Spring Forecasting Experiment (SFE) in different parent models. Verification strategies were informed by SFE participant input via a survey. Skill variability among models decreased in SFE 2021 relative to prior SFEs. The FV3 model in 2021, compared to 2019, provided improved forecasts of both convective distribution and 38-mm (1.5 in.) hail size, as well as less overforecasting of convection from 1900 to 2300 UTC.
Abstract
Hail forecasts produced by the CAM-HAILCAST pseudo-Lagrangian hail size forecasting model were evaluated during the 2019, 2020, and 2021 NOAA Hazardous Weather Testbed (HWT) Spring Forecasting Experiments (SFEs). As part of this evaluation, HWT SFE participants were polled about their definition of a “good” hail forecast. Participants were presented with two different verification methods conducted over three different spatiotemporal scales, and were then asked to subjectively evaluate the hail forecast as well as the different verification methods themselves. Results recommended use of multiple verification methods tailored to the type of forecast expected by the end-user interpreting and applying the forecast. The hail forecasts evaluated during this period included an implementation of CAM-HAILCAST in the Limited Area Model of the Unified Forecast System with the Finite Volume 3 (FV3) dynamical core. Evaluation of FV3-HAILCAST over both 1- and 24-h periods found continued improvement from 2019 to 2021. The improvement was largely a result of wide intervariability among FV3 ensemble members with different microphysics parameterizations in 2019 lessening significantly during 2020 and 2021. Overprediction throughout the diurnal cycle also lessened by 2021. A combination of both upscaling neighborhood verification and an object-based technique that only retained matched convective objects was necessary to understand the improvement, agreeing with the HWT SFE participants’ recommendations for multiple verification methods.
Significance Statement
“Good” forecasts of hail can be determined in multiple ways and must depend on both the performance of the guidance and the perspective of the end-user. This work looks at different verification strategies to capture the performance of the CAM-HAILCAST hail forecasting model across three years of the Spring Forecasting Experiment (SFE) in different parent models. Verification strategies were informed by SFE participant input via a survey. Skill variability among models decreased in SFE 2021 relative to prior SFEs. The FV3 model in 2021, compared to 2019, provided improved forecasts of both convective distribution and 38-mm (1.5 in.) hail size, as well as less overforecasting of convection from 1900 to 2300 UTC.
Abstract
Convection-allowing models offer forecasters unique insight into convective hazards relative to numerical models using parameterized convection. However, methods to best characterize the uncertainty of guidance derived from convection-allowing models are still unrefined. This paper proposes a method of deriving calibrated probabilistic forecasts of rare events from deterministic forecasts by fitting a parametric kernel density function to the model’s historical spatial error characteristics. This kernel density function is then applied to individual forecast fields to produce probabilistic forecasts.
Abstract
Convection-allowing models offer forecasters unique insight into convective hazards relative to numerical models using parameterized convection. However, methods to best characterize the uncertainty of guidance derived from convection-allowing models are still unrefined. This paper proposes a method of deriving calibrated probabilistic forecasts of rare events from deterministic forecasts by fitting a parametric kernel density function to the model’s historical spatial error characteristics. This kernel density function is then applied to individual forecast fields to produce probabilistic forecasts.
Abstract
Examining forecasts from the Storm Scale Ensemble Forecast (SSEF) system run by the Center for Analysis and Prediction of Storms for the 2010 NOAA/Hazardous Weather Testbed Spring Forecasting Experiment, recent research diagnosed a strong relationship between the cumulative pathlengths of simulated rotating storms (measured using a three-dimensional object identification algorithm applied to forecast updraft helicity) and the cumulative pathlengths of tornadoes. This paper updates those results by including data from the 2011 SSEF system, and illustrates forecast examples from three major 2011 tornado outbreaks—16 and 27 April, and 24 May—as well as two forecast failure cases from June 2010. Finally, analysis updraft helicity (UH) from 27 April 2011 is computed using a three-dimensional variational data assimilation system to obtain 1.25-km grid-spacing analyses at 5-min intervals and compared to forecast UH from individual SSEF members.
Abstract
Examining forecasts from the Storm Scale Ensemble Forecast (SSEF) system run by the Center for Analysis and Prediction of Storms for the 2010 NOAA/Hazardous Weather Testbed Spring Forecasting Experiment, recent research diagnosed a strong relationship between the cumulative pathlengths of simulated rotating storms (measured using a three-dimensional object identification algorithm applied to forecast updraft helicity) and the cumulative pathlengths of tornadoes. This paper updates those results by including data from the 2011 SSEF system, and illustrates forecast examples from three major 2011 tornado outbreaks—16 and 27 April, and 24 May—as well as two forecast failure cases from June 2010. Finally, analysis updraft helicity (UH) from 27 April 2011 is computed using a three-dimensional variational data assimilation system to obtain 1.25-km grid-spacing analyses at 5-min intervals and compared to forecast UH from individual SSEF members.
Abstract
Probabilistic quantitative precipitation forecasts (PQPFs) from the storm-scale ensemble forecast system run by the Center for Analysis and Prediction of Storms during the spring of 2009 are evaluated using area under the relative operating characteristic curve (ROC area). ROC area, which measures discriminating ability, is examined for ensemble size n from 1 to 17 members and for spatial scales ranging from 4 to 200 km.
Expectedly, incremental gains in skill decrease with increasing n. Significance tests comparing ROC areas for each n to those of the full 17-member ensemble revealed that more members are required to reach statistically indistinguishable PQPF skill relative to the full ensemble as forecast lead time increases and spatial scale decreases. These results appear to reflect the broadening of the forecast probability distribution function (PDF) of future atmospheric states associated with decreasing spatial scale and increasing forecast lead time. They also illustrate that efficient allocation of computing resources for convection-allowing ensembles requires careful consideration of spatial scale and forecast length desired.
Abstract
Probabilistic quantitative precipitation forecasts (PQPFs) from the storm-scale ensemble forecast system run by the Center for Analysis and Prediction of Storms during the spring of 2009 are evaluated using area under the relative operating characteristic curve (ROC area). ROC area, which measures discriminating ability, is examined for ensemble size n from 1 to 17 members and for spatial scales ranging from 4 to 200 km.
Expectedly, incremental gains in skill decrease with increasing n. Significance tests comparing ROC areas for each n to those of the full 17-member ensemble revealed that more members are required to reach statistically indistinguishable PQPF skill relative to the full ensemble as forecast lead time increases and spatial scale decreases. These results appear to reflect the broadening of the forecast probability distribution function (PDF) of future atmospheric states associated with decreasing spatial scale and increasing forecast lead time. They also illustrate that efficient allocation of computing resources for convection-allowing ensembles requires careful consideration of spatial scale and forecast length desired.