Search Results

You are looking at 1 - 10 of 42 items for

  • Author or Editor: Fanyou Kong x
  • Refine by Access: All Content x
Clear All Modify Search
Jeffrey D. Duda
,
Xuguang Wang
,
Fanyou Kong
, and
Ming Xue

Abstract

Two approaches for accounting for errors in quantitative precipitation forecasts (QPFs) due to uncertainty in the microphysics (MP) parameterization in a convection-allowing ensemble are examined. They include mixed MP (MMP) composed mostly of double-moment schemes and perturbing parameters within the Weather Research and Forecasting single-moment 6-class microphysics scheme (WSM6) MP scheme (PPMP). Thirty-five cases of real-time storm-scale ensemble forecasts produced by the Center for Analysis and Prediction of Storms during the NOAA Hazardous Weather Testbed 2011 Spring Experiment were examined.

The MMP ensemble had better fractions Brier scores (FBSs) for most lead times and thresholds, but the PPMP ensemble had better relative operating characteristic (ROC) scores for higher precipitation thresholds. The pooled ensemble formed by randomly drawing five members from the MMP and PPMP ensembles was no more skillful than the more accurate of the MMP and PPMP ensembles. Significant positive impact was found when the two were combined to form a larger ensemble.

The QPF and the systematic behaviors of derived microphysical variables were also examined. The skill of the QPF among different members depended on the thresholds, verification metrics, and forecast lead times. The profiles of microphysics variables from the double-moment schemes contained more variation in the vertical than those from the single-moment members. Among the double-moment schemes, WDM6 produced the smallest raindrops and very large number concentrations. Among the PPMP members, the behaviors were found to be consistent with the prescribed intercept parameters. The perturbed intercept parameters used in the PPMP ensemble fell within the range of values retrieved from the double-moment schemes.

Full access
Caren Marzban
,
Ranran Wang
,
Fanyou Kong
, and
Stephen Leyton

Abstract

The rank histogram (RH) is a visual tool for assessing the reliability of ensemble forecasts (i.e., the degree to which the forecasts and the observations have the same distribution). But it is already known that in certain situations it conveys misleading information. Here, it is shown that a temporal correlation can lead to a misleading RH, but such a correlation contributes only to the sampling variability of the RH, and so it is accounted for by producing a RH that explicitly displays sampling variability. A simulation is employed to show that the variance within each ensemble member (i.e., climatological variance), the correlation between ensemble members, and the correlation between the observations and the forecasts, all have a confounding effect on the RH, making it difficult to use the RH for assessing the climatological component of forecast reliability. It is proposed that a “residual” quantile–quantile plot (denoted R-Q-Q plot) is better suited than the RH for assessing the climatological component of forecast reliability. Then, the RH and R-Q-Q plots for temperature and wind speed forecasts at 90 stations across the continental United States are computed. A wide range of forecast reliability is noted. For some stations, the nonreliability of the forecasts can be attributed to bias and/or under-or overclimatological dispersion. For others, the difference between the distributions can be traced to lighter or heavier tails in the distributions, while for other stations the distributions of the forecasts and the observations appear to be completely different. A spatial signature is also noted and discussed briefly.

Full access
Eric D. Loken
,
Adam J. Clark
,
Ming Xue
, and
Fanyou Kong

Abstract

Given increasing computing power, an important question is whether additional computational resources would be better spent reducing the horizontal grid spacing of a convection-allowing model (CAM) or adding members to form CAM ensembles. The present study investigates this question as it applies to CAM-derived next-day probabilistic severe weather forecasts created by using forecast updraft helicity as a severe weather proxy for 63 days of the 2010 and 2011 NOAA Hazardous Weather Testbed Spring Forecasting Experiments. Forecasts derived from three sets of Weather Research and Forecasting Model configurations are tested: a 1-km deterministic model, a 4-km deterministic model, and an 11-member, 4-km ensemble. Forecast quality is evaluated using relative operating characteristic (ROC) curves, attributes diagrams, and performance diagrams, and forecasts from five representative cases are analyzed to investigate their relative quality and value in a variety of situations. While no statistically significant differences exist between the 4- and 1-km deterministic forecasts in terms of area under ROC curves, the 4-km ensemble forecasts offer weakly significant improvements over the 4-km deterministic forecasts over the entire 63-day dataset. Further, the 4-km ensemble forecasts generally provide greater forecast quality relative to either of the deterministic forecasts on an individual day. Collectively, these results suggest that, for purposes of improving next-day CAM-derived probabilistic severe weather forecasts, additional computing resources may be better spent on adding members to form CAM ensembles than on reducing the horizontal grid spacing of a deterministic model below 4 km.

Full access
Fanyou Kong
,
Kelvin K. Droegemeier
, and
Nicki L. Hickmon

Abstract

In Part I, the authors used a full physics, nonhydrostatic numerical model with horizontal grid spacing of 24 km and nested grids of 6- and 3-km spacing to generate the ensemble forecasts of an observed tornadic thunderstorm complex. The principal goal was to quantify the value added by fine grid spacing, as well as the assimilation of Doppler radar data, in both probabilistic and deterministic frameworks. The present paper focuses exclusively on 3-km horizontal grid spacing ensembles and the associated impacts on the forecast quality of temporal forecast sequencing, the construction of initial perturbations, and data assimilation. As in Part I, the authors employ a modified form of the scaled lagged average forecasting technique and use Stage IV accumulated precipitation estimates for verification. The ensemble mean and spread of accumulated precipitation are found to be similar in structure, mimicking their behavior in global models. Both the assimilation of Doppler radar data and the use of shorter (1–2 versus 3–5 h) forecast lead times improve ensemble precipitation forecasts. However, even at longer lead times and in certain situations without assimilated radar data, the ensembles are able to capture storm-scale features when the associated control forecast in a deterministic framework fails to do so. This indicates the potential value added by ensembles although this single case is not sufficient for drawing general conclusions. The creation of initial perturbations using forecasts of the same grid spacing shows no significant improvement over simply extracting perturbations from forecasts made at coarser spacing and interpolating them to finer grids. However, forecast quality is somewhat dependent upon perturbation amplitude, with smaller scaling values leading to significant underdispersion. Traditional forecast skill scores show somewhat contradictory results for accumulated precipitation, with the equitable threat score most consistent with qualitative performance.

Full access
Jiangshan Zhu
,
Fanyou Kong
,
Lingkun Ran
, and
Hengchi Lei

Abstract

To study the impact of training sample heterogeneity on the performance of Bayesian model averaging (BMA), two BMA experiments were performed on probabilistic quantitative precipitation forecasts (PQPFs) in the northern China region in July and August of 2010 generated from an 11-member short-range ensemble forecasting system. One experiment, as in many conventional BMA studies, used an overall training sample that consisted of all available cases in the training period, while the second experiment used stratified sampling BMA by first dividing all available training cases into subsamples according to their ensemble spread, and then performing BMA on each subsample. The results showed that ensemble spread is a good criterion to divide ensemble precipitation cases into subsamples, and that the subsamples have different statistical properties. Pooling the subsamples together forms a heterogeneous overall sample. Conventional BMA is incapable of interpreting heterogeneous samples, and produces unreliable PQPF. It underestimates the forecast probability at high-threshold PQPF and local rainfall maxima in BMA percentile forecasts. BMA with stratified sampling according to ensemble spread overcomes the problem reasonably well, producing sharper predictive probability density functions and BMA percentile forecasts, and more reliable PQPF than the conventional BMA approach. The continuous ranked probability scores, Brier skill scores, and reliability diagrams of the two BMA experiments were examined for all available forecast days, along with a logistic regression experiment. Stratified sampling BMA outperformed the raw ensemble and conventional BMA in all verifications, and also showed better skill than logistic regression in low-threshold forecasts.

Full access
Aaron Johnson
,
Xuguang Wang
,
Fanyou Kong
, and
Ming Xue

Abstract

Forecasts generated by the Center for Analysis and Prediction of Storms with 1- and 4-km grid spacing using the Advanced Research Weather Research and Forecasting Model (ARW-WRF; ARW1 and ARW4, respectively) for the 2009–11 NOAA Hazardous Weather Testbed Spring Experiments are compared and verified. Object-based measures, including average values of object attributes, the object-based threat score (OTS), and the median of maximum interest (MMI) are used for the verification. Verification was first performed against observations at scales resolvable by each forecast model and then performed at scales resolvable by both models by remapping ARW1 to the ARW4 grid (ARW1to4). Thirty-hour forecasts of 1-h accumulated precipitation initialized at 0000 UTC on 22, 36, and 33 days during the spring of 2009, 2010, and 2011, respectively, are evaluated over a domain covering most of the central and eastern United States. ARW1, ARW1to4, and ARW4 all significantly overforecasted the number of objects during diurnal convection maxima. The overforecasts by ARW1 and ARW1to4 were more pronounced than ARW4 during the first convection maximum at 1-h lead time. The average object area and aspect ratio were closer to observations for ARW1 and ARW1to4 than for ARW4. None of the models showed a significant advantage over the others for average orientation angle and centroid location. Increased accuracy for ARW1, compared to ARW4, was statistically significant for the MMI but not the OTS. However, ARW1to4 had similar MMI and OTS as ARW4 at most lead times. These results are consistent with subjective evaluations that the greatest impact of grid spacing is on the smallest resolvable objects.

Full access
Russ S. Schumacher
,
Adam J. Clark
,
Ming Xue
, and
Fanyou Kong

Abstract

From 9 to 11 June 2010, a mesoscale convective vortex (MCV) was associated with several periods of heavy rainfall that led to flash flooding. During the overnight hours, mesoscale convective systems (MCSs) developed that moved slowly and produced heavy rainfall over small areas in south-central Texas on 9 June, north Texas on 10 June, and western Arkansas on 11 June. In this study, forecasts of this event from the Center for the Analysis and Prediction of Storms' Storm-Scale Ensemble Forecast system are examined. This ensemble, with 26 members at 4-km horizontal grid spacing, included a few members that very accurately predicted the development, maintenance, and evolution of the heavy-rain-producing MCSs, along with a majority of members that had substantial errors in their precipitation forecasts. The processes favorable for the initiation, organization, and maintenance of these heavy-rain-producing MCSs are diagnosed by comparing ensemble members with accurate and inaccurate forecasts. Even within a synoptic environment known to be conducive to extreme local rainfall, there was considerable spread in the ensemble's rainfall predictions. Because all ensemble members included an anomalously moist environment, the precipitation predictions were insensitive to the atmospheric moisture. However, the development of heavy precipitation overnight was very sensitive to the intensity and evolution of convection the previous day. Convective influences on the strength of the MCV and its associated dome of cold air at low levels determined whether subsequent deep convection was initiated and maintained. In all, this ensemble provides quantitative and qualitative information about the mesoscale processes that are most favorable (or unfavorable) for localized extreme rainfall.

Full access
Marc Berenguer
,
Madalina Surcel
,
Isztar Zawadzki
,
Ming Xue
, and
Fanyou Kong

Abstract

This second part of a two-paper series compares deterministic precipitation forecasts from the Storm-Scale Ensemble Forecast System (4-km grid) run during the 2008 NOAA Hazardous Weather Testbed (HWT) Spring Experiment, and from the Canadian Global Environmental Multiscale (GEM) model (15 km), in terms of their ability to reproduce the average diurnal cycle of precipitation during spring 2008. Moreover, radar-based nowcasts generated with the McGill Algorithm for Precipitation Nowcasting Using Semi-Lagrangian Extrapolation (MAPLE) are analyzed to quantify the portion of the diurnal cycle explained by the motion of precipitation systems, and to evaluate the potential of the NWP models for very short-term forecasting.

The observed diurnal cycle of precipitation during spring 2008 is characterized by the dominance of the 24-h harmonic, which shifts with longitude, consistent with precipitation traveling across the continent. Time–longitude diagrams show that the analyzed NWP models partially reproduce this signal, but show more variability in the timing of initiation in the zonal motion of the precipitation systems than observed from radar.

Traditional skill scores show that the radar data assimilation is the main reason for differences in model performance, while the analyzed models that do not assimilate radar observations have very similar skill.

The analysis of MAPLE forecasts confirms that the motion of precipitation systems is responsible for the dominance of the 24-h harmonic in the longitudinal range 103°–85°W, where 8-h MAPLE forecasts initialized at 0100, 0900, and 1700 UTC successfully reproduce the eastward motion of rainfall systems. Also, on average, MAPLE outperforms radar data assimilating models for the 3–4 h after initialization, and nonradar data assimilating models for up to 5 h after initialization.

Full access
Fanyou Kong
,
Kelvin K. Droegemeier
, and
Nicki L. Hickmon

Abstract

Using a nonhydrostatic numerical model with horizontal grid spacing of 24 km and nested grids of 6- and 3-km spacing, the authors employ the scaled lagged average forecasting (SLAF) technique, developed originally for global and synoptic-scale prediction, to generate ensemble forecasts of a tornadic thunderstorm complex that occurred in north-central Texas on 28–29 March 2000. This is the first attempt, to their knowledge, in applying ensemble techniques to a cloud-resolving model using radar and other observations assimilated within nonhorizontally uniform initial conditions and full model physics. The principal goal of this study is to investigate the viability of ensemble forecasting in the context of explicitly resolved deep convective storms, with particular emphasis on the potential value added by fine grid spacing and probabilistic versus deterministic forecasts. Further, the authors focus on the structure and growth of errors as well as the application of suitable quantitative metrics to assess forecast skill for highly intermittent phenomena at fine scales.

Because numerous strategies exist for linking multiple nested grids in an ensemble framework with none obviously superior, several are examined, particularly in light of how they impact the structure and growth of perturbations. Not surprisingly, forecast results are sensitive to the strategy chosen, and owing to the rapid growth of errors on the convective scale, the traditional SLAF methodology of age-based scaling is replaced by scaling predicated solely upon error magnitude. This modification improves forecast spread and skill, though the authors believe errors grow more slowly than is desirable.

For all three horizontal grid spacings utilized, ensembles show both qualitative and quantitative improvement relative to their respective deterministic control forecasts. Nonetheless, the evolution of convection at 24- and 6-km spacings is vastly different from, and arguably inferior to, that at 3 km because at 24-km spacing, the model cannot explicitly resolve deep convection while at 6 km, the deep convection closure problem is ill posed and clouds are neither implicitly nor explicitly represented (even at 3-km spacing, updrafts and downdrafts only are marginally resolved). Despite their greater spatial fidelity, the 3-km grid spacing experiments are limited in that the ensemble mean reflectivity tends to be much weaker in intensity, and much broader in aerial extent, than that of any single 3-km spacing forecast owing to amplitude reduction and spatial smearing that occur when averaging is applied to spatially intermittent phenomena. The ensemble means of accumulated precipitation, on the other hand, preserve peak intensity quite well.

Although a single case study obviously does not provide sufficient information with which to draw general conclusions, the results presented here, as well as those in Part II (which focuses solely on 3-km grid spacing experiments), suggest that even a small ensemble of cloud-resolving forecasts may provide greater skill, and greater practical value, than a single deterministic forecast using either the same or coarser grid spacing.

Full access
Aaron Johnson
,
Xuguang Wang
,
Ming Xue
, and
Fanyou Kong

Abstract

Twenty-member real-time convection-allowing storm-scale ensemble forecasts with perturbations to model physics, dynamics, initial conditions (IC), and lateral boundary conditions (LBC) during the NOAA Hazardous Weather Testbed Spring Experiment provide a unique opportunity to study the relative impact of different sources of perturbation on convection-allowing ensemble diversity. In Part II of this two-part study, systematic similarity/dissimilarity of hourly precipitation forecasts among ensemble members from the spring season of 2009 are identified using hierarchical cluster analysis (HCA) with a fuzzy object-based threat score (OTS), developed in . In addition to precipitation, HCA is also performed on ensemble forecasts using the traditional Euclidean distance for wind speed at 10 m and 850 hPa, and temperature at 500 hPa.

At early lead times (3 h, valid at 0300 UTC) precipitation forecasts cluster primarily by data assimilation and model dynamic core, indicating a dominating impact of models, with secondary clustering by microphysics. There is an increasing impact of the planetary boundary layer (PBL) scheme on clustering relative to the microphysics scheme at later lead times. Forecasts of 10-m wind speed cluster primarily by the PBL scheme at early lead times, with an increasing impact of LBC at later lead times. Forecasts of midtropospheric variables cluster primarily by IC at early lead times and LBC at later lead times. The radar and Mesonet data assimilation (DA) show its impact, with members without DA in a distinct cluster, through the 12-h lead time (valid at 1200 UTC) for both precipitation and nonprecipitation variables. The implication for optimal ensemble design for storm-scale forecasts is also discussed.

Full access