• Alexander, C., , Weygandt S. , , Benjamin S. G. , , Smirnova T. G. , , Brown J. M. , , Hofmann P. , , and James E. , 2011: The High Resolution Rapid Refresh (HRRR): Recent and future enhancements, time-lagged ensembling, and 2010 forecast evaluation activities. Proc. 24th Conf. on Weather and Forecasting/20th Conf. on Numerical Weather Prediction, Seattle, WA, Amer. Meteor. Soc., 12B.2. [Available online at https://ams.confex.com/ams/91Annual/webprogram/Paper183065.html.]

  • Augustine, J. A., , and Howard K. W. , 1988: Mesoscale convective complexes over the United States during 1985. Mon. Wea. Rev., 116, 685701, doi:10.1175/1520-0493(1988)116<0685:MCCOTU>2.0.CO;2.

    • Search Google Scholar
    • Export Citation
  • Baldwin, M. E., , and Lakshmivarahan S. , 2003: Development of an events oriented verification system using data mining and image processing algorithms. Preprints, Third Conf. on Artificial Intelligence, Long Beach, CA, Amer. Meteor. Soc., 4.6. [Available online at http://ams.confex.com/ams/pdfpapers/57821.pdf.]

  • Basara, J. B., , Maybourn J. N. , , Peirano C. M. , , Tate J. E. , , Brown P. J. , , Hoey J. D. , , and Smith B. R. , 2013: Drought and associated impacts in the Great Plains of the United States—A review. Int. J. Geosci., 4, 7281, doi:10.4236/ijg.2013.46A2009.

    • Search Google Scholar
    • Export Citation
  • Benjamin, S. G., , Weygandt S. , , Smirnova T. G. , , Hu M. , , Peckham S. E. , , Brown J. M. , , Brundage K. , , and Manikin G. S. , 2009: Assimilation of radar reflectivity data using a diabatic digital filter: Applications to the Rapid Update Cycle and Rapid Refresh and initialization of High Resolution Rapid Refresh forecasts with RUC/RR grids. Preprints, 13th Conf. on Integrated Observing and Assimilation Systems for Atmosphere, Oceans, and Land Surface (IOAS-AOLS), Phoenix, AZ, Amer. Meteor. Soc., 7B.3. [Available online at https://ams.confex.com/ams/pdfpapers/150469.pdf.]

  • Benjamin, S. G., , Weygandt S. , , Alexander C. , , Brown J. M. , , Smirnova T. G. , , Hofmann P. , , James E. , , and Dimego G. , 2011: NOAA’s hourly-updated 3km HRRR and RUC/Rapid Refresh—Recent (2010) and upcoming changes toward improving weather guidance for air-traffic management. Proc. Second Aviation, Range, and Aerospace Meteorology Special Symp. on Weather–Air Traffic Management Integration, Seattle, WA, Amer. Meteor. Soc., 3.2. [Available online at https://ams.confex.com/ams/91Annual/webprogram/Paper185659.html.]

  • Benjamin, S. G., and Coauthors, 2013: Data assimilation and model updates in the 2013 Rapid Refresh (RAP) and High-Resolution Rapid Refresh (HRRR) analysis and forecast systems. NCEP/EMC Meeting, Washington, DC, NCEP/EMC/Model Evaluation Group. [Available online at http://ruc.noaa.gov/pdf/NCEP_HRRR_RAPv2_6jun2013-Benj-noglob.pdf.]

  • Billet, J., , DeLisi M. , , Smith B. G. , , and Gates C. , 1997: Use of regression techniques to predict hail size and the probability of large hail. Wea. Forecasting, 12, 154164, doi:10.1175/1520-0434(1997)012<0154:UORTTP>2.0.CO;2.

    • Search Google Scholar
    • Export Citation
  • Bryan, G. H., , Wyngaard J. C. , , and Fritsch J. M. , 2003: Resolution requirements for the simulation of deep moist convection. Mon. Wea. Rev., 131, 23942416, doi:10.1175/1520-0493(2003)131<2394:RRFTSO>2.0.CO;2.

    • Search Google Scholar
    • Export Citation
  • Burghardt, B. J., , Evans C. , , and Roebber P. J. , 2014: Assessing the predictability of convection initiation in the high plains using an object-based approach. Wea. Forecasting, 29, 403418, doi:10.1175/WAF-D-13-00089.1.

    • Search Google Scholar
    • Export Citation
  • Caine, S., , Lane T. P. , , May P. , , Jakob C. , , Siems S. T. , , Manton M. J. , , and Pinto J. O. , 2013: Statistical assessment of tropical convection-permitting model simulations using a cell-tracking algorithm. Mon. Wea. Rev., 141, 557581, doi:10.1175/MWR-D-11-00274.1.

    • Search Google Scholar
    • Export Citation
  • Carbone, R. E., , and Tuttle J. D. , 2008: Rainfall occurrence in the U.S. warm season: The diurnal cycle. J. Climate, 21, 41324146, doi:10.1175/2008JCLI2275.1.

    • Search Google Scholar
    • Export Citation
  • Carbone, R. E., , Tuttle J. D. , , Ahijevych D. , , and Trier S. B. , 2002: Inferences of predictability associated with warm season precipitation episodes. J. Atmos. Sci., 59, 20332056, doi:10.1175/1520-0469(2002)059<2033:IOPAWW>2.0.CO;2.

    • Search Google Scholar
    • Export Citation
  • Clark, A. J., , Bullock R. G. , , Jensen T. L. , , Xue M. , , and Kong F. , 2014: Application of object-based time-domain diagnostics for tracking precipitation systems in convection allowing models. Wea. Forecasting, 29, 517542, doi:10.1175/WAF-D-13-00098.1.

    • Search Google Scholar
    • Export Citation
  • Coniglio, M. C., , Hwang J. Y. , , and Stensrud D. J. , 2010: Environmental factors in the upscale growths and longevity of MCSs derived from Rapid Update Cycle analyses. Mon. Wea. Rev., 138, 35143539, doi:10.1175/2010MWR3233.1.

    • Search Google Scholar
    • Export Citation
  • Coniglio, M. C., , Correia J. , , Marsh P. T. , , and Kong F. , 2013: Verification of convection-allowing WRF Model forecasts of the planetary boundary layer using sounding observations. Wea. Forecasting, 28, 842862, doi:10.1175/WAF-D-12-00103.1.

    • Search Google Scholar
    • Export Citation
  • Crowe, B. A., , and Miller D. W. , 1999: The benefits of using NEXRAD vertically integrated liquid water as an aviation weather product. Preprints, Eighth Conf. on Aviation, Range, and Aerospace Meteorology, Dallas, TX, Amer. Meteor. Soc., 168171.

  • Davis, C. A., , Brown B. G. , , and Bullock R. G. , 2006: Object-based verification of precipitation forecasts. Part II: Application to convective rain systems. Mon. Wea. Rev., 134, 17851795, doi:10.1175/MWR3146.1.

    • Search Google Scholar
    • Export Citation
  • Dixon, M., , and Wiener G. , 1993: TITAN: Thunderstorm identification, tracking, analysis and nowcasting—A radar-based methodology. J. Atmos. Oceanic Technol., 10, 785797, doi:10.1175/1520-0426(1993)010<0785:TTITAA>2.0.CO;2.

    • Search Google Scholar
    • Export Citation
  • Ebert, E. E., , and McBride J. L. , 2000: Verification of precipitation in weather systems: Determination of systematic errors. J. Hydrol., 239, 179202, doi:10.1016/S0022-1694(00)00343-7.

    • Search Google Scholar
    • Export Citation
  • Ebert, E. E., , and Gallus W. A. Jr., 2009: Toward better understanding of the contiguous rain area (CRA) method for spatial forecast verification. Wea. Forecasting, 24, 14011415, doi:10.1175/2009WAF2222252.1.

    • Search Google Scholar
    • Export Citation
  • Evans, J. E., , and Ducot E. R. , 2006: Corridor Integrated Weather System. MIT Lincoln Lab. J., 16, 5980.

  • Gallus, W. A., Jr., 2010: Application of object-based verification techniques to ensemble precipitation forecasts. Wea. Forecasting, 25, 144158, doi:10.1175/2009WAF2222274.1.

    • Search Google Scholar
    • Export Citation
  • Geerts, B., 1998: Mesoscale convective systems in the southeast United States during 1994–95: A survey. Wea. Forecasting, 13, 860869, doi:10.1175/1520-0434(1998)013<0860:MCSITS>2.0.CO;2.

    • Search Google Scholar
    • Export Citation
  • Greene, D. R., , and Clark R. A. , 1972: Vertically integrated liquid water—A new analysis tool. Mon. Wea. Rev., 100, 548552, doi:10.1175/1520-0493(1972)100<0548:VILWNA>2.3.CO;2.

    • Search Google Scholar
    • Export Citation
  • Hallowell, R. G., and Coauthors, 1999: The Terminal Convective Weather Forecast Demonstration. Preprints, Eighth Conf. on Aviation, Range, and Aerospace Meteorology, Dallas, TX, Amer. Meteor. Soc., 200–204.

  • Hoerling, M., and Coauthors, 2013: An interpretation of the origins of the 2012 Central Great Plains drought. NOAA Drought Task Force Assessment Rep., 50 pp. [Available online at ftp://ftp.oar.noaa.gov/CPO/pdf/mapp/reports/2012-Drought-Interpretation-final.web-041113.pdf.]

  • Hogan, R. J., , O’Connor E. J. , , and Illingworth A. J. , 2009: Verification of cloud fraction forecasts. Quart. J. Roy. Meteor. Soc., 135, 14941511, doi:10.1002/qj.481.

    • Search Google Scholar
    • Export Citation
  • Houze, R. A., Jr., 1993: Cloud Dynamics. Academic Press, 573 pp.

  • Jirak, I. L., , Cotton W. R. , , and McAnelly R. L. , 2003: Satellite and radar survey of mesoscale convective system development. Mon. Wea. Rev., 131, 24282449, doi:10.1175/1520-0493(2003)131<2428:SARSOM>2.0.CO;2.

    • Search Google Scholar
    • Export Citation
  • Johnson, A., , and Wang X. , 2013: Object-based evaluation of a storm-scale ensemble during the 2009 NOAA Hazardous Weather Testbed Spring Experiment. Mon. Wea. Rev., 141, 10791098, doi:10.1175/MWR-D-12-00140.1.

    • Search Google Scholar
    • Export Citation
  • Johnson, A., , Wang X. , , Kong F. , , and Xue M. , 2013: Object-based evaluation of the impact of horizontal grid spacing on convection-allowing forecasts. Mon. Wea. Rev., 141, 34133425, doi:10.1175/MWR-D-13-00027.1.

    • Search Google Scholar
    • Export Citation
  • Jolliffe, I. T., 2008: The impenetrable hedge: A note on propriety, equitability and consistency. Meteor. Appl., 15, 2529, doi:10.1002/met.60.

    • Search Google Scholar
    • Export Citation
  • Kitzmiller, D. H., , McGovern W. E. , , and Saffle R. F. , 1995: The WSR-88D severe weather potential algorithm. Wea. Forecasting, 10, 141159, doi:10.1175/1520-0434(1995)010<0141:TWSWPA>2.0.CO;2.

    • Search Google Scholar
    • Export Citation
  • Koch, S. E., , Ferrier B. S. , , Kain J. S. , , Stoelinga M. T. , , Szoke E. J. , , and Weiss S. J. , 2005: The use of simulated radar reflectivity fields in the diagnosis of mesoscale phenomena from high-resolution WRF Model forecasts. Preprints, 11th Conf. on Mesoscale Processes/32nd Conf. on Radar Meteorology, Albuquerque, NM, Amer. Meteor. Soc., J4J.7. [Available online at https://ams.confex.com/ams/pdfpapers/97032.pdf.]

  • Krozel, J., , Mitchell J. S. B. , , Polishchuk V. , , and Prete J. , 2007: Maximum flow rates for capacity estimation in level flight with convective weather constraints. Air Traffic Control Quart., 15, 209238.

    • Search Google Scholar
    • Export Citation
  • Lack, S., , Limpert G. J. , , and Fox N. I. , 2010: An object-oriented multiscale verification scheme. Wea. Forecasting, 25, 7992, doi:10.1175/2009WAF2222245.1.

    • Search Google Scholar
    • Export Citation
  • Parker, M. D., , and Johnson R. H. , 2000: Organizational modes of midlatitude mesoscale convective systems. Mon. Wea. Rev., 128, 34133436, doi:10.1175/1520-0493(2001)129<3413:OMOMMC>2.0.CO;2.

    • Search Google Scholar
    • Export Citation
  • Parker, M. D., , and Ahijevych D. , 2007: Convective episodes in the east-central United States. Mon. Wea. Rev., 135, 37073727, doi:10.1175/2007MWR2098.1.

    • Search Google Scholar
    • Export Citation
  • Pinto, J. O., , Phillips C. , , Steiner M. , , Rasmussen R. , , Oien N. , , Dixon M. , , Wang W. , , and Weisman M. , 2007: Assessment of the statistical characteristics of thunderstorms simulated with the WRF Model using convection permitting resolution. 33rd Int. Conf. on Radar Meteorology, Cairns, QLD, Australia, 5.5. [Available online at https://ams.confex.com/ams/pdfpapers/123712.pdf.]

  • Pinto, J. O., , Dupree W. , , Weygandt S. , , Wolfson M. , , Benjamin S. , , and Steiner M. , 2010: Advances in the Consolidated Storm Prediction for Aviation (CoSPA). Preprints, 14th Conf. on Aviation, Range, and Aerospace Meteorology, Atlanta, GA, Amer. Meteor. Soc., J11.2. [Available online at https://ams.confex.com/ams/pdfpapers/163811.pdf.]

  • Pinto, J. O., , Grim J. A. , , Ahijevych D. , , and Steiner M. , 2013: An automated system for detecting large-scale convective storms: Application to model evaluation. Proc. 16th Conf. on Aviation, Range, and Aerospace Meteorology, Austin, TX, Amer. Meteor. Soc., 9.4A. [Available online at https://ams.confex.com/ams/93Annual/webprogram/Paper222079.html].

  • Schwartz, C. S., and Coauthors, 2009: Next-day convection-allowing WRF Model guidance: A second look at 2-km versus 4-km grid spacing. Mon. Wea. Rev., 137, 33513372, doi:10.1175/2009MWR2924.1.

    • Search Google Scholar
    • Export Citation
  • Skamarock, W. C., , and Klemp J. B. , 2008: A time-split nonhydrostatic atmosphere model for weather research and forecasting applications. J. Comput. Phys., 227, 34653485, doi:10.1016/j.jcp.2007.01.037.

    • Search Google Scholar
    • Export Citation
  • Smalley, D. J., , and Bennett B. J. , 2002: Using ORPG to enhance NEXRAD products to support FAA critical systems. Preprints, 10th Conf. on Aviation, Range, and Aerospace Meteorology, Portland, OR, Amer. Meteor. Soc., 3.6. [Available online at https://ams.confex.com/ams/pdfpapers/38861.pdf.]

  • Steiner, M., , Bateman R. , , Megenhardt D. , , Liu Y. , , Xu M. , , Pocernich M. , , and Krozel J. , 2010: Translation of ensemble weather forecasts into probabilistic air traffic capacity impact. Air Traffic Control Quart., 18, 229254.

    • Search Google Scholar
    • Export Citation
  • Stensrud, D. J., and Coauthors, 2009: Convective-scale warn-on-forecast: A vision for 2020. Bull. Amer. Meteor. Soc., 90, 14871499, doi:10.1175/2009BAMS2795.1.

    • Search Google Scholar
    • Export Citation
  • Stensrud, D. J., and Coauthors, 2013: Progress and challenges with warn-on-forecast. Atmos. Environ., 123, 216, doi:10.1016/j.atmosres.2012.04.004.

    • Search Google Scholar
    • Export Citation
  • Stratman, D. R., , Coniglio M. C. , , Koch S. E. , , and Xue M. , 2013: Use of multiple verification methods to evaluation forecasts of convection from hot- and cold-start convection allowing models. Wea. Forecasting, 28, 119138, doi:10.1175/WAF-D-12-00022.1.

    • Search Google Scholar
    • Export Citation
  • Thompson, G., , Field P. R. , , Rasmussen R. M. , , and Hall W. D. , 2008: Explicit forecasts of winter precipitation using an improved bulk microphysics scheme: Part II: Implementation of a new snow parameterization. Mon. Wea. Rev., 136, 50955115, doi:10.1175/2008MWR2387.1.

    • Search Google Scholar
    • Export Citation
  • Weisman, M. L., , Skamarock W. C. , , and Klemp J. B. , 1997: The resolution dependence of explicitly modeled convective systems. Mon. Wea. Rev., 125, 527548, doi:10.1175/1520-0493(1997)125<0527:TRDOEM>2.0.CO;2.

    • Search Google Scholar
    • Export Citation
  • Weygandt, S., and Coauthors, 2011: The Rapid Refresh—Replacement for the RUC, pre-implementation development and evaluation. Proc. 24th Conf. on Weather and Forecasting/20th Conf. on Numerical Weather Prediction, Seattle, WA, Amer. Meteor. Soc., 12B.1. [Available online at https://ams.confex.com/ams/91Annual/webprogram/Paper183027.html.]

  • Wilks, D. S., 2006: Statistical Methods in the Atmospheric Sciences. 2nd ed. Academic Press, 648 pp.

  • View in gallery

    Observed and modeled frequencies of VIL exceeding either 1.5 or 3.5 kg m−2 as a function of lead time for the eastern United States (i.e., east of 105°W) for JJA 2012.

  • View in gallery

    Schematic representation of the MCS-I identification method. Red, blue, and magenta indicate the current, 1-h-previous, and 2-h-previous locations of the MCS, respectively. Colors from more recent times overlap those from previous times. Filled contour underlay gives the VIL obtained 1 h before present. The ellipses indicate the approximate search area for each MCS identified at current time with the orange ellipses indicating the MCS-I events and the cyan ellipses indicating which MCSs are not considered MCS-Is. Each current MCS is also numbered to aid discussion in the text.

  • View in gallery

    Observed seasonal mean MCS occurrence rates (No. per week) for (a) all times, (b) daytime (1100–2000 LST), and (c) nighttime (2100–1100 LST) for JJA (left) 2012 and (right) 2013. Labeled bands and boxes indicate regions used in additional analyses described in the text.

  • View in gallery

    Normalized diurnal cycle of MCSs by region (GP, MRV, and APP bands, as well as two smaller boxes for SE and TX, as shown in Fig. 2) for JJA 2012. Total numbers of MCS events for each region are listed in the legend.

  • View in gallery

    Maps of (a) observed MCS-I frequency (No. per week per unit area for the actual land area within a 500 km2 search box) and (b) median time of MCS-I event during (left) 2012 and (right) 2013. The frequency is computed by summing over all times of day. The median time of MCS-I is only computed at points with at least one MCS-I per week per unit land area.

  • View in gallery

    Sensitivity of the MCS size distribution to choice of VIL threshold using 6-h forecasts and observations collected during 6–12 Jul 2012. The black line depicts the observed MCS size distribution obtained using a threshold of 3.5 kg m−2.

  • View in gallery

    HRRR 6-h forecast valid at 0500 UTC 9 Jul 2013 of (a) column max reflectivity (dBZ) and (b) VIL (kg m−2) derived from modeled radar reflectivity. The white contours indicate storms identified as MCSs using a VIL threshold of 1.75 kg m−2. Only the large area of convection over southwestern North Dakota would have been classified as an MCS using the standard VIL threshold of 3.5 kg m−2 despite the fact that the modeled composite reflectivity within each MCS contour exceeds 35 dBZ.

  • View in gallery

    Modeled minus observed MCS frequency vs lead time obtained using optimized thresholds for (a) daytime (1100–2000 LST) and (b) nighttime (2100–1100 LST) for three longitudinal bands (GP, MRV, and APP), and two regions (TX and SE) during (left) 2012 and (right) 2013. The actual observed MCS frequencies (No. per week) for each region and time period are given in the legend.

  • View in gallery

    Frequency of modeled and observed MCSs for the (a) GP, (b) MRV, and (c) APP longitudinal belts for JJA (left) 2012 and (right) 2013. Frequencies are found by dividing the MCS count found at a given valid time by the total number of MCSs detected in the observations or the model forecast (for a given lead time) during the 92-day period.

  • View in gallery

    Model 6-h forecasted MCS occurrence rate minus the observed occurrence rate for (a) all times, (b) daytime (1100–2000 LST), and (c) nighttime (2000–1100 LST) for JJA (left) 2012 and (right) 2013. Black box denotes central MRV region used in Fig. 18.

  • View in gallery

    Maps of (a) seasonal mean MCS-I occurrence rate obtained from 6-h forecasts and (b) median time of MCS-I in the 6-h forecasts for JJA (left) 2012 and (right) 2013. The median time of MCS-I is only computed at points with at least one MCS-I per week per 500 km2. The VIL threshold for identifying MCSs is shown at the top right.

  • View in gallery

    Schematic demonstrating how contingency tables are created. The red-filled contours depict the truth field (given a value of 2) and the yellow-filled contours represent a 50-km expansion of the truth field (given a value of 1). The dark blue contours indicate the model-predicted MCS locations (=2) and the light blue contour indicates the model-predicted MCS locations extended by 50 km (=1). The grid lines represent a 1° grid, with each grid square being classified as either a hit (indicated by an “H”), miss (indicated by an “M”), false alarm (indicated by an “F”), or correct null (indicated by an “N”) based on the classification method shown in Table 4.

  • View in gallery

    Model bias as a function of lead time and valid time of day in the (a) eastern U.S. and the (b) SE and (c) GP domains during (left) 2012 and (right) 2013. Solid curves indicate relative variations in observed MCS frequency with time of day for a given region (the max observed frequency is given for each region). Dashed diagonal lines indicate the lead time/valid time coordinates of the 0000 UTC forecasts.

  • View in gallery

    Model performance as a function of lead time and valid time of day as indicated by (a) SEDS, (b) POD, and (c) FAR for the eastern U.S. domain during (left) 2012 and (right) 2013. Dashed diagonal lines indicate the location of lead times associated with the 0000 UTC forecasts.

  • View in gallery

    As in Fig. 14, but for the SE domain.

  • View in gallery

    As in Fig. 14, but for the GP domain.

  • View in gallery

    Distributions of modeled (red) and observed (black) storm sizes occurring within regions bound by 29.5°N, 86.5°W and 33.5°N, 80.5°W (a subregion within the SE region) that were obtained for a subset of MCS-I events observed in 2013 in which max dimension of the modeled convective storms failed to reach 100 km. Note that the modeled storms larger than 100 km that were present in the subregion during an observed MCS-I event did not initiate within ±2 h of the observed event.

  • View in gallery

    Total number of modeled MCSs (3–14-h lead times of forecasts issued every even hour are shown, with the line for the 0000 LST run being blue and more reddish and brown colors being forecasts issued later in the day) and observed (connected dots) MCSs obtained during JJA 2013 for the central MRV region denoted in Fig. 10.

All Time Past Year Past 30 Days
Abstract Views 0 0 0
Full Text Views 56 56 27
PDF Downloads 51 51 27

Assessment of the High-Resolution Rapid Refresh Model’s Ability to Predict Mesoscale Convective Systems Using Object-Based Evaluation

View More View Less
  • 1 Research Applications Laboratory, National Center for Atmospheric Research,* Colorado
© Get Permissions
Full access

Abstract

An object-based verification technique that keys off the radar-retrieved vertically integrated liquid (VIL) is used to evaluate how well the High-Resolution Rapid Refresh (HRRR) predicted mesoscale convective systems (MCSs) in 2012 and 2013. It is found that the modeled radar VIL values are roughly 50% lower than observed. This mean bias is accounted for by reducing the radar VIL threshold used to identify MCSs in the HRRR. This allows for a more fair evaluation of the model’s skill at predicting MCSs. Using an optimized VIL threshold for each summer, it is found that the HRRR reproduces the first (i.e., counts) and second moments (i.e., size distribution) of the observed MCS size distribution averaged over the eastern United States, as well as their aspect ratio, orientation, and diurnal variations. Despite threshold optimization, the HRRR tended to predict too many (few) MCSs at lead times less (greater) than 4 h because of lead time–dependent biases in the modeled radar VIL. The HRRR predicted too many MCSs over the Great Plains and too few MCSs over the southeastern United States during the day. These biases are related to the model’s tendency to initiate too many MCSs over the Great Plains and too few MCSs over the southeastern United States. Additional low biases found over the Mississippi River valley region at night revealed a tendency for the HRRR to dissipate MCSs too quickly. The skill of the HRRR at predicting specific MCS events increased between 2012 and 2013, coinciding with changes in both the model physics and in the methods used to assimilate the three-dimensional radar reflectivity.

The National Center for Atmospheric Research is sponsored by the National Science Foundation.

Corresponding author address: James O. Pinto, Research Applications Laboratory, National Center for Atmospheric Research, 3450 Mitchell Ln., Boulder, CO 80301. E-mail: pinto@ucar.edu

Abstract

An object-based verification technique that keys off the radar-retrieved vertically integrated liquid (VIL) is used to evaluate how well the High-Resolution Rapid Refresh (HRRR) predicted mesoscale convective systems (MCSs) in 2012 and 2013. It is found that the modeled radar VIL values are roughly 50% lower than observed. This mean bias is accounted for by reducing the radar VIL threshold used to identify MCSs in the HRRR. This allows for a more fair evaluation of the model’s skill at predicting MCSs. Using an optimized VIL threshold for each summer, it is found that the HRRR reproduces the first (i.e., counts) and second moments (i.e., size distribution) of the observed MCS size distribution averaged over the eastern United States, as well as their aspect ratio, orientation, and diurnal variations. Despite threshold optimization, the HRRR tended to predict too many (few) MCSs at lead times less (greater) than 4 h because of lead time–dependent biases in the modeled radar VIL. The HRRR predicted too many MCSs over the Great Plains and too few MCSs over the southeastern United States during the day. These biases are related to the model’s tendency to initiate too many MCSs over the Great Plains and too few MCSs over the southeastern United States. Additional low biases found over the Mississippi River valley region at night revealed a tendency for the HRRR to dissipate MCSs too quickly. The skill of the HRRR at predicting specific MCS events increased between 2012 and 2013, coinciding with changes in both the model physics and in the methods used to assimilate the three-dimensional radar reflectivity.

The National Center for Atmospheric Research is sponsored by the National Science Foundation.

Corresponding author address: James O. Pinto, Research Applications Laboratory, National Center for Atmospheric Research, 3450 Mitchell Ln., Boulder, CO 80301. E-mail: pinto@ucar.edu

1. Introduction

The improved prediction of mesoscale convective systems (MCSs) can enhance public safety and improve the decision-making of a number of industries (such as aviation, construction, utility and road maintenance, and outdoor recreation) that require timely and accurate forecasts of high-impact weather. Many of these industries could use more specific forecast information regarding the potential properties of convective storms. For example, aviation planners need information related to the size, orientation, and organization of convective areas to more efficiently route air traffic across the United States (e.g., Krozel et al. 2007; Steiner et al. 2010). In addition, understanding the performance characteristics of high-resolution ensemble forecasts of convection and their intrinsic properties is an important component of the National Weather Service’s (NWS) warn-on-forecast vision (Stensrud et al. 2009, 2013).

High-resolution models continue to evolve as computational resources continue to increase, higher-resolution observations become available, and the representation of physical processes and data assimilation techniques continue to improve. As model resolution continues to increase, more detail with regard to the forecasted storm properties becomes available. Understanding how well storm features, such as their size and organization, are reproduced is key for continued model development efforts and for the interpretation of model forecasts.

During the late 1980s, feature identification techniques were developed and applied to observational datasets. Augustine and Howard (1988) developed a method for identifying MCSs in satellite data and others have done the same for radar (e.g., Dixon and Wiener 1993; Jirak et al. 2003). More recently, feature-based techniques have been used in the evaluation of numerical models. Some of the first automated, feature-based, model evaluation studies were performed by Ebert and McBride (2000), with a proliferation of various implementations since, including the evaluation of high-resolution deterministic forecasts (e.g., Baldwin and Lakshmivarahan 2003; Davis et al. 2006; Lack et al. 2010; Ebert and Gallus 2009; Caine et al. 2013) and, more recently, high-resolution model ensembles (Gallus 2010; Johnson and Wang 2013; Johnson et al. 2013).

The time dimension is less frequently included in feature-based forecast verification methods. Davis et al. (2006) used feature-based verification software called the Method for Object-Based Diagnostic Evaluation (MODE) to assess how well the Weather Research and Forecasting (WRF) Model, version 1.3 (WRF1.3; Skamarock and Klemp 2008), predicts the spatiotemporal evolution of rain systems exceeding 400 km2 in area. They found that, generally, WRF1.3 produced too many large storms (maximum dimension L > 120 km) and that the predicted storms lasted longer than observed. They also found evidence that storm timing (given by the difference in modeled and observed storm centroid times) was delayed in the model, possibly because of their slow formation. Using an object-based technique, Burghardt et al. (2014) found that a subkilometer-resolution implementation of WRF, version 3.3 (WRF3.3), tended to initiate too many storms over the high plains in an area of complex terrain. Clark et al. (2014) evaluated the ability of four members of the Center for Analysis and Prediction of Storms (CAPS) high-resolution ensemble with varying treatments of microphysics to predict the timing and evolution of rain systems using a modified version of MODE that included the time dimension. Their results demonstrated that predictions of the onset and dissipation of rain objects (including both mesoscale convective systems and smaller, less-organized areas of convection) were highly sensitive to the treatment of cloud and precipitation physics.

In the present study, storm-tracking software developed for radar analysis and forecasting has been modified to evaluate model skill at predicting the occurrence, characteristics, and timing (especially the onset) of MCSs. A key aspect of this work, which is distinct from most other object-based model evaluations, is that we adjust the threshold used to identify storm objects in the modeled vertically integrated liquid (VIL) field to remove mean bias prior to performing the model evaluation. We also extend recent object-based verification work by assessing regional variations in model performance. The datasets used in this study are described in section 2. The object-based verification technique is described in section 3, with results of its application on radar data given in section 4. In section 5, we describe a method in which the threshold used to identify MCSs in the model is optimized in order to more fairly evaluate the skill of a high-resolution numerical weather prediction model whose performance is charted over two summers. Finally, the results are summarized and put into context in section 6.

2. Datasets

a. Radar dataset description

Vertically integrated liquid derived from radar data is often used by NWS meteorologists to detect areas of intense, possibly hail-containing convection across the country (e.g., Kitzmiller et al. 1995; Billet et al. 1997). Vertically integrated liquid is also used by aviation meteorologists and aviation planners to identify areas of convection that are hazardous for flight. In this study, we use VIL data from the Corridor Integrated Weather System (CIWS; Crowe and Miller 1999; Evans and Ducot 2006), which was developed for tactical decision-making by the aviation industry. The VIL field developed for CIWS is generated following the method described by Smalley and Bennett (2002), in which the reflectivity from each sweep of each available radar is first quality controlled and then converted to VIL, using the empirical relationship developed by Greene and Clark (1972):
e1
where Ze is the equivalent reflectivity factor, dh is the depth of each radar volume, and N is the number of radar volumes. Unlike other VIL products [e.g., the multiradar/multisensor (MRMS) method uses a maximum reflectivity of 56 dBZ in the VIL calculation], the layer mean reflectivity is not capped at any value. The two-dimensional VIL field obtained from each radar is then combined by advecting VIL to a common time and then taking the maximum plausible VIL value in areas where the radar swaths overlap. The VIL map includes radar data collected from each of the WSR-88D network radars (134 NEXRADs) operated by the NWS, 11 Terminal Doppler Weather Radars (TDWRs) operated by the Federal Aviation Administration (FAA), and five Canadian C-band (5 cm) radars north of the Canadian–U.S. border that are operated by Environment Canada. The resulting VIL dataset, which spans the entire United States and the southern Canadian provinces, and extends roughly 100 km offshore, has a horizontal resolution of 1 km, and is updated every 5 min. For this study, only hourly data are used. In addition, only data collected over the United States are used in our analyses, because the Canadian radars tend to attenuate rapidly in the heavy precipitation events that are of interest in this study. Gaps in radar coverage over the western United States limit the utility of the observational dataset in this region; thus, much of the analysis herein focuses on the eastern two-thirds of the country.

b. Model dataset description

The High-Resolution Rapid Refresh (HRRR) model has been under development at the National Oceanic and Atmospheric Administration/Earth System Research Laboratory/Global Systems Division (NOAA/ESRL/GSD) since 2008 (Benjamin et al. 2009, 2011). Model outputs used in this study are from the summer months [June, July, and August (JJA)] of 2012 and 2013. The HRRR was run as part of the FAA’s summer evaluation (Alexander et al. 2011), serving as the model forecast input field for Consolidated Storm Prediction for Aviation (CoSPA; Pinto et al. 2010) forecasts. The HRRR was run once per hour at a convection-permitting grid spacing of 3 km. While this grid spacing is too coarse to resolve all scales of motion that are important in the development, organization, and evolution of convective storms (Bryan et al. 2003), it is believed that the larger-scale convective features of MCSs are well resolved when using this grid spacing (e.g., Weisman et al. 1997; Schwartz et al. 2009).

The data assimilation and physics of the HRRR and its parent model were modified between 2012 and 2013. The model configurations utilized during these two years are listed in Table 1. The 2013 configuration of the HRRR became operational at NCEP in September 2014. During both years, the HRRR was initialized and driven at the lateral boundaries using version 2 of the 13-km Rapid Refresh (RAP; Weygandt et al. 2011) model run in experimental mode at NOAA/ESRL. Both the RAP model and the HRRR use Gridpoint Statistical Interpolation analysis system (GSI) based three-dimensional variational data assimilation (3DVAR) to assimilate a wealth of observational data (Benjamin et al. 2013). However, two changes expected to have a positive impact on the simulation of convection were made between 2012 and 2013. First, the PBL scheme was changed from Mellor–Yamada–Janjić (MYJ) in 2012 to Mellor–Yamada–Nakanishi–Niino (MYNN) in 2013. This change was expected to improve the simulation of the boundary layer, as found by Coniglio et al. (2013) for simulations with convection-permitting resolution. Second, three-dimensional radar reflectivity data,1 which were only assimilated into RAP in 2012, were assimilated into both the RAP and the 3-km HRRR in 2013. This change was expected to improve the initialization of storm information on the 3-km grid, thus providing a more accurate depiction of ongoing convection at the earlier forecast lead times. We present evidence in section 5 that this is indeed the case.

Table 1.

HRRR configurations used in 2012 and 2013. Note that RAP, the parent model of the HRRR, was also updated. Details of these updates may be found in Benjamin et al. (2013).

Table 1.

Because VIL is used to identify MCSs in the model, the treatment of microphysics and the subsequent calculation of radar reflectivity are critical. The cloud microphysics scheme of Thompson et al. (2008) is used to simulate cloud and precipitation processes. Profiles of snow, rain, and graupel produced by the Thompson et al. microphysics scheme are converted to a reflectivity factor using the radar equation and assumptions pertaining to the shape of the size distribution of each precipitation type [see Koch et al. (2005) for details]. Profiles of reflectivity are then used to estimate the “radar retrieved” VIL using the empirical relationship developed for WSR-88D NEXRAD data following (1). While the equation used to calculate VIL from radar reflectivity data in the model is the same as that used on observed reflectivity, no attempt has been made to account for the impact of radar coverage gaps in the calculation of the simulated reflectivity field obtained from the model. Despite this simplifying assumption (which would lead to overestimates of radar-retrieved VIL), the model actually underforecasts VIL. In fact, the frequency of VIL values exceeding 3.5 kg m−2 is underforecasted by as much as a factor of 5, especially at the longer lead times (Fig. 1). Thus, the expected overprediction of VIL that would result from neglecting the impact of radar coverage gaps in the model-derived VIL is masked by biases associated with assumptions in the microphysics parameterization and subsequent computations of radar reflectivity that contribute to the underprediction of VIL.

Fig. 1.
Fig. 1.

Observed and modeled frequencies of VIL exceeding either 1.5 or 3.5 kg m−2 as a function of lead time for the eastern United States (i.e., east of 105°W) for JJA 2012.

Citation: Weather and Forecasting 30, 4; 10.1175/WAF-D-14-00118.1

3. Storm identification

The object-based technique used in this study was adapted from the Thunderstorm Identification Tracking Analysis and Nowcasting (TITAN) software, developed by Dixon and Wiener (1993), which was originally designed to identify and track convective storms in the reflectivity field of an individual radar. The software provides several characteristics of each storm it identifies including location, orientation, size, aspect ratio, storm trends, and motion. The TITAN software has since been adapted to work on radar mosaic data and, more recently, model output (Pinto et al. 2007; Caine et al. 2013). In this study, the adapted TITAN software is used to assess the timing and location of MCSs in both radar mosaic data and HRRR forecasts. Convective objects are identified in the observed VIL field using a threshold of 3.5 kg m−2. This VIL threshold, which is typically used to identify convection, corresponds to column maximum reflectivity values ranging from 30 to 40 dBZ, following Hallowell et al. (1999). To be considered an MCS, the convective object (which may include gaps between convective cells of up to 20 km) must have a maximum dimension (measured along the object’s major axis) exceeding 100 km for at least two consecutive hours. This MCS definition is similar to that used in previous studies to investigate the occurrence of MCSs (e.g., Houze 1993; Geerts 1998; Parker and Johnson 2000; Coniglio et al. 2010), except that our definition includes less-organized convective areas or clusters of convective cells that are typical of the southeastern United States.

In this study, the initiation of an MCS is defined as occurring if no preexisting MCSs are present within the last 2 h and within a search radius of 125 km that extended from all edges of the MCS in question. The search radius is based on the median speed of MCSs reported in the literature (e.g., Carbone et al. 2002) and comparable to the separation distance of 144 km used by Davis et al. (2006) to distinguish separate convective storm objects. The use of look-back times of 1 and 2 h to identify MCS initiation (MCS-I) is similar to the 1–2-h MCS “genesis” stage described by Coniglio et al. (2010) during which time storm cells may first merge to form a near-continuous line of convection exceeding 100 km in length. Figure 2 illustrates the procedure used to identify MCS-I events. In this example, five MCS objects were identified at the analysis time. Objects 4 and 5 had clearly existed previously, as indicated by the blue and magenta objects from 1 and 2 h in the past, respectively. Farther west, objects 1 and 2 had no preexisting MCSs within 125 km in the past 2 h and are classified as MCS-I events. The convective storms evident to the east and southeast of objects 4 and 5 in the VIL field were too small and too far apart to be classified as MCSs. Finally, object 3, which should have been classified as an MCS-I event, was not identified as such because it formed within 125 km of object 4 that was present 2 h earlier. This missed MCS-I event represents a flaw in the algorithm; however, this situation was found to occur less than 10% of the time in a comparison with manually classified MCS-Is. Furthermore, it is expected that since the same technique is used on both observations and model data, the overall impact on the comparisons will be small.

Fig. 2.
Fig. 2.

Schematic representation of the MCS-I identification method. Red, blue, and magenta indicate the current, 1-h-previous, and 2-h-previous locations of the MCS, respectively. Colors from more recent times overlap those from previous times. Filled contour underlay gives the VIL obtained 1 h before present. The ellipses indicate the approximate search area for each MCS identified at current time with the orange ellipses indicating the MCS-I events and the cyan ellipses indicating which MCSs are not considered MCS-Is. Each current MCS is also numbered to aid discussion in the text.

Citation: Weather and Forecasting 30, 4; 10.1175/WAF-D-14-00118.1

4. Observed MCS properties

The algorithm described above was used to detect MCSs and MCS-I events in the observed VIL mosaic obtained from CIWS for the summers of 2012 and 2013. Table 2 lists the mean macroscale properties of MCSs observed during these two summers. There was a 15% increase in the number of MCS events observed between 2012 and 2013, but their median size, shape, and orientation were basically unchanged (<4% change). The median orientation, given as the angle at which the major axis of the storm is oriented clockwise from due north (values 0°–180°), ranged between 75° and 80° (indicating that the MCSs tended to extend from the west-southwest to east-northeast). The aspect ratio, given as the ratio of the MCS major radius to its minor radius, indicates that the length of MCSs was 3 times the width. MCS-I events were recorded a factor of 10 times less often than MCS occurrences in both years. This large difference between MCS and MCS-I counts is due to the fact that a single MCS can be counted multiple times (each hour of its existence) over the course of its life cycle whereas each MCS-I is treated as a singular event. Sensitivity studies, in which each algorithmic parameter was separately varied, revealed that MCS and MCS-I counts were most sensitive to the VIL threshold and permissible gap size, with the MCS counts being more sensitive than the MCS-I counts to changes in the MCS-defining characteristics. A 1 kg m−2 increase in VIL threshold resulted in a 40% (30%) decrease in the number of MCSs (MCS-Is) while a 10-km increase in permissible gap size resulted in a 100% (50%) increase in the number of MCSs (MCS-Is). The sensitivity of the observed MCS counts to VIL threshold reveals the importance of accounting for mean bias offsets between the modeled and observed VIL fields prior to evaluating model skill.

Table 2.

MCS characteristics observed for the summers of 2012 and 2013.

Table 2.

Maps of the observed MCS frequency of occurrence were generated using 1° grid boxes (Fig. 3). For each MCS detected, the grid boxes containing any part of the MCS were determined and the MCS count in each of these grid boxes was increased by one. The frequency is then determined by dividing the MCS count by the number of observation times in the period of interest. The frequency is then scaled to give the number of observed MCSs occurring within each grid box per week for a given time period (daytime is defined as 1100–2000 LST, and nighttime during the remaining hours). The largest seasonal (JJA) differences in MCS frequency between 2012 and 2013 are found over the central portion of both the Great Plains (GP) and Mississippi River valley (MRV) longitudinal bands at night (Fig. 3). The limited MCS activity observed over the central United States in 2012 corresponded with one of the driest summers on record in this region (Hoerling et al. 2013; Basara et al. 2013). In 2013, MCS activity over the central United States increased dramatically over 2012 and was much closer to normal.

Fig. 3.
Fig. 3.

Observed seasonal mean MCS occurrence rates (No. per week) for (a) all times, (b) daytime (1100–2000 LST), and (c) nighttime (2100–1100 LST) for JJA (left) 2012 and (right) 2013. Labeled bands and boxes indicate regions used in additional analyses described in the text.

Citation: Weather and Forecasting 30, 4; 10.1175/WAF-D-14-00118.1

Regional variations in the observed diurnal cycle of MCSs are depicted in Fig. 4. Results from the three longitudinal bands and two additional evaluation boxes shown in Fig. 3 are given. Results from the Texas (TX) box represent a subset of the MCS activity occurring within the GP band, while the Southeast (SE) box overlaps the southern portions of both the MRV and Appalachians (APP) longitudinal bands. Both the amplitude (i.e., percentage of all MCSs occurring in a region) and phase (i.e., timing) of the diurnal cycle vary by region. Consistent with previous climatological studies (e.g., Carbone and Tuttle 2008), the center of the time (LST2) of peak MCS activity (defined as the midpoint between clear positively and negatively sloping portions of each curve and given in local solar time) occurs about 4 h later in the GP band (1900 LST) than that found for the APP band (1500 LST). There is also evidence to support the existence of an earlier ramp-up in MCS activity in the two southern regions (i.e., SE and TX) compared to their corresponding longitudinal band(s). This north–south variation in the timing of the ramp-up is related to the later initiation of MCSs to the north, as seen in Fig. 5b. The diurnal cycles with the largest amplitudes occur in the SE and APP regions, while the smallest-amplitude diurnal cycle is found for the MRV band. Very similar diurnal cycles were observed for each region in 2013 (not shown).

Fig. 4.
Fig. 4.

Normalized diurnal cycle of MCSs by region (GP, MRV, and APP bands, as well as two smaller boxes for SE and TX, as shown in Fig. 2) for JJA 2012. Total numbers of MCS events for each region are listed in the legend.

Citation: Weather and Forecasting 30, 4; 10.1175/WAF-D-14-00118.1

Fig. 5.
Fig. 5.

Maps of (a) observed MCS-I frequency (No. per week per unit area for the actual land area within a 500 km2 search box) and (b) median time of MCS-I event during (left) 2012 and (right) 2013. The frequency is computed by summing over all times of day. The median time of MCS-I is only computed at points with at least one MCS-I per week per unit land area.

Citation: Weather and Forecasting 30, 4; 10.1175/WAF-D-14-00118.1

While the observed ratio of MCS to MCS-I events is consistent between the two summers (Table 2), differences are evident in the observed spatial patterns of MCS-I frequency (Fig. 5a). These maps were created by counting all MCS-I events occurring within overlapping 5° × 5° boxes spaced 1° apart. This larger box size (compared to that used to build the MCS frequency maps) was necessary because of the relatively infrequent nature of MCS-I events. To remove the effects of offshore convection and the Canadian radar attenuation issue, the MCS-I frequencies are limited to only those occurring within nonoceanic grid boxes and within the continental United States. MCS-I frequency maxima are observed over the GP and APP bands peaking over the Florida peninsula. In both years, the MCS-I frequency maximum in the APP band extends from the Gulf Coast northward along the spine of the Appalachians into western New York. This peak frequency of MCS-I over the Appalachians is consistent with that reported by Parker and Ahijevych (2007). Peak MCS-I activity in the GP band varies notably between 2012 and 2013, shifting from the Dakotas in 2012 to Nebraska and western Kansas in 2013 with the area and increasing in magnitude in 2013.

The median time of MCS-I events shown in Fig. 5b is found following a two-step method. First, a distribution MCS-I counts versus LST is generated and smoothed using a cyclic 5-h boxcar average. Then, the time corresponding with the minimum MCS-I count in the smoothed distribution is found and used as the starting point to compute the median LST value using a circular time of day coordinate system. The median time of MCS-I events increases as one moves north and west across the United States (Fig. 5b) varying from around 1200 LST in the southeastern United States to 1800–2200 LST over much of the central and northern portions of the GP band. Localized areas of even later MCS-I activity are evident in the GP and MRV bands especially in 2013. Summer-to-summer variations are also in northern Texas where MCS-I occurred notably later in 2013 and in the northern third of the MRV band, which experienced much earlier MCS-I in 2013. The observed MCS-I frequency maxima found over the GP region in Fig. 5 are generally located upstream of the MCS frequency maxima shown in Fig. 3. The relationship between MCS-I location and subsequent downstream MCS frequency maxima is consistent with that previously discussed by Carbone and Tuttle (2008).

In the SE region, MCSs tend to form through the clustering and upscale growth of convective storms (e.g., Geerts 1998). The MCS-I frequency maxima in the SE region are coincident in time and space (Figs. 5a,b) with the peaks in the MCS frequency map shown in Fig. 3. In the region denoted by the SE box in Fig. 3, MCS-I generally occurs between 1100 and 1400 LST and peak MCS activity occurs between 1300 and 1700 LST (Fig. 4). Similar to that described by Geerts (1998), the timing of peak MCS-I and MCS frequencies in the SE region indicates that MCSs tend to form and dissipate over a short time period with limited propagation.

5. Evaluation of HRRR

The same storm identification algorithm used to detect MCSs and MCS-Is in the observations (see section 4) is used to evaluate the HRRR. Evaluations are performed both in a statistical sense (i.e., in which the statistical properties of observed and modeled storms are compared without being matched in time and location) and using standard pairwise metrics (in which modeled and observed storms are matched based on proximity to one another, as described in more detail below). Since MCS identification can be highly sensitive to the VIL threshold used to identify convection and because the modeled VIL field can be significantly biased depending on the choice of microphysical parameterization (e.g., Clark et al. 2014) and the assumptions used to compute the reflectivity, an attempt is made to remove the mean bias from the synthetic-radar-based VIL field (as shown in Fig. 1) prior to applying the MCS identification software.

a. The VIL threshold optimization

To account for model bias caused by assumptions in the microphysics and computation of radar-derived VIL and also potential biases in the observations, the threshold used to identify MCSs is chosen using an optimization procedure. This is done by comparing modeled MCS size distributions obtained using a range of VIL thresholds to that obtained from observations using the standard VIL threshold of 3.5 kg m−2. This procedure is illustrated in Fig. 6. It is seen that using the standard threshold of 3.5 kg m−2 on the model VIL results in large underforecasts of the number of MCSs in each size bin. In the example shown in Fig. 6, a VIL threshold of 1.5 kg m−2 produces the best match between the 6-h forecast of MCSs and those that were observed. This finding is consistent with that shown in Fig. 1, where the frequency of 6-h modeled VIL exceeding 1.5 kg m−2 corresponds with the frequency of observed VIL exceeding 3.5 kg m−2. However, because bias in the modeled VIL field varies with forecast lead time (Fig. 1), using a single threshold to detect MCSs in the model forecasts will translate into lead time–dependent biases in the modeled MCS frequencies. These lead time–dependent biases are an intrinsic property of the model and are not related to assumptions used in the computation of radar reflectivity.

Fig. 6.
Fig. 6.

Sensitivity of the MCS size distribution to choice of VIL threshold using 6-h forecasts and observations collected during 6–12 Jul 2012. The black line depicts the observed MCS size distribution obtained using a threshold of 3.5 kg m−2.

Citation: Weather and Forecasting 30, 4; 10.1175/WAF-D-14-00118.1

Sensitivity studies revealed that this optimized threshold can be obtained using a period as short as a week depending on the amount of convective activity. It is important to note that only the VIL threshold is varied during the optimization procedure. All other criteria used to identify MCSs in the model are identical to those used to identify MCSs in the observations. Using this optimization procedure, it was found that VIL thresholds of 1.5 and 1.75 kg m−2 worked best for the summers of 2012 and 2013, respectively. These thresholds are roughly a factor of 2 less than those used to detect MCSs in the observations. This may seem concerning since one might think these storms will appear to be too weak to be considered MCSs, but one must remember that we chose these thresholds to account for model biases that are predominantly in the model data postprocessing used to generate the radar-derived VIL field. When one compares the retrieved VIL field with the composite reflectivity field, it becomes clear that the threshold optimization method does a nice job of detecting convective storms that would otherwise have been missed when using the retrieved VIL field alone (Fig. 7).

Fig. 7.
Fig. 7.

HRRR 6-h forecast valid at 0500 UTC 9 Jul 2013 of (a) column max reflectivity (dBZ) and (b) VIL (kg m−2) derived from modeled radar reflectivity. The white contours indicate storms identified as MCSs using a VIL threshold of 1.75 kg m−2. Only the large area of convection over southwestern North Dakota would have been classified as an MCS using the standard VIL threshold of 3.5 kg m−2 despite the fact that the modeled composite reflectivity within each MCS contour exceeds 35 dBZ.

Citation: Weather and Forecasting 30, 4; 10.1175/WAF-D-14-00118.1

Application of these optimized thresholds resulted in median storm sizes, aspect ratios, and modal orientations that are nearly identical to the observed values reported in Table 2. The 2013 data provide a representative comparison of observed MCS counts with those obtained from the model using a threshold of 1.75 kg m−2 (Table 3). Biases in the counts of MCSs and MCS-Is obtained from all available 6-h forecasts are small (<10%) when evaluated over the United States east of 105°W (eUS), but can reach nearly a factor of 2 when evaluated regionally. Results given in Table 3 reveal that there were 45% too many MCSs predicted over the GP region and 15% too few predicted over the SE region in 2013 (where the MCS-I count was also dramatically underforecasted by 39%). Temporal and regional variations in model performance will be discussed in more detail below, along with how the model performance changed between 2012 and 2013.

Table 3.

Modeled and observed MCS and MCS-I counts for the summer of 2013.

Table 3.

b. Space–time variations in model performance

Spatiotemporal variations in the model’s skill at predicting MCSs were evaluated in two ways. First, regional variations in the seasonal mean statistical properties of modeled and observed MCSs and MCS-Is are compared. Then, a simple technique that matches modeled and observed objects while allowing for small position errors is used to assess the model’s skill at predicting specific MCS events.

1) Statistical assessment

While threshold optimization removes the impact of mean bias in the VIL field, biases in the prediction of MCSs that are a function of lead time, valid time period (i.e., day vs night), and region remain (Fig. 8). The model generally predicts too many MCSs during the earlier lead times and too few MCSs at the longer lead times. While this dependence on lead time is evident in both years, it was somewhat reduced in 2013. In both years, the model tended to overpredict the frequency of MCSs during both daytime and nighttime periods in the GP band while underpredicting the daytime MCS frequency in the SE and APP regions. While the model bias in some regions was consistent from one year to the next (e.g., nighttime high bias in the GP region), other regions experienced notable changes in performance. For example, the frequency of daytime MCSs in the APP and MRV regions was better captured in 2013.

Fig. 8.
Fig. 8.

Modeled minus observed MCS frequency vs lead time obtained using optimized thresholds for (a) daytime (1100–2000 LST) and (b) nighttime (2100–1100 LST) for three longitudinal bands (GP, MRV, and APP), and two regions (TX and SE) during (left) 2012 and (right) 2013. The actual observed MCS frequencies (No. per week) for each region and time period are given in the legend.

Citation: Weather and Forecasting 30, 4; 10.1175/WAF-D-14-00118.1

The diurnal cycle of MCS frequency of occurrence obtained from model forecasts of varying lead times is compared to the observed MCS frequency in Fig. 9. The MCS frequency for a given longitudinal band is obtained by dividing the number of MCSs obtained for a given band, lead time, and valid time by the corresponding total number of modeled MCSs obtained for a given band and lead time summed over all valid times. This normalization effectively removes the model biases shown in Fig. 8 so that any deviations between the forecasted and observed diurnal cycles may be interpreted as amplitude and/or timing errors for a given forecast lead time. All lead times of the model generally capture the observed regional variations in the phase and amplitude of the MCS diurnal cycle in both years, with the 2013 forecasts deviating from the observations less than the 2012 forecasts. Comparing results for the three longitudinal bands, the largest deviations are evident for the GP band in 2012 for lead times of 10 h or greater. The GP diurnal cycle obtained at these longer lead times appears to be several hours out of phase compared to the observed diurnal cycle, with both the ramp-up of MCS activity and the center time of the MCS frequency maximum occurring several hours later than observed (Fig. 9a). The opposite effect is found for the GP band in 2013, when the modeled ramp-up, peak, and decay phase of the diurnal cycle occur about 1–2 h ahead of that observed for all forecast lead times. The modeled timing and amplitude of the MCS diurnal cycle in the MRV band is well captured in both years, with the 2013 forecasts showing slightly less deviations from the observed diurnal cycle than those found in 2012. Finally, the amplitude of MCS activity in the APP band was closer to that observed in 2013, but the timing (ramp-up and decay of MCS frequency) remains delayed by about 2 h.

Fig. 9.
Fig. 9.

Frequency of modeled and observed MCSs for the (a) GP, (b) MRV, and (c) APP longitudinal belts for JJA (left) 2012 and (right) 2013. Frequencies are found by dividing the MCS count found at a given valid time by the total number of MCSs detected in the observations or the model forecast (for a given lead time) during the 92-day period.

Citation: Weather and Forecasting 30, 4; 10.1175/WAF-D-14-00118.1

A more detailed inspection of how well the model captures temporal and regional variations in MCS frequency of occurrence is shown in Fig. 10. The frequency differences shown in Fig. 10 are for 6-h forecasts valid during three time periods: daytime, nighttime, and all times. In 2012, the largest positive biases are found over Louisiana and Florida for forecasts valid during daytime hours, where the model predicts up to five occurrences per week more than observed (i.e., an overprediction by more than a factor of 2). The model also overpredicts the frequency of MCSs over vast areas of the GP band (particularly the northern half) and northeastern United States during the day and at night. In contrast, the model tended to underforecast the frequency of daytime MCSs in the southeastern United States especially near the coast of the Carolinas and northern Florida in both years. Other regions characterized by underforecasting biases include the western slope of the southern Appalachians and eastern Texas during the day and parts of the MRV band at night. The daytime high biases in the northeastern United States, Louisiana, and Florida were less pronounced in 2013. In contrast, the magnitude of biases in the GP region (positive bias) and central MRV region (negative bias) increased in 2013.

Fig. 10.
Fig. 10.

Model 6-h forecasted MCS occurrence rate minus the observed occurrence rate for (a) all times, (b) daytime (1100–2000 LST), and (c) nighttime (2000–1100 LST) for JJA (left) 2012 and (right) 2013. Black box denotes central MRV region used in Fig. 18.

Citation: Weather and Forecasting 30, 4; 10.1175/WAF-D-14-00118.1

The MCS frequency biases shown in Fig. 10 can be attributed, in part, to biases in the initiation of the MCSs in the model. Comparing Figs. 5 and 11, it is seen that while the model was able to roughly approximate the location of the observed MCS-I hot spots (i.e., especially the high plains portion of the GP band and Florida), it overforecasted the MCS-I frequency maximum over portions of the GP band (in terms of the extent of the local maximum) and grossly underforecasted the MCS-I frequency in the SE region and APP band (in terms of both extent and magnitude) except over parts of Florida where the model overforecasts the frequency of MCS-I (Fig. 11a). These MCS-I biases are generally consistent with MCS frequency biases shown in Fig. 10. Lesser MCS-I underforecasting biases are also evident throughout the MRV band, which does not seem to fully explain the nighttime MCS forecasting bias that is evident in 2013 (Fig. 10).

Fig. 11.
Fig. 11.

Maps of (a) seasonal mean MCS-I occurrence rate obtained from 6-h forecasts and (b) median time of MCS-I in the 6-h forecasts for JJA (left) 2012 and (right) 2013. The median time of MCS-I is only computed at points with at least one MCS-I per week per 500 km2. The VIL threshold for identifying MCSs is shown at the top right.

Citation: Weather and Forecasting 30, 4; 10.1175/WAF-D-14-00118.1

The model seems to capture the larger-scale variations in the timing of MCS-I fairly well in both years. As seen in Fig. 4, MCS-I generally occurs later in the day as one moves north and west with latest MCS-Is in the central and northern portions of the MRV and GP bands. South-to-north variations in the median timing of MCS-Is are generally captured by the model fairly well in 2012; however, in 2013 the modeled MCS-I is much too early in the central portions of the GP and much too late over southern Texas (Fig. 11). In the APP band, there is a tendency for the modeled MCS-I to occur later than observed, with this late timing bias being larger in 2013. Finally, the model shows an area of later-forming MCSs similar in size to that observed; however, the modeled area of late-forming MCSs is displaced several hundred kilometers to the southeast of the observed area of late-forming MCS.

2) Matched modeled and observed MCS analyses

Thus far, we have only compared the statistical properties of modeled and observed MCSs. In this section, we discuss the model’s skill at predicting specific MCS events using a simple object-matching technique. In this object-matching technique, modeled MCS objects that are within 100 km of a forecasted MCS are given credit for a hit, as shown in Fig. 12 and Table 4. The position and area covered by each MCS are defined using a centroid location and a 72-vertex polygon (one vertex every 5°). Each modeled and observed MCS object is then expanded by extending each vertex of the polygon by 50 km. Extending both modeled and observed MCS objects by 50 km allows for forecast errors of up to 100 km to be considered a hit. Hits, misses, false alarms, and correct nulls are determined following the rules given in Table 4. It is important to note that the standard skill scores computed using this technique are impacted by the size of the modeled and observed storms and how much they overlap. This upscaling of the modeled and observed MCSs is designed to give credit for near misses but not penalize for additional false alarms caused by the MCS expansions.

Fig. 12.
Fig. 12.

Schematic demonstrating how contingency tables are created. The red-filled contours depict the truth field (given a value of 2) and the yellow-filled contours represent a 50-km expansion of the truth field (given a value of 1). The dark blue contours indicate the model-predicted MCS locations (=2) and the light blue contour indicates the model-predicted MCS locations extended by 50 km (=1). The grid lines represent a 1° grid, with each grid square being classified as either a hit (indicated by an “H”), miss (indicated by an “M”), false alarm (indicated by an “F”), or correct null (indicated by an “N”) based on the classification method shown in Table 4.

Citation: Weather and Forecasting 30, 4; 10.1175/WAF-D-14-00118.1

Table 4.

Description of 3 × 3 contingency table used in the matched comparison of modeled and observed MCSs. For the “Modeled MCS location value” and “Observed MCS location value,” a value of 2 is used to indicate the exact location of an MCS while a value of 1 indicates a 50-km buffer around the MCS. A value of zero indicates that no MCSs are present within 50 km.

Table 4.

The hit, miss, false alarm, and positive null counts are summed over the period from 1 June to 31 August for a given forecast lead time using the 3 × 3 contingency table provided in Table 4, from which the following standard skill scores are computed (Wilks 2006):
e2
e3
e4
Using Table 4, the frequency bias is obtained by the number of forecasts greater than or equal to two nfcsts and the number of observations greater than or equal to two nobs.
In addition, because we are trying to compare the skill for two seasons with varying event frequencies, we use the symmetric extreme dependence score, which is equitable and not impacted by the frequency of occurrence of the event being verified (Hogan et al. 2009):
e5
e6
The seasonal mean skill scores are computed for 2012 and 2013 for three regions: the eastern United States (east of 105°W), and the SE and GP regions shown in Fig. 3. As with the statistical comparisons discussed in the previous section, the skill scores are a strong function of valid time, forecast lead time, and region (Figs. 1316). The dashed line shown in these figures can be used to determine how the skill score of the 0000 UTC forecast varies with lead time. The mean skill scores for any other forecast issuance time may be found by following a line parallel to the dashed line that originates on the issuance time of interest.
Fig. 13.
Fig. 13.

Model bias as a function of lead time and valid time of day in the (a) eastern U.S. and the (b) SE and (c) GP domains during (left) 2012 and (right) 2013. Solid curves indicate relative variations in observed MCS frequency with time of day for a given region (the max observed frequency is given for each region). Dashed diagonal lines indicate the lead time/valid time coordinates of the 0000 UTC forecasts.

Citation: Weather and Forecasting 30, 4; 10.1175/WAF-D-14-00118.1

Fig. 14.
Fig. 14.

Model performance as a function of lead time and valid time of day as indicated by (a) SEDS, (b) POD, and (c) FAR for the eastern U.S. domain during (left) 2012 and (right) 2013. Dashed diagonal lines indicate the location of lead times associated with the 0000 UTC forecasts.

Citation: Weather and Forecasting 30, 4; 10.1175/WAF-D-14-00118.1

Fig. 15.
Fig. 15.

As in Fig. 14, but for the SE domain.

Citation: Weather and Forecasting 30, 4; 10.1175/WAF-D-14-00118.1

Fig. 16.
Fig. 16.

As in Fig. 14, but for the GP domain.

Citation: Weather and Forecasting 30, 4; 10.1175/WAF-D-14-00118.1

The diurnal variation in frequency bias is shown for each region in Fig. 13. This figure provides more detailed information on the diurnal variation of bias than that given in Fig. 8. Figure 13 also includes curves that indicate the diurnal variation in the observed MCS frequency for each region. Comparing the curves in each panel reveals that the observed diurnal cycle of MCS frequency for the GP region is almost completely out of phase with that found for the SE region. Diurnal variations in the skill scores may be described in terms of the observed phase of MCS activity (i.e., dissipation vs initiation and growth). For example, in the SE region, the frequency bias peaks (more so in 2013) in forecasts issued between 0000 and 0800 UTC, which are valid during the dissipation phase of MCSs in this region, indicating the model’s tendency to dissipate storms too slowly in this region. At the same time, the frequency bias in the SE region was reduced during the storm initiation and growth phase (1600–2200 UTC) in 2013. In the GP region, forecasts issued after 1200 and before 0000 UTC had large frequency biases. These high biases are found primarily during the initiation phase of MCSs in this region, which is consistent with the overprediction of MCS-Is and MCSs in this region (Figs. 10 and 11). These high biases expanded in 2013 despite a 40% increase in observed MCS activity over 2012.

Like the frequency bias, each of the skill scores (i.e., SEDS, POD, and FAR) varies with lead time, valid time, region, and year (Figs. 1416). For the eastern U.S. domain, the best performing forecasts were issued between 2000 and 0400 UTC (note the skill pickup along either side of the dashed line that identifies all of the valid times for the 0000 UTC forecast in Fig. 14). This valid time period corresponds with the observed period of enhanced MCS activity in the GP region (Fig. 13c). This period of peak skill is characterized by higher POD and lower FAR values than at other times of day. In contrast, model forecasts issued between 0600 and 1600 UTC (i.e., period corresponding with reduced observed MCS activity compared to other times of day) perform worst (Fig. 14). These relationships are consistent with the conjecture that the assimilation of radar reflectivity improves model skill when appreciable convective activity is present at forecast initialization time.

Overall, the skill (given by SEDS) of the MCS forecasts is greater in 2013, particularly for longer-lead forecasts verifying between 0600 and 1200 UTC (i.e., overnight period). These increases in SEDS are coincident with simultaneous increases in POD and decreases in the FAR in the 2013 forecasts (Figs. 14b,c). This skill pickup was not simply due to the increased frequency of the observed MCS events (seen in Fig. 13c), as discussed by Jolliffe (2008), since SEDS is not sensitive to these fluctuations (Hogan et al. 2009). However, the year-to-year change in observed MCS frequency indicates that there were likely appreciable differences in the mean environmental conditions of this region between 2012 and 2013, which may have contributed to changes in model performance.

Regional variations in model performance are explored by studying two regions (SE and GP) with vastly different storm formation mechanisms and statistical performance characteristics [see section 5b(1)]. Values of SEDS vary more strongly with lead time in the SE region and more strongly with valid time in the GP region (Figs. 15 and 16, respectively). In the SE region, SEDS values are consistently highest for the shortest lead times, while in the GP region, SEDS peaks during the evening or overnight period (between 0000 and 1200 UTC). Variations in SEDS are related to variations in POD in the SE region, and both POD and the FAR in the GP region. While the POD values are comparable between the two regions, the FAR values are much greater in the GP region. The cause of the high FAR values in the GP region after ~1200 UTC, which is consistent with the overprediction of MCS frequencies in this region discussed in section 5b(1), is further explored in section 5c.

Values of SEDS in 2013 tend to be higher than those found for the 2012 forecasts in both regions. In the SE region, this increase in skill is limited to forecast lead times of 3–8 h that are valid between 1400 and 1700 UTC. In the GP region, SEDS increases in 2013 for forecasts issued between 0000 and 0600 UTC and for lead times of up to 10 h, while it decreases slightly at lead times longer than 8 h for forecasts issued between 1600 and 0000 UTC (i.e., issue times for which convective activity is limited, as seen in Fig. 13c). In both regions, the POD of the first few lead hours of forecasts issued in 2013 increases dramatically over that found in 2012. This increase in POD at the short lead times is consistent with what one might expect through improved assimilation of radar reflectivity. The increase in SEDS extends to longer lead times in the GP region where larger, longer-lived storms are more common than in the SE region; thus, their assimilation into the model can have a longer impact on the model’s performance. These findings are also consistent with Stratman et al. (2013), who discussed improvements associated with the assimilation of radar reflectivity into the CAPS 4-km model ensemble.

c. Exploring the cause for model biases

The evaluation presented above indicates that there are three consistent biases in the model’s prediction of MCSs. The three main biases are 1) underprediction of daytime MCSs in the SE, 2) overprediction of MCSs in the high plains and northern GP throughout the day, and 3) underprediction of nighttime MCSs in the central MRV. In this section, each of these biases is explored in more detail.

1) Daytime MCS underforecasting bias in the southeastern United States

The importance of evaluating the model at finer scales is evident when comparing longitudinal band averages of MCS frequencies (Fig. 8) with those obtained using a 100-km grid (Fig. 10). It is seen that while model-forecasted MCS frequencies had limited bias in the APP longitudinal band (Fig. 8), the southern and eastern areas within the APP band had large negative biases, especially during the daytime (Fig. 10). As mentioned earlier, a large fraction of the observed MCS-I events are missed in the southeastern third of the APP band in both years. Matched pair analyses revealed that only 18% of the observed MCS-I events in this region were simulated to within ±2 h in 2013. Of the missed MCS-I events, 25% of the time the model failed to initiate convection at any scale while the rest of the time convection was predicted, but storms failed to grow large enough to attain MCS status (Fig. 17). Here, it is seen that, for each observed MCS-I event, the size of the modeled convective storms tended to remain below the MCS criterion (i.e., 100 km). Quantification of the biases in the modeled environmental conditions associated with these missed MCS-I events is needed to guide future model improvement efforts.

Fig. 17.
Fig. 17.

Distributions of modeled (red) and observed (black) storm sizes occurring within regions bound by 29.5°N, 86.5°W and 33.5°N, 80.5°W (a subregion within the SE region) that were obtained for a subset of MCS-I events observed in 2013 in which max dimension of the modeled convective storms failed to reach 100 km. Note that the modeled storms larger than 100 km that were present in the subregion during an observed MCS-I event did not initiate within ±2 h of the observed event.

Citation: Weather and Forecasting 30, 4; 10.1175/WAF-D-14-00118.1

2) MCS overforecasting bias over the high plains and northern plains

Model forecasts of MCSs exhibited a high bias in the GP longitudinal band throughout the day at all but the longest forecast lead times (Fig. 8). The high bias was most pronounced over the high plains (i.e., western half of the GP band) and northern Great Plains (Fig. 10), with areas experiencing the highest biases corresponding directly with areas where MCS-Is were also overpredicted. The overprediction of MCS-Is over the high plains is consistent with that reported by Burghardt et al. (2014), who found similar biases (the modeled MCS-Is being too early and occurring too often) in a set of high-resolution simulations of terrain-aided convective initiation over the high plains.

In terms of the matched skill scores, the model was able to detect more than 80% of the MCS events occurring during the period on enhanced MCS activity at night, but at the expense of FAR values of over 50% (Fig. 16). These FAR values are much larger than the 30% values found during the period of peak activity in the SE region (Fig. 15). Further inspection of the false alarm cases in the GP region revealed that they were primarily due to the model having too many MCS-I events, with many of these events producing an MCS lasting several hours. About 10% of the time, areas of primarily nonconvective precipitation were incorrectly identified as MCSs using the optimized threshold values.

3) Nighttime MCS underforecasting biases over the central Mississippi River valley

In contrast to the GP region, the model tended to underforecast the occurrence of MCSs in the MRV region during overnight hours (Fig. 8), especially in 2013. The largest underforecasts in this longitudinal band (with a deficit of nearly 40 per week in 2013) occurred over the northern and central parts of the MRV band (Fig. 10), where the model was unable to reproduce the local maximum in MCS activity observed to the south and west of Lake Michigan in 2013 (Fig. 3). Similar to the SE region, this underforecasting bias can be linked, in part, to an underprediction of MCS-I events. This is evident when comparing Figs. 5 and 11, which indicate that the model MCS-I frequency in 2013 is about 50% of the observed frequency over much of the central MRV band centered near Missouri. In addition to this MCS-I underprediction, the MCS underforecasting bias can also be attributed to the model dissipating nighttime MCSs too quickly, as demonstrated in Fig. 18. In this figure the number of observed and forecasted (given as a function of lead time for every other issue time) MCSs is summed over the entire summer. The forecasted MCSs are for all forecast lead times and issued every even hour of the day; thus, each curve overlaps the next by 12 h. The number of MCSs predicted between 2000 and 1200 LST starts near the observed number of MCSs, but then decreases with forecast lead time. In other words, over the course of each model run valid during this time period, the model tends to dissipate MCSs too quickly. Thus, the low bias in the MRV band at night is caused by a combination of underforecasted MCS-Is and a tendency to dissipate storms too quickly.

Fig. 18.
Fig. 18.

Total number of modeled MCSs (3–14-h lead times of forecasts issued every even hour are shown, with the line for the 0000 LST run being blue and more reddish and brown colors being forecasts issued later in the day) and observed (connected dots) MCSs obtained during JJA 2013 for the central MRV region denoted in Fig. 10.

Citation: Weather and Forecasting 30, 4; 10.1175/WAF-D-14-00118.1

6. Summary and context of results

An object-based verification technique has been used to evaluate HRRR’s skill at predicting the occurrence, properties, and initiation of MCSs. Software, which was originally developed to identify and track convective storms in radar data, called TITAN (Dixon and Wiener 1993), has been adapted to assess the performance of high-resolution model forecasts of convection (e.g., Pinto et al. 2007; Caine et al. 2013). While a number of studies have used object-based techniques to evaluate convective precipitation forecasts in the past, few have attempted to quantify regional variations in performance, and fewer still have quantified the model’s ability to predict the initiation of larger storm systems over a region spanning many different climate zones. In addition, unlike most studies, this work employs a method to remove the mean bias in the modeled VIL field prior to identification of storm objects and the computation of performance statistics. This is done by adjusting the VIL threshold used to identify MCSs in the model forecasts until the modeled MCS size distribution closely matches that obtained from radar observations using a VIL threshold of 3.5 kg m−2. This procedure effectively removes the impact of mean biases in both the model and the observational dataset.

It was found that using an optimized VIL threshold was critical in evaluating the HRRR’s skill at predicting MCSs. Optimized thresholds of 1.5 and 1.75 kg m−2 were used to detect MCS HRRR runs produced in 2012 and 2013, respectively. Both of these thresholds are at least 50% less than the 3.5 kg m−2 value used to identify areas of convection in the observed VIL field. While these thresholds are relatively low, and not what one would typically relate to convection, it was shown that these thresholds effectively capture areas of convection that might be seen using other fields such as composite maximum reflectivity. As such, a key aspect of this study is to make users of this data aware of potential biases in the modeled VIL field and to allow for a fair assessment of the model skill when using this field to identify MCS rather than penalize the model for issues predominantly related to postprocessing.

Once the mean bias is removed using a single VIL threshold, other biases related to model physics and predictability issues can be explored. It is found that the remaining biases are a function of lead time, region, and time of day. Generally, the frequency of occurrence of predicted MCSs decreases with forecast lead time. When using a single optimized threshold, the HRRR predicts too many (as many as a factor for 2 more than observed in the GP region during the day) MCSs at the shorter lead times and too few (nearly 50% less than observed in the SE region at night) at the longer lead times (see Fig. 8 for details). This dependence on lead time was less pronounced in 2013, which is indicative of an overall improvement in the HRRR’s skill at predicting MCSs. Regional variations in bias of the forecasted MCS frequencies obtained with the HRRR using a single optimized threshold varied between −33% in the SE during the day to +75% in the GP region at night in both years (Fig. 7). It should be noted that if one had used 3.5 kg m−2 to detect MCSs, the range of variations in the frequency bias reported above would be similar but values would all be negative.

More detailed analyses revealed that the HRRR consistently overestimated the frequency of occurrence of MCSs over the Great Plains during both day and night (Fig. 10). This consistent overforecasting bias was related to the simulation of too many MCS-I events on the high plains in the late afternoon. Recent studies have shown that this bias may be insensitive to model resolution, as Burghardt et al. (2014) found similar biases in the initiation of convection over the high plains at a model grid spacing of 400 m. In contrast to the high bias found over the Great Plains, the model tended to underestimate MCS frequency in the SE during the day and in the MRV band at night. The low bias in the SE region was related to the model underforecasting the number of MCS-I events while the low bias in the MRV region was related to the model’s tendency to dissipate storms too quickly. While the model did a better job simulating MCS-I frequency over the Appalachians in 2013, the timing in this region was worse than in 2012 with MCS-I occurring 2–3 h later than observed. Nonetheless, broadly speaking, the model did a nice job capturing the regional variations in the timing of MCS-Is with good representation of the later initiation of MCSs as one moves north and west across the United States. Smaller-scale interannual variability appears to have been captured as well. Future changes in the HRRR’s skill at predicting MCSs and MCS-I events can be tracked at the NCAR/Research Applications Laboratory MCS verification website, which provides weekly and monthly performance graphics (http://rap.ucar.edu/projects/mcsprediction/statistics/).

Using optimized thresholds also allowed for a more fair evaluation of model skill. It was found that both SEDS and POD increased notably between 2012 and 2013 while FAR was reduced. This increase in POD was observed throughout the day for lead times of 3–5 h, with notable increases extending to longer lead times during the storm growth period in the SE region and at night in the GP region. Both periods in which the increase in SEDS extended to longer lead times coincided with forecast issue times that were more likely to have existing storms. The large increase in POD found at short lead times and the increase in SEDS of longer-lead forecasts that likely had ongoing convective activity at forecast issue time is consistent with what one might expect in response to improved assimilation of radar reflectivity. A more detailed investigation of the relationship between radar reflectivity assimilation and model performance is beyond the scope of this paper.

The object-based technique used in this study could easily be extended to evaluate model performance as a function of storm properties. For example, one could explore the relationship between skill and storm size or skill and storm organization (e.g., linear vs circular MCSs, broken squall lines, etc.) via computing skill scores as a function of permissible gap size or aspect ratio. The technique could also be used to assess the performance of high-resolution model ensemble forecasts following the work of Gallus (2010). Taking this a step further, one could apply the object-based technique on model ensemble forecasts to generate probabilistic forecasts of storm properties. Pinto et al. (2013) demonstrated this type of forecast system in which they generated forecasts of the likelihood of convective storms exceeding a given size threshold to aid aviation planners in optimally selecting air traffic flow structures in the presence of large convective storms.

Acknowledgments

The authors greatly appreciated access to CIWS data provided by Dr. Haig Iskenderian of the MIT Lincoln Laboratory. We would also like to thank Drs. Stan Benjamin, Steve Weygandt, and Curtis Alexander of NOAA/ESRL/GSD for providing the HRRR data as well as several stimulating discussions on HRRR performance over the years. The thoughtful comments by Drs. Tammy Weckwerth, Russ Schumacher, Matthew Bunkers, and another reviewer (anonymous) were greatly appreciated and helped sharpen the discussion of the material.

This research has been conducted in response to requirements of the Federal Aviation Administration (FAA), supplemented by funding provided by the National Science Foundation (NSF) to NCAR. The views expressed are those of the authors and do not necessarily represent the official policy or position of neither the FAA nor NSF.

REFERENCES

  • Alexander, C., , Weygandt S. , , Benjamin S. G. , , Smirnova T. G. , , Brown J. M. , , Hofmann P. , , and James E. , 2011: The High Resolution Rapid Refresh (HRRR): Recent and future enhancements, time-lagged ensembling, and 2010 forecast evaluation activities. Proc. 24th Conf. on Weather and Forecasting/20th Conf. on Numerical Weather Prediction, Seattle, WA, Amer. Meteor. Soc., 12B.2. [Available online at https://ams.confex.com/ams/91Annual/webprogram/Paper183065.html.]

  • Augustine, J. A., , and Howard K. W. , 1988: Mesoscale convective complexes over the United States during 1985. Mon. Wea. Rev., 116, 685701, doi:10.1175/1520-0493(1988)116<0685:MCCOTU>2.0.CO;2.

    • Search Google Scholar
    • Export Citation
  • Baldwin, M. E., , and Lakshmivarahan S. , 2003: Development of an events oriented verification system using data mining and image processing algorithms. Preprints, Third Conf. on Artificial Intelligence, Long Beach, CA, Amer. Meteor. Soc., 4.6. [Available online at http://ams.confex.com/ams/pdfpapers/57821.pdf.]

  • Basara, J. B., , Maybourn J. N. , , Peirano C. M. , , Tate J. E. , , Brown P. J. , , Hoey J. D. , , and Smith B. R. , 2013: Drought and associated impacts in the Great Plains of the United States—A review. Int. J. Geosci., 4, 7281, doi:10.4236/ijg.2013.46A2009.

    • Search Google Scholar
    • Export Citation
  • Benjamin, S. G., , Weygandt S. , , Smirnova T. G. , , Hu M. , , Peckham S. E. , , Brown J. M. , , Brundage K. , , and Manikin G. S. , 2009: Assimilation of radar reflectivity data using a diabatic digital filter: Applications to the Rapid Update Cycle and Rapid Refresh and initialization of High Resolution Rapid Refresh forecasts with RUC/RR grids. Preprints, 13th Conf. on Integrated Observing and Assimilation Systems for Atmosphere, Oceans, and Land Surface (IOAS-AOLS), Phoenix, AZ, Amer. Meteor. Soc., 7B.3. [Available online at https://ams.confex.com/ams/pdfpapers/150469.pdf.]

  • Benjamin, S. G., , Weygandt S. , , Alexander C. , , Brown J. M. , , Smirnova T. G. , , Hofmann P. , , James E. , , and Dimego G. , 2011: NOAA’s hourly-updated 3km HRRR and RUC/Rapid Refresh—Recent (2010) and upcoming changes toward improving weather guidance for air-traffic management. Proc. Second Aviation, Range, and Aerospace Meteorology Special Symp. on Weather–Air Traffic Management Integration, Seattle, WA, Amer. Meteor. Soc., 3.2. [Available online at https://ams.confex.com/ams/91Annual/webprogram/Paper185659.html.]

  • Benjamin, S. G., and Coauthors, 2013: Data assimilation and model updates in the 2013 Rapid Refresh (RAP) and High-Resolution Rapid Refresh (HRRR) analysis and forecast systems. NCEP/EMC Meeting, Washington, DC, NCEP/EMC/Model Evaluation Group. [Available online at http://ruc.noaa.gov/pdf/NCEP_HRRR_RAPv2_6jun2013-Benj-noglob.pdf.]

  • Billet, J., , DeLisi M. , , Smith B. G. , , and Gates C. , 1997: Use of regression techniques to predict hail size and the probability of large hail. Wea. Forecasting, 12, 154164, doi:10.1175/1520-0434(1997)012<0154:UORTTP>2.0.CO;2.

    • Search Google Scholar
    • Export Citation
  • Bryan, G. H., , Wyngaard J. C. , , and Fritsch J. M. , 2003: Resolution requirements for the simulation of deep moist convection. Mon. Wea. Rev., 131, 23942416, doi:10.1175/1520-0493(2003)131<2394:RRFTSO>2.0.CO;2.

    • Search Google Scholar
    • Export Citation
  • Burghardt, B. J., , Evans C. , , and Roebber P. J. , 2014: Assessing the predictability of convection initiation in the high plains using an object-based approach. Wea. Forecasting, 29, 403418, doi:10.1175/WAF-D-13-00089.1.

    • Search Google Scholar
    • Export Citation
  • Caine, S., , Lane T. P. , , May P. , , Jakob C. , , Siems S. T. , , Manton M. J. , , and Pinto J. O. , 2013: Statistical assessment of tropical convection-permitting model simulations using a cell-tracking algorithm. Mon. Wea. Rev., 141, 557581, doi:10.1175/MWR-D-11-00274.1.

    • Search Google Scholar
    • Export Citation
  • Carbone, R. E., , and Tuttle J. D. , 2008: Rainfall occurrence in the U.S. warm season: The diurnal cycle. J. Climate, 21, 41324146, doi:10.1175/2008JCLI2275.1.

    • Search Google Scholar
    • Export Citation
  • Carbone, R. E., , Tuttle J. D. , , Ahijevych D. , , and Trier S. B. , 2002: Inferences of predictability associated with warm season precipitation episodes. J. Atmos. Sci., 59, 20332056, doi:10.1175/1520-0469(2002)059<2033:IOPAWW>2.0.CO;2.

    • Search Google Scholar
    • Export Citation
  • Clark, A. J., , Bullock R. G. , , Jensen T. L. , , Xue M. , , and Kong F. , 2014: Application of object-based time-domain diagnostics for tracking precipitation systems in convection allowing models. Wea. Forecasting, 29, 517542, doi:10.1175/WAF-D-13-00098.1.

    • Search Google Scholar
    • Export Citation
  • Coniglio, M. C., , Hwang J. Y. , , and Stensrud D. J. , 2010: Environmental factors in the upscale growths and longevity of MCSs derived from Rapid Update Cycle analyses. Mon. Wea. Rev., 138, 35143539, doi:10.1175/2010MWR3233.1.

    • Search Google Scholar
    • Export Citation
  • Coniglio, M. C., , Correia J. , , Marsh P. T. , , and Kong F. , 2013: Verification of convection-allowing WRF Model forecasts of the planetary boundary layer using sounding observations. Wea. Forecasting, 28, 842862, doi:10.1175/WAF-D-12-00103.1.

    • Search Google Scholar
    • Export Citation
  • Crowe, B. A., , and Miller D. W. , 1999: The benefits of using NEXRAD vertically integrated liquid water as an aviation weather product. Preprints, Eighth Conf. on Aviation, Range, and Aerospace Meteorology, Dallas, TX, Amer. Meteor. Soc., 168171.

  • Davis, C. A., , Brown B. G. , , and Bullock R. G. , 2006: Object-based verification of precipitation forecasts. Part II: Application to convective rain systems. Mon. Wea. Rev., 134, 17851795, doi:10.1175/MWR3146.1.

    • Search Google Scholar
    • Export Citation
  • Dixon, M., , and Wiener G. , 1993: TITAN: Thunderstorm identification, tracking, analysis and nowcasting—A radar-based methodology. J. Atmos. Oceanic Technol., 10, 785797, doi:10.1175/1520-0426(1993)010<0785:TTITAA>2.0.CO;2.

    • Search Google Scholar
    • Export Citation
  • Ebert, E. E., , and McBride J. L. , 2000: Verification of precipitation in weather systems: Determination of systematic errors. J. Hydrol., 239, 179202, doi:10.1016/S0022-1694(00)00343-7.

    • Search Google Scholar
    • Export Citation
  • Ebert, E. E., , and Gallus W. A. Jr., 2009: Toward better understanding of the contiguous rain area (CRA) method for spatial forecast verification. Wea. Forecasting, 24, 14011415, doi:10.1175/2009WAF2222252.1.

    • Search Google Scholar
    • Export Citation
  • Evans, J. E., , and Ducot E. R. , 2006: Corridor Integrated Weather System. MIT Lincoln Lab. J., 16, 5980.

  • Gallus, W. A., Jr., 2010: Application of object-based verification techniques to ensemble precipitation forecasts. Wea. Forecasting, 25, 144158, doi:10.1175/2009WAF2222274.1.

    • Search Google Scholar
    • Export Citation
  • Geerts, B., 1998: Mesoscale convective systems in the southeast United States during 1994–95: A survey. Wea. Forecasting, 13, 860869, doi:10.1175/1520-0434(1998)013<0860:MCSITS>2.0.CO;2.

    • Search Google Scholar
    • Export Citation
  • Greene, D. R., , and Clark R. A. , 1972: Vertically integrated liquid water—A new analysis tool. Mon. Wea. Rev., 100, 548552, doi:10.1175/1520-0493(1972)100<0548:VILWNA>2.3.CO;2.

    • Search Google Scholar
    • Export Citation
  • Hallowell, R. G., and Coauthors, 1999: The Terminal Convective Weather Forecast Demonstration. Preprints, Eighth Conf. on Aviation, Range, and Aerospace Meteorology, Dallas, TX, Amer. Meteor. Soc., 200–204.

  • Hoerling, M., and Coauthors, 2013: An interpretation of the origins of the 2012 Central Great Plains drought. NOAA Drought Task Force Assessment Rep., 50 pp. [Available online at ftp://ftp.oar.noaa.gov/CPO/pdf/mapp/reports/2012-Drought-Interpretation-final.web-041113.pdf.]

  • Hogan, R. J., , O’Connor E. J. , , and Illingworth A. J. , 2009: Verification of cloud fraction forecasts. Quart. J. Roy. Meteor. Soc., 135, 14941511, doi:10.1002/qj.481.

    • Search Google Scholar
    • Export Citation
  • Houze, R. A., Jr., 1993: Cloud Dynamics. Academic Press, 573 pp.

  • Jirak, I. L., , Cotton W. R. , , and McAnelly R. L. , 2003: Satellite and radar survey of mesoscale convective system development. Mon. Wea. Rev., 131, 24282449, doi:10.1175/1520-0493(2003)131<2428:SARSOM>2.0.CO;2.

    • Search Google Scholar
    • Export Citation
  • Johnson, A., , and Wang X. , 2013: Object-based evaluation of a storm-scale ensemble during the 2009 NOAA Hazardous Weather Testbed Spring Experiment. Mon. Wea. Rev., 141, 10791098, doi:10.1175/MWR-D-12-00140.1.

    • Search Google Scholar
    • Export Citation
  • Johnson, A., , Wang X. , , Kong F. , , and Xue M. , 2013: Object-based evaluation of the impact of horizontal grid spacing on convection-allowing forecasts. Mon. Wea. Rev., 141, 34133425, doi:10.1175/MWR-D-13-00027.1.

    • Search Google Scholar
    • Export Citation
  • Jolliffe, I. T., 2008: The impenetrable hedge: A note on propriety, equitability and consistency. Meteor. Appl., 15, 2529, doi:10.1002/met.60.

    • Search Google Scholar
    • Export Citation
  • Kitzmiller, D. H., , McGovern W. E. , , and Saffle R. F. , 1995: The WSR-88D severe weather potential algorithm. Wea. Forecasting, 10, 141159, doi:10.1175/1520-0434(1995)010<0141:TWSWPA>2.0.CO;2.

    • Search Google Scholar
    • Export Citation
  • Koch, S. E., , Ferrier B. S. , , Kain J. S. , , Stoelinga M. T. , , Szoke E. J. , , and Weiss S. J. , 2005: The use of simulated radar reflectivity fields in the diagnosis of mesoscale phenomena from high-resolution WRF Model forecasts. Preprints, 11th Conf. on Mesoscale Processes/32nd Conf. on Radar Meteorology, Albuquerque, NM, Amer. Meteor. Soc., J4J.7. [Available online at https://ams.confex.com/ams/pdfpapers/97032.pdf.]

  • Krozel, J., , Mitchell J. S. B. , , Polishchuk V. , , and Prete J. , 2007: Maximum flow rates for capacity estimation in level flight with convective weather constraints. Air Traffic Control Quart., 15, 209238.

    • Search Google Scholar
    • Export Citation
  • Lack, S., , Limpert G. J. , , and Fox N. I. , 2010: An object-oriented multiscale verification scheme. Wea. Forecasting, 25, 7992, doi:10.1175/2009WAF2222245.1.

    • Search Google Scholar
    • Export Citation
  • Parker, M. D., , and Johnson R. H. , 2000: Organizational modes of midlatitude mesoscale convective systems. Mon. Wea. Rev., 128, 34133436, doi:10.1175/1520-0493(2001)129<3413:OMOMMC>2.0.CO;2.

    • Search Google Scholar
    • Export Citation
  • Parker, M. D., , and Ahijevych D. , 2007: Convective episodes in the east-central United States. Mon. Wea. Rev., 135, 37073727, doi:10.1175/2007MWR2098.1.

    • Search Google Scholar
    • Export Citation
  • Pinto, J. O., , Phillips C. , , Steiner M. , , Rasmussen R. , , Oien N. , , Dixon M. , , Wang W. , , and Weisman M. , 2007: Assessment of the statistical characteristics of thunderstorms simulated with the WRF Model using convection permitting resolution. 33rd Int. Conf. on Radar Meteorology, Cairns, QLD, Australia, 5.5. [Available online at https://ams.confex.com/ams/pdfpapers/123712.pdf.]

  • Pinto, J. O., , Dupree W. , , Weygandt S. , , Wolfson M. , , Benjamin S. , , and Steiner M. , 2010: Advances in the Consolidated Storm Prediction for Aviation (CoSPA). Preprints, 14th Conf. on Aviation, Range, and Aerospace Meteorology, Atlanta, GA, Amer. Meteor. Soc., J11.2. [Available online at https://ams.confex.com/ams/pdfpapers/163811.pdf.]

  • Pinto, J. O., , Grim J. A. , , Ahijevych D. , , and Steiner M. , 2013: An automated system for detecting large-scale convective storms: Application to model evaluation. Proc. 16th Conf. on Aviation, Range, and Aerospace Meteorology, Austin, TX, Amer. Meteor. Soc., 9.4A. [Available online at https://ams.confex.com/ams/93Annual/webprogram/Paper222079.html].

  • Schwartz, C. S., and Coauthors, 2009: Next-day convection-allowing WRF Model guidance: A second look at 2-km versus 4-km grid spacing. Mon. Wea. Rev., 137, 33513372, doi:10.1175/2009MWR2924.1.

    • Search Google Scholar
    • Export Citation
  • Skamarock, W. C., , and Klemp J. B. , 2008: A time-split nonhydrostatic atmosphere model for weather research and forecasting applications. J. Comput. Phys., 227, 34653485, doi:10.1016/j.jcp.2007.01.037.

    • Search Google Scholar
    • Export Citation
  • Smalley, D. J., , and Bennett B. J. , 2002: Using ORPG to enhance NEXRAD products to support FAA critical systems. Preprints, 10th Conf. on Aviation, Range, and Aerospace Meteorology, Portland, OR, Amer. Meteor. Soc., 3.6. [Available online at https://ams.confex.com/ams/pdfpapers/38861.pdf.]

  • Steiner, M., , Bateman R. , , Megenhardt D. , , Liu Y. , , Xu M. , , Pocernich M. , , and Krozel J. , 2010: Translation of ensemble weather forecasts into probabilistic air traffic capacity impact. Air Traffic Control Quart., 18, 229254.

    • Search Google Scholar
    • Export Citation
  • Stensrud, D. J., and Coauthors, 2009: Convective-scale warn-on-forecast: A vision for 2020. Bull. Amer. Meteor. Soc., 90, 14871499, doi:10.1175/2009BAMS2795.1.

    • Search Google Scholar
    • Export Citation
  • Stensrud, D. J., and Coauthors, 2013: Progress and challenges with warn-on-forecast. Atmos. Environ., 123, 216, doi:10.1016/j.atmosres.2012.04.004.

    • Search Google Scholar
    • Export Citation
  • Stratman, D. R., , Coniglio M. C. , , Koch S. E. , , and Xue M. , 2013: Use of multiple verification methods to evaluation forecasts of convection from hot- and cold-start convection allowing models. Wea. Forecasting, 28, 119138, doi:10.1175/WAF-D-12-00022.1.

    • Search Google Scholar
    • Export Citation
  • Thompson, G., , Field P. R. , , Rasmussen R. M. , , and Hall W. D. , 2008: Explicit forecasts of winter precipitation using an improved bulk microphysics scheme: Part II: Implementation of a new snow parameterization. Mon. Wea. Rev., 136, 50955115, doi:10.1175/2008MWR2387.1.

    • Search Google Scholar
    • Export Citation
  • Weisman, M. L., , Skamarock W. C. , , and Klemp J. B. , 1997: The resolution dependence of explicitly modeled convective systems. Mon. Wea. Rev., 125, 527548, doi:10.1175/1520-0493(1997)125<0527:TRDOEM>2.0.CO;2.

    • Search Google Scholar
    • Export Citation
  • Weygandt, S., and Coauthors, 2011: The Rapid Refresh—Replacement for the RUC, pre-implementation development and evaluation. Proc. 24th Conf. on Weather and Forecasting/20th Conf. on Numerical Weather Prediction, Seattle, WA, Amer. Meteor. Soc., 12B.1. [Available online at https://ams.confex.com/ams/91Annual/webprogram/Paper183027.html.]

  • Wilks, D. S., 2006: Statistical Methods in the Atmospheric Sciences. 2nd ed. Academic Press, 648 pp.

1

The three-dimensional radar reflectivity mosaic was produced as part of the MRMS data suite produced by the National Severe Storms Laboratory in Norman, Oklahoma.

2

LST is used instead of UTC for many analyses, since the contiguous United States spans ~58° of longitude and, thus, local solar time varies by ~3.9 h across the country.

Save