This first part of a two-part study on storm-scale radar and satellite data assimilation provides an overview of a multicase study conducted as part of the NOAA Warn-on-Forecast (WoF) project. The NSSL Experimental WoF System for ensembles (NEWS-e) is used to produce storm-scale analyses and forecasts of six diverse severe weather events from spring 2013 and 2014. In this study, only Doppler reflectivity and radial velocity observations (and, when available, surface mesonet data) are assimilated into a 36-member, storm-scale ensemble using an ensemble Kalman filter (EnKF) approach. A series of 1-h ensemble forecasts are then initialized from storm-scale analyses during the 1-h period preceding the onset of storm reports. Of particular interest is the ability of these 0–1-h ensemble forecasts to reproduce the low-level rotational characteristics of supercell thunderstorms, as well as other convective hazards. For the tornado-producing thunderstorms considered in this study, ensemble probabilistic forecasts of low-level rotation generally indicated a rotating thunderstorm approximately 30 min before the time of first observed tornado. Displacement errors (often to the north of tornado-affected areas) associated with vorticity swaths were greatest in those forecasts launched 30–60 min before the time of first tornado. Similar forecasts were produced for a tornadic mesovortex along the leading edge of a bow echo and, again, highlighted a well-defined vorticity swath as much as 30 min prior to the first tornado.
The NOAA Warn-on-Forecast (WoF; Stensrud et al. 2009, 2013) research project is tasked with the development of very short–range (0–1 h) probabilistic forecasts that accurately predict severe convective storms and their hazards. The eventual goal of the WoF project is to increase tornado, severe thunderstorm, and flash flood warning lead times. Storm-scale data assimilation and ensemble forecasting in a future WoF system will most likely be performed on very high–resolution (grid spacing ~1 km or less), event-dependent grids, and will be used to anticipate hazardous weather such as tornadoes, damaging straight-line winds, hail, and flooding. The WoF system will need to ingest a variety of data types ranging from ground-based radar and mesonet observations to remotely sensed observations from satellites, such as cloud water path retrievals and radiances. To this end, development and testing of subhourly storm-scale data assimilation with a Weather Research and Forecasting (WRF) Model–based system has begun on somewhat coarser convection-allowing grids (3-km grid spacing). In this first part of a two-part study that evaluates forecasts produced by an experimental WoF system, results from several radar data–only assimilation experiments are presented. Additional work in Jones et al. [(2015, manuscript submitted to Wea. Forecasting), hereafter Part II] further evaluates the impact of combined radar and satellite data assimilation.
The NSSL Experimental WoF System for ensembles (NEWS-e; described fully in the next section) uses an ensemble Kalman filter (EnKF; Evensen 1994; Houtekamer and Mitchell 1998; Anderson and Collins 2007) to assimilate Doppler observations from multiple radars into the storm-scale ensemble. A number of earlier radar data assimilation studies have used this approach with a simplified cloud model, and demonstrated the ability of a storm-scale EnKF to produce accurate analyses of isolated supercell thunderstorms and improved forecasts (e.g., Snyder and Zhang 2003; Dowell et al. 2004; Tong and Xue 2005; Yussouf and Stensrud 2010). Aksoy et al. (2009, 2010) performed EnKF simulations of supercellular, linear, and multicellular convective modes using an idealized approach similar to the above-referenced studies, and found that the EnKF can perform effectively across all convective regimes. They also found that the representation of mesoscale uncertainty (in the form of perturbations to the horizontal wind components of the environmental sounding) significantly increased ensemble spread and improved fits to observations for both reflectivity and radial velocity.
Consistent with this previous work by Aksoy et al. (2009, 2010), studies by Stensrud and Gao (2010) and Yussouf et al. (2013a) found that an improved representation of mesoscale heterogeneity in the near-storm environment produced more accurate EnKF analyses of tornadic supercell thunderstorms and improved probabilistic forecasts of low-level rotation. More recent data assimilation case studies of isolated supercell thunderstorms (e.g., Dowell et al. 2010; Jung et al. 2012; Yussouf et al. 2013b) and mesoconvective systems (e.g., Snook et al. 2011, 2012; Putnam et al. 2014; Chang et al. 2014; Wheatley et al. 2014; Johnson et al. 2015) have increasingly used storm-scale EnKFs in conjunction with models initialized with real data, which are then run with full-physics options (in contrast to idealized simulations, where initial conditions are supplied by a one-dimensional sounding that varies with height). In addition to these retrospective studies, the EnKF has also been used in real-time forecast systems to assimilate conventional observations (i.e., no radar data), and then launch convection-permitting forecasts in the 1–12-h and next-day ranges (e.g., Romine et al. 2014; Schwartz et al. 2015).
In light of these previous works, this paper aims to provide an overview of a multicase study, with a model that uses full physics and mesoscale variability, conducted as part of development efforts surrounding the WoF project. For several severe weather events, NEWS-e is used to produce storm-scale analyses by only assimilating Doppler reflectivity and radial velocity observations from several radars (and, when available, surface mesonet data). The aspects of these experiments described in the next section—such as storm-scale grid placement and radar data assimilation start time and frequency—are chosen to mimic the constraints of an initial real-time WoF system, which will be on demand and focused on the “next hour” forecast problem. A series of 1-h ensemble forecasts (initialized from storm-scale analyses) are launched during the 30–60-min period preceding the onset of storm reports. Of particular interest is the ability of these 0–1-h ensemble forecasts to reproduce the low-level rotational characteristics of supercell thunderstorms. This suite of forecasts also will serve as a control for additional storm-scale experiments, conducted to evaluate the impact of combined radar and satellite data assimilation, which are presented in Part II.
Six severe weather events from spring 2013 and 2014 are considered as part of this study, in an effort to expand beyond the traditional approach of rigorously evaluating a single case study. The first three events were selected from spring 2013 and occurred over central Oklahoma, with a focus—like past studies—on isolated supercell thunderstorms (Figs. 1a,c,e). These include the Edmond-Carney tornado on 19 May (Fig. 1a), the Moore tornado on 20 May (Fig. 1c), and the El Reno tornado on 31 May (Fig. 1e), which were rated on the enhanced Fujita (EF) scale as EF3, EF5, and EF3, respectively.
The remaining three events are selected from spring 2014 and span different geographical regions (Figs. 2a,c,e). The first two events of 2014 were associated with a powerful springtime storm system that slowly migrated eastward across the central plains. On 27 April 2014, the Storm Prediction Center (SPC) day 1 convective outlook (issued at 2000 UTC) called for a moderate to high risk of severe thunderstorms across much of Arkansas. The only tornadic supercell thunderstorm tracked just to the north and east of Little Rock [see location of Little Rock’s radar (KLZK) in Fig. 2a], producing an EF4 tornado. The 28 April 2014 event (Fig. 2c) produced a widespread tornado outbreak over Mississippi and Alabama. There were 21 confirmed tornadoes across the Jackson, Mississippi, National Weather Service (NWS) forecast area alone. The EF4 that affected the city of Louisville, Mississippi, was the most devastating, resulting in 10 fatalities and 81 injuries. The final event for study occurred on 11–12 May over eastern Nebraska and western Iowa, and was marked by a supercell-to-MCS transition, with both convective modes associated with tornado and wind reports (Fig. 2e).
The next section provides a discussion of the ensemble and EnKF methodology applied to the above-described severe weather events. The assimilation results are presented in section 3. Finally, in section 4, the results of this study are summarized, their implications are discussed, and suggestions for future research on this topic are offered.
A multiscale data assimilation system using the advanced research version of WRF, version 3.4.1 (ARW; Skamarock et al. 2008), has been used to produce storm-scale ensemble analyses and forecasts of six diverse severe weather events during spring 2013 and 2014 (Table 1). The storm-scale ensemble is nested within a mesoscale ensemble that assimilates routinely available mesoscale observations, as well as observations from mesonet stations (see, e.g., McPherson et al. 2007). Radar observations are assimilated into the higher-resolution storm-scale ensemble with the ensemble adjustment Kalman filter (EAKF1; Anderson 2001) encoded in the Data Assimilation Research Testbed (DART; Anderson and Collins 2007; Anderson et al. 2009) software. The goal of the more detailed methodology that follows is to obtain reasonable analyses of the supercells present in each event and then attempt prediction of their subsequent evolution.
a. Mesoscale data assimilation
The experimental domain utilizes a one-way nest setup, whereby the parent and nested grids are run concurrently and the parent grid provides lateral boundary conditions for the nested run (with no feedback from the nested grid to the parent grid). The parent grid has a horizontal gridpoint spacing of 15 km and covers the continental United States, while the nested grid has a horizontal gridpoint spacing of 3 km and its location is event dependent (Fig. 3). Placement of the nested grid reflects the most likely region for severe convective development as guided by day 2 convective outlooks from the Storm Prediction Center. The vertical grid has 51 levels that are spaced from less than 100 m apart near the surface to over 1 km apart at model top, defined as the 10-hPa pressure surface. The prescription of vertical levels is nearly identical to that of the operational Rapid Refresh (RAP) and High-Resolution Rapid Refresh (HRRR) models, with some increased resolution (i.e., five additional sigma surfaces) below 2 km AGL. The DART state vector, which is updated by the EnKF, is composed of the WRF Model’s prognostic variables, including zonal and meridional velocity components (u and υ, respectively) in Cartesian coordinates, vertical velocity w, perturbation potential temperature, perturbation geopotential, and the perturbation surface pressure of dry air. The parent and nested grids utilize the Thompson microphysical scheme (Thompson et al. 2004, 2008), so the DART state vector is augmented by the six mass variables for water vapor, cloud water, rain, cloud ice, snow, graupel, as well as number concentration variables for rain and ice.
Initial conditions for the parent and nested grids are downscaled from the 21-member 0000 UTC Global Ensemble Forecast System (GEFS) forecast cycle. The GEFS also provides boundary conditions for the parent grid, which provides boundary conditions for the nested grid via the one-way nesting capability with the WRF Model. Different sets of WRF Model physics options are applied to each ensemble member to account for model physics uncertainties (e.g., Stensrud et al. 2000; Fujita et al. 2007; Meng and Zhang 2008; Wheatley et al. 2014; see Table 2). The physics options are the same on the parent and nested grids, except that no cumulus parameterization is used on the nested grid.
Mesoscale observations of pressure (altimeter setting), temperature, dewpoint, and horizontal wind components from the NOAA Meteorological Assimilation Data Ingest System2 (MADIS; Miller et al. 2007), as well as MADIS mesonet data, are ingested into the ensemble using the parallel version of the DART software, which is based upon the ensemble adjustment Kalman filter described in Anderson (2001). No radar data are used in producing the initial mesoscale “backgrounds” on either grid. The mesoscale data assimilation process begins at 0000 UTC on an event day with the forecast step, whereby a 1-h forecast is made from each ensemble member. The first mesoscale analysis is produced at 0100 UTC on day 1 with the assimilation of all mesoscale data observed during a 30-min window centered on the analysis time (i.e., 0100 UTC). Mesoscale observations are then assimilated every 1 h out to the beginning of a storm-scale experiment. The start times for the six storm-scale experiments were guided by mesoscale discussions and watch issuance from the Storm Prediction Center and are event dependent, with all occurring at the top of the hour between 1800 and 2100 UTC on day 1. While each experiment start time began within 30 min of watch issuance, the length of the storm-scale experiments (i.e., the number of radar data assimilation cycles) was then dictated by event. Since the mesoscale observations are available hourly with ~30-min latency, there was no mesoscale data assimilation on the parent grid once the temporally higher-resolution radar data assimilation began on the storm-scale (3 km) grid. The parent grid is merely integrated forward to continue to provide boundary conditions for the storm-scale grid.
The fifth-order piecewise rational function of Gaspari and Cohn [(1999); their Eq. (4.10)]—a Gaussian-like spatial localization function—is used to lessen the impact of or remove spurious correlations with state variables at considerable distances from observations and correlations associated with less than optimal ensemble size. Except for mesonet observations, the horizontal localization cutoff for all mesoscale observations is ~458 km, while the vertical localization cutoff (i.e., where the weight reaches zero) is 8 km [consistent with Wheatley et al. (2012)]. The relatively high-density mesonet observations have a horizontal localization cutoff of ~60 km [consistent with Sobash and Stensrud (2015)] with the same vertical localization cutoff as above. In an effort to maintain ensemble spread, the spatially and temporally varying prior adaptive inflation (Anderson 2007; DART namelist options: inf_initial = 1.0, inf_sd_initial = 0.8, inf_damping = 0.9, inf_lower_bound = 1.0, inf_upper_bound = 100, and inf_sd_lower_bound = 0.8) is applied to the prior ensemble estimate at the outset of each assimilation step.
b. Storm-scale data assimilation
Each storm-scale data assimilation experiment utilizes level II reflectivity greater than 10 dBZ and radial velocity observations from three WSR-88Ds closest to the studied convection (see Table 1 for event-specific radars). Reflectivity values below 10 dBZ are less likely to indicate precipitation and are at times noisy. The radar data are objectively analyzed to a 6-km Cartesian grid on each of the original conical elevation scans, using the Cressman scheme (Cressman 1959) encoded in the Observation Processing and Wind Synthesis (OPAWS; Majcen et al. 2008) software. The radar data are interpolated in an effort to decorrelate the observation errors, and to remove signals in the data of less than two horizontal grid lengths.
For each experiment, the first storm-scale analysis is produced at the top of the start hour with the assimilation of all radar data observed during a 5-min window centered on the analysis time. The use of a wider data window can give rise to spurious convection in the storm-scale analyses, owing to the convective time scale and often-rapid motion of severe thunderstorms. Radar data are then assimilated every 15 min until the time of observed tornado touchdown3 (see Table 1). Observation error standard deviations for radial velocity and reflectivity were fixed at 3 m s−1 and 5 dBZ, similar to the previous studies of Dowell et al. (2004), Aksoy et al. (2009), and Yussouf et al. (2013b). An outlier test in DART is used to discard any radar observation where the squared difference between an observation value and its prior ensemble mean estimate exceeds 3 times the sum of the observation error variance and prior ensemble variance. This test discards approximately 10%–20% of the reflectivity observations during the first three to five update cycles (i.e., during the spinup), and less than 10% in later update cycles. No more than a few percentage points of the radial velocity observations are rejected as outliers during the storm-scale data assimilation. The additive noise technique described in Dowell and Wicker (2009) is used to maintain spread during radar data assimilation. Local perturbations are added to each member’s wind (0.5 m s−1), temperature (0.5 K), and dewpoint (0.5 K) fields where the reflectivity observations exceed 25 dBZ (i.e., indicate precipitation) and the absolute difference between a reflectivity observation and its posterior ensemble mean estimate is <10 dBZ (Sobash and Wicker 2015). Where available (because of grid placement), the Oklahoma Mesonet data valid during the same 5-min window as the radar observations are also assimilated onto the 3-km grid.
Horizontal and vertical localization cutoffs of 18 and 6 km, respectively, are used for radar data. These choices for length scales are guided by previous data assimilation case studies of an isolated supercell thunderstorm and MCS by Yussouf et al. (2013b) and Wheatley et al. (2014). It is also noted that Sobash and Stensrud (2013) found that larger horizontal localization cutoffs could be beneficial when convective evolution is dominated by thunderstorm interactions or upscale growth into a mesoconvective system.
Beginning at ~1 h prior to the time of observed tornado touchdown for the six events (see Table 1), a series of ensemble forecasts are generated at 15-min intervals following the convective-scale data assimilation procedure. The analyses of the parent and nested grid from each ensemble member serve as initial conditions for the forecasts, while the downscaled GEFS members (see section 2a) provide the boundary conditions for the parent domain. Each forecast is run out to 1 h, with 5-min history files generated in order to permit high-temporal-resolution analysis of the impact of continued radar data assimilation on forecast evolution.
a. Observation-space diagnostics
For each severe thunderstorm event, the overall EnKF system performance is first evaluated by calculating observation-space diagnostics for the radar reflectivity and radial velocity observations ingested during the radar data assimilation period (Dawson et al. 2012; Yussouf et al. 2013b). The first diagnostic, root-mean-square innovation (RMSI), provides a measure of the overall fit of the analyses to the observations and is defined as
where or is the innovation (i.e., the difference between an observational value and the interpolated ensemble mean value), y0 is the observation, H is the forward operator that maps the model state to the observation location and type, xf and xa are particular model fields from the forecast or analysis state vector, an overbar indicates the ensemble mean, and angle brackets indicate an average at all observation locations within a radar volume scan. The second measure, total ensemble spread (Dowell and Wicker 2009), is defined as
where the observation error standard deviation is assumed to be 5 dBZ and 3 m s−1 for reflectivity and radial velocity observations, respectively; N (=36) is the ensemble size; and n is the ensemble member index. It is noted that radial velocity statistics are calculated at all observed locations within the 3-km model domain, while reflectivity statistics are calculated at locations where observed reflectivity exceeds 10 dBZ (as smaller values were not assimilated). This threshold is used such that the performance measure is more representative of the region in and around the convective system (Aksoy et al. 2009; Dowell et al. 2011; Dawson et al. 2012; Jung et al. 2012).
For reflectivity, values of prior RMSI in excess of 10 dBZ are common in those experiments where radar data assimilation begins around convective initiation and the model state (devoid of the observed convection) is being adjusted by a relatively small sample of observations (Figs. 4a,c,e and 5a,c,e). In general, these values steadily decrease with each subsequent assimilation cycle, with later prior RMSI values ranging from ~7 to 9 dBZ. The prior RMSI magnitudes for radial velocity range from ~4 to 5 m s−1 throughout the assimilation period (Figs. 4b,d,f and 5b,d,f). As convection initiates and matures, prior spread values associated with reflectivity grow in the first 30 min of assimilation in response to the additive noise and adaptive inflation, but then stabilize for the remainder of the assimilation period, ranging from ~8 to 9 dBZ. For all events, prior spread values for radial velocity range approximately from 4 to 6 m s−1 throughout the assimilation period.
The above numbers for the RMSI and total spread for each radar reflectivity and radial velocity are put into additional context by calculating the consistency ratio (see Dawson et al. 2012; Yussouf et al. 2013b), which gauges whether the prior spread and RMSI are in tune with the assumed observation error. This ratio is defined as
where a value of ~1.0 is indicative of sufficient ensemble prior spread for the assumed observation errors. In each case, the consistency ratios for reflectivity generally fall within the range 0.7–1.3, beyond the first few assimilation cycles (Fig. 6). The consistency ratios for radial velocity over the same period are somewhat larger and, generally, fall within the range 1.0–1.5. The values of the consistency ratio suggest a reasonable system configuration at the end of the spinup.
The quality of observation-space diagnostics for radar data are shown primarily to demonstrate that the data assimilation system is tuned reasonably well for these observations and has similar characteristics to prior radar data assimilation cases for supercellular and multicellular convection. Unfortunately, even a good model fit to the observations during the assimilation phase does not guarantee improved storm-scale forecasts. The following section evaluates the forecasts produced by NEWS-e, relevant to tornadic storms and convectively driven high-wind events.
b. Model-space diagnostics
After just five to nine radar data assimilation cycles (with the exception of the 27 April 2014 tornadic event), the ensemble posterior mean reflectivity fields derived from EnKF updates closest to the time of the first tornado report show simulated thunderstorms in proximity to those displayed in observed radar data [provided by the Multi-Radar/Multi-Sensor (MRMS) system] from each event (see Figs. 1 and 2). The simulated core of 40 dBZ or greater in the 11 May 2014 tornadic event shows the greatest discrepancy with its observed storm, with a highly underdeveloped eastern flank and spurious convection just to its north. It is also noted that, in general, the simulated cores of greater than 40 dBZ are spatially smaller than those of observed storms. This result is underscored by Figs. 4 and 5, where there was a persistent low bias in the reflectivity statistics, for each event. Such a bias may, in part, be attributable to the storm-scale grid’s 3-km horizontal spacing, characteristic behavior of the microphysical scheme, or a combination of both.
Figures 1 and 2 provide some confidence in the posterior ensemble mean fields produced by radar data assimilation, but they only show a single occurrence in the respective storms’ evolutions. To provide a better sense of ensemble “spinup” resulting from the radar data assimilation cycling, plots of maximum vertical velocity in the 2–5-km layer versus forecast time were derived from the storm-scale ensemble, within the subdomains (of the full storm-scale grid) displayed in Figs. 10 and 13 (Fig. 7). Maximum vertical velocity values are recorded for each ensemble member at the analysis time used to initialize the forecast and then from 5-min ensemble forecast output through the forecast end time. The ensemble mean maximum vertical velocity value at each forecast output time is calculated by averaging all members that lie within the 10th and 90th percentiles, to lessen the impact of outliers. Plots from every other forecast are shown for each event in Fig. 7, starting with the first ensemble forecast (initialized from the first storm-scale analysis) and ending with the last ensemble forecast, around the time of the first tornado report.
The three severe weather events from spring 2013, as well as the 11–12 May 2014 event, are similar in that radar data assimilation began around the time of convective initiation. In contrast, the severe weather events on 27 and 28 April were longer-duration events and characterized by more widespread regional coverage of convection, such that some elements of convection are ongoing as radar data assimilation begins. For the former events, except on 31 May 2013 (which was characterized by a very high-CAPE, high-shear environment), maximum vertical velocity values begin at less than 5 m s−1 and increase over the period of the 1-h forecast, suggesting that the radar data assimilation introduces storms into the model state that continue to grow in intensity (Figs. 7a–f). Maximum vertical velocity values gradually level off over the next five to nine radar data assimilation cycles—consistent with Yussouf and Stensrud (2010)—with many of the curves at these later forecast times showing a period of spinup (and then spindown) in the first 15–30 min of a given forecast. For the 27 and 28 April 2014 events, maximum vertical velocity values begin around 10 m s−1, as convection is ongoing on the storm-scale grid, and quickly level off to values of 10–15 m s−1 in later forecasts. The somewhat weaker updrafts for these two events likely are owed to the relatively lower-CAPE environments in which they developed.
It is also instructive to briefly consider the near-storm environment in which the simulated storms evolve, in order to provide broader context to the ensemble probabilistic forecasts presented below. To this end, observed 0–6-km hodographs derived from special NWS soundings (i.e., those soundings released at times other than 0000 and 1200 UTC, in anticipation of severe weather) have been compared to ensemble mean 0–6-km hodographs. The ensemble mean hodographs exhibit the general shape of the observed hodographs for each event (Fig. 8); however, for each event, it is seen that the observed hodograph often lies outside the individual ensemble members. The hodographs calculated at 2100 UTC 27 April 2014 from the individual ensemble members (Fig. 8d) show a rather wide distribution about the mean, particularly in the 0–1-km layer, owing to effects of nearby convection in the model.
Finally, temperature properties of the simulated, near-surface cold pools are evaluated by calculating the ensemble mean 2-m temperature field at the EnKF analysis time used to initialize an ensemble forecast (varies by event), and then from 5-min ensemble forecast output over the next 1-h period. Over the entire forecast period (13 output times), the minimum 2-m temperature value that occurs at each model grid point is retrieved. The minimum of hourly surface observations available for the same period (varies by event) is then overlaid on the resultant model temperature fields (see Fig. 9). In this study, the Oklahoma Mesonet provides the only relatively dense, high-temporal-frequency dataset for comparison, so the primary focus is on the three severe weather events that occurred over central Oklahoma.
Using the above forecast temperature fields, a domain-averaged RMS value was calculated for each of the 19, 20, and 31 May severe weather events, and these values were found to be 2.2°, 1.4°, and 1.7°C, respectively. More specifically, in Fig. 9, slight overcooling on the order of 1°–3°C is noted in some instances along the peripheries of cold pools, suggesting their spatial coverage is larger than those of the observed cold pools. Still, the above error magnitudes are on the order of the assumed observation-error standard deviation for temperature, and would hopefully make only minor contributions to errors in storm motion and evolution in these EnKF simulations. Part II includes further discussion on the forecasts of simulated surface temperature.
c. Ensemble probabilistic forecasts of low-level rotation
For each event, a series of ensemble probabilistic forecasts (initialized from storm-scale analyses) are launched during the 1-h period preceding the first tornado report. The ability of these forecasts to reproduce the low-level rotational characteristics of supercell thunderstorms is evaluated by calculating the probability of 0–2-km vertical vorticity greater than 0.004 s−1 over the 1-h forecast period, using the 3 × 3 neighborhood around the corresponding horizontal grid point. [Similar magnitudes of vorticity were utilized at 3-km horizontal gridpoint spacing in previous works by Trapp et al. (2007) and Yussouf et al. (2013b, 2015).] At a given time and horizontal grid point, the probability is approximated by the number of ensemble members in which the vertical vorticity exceeds the above threshold on any model sigma level below 2 km AGL, divided by the ensemble size. Vorticity fields are first calculated at the analysis time used to initialize the forecast, and then from 5-min ensemble forecast output through the forecast end time. The maximum probability recorded at each horizontal grid point is shown. The reader is reminded that—at 3-km horizontal gridpoint spacing—the resultant vorticity swaths may indicate the presence of rotating supercells, but do not indicate model-predicted tornados or even necessarily tornadic supercells. Specifically, Trapp et al. (2005) found that ~40% of thunderstorms with low-level mesocyclone detections were associated with tornadoes.
1) Warm season 2013
The first ensemble forecasts were launched at 2015, 1900, and 2200 UTC, respectively, for the 19, 20, and 31 May tornadic events. For the 19 May event, six radar data assimilation cycles were performed in the 75 min leading up to the first ensemble forecast. Similarly, for the 20 and 31 May events, five radar data assimilation cycles were performed in the 60 min before the first ensemble forecasts.
The first ensemble forecasts for the 19 and 20 May tornadic events produce probability magnitudes of ~0.2–0.4 (or less) prior to the first observed tornado reports (Figs. 10a,d). In contrast, the first ensemble forecast for the 31 May tornadic events produces significantly higher probabilities (greater than 0.6), which are spatially collocated with the observed tornado track (Fig. 10g), and similarly high probabilities persist in all subsequent forecasts launched for this event (Figs. 10h,i). More defined, albeit relatively weak (~0.4 or less), vorticity swaths are present in the 1-h ensemble forecasts launched approximately 45 min prior to the 19 and 20 May 2013 tornadoes, but they are displaced to the north of the observed tornado tracks (not shown). In each of these events, the northward displacement decreases in later forecasts, while the probability magnitudes associated with the 20 May tornadic event also increase (>0.6; Figs. 10b,e). Much like the 31 May 2013 tornadic event, the ensemble forecasts launched on 19 and 20 May within 30 min of the first tornado reports produce significantly higher probabilities (>0.6) that are spatially collocated with the observed tornado track (see, e.g., Figs. 10c,f).
The source of displacement errors in the vorticity swaths forecasts for 19 and 20 May 2013 is likely twofold. In each of the 19, 20, and 31 May 2013 events, the first 1-h ensemble forecasts are launched after relatively few4 (five, six, and five, respectively) radar assimilation cycles, when the horizontal scale of the developing convection is smaller than tens of kilometers (see black contours in Figs. 10a,b,d,e). On 19 May, only the largest convective cell (of three observed) is present in the ensemble mean reflectivity field at 2015 UTC (Fig. 11). The analysis increments appropriately reinforce this updraft but this has little initial impact on the smaller updrafts. Thus, the earlier forecasts miss a storm merger in the southwestern quadrant of the largest convective cell that appears to influence the location of maximum low-level rotation at later times (Figs. 10a–c).
In contrast to the 19 and 20 May events, observed convective initiation occurs rather rapidly on 31 May 2013 during the period 2130–2200 UTC, and it grows to larger spatial scales more quickly (than storms during the early development stages on 19 and 20 May). Thus, on 31 May, the simulated storms are likely better resolved at these earlier times by the 3-km mesh, which may support retention of information introduced by the radar data assimilation. It should also be noted that the observed and modeled storm environments on 31 May were very conducive to generating rotating thunderstorms (Fig. 8c). This environmental constraint may make this event inherently more predictable than the 19 and 20 May events, such that low probabilities at longer lead times in the two earlier events are entirely appropriate and a necessary consideration in the forecast decision process.
The above comparison of ensemble probabilistic forecasts of low-level rotation to observed tornado tracks qualitatively suggests forecast improvements with additional radar data assimilation cycles. For a more quantitative assessment, the equitable threat score (ETS; Wilks 2006) is computed in model space using the forecast ensemble mean reflectivity field from each experiment. For horizontal grid points that lie on the ~3-km model vertical grid level, a “hit” is counted if the ensemble mean reflectivity field and MRMS reflectivity exceed a 40-dBZ threshold within ±3 km horizontally and ±1 model levels vertically of a given grid point. A “false detection” is counted if the ensemble mean reflectivity exceeds the threshold of 40 dBZ and the observations do not. A “correct null forecast” is counted if neither the ensemble mean reflectivity field nor observations exceed this threshold. For a given event, a single ETS curve is plotted for each of the five forecasts launched during the 1-h period preceding the time of first tornado report, by averaging all ETS curves generated from individual ensemble members.
For a given experiment and forecast, it is shown that forecasts launched with each successive radar data assimilation cycle generally produce higher ETS than earlier forecasts, with this trend continuing until the last forecast around the time of first tornado report (Figs. 12a–c). The higher ETS relative to earlier forecasts is most pronounced over the first 30 min of a given forecast. In all forecasts across events, though, the ETS does decrease over time such that the skill of any newer forecast is much less pronounced in the trailing 15–30 min. Decreasing scores with time may be, in part, a consequence of storm dissipation; however, the earlier analysis of maximum vertical velocity with time (see Figs. 7a–c) showed that storms introduced into the ensemble were able to grow upscale and persist. Dispersion of storm “objects” in the ensemble is another factor in the lower scores, as increasing the neighborhood radius to 6 km (not shown in Fig. 12) produced somewhat higher ETSs at later forecast times.
The next subsection addresses ensemble probabilistic forecasts during the warm season of 2014.
2) Warm season 2014
The first ensemble forecasts were launched at 2300 UTC for the 27 April tornadic event, and at 2000 UTC for both the 28 April and 11 May tornadic events. The 27 and 28 April tornadic events were both characterized by widespread convection over several hours. Thirteen radar data assimilation cycles were performed over 3 h before the first ensemble forecast on 27 April, while nine radar data assimilation cycles were performed over 2 h before the first ensemble forecast on 28 April. For the 11 May event, five radar data assimilation cycles were performed in the hour period preceding the first ensemble forecast.
The first ensemble forecast for the 27 April event produces probabilities of ~0.4–0.6 in association with the eventual tornado-producing thunderstorm (Fig. 13a). The first ensemble forecast for the 28 April event produces somewhat higher probabilities (greater than 0.6) in association with the thunderstorm that eventually produced the Louisville, Mississippi, tornado, which had a history of producing several brief tornado touchdowns (storm 1 in Fig. 13d). In both events, these higher probabilities are notably greater than those produced at similar lead times (greater than 30 min prior to the first tornado report) for the 19 and 20 May 2013 tornado events, in which the radar data assimilation begins around the time of convective initiation. In contrast, on 11 May 2014, the first radar data assimilation cycle occurs within the hour preceding initiation, and no significant probabilities of low-level rotation are present in the forecasts until just 15 min before the first tornado report (Fig. 13g). It is noted, though, that (ongoing) cyclic mesocyclogenesis associated with the 11 May 2014 tornadic thunderstorm also appears to affect forecast quality. Multiple circulations are apparent in observed radial velocity fields (not shown) and in ensemble probabilistic forecasts as two local maxima in the probability field (see, e.g., Figs. 13h,i). However, the 15-min radar data assimilation updates are probably insufficient to temporally resolve this rapid physical process onto the model grid, particularly at the earlier forecast start times.
Later ensemble forecasts (i.e., those launched within 30 min of the first tornado report) on 27 and 28 April 2014 show clearly defined vorticity swaths associated with the observed tornadic thunderstorms (see, e.g., Figs. 13c,f). However, these same forecasts suggest additional rotating thunderstorms within the experimental domains (as observed for these events), most of which did not produce tornadoes. The ensemble forecasts launched between 2300 UTC 27 April and 0000 UTC 28 April suggest as many as four additional rotating thunderstorms (see Fig. 13a). Vorticity swaths associated with these nontornadic thunderstorms possess probability magnitudes of ~0.6–0.8, while probability magnitudes quickly exceed 0.8 with the only tornado-producing thunderstorm. Only at the last forecast time (0000 UTC 28 April) does any vorticity swath associated with a nontornadic thunderstorm produce probabilities in excess of 0.8.
The ensemble forecasts launched between 2000 and 2100 UTC 28 April show an additional vorticity swath tens of kilometers north-northeast of the relatively long-track tornado in Louisville, Mississippi, in association with a thunderstorm that produced a brief tornado touchdown between 1930 and 2000 UTC (storm 3 in Fig. 13d). In the forecast launched at 2000 UTC, probability magnitudes associated with this thunderstorm exceed those of the thunderstorm that produces the Louisville tornado (storm 1 in Fig. 13d), but they rapidly decrease in later forecasts (Figs. 13e,f). This forecast trend suggests the diminishing probability that the simulated storm will produce tornadoes. It is also noted that the ensemble forecasts for this event produce low probabilities of rotation in another thunderstorm located just north of the thunderstorm that produces the Louisville tornado (storm 2 in Fig. 13d). Ensemble probabilistic forecasts of reflectivity structure suggest that the simulated storm exhibits storm motion with too strong a north-northeasterly component (Fig. 14), such that it is undercut by the cold pool from the storm just to its north (not shown).
For additional context, it is noted that ETS curves were calculated for the 2014 warm season events using the same methodology employed for the 2013 events. Discussion of the ETS curves for the 2014 events (Figs. 12d–f) follows that provided above for the 2013 events (Figs. 12a–c). For each event, forecasts launched with successive radar data assimilation cycles generally produce higher ETSs than previous forecasts, particularly over the first 30 min of the most recent forecast. Furthermore, decreasing scores with time are likely a result in large part of the dispersion of storm “objects” in the ensemble, as the earlier analysis of maximum vertical velocity with time (Figs. 7d–f) showed that storms introduced into the ensemble were able to grow upscale and persist.
Thus, the combined suite of ensemble probabilistic forecasts of low-level rotation produced during tornadic events from spring 2013 and 2014 suggest that it is possible to highlight strongly rotating thunderstorms at lead times of ~30 min before the time of first tornado report. Particularly in cases of widespread convection, though, additional work (and likely greater horizontal gridpoint resolution) will be needed to understand whether these types of forecasts can predict the probability that a given storm will produce tornadoes.
3) Application of the forecasts for upscale growth and organization
Radar data were assimilated every 15 min for the period from 1900 UTC 11 May to 0300 UTC 12 May 2014 to gauge whether sampling errors (caused by the limited number of ensemble members) slowly degrade the quality of forecasts for a long-duration severe weather event. This period was marked by a supercell-to-MCS transition over eastern Nebraska and western Iowa, with both convective modes associated with tornado and high-wind reports. By 0300 UTC, a well-organized tornadic MCS was moving east-northeast across western Iowa. The most significant tornado occurring during this stage of the event produced a 10.35-mi-long path of EF2 damage just west of Des Moines, Iowa.
Ensemble mean reflectivity after 33 EnKF updates (i.e., at 0300 UTC) suggests that any accumulation of analysis errors has yet to prevent a reasonable simulation of this event (Fig. 15). Storm-scale analyses at these later times were used to launch a series of 1-h ensemble forecasts for the periods 0200–0300 (approximately 1 h before the EF2 tornado), 0230–0330, and 0300–0400 UTC (Figs. 16a–c). Probabilities of ~0.6–0.8 on the northern end of the line persist in all three forecast periods, in association with the northern line-end vortex (which produced no wind or tornado reports). More isolated convection along the southern end of the line also produces a second area of generally weaker probabilities (~0.4–0.6) to the south-southwest of the MCS.
Several local maxima in the probability field occur along the line’s north–south extent, in association with mesovortices embedded within the leading-edge convection. Some of these mesovortices are only marginally resolved and more transient in nature. In all three forecasts, though, the tornadic mesovortex—associated with the EF2 tornado—produces vorticity swaths (with probability magnitudes of 0.6–0.8 and greater) coincident with observed tornado tracks. While this finding suggests that NEWS-e has the ability to highlight potentially hazardous bow-echo mesovortices, future work (including more MCS case studies) will be needed to see if these types of forecasts can establish the categorization between nondamaging and damaging mesovortices.
4. Summary and discussion
In support of the Warn-on-Forecast project, NEWS-e has been used to produce storm-scale ensemble analyses and forecasts of six diverse severe weather events from spring 2013 and 2014. Reflectivity and radial velocity observations from several radars were assimilated—beginning just prior to convective initiation—using an EnKF approach, and a series of 1-h ensemble forecasts were then initialized from storm-scale analyses during the 1-h period preceding the onset of storm reports. Of particular interest was the ability of these 0–1-h ensemble forecasts to reproduce the low-level rotational characteristics of supercell thunderstorms, as well as other convective hazards.
All six EnKF experiments are able to reproduce rotating thunderstorms in association with the observed tornado-producing storms. With the exception of the 11 May 2014 tornadic event in central Nebraska, clearly defined vorticity swaths were generally present in 1-h ensemble forecasts launched approximately 30 min before the first tornado report. Displacement errors (often to the north of tornado-affected areas) associated with vorticity swaths were greatest in those forecasts launched 30–60 min before the time of first tornado. While other storm-scale data assimilation experiments have noted similar displacement errors and/or too fast storm motion [including those utilizing other weather forecasting models; e.g., Dawson et al. (2012)], an observation system simulation experiment (OSSE) performed at the storm scale by Potvin and Wicker (2013) did not see pronounced displacement or motion errors. This discrepancy suggests that these types of errors may be related to model error or some nonlinear effect between model error and the radar data assimilation.
In those events in which the observed convection initiates and matures rapidly (e.g., 31 May 2013) or is ongoing around the time of the first radar data assimilation cycle (e.g., 27 and 28 April 2014), ensemble probabilistic forecasts of low-level rotation often possess probability magnitudes of ~0.4–0.6 or greater as much as 45–60 min out from the first tornado report. In those events in which the first radar data assimilation cycle precedes convective initiation and matures more slowly (e.g., 19 and 20 May 2013), the probability magnitudes were generally lower (less than ~0.4–0.6) in those forecasts launched greater than 30 min out from the first tornado report. For all events, ensemble probabilistic forecasts of low-level rotation launched within 30 min of the first tornado report generally indicated rotating thunderstorms in close proximity to the observed tornado tracks.
Additional EnKF experiments suggest that NEWS-e is suitable for assimilating radar data over long-duration events, including severe MCSs. Similar forecasts were produced for a tornadic mesovortex along the leading edge of a bow echo and, again, produce a well-defined vorticity swath as much as 30 min prior to the first tornado.
The above results were generated using a relatively coarse 15-min radar data assimilation interval. However, the latency period associated with processing incoming data, running a complex data assimilation system, and visualizing the output will preclude efforts to run a quasi-real-time data assimilation system at higher temporal frequencies, in the near term. Still, there are a number of areas for active research. A next logical step will be to run the system at smaller horizontal gridpoint spacing, in order to investigate any potential improvements in the representation of convection in EnKF updates and subsequent ensemble forecasts. For example, in examining the 19 May 2013 event, the merger of the main updraft with a smaller storm on the southwestern flank—the latter of which is poorly resolved until sufficient upscale growth—yields a vorticity swath displaced too far northward. Another area of research involves distinguishing between nontornadic and tornadic supercell thunderstorms (which will undoubtedly require smaller horizontal grid spacing), as was a particular challenge in the events spanning 27 and 28 April 2014 in this study.
There was some suggestion from 1-h ensemble forecasts at later times during the 11–12 May 2014 event that this approach may be equally successful in highlighting potentially tornadic mesovortices within squall lines and bow echoes. While mesoscale convective systems can be as much as an order of magnitude larger than isolated supercell thunderstorms, bow-echo mesovortices can at times have diameters of less than 2 km. Evaluation of spatially higher-resolution forecasts is necessary to determine if mesovortices can be anticipated within a WoF context. Progress on these challenges and others is crucial for accurate very short–range probabilistic forecasts of severe convective storms, one of the major goals of the Warn-on-Forecast project.
The computing for this project was performed at the University of Oklahoma (OU) Supercomputing Center for Education and Research (OSCER). Funding for this research was provided by the NOAA/Office of Oceanic and Atmospheric Research under NOAA–University of Oklahoma Cooperative Agreement NA11OAR4320072, U.S. Department of Commerce. The comments and suggestions by three anonymous reviewers, as well as David Dowell, helped us improve our presentations.
The EAKF is one variant of the EnKF, so the term EnKF is used in its place throughout the rest of this manuscript.
Observation platforms included are METAR and marine surface stations, rawinsondes, the Aircraft Communications Addressing and Reporting System (ACARS), and NOAA GOES (horizontal wind components).
The 15-min assimilation period was chosen in an effort to mimic a quasi-real-time data assimilation system, which is temporally constrained by the latency period associated with processing incoming data, running a complex data assimilation system, and visualizing the output.
Yussouf and Stensrud (2010) found that approximately 8–10 radar data assimilation cycles were necessary to establish an isolated supercell thunderstorm within a model background.