1. Introduction
High-resolution, convection-permitting numerical weather prediction (NWP) models have been shown to produce superior precipitation forecasts than coarser-resolution NWP models with parameterized convection (e.g., Done et al. 2004; Kain et al. 2006; Lean et al. 2008; Roberts and Lean 2008; Schwartz et al. 2009; Clark et al. 2010) and to provide valuable guidance for severe weather forecasting (e.g., Kain et al. 2008; Sobash et al. 2011; Clark et al. 2012, 2013; Weisman et al. 2013). Given these successes, convection-allowing forecasts are routinely run at the National Oceanic and Atmospheric Administration’s (NOAA) National Centers for Environmental Prediction (NCEP) and other operational centers.
Even with increased computational resources over the past decade, high-resolution simulations remain expensive and operational convection-permitting forecasts are necessarily regional, meaning specifying appropriate initial and lateral boundary conditions (LBCs) can be challenging. A common initialization technique is to simply interpolate analyses from larger-domain, coarser-resolution NWP models onto regional high-resolution domains. For example, convection-permitting Weather Research and Forecasting Model (WRF; Skamarock et al. 2008) forecasts examined in Done et al. (2004), Kain et al. (2006, 2008), Weisman et al. (2008), and Schwartz et al. (2009) were initialized by interpolating North American Mesoscale Model (NAM; Rogers et al. 2009) analyses onto the high-resolution domains. Similarly, NOAA’s experimental High-Resolution Rapid Refresh (HRRR; Smith et al. 2008) model is currently initialized each hour by downscaling 13-km Rapid Refresh (RR; Benjamin et al. 2007) model analyses1 onto its 3-km computational domain.
When this initialization approach is used, the skill of the regional model is intrinsically limited by the analysis quality of the “parent” model providing the initial conditions (ICs). Usually, the parent analysis is produced via some form of data assimilation (DA), which incorporates real observations to adjust the parent model’s gridded background field. For example, the NAM and RR models employ a three-dimensional variational (3DVAR; e.g., Parrish and Derber 1992; Lorenc et al. 2000; Barker et al. 2004) DA approach. Therefore, prior studies examining convection-permitting WRF forecasts with ICs from NAM analyses (e.g., Done et al. 2004; Kain et al. 2006, 2008; Weisman et al. 2008; Schwartz et al. 2009) were effectively initialized by downscaling 3DVAR analyses onto the computational domain.
Accurate ICs are especially critical for successful high-resolution forecasts (e.g., Weisman et al. 2008) since small-scale errors grow rapidly and can amplify onto larger scales (e.g., Lorenz 1969; Zhang et al. 2003; Hohenegger and Schär 2007). Therefore, it is important to improve the initialization of high-resolution NWP model configurations.
Fortunately, advanced DA techniques may offer a means to improve ICs and subsequent high-resolution forecasts. In particular, ensemble-based DA algorithms, such as the ensemble Kalman filter (EnKF; Evensen 1994; Burgers et al. 1998; Houtekamer and Mitchell 1998), may be particularly well suited to provide ICs for convective forecasts. EnKFs utilize short-term ensemble forecasts to calculate multivariate flow-dependent background error covariances (BECs) that represent “errors of the day” while producing an ensemble of analyses. Therefore, EnKFs can capture the forecast errors associated with the often-amplified flows responsible for convective development and initialize ensemble forecasts, which are of interest at meso- and convective scales (e.g., Stensrud et al. 2009). In contrast, 3DVAR typically employs time-invariant, isotropic BECs often based on model climatologies (e.g., Parrish and Derber 1992), and although tangent linear and adjoint models in four-dimensional variational (4DVAR; e.g., Huang et al. 2009) DA allow implicit evolution of the BECs, the use of fixed BECs at the start of each 4DVAR DA cycle represents a major limitation. Thus, deterministic variational DA methods may provide inaccurate or suboptimal depictions of forecast error when the actual flow deviates from the climatological patterns encapsulated in the static BECs.
Many experiments have shown that EnKF-initialized forecasts are comparable to or better than 3DVAR-initialized forecasts (e.g., Meng and Zhang 2008a,b; Whitaker et al. 2008; Torn and Hakim 2009; Torn 2010; Hamill et al. 2011a,b; Zhang et al. 2009a, 2013; M. Zhang et al. 2011) and competitive with 4DVAR-initialized forecasts (e.g., Buehner et al. 2010; Miyoshi et al. 2010; M. Zhang et al. 2011; Zhang et al. 2013) for a variety of applications. For convective scenarios, substantial effort has been devoted to storm-scale EnKF assimilation of radar data, typically in the form of case studies using very small (e.g., ~100 km × 100 km) high-resolution domains over short time periods (e.g., Snyder and Zhang 2003; Zhang et al. 2004; Dowell et al. 2004; Tanamachi et al. 2013 and references therein). Additionally, a few studies have used limited-area mesoscale EnKFs as a tool to generate downscaled ICs for convection-permitting forecasts. For example, Jones and Stensrud (2012) and Jones et al. (2013) employed a 15-km EnKF to initialize small-domain (~5° × 5°) 3-km forecasts for selected severe weather events with a focus on assimilating satellite observations and Melhauser and Zhang (2012) examined an ensemble of convection-permitting forecasts of a squall line initialized by a 30-km EnKF. Additionally, Romine et al. (2013) described a real-time continuously cycling limited-area mesoscale EnKF system that initialized 3-km convection-permitting forecasts over approximately three-fourths of the contiguous United States (CONUS) during the 2011 spring convective season. These studies illustrate that mesoscale EnKF analyses can be downscaled to initialize good convective forecasts, although Romine et al. (2013) discussed the considerable challenges of mitigating model bias in cycling limited-area EnKFs.
However, time-evolving forecast errors can also be incorporated into DA systems using non-EnKF techniques. In particular, flow-dependent, ensemble-derived BECs can be incorporated within a variational framework in a “hybrid” variational–ensemble DA approach (e.g., Hamill and Snyder 2000; Lorenc 2003; Buehner 2005; Wang et al. 2008a; Zhang et al. 2009b; Wang 2010; Clayton et al. 2012). While the various hybrid algorithms differ, they all contain tunable parameters that determine how much the total BECs are weighted toward static (i.e., 3DVAR) and flow-dependent (i.e., ensemble) contributions. The hybrid method is attractive since it can be easily implemented in preexisting variational DA systems and employs model-space covariance localization, which is beneficial for assimilating nonlocal observations, such as satellite radiances (Campbell et al. 2010). Furthermore, efforts with hybrid DA have been encouraging, and NWP model forecasts initialized with hybrid techniques have been comparable to or better than forecasts initialized with purely variational methods that do not incorporate ensemble BECs (Buehner 2005; Wang et al. 2008b; Buehner et al. 2010; Hamill et al. 2011b; Wang 2011; Zhang and Zhang 2012; Wang et al. 2013; Zhang et al. 2013). In fact, beginning May 2012 the NCEP Global Forecast System (GFS) model has been initialized with a hybrid–3DVAR system (before then 3DVAR DA was employed) and the Met Office uses a 4DVAR-based hybrid to initialize their global model (Clayton et al. 2012).
While Gao et al. (2010) assimilated synthetic radar observations in a storm-scale hybrid DA system, the utility of hybrid techniques that assimilate real observations as a method of initializing convective forecasts has not been examined. Thus, this study employed a cyclic limited-area hybrid DA system with 20-km horizontal grid spacing to provide downscaled ICs for convection-permitting forecasts with 4-km horizontal grid spacing spanning three-fourths of the CONUS over May and June 2011. The hybrid-initialized forecasts were compared to those initialized from parallel cycling EnKF and 3DVAR DA systems. Furthermore, forecasts initialized from the cycling EnKF, 3DVAR, and hybrid configurations were compared to benchmark forecasts initialized by interpolating NCEP’s operational GFS analyses onto the computational domain.
Section 2 details model and DA configurations and the experimental design, while section 3 describes the observations and their associated preprocessing and errors. Results are presented in section 4 before we conclude in section 5.
2. Model configurations and experimental design
a. Forecast model
Weather forecasts were produced by version 3.3.1 of the nonhydrostatic Advanced Research WRF (ARW-WRF, hereafter WRF; Skamarock et al. 2008) over a nested computational domain spanning the CONUS and adjacent areas (Fig. 1). The horizontal grid spacing was 20 km (300 × 200 grid points) in the outer domain and 4 km (801 × 616 grid boxes) in the inner nest. Both domains were configured with 57 vertical levels and a 10-hPa top. The time step was 90 s in the 20-km domain and 18 s in the 4-km nest, and two-way feedback linked the domains.
The following physical parameterizations were used: the Morrison double-moment microphysics scheme (Morrison et al. 2009), the Rapid Radiative Transfer Model for Global Climate Models (RRTMG; Mlawer et al. 1997; Iacono et al. 2008) longwave and shortwave radiation schemes with aerosol and ozone climatologies, the Mellor–Yamada–Janjić (MYJ; Mellor and Yamada 1982; Janjić 1994, 2002) planetary boundary layer scheme, the Noah land surface model (Chen and Dudhia 2001), and the Tiedtke cumulus parameterization (Tiedtke 1989; C. Zhang et al. 2011). These physics, as well as positive-definite moisture advection (Skamarock and Weisman 2009), were used on both domains, with the exception that cumulus parameterization was turned off on the 4-km grid. GFS forecasts provided LBC forcing for the 20-km domain every 3 h and the 20-km domain provided LBCs for the 4-km nest.
b. Data assimilation systems
The hybrid and 3DVAR algorithms from NCEP’s operational gridpoint statistical interpolation (GSI; Kleist et al. 2009) DA system were used. Five control variables were employed: streamfunction, pseudo–relative humidity (Dee and da Silva 2003), and unbalanced velocity potential, virtual temperature, and surface pressure. The hybrid formulation in GSI uses extended control variables (Lorenc 2003) and is described by Wang (2010), while Wu et al. (2002) detail the 3DVAR algorithm.
The static BECs used in the 3DVAR and hybrid algorithms were constructed by the National Meteorological Center (NMC, now known as NCEP) method (Parrish and Derber 1992), which computes BECs by taking differences between forecasts of different lengths valid at common times. Differences of 24- and 12-h WRF forecasts valid at 120 common times over May and June 2010 were used to compute the BECs. Empirically determined tuning factors modified the initially computed static BEC statistics. Specifically, the background error standard deviations of unbalanced surface pressure, virtual temperature, and pseudo–relative humidity were reduced by 50%, 30%, and 30%, respectively, similar to NCEP’s GSI 3DVAR configuration (Kleist et al. 2009). These same weightings were used for the 3DVAR and hybrid systems.
The hybrid BECs were weighted 75% toward the ensemble contribution and 25% toward the static (i.e., NMC generated) component. Horizontal and vertical localizations were also applied in the hybrid to limit the spatial extent of the ensemble contribution to the analysis increments, and the length scales were identical to those in the EnSRF. The GSI 3DVAR algorithm employed two outer loops (OLs; Courtier et al. 1994; Kleist et al. 2009) during minimization. For consistency, the hybrid also used two OLs.
c. Experimental design
Parallel experiments employing limited-area 3DVAR, EnSRF, and hybrid DA configurations were performed. DA occurred solely on the 20-km grid, and the first analyses were at 0000 UTC 6 May 2011. The background for the initial 3DVAR and hybrid analysis was the deterministic 0.5° × 0.5° GFS analysis interpolated onto the 20-km domain (Fig. 1). The initial prior ensemble for the first EnSRF analysis was constructed by taking Gaussian random draws with zero mean and static BECs (Torn et al. 2006) and adding them to the 20-km 0000 UTC 6 May wind, surface pressure, temperature, water vapor, and geopotential height fields. LBCs for the ensemble system were perturbed similarly.
The 0000 UTC 6 May analyses initialized 6-h WRF forecasts and the second set of 3DVAR, hybrid, and EnSRF analyses occurred at 0600 UTC 6 May using the previous 6-h forecasts as backgrounds. This cyclic analysis–forecast pattern with a 6-h period continued until 0000 UTC 21 June (inclusive). Hydrometeor and land surface fields were allowed to evolve freely throughout the entire ~1.5-month period, with the exception that the sea surface temperature (SST) was updated each 0000 UTC analysis from
The procedure for the cycling hybrid system was identical to that of Schwartz et al. (2013) and is illustrated in Fig. 2. Given a background at an analysis time T, separate EnSRF and hybrid analyses were performed. The hybrid utilized the ensemble valid at T to incorporate flow-dependent BECs into the variational framework. After each EnSRF analysis, a 6-h ensemble forecast produced the background ensemble at T + 6. Similarly, the deterministic hybrid analysis initialized a 6-h forecast that served as the background for the next hybrid analysis at T + 6. There was no interaction between the hybrid and ensemble systems in this configuration. Alternatively, as in Zhang et al. (2013), the analysis ensemble could be recentered about the hybrid analysis, which, while shifting the ensemble mean, preserves the perturbations about the mean. Whereas coupling the ensemble and hybrid systems may be more critical if cycling over longer time periods or for dual-resolution applications (e.g., Rainwater and Hunt 2013), both Schwartz et al. (2013) and Wang et al. (2013) noted little practical difference between approximately month-long single-resolution coupled and uncoupled cycling hybrid systems.
The cycling 3DVAR procedure was identical to the deterministic hybrid cycling (bottom circuit in Fig. 2), except the BECs were purely static.
From 9 May through 21 June, each 0000 UTC analysis initialized a 36-h WRF forecast containing a large nested 4-km domain (Fig. 1). The 4-km forecasts were initialized by using a monotone interpolation scheme (Smolarkiewicz and Grell 1992; Skamarock et al. 2008) to interpolate the 20-km analyses onto the 4-km grid. EnSRF-based forecasts were initialized from ensemble mean analyses. Thus, the EnSRF was utilized for two purposes: to provide perturbations for the hybrid and as a stand-alone DA system to initialize WRF forecasts. No digital filter initialization (DFI; Huang and Lynch 1993) or alternative initialization procedure was used to initialize WRF forecasts.
Additionally, operational 0000 UTC GFS analyses were interpolated onto the nested grids and initialized 36-h WRF forecasts between 9 May and 21 June to serve as a benchmark for the limited-area cycling DA experiments. During this period (2011), the GFS analysis system produced 3DVAR analyses at 0.5° × 0.5° horizontal grid spacing and assimilated many observations that were not assimilated in the 3DVAR, EnSRF, and hybrid experiments, such as satellite radiances.
3. Observations
A variety of surface and upper-air observations were assimilated in the 3DVAR, hybrid, and EnSRF experiments. Radiosonde observations of temperature, wind, surface pressure, and specific humidity were assimilated in addition to aircraft reports of temperature and wind. Furthermore, infrared and water vapor channel satellite-tracked wind, global positioning system radio occultation (GPSRO) refractivity, and wind profiler observations were assimilated. Surface observations from ship, buoy, surface synoptic observation (SYNOP), and aviation routine weather report (METAR) platforms were also assimilated. Whereas temperature, specific humidity, wind, and surface pressure observations were assimilated from ships and buoys, only surface pressure observations were assimilated from the land-based SYNOP and METAR stations. Observations taken within ±1.5 h of each analysis were assimilated, except the time window was shortened to ±0.5 h for METAR and SYNOP surface pressure observations. All observations were assumed to be valid at the analysis time for each experiment. A typical distribution of observations available for assimilation at 0000 UTC is shown in Fig. 3.
The initially specified observation error standard deviations (σo) are summarized in Table 1 for observations with height-invariant errors and Fig. 4 for observations where σo varied vertically. However, within GSI, σo was modified due to an observation’s vertical location, proximity to nearby observations, and other considerations. In some instances, the modifications were dramatic and the “final” observation error
Initially specified observation errors σo and outlier check thresholds a for different observation platforms and types. See Fig. 4 for the error profiles for those observations with vertically varying observation errors. Specific humidity observations were rejected if their innovations exceeded 1.0 g kg−1.
The observations were subject to various forms of quality control (QC) and preprocessing. For example, aircraft reports and satellite-tracked winds were thinned to 40-km and 40-hPa resolution. Furthermore, satellite-tracked wind observations either above 200 hPa or over land were not assimilated. Additionally, an “outlier check” was applied whereby an observation was rejected if its innovation (difference between the observation and model-simulated observation based on the model guess) exceeded
All the above observation QC decisions, thinning, and error adjustments were made within GSI. Additionally, the EnSRF relied entirely on GSI for all observation considerations and forward operators. Specifically, for the EnSRF experiment, as in Hamill et al. (2011a), GSI was used to generate the prior model-simulated observations for each ensemble member, adjust the initial observation errors, and perform QC decisions. The observed values, prior model-simulated observations, and final observation errors were then ingested into the EnSRF.
This complete dependence on GSI for all observation-related issues ensured consistency of the forward operators and observation errors in the hybrid, 3DVAR, and EnSRF systems, and all three experiments used an identical observational dataset. Given these considerations, the only differences regarding observations that were actually assimilated involved those rejected due to the outlier checks, which reflect the quality of the background field. Overall, these settings and the experimental design permit a very clean assessment of the merits of mesoscale cycling 3DVAR, hybrid, and EnSRF DA techniques for the initialization of convection-permitting forecasts.
4. Results
Model output was assessed using several metrics. Characteristics of the 20-km analysis systems are first examined before more extensively focusing on 4-km precipitation forecasts. Output from the first 3 days of the cycling DA configurations was excluded from all verification statistics to allow the ensemble fields to “spin up” from the initial, randomly generated, ensemble.
a. Mean 20-km state characteristics
Since only the 0000 UTC analyses initialized 4-km forecasts, we focus on characteristics of the 0000 UTC backgrounds and analyses between 9 May and 21 June. Statistics for the EnSRF experiment were computed using the ensemble mean prior (background) and posterior (analysis) fields (average of the 50-member ensemble). The backgrounds, which provided the foundation for the analyses, are discussed first.
1) Background characteristics
To assess the quality of the three 20-km DA systems, we examine the 0000 UTC background fields, which were 6-h WRF forecasts initialized by the 1800 UTC analyses.
The aggregate 0000 UTC prior model biases (forecast minus observations) compared to aircraft observations over the entire 20-km domain were similar in the 3DVAR, EnSRF, and hybrid configurations (Figs. 5a–c). Aircraft observations are notoriously biased (e.g., Gao et al. 2012), and slight cold and slow meridional wind speed biases were noted throughout the column, and zonal winds below 700 hPa were too fast. Root-mean-square errors (RMSEs) compared to aircraft temperature observations were comparable, but the EnSRF had the smallest wind RMSEs throughout the troposphere, followed by the hybrid and 3DVAR.
Average prior 0000 UTC biases compared to radiosonde temperature, specific humidity, and wind observations were also usually similar (Figs. 5d–f). Radiosonde temperature biases were too warm throughout most of the atmosphere and opposite the sign of the aircraft temperature bias, which was also noted by Romine et al. (2013). RMSEs compared to radiosonde temperature observations differed little among the configurations, although, as with aircraft observations, the EnSRF had the smallest zonal wind RMSE (meridional wind pattern was similar and is not shown). The EnSRF also had the lowest RMSEs compared to specific humidity observations while the 3DVAR and hybrid moisture RMSEs were comparable.
The hybrid consistently had lower wind RMSEs than 3DVAR, and the 3DVAR and hybrid moisture and temperature RMSEs were similar. These findings suggest the hybrid DA system performed better than 3DVAR. It is unsurprising that the mean prior EnSRF RMSEs were lower than those of the deterministic 3DVAR and hybrid configurations, as the EnSRF mean field is smoother and better in an RMSE sense than any individual member’s field, on average (Leith 1974). Similarly, the mean prior EnSRF field was smoother than the 3DVAR and hybrid fields that did not benefit from averaging procedures. These findings are consistent with Meng and Zhang (2008a), who suggested that ensemble averaging is an important component of EnKFs. As further confirmation of the importance of ensemble averaging, the prior RMSEs computed for individual ensemble members were considerably higher than the prior RMSEs for the ensemble mean background and often higher than the 3DVAR RMSEs (not shown).
Assimilation of observations updated the prior fields to produce the analyses, which were used to initialize the 4-km forecasts. Some aspects of the analysis fields are presented next.
2) Analysis characteristics
Although analysis fits to observations are easily manipulated by tuning the background error covariance and observation errors, since the observation errors were constant in the hybrid, EnSRF, and 3DVAR configurations, the analysis fits primarily reflect the magnitudes of the background error variances and reveal interesting differences between the experiments. All four analysis sets had a slow zonal wind speed bias compared to radiosonde observations at 500, 400, and 200 hPa (Fig. 6a), which may be related to domination of aircraft observations at these levels. Below 500 hPa, the 3DVAR, hybrid, and EnSRF zonal wind analyses were nearly unbiased, but the GFS analyses had a slow bias. Meridional wind analyses were nearly unbiased (Fig. 6b). Aggregate biases of radiosonde temperature fits (Fig. 6c) were usually similar in the 3DVAR, hybrid, and EnSRF experiments and collectively less biased than the GFS mean analysis throughout most of the troposphere. Biases compared to specific humidity observations (Fig. 6d) were comparable, although the EnSRF was slightly more biased between 925 and 700 hPa. For all variables, the hybrid and 3DVAR analysis RMSEs were comparable and smallest, while those of the EnSRF and GFS were largest. These patterns of behavior were broadly consistent with those found by Zhang et al. (2013).
However, the analysis fits provide no information about how the assimilation updated the model state at nonobservation locations, which is achieved through the BECs. To assess DA impacts at nonobservation locations, the aggregate 0000 UTC pure and bias-removed RMS differences between GFS analyses and the 3DVAR, EnSRF, and hybrid analyses were calculated between 9 May and 21 June (3DVAR/EnSRF/hybrid minus GFS for the pure differences). The mean 850-hPa mixing ratio differences revealed considerably drier GFS analyses south of ~45°N and east of ~100°W (Figs. 7a–c). The magnitudes and structures of the differences varied across the three cycling DA configurations, and the bias-removed RMS differences (Figs. 7d–f) varied substantially, as well. Debiased RMS differences for 700-hPa temperature (Figs. 8a–c) and 500-hPa wind speed (Figs. 8d–f) also revealed varying structures. For all variables, as expected, the differences were maximized between radiosonde locations, where the analysis increments were largely determined by the distribution of the observational content through the BECs. Thus, even though the 3DVAR and hybrid analysis fits to radiosonde observations were similar, substantial subsynoptic differences existed in surrounding areas. These subsynoptic differences between the three cycling DA experiments were manifested by different precipitation forecasts, which are now discussed.
b. 4-km precipitation forecasts
Forecasts of hourly accumulated precipitation were compared to gridded stage IV (ST4) observations produced at NCEP (Lin and Mitchell 2005). Objective verification was performed over a fixed domain encompassing most of the central CONUS (Fig. 1), which was removed from lateral boundaries and where ST4 data were robust. ST4 observations were reported in eighths of a millimeter.
When possible, objective metrics were computed on the native 4-km grids. However, some scores required that the observations and model be on a common grid. Therefore, for certain calculations, precipitation forecasts were bilinearly interpolated onto the ST4 grid (~4.7-km grid spacing), which is hereafter referred to as the verification grid.
1) General precipitation characteristics
The total accumulated precipitation over the verification domain, calculated on native grids, aggregated each hour over the 44 forecasts, and normalized by 44 times the total number of grid boxes in the verification domain, is shown in Fig. 9. All four configurations captured the diurnal cycle well. Differences between the cycling experiments were usually small, with overprediction of rainfall in the first ~12 h and during the diurnal maximum. The GFS-initialized WRF forecasts (denoted as WRF-GFS) produced precipitation amounts that most closely agreed with ST4 observations between ~24–30 h but still produced too much rainfall.
Since no hydrometeors or vertical motions were present in the GFS ICs, the WRF-GFS required ~6 h to spin up precipitation. Conversely, a large “spike” was evident at 2 h in the 3DVAR-, EnSRF-, and hybrid-initialized WRF forecasts (labeled WRF-3DVAR, WRF-EnSRF, and WRF-Hybrid, respectively). To assess whether this feature was caused by an IC imbalance, several short-term 4-km forecasts were initialized after applying DFI to the 20-km ICs. When this procedure was performed, DFI yielded a precipitation spinup that was slower, but smoother, compared to when DFI was not used. Additional testing revealed that this spike at 2 h could be substantially reduced by using the Thompson microphysics scheme (Thompson et al. 2008) on both the 20- and 4-km domains without DFI. Thus, it appears that some aspect of the Morrison microphysics parameterization, combined with the initial hydrometeor fields, was responsible for the spike at 2 h, although an initial imbalance may also have played a role. Furthermore, employing Thompson microphysics reduced the afternoon overprediction, suggesting that Morrison microphysics contributed to the high afternoon precipitation bias. While these physics-related behaviors were not optimal, they impacted the three cycling experiments equally. Moreover, examination of individual events strongly suggested later precipitation forecasts were not adversely affected by the excessive rainfall at 2 h.
Figure 10 shows the fractional occurrence of various events, defined as the precipitation exceedance of various accumulation thresholds (q; e.g., q = 2.0 mm h−1), over the verification domain on the native grids and aggregated hourly over the 44 forecasts. Rainfall occurrence was rare: less than 6% of the verification domain contained precipitation ≥0.25 mm h−1 at any time, on average, and the event frequency decreased as q increased. For q = 0.25 and 1.0 mm h−1 (Figs. 10a,b), the GFS-initialized forecasts underpredicted the areal coverage at all times, and the hybrid-, EnSRF-, and 3DVAR-initialized forecasts produced coverage amounts that corresponded well with observations during the diurnal maximum, though they were biased slightly high. The WRF-3DVAR, WRF-EnSRF, and WRF-Hybrid coverage amounts were too small after ~27 h for q = 0.25 and 1.0 mm h−1. For q ≥ 5.0 mm h−1 (Figs. 10c,d), the WRF-3DVAR, WRF-Hybrid, and WRF-EnSRF forecasts overpredicted rainfall during the diurnal maximum. The WRF-GFS coverages agreed well with the ST4 coverages at q = 5.0 mm h−1 but overpredicted at the 10.0 mm h−1 threshold, though not as much as the EnSRF-, hybrid-, and 3DVAR-initialized forecasts. These fractional coverage amounts suggest that the excessive total rainfall (Fig. 9), including the Morrison-microphysics-induced spike at 2 h, was primarily due to overprediction at higher thresholds, representing more extreme events associated with convection.
The multiplicative bias, defined as the ratio of the number of forecast events to the number of observed events, was also determined over the verification domain on the verification grid. Biases aggregated hourly over 1–12- and 18–36-h forecasts are shown in Fig. 11 as a function of q. Similar to the areal coverage patterns and total precipitation, the WRF-3DVAR, WRF-Hybrid, and WRF-EnSRF had similar biases and the WRF-GFS followed a different trajectory. In the first 12 h (Fig. 11a), all configurations underpredicted rainfall (bias < 1) for q ≤ 1 mm h−1, with the WRF-GFS underpredicting most significantly. For q ≥ 5.0 mm h−1, the cycling DA experiments overpredicted precipitation (bias > 1), with the hybrid-initialized forecasts producing the highest biases. At the 10.0 mm h−1 threshold, WRF-GFS had nearly no bias.
For 18–36-h forecasts (Fig. 11b), all four experiments produced too little rainfall for q ≤ 1.0 mm h−1, with WRF-GFS again underpredicting most. For q = 5.0 and 10.0 mm h−1, the hybrid-, 3DVAR-, and EnSRF-initialized forecasts all overpredicted rainfall, although the WRF-Hybrid had a slightly lower bias than WRF-EnSRF and WRF-3DVAR. However, the GFS-initialized forecasts produced the lowest bias for q = 10.0 mm h−1.
High biases with convection-permitting forecasts have been previously documented (e.g., Kain et al. 2008; Weisman et al. 2008; Schwartz et al. 2009, 2010), particularly for high rainfall thresholds. This overprediction may be related to insufficient resolution (e.g., Bryan et al. 2003) or uncertainties in the physical parameterizations. However, it is interesting that the WRF-GFS precipitation biases were noticeably lower, as all experiments utilized an identical forecast model, but the lower WRF-GFS precipitation totals were consistent with the drier GFS ICs (Fig. 7).
While there were occasionally small differences, overall, the general precipitation characteristics were comparable in the WRF-3DVAR, WRF-EnSRF, and WRF-Hybrid experiments. These findings indicate that 3DVAR, EnSRF, and hybrid DA methods can all initialize convection-permitting forecasts with reasonable ability in replicating the observed precipitation climatology, especially the timing of the diurnal cycle, although there were problems with amplitude. However, the areal coverage amounts, bias, and domain total precipitation do not quantify spatial skill. Thus, objective verification of precipitation placement follows next.
2) Assessment of precipitation placement
The fractions skill score (FSS) was used to objectively evaluate the spatial skill of precipitation forecasts. The FSS employs a “neighborhood” approach (Ebert 2009). Neighborhood approaches relax the requirement that forecast and observed events match exactly at the grid scale for a forecast to be considered perfect. A brief description of the FSS follows and more details may be found in Roberts and Lean (2008) and Schwartz et al. (2009).
The FSS requires that the observations and model be on a common grid. Precipitation accumulation thresholds (q) are chosen to define events and a radius of influence (r; e.g., r = 50 km) is selected. Then, for each grid point within the verification domain on the verification grid, the number of grid boxes falling within r is counted (Nb). A fractional value for each grid point is determined by dividing the number of points within r containing accumulated precipitation ≥q by Nb. This procedure is performed on both the model and observational (e.g., ST4) grids.
Statistical significance of the FSS was assessed by a bootstrap resampling technique based on differences between pairs of experiments. Random samples (with replacement) were drawn from the distribution of daily FBS and FBSworst values separately for each of two experiments. Then, two corresponding resampled FSSs were calculated from the aggregate FBS and FBSworst, and the FSSs were differenced. This process was repeated 10 000 times. The 90% confidence interval (CI) for the FSS difference between two experiments was estimated from the distribution of resampled FSS differences. If zero was outside the bounds of the CI, then the difference between the FSSs was statistically significant (SS) at the 95% level.
As the FSS can be sensitive to bias (Mittermaier and Roberts 2010), the FSS was computed using percentile thresholds, which effectively removes bias and focuses solely on displacement errors. Percentiles were computed from a climatological perspective for each forecast hour for all experiments and the ST4 observations. The climatologies consisted of all rainfall amounts, including zeros, in the verification domain on the verification grid over all 44 forecasts. Separate percentile thresholds were computed for a particular hour by sorting that hour’s precipitation values from lowest to highest. The yth percentile means that (100 − y) percent of points contained precipitation with accumulations greater than qy, where qy is the physical accumulation threshold corresponding to the yth percent. By including zeros in the calculation, this method ensured the climatological frequencies of precipitation exceeding qy were identical for all experiments and observations.
Only very high percentiles corresponded to nonzero absolute thresholds (Fig. 12), as zeroes dominated over this domain. The qy patterns were consistent with the areal coverage patterns (Fig. 10). Differences between the WRF-3DVAR, WRF-EnSRF, and WRF-Hybrid percentiles were usually small while the WRF-GFS percentiles differed notably from the other three experiments. At the 97th and 98th percentiles (Figs. 12a,b), the WRF-Hybrid percentiles agreed most closely with the observed percentiles during the diurnal peak, while at other thresholds, the WRF-GFS percentiles were nearest observations.
FSSs aggregated hourly over the first 12 h and all days over the verification domain on the verification grid are shown in Fig. 13. This period was chosen to focus on initial convective development. The horizontal lines denote “zero lines” for the bootstrap CIs based upon FSS differences between experimental pairs. As q increased, FSSs decreased, indicating less skill at predicting progressively heavier precipitation. The WRF-Hybrid had the highest FSSs that were usually significantly better than the WRF-EnSRF and WRF-3DVAR FSSs and significantly higher than WRF-GFS FSSs except at the 99.75th percentile threshold. Additionally, the WRF-EnSRF had higher FSSs than did WRF-3DVAR, with SS differences for a majority of r at all but the 99.75th percentile. WRF-EnSRF FSSs were also higher than WRF-GFS FSSs with many instances of statistical significance. Undoubtedly, the GFS-initialized forecasts were disadvantaged during earlier forecast periods since no hydrometeors or vertical velocities were present in the ICs while the other experiments did not suffer from these omissions.
FSSs aggregated over all days hourly between 18- and 36-h forecasts are shown in Fig. 14. This time period coincided with next-day convective initiation and evolution. Differences between WRF-GFS and WRF-3DVAR FSSs were usually not SS. The WRF-Hybrid had significantly higher FSSs than the three other forecast sets for all q and r, with the differences compared to WRF-GFS widening for increasing thresholds. Differences between the WRF-EnSRF and WRF-GFS were sometimes SS, particularly for larger radii. Other than at the 97th percentile threshold, differences between WRF-EnSRF and WRF-3DVAR were usually not SS.
Figure 15 shows a time series of the FSS for various percentile thresholds for r = 50 km aggregated over all forecasts. The curves become noisier as the threshold increases due to smaller sample sizes for more extreme events. These hourly statistics are consistent with FSSs aggregated over the first 12 h and between 18–36 h but provide more temporal detail. The WRF-3DVAR and WRF-EnSRF FSSs were usually lower than those of WRF-Hybrid, and the WRF-EnSRF typically performed better than WRF-3DVAR during the first half of the forecast. Notably, the WRF-Hybrid had unambiguously higher FSSs at most thresholds before 12 h; between ~18 and 24 h, during the convective-initiation period; and after ~28 h, as the convection evolved. Interestingly, the WRF-GFS had the highest FSSs at the heaviest rainfall thresholds between ~24 and 28 h but consistently performed the worst after ~28 h. The WRF-GFS also performed comparably to or better than the WRF-Hybrid between ~12 and 18 h, during the diurnal minimum.
Overall, at all thresholds, the WRF-Hybrid configuration produced the highest FSSs that were typically significantly higher than all other FSSs. The WRF-EnSRF FSSs were usually higher than WRF-3DVAR FSSs, but the differences were primarily SS only during the first 12 h. The WRF-3DVAR performed comparably to WRF-GFS and differences were mainly not SS. It is revealing and noteworthy that the WRF-Hybrid improvement compared to the WRF-GFS was maximized at higher thresholds during the 18–36-h period, even though model skill decreased for these more extreme events.
A brief example of a forecast that visually highlights some aspects of these objective measures follows next.
3) An example forecast
The 6- and 30-h 4-km forecasts of 1-h accumulated precipitation initialized from the 0000 UTC 24 May analysis are shown in Figs. 16 and 17, respectively, along with the corresponding ST4 observations. At 6 h, only WRF-EnSRF and WRF-Hybrid (Figs. 16b,c) correctly predicted a convective complex in Arkansas, although both forecasts displaced the convection slightly northward. The WRF-Hybrid forecast most accurately represented the orientation and curvature of the observed convective band. Conversely, WRF-GFS (Fig. 16a) failed to generate organized convection in Arkansas and WRF-3DVAR (Fig. 16d) misplaced the convection in Missouri. Elsewhere, all the forecasts predicted convection in northeast Colorado, western Nebraska, and the northeastern part of the domain. WRF-EnSRF performed particularly well in western Nebraska and had lesser areal coverage amounts over the Ohio Valley than did WRF-Hybrid and WRF-3DVAR, similar to observations.
At 30 h (Fig. 17), all the WRF forecasts missed convection in Texas. However, the experiments all correctly produced rainfall arching from central Nebraska through western Missouri and northwestern Arkansas. Within this feature, only small differences existed between the experiments, such as the orientation of individual bands, the number of discrete cells, and the exact locations of heaviest rainfall. For example, WRF-Hybrid and WRF-3DVAR (Figs. 17c,d) placed heavy precipitation in northern Missouri and extreme southern Iowa, while the other experiments primarily emphasized convection in southwestern Missouri. Additionally, the WRF-EnSRF had more precipitation in central and eastern Iowa than the other forecasts. Farther east, the WRF-3DVAR produced excessive rainfall around Washington, D.C., while the WRF-EnSRF and WRF-Hybrid generated too much precipitation in southeastern Virginia. However, the WRF-GFS kept these areas mainly dry, in agreement with the observations. Various degrees of spurious convection were noted in all experiments across the Ohio Valley.
Although this case represents just a small fraction of the dataset, it nicely illustrates some features that were implied by the objective statistics. For example, as evidenced in this case, all four forecast sets mostly predicted precipitation over broadly similar meso-α-scale regions. Additionally, missed events were usually missed by all experiments, as in Fig. 17 over Texas. Furthermore, even if a particular forecast feature was incorrectly placed, it was often possible to identify the observed entity the model was attempting to replicate (e.g., the WRF-3DVAR complex over southwestern Missouri corresponded to the observed precipitation over Arkansas in Fig. 16). Thus, statistics that measure climatological aspects, such as bias and areal coverage patterns, revealed few differences between the experiments. But, as evidenced in the case study, there were meso-β-scale differences in precipitation positions, which were quantified in the FSS. The FSSs clearly indicate that WRF-Hybrid had the greatest spatial skill.
5. Summary and conclusions
Limited-area 3DVAR, EnSRF, and hybrid DA configurations cycled continuously between 0000 UTC 6 May and 0000 UTC 21 June 2011 over a CONUS-spanning domain with 20-km horizontal grid spacing. Beginning 9 May, the 0000 UTC analyses initialized 36-h WRF forecasts that contained a large convection-permitting 4-km nest. These 4-km 3DVAR-, EnSRF-, and hybrid-initialized forecasts were compared to identically configured WRF forecasts initialized by interpolating 0000 UTC GFS analyses onto the computational domain. While the relative merits of 3DVAR, EnKF, and hybrid DA systems have also been examined elsewhere (e.g., Zhang et al. 2013), our detailed comparison of the utility of these different DA methods to initialize high-resolution forecasts is somewhat unique.
Climatological characteristics of the EnSRF-, 3DVAR-, and hybrid-initialized 4-km precipitation forecasts, such as total rainfall, bias, and timing of the diurnal cycle, were comparable, while the GFS-initialized forecasts differed from the other three experiments regarding rainfall amounts. However, there were notable differences throughout the 36-h forecast regarding precipitation placement, as measured by the FSS. The hybrid-initialized forecasts always had the highest FSSs that were usually statistically significantly better than those of the other experiments. The EnSRF-initialized forecasts usually performed second best, although differences compared to WRF-GFS and WRF-3DVAR were not always SS. FSSs from the 3DVAR- and GFS-initialized forecasts were lowest and the differences were usually not significantly different. The higher WRF-Hybrid and WRF-EnSRF FSSs suggests that limited-area DA configurations incorporating flow-dependent BECs can initialize better next-day convective forecasts than 3DVAR DA systems that employ static BECs.
The differences regarding convective placement directly resulted from the different ICs produced by the various analysis systems. These different precipitation patterns imply meaningful subsynoptic differences in the ICs that were not always quantifiable by examining the analysis fits to mass and wind observations. The different 3DVAR, EnSRF, and hybrid BECs distributed the observational content differently at nonobservation locations, leading to analyses with various degrees of balance and detail.
These findings are encouraging and suggest there is value from initializing convection-permitting forecasts with cycling mesoscale DA systems that incorporate flow-dependent BECs. In particular, the hybrid method of combining the static and ensemble-derived BECs seems promising. However, limited-area hybrid DA systems have not yet been introduced operationally. These results strongly suggest the need for additional development of regional hybrid DA with testing at higher resolution and using larger ensembles.
Acknowledgments
Thanks to Jeff Whitaker (NOAA/ESRL/PSD) for helping with the EnSRF configurations. Glen Romine (NCAR/MMM) contributed to the model physics and data assimilation settings. We are grateful to Mike Moran (NCAR/NESL) for his assistance with procuring computational support, which was provided by NCAR’s Computational and Information Systems Laboratory (CISL). Two anonymous reviewers provided constructive comments that improved this paper. This work was partially funded by the U.S. Air Force Weather Agency (AFWA).
REFERENCES
Barker, D. M., W. Huang, Y.-R. Guo, A. Bourgeois, and X. N. Xio, 2004: A three-dimensional variational data assimilation system for MM5: Implementation and initial results. Mon. Wea. Rev., 132, 897–914.
Benjamin, S. G., and Coauthors, 2004: An hourly assimilation–forecast cycle: The RUC. Mon. Wea. Rev., 132, 495–518.
Benjamin, S. G., and Coauthors, 2007: From radar-enhanced RUC to the WRF-based Rapid Refresh. Preprints, 22nd Conf. on Weather Analysis and Forecasting/18th Conf. on Numerical Weather Prediction, Park City, UT, Amer. Meteor. Soc., J3.4. [Available online at http://ams.confex.com/ams/pdfpapers/124827.pdf.]
Bryan, G. H., J. C. Wyngaard, and J. M. Fritsch, 2003: Resolution requirements for the simulation of deep moist convection. Mon. Wea. Rev., 131, 2394–2416.
Buehner, M., 2005: Ensemble-derived stationary and flow-dependent background error covariances: Evaluation in a quasi-operational NWP setting. Quart. J. Roy. Meteor. Soc., 131, 1013–1043.
Buehner, M., P. L. Houtekamer, C. Charette, H. L. Mitchell, and B. He, 2010: Intercomparison of variational data assimilation and the ensemble Kalman filter for global deterministic NWP. Part II: One-month experiments with real observations. Mon. Wea. Rev., 138, 1567–1586.
Burgers, G., P. J. van Leeuwen, and G. Evensen, 1998: Analysis scheme in the ensemble Kalman filter. Mon. Wea. Rev., 126, 1719–1724.
Campbell, W. F., C. H. Bishop, and D. Hodyss, 2010: Vertical covariance localization for satellite radiances in ensemble Kalman filters. Mon. Wea. Rev., 138, 282–290.
Chen, F., and J. Dudhia, 2001: Coupling an advanced land surface–hydrology model with the Penn State–NCAR MM5 modeling system. Part I: Model description and implementation. Mon. Wea. Rev., 129, 569–585.
Clark, A. J., W. A. Gallus Jr., and M. L. Weisman, 2010: Neighborhood-based verification of precipitation forecasts from convection-allowing NCAR WRF model simulations and the operational NAM. Wea. Forecasting, 25, 1495–1509.
Clark, A. J., J. S. Kain, P. T. Marsh, J. Correia, M. Xue, and F. Kong, 2012: Forecasting tornado pathlengths using a three-dimensional object identification algorithm applied to convection-allowing forecasts. Wea. Forecasting, 27, 1090–1113.
Clark, A. J., J. Gao, P. T. Marsh, T. Smith, J. S. Kain, J. Correia, M. Xue, and F. Kong, 2013: Tornado pathlength forecasts from 2010 to 2011 using ensemble updraft helicity. Wea. Forecasting, 28, 387–407.
Clayton, A. M., A. C. Lorenc, and D. M. Barker, 2012: Operational implementation of a hybrid ensemble/4D-Var global data assimilation system at the Met Office. Quart. J. Roy. Meteor. Soc., 139, 1445–1461, doi:10.1002/qj.2054.
Courtier, P., J.-N. Thépaut, and A. Hollingsworth, 1994: A strategy for operational implementation of 4D-Var, using an incremental approach. Quart. J. Roy. Meteor. Soc., 120, 1367–1387.
Dee, D. P., and A. M. da Silva, 2003: The choice of variable for atmospheric moisture analysis. Mon. Wea. Rev., 131, 155–171.
Done, J., C. A. Davis, and M. L. Weisman, 2004: The next generation of NWP: Explicit forecasts of convection using the Weather Research and Forecasting (WRF) model. Atmos. Sci. Lett., 5, 110–117, doi:10.1002/asl.72.
Dowell, D. C., F. Zhang, L. J. Wicker, C. Snyder, and N. A. Crook, 2004: Wind and temperature retrievals in the 17 May 1981 Arcadia, Oklahoma, supercell: Ensemble Kalman filter experiments. Mon. Wea. Rev., 132, 1982–2005.
Ebert, E. E., 2009: Neighborhood verification: A strategy for rewarding close forecasts. Wea. Forecasting, 24, 1498–1510.
Evensen, G., 1994: Sequential data assimilation with a nonlinear quasi-geostrophic model using Monte Carlo methods to forecast error statistics. J. Geophys. Res., 99, 10 143–10 162.
Gao, F., X. Zhang, N. A. Jacobs, X.-Y. Huang, X. Zhang, and P. P. Childs, 2012: Estimation of TAMDAR observational error and assimilation experiments. Wea. Forecasting, 27, 856–877.
Gao, J., M. Xue, and D. J. Stensrud, 2010: The development of a hybrid EnKF–3DVar algorithm for storm-scale data assimilation. Preprints, 25th Conf. Several Local Storms, Denver, CO, Amer. Meteor. Soc., P7.4. [Available online at https://ams.confex.com/ams/pdfpapers/176004.pdf.]
Gaspari, G., and S. E. Cohn, 1999: Construction of correlation functions in two and three dimensions. Quart. J. Roy. Meteor. Soc., 125, 723–757.
Hamill, T. M., and C. Snyder, 2000: A hybrid ensemble Kalman filter–3D variational analysis scheme. Mon. Wea. Rev., 128, 2905–2919.
Hamill, T. M., J. S. Whitaker, M. Fiorino, and S. G. Benjamin, 2011a: Global ensemble predictions of 2009’s tropical cyclones initialized with an ensemble Kalman filter. Mon. Wea. Rev., 139, 668–688.
Hamill, T. M., J. S. Whitaker, D. T. Kleist, M. Fiorino, and S. G. Benjamin, 2011b: Predictions of 2010’s tropical cyclones using the GFS and ensemble-based data assimilation methods. Mon. Wea. Rev., 139, 3243–3247.
Hohenegger, C., and C. Schär, 2007: Predictability and error growth dynamics in cloud-resolving models. J. Atmos. Sci., 64, 4467–4478.
Houtekamer, P. L., and H. L. Mitchell, 1998: Data assimilation using an ensemble Kalman filter technique. Mon. Wea. Rev., 126, 796–811.
Huang, X.-Y., and P. Lynch, 1993: Diabatic digital filter initialization: Application to the HIRLAM model. Mon. Wea. Rev., 121, 589–603.
Huang, X.-Y., and Coauthors, 2009: Four-dimensional variational data assimilation for WRF: Formulation and preliminary results. Mon. Wea. Rev., 137, 299–314.
Iacono, M. J., J. S. Delamere, E. J. Mlawer, M. W. Shephard, S. A. Clough, and W. D. Collins, 2008: Radiative forcing by long-lived greenhouse gases: Calculations with the AER radiative transfer models. J. Geophys. Res., 113, D13103, doi:10.1029/2008JD009944.
Janjić, Z. I., 1994: The step-mountain eta coordinate model: Further developments of the convection, viscous sublayer, and turbulence closure schemes. Mon. Wea. Rev., 122, 927–945.
Janjić, Z. I., 2002: Nonsingular implementation of the Mellor–Yamada level 2.5 scheme in the NCEP Meso Model. NCEP Office Note 437, 61 pp. [Available online at http://www.emc.ncep.noaa.gov/officenotes/newernotes/on437.pdf.]
Jones, T. A., and D. J. Stensrud, 2012: Assimilating AIRS temperature and mixing ratio profiles using an ensemble Kalman filter approach for convective-scale forecasts. Wea. Forecasting, 27, 541–564.
Jones, T. A., D. J. Stensrud, P. Minnis, and R. Palikonda, 2013: Evaluation of a forward operator to assimilate cloud water path into WRF-DART. Mon. Wea. Rev., 141, 2272–2289.
Kain, J. S., S. J. Weiss, J. J. Levit, M. E. Baldwin, and D. R. Bright, 2006: Examination of convection-allowing configurations of the WRF model for the prediction of severe convective weather: The SPC/NSSL Spring Program 2004. Wea. Forecasting, 21, 167–181.
Kain, J. S., and Coauthors, 2008: Some practical considerations regarding horizontal resolution in the first generation of operational convection-allowing NWP. Wea. Forecasting, 23, 931–952.
Kleist, D. T., D. F. Parrish, J. C. Derber, R. Treadon, W.-S. Wu, and S. Lord, 2009: Introduction of the GSI into the NCEP Global Data Assimilation System. Wea. Forecasting, 24, 1691–1705.
Lean, H. W., P. A. Clark, M. Dixon, N. M. Roberts, A. Fitch, R. Forbes, and C. Halliwell, 2008: Characteristics of high-resolution versions of the Met Office Unified Model for forecasting convection over the United Kingdom. Mon. Wea. Rev., 136, 3408–3424.
Leith, C., 1974: Theoretical skill of Monte Carlo forecasts. Mon. Wea. Rev., 102, 409–418.
Lin, Y., and K. E. Mitchell, 2005: The NCEP stage II/IV hourly precipitation analyses: Development and applications. Preprints, 19th Conf. on Hydrology, San Diego, CA, Amer. Meteor. Soc., 1.2. [Available online at http://ams.confex.com/ams/pdfpapers/83847.pdf].
Lorenc, A. C., 2003: The potential of the ensemble Kalman filter for NWP—A comparison with 4D-VAR. Quart. J. Roy. Meteor. Soc., 129, 3183–3203.
Lorenc, A. C., and Coauthors, 2000: The Met. Office global three-dimensional variational data assimilation scheme. Quart. J. Roy. Meteor. Soc., 126, 2991–3012, doi:10.1002/qj.49712657002.
Lorenz, E. N., 1969: The predictability of a flow which possesses many scales of motion. Tellus, 21, 289–307.
Melhauser, C., and F. Zhang, 2012: Practical and intrinsic predictability of severe and convective weather at the mesoscales. J. Atmos. Sci., 69, 3350–3371.
Mellor, G. L., and T. Yamada, 1982: Development of a turbulence closure model for geophysical fluid problems. Rev. Geophys. Space Phys., 20, 851–875.
Meng, Z., and F. Zhang, 2008a: Tests of an ensemble Kalman filter for mesoscale and regional-scale data assimilation. Part III: Comparison with 3DVAR in a real-data case study. Mon. Wea. Rev., 136, 522–540.
Meng, Z., and F. Zhang, 2008b: Test of an ensemble Kalman filter for mesoscale and regional-scale data assimilation. Part IV: Comparison with 3DVAR in a month-long experiment. Mon. Wea. Rev., 136, 3671–3682.
Mittermaier, M., and N. Roberts, 2010: Intercomparison of spatial forecast verification methods: Identifying skillful spatial scales using the fractions skill score. Wea. Forecasting, 25, 343–354.
Miyoshi, T., Y. Sato, and T. Kadowaki, 2010: Ensemble Kalman filter and 4D-Var intercomparison with the Japanese Operational Global Analysis and Prediction System. Mon. Wea. Rev., 138, 2846–2866.
Mlawer, E. J., S. J. Taubman, P. D. Brown, M. J. Iacono, and S. A. Clough, 1997: Radiative transfer for inhomogeneous atmosphere: RRTM, a validated correlated-k model for the long-wave. J. Geophys. Res., 102, 16 663–16 682.
Morrison, H., G. Thompson, and V. Tatarskii, 2009: Impact of cloud microphysics on the development of trailing stratiform precipitation in a simulated squall line: Comparison of one- and two-moment schemes. Mon. Wea. Rev., 137, 991–1007.
Parrish, D. F., and J. C. Derber, 1992: The National Meteorological Center’s spectral statistical interpolation analysis system. Mon. Wea. Rev., 120, 1747–1763.
Rainwater, S., and B. Hunt, 2013: Mixed resolution ensemble data assimilation. Mon. Wea. Rev., 141, 3007–3021.
Roberts, N. M., and H. W. Lean, 2008: Scale-selective verification of rainfall accumulations from high-resolution forecasts of convective events. Mon. Wea. Rev., 136, 78–97.
Rogers, E., and Coauthors, 2009: The NCEP North American Mesoscale modeling system: Recent changes and future plans. Preprints, 23rd Conf. on Weather Analysis and Forecasting/19th Conf. on Numerical Weather Prediction, Omaha, NE, Amer. Meteor. Soc., 2A.4. [Available online at https://ams.confex.com/ams/pdfpapers/154114.pdf.]
Romine, G., C. S. Schwartz, C. Snyder, J. Anderson, and M. Weisman, 2013: Model bias in a continuously cycled assimilation system and its influence on convection-permitting forecasts. Mon. Wea. Rev., 141, 1263–1284.
Schwartz, C. S., and Coauthors, 2009: Next-day convection-allowing WRF model guidance: A second look at 2-km versus 4-km grid spacing. Mon. Wea. Rev., 137, 3351–3372.
Schwartz, C. S., and Coauthors, 2010: Toward improved convection-allowing ensembles: Model physics sensitivities and optimizing probabilistic guidance with small ensemble membership. Wea. Forecasting, 25, 263–280.
Schwartz, C. S., Z. Liu, X.-Y. Huang, Y.-H. Kuo, and C.-T. Fong, 2013: Comparing limited-area 3DVAR and hybrid variational-ensemble data assimilation methods for typhoon track forecasts: Sensitivity to outer loops and vortex relocation. Mon. Wea. Rev., 141, 4350–4372.
Skamarock, W. C., and M. L. Weisman, 2009: The Impact of positive-definite moisture transport on NWP precipitation forecasts. Mon. Wea. Rev., 137, 488–494.
Skamarock, W. C., and Coauthors, 2008: A description of the Advanced Research WRF version 3. NCAR Tech Note NCAR/TN-475+STR, 113 pp. [Available from UCAR Communications, P.O. Box 3000, Boulder, CO 80307.]
Smith, T. L., S. G. Benjamin, J. M. Brown, S. Weygandt, T. Smirnova, and B. Schwartz, 2008: Convection forecasts from the hourly updated, 3-km High-Resolution Rapid Refresh (HRRR) model. Preprints, 24th Conf. on Severe Local Storms, Savannah, GA, Amer. Meteor. Soc., 11.1. [Available online at https://ams.confex.com/ams/pdfpapers/142055.pdf.]
Smolarkiewicz, P. K., and G. A. Grell, 1992: A class of monotone interpolation schemes. J. Comput. Phys., 101, 431–440.
Snyder, C., and F. Zhang, 2003: Assimilation of simulated Doppler radar observations with an ensemble Kalman filter. Mon. Wea. Rev., 131, 1663–1677.
Sobash, R. A., J. S. Kain, D. R. Bright, A. R. Dean, M. C. Coniglio, and S. J. Weiss, 2011: Probabilistic forecast guidance for severe thunderstorms based on the identification of extreme phenomena in convection-allowing model forecasts. Wea. Forecasting, 26, 714–728.
Stensrud, D. J., and Coauthors, 2009: Convective-scale warn-on-forecast system. Bull. Amer. Meteor. Soc., 90, 1487–1499.
Tanamachi, R. L., L. J. Wicker, D. C. Dowell, H. B. Bluestein, D. T. Dawson, and M. Xue, 2013: EnKF assimilation of high-resolution, mobile Doppler radar data of the 4 May 2007 Greensburg, Kansas, supercell into a numerical cloud model. Mon. Wea. Rev., 141, 625–648.
Thompson, G., P. R. Field, R. M. Rasmussen, and W. D. Hall, 2008: Explicit forecasts of winter precipitation using an improved bulk microphysics scheme. Part II: Implementation of a new snow parameterization. Mon. Wea. Rev., 136, 5095–5115.
Tiedtke, M., 1989: A comprehensive mass flux scheme for cumulus parameterization in large-scale models. Mon. Wea. Rev., 117, 1779–1800.
Torn, R. D., 2010: Performance of a mesoscale ensemble Kalman filter (EnKF) during the NOAA High-Resolution Hurricane Test. Mon. Wea. Rev., 138, 4375–4392.
Torn, R. D., and G. J. Hakim, 2009: Ensemble data assimilation applied to RAINEX observations of Hurricane Katrina (2005). Mon. Wea. Rev., 137, 2817–2829.
Torn, R. D., G. J. Hakim, and C. Snyder, 2006: Boundary conditions for limited-area ensemble Kalman filters. Mon. Wea. Rev., 134, 2490–2502.
Wang, X., 2010: Incorporating ensemble covariance in the gridpoint statistical interpolation (GSI) variational minimization: A mathematical framework. Mon. Wea. Rev., 138, 2990–2995.
Wang, X., 2011: Application of the WRF hybrid ETKF–3DVAR data assimilation system for hurricane track forecasts. Wea. Forecasting, 26, 868–884.
Wang, X., D. Barker, C. Snyder, and T. M. Hamill, 2008a: A hybrid ETKF–3DVAR data assimilation scheme for the WRF model. Part I: Observing system simulation experiment. Mon. Wea. Rev., 136, 5116–5131.
Wang, X., D. Barker, C. Snyder, and T. M. Hamill, 2008b: A hybrid ETKF–3DVAR data assimilation scheme for the WRF model. Part II: Real observation experiments. Mon. Wea. Rev., 136, 5132–5147.
Wang, X., D. F. Parrish, D. T. Kleist, and J. S. Whitaker, 2013: GSI 3DVAR-based ensemble-variational hybrid data assimilation for NCEP Global Forecast System: Single-resolution experiments. Mon. Wea. Rev., 141, 4098–4117.
Weisman, M. L., C. A. Davis, W. Wang, K. W. Manning, and J. B. Klemp, 2008: Experiences with 0–36-h explicit convective forecasts with the WRF-ARW model. Wea. Forecasting, 23, 407–437.
Weisman, M. L., C. Evans, and L. Bosart, 2013: The 8 May 2009 Superderecho: Analysis of a real-time explicit convective forecast. Wea. Forecasting, 28, 863–892.
Whitaker, J. S., and T. M. Hamill, 2002: Ensemble data assimilation without perturbed observations. Mon. Wea. Rev., 130, 1913–1924.
Whitaker, J. S., and T. M. Hamill, 2012: Evaluating methods to account for system errors in ensemble data assimilation. Mon. Wea. Rev., 140, 3078–3089.
Whitaker, J. S., T. M. Hamill, X. Wei, Y. Song, and Z. Toth, 2008: Ensemble data assimilation with the NCEP Global Forecast System. Mon. Wea. Rev., 136, 463–482.
Wu, W.-S., D. F. Parrish, and R. J. Purser, 2002: Three-dimensional variational analysis with spatially inhomogeneous covariances. Mon. Wea. Rev., 130, 2905–2916.
Zhang, C., Y. Wang, and K. Hamilton, 2011: Improved representation of boundary layer clouds over the southeast Pacific in ARW-WRF using a modified Tiedtke cumulus parameterization scheme. Mon. Wea. Rev., 139, 3489–3513.
Zhang, F., C. Snyder, and R. Rotunno, 2003: Effects of moist convection on mesoscale predictability. J. Atmos. Sci., 60, 1173–1185.
Zhang, F., C. Snyder, and J. Sun, 2004: Impacts of initial estimate and observation availability on convective-scale data assimilation with an ensemble Kalman filter. Mon. Wea. Rev., 132, 1238–1253.
Zhang, F., Y. Weng, J. A. Sippel, Z. Meng, and C. H. Bishop, 2009a: Cloud-resolving hurricane initialization and prediction through assimilation of Doppler radar observations with an ensemble Kalman filter. Mon. Wea. Rev., 137, 2105–2125.
Zhang, F., M. Zhang, and J. A. Hansen, 2009b: Coupling ensemble Kalman filter with four-dimensional variational data assimilation. Adv. Atmos. Sci., 26, 1–8.
Zhang, F., M. Zhang, and J. Poterjoy, 2013: E3DVar: Coupling an ensemble Kalman filter with three-dimensional variational data assimilation in a limited-area weather prediction model and comparison to E4DVar. Mon. Wea. Rev., 141, 900–917.
Zhang, M., and F. Zhang, 2012: E4DVar: Coupling an ensemble Kalman filter with four-dimensional variational data assimilation in a limited-area weather prediction model. Mon. Wea. Rev., 140, 587–600.
Zhang, M., F. Zhang, X.-Y. Huang, and X. Zhang, 2011: Intercomparison of an ensemble Kalman filter with three- and four-dimensional variational data assimilation methods in a limited-area model over the month of June 2003. Mon. Wea. Rev., 139, 566–572.
The RR replaced the Rapid Update Cycle (RUC; Benjamin et al. 2004) model on 1 May 2012.