1. Introduction
Accurate prediction of deep convection is a well-known challenge (e.g., Fritsch and Carbone 2004; Ebert et al. 2003), and probabilistic guidance from regional convection-permitting ensemble forecast systems (CPEFSs) has been demonstrated as a viable forecast tool (e.g., Gebhardt et al. 2008; Schwartz et al. 2010; Clark et al. 2011; Migliorini et al. 2011). Moreover, CPEFSs provide added value over coarser-resolution ensembles for prediction of high-impact weather (e.g., Clark et al. 2009; Duc et al. 2013; Kain et al. 2013). Yet, the best practice in ensemble design for CPEFS remains unclear. Specifically, while techniques for generating ensemble dispersion on global scales are mature (e.g., Buizza et al. 2005), it is not clear if the same methods are applicable at the meso- and convective scales. Nevertheless, CPEFSs are characteristically biased and lack sufficient ensemble spread (or dispersion) such that the range of ensemble predictions often does not accurately represent forecast uncertainties (e.g., Hohenegger et al. 2008; Gebhardt et al. 2011; Vié et al. 2011; Schwartz et al. 2014, hereafter S14).
Methods that improve ensemble dispersion and forecast reliability do not necessarily improve the performance of the ensemble mean forecast and can reduce the deterministic skill of individual member forecasts, which may inadvertently reduce forecaster confidence of high-impact events. Yet, among issues limiting greater adoption of ensemble products by current operational forecasters, Novak et al. (2008) describe a lack of sufficient dispersion in forecasts, particularly for high-impact weather events. Thus, some degree of compromise appears warranted to increase dispersion if the consequences are manageable. Approaches to address underdispersion in CPEFS design include: 1) initial condition uncertainty, 2) lateral boundary condition uncertainty, and 3) forecast model error. A more detailed look at each component follows.
a. Initial condition uncertainty
One method to introduce spread in an ensemble is to vary the initial conditions among ensemble members, while all members employ an identical forecast model. Choosing appropriate perturbations remains an ad hoc exercise, with a variety of approaches described in the literature such as Gaussian noise (e.g., Hohenegger and Schär 2007), drawing perturbations from an external ensemble model system (e.g., Kong et al. 2009), or downscaling from a coarser-resolution ensemble system (e.g., Stensrud et al. 1999). Additionally, ensemble data assimilation, such as an ensemble Kalman filter (EnKF; Evensen 1994), can be used to generate an ensemble analysis for initial conditions, which has been demonstrated for global applications (e.g., Houtekamer et al. 2009) and at the mesoscale (Zhang et al. 2006). However, aside from select case studies (e.g., Aksoy et al. 2010; Migliorini et al. 2011; Melhauser and Zhang 2012; Dawson et al. 2012; Yussouf et al. 2013), EnKFs have received limited attention for initializing convection-permitting ensemble forecasts. S14 show that a CPEFS with initial condition uncertainty drawn from an EnKF is skillful, providing evidence the main source of error for their CPEFS configuration is initial condition uncertainty. Yet, S14 also show that their CPEFS is underdispersive, suggesting that initial condition uncertainty alone is insufficient. Given this, lateral boundary and model error representation are considered as a supplement to initial condition uncertainty from an EnKF.
b. Lateral boundary condition uncertainty
Convection-permitting models are computationally expensive, motivating the use of limited-area models (LAMs). However, the use of regional domains introduces a new uncertainty along the lateral boundaries relative to global models. External lower-resolution analyses and forecasts from a deterministic global forecast system can provide reasonable lateral boundary solutions, but for an ensemble it is necessary to perturb each member to provide suitable boundary conditions (e.g., Torn 2010). Alternatively, global ensemble systems are increasingly available to provide boundary conditions, yet are only available at coarse resolution, lag in availability time, and have limited ensemble size. So, there remains value in randomly perturbing a deterministic forecast for ensemble lateral boundary conditions.
Perturbed (or global ensemble forecast system derived) lateral boundary conditions (PLBCs) have been used by a number of LAM ensembles to improve ensemble performance (Nutter et al. 2004a,b; Hohenegger et al. 2008; Gebhardt et al. 2011; Vié et al. 2011), and this approach is regularly employed in limited-area continuously cycled ensemble data assimilation systems to preserve spread (e.g., Torn et al. 2006; Torn 2010; Romine et al. 2013, hereafter R13). When the LAM area is small, variability in lateral boundary conditions among members is crucial to maintaining ensemble spread (e.g., Hohenegger and Schär 2007), while the influence of PLBC should decrease as the spatial distance from the lateral boundaries to the region of interest increases. Conversely, as the forecast lead time increases so should the dependence on appropriate PLBCs.
c. Model error representation
Forecast models have uncertain representations of physical processes and may have several parameterization configuration options that can be leveraged to improve spread. The compounded member diversity associated with particular sets of physical parameterizations or model formulations leads to ensemble dispersion, but ensemble members have different biases and can no longer be described as equally likely, which complicates how to appropriately weight members in probabilistic forecasts. Still, this approach has often been leveraged in CPEFS design for multimodel (e.g., Kong et al. 2009; Johnson and Wang 2012), multiphysics (e.g., Clark et al. 2010; Gebhardt et al. 2011), and even multiparameter (e.g., Yussouf and Stensrud 2012) ensembles.
While a multimodel ensemble would seem like a practical approach to represent model error, there are some potential advantages to using a single uniformly configured model, particularly when the same model is used for both the continuously cycled EnKF and ensemble forecast system. These include (i) reduced adjustment period to the forecast model attractor relative to drawing initial states from a different model system, (ii) known analysis error relative to observations from the statistics of the cycled analysis system that identify opportunities to improve the model itself, (iii) ensemble initial state represents uncertainty in the initial state of the atmosphere, and (iv) each individual ensemble member forecast is an equally likely forecast outcome. Based on these benefits, this study will consider the single model approach.
Ideally, a model error scheme would contribute random noise to each forecast member, such that when an ensemble of forecasts is employed the true state of the atmosphere is more likely to be encapsulated within the envelope of ensemble solutions. Assuming an imperfect forecast model, some forecasts would move closer to the truth, while other forecasts would move even further from the truth than the unperturbed forecast. The ensemble mean, or average of the member forecasts, should remain about the same distance from the true state for a purely random model error process in the limit of sampling error and approximately linear model behavior.1
To this end, stochastic model error schemes (SMESs) have been developed as an alternative to represent forecast model error, preserving the potential benefits of using a single model framework. This study considers two stochastic model error scheme options: (i) stochastic kinetic energy backscatter scheme (SKEBS; Shutts 2005) and (ii) stochastically perturbed parameterization tendencies (SPPT; Buizza et al. 1999). SKEBS is intended to substitute for a “missing” physical process in the turbulent energy cascade, and represents structural uncertainty owing to this shortcoming in the model by introducing spatially and temporally correlated perturbations to streamfunction (rotational wind) and potential temperature tendencies. On the other hand, SPPT modifies the output of existing subgridscale parameterization schemes to provide a probabilistic parameterization process (Palmer et al. 2009). The SPPT as implemented in this study uses a similar stochastic forcing pattern as SKEBS, but perturbations are multiplicatively applied to the tendencies from physical processes represented in the model instead of additive perturbations to the state variables. Berner et al. (2014, manuscript submitted to Mon. Wea. Rev., hereafter B14) tested SKEBS and SPPT at 45-km grid spacing and found that these model error schemes were beneficial throughout the atmosphere, especially near the surface, with SKEBS having larger impact.
During the spring of 2012, the National Center for Atmospheric Research (NCAR) generated real-time EnKF analyses and deterministic forecasts in support of the Deep Convective Clouds and Chemistry (DC3) field campaign (Barth et al. 2012). The output from this real-time ensemble analysis is leveraged by this study to explore CPEFS design with EnKF initial conditions along with model error representation from perturbed lateral boundaries, SKEBS, or SPPT. Methodologies for the analysis, ensemble forecast system, and verification approaches are discussed in section 2. Results from ensemble forecasts are presented in section 3, followed by a discussion (section 4) and summary (section 5).
2. Methodology
Ensemble forecasts are initialized from the ensemble analysis (analysis region is the larger area in Fig. 1) including downscaling to a convection-permitting nest (smaller box within the analysis region labeled “d02”). Lateral boundary conditions for the member nest are provided by the larger regional ensemble forecast. This study focuses on convection-permitting ensemble forecasts in the nest domain only. Descriptions of the forecast model and analysis system follow, along with a summary of the control forecast, perturbation methods, and metrics for perturbation impact and forecast verification.

Geographic area for the cycled analysis (outer) and convection-permitting nest (inner, labeled d02), and a representative sample of the instantaneous stochastic forcing pattern for temperature (color fill).
Citation: Monthly Weather Review 142, 12; 10.1175/MWR-D-14-00100.1

Geographic area for the cycled analysis (outer) and convection-permitting nest (inner, labeled d02), and a representative sample of the instantaneous stochastic forcing pattern for temperature (color fill).
Citation: Monthly Weather Review 142, 12; 10.1175/MWR-D-14-00100.1
Geographic area for the cycled analysis (outer) and convection-permitting nest (inner, labeled d02), and a representative sample of the instantaneous stochastic forcing pattern for temperature (color fill).
Citation: Monthly Weather Review 142, 12; 10.1175/MWR-D-14-00100.1
a. WRF Model description
Daily ensemble forecasts include a convection-permitting (3-km grid spacing) nest within each forecast ensemble member that is initialized by downscaling each member analysis state from the ensemble cycled analysis system. So, once daily at 0000 UTC for the period of study extending from 25 May to 25 June 2012, a 30-member Weather Research and Forecasting (WRF) Model (Skamarock et al. 2008) forecast ensemble is integrated for 36 h, initialized from the first 30 members of the 50-member ensemble analysis (15-km grid spacing). S14 shows that this simple selection approach for which members to include in the ensemble forecast is sufficient.
All ensemble forecasts use an identical forecast model configuration, aside from activating model error schemes. This includes positive definite moisture advection (Skamarock and Weisman 2009) and the following physical parameterizations on both domains: the Rapid Radiative Transfer Model for Global Climate Models (RRTMG; Mlawer et al. 1997; Iacono et al. 2008), longwave and shortwave radiation schemes with ozone and aerosol climatologies (Tegen et al. 1997), the Mellor–Yamada–Janjić (MYJ; Mellor and Yamada 1982; Janjić 1994, 2002) planetary boundary layer scheme, and the Noah land surface model (Chen and Dudhia 2001). On the 15-km grid spacing domain, the Tiedtke cumulus parameterization (Tiedtke 1989; C. Zhang et al. 2011) is also used. A 75-s (18.75 s) time step is applied for the outer (inner) domain.
R13 show that the performance of the ensemble data assimilation is sensitive to the model configuration. The model configuration used for this study (see Table 1) is similar to the best performing configuration in R13 except for an updated WRF Model version (here version 3.3.1) and a change to RRTMG radiation schemes. Notably, the Morrison explicit microphysics (Morrison et al. 2009) is used in the cycled data assimilation system while Thompson explicit microphysics (Thompson et al. 2008) is employed for the ensemble forecasts. Convection-permitting forecasts with the Morrison microphysics run in real time displayed spurious intense rain rates during the first few hours in explicit forecasts. Subsequent tests with the Thompson microphysics reduced the spurious precipitation, while forecast skill was qualitatively and quantitatively similar with both schemes (not shown). As such, the results presented here use the Thompson scheme.
WRF Model options.


b. WRF/DART analysis system
As in R13, the ensemble data assimilation system uses a 50-member ensemble to provide short forecasts between 6-hourly analysis cycles. The system is initialized at 1800 UTC 30 April 2012, and thereafter the ensemble analysis state is integrated to the next analysis time each “cycle,” providing the first-guess background for the next analysis as a continuously cycled analysis system. Lateral boundary conditions for the cycled analysis system are generated from perturbed Global Forecast System (GFS; Kalnay et al. 1990) forecasts with the fixed covariance perturbation technique (Torn et al. 2006). The ensemble data assimilation system uses the Data Assimilation Research Testbed (DART; Anderson et al. 2009) facility configured as an ensemble adjustment Kalman filter (Anderson 2001) coupled with the regional WRF Model (as described in section 2a). The continuously cycled analysis system (WRF/DART) is configured (Table 2) similar to R13, except for a modest reduction in horizontal and vertical localization (larger region of influence), a larger adaptive localization threshold, and adjustments to adaptive covariance inflation (Anderson 2009) to increase responsiveness to changes in the misfit with observations. Standard conventional observations are assimilated (see Table 3 of R13) to include radiosondes, surface observations from ship and land stations, cloud track winds, GPS radio occultation, and aircraft reports. Analysis variables include the three components of velocity, potential temperature, column mass of dry air, geopotential, and mixing ratio and number concentration for water vapor and microphysical species.
DART options.


Observation processing is also modified relative to that in R13. For aircraft observations at flight level (200–300 hPa), observation errors are slightly increased and superobservations are averaged over larger horizontal boxes. These changes aim at reducing the influence on the analysis of the bias in the Aircraft Meteorological Data Relay (AMDAR) observations noted in R13. For similar reasons, “significant level” radiosonde observations of temperature and moisture are also assimilated. Observation errors for surface observations are reduced to better match total spread with innovation statistics from assimilated observations (see also Ha and Snyder 2014). The model and analysis system adjustments collectively lead to improved analysis and forecast system performance during the spring of 2012 relative to the results shown in R13 (explicit comparisons not shown).
c. Control ensemble forecasts
Control ensemble forecasts use the first 30 members from the WRF/DART analysis as initial conditions and are integrated for 36 h. Lateral boundary conditions on the 15-km domain are identical among members for a particular forecast and come from GFS 3-hourly forecasts. (The initial analysis state has PLBCs carried over from the WRF/DART analysis, but tendencies become identical for all members beyond 3 h into the forecast.) S14 show the overall performance characteristics of the control ensemble, with initial condition perturbations only, using several verification approaches and suggested a 20- to 30-member ensemble sufficiently represented the forecast probability density function. Since the control ensemble forecasts employ neither lateral boundary perturbations nor a representation of model error, they provide a reference against which to measure the effects of PLBC, SKEBS, and SPPT.
d. Ensemble forecast experiments
To address the noted underdispersion of the control ensemble (S14), three approaches are explored that can enhance the growth of ensemble spread: perturbing lateral boundaries and two stochastic model error methods. The former approximates expected uncertainty in forecast lateral boundary conditions, while the stochastic approaches target sources of random error in the forecast model. Details for each approach follow.
1) Perturbed lateral boundary conditions
As noted in section 2b, the initial conditions of the control ensemble have perturbed lateral boundary states at the start of the forecast. As a simplified representation of continued lateral boundary condition uncertainty in the high-resolution ensemble forecasts, lateral boundary fields for the 15-km grid spacing outer domain are given a persistent, randomly selected perturbation beyond 3 h. These lateral boundary perturbations are implemented identically to the cycled analysis system, except tendencies in the ensemble forecasts are adjusted to a different perturbed state during the first 3 h of the forecast and thereafter the same perturbation is continued for each member for the remainder of the forecast boundary conditions. For the convection-permitting nest, the lateral boundary conditions come from the coarser grid regional forecast, far removed from the perturbed lateral boundary condition for the coarse grid to reduce the errors introduced by using a specified lateral boundary condition (e.g., Warner et al. 1997).
2) Stochastic kinetic energy backscatter scheme
As in Berner et al. (2011), SKEBS is activated for ensemble forecasts and compared against the control ensemble forecast. This SKEBS implementation (B14) is the one released in WRF version 3.6, containing bug fixes2 relative to the original WRF Model 3.3.1 release version. To mimic the unresolved processes at the cloud-permitting scale, the SKEBS parameters (see Table 3) are chosen differently from those in Berner et al. (2011; B14). For example, no perturbations are applied to wavenumbers 1–3 of the temperature forcing to target processes with horizontal correlations of 50–100 km and the temporal decorrelation is set to 3 h for both temperature and streamfunction forcings. As in the default WRF release, the stochastic pattern varies horizontally, but has a barotropic (i.e., constant) vertical structure. The pattern is generated on the parent domain and then interpolated to the inner nest, such that the inner and outer nests have consistent model error patterns. Thus, the stochastic forcing acts on relatively large scales compared with the grid scale of the model. An example of the stochastic spectral perturbation pattern is shown in Fig. 1. The spatial and temporal correlations parameters are set by a length scale and temporal decorrelation time. A third parameter is the variance of the stochastic forcing at each grid point. The distribution of the resulting stochastic patterns at each grid point follows a Gaussian with zero mean and a prescribed variance. Random seeds provide a unique spatial pattern for each ensemble member and forecast cycle, and the pattern evolves according to an autoregressive model in spectral space with a decorrelation time of 3 h. Stochastic forcing is applied to the rotational component of the horizontal wind and potential temperature.
Stochastic perturbation parameter options.


3) Stochastically perturbed parameterization tendencies
SPPT ensemble forecasts are activated in a similar manner as those with SKEBS. SPPT is based on the notion that, especially with increasing numerical resolution, the equilibrium assumption no longer holds and the subgrid-scale state should be sampled rather than represented by the equilibrium mean. Consequently, SPPT multiplies the accumulated physical tendencies at each grid point and time step with a stochastic pattern generated in the same way as SKEBS, but with different parameter settings. Since the physical tendencies tend to have vertical structure, the net perturbations added to the state do as well. The parameter settings used here (Table 3) differ somewhat from that of B14 in an effort to adjust to the smaller horizontal grid spacing. For example, while at 45-km grid spacing a temporal decorrelation time of 6 h was used. It is reduced in this study to 1 h, since simulations with longer decorrelation times were not stable during integration. Otherwise, the parameter choices were based on physical intuition without conducting tuning experiments.
e. Forecast intercomparison
To assess the relative impact of PLBCs and SMESs to enhance forecast error growth, ensemble forecasts are compared to the control forecast. Ensemble spread is computed at each grid point from forecast products (e.g., 500-hPa temperature) and then averaged over all grid points within the nest domain for each forecast hour. Averaging the difference between ensemble mean forecasts is done to assess systematic impacts. Both domain-averaged and forecast-averaged products at each horizontal grid point are examined for temporal and spatial trends in ensemble forecast differences.
f. Forecast verification
For high-resolution precipitation forecasts, the local nature of individual convective elements averaged over many member forecasts leads to an ensemble mean forecast with an apparent low bias for intensity (e.g., Mass et al. 2002; Roberts and Lean 2008). Techniques to improve the usability of ensemble mean precipitation forecasts through statistical approaches (Ebert 2001) have had limited success (Surcel et al. 2014). Thus, interpretation of the value of ensemble forecasts for high-impact weather relies on synthesizing member forecast scenarios over a larger neighborhood and evaluating the probabilistic forecast of rare events (Schwartz et al. 2010; Clark et al. 2011; Gebhardt et al. 2011; Johnson and Wang 2012). Hence, precipitation forecasts from the convection-permitting ensemble are verified using a neighborhood ensemble verification technique (Schwartz et al. 2010; S14) over a subset of the 3-km grid spacing domain (i.e., the “verification region” in Fig. 2). The ensemble probability at a particular grid point is determined from the number of ensemble members exceeding a defined precipitation rate threshold divided by the ensemble size. Then, the neighborhood ensemble probability is computed by averaging the point-based probability over a specified radius of influence. The hourly gridded stage IV (ST4; Lin and Mitchell 2005) accumulated precipitation analyses are used as “truth.” The ST4 analyses are first bilinearly interpolated onto the model grid. The latter is done in lieu of interpolating the model grids to the observation space grids to reduce computational burden. Tests indicate only small differences are introduced by this approach, but relative differences between CPEFS configurations are insensitive to the interpolation approach (more details available in S14). Precipitation forecast metrics that are presented include attribute statistics (Wilks 2006), the area under the receiver operating characteristic (ROC) curve (e.g., Mason and Graham 2002), and fractions skill scores (FSSs; Roberts and Lean 2008) to assess the relative skill of each CPEFS.

Surface elevation of the inner nest domain (color fill), location of the precipitation verification region (purple box, labeled “VERIF”), and the locations of operational soundings used for forecast verification (marked with stars and station labels).
Citation: Monthly Weather Review 142, 12; 10.1175/MWR-D-14-00100.1

Surface elevation of the inner nest domain (color fill), location of the precipitation verification region (purple box, labeled “VERIF”), and the locations of operational soundings used for forecast verification (marked with stars and station labels).
Citation: Monthly Weather Review 142, 12; 10.1175/MWR-D-14-00100.1
Surface elevation of the inner nest domain (color fill), location of the precipitation verification region (purple box, labeled “VERIF”), and the locations of operational soundings used for forecast verification (marked with stars and station labels).
Citation: Monthly Weather Review 142, 12; 10.1175/MWR-D-14-00100.1
Forecasts are also verified against a set of operational radiosonde observations of temperature and wind within the 3-km domain (see sounding locations in Fig. 2). Forecast soundings are extracted from the ensembles at the time and model grid point closest to the observed sounding time and locations. Verification statistics against soundings assume standard observation errors drawn from the National Centers for Environmental Prediction (NCEP) statistics, except the height of maximum errors are adjusted upward to be consistent with the typical height of the tropopause during the late spring. Both the ensemble mean and the average individual member forecast skills are verified deterministically. To assess the adequacy of ensemble forecast spread, total spread (square root of the sum of ensemble variance and observation error variance) is also compared against the ensemble mean forecast root-mean-square (RMS) error.
For verification versus both ST4 and radiosonde observations, a bootstrap resampling method (e.g., Hamill 1999) with 1000 random draws is used to estimate bounds of the 90% confidence interval (CI) to discriminate skill among ensemble configurations. Resamples were drawn assuming each forecast was independent. This technique is suitable for computing bootstrap CIs for metrics at one particular time (e.g., CIs for 24-h forecast RMSEs). But, for time-aggregated measures (e.g., forecast skill aggregated over 18–36-h forecasts), there is the potential for errors across consecutive forecast hours to be temporally correlated (Hamill 1999), such that temporal correlation should be considered when drawing resamples to compute bootstrap CIs. In recognition of this, a moving block bootstrapping approach (e.g., Wilks 1997) was also used to calculate CIs for time-aggregated measures using different block lengths. However, this approach yielded results that were nearly unchanged compared to assuming each sample was independent, and owing to the challenge of choosing an appropriate block length, the bootstrap CIs shown for metrics aggregated over several forecast hours were computed without a moving block.
3. Results
The primary objective in this study is to 1) improve the performance of the CPEFS dispersion characteristics, and achieve similar or improved 2) forecast skill and 3) bias in the forecast. Following a review of the control ensemble performance with only initial condition uncertainty, the relative contribution from PLBC and SMES ensembles are assessed. Since the analysis domain is much larger than the area of interest for the forecast, the PLBCs are not expected to play a large role during the early forecast window. The inclusion of either SMES is expected to improve forecast reliability by reducing the occurrence of correlated model error in the ensemble-mean forecast and overall improve the probabilistic performance of the CPEFS.
a. Control ensemble forecast
Characteristics of the control ensemble forecast skill are discussed in detail by S14 and are now briefly presented. Control ensemble forecasts display skill across a broad range of rainfall intensities, yet overpredict precipitation (wet bias) and are underdispersive. Individual member forecasts initialized from the ensemble analysis are indistinguishable from each other, on average, supporting selecting an arbitrary subset of initial states from the full 50-member ensemble analysis. S14 find reliability and resolution of a subensemble of 20–30 members is usually similar to the skill of the full 50-member ensemble in the control configuration for precipitation forecasts. S14 also find the control CPEFS is skillful at discriminating both intense and light precipitation events, and the most skillful and valuable guidance is achieved by evaluating neighborhood probabilistic forecasts. Based on these findings, for this study, ensemble forecasts will compare 30-member ensembles using neighborhood ensemble probabilistic verification methods (see section 2f) for assessing the relative skill of the perturbed ensemble forecasts to the control.
b. PLBC and stochastic model error representation impacts relative to control forecast
The intent of adding forecast perturbations to the control ensemble (in addition to initial condition perturbations) is to increase the ensemble dispersion of the control ensemble, and this is indeed achieved by all three methods. Figures 3–5 summarize forecast variable domain-averaged impacts from including forecast perturbations as a function of lead time. Spread is increased as the forecast progresses, increasing more quickly in the forecasts with a SMES than from PLBCs. Accompanying the increase in spread are shifts in the mean state, or bias, relative to the control forecast, especially with the SMES ensembles (forecast bias relative to the true atmospheric state is discussed in section 3c). The greatest (least) forecast-to-forecast variability is found with the SPPT (PLBC) ensemble forecasts. Although SKEBS and SPPT generate random patterns that evolve at each grid point as a Gaussian noise process with mean zero, nonlinearities in the model (SKEBS) and the multiplicativeness of the noise (SPPT) can lead to a change of the mean state (e.g., Berner et al. 2012). This can result in an increase in bias relative to that of the control forecast. A discussion of a few variable specific characteristics follows.

(a),(c),(e) Difference in the ensemble mean for temperature (°C) forecasts with SKEBS (red), SPPT (green), or PLBC (mustard) from the control forecast ensemble mean, with the average difference over all forecasts overlain in heavy black lines with long dashes (SKEBS: control), solid lines (SPPT: control), or short dashes (PLBC: control). (b),(d),(f) As in (a),(c),(e), but for the ensemble spread of temperature, and inclusion of the control forecast spread (blue), and the average ensemble spread difference from the control forecast is shown in heavy black lines. Heights shown are at (a),(b) 300; (c),(d) 500; and (e),(f) 850 hPa. Each color line represents an individual 30-member ensemble forecast, run daily, initialized from 25 May to 25 Jun 2012.
Citation: Monthly Weather Review 142, 12; 10.1175/MWR-D-14-00100.1

(a),(c),(e) Difference in the ensemble mean for temperature (°C) forecasts with SKEBS (red), SPPT (green), or PLBC (mustard) from the control forecast ensemble mean, with the average difference over all forecasts overlain in heavy black lines with long dashes (SKEBS: control), solid lines (SPPT: control), or short dashes (PLBC: control). (b),(d),(f) As in (a),(c),(e), but for the ensemble spread of temperature, and inclusion of the control forecast spread (blue), and the average ensemble spread difference from the control forecast is shown in heavy black lines. Heights shown are at (a),(b) 300; (c),(d) 500; and (e),(f) 850 hPa. Each color line represents an individual 30-member ensemble forecast, run daily, initialized from 25 May to 25 Jun 2012.
Citation: Monthly Weather Review 142, 12; 10.1175/MWR-D-14-00100.1
(a),(c),(e) Difference in the ensemble mean for temperature (°C) forecasts with SKEBS (red), SPPT (green), or PLBC (mustard) from the control forecast ensemble mean, with the average difference over all forecasts overlain in heavy black lines with long dashes (SKEBS: control), solid lines (SPPT: control), or short dashes (PLBC: control). (b),(d),(f) As in (a),(c),(e), but for the ensemble spread of temperature, and inclusion of the control forecast spread (blue), and the average ensemble spread difference from the control forecast is shown in heavy black lines. Heights shown are at (a),(b) 300; (c),(d) 500; and (e),(f) 850 hPa. Each color line represents an individual 30-member ensemble forecast, run daily, initialized from 25 May to 25 Jun 2012.
Citation: Monthly Weather Review 142, 12; 10.1175/MWR-D-14-00100.1

As in Fig. 3, but for the zonal, or U, component of the model grid horizontal winds (m s−1).
Citation: Monthly Weather Review 142, 12; 10.1175/MWR-D-14-00100.1

As in Fig. 3, but for the zonal, or U, component of the model grid horizontal winds (m s−1).
Citation: Monthly Weather Review 142, 12; 10.1175/MWR-D-14-00100.1
As in Fig. 3, but for the zonal, or U, component of the model grid horizontal winds (m s−1).
Citation: Monthly Weather Review 142, 12; 10.1175/MWR-D-14-00100.1

As in Fig. 3, but for (a),(b) precipitable water (mm); (c),(d) 2-m water vapor (g kg−1); and (e),(f) 2-m temperature (°C).
Citation: Monthly Weather Review 142, 12; 10.1175/MWR-D-14-00100.1

As in Fig. 3, but for (a),(b) precipitable water (mm); (c),(d) 2-m water vapor (g kg−1); and (e),(f) 2-m temperature (°C).
Citation: Monthly Weather Review 142, 12; 10.1175/MWR-D-14-00100.1
As in Fig. 3, but for (a),(b) precipitable water (mm); (c),(d) 2-m water vapor (g kg−1); and (e),(f) 2-m temperature (°C).
Citation: Monthly Weather Review 142, 12; 10.1175/MWR-D-14-00100.1
For temperature (Fig. 3), spread growth is fastest with the SPPT scheme ensemble in the mid- and upper troposphere, yet is similar in magnitude for both SMESs in the lower troposphere. Mean temperature differences compared to the control ensemble show that the SPPT ensemble is becoming relatively more statically stable during the forecast, with warming aloft and cooling below (Fig. 3). Mean temperature differences and spread for the SPPT ensemble grows more rapidly during the diurnal peak in convection. The SKEBS ensemble also shows modest warming in the mid- and upper troposphere, while the PLBC ensemble is fairly neutral for temperature relative to the control.
Wind forecast impacts (Fig. 4) from perturbation methods are more similar for both SMES ensembles, except for the SPPT ensemble in the midtroposphere. Again, the SPPT ensemble shows the largest run-to-run variability in mean difference from the control ensemble, but averaging over many forecasts the impact is largely neutral. The PLBC ensemble consistently leads to slightly slower tropospheric zonal wind forecasts relative to the control ensemble, with a damping periodic oscillation beyond 3 h as the new perturbed lateral boundary state is enforced. Results are similar for the meridional wind component (not shown), except no difference in systematic bias is found.
Impacts from forecast perturbations on moisture and surface variables (Fig. 5) show a similarly larger impact from the SMES ensembles. The SKEBS ensemble leads to the largest increase in spread in tropospheric moisture. At the surface, the impact of the SMESs is similar for both. The SMESs, and especially the SPPT ensemble, leads to a drift in the surface thermodynamic state toward cooler and drier conditions. The SPPT ensemble continues to show the greatest variability in forecast impact from run to run.
Systematic bias varies not only by forecast hour, but also spatially across the forecast domain. Figures 6–9 highlight some of the systematic differences in 24-h forecasts relative to the control ensemble. For example, the 700-hPa mean state temperature from the control ensemble (Fig. 6d) features a thermal ridge along the Rockies, an anticyclone in central Texas, and a trough toward the East Coast. Relative to the control mean state, SKEBS (Fig. 6a) reduces the amplitude of the ridge–trough pattern, and leads to reduced southerly component to the winds in the high and central plains. The SPPT ensemble (Fig. 6b) cools temperatures fairly uniformly across the model domain, and reduces the strength of the mean anticyclonic flow pattern across the eastern two-thirds of the domain. The PLBC ensemble (Fig. 6c) results in a westward shift in the Rockies-centered thermal ridge and a slight decrease in the westerly wind component across the Intermountain West. At the surface, both SMES ensembles lead to generally cooler and drier conditions across most of the domain over land, especially SPPT (Figs. 7 and 8). Collectively, this leads to as much as a 10% decline in conditional instability across portions of the central plains (not shown). A closer examination finds that the forecasts with SMESs, especially SPPT, lead to more frequent cloudiness in forecasts across the eastern half of the domain (not shown).

Average 24-h ensemble mean temperature (°C) and wind vector forecast at 700 hPa for (d) the control forecast, and mean difference from the control for the (a) SKEBS, (b) SPPT, and (c) PLBC ensemble forecasts. Forecasts are averaged over the period initialized from 25 May to 25 Jun 2012. For winds in (d), the half pennant is 5 kt (1 kt = 0.5144 m s−1) and the full pennant is 10 kt. Representative wind vectors for (a)–(c) are shown at bottom left in (a).
Citation: Monthly Weather Review 142, 12; 10.1175/MWR-D-14-00100.1

Average 24-h ensemble mean temperature (°C) and wind vector forecast at 700 hPa for (d) the control forecast, and mean difference from the control for the (a) SKEBS, (b) SPPT, and (c) PLBC ensemble forecasts. Forecasts are averaged over the period initialized from 25 May to 25 Jun 2012. For winds in (d), the half pennant is 5 kt (1 kt = 0.5144 m s−1) and the full pennant is 10 kt. Representative wind vectors for (a)–(c) are shown at bottom left in (a).
Citation: Monthly Weather Review 142, 12; 10.1175/MWR-D-14-00100.1
Average 24-h ensemble mean temperature (°C) and wind vector forecast at 700 hPa for (d) the control forecast, and mean difference from the control for the (a) SKEBS, (b) SPPT, and (c) PLBC ensemble forecasts. Forecasts are averaged over the period initialized from 25 May to 25 Jun 2012. For winds in (d), the half pennant is 5 kt (1 kt = 0.5144 m s−1) and the full pennant is 10 kt. Representative wind vectors for (a)–(c) are shown at bottom left in (a).
Citation: Monthly Weather Review 142, 12; 10.1175/MWR-D-14-00100.1

As in Fig. 6, but for 2-m temperature (°C) and 10-m wind vectors (m s−1).
Citation: Monthly Weather Review 142, 12; 10.1175/MWR-D-14-00100.1

As in Fig. 6, but for 2-m temperature (°C) and 10-m wind vectors (m s−1).
Citation: Monthly Weather Review 142, 12; 10.1175/MWR-D-14-00100.1
As in Fig. 6, but for 2-m temperature (°C) and 10-m wind vectors (m s−1).
Citation: Monthly Weather Review 142, 12; 10.1175/MWR-D-14-00100.1

As in Fig. 6, but for precipitable water (mm) and 10-m wind vectors (m s−1).
Citation: Monthly Weather Review 142, 12; 10.1175/MWR-D-14-00100.1

As in Fig. 6, but for precipitable water (mm) and 10-m wind vectors (m s−1).
Citation: Monthly Weather Review 142, 12; 10.1175/MWR-D-14-00100.1
As in Fig. 6, but for precipitable water (mm) and 10-m wind vectors (m s−1).
Citation: Monthly Weather Review 142, 12; 10.1175/MWR-D-14-00100.1

As in Fig. 6, but for accumulated precipitation (mm) through 24 h, and excludes wind vectors.
Citation: Monthly Weather Review 142, 12; 10.1175/MWR-D-14-00100.1

As in Fig. 6, but for accumulated precipitation (mm) through 24 h, and excludes wind vectors.
Citation: Monthly Weather Review 142, 12; 10.1175/MWR-D-14-00100.1
As in Fig. 6, but for accumulated precipitation (mm) through 24 h, and excludes wind vectors.
Citation: Monthly Weather Review 142, 12; 10.1175/MWR-D-14-00100.1
Consistent with the above-noted changes are displacements in where and how much rain is produced (Fig. 9). Both SMESs increase rainfall, particularly in the southeastern U.S. region of the domain where surface moisture is more abundant, yet both are drier than in the control (e.g., Fig. 8). The drying and cooling, particularly along the immediate Gulf Coast, result in an accompanying weakening of the sea-breeze circulation (Fig. 7) and precipitation amounts are reduced in these areas for the SMES forecasts. Increased precipitation in the simulations with a SMES could also drive the increase in mean static stability and reduced atmospheric moisture content. The SPPT forecast perturbations uniquely lead to an increase in the frequency of more intense storms (not shown), which requires future investigation. Simulations with PLBC show little systematic impact on precipitation forecasts.
Overall, ensemble forecasts with PLBC have the smallest increase in spread compared to the control ensemble forecast, and are the slowest to show impact during the forecast. This is not surprising given the large spatial separation of the model lateral boundaries from the forecast verification region. The SPPT contributes to spread growth more quickly than other methods, but shows greater variability in impact for daily ensemble forecasts. Systematic bias is modified by all perturbation methods, especially with SPPT. The PLBCs also produce the least impact on thermodynamic state relative to the control ensemble forecasts. The SPPT causes the largest change in temperature, particularly during the following diurnal heating maximum with a trend toward a more statically stable mean state. Application of SKEBS and SPPT increases precipitation occurrence, while also reducing mean tropospheric humidity and near-surface thermodynamic state variables with a net decrease in conditional instability.
c. Ensemble forecast skill comparison
Precipitation forecasts from the ensembles all display systematic bias with greater areal coverage across a range of rainfall intensities than in ST4 analyses (Fig. 10). Additionally, an early offset is noted in the diurnal cycle, with a substantial overshoot at the most intense rain rates during the first few hours of the forecast. Among the ensembles with forecast perturbations, relative to the control ensemble, areal coverage is quite similar, although the SPPT ensemble again shows greater variability from forecast to forecast. The largest differences are noted during the diurnal minimum at the lower rain rates, where the simulations with a SMES lead to greater areal coverage, although these differences are not statistically significant.

Areal coverage at rain-rate thresholds of (a) 0.25, (b) 1.0, and (c) 10.0 mm h−1 for stage IV precipitation (heavy black line) and ensemble member forecasts for the control (blue), PLBC (mustard), SKEBS (red), and SPPT (green) as a function of forecast hour aggregated over all forecasts initialized from 25 May to 25 Jun 2012. To improve visibility of differences in the plot, only every third forecast member is plotted from each experiment (e.g., members 1, 4, 7, …).
Citation: Monthly Weather Review 142, 12; 10.1175/MWR-D-14-00100.1

Areal coverage at rain-rate thresholds of (a) 0.25, (b) 1.0, and (c) 10.0 mm h−1 for stage IV precipitation (heavy black line) and ensemble member forecasts for the control (blue), PLBC (mustard), SKEBS (red), and SPPT (green) as a function of forecast hour aggregated over all forecasts initialized from 25 May to 25 Jun 2012. To improve visibility of differences in the plot, only every third forecast member is plotted from each experiment (e.g., members 1, 4, 7, …).
Citation: Monthly Weather Review 142, 12; 10.1175/MWR-D-14-00100.1
Areal coverage at rain-rate thresholds of (a) 0.25, (b) 1.0, and (c) 10.0 mm h−1 for stage IV precipitation (heavy black line) and ensemble member forecasts for the control (blue), PLBC (mustard), SKEBS (red), and SPPT (green) as a function of forecast hour aggregated over all forecasts initialized from 25 May to 25 Jun 2012. To improve visibility of differences in the plot, only every third forecast member is plotted from each experiment (e.g., members 1, 4, 7, …).
Citation: Monthly Weather Review 142, 12; 10.1175/MWR-D-14-00100.1
Increased spread is coincident with greater variability in rainfall forecasts that increases ensemble forecast reliability, particularly at longer lead times (Fig. 11). Rank histograms indicate considerable improvement in ensemble spread relative to the control, as well as a wet bias with SMES (not shown). Consistent with the rates of spread growth noted in the forecast variables, the faster spread growth with the SPPT ensemble also contributes to the largest improvement in forecast reliability during the early window of the forecast, while the PLBCs have very little impact. During the later forecast window, both SMES ensembles lead to significant improvement in forecast reliability across a range of rainfall intensities. The PLBC also improves ensemble reliability during the later period consistent with other metrics. Use of a larger neighborhood than 50 km, as presented here, moves the curves closer to perfect reliability, while relative differences remain about the same (not shown).

Attributes diagrams at rain-rate thresholds of (a),(b) 0.25; (c),(d) 1.0; and (e),(f) 10.0 mm h−1 averaged over all forecast hours of (a),(c),(e) 1–12 and (b),(d),(f) 18–36 for the control (blue), PLBC (mustard), SKEBS (red), and SPPT (green) ensemble forecasts initialized from 25 May to 25 Jun 2012, using a 50-km neighborhood. The diagonal line indicates perfect reliability, the observed frequency is shown as a solid black line from observed relative frequency axis, and the dashed line demarcates “no skill” relative to climatology. Numbers above each forecast probability bin are the frequency of the control forecast hits in each bin, −999 means the control had fewer than 1000 occurrences in that bin. Error bars indicate the bounds of the 90% confidence intervals. Where the control curve is not seen, it is behind the PLBC curve. Colored markers indicate where ensemble forecast configurations have statistically significant differences from the control ensemble forecast.
Citation: Monthly Weather Review 142, 12; 10.1175/MWR-D-14-00100.1

Attributes diagrams at rain-rate thresholds of (a),(b) 0.25; (c),(d) 1.0; and (e),(f) 10.0 mm h−1 averaged over all forecast hours of (a),(c),(e) 1–12 and (b),(d),(f) 18–36 for the control (blue), PLBC (mustard), SKEBS (red), and SPPT (green) ensemble forecasts initialized from 25 May to 25 Jun 2012, using a 50-km neighborhood. The diagonal line indicates perfect reliability, the observed frequency is shown as a solid black line from observed relative frequency axis, and the dashed line demarcates “no skill” relative to climatology. Numbers above each forecast probability bin are the frequency of the control forecast hits in each bin, −999 means the control had fewer than 1000 occurrences in that bin. Error bars indicate the bounds of the 90% confidence intervals. Where the control curve is not seen, it is behind the PLBC curve. Colored markers indicate where ensemble forecast configurations have statistically significant differences from the control ensemble forecast.
Citation: Monthly Weather Review 142, 12; 10.1175/MWR-D-14-00100.1
Attributes diagrams at rain-rate thresholds of (a),(b) 0.25; (c),(d) 1.0; and (e),(f) 10.0 mm h−1 averaged over all forecast hours of (a),(c),(e) 1–12 and (b),(d),(f) 18–36 for the control (blue), PLBC (mustard), SKEBS (red), and SPPT (green) ensemble forecasts initialized from 25 May to 25 Jun 2012, using a 50-km neighborhood. The diagonal line indicates perfect reliability, the observed frequency is shown as a solid black line from observed relative frequency axis, and the dashed line demarcates “no skill” relative to climatology. Numbers above each forecast probability bin are the frequency of the control forecast hits in each bin, −999 means the control had fewer than 1000 occurrences in that bin. Error bars indicate the bounds of the 90% confidence intervals. Where the control curve is not seen, it is behind the PLBC curve. Colored markers indicate where ensemble forecast configurations have statistically significant differences from the control ensemble forecast.
Citation: Monthly Weather Review 142, 12; 10.1175/MWR-D-14-00100.1
Forecast skill for the ensemble configurations presents mixed results. Skill is assessed using both area under the ROC curve (AUC; Fig. 12) and FSS (Fig. 13) across a range of rain accumulation thresholds. Considering AUC, all ensemble configurations show skill across a wide range of rain rates, although rapidly diminishing performance above 1 mm h−1. At lower rain-rate thresholds, forecasts with an SMES outperform the control ensemble forecast by small but statistically significant differences, particularly at longer lead times. At higher thresholds, the performance is similar to that at early lead times, but with diminished performance by the SMES ensembles at high rain rates. Closely examining individual ROC curves show the differences in performance with perturbed ensemble forecasts owes largely to improved detection at low forecast probabilities, as in fewer missed events relative to the control ensemble, while also reducing the occurrence of high forecast confidence of intense rain events. The FSS (Fig. 13) indicates the ensemble forecasts are of similar value during the first period, while SMES ensembles degrade the forecast at higher rain rates, particularly during the later forecast window for the SPPT ensemble. This is also shown by times series of the AUC for several rain-rate thresholds (Figs. 12c–e). These differences were fairly insensitive to the size of the neighborhood (not shown). The Brier skill score (Brier 1950) is also evaluated, and the results corroborated the FSS (not shown).

Area under the relative operating characteristic curve (AUC) as a function of rain-rate threshold for the (a) early (forecast hours 1–12) and (b) late (forecast hours 18–36) periods for the control (blue), PLBC (mustard), SKEBS (red), and SPPT (green) aggregated over ensemble forecasts initialized from 25 May to 25 Jun 2012, using a 50-km neighborhood. Error bars indicate the bounds of the 90% confidence intervals. Where the control curve is not seen, it is behind the PLBC curve. Colored markers indicate where ensemble forecast configurations have statistically significant differences from the control ensemble forecast. AUC as a function of forecast hour for rain-rate thresholds of (c) 0.25, (d) 1.0, and (e) 10.0 mm h−1 aggregated over all forecasts initialized from 25 May to 25 Jun 2012, using a 50-km neighborhood.
Citation: Monthly Weather Review 142, 12; 10.1175/MWR-D-14-00100.1

Area under the relative operating characteristic curve (AUC) as a function of rain-rate threshold for the (a) early (forecast hours 1–12) and (b) late (forecast hours 18–36) periods for the control (blue), PLBC (mustard), SKEBS (red), and SPPT (green) aggregated over ensemble forecasts initialized from 25 May to 25 Jun 2012, using a 50-km neighborhood. Error bars indicate the bounds of the 90% confidence intervals. Where the control curve is not seen, it is behind the PLBC curve. Colored markers indicate where ensemble forecast configurations have statistically significant differences from the control ensemble forecast. AUC as a function of forecast hour for rain-rate thresholds of (c) 0.25, (d) 1.0, and (e) 10.0 mm h−1 aggregated over all forecasts initialized from 25 May to 25 Jun 2012, using a 50-km neighborhood.
Citation: Monthly Weather Review 142, 12; 10.1175/MWR-D-14-00100.1
Area under the relative operating characteristic curve (AUC) as a function of rain-rate threshold for the (a) early (forecast hours 1–12) and (b) late (forecast hours 18–36) periods for the control (blue), PLBC (mustard), SKEBS (red), and SPPT (green) aggregated over ensemble forecasts initialized from 25 May to 25 Jun 2012, using a 50-km neighborhood. Error bars indicate the bounds of the 90% confidence intervals. Where the control curve is not seen, it is behind the PLBC curve. Colored markers indicate where ensemble forecast configurations have statistically significant differences from the control ensemble forecast. AUC as a function of forecast hour for rain-rate thresholds of (c) 0.25, (d) 1.0, and (e) 10.0 mm h−1 aggregated over all forecasts initialized from 25 May to 25 Jun 2012, using a 50-km neighborhood.
Citation: Monthly Weather Review 142, 12; 10.1175/MWR-D-14-00100.1

Fractions skill score as a function of radius of influence for probabilistic forecasts for the (a)–(c) early (forecast hours 1–12) and (d)–(f) late (forecast hours 18–36) periods from the control (blue), PLBC (mustard), SKEBS (red), and SPPT (green) averaged over ensemble forecasts initialized from 25 May to 25 Jun 2012, for rain-rate thresholds of (a),(d) 0.25; (b),(e) 1.0; and (c),(f) 10.0 mm h−1. Error bars indicate the bounds of the 90% confidence intervals. Where the control curve is not seen, it is behind the PLBC curve. Colored markers indicate where ensemble forecast configurations have statistically significant differences from the control ensemble forecast.
Citation: Monthly Weather Review 142, 12; 10.1175/MWR-D-14-00100.1

Fractions skill score as a function of radius of influence for probabilistic forecasts for the (a)–(c) early (forecast hours 1–12) and (d)–(f) late (forecast hours 18–36) periods from the control (blue), PLBC (mustard), SKEBS (red), and SPPT (green) averaged over ensemble forecasts initialized from 25 May to 25 Jun 2012, for rain-rate thresholds of (a),(d) 0.25; (b),(e) 1.0; and (c),(f) 10.0 mm h−1. Error bars indicate the bounds of the 90% confidence intervals. Where the control curve is not seen, it is behind the PLBC curve. Colored markers indicate where ensemble forecast configurations have statistically significant differences from the control ensemble forecast.
Citation: Monthly Weather Review 142, 12; 10.1175/MWR-D-14-00100.1
Fractions skill score as a function of radius of influence for probabilistic forecasts for the (a)–(c) early (forecast hours 1–12) and (d)–(f) late (forecast hours 18–36) periods from the control (blue), PLBC (mustard), SKEBS (red), and SPPT (green) averaged over ensemble forecasts initialized from 25 May to 25 Jun 2012, for rain-rate thresholds of (a),(d) 0.25; (b),(e) 1.0; and (c),(f) 10.0 mm h−1. Error bars indicate the bounds of the 90% confidence intervals. Where the control curve is not seen, it is behind the PLBC curve. Colored markers indicate where ensemble forecast configurations have statistically significant differences from the control ensemble forecast.
Citation: Monthly Weather Review 142, 12; 10.1175/MWR-D-14-00100.1
To further understand the relative forecast value, and to assess if bias shifts from the PLBC and the SMES ensembles lead to forecast improvements, forecasts are also verified against a set of soundings within the forecast domain (see sounding locations in Fig. 2), which covers a broader region of the forecast domain than is used for the precipitation verification. Ensemble-mean 24-h forecasts verified against radiosonde temperature and zonal wind observations (Fig. 14) generally agree with precipitation metrics. Perturbed ensembles lead to increased spread relative to the control forecast, but with relatively little impact on forecast skill, except for decreased (increased) skill (bias) with the SPPT ensemble temperature forecasts. Forecast bias for winds are not significantly impacted by perturbation methods relative to the control ensemble, aside from slightly improved bias with the PLBC ensemble (Fig. 15). Ratios of total spread to RMS error suggest that the ensembles are overdispersive, particularly for wind, although much of the total spread contribution comes from assumed representativeness errors, which may be poorly calibrated for the grid spacing in this forecast system.

Station-averaged ensemble mean 24-h forecast verification for (a) temperature and (b) the zonal wind component averaged over ensemble forecasts initialized from 25 May to 25 Jun 2012. Shown are RMS error (solid), total spread (dot–dash), and bias (observation − forecast; dash; temperature only) for the control (blue), PLBC (mustard), SKEBS (red), and SPPT (green). The assumed observation error from the NCEP statistics is overlain (gray). Also shown is the ratio of total spread (square root of the sum of ensemble and observation variance) to RMS error for (c) temperature and (d) zonal wind.
Citation: Monthly Weather Review 142, 12; 10.1175/MWR-D-14-00100.1

Station-averaged ensemble mean 24-h forecast verification for (a) temperature and (b) the zonal wind component averaged over ensemble forecasts initialized from 25 May to 25 Jun 2012. Shown are RMS error (solid), total spread (dot–dash), and bias (observation − forecast; dash; temperature only) for the control (blue), PLBC (mustard), SKEBS (red), and SPPT (green). The assumed observation error from the NCEP statistics is overlain (gray). Also shown is the ratio of total spread (square root of the sum of ensemble and observation variance) to RMS error for (c) temperature and (d) zonal wind.
Citation: Monthly Weather Review 142, 12; 10.1175/MWR-D-14-00100.1
Station-averaged ensemble mean 24-h forecast verification for (a) temperature and (b) the zonal wind component averaged over ensemble forecasts initialized from 25 May to 25 Jun 2012. Shown are RMS error (solid), total spread (dot–dash), and bias (observation − forecast; dash; temperature only) for the control (blue), PLBC (mustard), SKEBS (red), and SPPT (green). The assumed observation error from the NCEP statistics is overlain (gray). Also shown is the ratio of total spread (square root of the sum of ensemble and observation variance) to RMS error for (c) temperature and (d) zonal wind.
Citation: Monthly Weather Review 142, 12; 10.1175/MWR-D-14-00100.1

Station-averaged ensemble member forecast verification for (a),(c) temperature and (b),(d) zonal wind for (a),(b) RMS error and (c),(d) bias for the control (blue), PLBC (mustard), SKEBS (red), and SPPT (green) averaged over ensemble forecasts initialized from 25 May to 25 Jun 2012. Error bars indicate the bounds of the 90% confidence intervals.
Citation: Monthly Weather Review 142, 12; 10.1175/MWR-D-14-00100.1

Station-averaged ensemble member forecast verification for (a),(c) temperature and (b),(d) zonal wind for (a),(b) RMS error and (c),(d) bias for the control (blue), PLBC (mustard), SKEBS (red), and SPPT (green) averaged over ensemble forecasts initialized from 25 May to 25 Jun 2012. Error bars indicate the bounds of the 90% confidence intervals.
Citation: Monthly Weather Review 142, 12; 10.1175/MWR-D-14-00100.1
Station-averaged ensemble member forecast verification for (a),(c) temperature and (b),(d) zonal wind for (a),(b) RMS error and (c),(d) bias for the control (blue), PLBC (mustard), SKEBS (red), and SPPT (green) averaged over ensemble forecasts initialized from 25 May to 25 Jun 2012. Error bars indicate the bounds of the 90% confidence intervals.
Citation: Monthly Weather Review 142, 12; 10.1175/MWR-D-14-00100.1
As expected, the addition of the PLBC or the SMES reduces the deterministic skill of individual member forecasts from the ensemble, on average, while the ensemble mean forecast remains about the same distance from the truth. For example, computing the RMS error for individual member forecasts from the ensembles and then averaging, the control ensemble indeed outperforms the perturbed forecasts (Fig. 15). As previously noted, a “perfect” model error scheme might add purely random error, which they largely do as shown by the similar performance in the spatially averaged and ensemble mean skill metrics (Figs. 12–14). Still, the SMES ensembles are shown to introduce systematic bias with regionally varying behavior relative to ensemble forecasts without model error or PLBCs. For example, the SMES ensembles increase temperature bias at most levels, particularly the SPPT. The ensemble mean forecast (Fig. 14), which averages out small spatial scales where the ensemble lacks skill, demonstrates that the apparent skill of the control ensemble relative to the other configurations is likely due to correlated forecast error. Finally, S14 found that (control) ensembles with 20 members gave probabilistic forecasts, on average, with analogous skill to a 30-member ensemble. A similar investigation of the PLBC and SMES ensembles found the mean forecast skill improved from 20- to 30-member ensembles. As such, maximizing value from the ensemble forecast needed more members with the SMES ensembles than for the control ensemble.
4. Discussion
For this study, the use of perturbed initial conditions from an EnKF analysis alone leads to skillful but underdispersive ensemble forecasts. Inclusion of additional perturbations to the forecasts improves forecast reliability. The PLBC ensemble leads to limited additional error growth and only modest improvement in spread, but has the least degradation in systematic bias. The SPPT ensemble shows the fastest perturbation growth, improving forecast reliability by the largest margins, but degrades the skill of the precipitation forecasts and introduces the largest bias. The SKEBS ensemble offers a compromise of larger spread growth than the PLBC ensemble, with only a modest increase in systematic bias.
Ensemble forecast studies for application to deep convection remain limited. S14 recently investigated this topic using a continuously cycled ensemble data assimilation system for initial condition uncertainty for a CPEFS. There has been one recent study by Bouttier et al. (2012, hereafter B12) that employed SPPT for a CPEFS. They also found systematic bias was introduced by the scheme, but with differing manifestations. Both the B12 CPEFS and this study show decreasing water vapor concentrations, although B12 (this study) show a relative decrease (increase) in precipitation. Both systems indicate that the spread impact from including SPPT was still increasing at 36 h in the forecast. The main differences in implementation relate to parameter choices, a supersaturation check in B12, along with tapering of the perturbations at the upper and lower boundaries in B12. Both systems improve forecast reliability. Differences may have more to do with their use of a different forecast model than the SPPT settings.
In comparison to experiments in the same model, but at coarser resolution, the impact of SPPT is much stronger at a horizontal grid spacing of 3 km. In particular, in this study the spread and RMS error are markedly larger for the simulations with SPPT than for SKEBS, while B14 found the opposite to be true. Tuning of the schemes was necessary to adapt the model error schemes to the cloud-resolving resolution and might be responsible for some of the differences. More in-depth analysis is planned, but beyond the scope of the current project.
For forecasters examining individual member forecasts from the ensemble or single deterministic forecasts with SMES, the random error introduced by the scheme may have undesirable side effects. For example, Fig. 16 demonstrates the practical impact of perturbations on precipitation forecasts. For this 24-h forecast, ensemble probabilities of accumulated precipitation exceeding 0.1 mm during the previous hour are shown, overlain with the ST4 precipitation exceeding 0.1 mm. The areal extent of any forecast probability at this threshold is much greater than observed, with scattered storms occurring in parts of extreme southern Kansas, western Oklahoma, and western Arkansas. Perturbations generally lead to lower probabilities of the precipitation forecast exceeding this threshold over northwest Oklahoma, as well as parts of Texas, while increasing the areal coverage of lower probabilities to include areas where precipitation is observed but not forecast by any members of the control ensemble. Notably, for the SMES ensembles, the higher probability forecast areas are significantly diminished, while markedly increasing the areal coverage of low probabilities. For this particular example, the SKEBS forecast appears to provide the best compromise maintaining higher probabilities over the observed event, while also improving coverage over areas where the control forecast was overconfident and failed to trigger precipitation. Since the discrimination of high-intensity precipitations events is declined slightly for the SMES ensembles on average (Fig. 12), future efforts will seek to better understand if this decrease in skill can be tied to changes in organization of precipitation systems by imposing the SMESs. Examination of other forecast events suggests that this may be a common pattern.

Ensemble forecast probabilities of accumulated precipitation equal to or exceeding 0.1 mm during the previous hour (color fill), and 10-m wind barbs for the 24-h forecast initialized from 0000 UTC 29 May 2012 for the (a) SKEBS, (b) SPPT, (c) PLBC, and (d) the control forecast. Plots are overlain with the ST4 accumulated precipitation contour of 0.1 mm during the previous hour valid at 0000 UTC 30 May 2012 (bright green). For winds, a half pennant is 5 kt and a full pennant is 10 kt.
Citation: Monthly Weather Review 142, 12; 10.1175/MWR-D-14-00100.1

Ensemble forecast probabilities of accumulated precipitation equal to or exceeding 0.1 mm during the previous hour (color fill), and 10-m wind barbs for the 24-h forecast initialized from 0000 UTC 29 May 2012 for the (a) SKEBS, (b) SPPT, (c) PLBC, and (d) the control forecast. Plots are overlain with the ST4 accumulated precipitation contour of 0.1 mm during the previous hour valid at 0000 UTC 30 May 2012 (bright green). For winds, a half pennant is 5 kt and a full pennant is 10 kt.
Citation: Monthly Weather Review 142, 12; 10.1175/MWR-D-14-00100.1
Ensemble forecast probabilities of accumulated precipitation equal to or exceeding 0.1 mm during the previous hour (color fill), and 10-m wind barbs for the 24-h forecast initialized from 0000 UTC 29 May 2012 for the (a) SKEBS, (b) SPPT, (c) PLBC, and (d) the control forecast. Plots are overlain with the ST4 accumulated precipitation contour of 0.1 mm during the previous hour valid at 0000 UTC 30 May 2012 (bright green). For winds, a half pennant is 5 kt and a full pennant is 10 kt.
Citation: Monthly Weather Review 142, 12; 10.1175/MWR-D-14-00100.1
The PLBC and SMES ensembles shown here improve aspects of the forecast performance relative to the control ensemble, although there remains much room for further forecast improvement, particularly for high-intensity rain events. Improvements in the forecast model are clearly warranted, as significant systematic bias is found in the forecasts presented here, which undoubtedly limits forecast skill. The SMESs tested here lead to shifts in ensemble bias that move further from the observed state in this tested configuration. Further tuning of the stochastic parameters might lead to substantial improvement in performance, particularly for the SPPT (e.g., Buizza et al. 1999). Combinations of the PLBC and SMES ensembles could outperform the individual contributions. Approaches to improve the dispersion characteristics of the ensemble through posterior inflation methods on the ensemble initial conditions (e.g., Whitaker and Hamill 2012) might also prove beneficial.
An underlying issue exposed during this study was the challenge to classify the overall dispersion characteristics of the ensemble forecasts. For some forecast verification metrics, such as the reliability of precipitation forecasts, the ensemble forecasts are still insufficiently broad even after applying SMESs (e.g., Fig. 11). Rank histograms of precipitation forecasts indicate underdispersive forecasts for all but SPPT, along with a wet bias (not shown). More appropriate treatment of observation errors could have changed this perspective by better sampling from the expected errors in the observation reports (e.g., Hamill 2001), but no established error model for hourly stage IV precipitation analysis products currently exists. Meanwhile, verification against soundings suggests overdispersion, particularly for zonal wind (e.g., Fig. 14). While established observation error statistics exist for conventional observations, the representativeness error component should vary as a function of the model grid spacing, but limited guidance exists for making appropriate adjustments (e.g., Hamill and Whitaker 2005). These are important topics in ensemble forecast verification that need to be addressed in the future.
Notably, the limited response of ensemble forecasts to PLBCs found in this study may have resulted from an inadequate perturbation approach. Torn (2010) used a similar method as that employed in this study but with progressively inflated perturbations with increasing lead time. Alternatively, drawing lateral boundaries from a global ensemble forecast system might have increased the forecast sensitivity to PLBCs. The latter still remains impractical for real-time forecast applications owing to limited availability of global ensemble forecasts of sufficient resolution, timeliness, and ensemble size (e.g., Torn et al. 2006). Since the lateral boundary condition is spatially far from the inner-nest domain where forecast verification is conducted, the influence of the outer domain lateral boundary condition should be reduced in this study, but alternate lateral boundary perturbation approaches warrant future investigation.
5. Summary
Ensemble forecasts at convection-permitting horizontal grid spacing provide an opportunity to improve predictions of deep convection. Current state-of-the-art convection-permitting ensemble forecast systems (CPEFS) are persistently underdispersive, particularly for precipitation. This study explores some options to improve the performance of the CPEFS. Four sets of 30-member convection-permitting ensemble forecasts over a 32-day period in the spring of 2012 are assessed. The ensemble forecasts draw initial conditions from a continuously cycled ensemble data assimilation system to represent initial state uncertainty. Forecasts are evaluated against stage IV accumulated precipitation analysis and radiosonde observations. Ensemble forecasts are also compared to a control ensemble forecast to assess relative impact from adding perturbations during the forecast. The control ensemble forecast was skillful but underdispersive for precipitation forecasts. To improve the forecast error growth of the ensemble forecasts, the control ensemble was supplemented with 1) the perturbed lateral boundary conditions (PLBC), or model error representation schemes using either 2) the stochastic kinetic energy backscatter scheme (SKEBS) or 3) the stochastically perturbed parameterization tendencies (SPPT) scheme.
Spread is increased by all perturbation techniques, particularly for longer forecast lead times. For this forecast system, the SPPT ensemble leads to the largest growth in spread for most state variables, followed by the SKEBS and then the PLBC ensemble. The additional spread provided by these methods improves ensemble forecast reliability and some skill metrics, but on average reduces the forecast value of individual ensemble members relative to the control forecast. A such, caution should be exercised in use of the SMES for deterministic guidance, since results from stochastic schemes as tested here led to increased distance from the truth, on average, for individual member forecasts. Thus, to fully leverage more reliable ensemble forecasts, less emphasis should be made on examining individual member forecast scenarios and more on probabilistic evaluation. This may necessitate the development of more sophisticated probabilistic guidance of convective forecast aspects, such as the mode of convection, which is typically assessed by examining many individual forecasts.
Stochastic model error schemes (SMESs) are shown to introduce systematic bias with regionally varying behavior relative to a forecast without the SMES or PLBC. For this study, the SKEBS model error approach provides the best balance between improvement in spread with only a modest change in forecast model bias. It is also found that the greater dispersion introduced with the SMES ensembles may require a larger forecast ensemble size (~30 members) than the control (~20 members) to maximize the probabilistic forecast skill for precipitation events, such that greater computational resources are needed to realize the full potential value of the SMES ensemble forecasts.
For precipitation forecasts, a wet bias is evident in all configurations. The control forecast is underdispersive for all lead times considered, but reliability is improved with model error schemes, particularly at longer lead times. All forecast systems remain somewhat overconfident with forecast probabilities generally greater than observed frequencies, particularly at high rain-rate thresholds. For forecast verification against soundings, an overdispersive ensemble is found for some observation types, though this study highlights the need to more closely examine the observation error assumptions.
Acknowledgments
We would like to acknowledge high-performance computing support from Yellowstone (ark:/85065/d7wd3xhc) provided by NCAR’s Computational and Information Systems Laboratory, sponsored by the National Science Foundation. So-Young Ha participated in helpful discussions related to this project. Three anonymous reviewers provided helpful suggestions that led to improvements in this study.
REFERENCES
Aksoy, A., D. C. Dowell, and C. Snyder, 2010: A multicase comparative assessment of the ensemble Kalman filter for assimilation of radar observations. Part II: Short-range ensemble forecasts. Mon. Wea. Rev., 138, 1273–1292, doi:10.1175/2009MWR3086.1.
Anderson, J. L., 2001: An ensemble adjustment Kalman filter for data assimilation. Mon. Wea. Rev., 129, 2884–2903, doi:10.1175/1520-0493(2001)129<2884:AEAKFF>2.0.CO;2.
Anderson, J. L., 2009: Spatially and temporally varying adaptive covariance inflation for ensemble filters. Tellus, 61A, 72–83, doi:10.1111/j.1600-0870.2008.00361.x.
Anderson, J. L., T. Hoar, K. Raeder, H. Liu, N. Collins, R. Torn, and A. Arellano, 2009: The Data Assimilation Research Testbed: A community facility. Bull. Amer. Meteor. Soc., 90, 1283–1296, doi:10.1175/2009BAMS2618.1.
Barth, M., W. Brune, C. Cantrell, and S. Rutledge, 2012: Deep Convective Clouds and Chemistry (DC3) operations plan. NASA/NOAA/NSF, 106 pp. [Available online at http://www.eol.ucar.edu/projects/dc3/documents/DC3_Operations_Plan_28_Apr_2012.pdf.]
Berner, J., S.-Y. Ha, J. P. Hacker, A. Fournier, and C. Snyder, 2011: Model uncertainty in a mesoscale ensemble prediction system: Stochastic versus multiphysics representations. Mon. Wea. Rev., 139, 1972–1995, doi:10.1175/2010MWR3595.1.
Berner, J., T. Jung, and T. N. Palmer, 2012: Systematic model error: The impact of increased horizontal resolution versus improved stochastic and deterministic parameterizations. J. Climate, 25, 4946–4962, doi:10.1175/JCLI-D-11-00297.1.
Bouttier, F., B. Vié, O. Nuissier, and L. Raynaud, 2012: Impact of stochastic physics in a convection-permitting ensemble. Mon. Wea. Rev., 140, 3706–3721, doi:10.1175/MWR-D-12-00031.1.
Brier, G. W., 1950: Verification of forecasts expressed in terms of probability. Mon. Wea. Rev., 78, 1–3, doi:10.1175/1520-0493(1950)078<0001:VOFEIT>2.0.CO;2.
Buizza, R., M. Miller, and T. N. Palmer, 1999: Stochastic representation of model uncertainties in the ECMWF Ensemble Prediction System. Quart. J. Roy. Meteor. Soc., 125, 2887–2908, doi:10.1002/qj.49712556006.
Buizza, R., P. L. Houtekamer, G. Pellerin, Z. Toth, Y. Zhu, and M. Wei, 2005: A comparison of the ECMWF, MSC, and NCEP Global Ensemble Prediction Systems. Mon. Wea. Rev., 133, 1076–1097, doi:10.1175/MWR2905.1.
Chen, F., and J. Dudhia, 2001: Coupling an advanced land surface–hydrology model with the Penn State–NCAR MM5 modeling system. Part I: Model implementation and sensitivity. Mon. Wea. Rev., 129, 569–585, doi:10.1175/1520-0493(2001)129<0569:CAALSH>2.0.CO;2.
Clark, A. J., W. A. Gallus, M. Xue, and F. Kong, 2009: A comparison of precipitation forecast skill between small convection-allowing and large convection-parameterizing ensembles. Wea. Forecasting, 24, 1121–1140, doi:10.1175/2009WAF2222222.1.
Clark, A. J., W. A. Gallus, M. Xue, and F. Kong, 2010: Growth of spread in convection-allowing and convection-parameterizing ensembles. Wea. Forecasting, 25, 594–612, doi:10.1175/2009WAF2222318.1.
Clark, A. J., and Coauthors, 2011: Probabilistic precipitation forecast skill as a function of ensemble size and spatial scale in a convection-allowing ensemble. Mon. Wea. Rev., 139, 1410–1418, doi:10.1175/2010MWR3624.1.
Dawson, D. T., II, L. J. Wicker, E. R. Mansell, and R. L. Tanamachi, 2012: Impact of the environmental low-level wind profile on ensemble forecasts of the 4 May 2007 Greensburg, Kansas, tornadic storm and associated mesocyclones. Mon. Wea. Rev., 140, 696–716, doi:10.1175/MWR-D-11-00008.1.
Duc, L., K. Saito, and H. Seko, 2013: Spatial-temporal fractions verification for high-resolution ensemble forecasts. Tellus, 65A, 18171, doi:10.3402/tellusa.v65i0.18171.
Ebert, E. E., 2001: Ability of a poor man’s ensemble to predict the probability and distribution of precipitation. Mon. Wea. Rev., 129, 2461–2480, doi:10.1175/1520-0493(2001)129<2461:AOAPMS>2.0.CO;2.
Ebert, E. E., U. Damrath, W. Wergen, and M. E. Baldwin, 2003: The WGNE assessment of short-term quantitative precipitation forecasts. Bull. Amer. Meteor. Soc., 84, 481–492, doi:10.1175/BAMS-84-4-481.
Evensen, G., 1994: Sequential data assimilation with a nonlinear quasi-geostrophic model using Monte Carlo methods to forecast error statistics. J. Geophys. Res., 99, 10 143–10 162, doi:10.1029/94JC00572.
Fritsch, J. M., and R. E. Carbone, 2004: Improving quantitative precipitation forecasts in the warm season: A USWRP research and development strategy. Bull. Amer. Meteor. Soc., 85, 955–965, doi:10.1175/BAMS-85-7-955.
Gebhardt, C., S. Theis, P. Krahe, and V. Renner, 2008: Experimental ensemble forecasts of precipitation based on a convection-resolving model. Atmos. Sci. Lett., 9, 67–72, doi:10.1002/asl.177.
Gebhardt, C., S. Theis, M. Paulat, and Z. Ben Bouallègue, 2011: Uncertainties in COSMO-DE precipitation forecasts introduced by model perturbations and variation of lateral boundaries. Atmos. Res., 100, 168–177, doi:10.1016/j.atmosres.2010.12.008.
Ha, S.-Y., and C. Snyder, 2014: Influence of surface observations in mesoscale data assimilation using an ensemble Kalman filter. Mon. Wea. Rev., 142, 1489–1508, doi:10.1175/MWR-D-13-00108.1.
Hamill, T. M., 1999: Hypothesis tests for evaluating numerical precipitation forecasts. Wea. Forecasting, 14, 155–167, doi:10.1175/1520-0434(1999)014<0155:HTFENP>2.0.CO;2.
Hamill, T. M., 2001: Interpretation of rank histograms for verifying ensemble forecasts. Mon. Wea. Rev., 129, 550–560, doi:10.1175/1520-0493(2001)129<0550:IORHFV>2.0.CO;2.
Hamill, T. M., and J. S. Whitaker, 2005: Accounting for the error due to unresolved scales in ensemble data assimilation: A comparison of different approaches. Mon. Wea. Rev., 133, 3132–3147, doi:10.1175/MWR3020.1.
Hohenegger, C., and C. Schär, 2007: Predictability and error growth dynamics in cloud-resolving models. J. Atmos. Sci., 64, 4467–4478, doi:10.1175/2007JAS2143.1.
Hohenegger, C., A. Walser, W. Langhans, and C. Schär, 2008: Cloud-resolving ensemble simulations of the August 2005 Alpine flood. Quart. J. Roy. Meteor. Soc., 134, 889–904, doi:10.1002/qj.252.
Houtekamer, P. L., H. L. Mitchell, and X. Deng, 2009: Model error representation in an operational ensemble Kalman filter. Mon. Wea. Rev., 137, 2126–2143, doi:10.1175/2008MWR2737.1.
Iacono, M. J., J. S. Delamere, E. J. Mlawer, M. W. Shephard, S. A. Clough, and W. D. Collins, 2008: Radiative forcing by long-lived greenhouse gases: Calculations with the AER radiative transfer models. J. Geophys. Res., 113, D13103, doi:10.1029/2008JD009944.
Janjić, Z. I., 1994: The step-mountain Eta coordinate model: Further developments of the convection, viscous sublayer, and turbulence closure schemes. Mon. Wea. Rev., 122, 927–945, doi:10.1175/1520-0493(1994)122<0927:TSMECM>2.0.CO;2.
Janjić, Z. I., 2002: Nonsingular implementation of the Mellor–Yamada level 2.5 scheme in the NCEP Meso model. NCEP Office Note 437, 61 pp. [Available online at http://www.emc.ncep.noaa.gov/officenotes/newernotes/on437.pdf.]
Johnson, A., and X. Wang, 2012: Verification and calibration of neighborhood and object-based probabilistic precipitation forecasts from a multimodel convection-allowing ensemble. Mon. Wea. Rev., 140, 3054–3077, doi:10.1175/MWR-D-11-00356.1.
Kain, J. S., and Coauthors, 2013: A feasibility study for probabilistic convection initiation forecasts based on explicit numerical guidance. Bull. Amer. Meteor. Soc., 94, 1213–1225, doi:10.1175/BAMS-D-11-00264.1.
Kalnay, E., M. Kanamitsu, and W. Baker, 1990: Global numerical weather prediction at the National Meteorological Center. Bull. Amer. Meteor. Soc., 71, 1410–1428, doi:10.1175/1520-0477(1990)071<1410:GNWPAT>2.0.CO;2.
Kong, F., and Coauthors, 2009: A real-time storm-scale ensemble forecast system: 2009 Spring Experiment. 23rd Conf. on Weather Analysis and Forecasting/19th Conf. on Numerical Weather Prediction, Omaha, NE, Amer. Meteor. Soc., 16A.3. [Available online at https://ams.confex.com/ams/pdfpapers/154118.pdf.]
Lin, Y., and K. E. Mitchell, 2005: The NCEP Stage II/IV hourly precipitation analyses: Development and applications. 19th Conf. on Hydrology, San Diego, CA, Amer. Meteor. Soc., 1.2. [Available online at http://ams.confex.com/ams/pdfpapers/83847.pdf.]
Mason, S. J., and N. E. Graham, 2002: Areas beneath the relative operating characteristics (ROC) and relative operating levels (ROL) curves: Statistical significance and interpretation. Quart. J. Roy. Meteor. Soc., 128, 2145–2166, doi:10.1256/003590002320603584.
Mass, C. F., D. Ovens, K. Westrick, and B. A. Colle, 2002: Does increasing horizontal resolution produce more skillful forecasts? Bull. Amer. Meteor. Soc., 83, 407–430, doi:10.1175/1520-0477(2002)083<0407:DIHRPM>2.3.CO;2.
Melhauser, C., and F. Zhang, 2012: Practical and intrinsic predictability of severe and convective weather at the mesoscales. J. Atmos. Sci., 69, 3350–3371, doi:10.1175/JAS-D-11-0315.1.
Mellor, G. L., and T. Yamada, 1982: Development of a turbulence closure model for geophysical fluid problems. Rev. Geophys., 20, 851–875, doi:10.1029/RG020i004p00851.
Migliorini, S., M. Dixon, R. Bannister, and S. Ballard, 2011: Ensemble prediction for nowcasting with a convection-permitting model. I: Description of the system and the impact of radar-derived surface precipitation rates. Tellus, 63A, 468–496, doi:10.1111/j.1600-0870.2010.00503.x.
Mlawer, E. J., S. J. Taubman, P. D. Brown, M. J. Iacono, and S. A. Clough, 1997: Radiative transfer for inhomogeneous atmospheres: RRTM, a validated correlated-k model for the longwave. J. Geophys. Res., 102, 16 663–16 682, doi:10.1029/97JD00237.
Morrison, H., G. Thompson, and V. Tatarskii, 2009: Impact of cloud microphysics on the development of trailing stratiform precipitation in a simulated squall line: Comparison of one- and two-moment schemes. Mon. Wea. Rev., 137, 991–1007, doi:10.1175/2008MWR2556.1.
Novak, D. R., D. R. Bright, and M. J. Brennan, 2008: Operational forecaster uncertainty needs and future roles. Wea. Forecasting, 23, 1069–1084, doi:10.1175/2008WAF2222142.1.
Nutter, P., D. Stensrud, and M. Xue, 2004a: Effects of coarsely resolved and temporally interpolated lateral boundary conditions on the dispersion of limited-area ensemble forecasts. Mon. Wea. Rev., 132, 2358–2377, doi:10.1175/1520-0493(2004)132<2358:EOCRAT>2.0.CO;2.
Nutter, P., M. Xue, and D. Stensrud, 2004b: Application of lateral boundary condition perturbations to help restore dispersion in limited-area ensemble forecasts. Mon. Wea. Rev., 132, 2378–2390, doi:10.1175/1520-0493(2004)132<2378:AOLBCP>2.0.CO;2.
Palmer, T. N., R. Buizza, F. Doblas-Reyes, T. Jung, M. Leutbecher, G. J. Shutts, M. Steinheimer, and A. Weisheimer, 2009: Stochastic parametrization and model uncertainty. Tech. Rep. ECMWF RD Tech. Memo. 598, 42 pp.
Roberts, N. M., and H. W. Lean, 2008: Scale-selective verification of rainfall accumulations from high-resolution forecasts of convective events. Mon. Wea. Rev., 136, 78–97, doi:10.1175/2007MWR2123.1.
Romine, G. S., C. S. Schwartz, C. Snyder, J. L. Anderson, and M. L. Weisman, 2013: Model bias in a continuously cycled assimilation system and its influence on convection-permitting forecasts. Mon. Wea. Rev., 141, 1263–1284, doi:10.1175/MWR-D-12-00112.1.
Schwartz, C. S., and Coauthors, 2010: Toward improved convection-allowing ensembles: Model physics sensitivities and optimizing probabilistic guidance with small ensemble membership. Wea. Forecasting, 25, 263–280, doi:10.1175/2009WAF2222267.1.
Schwartz, C. S., G. Romine, K. Smith, M. Weisman, 2014: Characterizing and optimizing precipitation forecasts from a convection-permitting ensemble initialized by a mesoscale ensemble Kalman filter. Wea. Forecasting, in press.
Shutts, G. J., 2005: A kinetic energy backscatter algorithm for use in ensemble prediction systems. Quart. J. Roy. Meteor. Soc., 131, 3079–3102, doi:10.1256/qj.04.106.
Skamarock, W. C., and M. L. Weisman, 2009: The impact of positive-definite moisture transport on NWP precipitation forecasts. Mon. Wea. Rev., 137, 488–494, doi:10.1175/2008MWR2583.1.
Skamarock, W. C., and Coauthors, 2008: A description of the Advanced Research WRF version 3. NCAR Tech. Note NCAR/TN-475+STR, 113 pp. [Available online at http://www.mmm.ucar.edu/wrf/users/docs/arw_v3_bw.pdf.]
Stensrud, D., H. Brooks, J. Du, S. Tracton, and E. Rogers, 1999: Using ensembles for short-range forecasting. Mon. Wea. Rev., 127, 433–446, doi:10.1175/1520-0493(1999)127<0433:UEFSRF>2.0.CO;2.
Surcel, M., I. Zawadzki, and M. K. Yau, 2014: On the filtering properties of ensemble averaging for storm-scale precipitation forecasts. Mon. Wea. Rev., 142, 1093–1105, doi:10.1175/MWR-D-13-00134.1.
Tegen, I., P. Hollrig, M. Chin, I. Fung, D. Jacob, and J. Penner, 1997: Contribution of different aerosol species to the global aerosol extinction optical thickness: Estimates from model results. J. Geophys. Res., 102, 23 895–23 915, doi:10.1029/97JD01864.
Thompson, G., P. R. Field, R. M. Rasmussen, and W. D. Hall, 2008: Explicit forecasts of winter precipitation using an improved bulk microphysics scheme. Part II: Implementation of a new snow parameterization. Mon. Wea. Rev., 136, 5095–5115, doi:10.1175/2008MWR2387.1.
Tiedtke, M., 1989: A comprehensive mass flux scheme for cumulus parameterization in large-scale models. Mon. Wea. Rev., 117, 1779–1800, doi:10.1175/1520-0493(1989)117<1779:ACMFSF>2.0.CO;2.
Torn, R. D., 2010: Performance of a mesoscale ensemble Kalman filter (EnKF) during the NOAA high-resolution hurricane test. Mon. Wea. Rev., 138, 4375–4392, doi:10.1175/2010MWR3361.1.
Torn, R. D., G. J. Hakim, and C. Snyder, 2006: Boundary conditions for limited-area ensemble Kalman filters. Mon. Wea. Rev., 134, 2490–2502, doi:10.1175/MWR3187.1.
Vié, B., O. Nuissier, and V. Ducrocq, 2011: Cloud-resolving ensemble simulations of Mediterranean heavy precipitating events: Uncertainty on initial conditions and lateral boundary conditions. Mon. Wea. Rev., 139, 403–423, doi:10.1175/2010MWR3487.1.
Warner, T. T., R. A. Peterson, and R. E. Treadon, 1997: A tutorial on lateral boundary conditions as a basic and potentially serious limitation to regional numerical weather prediction. Bull. Amer. Meteor. Soc., 78, 2599–2617, doi:10.1175/1520-0477(1997)078<2599:ATOLBC>2.0.CO;2.
Whitaker, J. S., and T. M. Hamill, 2012: Evaluating methods to account for system errors in ensemble data assimilation. Mon. Wea. Rev., 140, 3078–3089, doi:10.1175/MWR-D-11-00276.1.
Wilks, D. S., 1997: Resampling hypothesis tests for autocorrelated fields. J. Climate, 10, 65–82, doi:10.1175/1520-0442(1997)010<0065:RHTFAF>2.0.CO;2.
Wilks, D. S., 2006: Statistical Methods in the Atmospheric Sciences: An Introduction. 2nd ed. Academic Press, 467 pp.
Yussouf, N., and D. J. Stensrud, 2012: Comparison of single-parameter and multiparameter ensembles for assimilation of radar observations using the ensemble Kalman filter. Mon. Wea. Rev., 140, 562–586, doi:10.1175/MWR-D-10-05074.1.
Yussouf, N., E. R. Mansell, L. J. Wicker, D. M. Wheatley, and D. J. Stensrud, 2013: The ensemble Kalman filter analyses and forecasts of the 8 May 2003 Oklahoma City tornadic supercell storm using single- and double-moment microphysics schemes. Mon. Wea. Rev., 141, 3388–3412, doi:10.1175/MWR-D-12-00237.1.
Zhang, C., Y. Wang, and K. Hamilton, 2011: Improved representation of boundary layer clouds over the southeast Pacific in ARW-WRF using a modified Tiedtke cumulus parameterization scheme. Mon. Wea. Rev., 139, 3489–3513, doi:10.1175/MWR-D-10-05091.1.
Zhang, F., Z. Meng, and A. Aksoy, 2006: Tests of an ensemble Kalman filter for mesoscale and regional-scale data assimilation. Part I: Perfect model experiment. Mon. Wea. Rev., 134, 722–736, doi:10.1175/MWR3101.1.
For a nonlinear model with imperfect representation of model uncertainties, adding noise to represent model error may degrade the mean behavior of the forecast.
These bug fixes do not change the basic behavior of the scheme. Details can be found online at http://www.cgd.ucar.edu/~berner/skebs.