## Abstract

This study examines the performance of a hybrid ensemble-variational data assimilation system (E3DVar) that couples an ensemble Kalman filter (EnKF) with the three-dimensional variational data assimilation (3DVar) system for the Weather Research and Forecasting (WRF) Model. The performance of E3DVar and the component EnKF and 3DVar systems are compared over the eastern United States for June 2003. Conventional sounding and surface observations as well as data from wind profilers, aircraft and ships, and cloud-tracked winds from satellites, are assimilated every 6 h during the experiments, and forecasts are verified using standard sounding observations. Forecasts with 12- to 72-h lead times are found to have noticeably smaller root-mean-square errors when initialized with the E3DVar system, as opposed to the EnKF, especially for the 12-h wind and moisture fields. The E3DVar system demonstrates similar performance as an EnKF, while using less than half the number of ensemble members, and is less sensitive to the use of a multiphysics ensemble to account for model errors. The E3DVar system is also compared with a similar hybrid method that replaces the 3DVar component with the WRF four-dimensional variational data assimilation (4DVar) method (denoted E4DVar). The E4DVar method demonstrated considerable improvements over E3DVar for nearly all model levels and variables at the shorter forecast lead times (12–48 h), but the forecast accuracies of all three ensemble-based methods (EnKF, E3DVar, and E4DVar) converge to similar results at longer lead times (60–72 h). Nevertheless, all methods that used ensemble information produced considerably better forecasts than the two methods that relied solely on static background error covariance (i.e., 3DVar and 4DVar).

## 1. Introduction

Variational data assimilation approaches have been implemented at many operational centers and research communities for improving numerical weather prediction in recent decades, either in the three-dimensional form [three-dimensional variational data assimilation (3DVar; Parrish and Derber 1992; Courtier et al. 1994; Gauthier et al. 1999; Lorenc et al. 2000; Barker et al. 2005)] or its four-dimensional extension [four-dimensional variational data assimilation (4DVar; Rabier et al. 2000; Honda et al. 2005; Zupanski et al. 2005; Gauthier et al. 2007; Huang et al. 2009)]. The background error covariance for variational methods is usually derived from assumptions of spatial and temporal homogeneity and isotropy; therefore, flow-dependent uncertainties (often referred to as the “errors of the day”) go unaccounted for during the data assimilation. Even though the 4DVar method is capable of resolving time-evolving background error covariance through the use of tangent linear and adjoint models (Lorenc 2003), the limitations of using a static background covariance for a series of 4DVar cycles are still significant (Navon et al. 2005; Buehner et al. 2010a,b), since the static covariance estimate is used at the beginning of each 4DVar window. The presence of strong nonlinearity and discontinuity in the forecast model may also be problematic for the adjoint during minimization (Zou et al. 1997; Sun and Crook 1997).

Ensemble-based data assimilation, in various forms of ensemble Kalman filters (EnKF), has become a popular alternative to the variational approaches. Evensen (1994) introduced the first EnKF, and Houtekamer and Mitchell (1998) were the first to apply the technique to the atmosphere. Unlike the deterministic analyses that result from a variational data assimilation cycle, the EnKF provides an explicit estimation of the analysis probability density function, sampled by an ensemble of members that can be used to initialize ensemble forecasts. EnKFs have been adopted for a variety of weather applications over the past decade, ranging from convective to global scales for both ideal and realistic model experiments (e.g., Anderson 2001; Whitaker and Hamill 2002; Snyder and Zhang 2003; Tong and Xue 2005; Zhang et al. 2006; Torn and Hakim 2008; Meng and Zhang 2007; Whitaker et al. 2008; Zhang et al. 2009a). The EnKF has demonstrated convincing advantages over 3DVar for both limited-area and global models, because of its use of flow-dependent background error covariance (Houtekamer et al. 2005; Whitaker et al. 2008; Meng and Zhang 2008a,b). Theoretical comparisons have been made between the EnKF and 4DVar systems, along with modeling studies that consider the head-to-head performance of each system (Lorenc 2003; Caya et al. 2005; Kalnay et al. 2007; Buehner et al. 2010a,b; Miyoshi et al. 2010; Zhang et al. 2011). In summary, the two methods are equivalent when used with a perfect linear model, and result in comparable performance for realistic settings in the presence of model error and nonlinear dynamics. Modeling studies have shown that the EnKF may provide better long-term forecast performance after multiple assimilation cycles have been completed, and the 4DVar method tends to fit observations better at the analysis time and requires fewer assimilation cycles to provide optimal performance. Many formulations of the EnKF require the sequential assimilation of observations to avoid large matrix inversions during each update cycle, so the computational expense of the Kalman filter update equations scales linearly with the amount of data. In contrast, the variational cost-function approach can efficiently assimilate large numbers of observations.

A hybrid data assimilation method that couples ensemble-based and variational data assimilation systems has emerged as an alternative approach (Lorenc 2003; Buehner 2005). The new system incorporates flow-dependent forecast uncertainties into the cost function during variational data assimilation, thus taking advantage of the relative strengths of the stand-alone methods. The hybrid system can easily be adapted from the existing variational framework, while providing more flexibility than conventional ensemble-based schemes in accounting for sampling error and model error (Wang et al. 2009).

Hamill and Snyder (2000) proposed the first hybrid data assimilation method by using a linear combination of ensemble-based and time-invariant covariance in a 3DVar cost function. Lorenc (2003) proposed a more sophisticated form of hybrid by introducing ensemble perturbations into the cost function as additional control variables; the method was first implemented in Buehner (2005). The control variables for the ensemble information are preconditioned for covariance localization to reduce sampling errors, much like what is done for the stand-alone EnKF (Houtekamer and Mitchell 2001). Wang et al. (2007, 2009) noted the equivalence of the two hybrid schemes, and tested the latter for a global two-layer primitive equation model under perfect- and imperfect-model scenarios. Liu et al. (2008, 2009) introduced a hybrid method that performs four-dimensional variational minimization of a cost function, using time-dependent error covariance estimated from an ensemble to replace the role of the tangent linear and adjoint model in the standard 4DVar system. Wang et al. (2008a,b) and Buehner (2010a,b) conducted data assimilation experiments in which these two hybrid methods were used to assimilate real observations for a limited-area and global model, respectively.

Zhang et al. (2009b) applied a four-dimensional data assimilation system for the Lorenz-96 model (Lorenz 1996) that directly couples the EnKF with 4DVar methods using a linear combination of ensemble and climatological background error covariance. Unlike the four-dimensional hybrid in Liu et al. (2008, 2009), this system introduces the ensemble perturbations at the beginning of the 4DVar observation window and uses the tangent linear and adjoint models to fit the observations. The component systems run in parallel, with ensemble perturbations being updated by the EnKF and the ensemble mean replaced by the 4DVar analysis. Zhang and Zhang (2012) further showed that such a 4DVar-based hybrid (denoted as E4DVar) can outperform the stand-alone EnKF and 4DVar in real data experiments, since it benefits from flow-dependent information provided from the two component systems. Another potential benefit of the hybrid approach is the treatment of sampling and model errors, since a careful weighting between climatological and ensemble-based background statistics are used (Wang et al. 2009; Zhang et al. 2009a).

As a follow-up of previous studies (Zhang and Zhang 2012; Wang et al. 2008a,b), a hybrid ensemble-variational data assimilation system (E3DVar) based on the WRF model is applied for a regional modeling study. The data assimilation system tested in this study is similar to the method presented in Wang et al. (2008a,b), except that 1) a square root EnKF is used instead of an ensemble transform Kalman filter, 2) the EnKF and 3DVAR are two-way coupled and running in parallel, and 3) a direct comparison is made to the EnKF, 4DVar, and E4DVar systems using the same forecast model. The performance of the coupled system is examined over the eastern part of the United States for the month of June 2003 by assimilating most available conventional observations, and comparing results from cases in which the stand-alone EnKF and 3DVar systems are applied. Sensitivity tests are performed for background error covariance weighting, ensemble size, multiphysics and single-physics ensembles, and additive inflation for a comprehensive investigation on the treatment of sampling and model errors. The data assimilation systems are described in section 2, followed by experiment design in section 3. A comparison of E3DVar and the stand-alone data assimilation methods is given in section 4. The sensitivity of E3DVar experiments to configurations of 3DVar and EnKF are presented in sections 5 and 6, respectively. E3DVar is compared with E4DVar in section 7, and concluding remarks are presented in section 8.

## 2. Hybrid algorithm

### a. Variational method

The variational approach seeks a balanced state analysis that is subject to both dynamical and statistical constraints by minimizing a cost function *J*:

The three right-hand-side terms in Eq. (1) are the background (*J _{b}*,), observational (

*J*), and penalty (

_{o}*J*) cost functions, respectively, and the subscript

_{c}*k*denotes an observation time during the assimilation window of length

*K*. In

*J*,

_{b}*δ*

**x**

_{0}is the analysis increment from the first guess at the initial time, and is the background error covariance. In

*J*,

_{o}_{k}and

_{k}are the tangent linear versions of the forecast model and observation operator, is the observational error covariance, and is the innovation vector at time

*k*. For 3DVar,

*k*=

*K*= 0, meaning the analysis occurs for a fixed time. It differs fundamentally from 4DVar, which performs the minimization over a time window using linear and adjoint models that are linearized about a trajectory (Huang et al. 2009). The additional penalty term

*J*enforces a set of balanced constraints for the analysis, but it is not used in the current study (

_{c}*J*= 0). In practice, the WRF variational data assimilation package uses an incremental formulation of Eq. (1) to reduce computational cost.

_{c}### b. EnKF

The update equations for the standard ensemble Kalman filter (Evensen 1994) are given by

where and represent the prior and posterior estimate (or first guess and analysis) at the analysis time, and *i* denotes ensemble members (*i* = [1, *N*]). The is the Kalman gain matrix, and represents the background error covariance, which is referred to as in the variational algorithm. A flow-dependent is estimated from an ensemble of short-range forecasts by Eq. (3), and observations are assimilated sequentially under the assumptions of independent observation errors. This study uses the same EnKF as Meng and Zhang (2008a,b); see Snyder and Zhang (2003) for a detailed description of the algorithm.

### c. Coupled system

3DVar and EnKF run separately in E3DVar, with two-way variable exchanges during each assimilation cycle. Figure 1 outlines the three major variable exchanges that are described in the following steps: (i) the ensemble-based background error covariance is introduced into the 3DVar cost function; (ii) the prior ensemble mean is used as the first guess for each 3DVar cycle; and (iii) the posterior ensemble mean is replaced by the 3DVar analysis, , for the next ensemble forecast.

The ensemble-based background error covariance is introduced in the cost function by separating the *J _{b}* in Eq. (1) into two parts:

where *J*_{b1} is the traditional background term as in Eq. (1) and *J*_{b2} is the cost assigned to the ensemble-based terms. The is the prior ensemble covariance valid at the analysis time; is the weighting coefficient for the two covariance estimates; is a correlation matrix used to localize the ensemble covariance; and is an elementwise multiplication or Schur product. The hybrid formulation approaches the standard 3DVar as approaches 0. When the alpha-control variable transform is applied, the hybrid incremental analysis can be calculated as a function of two control variables. These variables are the traditional control variable * ν* associated with the (National Meteorological Center) NMC-based covariance (Barker et al. 2005), and an additional control variable

*associated with the ensemble-based covariance (Lorenc 2003):*

**α**The in Eq. (6) transforms the control variable **ν** into the background covariance matrix and is preconditioned such that ^{T} is approximately equal to . Likewise, ^{f}, is a matrix holding the ensemble perturbations, and the control variable ** α** is preconditioned to ensure that

*δ*

**x**

_{ens}spans the space of the localized ensemble perturbations. Using Eqs. (5) and (6), the cost function in Eq. (1) can then be rewritten as

for E4DVar, where *k* = 0 for the E3DVar system. Our hybrid formulation follows closely Wang et al. (2008a,b) though an alternative approach is to include the beta term in Eq. (6), rather than in the cost function in Eq. (7), as is done in other studies [e.g., Eq. (7) of Buehner (2005)].

## 3. Experimental design

### a. Forecast model

This study applies the above-mentioned data assimilation systems for version 3.1.1 of the Advanced Research Weather Research and Forecasting (ARW-WRF) Model (Skamarock et al. 2005), using the same configuration as Zhang et al. (2011) and Zhang and Zhang (2012). All experiments are conducted over a single domain, which covers the continental United States and surrounding areas (Fig. 2) with a 71 × 51 horizontal mesh grid using 90-km spacing and 27 vertical levels up to 50 hPa. The Grell–Devenyi cumulus scheme (Grell and Devenyi 2002), WRF single-moment 6-class microphysics scheme (WSM6; Hong et al. 2004), and the Yonsei State University (YSU) planetary boundary layer (PBL) scheme (Noh et al. 2003) are used for all deterministic forecasts. For ensemble forecasts, we use different arrangements of physics parameterization schemes for each ensemble member to create a multiphysics ensemble; the combinations of schemes used in this study are identical to those used in Meng and Zhang (2008a,b). The first forecast cycle of this month-long experiment is initialized at 0000 UTC 1 June 2003, using the National Centers for Environmental Prediction (NCEP) global final analysis (FNL) data to create the initial and lateral boundary conditions (ICs and LBCs). In the following cycles, LBCs are interpolated from the FNL analyses, while ICs are provided by analyses produced by the tested data assimilation schemes.

### b. Data assimilation systems

This study uses the 3DVar and 4DVar systems that are available in version 3.1 of the WRF variational data assimilation package (WRFDA). The static background error covariance for the variational experiments is estimated from the NMC method (Parrish and Derber 1992), which uses differences between 24- and 12-h forecasts valid at the same time (i.e., every 0000 and 1200 UTC) over the preceding month. The covariance matrix is formulated using option 5 (CV5; Barker et al. 2005) in WRFDA for a set of control variables that include streamfunction, unbalanced temperature, surface pressure and velocity potential, and relative humidity. The variance-scale parameter is carefully tuned to optimize the performance of 3DVar (refer to section 5a).

The EnKF uses a multiphysics ensemble of 40 members, with a relaxation coefficient of 0.8 [see Eq. (5) of Zhang et al. (2004)], and prespecified correlation functions for covariance localization (Gaspari and Cohn 1999) that use a radius of influence of 1800 km for radiosondes and profilers, and 600 km for all other observations. A vertical covariance localization of 15 vertical grid points is applied to discrete, single-level observations such as surface data and satellite winds. These EnKF settings are based on past experiences and limited tuning sensitivity tests, which may not be optimal.

The initial ensemble perturbations are randomly generated at 0000 UTC 1 June 2003 using the CV5 background error covariance option of the WRFDA system (Barker et al. 2005), which are dynamically balanced by the same set of control variables used for the variational cost function. The perturbations are then added to the FNL analysis to form an initial ensemble, which is integrated for 12 h to evolve a flow-dependent background error covariance matrix before the first assimilation cycle at 1200 UTC 1 June 2003. The LBCs for the ensemble forecasts are perturbed from the FNL analyses at each analysis time in the same manner as the initial perturbations.

The EnKF and 3DVar are coupled in E3Dvar through an additional control variable transformation (Lorenc 2003) that was adapted for WRFDA by Wang et al. (2008a,b). Unlike the Wang et al. (2008a,b) implementation, the coupled system developed for this study runs the 3DVar and EnKF components in parallel of one another, with information exchanged between the two systems via additional steps that are completely separate from the original WRF-3DVar and EnKF (Fig. 1). The coefficient for weighting the NMC- and ensemble-based background error covariance estimates is set to 0.8 and 0.5 in this study as in Wang et al. (2008b). Additional experiments are performed to evaluate the sensitivity of E3DVar to different weighting coefficients paired with various ensemble sizes (Table 1).

### c. Observations

The data assimilation experiments make use of various types of meteorological observations, including wind, temperature, and moisture from radiosondes, ships, and surface stations; wind from profilers; wind and temperature from aircrafts; and cloud-tracked wind from satellites. Data sorting, quality control, and observational error assignment for all cases are performed through the observation preprocessing module of WRFDA (Barker et al. 2005). The first analysis time is 1200 UTC 1 June 2003, and each data assimilation system continuously cycles through a 6-h analysis–forecast cycle (every 0000, 0600, 1200, and 1800 UTC) until the end of the month.

## 4. Control experiments of 3DVar, EnKF, and E3DVar

In this section, we evaluate the performance of each approach over the month-long period. Each data assimilation experiment presented here is referred to as a control case, since the default configurations are used. The root-mean-square error (RMSE) of horizontal winds (*U*, *V*), temperature (*T*), and mixing ratio of water vapor (*Q*) are calculated between model forecasts and radiosonde observations over a subset of the model domain (dashed box in Fig. 2). The verification statistics use a total of fifty-nine 72-h deterministic forecasts, which are initialized from the 0000 and 1200 UTC analyses each day for each of the data assimilation experiments. We also compare these results with 72-h forecasts that are initialized twice a day from the FNL analyses. The operational FNL analysis is available on a 1° by 1° latitudinal–longitudinal grid that is based on a 3DVar-type assimilation approach, called gridpoint statistical interpolation (GSI), which assimilates many more observations than what is used for the experiments presented herein, including satellite radiances.

We first examine the vertical distribution of the month-long mean RMSE for 12-h forecasts of *U*, *V*, *T*, and *Q* (Fig. 3). The largest RMSE of *U*, *V*, and *T* for each control experiment is near the tropopause at 200 hPa, while the largest error of *Q* is in the mid- to lower troposphere, collocated with a secondary error maxima for *T*. The control E3DVar performs as well as the control EnKF for the temperature field, but has noticeably smaller RMSE than EnKF for horizontal winds and moisture in the mid- to lower troposphere. Both the E3DVar and EnKF have substantially smaller RMSEs than the control 3DVar throughout the troposphere. In comparison to the 12-h WRF forecasts initialized with the FNL analysis (denoted FNL-WRF), both of the control ensemble methods (E3DVar and EnKF) have slightly larger RMSEs than FNL-WRF in the horizontal wind fields, but smaller errors in the temperature and moisture fields (Fig. 3). All three control experiments have a 12-h forecast bias in the lower and midtroposphere that is nearly negligible for winds (less than 5% of the RMSE), and slightly negative for temperature (Fig. 4). Forecasts from all three WRF-based assimilation schemes have smaller biases than the 12-h forecasts from the FNL analysis, despite using fewer observations. The control EnKF also outperforms 3DVar, which is consistent with Meng and Zhang (2008b).

Figure 5 displays the vertical profiles of RMS differences between the analysis of each assimilation scheme and the sounding observations at the analysis times. These differences measure the extent to which the analyses fit observations that have been assimilated at each verifying time. In contrast to the 12-h forecast errors (Fig. 3), Fig. 5 shows the 3DVar scheme fitting the observations the closest, followed by E3DVar, while the EnKF analyses yield the largest differences between the radiosonde observations throughout the vertical domain. The control 3DVar case has an inflated background variance factor that applies a larger weight to the observations than would otherwise be assigned by a default configuration. It is generally undesirable to closely fit observations during data assimilation, since it may cause valuable information provided by the forecast model to be disregarded. Overweighting observations may also overfit observation noise, rather than the signals. Nevertheless, the 3DVar configuration used for the control case provided the most optimal results and is therefore compared with the ensemble approaches in this section.

Figure 6 shows the vertical distribution of mean RMSE from the 72-h WRF forecasts. There is virtually no difference in RMSE between the EnKF and E3DVAR at longer lead times; the ensemble-based methods clearly produce better forecasts than the control 3DVar and FNL-WRF configuration for all fields, especially in the lower and midtroposphere. Figure 7 shows the time evolution of the domain-averaged^{1} 72-h RMSE for all three control experiments. There are large fluctuations in the 72-h RMSE for each experiment, with the performance of E3DVar and EnKF being similar throughout the month. Larger performance gains for the EnKF and E3DVar experiments occur for active weather patterns, since the ensemble provides a better error covariance estimate when the background flow deviates from climatology. The advantage of the two ensemble methods over the control 3DVar is most evident for a few episodes between 8–13 June that feature the passage of several strong mesoscale convective systems (Davis et al. 2004; Hawblitzel et al. 2007). Figure 8 summarizes the domain-averaged RMSEs of each case by further averaging all 72-h forecasts throughout the month. Consistent with what is shown in Figs. 3 and 5–7, the E3DVar produces slightly better forecasts than the EnKF for all fields except temperature. Both of the ensemble-based methods substantially outperform the standard 3DVar and FNL for all variables and forecast lead times, though the control 3DVar fits the observations the closest at the analysis time. The ensemble-based methods show significant decreases in forecast error over 3DVar, thus demonstrating the benefit of incorporating flow-dependent background error covariance into the state estimation. The ensemble forecasts also provide a means of estimating multivariate correlations between moisture variables, which are nonexistent in the static covariance matrix used for the 3DVar tested in this study (Fig. 9).

Single-observation tests are used here to show the structure of the background error covariance used to assimilate an observation at one time. Figure 9 shows increments of temperature, horizontal winds, and moisture that are calculated in response to a 1-K warmer temperature difference at 500 hPa on 0000 UTC 8 June 2003. The hypothetical observation is in the vicinity of an upper-level short-wave trough, where static background errors often fail to capture the true forecast errors. The analysis increments from 3DVar (Fig. 9c) have an isotropic structure that is centered on the observation location and completely independent of the background flow. As mentioned above, the NMC-based background error covariance used for this study contains no correlations between moisture and other variables; therefore, no updates are made to the moisture field in the 3DVar case. In contrast, the increments produced from the EnKF and E3DVar (Figs. 9a,d) contain significant adjustments to moisture near the observation location. Corrections to the wind field and thermodynamic variables are maximized westward of the observation, spanning an elongated region ahead of the wave for the EnKF case. By combining the NMC- and ensemble-based background covariance, the E3DVar analysis will be adjusted toward the larger-scale flow field as depicted by the background ensemble mean and covariance, while producing significant updates to the subsynoptic-scale flow field near the observation.

## 5. Sensitivity to variational configurations

### a. Sensitivity to variance scale

We tested 3DVar for a reasonable range of parameter values, and found an inflation value of 3.0 to give the best performance. The larger inflation factor causes the analysis to fit the observations more closely than the default setting. Though the forecast differences between sensitivity experiments are small, the higher inflation factor gave consistently better performance than 3DVAR-Var1.0 at both 12- and 72-h lead forecast times when averaged over the entire month (Fig. 10). The reason why 3DVar performs better for the larger variance factor is beyond the scope of the current study. One possibility is due to the use of 12- and 24-h forecast differences in generating the background error covariance in the current study, instead of the 24- and 48-h differences that are often used in global models (Parrish and Derber 1992). The NMC-based background error covariance also fails to account for model error, which partially explains why an inflated variance provides slightly improved results.

### b. Sensitivity to weighting coefficient

In this section, we examine the sensitivity of the E3DVar system to the weighting coefficient value [*β* in Eq. (7)] that controls the respective weights of NMC- and ensemble-based error covariance. The coefficient is set to 0.8 in the control E3DVar, which means 80% of the covariance comes from the ensemble and 20% comes from the static estimate. A value of 1.0 gives all weight entirely to the ensemble-estimated error covariance (as in experiment E3DVar-Beta1.0),^{2} which is essentially the same as the control EnKF experiment except that the 3DVar cost function minimization is used to arrive at the analysis instead of the Kalman update equation.

One key difference between the variational and Kalman filtering algorithm is the treatment of covariance localization. A recursive filter is used in E3DVar to precondition the alpha control variables for localization, while the Gaspari and Cohn (1999) fifth-order correlation function is used in the EnKF. We made every effort to configure and tune the correlation length scale in the recursive filter to give a similar influence radius as what is used for the EnKF. The similarity between analysis increments for the two cases can be seen in the single-observation experiments (Figs. 9a,e). The overall performance of the EnKF and E3DVar-Beta1.0 is indeed similar for forecasts with lead times ranging from 24 to 72 h, but the control EnKF analysis fits the observations closer and produces slightly smaller 12-h forecast errors (Fig. 11).

Figure 11 shows the performance of E3DVar with a *β* value of 0.5 (experiment E3DVar-Beta0.5). E3DVar-Beta0.5 has a slightly closer fit to observations than the control, *β* = 0.8, E3DVar at the analysis time because the combined covariance tends toward the NMC-based structures near the observation point (Fig. 9e). Nevertheless, the control E3DVar has a slightly lower forecast RMSE than E3DVar-Beta0.5 and E3DVar-Beta1.0 for all variables and forecast times. A choice of *β* smaller than 0.5 would make the solution closer to 3DVar and further degrade the forecast performance (not shown). We, therefore, conclude that the E3DVar system is not highly sensitivity to the value of the weighting coefficient but it is more desirable to place a higher weight on the ensemble-estimated covariance.

## 6. Sensitivity to the ensemble configurations

### a. Sensitivity to ensemble size

Wang et al. (2008a,b) and Zhang et al. (2009b) showed that a hybrid ensemble-variational approach may lessen the computational demand of ensemble data assimilation by requiring a smaller ensemble size than what is typically required for the EnKF. Here we examine the sensitivity of E3DVar to ensemble size over the same month-long period. Experiments E3DVar-SIZE10 and E3DVar-Size20 are identically configured to the control E3DVar except for using ensemble sizes of 10 and 20, respectively. Given the large number of sensitivity experiments presented here, we compare the ensemble spread versus analysis–forecast error in terms of root-mean-difference total energy (RM-DTE), which summarize the RMSE of several forecast variables (*u*, *υ*, and *T*). As in Zhang et al. (2002), DTE = 0.5(*u*′*u*′ + *υ*′*υ*′ + *kT*′*T*′) where the primes denote a difference between the observations and verified fields, and *k* = *C _{p}*/

*T*(

_{r}*C*= 1004.7 J kg

_{p}^{−1}K

^{−1}and the reference temperature,

*T*= 290 K). Figure 12 shows that reducing the ensemble size in the control E3DVar to 10 members increases forecast errors at all lead times. Given the severity of the sampling error with only 10 ensemble members (cf. Fig. 9g), we performed two additional sensitivity experiments denoted E3DVar-Size10-Beta0.5 and E3DVar-Size10-Beta0.5-L900. Both experiments use a weighting coefficient of 0.5, which gives an equal amount of weight to the two background covariance estimates, and the latter experiment uses a radius of influence that has been reduced from 1800 km in the control to 900 km (cf. Fig. 9h). Both changes improve the performance of E3DVar when only 10 members are used, reducing the error to a level that is comparable to what resulted from the standard 3DVar.

_{r}The ensemble size is increased from 10 to 20 members in E3DVar-Size20, leading to relatively smaller sampling errors (cf. Fig. 9i). This E3DVar configuration is able to outperform the standard 3DVar at all forecast lead times, yielding forecast errors that are close to the 40-member control EnKF. The result is consistent with what was found for simulated data experiments in Wang et al. (2008a,b). We further increase the ensemble size of the EnKF experiment to 100 (denoted EnKF-Size100) to compare with the E3DVar experiments. A larger ensemble size improves the estimate of the background covariance and therefore improves the performance of the EnKF (cf. Fig. 9f). Nevertheless, the performance of the EnKF-Size100 configuration is similar to that of the control E3DVar, again suggesting that the hybrid system may achieve a level of performance that is similar to an EnKF, while using less than half the number of ensemble members.

### b. Sensitivity to multiphysics scheme

As in Meng and Zhang (2008a,b), the control EnKF and E3DVar experiments use different combinations of subgrid-scale physics parameterization schemes for each ensemble member to account for a portion of the model error. Experiments EnKF-S and E3DVar-S are configured to be identical to the control EnKF and E3DVar except for using the same set of physics options for each member. The choice of physics schemes is listed in section 3a, and is the same as what is used for all deterministic forecasts in this study. Figure 13 compares vertical profiles of the 12-h forecast RMSEs for all variables in each of the four experiments, and Fig. 14 shows the domain-averaged RMSEs for 0–72-h forecast lead times. The control EnKF with a multiphysics ensemble has a clear advantage over the single-physics scheme ensemble for thermodynamic variables at earlier forecast times, but the difference diminishes by 72 h, which is consistent with Meng and Zhang (2008a). Nevertheless, there is little or no difference in the performance between E3DVar and E3DVar-S. This result suggests that mixing static background error covariance with the ensemble-estimated values reduces the effectiveness of a multiphysics ensemble. As suggested in Zhang et al. (2009b), the hybrid data assimilation scheme may be less vulnerable to model error.

### c. Sensitivity to additive inflation

Past studies using global models have demonstrated that the performance of an EnKF can be improved by adding to ensemble perturbations a random sample drawn from a static background error covariance to account for sampling and model error (e.g., Hamill and Whitaker 2005; Houtekamer et al. 2009). This technique is essentially another approach of combining ensemble- and climate-based covariance, but for a pure ensemble framework. With this in mind, an experiment similar to the control EnKF, but with a random error included in the ensemble perturbations is performed (EnKF-Addi0.2). The additional perturbations are randomly selected from the NMC-based background error covariance and scaled by a factor of 0.2. The relaxation of posterior perturbations back to the prior perturbations for EnKF (Zhang et al. 2004) is rewritten as

where *α* is set to 0.8, and *γ* is an additive inflation coefficient that is set to 0.2 for these experiments.

With additive error inflation, the EnKF-Addi0.2 configuration produces lower domain-averaged RM-DTE than the control EnKF and is comparable in performance to the control E3DVar (Fig. 12). This result suggests that it may be possible to achieve the same performance of a hybrid system using an ensemble framework. One caveat is that the EnKF may be sensitive to the magnitude of the additive error. More advanced covariance inflation methods, such as the adaptive or flow-dependent approaches proposed by Anderson (2007, 2009), Miyoshi (2011), and Whitaker and Hamill (2012), might allow the EnKF to achieve a performance that is similar to a hybrid system. Nevertheless, a thorough testing of these techniques is beyond the scope of the current study. The E3DVar-Addi0.2 experiment gives the best performance in this study, even for the EnKF with a 100-member ensemble in this particular case. A small ensemble size can lead to an underestimation of background errors (Whitaker and Hamill 2002), and so an inflated variance and/or application of additive noise are needed to avoid filter divergence.

## 7. Comparison of E3DVar with E4DVar

The control experiments in section 4 show clear advantages of E3DVar over the stand-alone 3DVar and EnKF methods, while an E4DVar hybrid in Zhang and Zhang (2012) shows clear advantages over 4DVar and EnKF. Given the much higher computational cost needed to run E4DVar over E3DVar, a question emerges regarding whether the additional improvements in accuracy are worth the cost needed to apply an E4DVar system.

Figure 15 compares 12-h forecast RMSE for all control experiments, including simulations run from the two coupled systems, E3DVar and E4DVar, and the uncoupled component systems, 3DVar, 4DVar, and EnKF. All forecasts are verified using sounding observations, and errors are averaged over all 59 WRF forecasts for the same month used in the previously discussed sensitivity experiments. Figure 16 shows the corresponding 0- to 72-h time evolution of the domain-averaged RMSE for each control experiment, evaluated every 12 h. The configurations of the control 4DVar and E4DVar experiments follow a similar setup as the 3DVar and E3DVar cases, except the 4DVar option in the WRFDA system (Huang et al. 2009) is used. As first observed in Zhang et al. (2011), the WRF versions of the EnKF and 4DVar systems perform similarly, except that the EnKF has a substantially smaller error in the moisture analysis and a slightly lower error for winds and temperature. The EnKF and 4DVar systems perform considerably better than 3DVar, but are both less accurate than E3DVar (Figs. 15 and 16).

E4DVar shows further improvements over E3DVar by reducing 12-h forecast error at nearly all levels and variables (Fig. 15). The benefits of E4DVar over E3DVar persist for the 48-h forecast lead times, but the two methods converge to a similar level of forecast accuracy as the EnKF by 60–72 h (Fig. 16). It is also worth noting that the error amplitudes of all three ensemble-based data assimilation methods (EnKF, E3DVar, and E4DVar) are quantitatively similar at the 72-h forecast time, all of which are considerably smaller than those of the two variational-based methods (3DVar and 4DVar) that use static background error covariance (Fig. 16). As suggested in section 4, ensemble forecasts provide information regarding the background flow field that goes unresolved by the NMC-based covariance; therefore, the variational data assimilation cases experience large peaks in forecast error during days in which the uncertainty in the flow field deviates from the climatological statistics used to derive the static covariance matrix. The probabilistic methods may also be benefiting from an improved first-guess estimate at each data assimilation cycle that is provided from an ensemble mean.

## 8. Conclusions

This study tests four state-of-the-art data assimilation algorithms for limited-area WRF Model configurations, covering variational, ensemble-based, and hybrid methods: 3DVar, EnKF, and E3DVar. E3DVar is a coupled data assimilation method adapted from Wang et al. (2008a,b) that is based on the WRFDA framework. It requires two variable exchanges between the EnKF and 3DVar components; the ensemble-based background error covariance is introduced into the 3DVar cost function, and the mean of the EnKF analysis is replaced by the 3DVar analysis. The strengths and weaknesses of the set of data assimilation schemes are investigated under a realistic operational modeling scenario for a month-long cycling data assimilation experiment during June 2003. Observations were taken from atmospheric sounding and surface datasets, wind profilers, ships and aircrafts, and cloud-tracked winds from satellites every 6 h over North America. Data assimilation approaches that use flow-dependent, multivariate background error covariance produced superior forecasts than methods that solely rely on static covariance in a setting in which conventional data were assimilated. Data assimilation experiments using a 40-member multiple-physics ensemble show reductions in 12–72-h forecast root-mean-square error when simulations are initialized with the E3DVar system as opposed to a similarly configured EnKF. Additional sensitivity experiments show only a small sensitivity of forecast errors from E3DVar to the weighting coefficient that is used to balance the amount of ensemble-based and static background covariance. E3DVar may perform as well as a standard EnKF that uses a much larger ensemble size, given that the system is carefully tuned to account for sampling error. In addition, the hybrid system is less sensitive to the use of multiphysics ensembles that are typically used to treat model error in ensemble data assimilation systems. Our results also show that additive inflation may lead to further improvements in the ensemble-based approaches, because ensemble forecasts typically underestimate the true background uncertainty.

In the last part of this study, the E3DVar system is compared with an E4DVar hybrid that couples the EnKF with a four-dimensional variational system. E4DVar leads to further forecast improvements at nearly all model levels and variables for forecast lead times between 12 and 48 h, but converges to the same accuracy as the other ensemble-based methods (EnKF and E3DVar) for lead times between 60 and 72 h. Forecasts from the three ensemble-based methods have considerably smaller errors than forecasts initialized from the two variational methods (3DVar and 4DVar) at later lead times, likely due to the use flow-dependent forecast uncertainty and an ensemble mean first-guess field during cycling. Future studies will be needed to evaluate the relative performance of various data assimilation systems in higher-resolution models and with the inclusion of satellite radiances.

## Acknowledgments

We thank Drs. Xiang-Yu Huang, Chris Snyder, Dale Barker, Xuguang Wang, Yonghui Weng, and Zhiyong Meng for their support on the algorithm development and constructive comments. We also thank three anonymous reviewers for their insightful comments on an earlier version of the manuscript. This work was supported by the Office of Naval Research Grant N000140910526 and the National Science Foundation Grant ATM-0840651.

## REFERENCES

*Sixth WRF/15th MM5 Users’ Workshop,*Boulder, CO, NCAR, 17 pp. [Available online at http://www.mmm.ucar.edu/wrf/users/workshops/WS2005/presentations/session10/1-Barker.pdf.]

*Seminar on Predictability,*Vol. 1, ECMWF, 1–18. [Available online at http://www.ecmwf.int/publications/library/ecpublications/_pdf/seminar/1995/predictability_lorenz.pdf.]

*Computation Science–ICCS 2005,*V. S. Sunderam et al., Eds., Lecture Notes in Computer Science, Vol. 3515, Springer, 837–844.

## Footnotes

^{1}

In the domain average here, a simple vertical mean of the RMSE at different verification levels is used, which does not account for the difference in sample density of verifying observations and the difference in the air density, though these two effects tend to compensate each other.

^{2}

The hybrid cost function [Eq. (7)], does not allow a weighting coefficient of 1, so the value is actually set to 0.99.