## Abstract

A hybrid ensemble transform Kalman filter–three-dimensional variational data assimilation (ETKF–3DVAR) system for the Weather Research and Forecasting (WRF) Model is introduced. The system is based on the existing WRF 3DVAR. Unlike WRF 3DVAR, which utilizes a simple, static covariance model to estimate the forecast-error statistics, the hybrid system combines ensemble covariances with the static covariances to estimate the complex, flow-dependent forecast-error statistics. Ensemble covariances are incorporated by using the extended control variable method during the variational minimization. The ensemble perturbations are maintained by the computationally efficient ETKF. As an initial attempt to test and understand the newly developed system, both an observing system simulation experiment under the perfect model assumption (Part I) and the real observation experiment (Part II) were conducted. In these pilot studies, the WRF was run over the North America domain at a coarse grid spacing (200 km) to emphasize synoptic scales, owing to limited computational resources and the large number of experiments conducted. In Part I, simulated radiosonde wind and temperature observations were assimilated. The results demonstrated that the hybrid data assimilation method provided more accurate analyses than the 3DVAR. The horizontal distributions of the errors demonstrated the hybrid analyses had larger improvements over data-sparse regions than over data-dense regions. It was also found that the ETKF ensemble spread in general agreed with the root-mean-square background forecast error for both the first- and second-order measures. Given the coarse resolution, relatively sparse observation network, and perfect model assumption adopted in this part of the study, caution is warranted when extrapolating the results to operational applications.

## 1. Introduction

The present three-dimensional variational data assimilation (3DVAR) system for the Weather Research and Forecasting (WRF) Model, like many other operational 3DVAR systems (e.g., Parrish and Derber 1992; Courtier et al. 1998; Gauthier et al. 1998; Cohn et al. 1998; Lorenc et al. 2000), assumes that the background forecast-error covariances are static and nearly homogeneous and isotropic. In reality, the background-error covariances may vary substantially depending on the flow of the day. A four-dimensional variational data assimilation (4DVAR) system for WRF implicitly includes a time-evolving covariance model through the evolution of initial errors under tangent linear dynamics (Lorenc 2003). However, the evolved, flow-dependent covariance model may still be limited by usage of a static covariance model at the beginning of each 4DVAR cycle.

The ensemble Kalman filter (EnKF) provides an alternative to variational data assimilation systems. In the EnKF, the background-error covariances are estimated from an ensemble of short-term forecasts. The presumed benefit of utilizing these ensemble-based techniques is their ability to provide a flow-dependent estimate of the background-error covariances. Since the EnKF was described and tested by Evensen (1994) in an oceanographic application and by Houtekamer and Mitchell (1998) in the atmospheric application, ensemble Kalman filter techniques have been implemented in numerous studies with promising results. These studies include observing system simulation experiment (OSSEs; e.g., Houtekamer and Mitchell 1998, 2001; Anderson 2001; Whitaker and Hamill 2002; Szunyogh et al. 2005; Torn et al. 2006) and experiments with real numerical weather prediction (NWP) models and observations (e.g., Houtekamer et al. 2005; Whitaker et al. 2008; Szunyogh et al. 2008). Studies more recently have included limited-area models resolving meso- and convective scales (e.g., Snyder and Zhang 2003; Dowell et al. 2004; Tong and Xue 2005; Meng and Zhang 2008; Liu et al. 2008; Dirren et al. 2007; Torn and Hakim 2008). The EnKF has also been applied to the land surface (e.g., Reichle et al. 2002) and ocean models (e.g., Keppenne and Rienecker 2002). For reviews of the ensemble-based data assimilation, please refer to Evensen (2003), Lorenc (2003), and Hamill (2006).

The encouraging results in these studies suggest that if ensemble information is used in the variational data assimilation framework to augment the static background-error covariance, analyses can be improved. Hereinafter, we call this method a hybrid ensemble variational method, or more simply a “hybrid” scheme. In comparison with the conventional ensemble-based data assimilation, a hybrid scheme may be attractive for the following reasons. First, unlike conventional ensemble-based schemes, which adopt a framework that differs significantly from standard variational schemes, the hybrid schemes build upon with existing variational systems, and thus ensemble information can be incorporated relatively easily. Second, hybrids have been shown with simple model experiments to be more robust than conventional ensemble data assimilation schemes when the ensemble size is small or the model error is large (Wang et al. 2007a, 2008, hereinafter Part II).

Several studies have been conducted on the hybrid schemes. Studies by Hamill and Snyder (2000), Etherton and Bishop (2004), and Wang et al. (2007a) used simple models and simulated observations to suggest the effectiveness of incorporating ensembles in the 3DVAR to improve the analyses. Lorenc (2003) discussed how an ensemble-based covariance model could be adapted conveniently to the variational framework by extending the control variables. Two studies have adopted this framework for global data assimilation applications. Barker (1999) tested this framework with a single member from an error-breeding system using the Met Office’s global 3DVAR and suggested substantial forecast improvement would be possible if more ensemble members were used. Buehner (2005) tested the hybrid for the global 3DVAR at the Canadian Meteorological Centre, using ensembles generated either by the EnKF or by performing an ensemble of 3DVAR analyses, each using a different background and observations perturbed according to observation error statistics. His results showed modest forecast improvements and suggested revisiting the problem with increasing ensemble size. The method of using an extended control variable to incorporate the ensemble in the variational framework (Lorenc 2003) and the method of directly combining the ensemble covariance with the static covariance (Hamill and Snyder 2000) were recently proved to be theoretically equivalent to each other by Wang et al. (2007b).

In this study, we developed a hybrid data assimilation system for the WRF model, based on the existing WRF 3DVAR system. The ensemble mean is updated by the hybrid scheme using the extended control variable method proposed by Lorenc (2003) to incorporate ensemble covariance information. The ensemble perturbations are generated by the ensemble transform Kalman filter (ETKF; Wang and Bishop 2003; Wang et al. 2004, 2007a). Therefore, like conventional ensemble-based data assimilation, such a system can automatically generate initial ensembles for the subsequent ensemble forecasts. We chose to use the ETKF to generate the ensembles for the following reasons. An early study by Wang and Bishop (2003) showed the ETKF can provide ensemble perturbations that produce skillful ensemble forecasts while maintaining relatively inexpensive cost. Recent studies by Wang et al. (2007a) using a simple model demonstrated that a hybrid data assimilation system using the ETKF ensembles provided analyses almost as accurate as the full EnKF with moderate ensemble sizes, and the hybrid provided analyses better than the EnKF with small ensemble sizes. Also, the ETKF generates the ensemble perturbations in a less computationally expensive fashion since the update of the ensemble perturbations are performed in the low-dimensional ensemble subspace. Since we use the ETKF to generate ensembles, we call such a system the hybrid ETKF–3DVAR system for WRF.

In Part I of this two-part study, we test the new WRF hybrid ETKF–3DVAR system using an OSSE. In Part II, we will test the system by assimilating real observations. These studies are the first we are aware of that test the hybrid ETKF–3DVAR method for a limited-area NWP model. As an initial attempt to test and understand the newly developed system and given the limited computational resources, the experiments were conducted at a reduced resolution (200 km) to emphasize synoptic scales, and a subset of observational network was assimilated. Our results are thus not a direct analog to the operational regional-scale applications where much finer resolution and denser observations are assimilated. We hope that in future work we will be able to extend this research to higher-resolution simulations and more complete observation networks.

## 2. The Hybrid ETKF–3DVAR scheme in WRF

Figure 1 from Wang et al. (2007a) illustrates how the hybrid ETKF–3DVAR data assimilation cycle works. Suppose we start with an ensemble of *K* background forecasts at time *t*_{0}. The following four steps are then repeated for each data assimilation cycle: 1) Update the ensemble mean by the hybrid ensemble-3DVAR method. 2) Update the forecast perturbations using the ETKF. 3) Add the updated ensemble perturbations to the updated ensemble mean to generate *K* initial ensemble members. 4) Make *K* forecasts starting from the *K* initial ensemble members forward to the next analysis time. In sections 2a, b, we will describe steps 1 and 2.

### a. Incorporating an ensemble in WRF 3DVAR using extended control variables

We briefly consider the update of the ensemble mean with observations using the hybrid method (the description of the method can also be found from D. M. Barker et al., unpublished manuscript). In the WRF hybrid ensemble–3DVAR system, flow-dependent ensemble covariances are incorporated in the variational minimization by extending control variables, following section 5 of Lorenc (2003). We first introduce the terms generally used in the hybrid ensemble–3DVAR framework and then explain how it is applied within WRF 3DVAR.

The analysis increment of the hybrid, denoted as **x**′, is a sum of two terms, defined as

The first term, **x**′_{1}, in Eq. (1) is the increment associated with the WRF 3DVAR static background covariance. The second term is the increment associated with the flow-dependent ensemble covariance. In the second term of Eq. (1), **x**^{e}_{k} is the *k*th ensemble perturbation normalized by *K* − 1 where *K* is the ensemble size:

In Eq. (2), **x*** _{k}* is the

*k*th ensemble forecast and

**x**is the mean of the

*K*-member ensemble forecasts. The vectors

**a**

*,*

_{k}*k*= 1, . . . ,

*K*, denote the extended control variables for each ensemble member. The symbol ∘ denotes the Schur product (element by element product) of the vectors

**a**

*and*

_{k}**x**

^{e}

_{k}. In other words, the second term of Eq. (1) represents a local linear combination of ensemble perturbations. The coefficient

**a**

*for each member varies in space as discussed later, which determines the ensemble covariance localization scale [for the meaning of covariance localization see Gaspari and Cohn (1999) and Hamill et al. (2001)].*

_{k}The analysis increment **x**′ is obtained by minimizing the following hybrid cost function:

As compared with a normal 3DVAR cost function, a weighted sum of *J*_{1} and *J _{e}* terms in Eq. (3) replaces the usual background term. Next we describe each of the three terms in Eq. (3).

In Eq. (3), *J*_{1} is the traditional WRF 3DVAR background term associated with the static covariance 𝗕.

In the term *J _{e}*,

**a**is a vector formed by concatenating

*K*vectors

**a**

*,*

_{k}*k*= 1, . . . ,

*K*. In other words,

**a**

^{T}= (

**a**

^{T}

_{1},

**a**

^{T}

_{2}, . . . ,

**a**

^{T}

_{K}). As in Eq. (17) of Lorenc (2003), the extended control variables are constrained by a block-diagonal matrix 𝗔

Each of the *K* blocks contains the same prescribed correlation matrix 𝗦, which constrains the spatial variation of **a*** _{k}*. In other words, 𝗔 defines the spatial covariance, here spatial correlation (since variance is equal to 1) of

**a**, same as 𝗕 defines the spatial covariance of

**x**′

_{1}.

In Eq. (3), *J _{o}* is the observation term. As the traditional 3DVAR,

**y**

*′ =*

^{o}**y**

*−*

^{o}*H*(

**x**

*) is the innovation vector. Here*

^{b}**y**

*denotes the observation,*

^{o}**x**

*is the background forecast, and*

^{b}*H*is the nonlinear observation operator. In this study, the background forecast

**x**

*is the ETKF ensemble mean forecast. Here 𝗛 is the linearized observation operator, and 𝗥 is the observation-error covariance.*

^{b}In Eq. (3), there are two factors *β*_{1} and *β*_{2} that define the weights placed on the static background-error covariance and the ensemble covariance. To conserve the total background-error variance, *β*_{1} and *β*_{2} are constrained by

A similar constraint was applied in Hamill and Snyder (2000), Etherton and Bishop (2004), and Wang et al. (2007a).

To further comprehend the hybrid system defined by Eqs. (1)–(5), Wang et al. (2007b) explicitly proved that the solution from Eqs. (1) to (5) is equivalent to the solution by minimizing a cost function where the background-error covariance was explicitly defined as a sum of the static covariance and the ensemble covariance with localization applied through the Schur product:

where 𝗣* ^{e}* is the ensemble covariance defined as

Wang et al. (2007b) also proved that given the covariance, 〈**a*** _{k}*(

**a**

*)*

_{k}^{T}〉 = 𝗦,

*k*= 1, . . . ,

*K*, the covariance of the second term in Eq. (1) satisfies

Based on Eqs. (6)–(8), effectively the correlation matrix in the second term of Eq. (3) performs covariance localization on the ensemble covariance.

Practically, to effectively reduce the condition number during variational minimization (e.g., Lorenc et al. 2000), the *J*_{1} term in Eq. (3) is preconditioned by a control variable transform relating the control variables (**v**_{1}) and model space increments (**x**_{1}), that is, **x**′_{1} = 𝗨_{1}**v**_{1}, where the transform 𝗨_{1} approximates the square root of the static covariance 𝗕. In the hybrid system, the transform 𝗨_{1} is the same as in the WRF 3DVAR. For details, please refer to Barker et al. (2004). Similarly, the *J _{e}* term is preconditioned by a transform

**a**= 𝗨

_{2}

**v**

_{2}, where the transform by 𝗨

_{2}approximates the square root of the correlation matrix 𝗔. This transform 𝗨

_{2}is modeled using the simple recursive filter (Hayden and Purser 1995) in the WRF hybrid data assimilation system, different from Buehner (2005) where the correlation was modeled with a truncated spectral expansion.

To reduce the extra cost of minimization due to the increased number of control variables, we chose to let each **a*** _{k}*,

*k*= 1, . . . ,

*K*, vary only in the horizontal. In other words, the same two-dimensional field of coefficients

**a**

*was applied for all levels and all variables. Thus, we only applied horizontal recursive filters to model the correlation matrix 𝗔. As mentioned earlier, the correlation matrix 𝗔 determines the covariance localization to be applied to the ensemble. So in the current hybrid system, there is no vertical covariance localization. Nevertheless, as discussed in Buehner (2005) and Wang et al. (2007a), the use of the static covariance in Eq. (3) in addition to the ensemble covariance helps to reduce the detrimental effects of the sampling error of the ensemble covariance on the analysis. So, although there is no vertical localization through 𝗔, the static covariance 𝗕 can ameliorate the effects of sampling error in the vertical covariances estimated from the ensemble.*

_{k}The number of extended control variables in the current hybrid system is equal to the horizontal grid dimension (i.e., the number of grid points in the horizontal, *I* × *J*) of the model times the ensemble size *K*. The number of control variables in the traditional WRF 3DVAR cost function is equal to Σ^{N}_{n=1 }*I* × *J* × *L*_{n}, where *N* is the number of types of variables and *L _{n}* is the number of vertical levels for the

*n*th type of variable. In the current WRF 3DVAR, there are five uncorrelated control variables (Barker et al. 2004), and therefore,

*N*= 5. So, the additional number of control variables in the hybrid divided by the number of control variables of the traditional WRF 3DVAR is equal to

*K*/Σ

^{5}

_{n=1 }

*L*

_{n}, or about 0.46 in the experiments considered in this study.

### b. The ETKF ensemble generation scheme

We now consider the method for generating ensemble perturbations around the updated mean state. The ETKF is used to update the forecast ensemble perturbations to produce the analysis perturbations. Denote 𝗫* ^{e}* as the matrix whose

*K*column vectors contain the

*K*vectors of ensemble member perturbations from the mean. Denote 𝗫

*as the matrix of analysis perturbations (i.e., columns of 𝗫*

^{a}*contain*

^{a}*K*analysis perturbations). The ETKF updates 𝗫

*into 𝗫*

^{e}*through a transformation matrix. The transformation matrix is derived within the ensemble perturbation subspace. Assuming the covariance of the raw forecast ensemble perturbations were equal to the true forecast-error covariance, the goal of the ETKF is to choose the transformation matrix so that the outer product of the transformed perturbations were equal to true analysis error covariance. The ETKF formulation used here (Wang et al. 2007a) is*

^{a}where **C** contains the eigenvectors and **Γ** the eigenvalues of the *K* × *K* matrix (𝗫* ^{e}*)

^{T}𝗛

^{T}𝗥

^{−1}𝗛𝗫

*/(*

^{e}*K*− 1), and 𝗜 is the identity matrix. For more details on the derivation of Eq. (9), please refer to Wang et al. (2007a) and references therein.

In Eq. (9), the scalar factor Π is an inflation factor, and the scalar factor *ρ* accounts for the fraction of the forecast-error variance projected onto the ensemble subspace. Both factors are intended to ameliorate the systematic underestimate of the analysis-error variance by the ETKF because of the limited ensemble size. Wang et al. (2007a) provided details on how to estimate these two factors adaptively for each data assimilation cycle using the innovation statistics. Ultimately, we want to ensure that on average the background-error variance estimated from the spread of ensembles about the ensemble mean is consistent with the background-error variance estimated from the differences between the ensemble mean and the observations. Mathematically,

where tr denotes trace of a matrix. For a derivation of Eq. (10), please see Wang et al. (2007a).

As discussed in Wang et al. (2007a), the estimation of the above two scalar factors is also subject to sampling errors due to the limited number of observations in the innovation vector and the limited ensemble size. Since this study only considers the radiosonde observations over North America and uses a 50-member ensemble, we reduce sampling errors in the estimation of these two factors by taking the running mean of the previous 5 days’ values of the factors.

## 3. Experiment design

### a. Model, observations, ensemble configurations, and verification techniques

As an initial attempt to test and understand the newly developed hybrid data assimilation system for WRF, we designed experiments that were relatively simple and manageable with our limited computational resources. In this paper (Part I), we describe and report the OSSE. Results for the real-observation experiments are reported in Part II.

Experiments were performed running the WRF model (Skamarock et al. 2005), on a domain covering North America and the surrounding oceans (Fig. 1), the same as the ongoing WRF EnKF experiments (Caya et al. 2004). We ran WRF with a 200-km grid spacing on a 45 × 45 horizontal grid with 27 vertical levels and a model top at 50 hPa. The coarse grid spacing was chosen so that a large number of experiments could be conducted to find optimal tunable parameters (see sections 3b and 3c and experiment design in Part II) using the limited computational resources available.

Simulated observations of radiosonde temperature, and *u*- and *υ*-wind components were generated by adding random noise to the WRF nature run (the “truth”). The positions and vertical sampling rate of the radiosondes were obtained from the operational observation dataset at the National Centers for Environmental Prediction (NCEP; see online at http://www.emc.ncep.noaa.gov/mmb/data_processing/prepbufr.doc for the operational observation dataset). Figure 1 shows a snapshot of the horizontal distribution of the radiosondes. Simulated observations were generated up to 100 hPa. The observation-error covariance 𝗥 was assumed to be diagonal. The observation errors were obtained from the operational observation dataset at NCEP. The vertical profile of the observation errors is shown in Fig. 2. The random noise that was used to generate simulated observations was drawn from a Gaussian distribution with zero mean and standard deviation equal to the observation errors.

The simulation started at 0000 UTC 1 January 2003 and lasted for 4 weeks. The observations were assimilated every 12 h for 4 weeks, beginning 1200 UTC 1 January 2003.^{1} Initial and lateral boundary conditions (LBCs) for the nature run and for the ensemble were constrained following Caya et al. (2004). The initial condition and the LBCs for the nature run were obtained by adding perturbations to the NCEP “Final” analysis (FNL; see online at http://dss.ucar.edu/datasets/ds083.2). The perturbations were drawn from a multivariate normal distribution whose covariance was from WRF 3DVAR NCEP background-error covariances, following Torn et al. (2006). The structure of the WRF 3DVAR NCEP background-error covariances is similar to that described in Wu et al. (2002). We ran 50-member ETKF ensembles. The ensemble at 0000 UTC 1 January and the ensemble of LBCs during the 4 weeks were generated by superposing 50 perturbations to the NCEP FNL analysis for the appropriate date. As for the nature run, these perturbations were drawn from a multivariate normal distribution, having the same covariance as the one used to generate the nature run. This procedure assumed the uncertainty in the LBCs for the 4-week period and the uncertainty in the IC at the very beginning of the cycle were perfectly represented by the ensembles because the true state, by construction, was drawn from the same distribution as the ensembles. Figure 3 shows the spread (“uncertainty”) of the 50-member initial ensembles as a function of model grids in latitudinal direction at 0000 UTC 1 January. The lateral boundary width in this study is five model grids. The magnitude of the spread at the boundaries in Fig. 3 also shows the uncertainty of the LBCs from 50 random perturbations.

For the 3DVAR experiment, the background forecast at 1200 UTC 1 January 2003 was taken from the ensemble mean at that time to ensure that both assimilation schemes start with the same first guess. The LBCs for the 3DVAR experiment were from the NCEP FNL analysis.

We chose the verification region to be the inner quarter of the total domain (Fig. 1) to minimize artifacts from the lateral boundaries. The analyses and forecasts were verified against the truth (i.e., the WRF nature run). The statistics were collected after the first 5 days for all horizontal and vertical grids in the verification domain.

### b. Tuning the 3DVAR static background-error covariance

Since the default WRF 3DVAR NCEP background-error covariance may not represent the optimal static background-error covariance for the 3DVAR OSSE experiment with the perfect model assumption and simplified observation network, we constructed a new static background-error covariance using the 12-h ETKF ensemble forecasts that corresponded to the OSSE experiment setting. We ran the ETKF ensemble forecasts every 12 h for 4 weeks. The ensemble-mean forecast for the ETKF was updated by the WRF 3DVAR assimilating the simulated observations using the default background-error covariance. We then discarded the first 5 days and recalculated the static background-error covariance 𝗕 using the remaining 12-h ETKF ensemble forecasts. In constructing 𝗕, linear balance between mass and wind fields and horizontal homogenous error covariances was assumed. For details on calculating the static covariance for WRF 3DVAR, please refer to Skamarock et al. (2005, chapter 9). Then we reran the OSSE 3DVAR experiment using the newly generated 𝗕. As shown in Table 1, the analyses generated by the 3DVAR with this tuned static covariance were more accurate than using the default WRF 3DVAR NCEP background-error covariance.

We also performed several other experiments to determine the sensitivity of 3DVAR analysis to the 𝗕 produced by other inputs. We reproduced 𝗕 using a 2-week period of ETKF forecasts instead of a 4-week period. The root-mean-square (rms) error of the 3DVAR analysis was not sensitive to this. Since for the OSSE, the true state was available and thus the error was known, we also calculated a new static covariance using the known 12-h forecast-error samples for the 4-week period. The rms error of the 3DVAR analysis using this new 𝗕 was similar to that using the 𝗕 generated by the ETKF ensemble. Consequently, in the following 3DVAR and hybrid data assimilation experiments, we used the tuned static background covariance generated by the 4-week ETKF ensemble.

### c. Weighting factors and ensemble covariance localization scales

There are two tunable parameters in the hybrid that may affect the performance of the hybrid [Eq. (3)]. One is the weighting factor that determines the relative influence of the static covariance and the ensemble covariance during the analysis. The other is the ensemble covariance localization scale. In the following hybrid experiments, we tried five different values for the weighting factor *β*_{1}, 1/*β*_{1} = 1.0, 0.8, 0.5, 0.2, 0. A value of 1/*β*_{1} = 1.0 indicated that all weight was placed on the static covariance; a value of 1/*β*_{1} = 0.5 indicated that the static covariance and ensemble covariance used equal weights.

As discussed in section 2a, the correlation matrix 𝗔 (which determines the horizontal localization for the ensemble covariance) is modeled by using a recursive filter (Hayden and Purser 1995). The correlation length scale of the recursive filter (*S*) determines the degree to which the ensemble covariance was localized. We tried four different values for *S*, *S* = 250, 500, 1000 and 1500 km. If iterated repeatedly, the recursive filter approximates a Gaussian-shaped response and the length scale *S* corresponds to an *e*-folding distance of *S _{e}* = 22

*S*= 707, 1414, 2828, 4242 km of the Gaussian response function (Barker et al. 2003). For each of the four weighting factors 1/

*β*

_{1}= 0.8, 0.5, 0.2, 0, the four different values of correlation length scales were tried.

## 4. Results

### a. Examples of flow-dependent increments

Figure 4 provides two examples showing that the ETKF ensemble can provide flow-dependent estimates of the background-error covariance and that the extended control variable method can utilize such information to provide flow-dependent analysis increments. Figure 4 shows the 850-hPa temperature increments from the assimilation of a single 850-hPa temperature observation that was 1 K warmer than the background forecast. A weighting factor of 1/*β*_{1} = 0 and a covariance localization scale of *S _{e}* = 22

*S*= 2000 km were used in both cases.

The first case corresponds to the ETKF ensemble at 0000 UTC 14 January 2003. The observation location was in the middle of a region with a strong temperature gradient. Positive increments of temperature were elongated along the isotherms while negative increments were located in the western part of the cold pool. The dipole-shaped increment suggested a more negative tilt to the analyzed thermal trough. In this first case, the ETKF ensemble primarily indicated uncertainty in the forecasting of the *position and orientation* of the baroclinic zone.

The second case corresponds to the ETKF ensemble at 0000 UTC 24 January 2003. In this case, the baroclinic zone had passed to the south and east of the observation location, and the simulated observation was again 1 K warmer than the background forecast. There was a positive increment in the cold air around the observation and negative increment in the warm air to the east. The dipole-shaped increment in this case suggested the observation should weaken the baroclinic zone east of the observation. In contrast to the first case, in the second case the ETKF ensemble indicated a disagreement on the *strength* of the baroclinic zone among ensemble members. Note also that the second case demonstrated that the ETKF ensemble covariance indicated that it was appropriate to extrapolate the observation information into the relative data void in the western Atlantic. Also note that the magnitude of maximum increment was smaller in the second case than the first case.

In summary, the ETKF ensemble estimated the flow-dependent structure and the magnitude of the background-error covariance. In both cases, the new hybrid assimilation system included these flow-dependent covariances during the minimization and thus produced flow-dependent analysis increments.

### b. Verification of the analyses and forecasts

In this section, we evaluate the characteristics of the analysis and forecast errors for the hybrid and the 3DVAR experiments. The analyses and forecasts were verified against the truth (i.e., the WRF nature run) for all model horizontal and vertical grids within the verification region. Table 2 shows the rms analysis error for wind (defined as the square root of the mean squared zonal plus meridional wind errors; same is used for all wind verification) and potential temperature. The rms analysis error for the hybrid is shown as a function of the weighting coefficient 1/*β*_{1} and the localization scales *S*_{e}. As shown in Table 2, for most of the combinations of 1/*β*_{1} and *S _{e}*, the hybrid analysis was more accurate than the 3DVAR, except for the combinations of small 1/

*β*

_{1}and relatively large

*S*. The optimal parameters were 1/

_{e}*β*

_{1}= 0.2,

*S*= 1414 km for the wind, and 1/

_{e}*β*

_{1}= 0.5,

*S*= 1414 km for the potential temperature. The best-performing hybrid analyses improved upon the 3DVAR analyses by 20.6% and 14.7% for the wind and the potential temperature, respectively. For all other variables of WRF, the hybrid analyses were also more accurate than the 3DVAR (not shown).

_{e}In Table 2, the difference between the 3DVAR and 1/*β*_{1} = 1.0 experiments was that the background forecast for the former was from the single control forecast, whereas for the latter it was from the ETKF ensemble-mean forecast. The improvement of the analyses for the 1/*β*_{1} = 1.0 experiment over the 3DVAR experiment was presumably because the background forecast from the ensemble mean was more accurate than the single control forecast.

Examining the columns of Table 2, for each localization scale, as 1/*β*_{1} decreased from 1.0 (i.e., more weight on the ensemble covariance), the rms analysis error first decreased and then increased. In general, the value of the optimal 1/*β*_{1} decreased with decreasing covariance localization scale. For example, for wind, the optimal 1/*β*_{1} decreased from 0.5 to 0.2 as *S _{e}*decreased from 2828–4242 to 707–1414 km. Relative to 1/

*β*

_{1}= 1.0, the improvement of the hybrid analysis as 1/

*β*

_{1}started to decrease demonstrated the improvement of the analysis through incorporating the flow-dependent ETKF ensemble covariance. For some localization scales, when further reducing 1/

*β*

_{1}after it reached the optimal value, the analysis became worse than the analysis for 1/

*β*

_{1}= 1.0. The value of 1/

*β*

_{1}that produced a worse analysis than 1/

*β*

_{1}= 1.0 decreased with decreasing localization scales.

Examining the rows of Table 2, with increasing weighting factor, 1/*β*_{1} (i.e., increasing weight on the static covariance), the rms analysis error of the hybrid became less sensitive to the ensemble covariance localization scales. As discussed in Buehner (2005) and Wang et al. (2007a), this was because the hybridization of the static covariance already served to ameliorate the detrimental effects of the sampling error in the ensemble covariance. Also note that in general, as 1/*β*_{1} decreased (i.e., more weight on the ensemble covariance), the optimal localization scale decreased (i.e., tighter localization). Corresponding results for the 12-h forecasts show similar variations of the errors with respect to *S _{e}* and 1/

*β*

_{1}.

Consider the horizontal (Fig. 5) distribution of the improvement of the hybrid analysis over the 3DVAR analysis. Corresponding plots for the 12-h forecasts show qualitatively similar results. The hybrid with the weighting factor 1/*β*_{1} = 0.2 and the localization scale *S _{e}* = 1414 km was used. Figure 5 shows the horizontal distribution of the difference between the hybrid rms analysis error and the 3DVAR rms analysis error for the wind over the verification domain. For each scheme, we first calculated the rms analysis error averaged over all levels in the verification domain and then plotted the error of the 3DVAR minus the error of the hybrid. Positive values in Fig. 5 indicate that the hybrid analysis was more accurate than the 3DVAR analysis. The rms analysis error of the hybrid was smaller than the 3DVAR throughout the domain. The hybrid had larger improvement near the data-sparse regions, such as the western Atlantic, Gulf of Mexico, and eastern Pacific. We also found that the improvement of the hybrid over 3DVAR was in general larger over the western part of the continent than the eastern part. Probably this is because the western continent is immediately downstream of eastern Pacific, where there are no radiosonde observations. The relatively large improvement of the analysis over the eastern Pacific produced relatively large forecast improvement, and thus relatively large analysis improvement downstream. These results are consistent with previous studies in Hamill and Snyder (2000) and Whitaker et al. (2004, 2008), who showed that the flow-dependent ensemble background-error covariances have the largest impact over and downstream of where the observational network is sparse.

Blue lines in Fig. 6 shows the rms analysis errors of the hybrid and the 3DVAR as a function of model levels. For both the hybrid and the 3DVAR, the rms error peaked at about 300 hPa for wind and 200 hPa for potential temperature, and then increased again in the lower stratosphere. The improvement of the hybrid (solid) over the 3DVAR (dotted) was nearly uniform except near the model top, where the improvement of the hybrid was much smaller than at lower levels. To understand why the analysis error was large near the model top and why the improvement of the hybrid was small at the model top, we conducted extra diagnostics and experiments.

We first calculated the time-averaged and domain-averaged vertical correlation of the 12-h forecast error between the model top level (level 27) and other levels. The correlation decreased quickly from 1.0 at level 27 to 0.16 at level 24 (about 100 hPa) for wind and to 0.1 at level 24 for potential temperature. The correlation then decreased to nearly zero at level 21 (about 180 hPa).

The observations were below 100 hPa in the study. The weak correlation between the model uppermost levels and the observation levels clearly has the potential to lead to relatively larger errors at the uppermost model levels. We have confirmed this through experiments with observations extended to 50 hPa. These revealed significantly reduced errors (by more than 50% at level 27) at the model top for both the hybrid and the 3DVAR. Significant improvement of the hybrid over the 3DVAR was also found even at the model top, consistent with the performance of the hybrid at lower levels (not shown).

In the hybrid system, the linear combination of the static covariance 𝗕 with the ensemble covariance ameliorates but does not remove the effect of the sampling errors of the vertical error correlation estimated by the ensemble. The weak correlation between the model uppermost levels and the levels below 100 hPa has the potential to increase the detrimental effects of sampling errors in the hybrid, since for fixed ensemble size, expected errors in the sample estimate of a correlation are largest when the correlation is small. We hypothesize that the small improvement of the hybrid relative to the 3DVAR at the model top shown in Fig. 6 was due to the sampling errors. To test this hypothesis, we conducted another hybrid experiment with the ensemble size increased from 50 to 100. The corresponding profile of the rms analysis error is shown in Fig. 6 as the black line. The improvements from the larger ensemble (and its reduced sampling error) were largest at the model top. With the larger ensemble, the hybrid also significantly improved on the 3DVAR even at the uppermost levels.

Figure 6 also shows the rms forecast errors at 12- (red lines) and 24-h (green lines) lead times. As the analysis, the hybrid forecast is more accurate than the 3DVAR forecasts nearly uniformly below level 25. The hybrid 12-h forecast is even more accurate than the 3DVAR analysis below level 15. Note that since the same LBCs were used to run the forecasts for both the hybrid and the 3DVAR, with increasing forecast lead times the influence of the LBCs becomes more dominant over the verification region and the difference of the forecasts between the hybrid and the 3DVAR becomes smaller. Plotting the forecast errors separately over the western and the eastern part of the verification domain shows that the differences of the forecast errors between the hybrid and the 3DVAR starts to diminish earlier over the western part than the eastern part (not shown), which verifies this hypothesis.

### c. Verification of the ETKF ensemble spread

In this section, we compare the 12-h ETKF ensemble forecast spread to the 12-h background forecast errors to see how well the ensemble spread estimated the magnitude of the background forecast errors. The results shown below correspond to the hybrid experiment with 1/*β*_{1} = 0.2 and *S _{e}* = 1414 km.

As mentioned in section 2b, the ETKF employs two tunable factors, the inflation factor (Π), and the factor (*ρ*) that estimates the fraction of the first-guess (ensemble mean) error variance projected onto the ensemble subspace, in the ETKF. These factors are intended to ameliorate the systematic underestimate on the error variance by the ETKF due to the limited ensemble size. Both factors were determined adaptively in the ETKF. Figure 7 shows the factors Π and *ρ* during the 4-week data assimilation period. The factors started to converge to constant values after 5 days (10 cycles). The average values after day 5 for Π and *ρ* were 3.4% and 28.7%, respectively.

To evaluate how well the ensemble spread performs as an estimate of forecast errors, Fig. 8 shows the vertical profile of the 12-h ensemble spread versus the rms error of the 12-h background forecast for wind and potential temperature. For each model level, the spread and rms error were averaged over all grid points within the verification domain over the verification period. The ETKF ensemble spread in general agreed with the first-guess rms error, except at the top levels where the ensemble spread was systematically smaller than the first-guess error. As discussed in section 2b, the adaptive inflation factor for the ETKF was determined by the innovations over all observations. Therefore, it only ensured that over the observation space, the ensemble spread on average agreed with the first-guess error. The relatively large discrepancy of the spread and the first-guess error at the top levels may be because the lack of observation above 100 hPa in this experiment. For the experiment where observations were extended to the top model level, the corresponding plot (not shown) shows that the magnitude of the spread and that of the first-guess error were about the same at the top model levels. Note that the adaptive inflation factor used in this study had the same value for all model grids and variables, which will not ensure matching of the spread and the first-guess error for each level and each variable. For future work, we may explore and apply the inflation factor so that it is not only adaptive in time but also adaptive in space (J. L. Anderson 2008, unpublished manuscript).

Comparing the ensemble spread to the rms first-guess error averaged over all grid points on a level like in Fig. 8 provides a first-order measurement of the relationship of the spread and the error at that level. Next we consider a second-order measure; namely, whether the ETKF spread can distinguish large background forecast error from small background forecast error. For example, Fig. 9 shows that for the first-order measure, the ETKF spread agreed with the rms background error at model level 17 (∼330 hPa). To provide a second-order measure for that level, we used a method similar to that used in Majumdar et al. (2001) and Wang et al. (2007a). We first produced a scatterplot where the ordinate and abscissa of each point were the squared first-guess error and the ensemble variance for a particular variable of interest, respectively. We collected samples over all grid points at level 17 over the verification domain and all times. We then divided these points into equally populated bins, arranged in order of increasing ensemble variance. Next, we averaged the squared forecast error and ensemble variance separately for each bin. Note that as discussed in Wang and Bishop (2003), comparing a squared first-guess error with the ensemble variance does not make sense since the former is a random draw from the latter if the ensemble variance is perfect. In other words, we need to average the squared first-guess error first before comparing it with the ensemble variance. We then plotted the square roots of the averaged squared forecast error against the square roots of the averaged ensemble variance. The connecting curve described the relationship between the ensemble spread and the rms forecast error. Figure 9 shows such a curve for the wind and potential temperature, where we used 12 bins. Results using three bins are smoother than that shown in Fig. 9. The curve in Fig. 9 closely aligned with the 45° line, which means the ETKF ensemble spread can distinguish large background errors from small background errors and thus make quantitative predictions of the expected background errors. To see if the result is statistically significant, we randomly picked half of the samples and plotted the same curve 10 times. Thus, in each bin we collected 10 pairs of square roots of averaged squared first-guess errors and the averaged ensemble covariance. We then calculated the standard deviations from these 10 samples in each bin. The bars in Fig. 9 shows the plus and minus three standard deviations with respect to both axes. This test suggested that the conclusion that the ETKF ensemble spread can well predict the expected forecast errors is statistically robust.

## 5. Conclusions and discussion

In this study, we introduced a new hybrid ETKF–3DVAR data assimilation method developed for WRF. The system was built based on the existing WRF 3DVAR. In the hybrid system, the background-error statistics were estimated from a combination of the traditional static covariance with the flow-dependent ensemble covariance. Specifically, the ensemble covariance was included in the variational update through extending control variables. At each data assimilation time, the deviations of the ensemble members about the mean were updated by the computationally efficient ETKF.

As a pilot study on the WRF ETKF–3DVAR hybrid data assimilation system, in this Part I we conducted an identical-twin OSSE experiment with a coarse grid spacing (200 km) and a relatively sparse network of simulated radiosonde wind and temperature observations in a region around the North America. (The accompanying Part II discusses real-data experiments). The hybrid provided ∼15%–20% more accurate analyses than the 3DVAR. The horizontal distribution of the rms analysis error showed that the improvement of the hybrid over the 3DVAR was larger over data-sparse regions than over data-dense regions. We also found that the ETKF ensemble spread in general was consistent with the rms first-guess error in both the first- and the second-order measurements.

The benefit of the hybrid may be diminished in real-world implementations with imperfect models and a denser observation network. Since an ensemble of simulations with an imperfect forecast model will not be able to describe the error covariances due to model error and the current state-of-the-art method to account for model error in the ensembles may not be effective, the positive impact of ensemble may thus diminish (Part II). Also, several previous studies have shown that as observation density increases, the impact of the flow-dependent covariances is diminished (Hamill and Snyder 2000; Whitaker et al. 2004, 2008).

Our choice of a coarse grid spacing in this study emphasized the synoptic scales. The real-world regional-scale application adopts a much finer resolution. As grid spacing is refined and mesoscale features are resolved, it is possible that these mesoscale features, which are more sensitive to the details of imperfect model parameterizations, may not be handled as well as the synoptic scales. On the other hand, increasing resolution may reduce the model error for synoptic scales and thus improve the ensemble covariance and the analysis of the hybrid. Also note that the hybrid may demonstrate its advantage in mesoscales because mesoscales are relatively poorly observed as compared with large scales, and they do not exhibit as strongly the balances assumed by the 3DVAR covariance model. Therefore, caution needs to be taken to extrapolate the results of this study to a more realistic context. Parameters such as the weighting coefficients and the localization length scale may need to be retuned.

A hybrid formulation using the ETKF has some appealing characteristics. The hybrid is straightforward to implement in the framework of operational variational schemes. It may be more robust than conventional ensemble filters when the ensemble size is small or when the model error is large. The computational expense of the ensemble update step using the ETKF is much more efficient than the conventional ensemble-based data assimilation schemes. Conceptually, there is no reason why a 4DVAR system could not be hybridized with an ETKF ensemble.

If one needs to run ensemble forecasts anyway, the only extra cost of the hybrid results from the analysis procedure. The hybrid formulation proposed here produced a modest increase in computational expense from the extension of the control variables in the variational minimization. In the current experiments, with this 50-member ensemble hybrid and horizontal covariance localization, the number of control variables was increased by 46% relative to 3DVAR. The computational expense of minimization for the hybrid was about twice that of 3DVAR. Here, to minimize the computational expense, the extended control variables in the current system did not apply localization for the vertical covariance estimated by the ensemble (the effect of the sampling error of the vertical covariance estimated by the ensemble was somewhat ameliorated by the static covariance). It is unclear how much the hybrid filter accuracy will improve with vertical covariance localization through extended control variables. It may be possible to include the vertical covariance localization at modest expense through one of several possible techniques. These include modeling the vertical correlation using truncated EOFs, similar to the vertical transform in the current WRF 3DVAR (Barker et al. 2004), or to use the extended control variables on a coarser grid. Other improvements in the efficiency of the hybrid may be possible through further optimization.

## Acknowledgments

The participation of the first author was funded by National Science Foundation Grants ATM-0205612, ATM-0205655, and a NOAA THORPEX grant. We thank Syed Rizvi and John Bray for their help with technical problems of WRF VAR at the early stages of the study. We appreciate Yongsheng Chen for sharing the WRF LBC ensemble perturbation scripts. Discussions with Jeff Whitaker improved the manuscript. We also appreciate the constructive comments from the reviewers.

## REFERENCES

**,**

**,**

**,**

**,**

**,**

**,**

**,**

**,**

**,**

**,**

**,**

**,**

**,**

**,**

**,**

**,**

**,**

**,**

**,**

**,**

**,**

**,**

**,**

**,**

**,**

**,**

**,**

**,**

**,**

**,**

**,**

**,**

**,**

**,**

**,**

**,**

**,**

**,**

**,**

**,**

**,**

## Footnotes

* The National Center for Atmospheric Research is sponsored by the National Science Foundation

*Corresponding author address:* Dr. Xuguang Wang, NOAA/Earth System Research Lab, Physical Sciences Division, 325 Broadway, R/PSD1, Boulder, CO 80305-3328. Email: xuguang.wang@noaa.gov

^{1}

For the OSSE, the dates were defined from the dates in the NCEP FNL analysis, which were used to start the data assimilation cycle and define the subsequent LBCs.