## 1. Introduction

The accuracy of numerical weather prediction (NWP) depends critically on the qualities of the initial conditions and the forecast model. The initial conditions of an NWP model usually come from data assimilation, a procedure that aims to estimate the state and uncertainty of the atmosphere as accurately as possible by combining all available information (including both model forecasts and observations, and their respective uncertainties).

In the data assimilation community, the ensemble Kalman filter (EnKF; Evensen 1994), which estimates the background error covariance with a short-term ensemble forecast, is drawing increasing attention. Since its first application in atmospheric sciences (Houtekamer and Mitchell 1998), the EnKF has been widely examined with different models at different scales (e.g., Hamill and Snyder 2000; Anderson 2001; Whitaker and Hamill 2002; Mitchell et al. 2002; Snyder and Zhang 2003; Zhang and Anderson 2003; Zhang et al. 2004, 2006, 2009a; Aksoy et al. 2005, 2006a,b; Houtekamer et al. 2005, 2009; Tong and Xue 2005; Dirren et al. 2007; Meng and Zhang 2007, 2008a,b; Whitaker et al. 2008; Torn and Hakim 2008a, 2009a; Buehner et al. 2010a,b). There are several recent review articles on the EnKF including, Evensen (2003, 2007), Hamill (2006), and Ehrendorfer (2007). However, none of these is dedicated to EnKF applications ranging from regional to meso- and convective scales in limited-area models (LAMs), which is the focus of the current review. ^{1}

The first LAM application of the EnKF was found in Snyder and Zhang (2003) and Zhang et al. (2004), where synthetic radar data was assimilated into a cloud model. Those two studies demonstrated that the EnKF analysis can faithfully approximate the truth in terms of both dynamic and thermodynamic variables of a supercell storm (Fig. 1).

The first real-data application appeared in Dowell et al. (2004) in which the same EnKF was used to assimilate real radar observations for a tornadic supercell thunderstorm. This EnKF was further demonstrated to be comparable to a four-dimensional variational data assimilation (4DVar) system when implemented in the same cloud model (Caya et al. 2005). Similar to global-scale EnKF applications, the LAM EnKF progressed from earlier perfect-model Observing System Simulation Experiments (OSSEs) to more real-data, real-time, quasi-operational applications (Tong and Xue 2005; Barker 2005; Zhang et al. 2006; Chen and Snyder 2007; Meng and Zhang 2007; Fujita et al. 2007, 2008; Hacker et al. 2007; Meng and Zhang 2008a,b; Torn and Hakim 2008a, 2009a; Zhang et al. 2009a; Aksoy et al. 2009, 2010). The first pseudo-operational regional-scale EnKF system, based on the Weather Research and Forecasting model (WRF), was implemented at the University of Washington in January 2005 (Torn and Hakim 2008a). The 2-yr performance of this system was found to have slightly larger errors of wind and temperature fields, but smaller errors in moisture in comparison to deterministic output of different operational forecast models (Fig. 2).

The performance of a quasi-operational WRF-EnKF system implemented at the University of Washington in comparison to selected operational forecasts in terms of RMS error (solid) and bias (forecast − observation, dashed) in 24-h forecasts of (a) temperature, (b) meridional wind, (c) geopotential height, and (d) dewpoint temperature from 1 Jan 2005 to 1 Jan 2007. All forecasts are verified against the same set of rawinsonde observations. The black line denotes the European Centre for Medium-Range Weather Forecasts (ECMWF) rawinsonde observation error standard deviation assumed during data assimilation (adapted from Torn and Hakim 2008a).

Citation: Monthly Weather Review 139, 7; 10.1175/2011MWR3418.1

The performance of a quasi-operational WRF-EnKF system implemented at the University of Washington in comparison to selected operational forecasts in terms of RMS error (solid) and bias (forecast − observation, dashed) in 24-h forecasts of (a) temperature, (b) meridional wind, (c) geopotential height, and (d) dewpoint temperature from 1 Jan 2005 to 1 Jan 2007. All forecasts are verified against the same set of rawinsonde observations. The black line denotes the European Centre for Medium-Range Weather Forecasts (ECMWF) rawinsonde observation error standard deviation assumed during data assimilation (adapted from Torn and Hakim 2008a).

Citation: Monthly Weather Review 139, 7; 10.1175/2011MWR3418.1

The performance of a quasi-operational WRF-EnKF system implemented at the University of Washington in comparison to selected operational forecasts in terms of RMS error (solid) and bias (forecast − observation, dashed) in 24-h forecasts of (a) temperature, (b) meridional wind, (c) geopotential height, and (d) dewpoint temperature from 1 Jan 2005 to 1 Jan 2007. All forecasts are verified against the same set of rawinsonde observations. The black line denotes the European Centre for Medium-Range Weather Forecasts (ECMWF) rawinsonde observation error standard deviation assumed during data assimilation (adapted from Torn and Hakim 2008a).

Citation: Monthly Weather Review 139, 7; 10.1175/2011MWR3418.1

Most recently, a WRF-based LAM EnKF system has also been used to assimilate real Doppler radar radial velocity observations for cloud-resolving hurricane analysis, initialization, and prediction (Zhang et al. 2009a). It was found that deterministic forecasts initialized from the EnKF analysis could be able to predict the rapid formation and intensification of Hurricane Humberto (2007) (Figs. 3c,d), providing analysis and forecasts superior to a WRF-based, three-dimensional variational data assimilation (3DVar) system. This EnKF data assimilation system is capable of ingesting airborne and ground-based radar observations and has been implemented for real-time hurricane analysis and forecasts in 2008 and 2009 (Y. Weng 2010, personal communication).

Real-data applications of convective-scale radar data assimilation with two independent EnKF systems: (a),(b) the 8 May 2003 Oklahoma City tornadic supercell storm case [adapted from Lei et al. (2008), courtesy of M. Xue at OU] and (c),(d) Hurricane Humberto (2007) [adapted from Zhang et al. (2009a)]. The 500-m forecast reflectivity [shaded in (b)] initialized from the EnKF analysis by assimilating radar radial velocity, radar reflectivity, and surface observations displays large agreement with the observed reflectivity valid at 2210 UTC 8 May 2003 [as shown in (a)]. Also shown in (b) are wind vectors and vertical vorticity at 1 km at the same time. A WRF-based EnKF analysis [shown in (d)] can successfully capture detailed structure of the radial velocity field of Hurricane Humberto [shown in (c)], which is observed at 0.58 base scan at the KHGX radar at 0300 UTC 13 Sep 2007.

Citation: Monthly Weather Review 139, 7; 10.1175/2011MWR3418.1

Real-data applications of convective-scale radar data assimilation with two independent EnKF systems: (a),(b) the 8 May 2003 Oklahoma City tornadic supercell storm case [adapted from Lei et al. (2008), courtesy of M. Xue at OU] and (c),(d) Hurricane Humberto (2007) [adapted from Zhang et al. (2009a)]. The 500-m forecast reflectivity [shaded in (b)] initialized from the EnKF analysis by assimilating radar radial velocity, radar reflectivity, and surface observations displays large agreement with the observed reflectivity valid at 2210 UTC 8 May 2003 [as shown in (a)]. Also shown in (b) are wind vectors and vertical vorticity at 1 km at the same time. A WRF-based EnKF analysis [shown in (d)] can successfully capture detailed structure of the radial velocity field of Hurricane Humberto [shown in (c)], which is observed at 0.58 base scan at the KHGX radar at 0300 UTC 13 Sep 2007.

Citation: Monthly Weather Review 139, 7; 10.1175/2011MWR3418.1

Real-data applications of convective-scale radar data assimilation with two independent EnKF systems: (a),(b) the 8 May 2003 Oklahoma City tornadic supercell storm case [adapted from Lei et al. (2008), courtesy of M. Xue at OU] and (c),(d) Hurricane Humberto (2007) [adapted from Zhang et al. (2009a)]. The 500-m forecast reflectivity [shaded in (b)] initialized from the EnKF analysis by assimilating radar radial velocity, radar reflectivity, and surface observations displays large agreement with the observed reflectivity valid at 2210 UTC 8 May 2003 [as shown in (a)]. Also shown in (b) are wind vectors and vertical vorticity at 1 km at the same time. A WRF-based EnKF analysis [shown in (d)] can successfully capture detailed structure of the radial velocity field of Hurricane Humberto [shown in (c)], which is observed at 0.58 base scan at the KHGX radar at 0300 UTC 13 Sep 2007.

Citation: Monthly Weather Review 139, 7; 10.1175/2011MWR3418.1

The potential of using ensemble-based data assimilation at regional scales is also being explored in several operational meteorological centers. For example, the Italian National Meteorological Service compared a local ensemble transform Kalman filter (LETKF) with its operational 3DVar algorithm for a regional NWP system at realistic model resolution, but reduced observation density and exclusion of satellite radiance data and scatterometer winds. Their results showed that the LETKF clearly outperformed 3DVar when the same model configuration was implemented (Bonavita et al. 2010). Other operational centers in active pursuit of an ensemble-based data assimilation approach for improving LAM prediction systems include (but are not limited to) Environment Canada (L. Fillion 2010, personal communication), the Japanese Meteorological Agency, the Met Office (UKMO), and the National Centers for Environmental Prediction (NCEP) as reported at the fourth EnKF workshop held during 6–10 April 2010 in New York (more information is available online at http://hfip.psu.edu/EDA2010).

The performance of the EnKF can be subject to the particular model and to the scale of the geophysical system at hand, because of its strong dependence on the accuracy of the forecast model and the dynamics and predictability of the underlying weather systems. There are several differences between applications of the EnKF in global and limited-area models. First, the LAM EnKF needs a proper way to perturb lateral boundary conditions. Second, as a result of the smaller scale of the systems of interest, model error might be more severe since the dynamics and physics of meso- to convective-scale systems are less well understood and thus likely to be more poorly represented in the model. Furthermore, there are more inhomogeneities in the spatial and temporal coverage of observations and more data-void areas for the LAM EnKF applications, especially considering our increasing desire to explicitly resolve moist convection. Associated with the data-sparseness problem, the error features of any given mesoscale forecast are poorly known, and as a result it is more difficult to generate initial perturbations and to verify the LAM EnKF results relative to the results of its large-scale counterpart. Moreover, the error growth dynamics of meso- to convective-scale systems are substantially different from that of large-scale systems; they tend to be more multiscale in nature, more nonlinear, and non-Gaussian (Dévényi and Schlatter 1994). For practical purposes, different error dynamics demands a different treatment of model error. The objective of this review is to summarize recent advances and challenges in the development and applications of the LAM EnKF, some of which were highlighted in a recent World Meteorological Organization/World Weather Research Programme/The Observing System Research and Predictability Experiment (WMO/WWRP/THORPEX) workshop on 4DVar and EnKF intercomparisons held in Argentina in November 2008, as well as in the third and the fourth EnKF workshops held in Austin, Texas, in April 2008, and in Rensselaerville, New York, in April 2010. Section 2 gives a brief introduction to the LAM EnKF. Section 3 provides an overview of issues specific to the LAM EnKF, including the generation of initial and boundary perturbations, as well as the respective errors in observations, modeling, and sampling. Progress obtained in the comparison and hybridization of the LAM EnKF with variational data assimilation methods is summarized in section 4. Several applications of the LAM EnKF beyond state estimation are presented in section 5. The conclusions are given in section 6.

## 2. An overview of the LAM EnKF

The Kalman filter is a linear, recursive estimator that produces the unbiased minimum variance estimate, in a least squares sense, under the assumption of unbiased noise processes (Kalman 1960; Kalman and Bucy 1961). To circumvent the coding of tangent linear and adjoint models, as well as the high computational cost posed by the weighting matrix calculation and the propagation of the background error covariance in the classic Kalman filter, Evensen (1994) proposed the use of an ensemble Kalman filter by representing the best estimate of the state vector and its covariance by using a random ensemble with a limited number of ensemble members.

**x**

*for each member*

^{a}*i*of the ensemble with size

*n*is obtained by adding to the background state vector

**x**

*a weighted difference between observations*

^{b}**y**

*and the background vector projected to observation space through an observation operator*

^{o}*H*. Here

*H*and

*t*− 1 to

*t*when the next observations are available:

*M*is a nonlinear model. The EnKF procedure is a Monte Carlo approximation to the computationally overwhelming propagation of the full probability density function (PDF) forward in time, at least to the extent that the analysis ensemble is a random sample from the full PDF. The ensemble-based algorithm asymptotically approaches the Kalman filter in the limit of a large ensemble and Gaussian error distributions.

Based on the method for generating the analysis ensemble, various EnKFs can be characterized as stochastic, where the analysis ensemble is obtained with the Kalman gain and randomly perturbed observations, or deterministic, where the analysis ensemble is created by deterministically transforming the forecast ensemble without perturbing the observations (Tippett et al. 2003).

A stochastic filter first implemented in Houtekamer and Mitchell (1998) divides the ensemble into two or more subgroups to avoid a deficient ensemble spread in the case of small ensembles. This is accomplished by updating the state vectors of one subgroup ensemble using the weights calculated from a different subgroup ensemble. Houtekamer et al. (2009) showed that with their current operational configuration of four subensembles, rather good agreement between the ensemble mean error and the ensemble spread was obtained in a perfect-model context without any need for covariance inflation. This approach has been shown to be effective not only in global assimilation systems (Houtekamer et al. 2005, 2009), but also with LAMs (Charron et al. 2006). The deterministic method, however, has been more widely used in LAM applications than the stochastic method.

The most commonly employed deterministic method has the form of ensemble square root filters (EnSRF) as reviewed in Tippett et al. (2003), such as the serial EnSRF (Whitaker and Hamill 2002), the ensemble transform Kalman filter (ETKF; Bishop et al. 2001), and the ensemble adjustment Kalman filter (EAKF; Anderson 2001; Aksoy et al. 2009, 2010). The serial EnSRF (Whitaker and Hamill 2002) processes the observations one by one assuming independent observation errors, which has been shown to be effective in mesoscale-ensemble-based data assimilation (Snyder and Zhang 2003; Barker 2005; Tong and Xue 2005; Zhang et al. 2006; Fujita et al. 2007; Meng and Zhang 2008a,b; Torn and Hakim 2008a). However, this sequential method may become computationally inefficient if the observations get very dense in space. It also has difficulties in assimilating observations with correlated observation errors.

The ETKF (Bishop et al. 2001) uses a transform matrix to directly transform the forecast error covariance to an analysis error covariance in a smaller subensemble space, thus reducing computational cost. Instead of assimilating data sequentially, the local ensemble Kalman filter (LEKF; Ott et al. 2004) updates independent grid points simultaneously using only observations in a localized subspace. By combining the ETKF and the LEKF, Hunt et al. (2007) proposed the LETKF, which is more efficient and flexible for nonlocal observations such as satellite radiances. The LETKF has been shown to be very useful in convective systems (Miyoshi and Aranami 2006). Encouraging results have been achieved in 4DEnKF, which assimilates observations instantaneously as they are measured by expanding the state vector through finding the linear combination of the ensemble trajectories that best fits the observations at the appropriate times (Hunt et al. 2004; Houtekamer and Mitchell 2005). To the best of our knowledge, the 4DEnKF has so far been applied with the Lorenz model (Hunt et al. 2004) and with global models (Whitaker et al. 2008; Houtekamer et al. 2009), but not with LAMs.

Because of the large dimension of the model state vector and the large number of observations needed in the LAM-EnKF scenario, various algorithms of the LAM EnKF have been proposed to improve computational efficiency. Anderson and Collins (2007) proposed a parallel ensemble Kalman filter in the least squares framework by arbitrarily partitioning the background ensemble to a set of processors that can be easily implemented into a variety of EnKFs. For a linear observation operator, the result of this parallel Kalman filter is the same as that of a single-processor filter, and similar results can be obtained for a nonlinear observation operator. The parallel Kalman filter has been implemented in the National Center for Atmospheric Research (NCAR) Data Assimilation Research Test bed (DART) and was used in Aksoy et al. (2009, 2010). A preemptive forecast method was also proposed to reduce the computational cost by propagating current analysis increments to update the future forecast, assuming that preemptive forecasts were of similar quality to the updated model forecasts (Etherton 2007). Algorithms that are efficient in terms of covariance calculation will be described in sections 3b and 3c.

## 3. Issues specific to the LAM EnKF

### a. Ensemble initialization

The LAM EnKF may be initialized from an existing global or larger-scale ensemble (Zhang et al. 2010). If a global ensemble forecast is not readily accessible, the most common alternative is to randomly sample the climatological uncertainties of the initial state (Aksoy et al. 2006b) or to derive random perturbations from the background error statistics of an existing 3D/4DVar system (e.g., Barker 2005; Meng and Zhang 2008a,b; Torn and Hakim 2008a), as is done for the global EnKF (e.g., Houtekamer et al. 2005; Whitaker et al. 2008).

How to generate an initial ensemble for convective-scale EnKF systems remains an open question because of the lack of accurate error statistics. Random sampling of a static variational background error covariance may not be applicable for convective scales because of its balance constraint and large length scale. For many convective-scale applications, Gaussian noise can be added to a horizontally uniform background (sounding) for all state variables (Snyder and Zhang 2003; Tong and Xue 2005) or for some cases, only the horizontal wind components (Aksoy et al. 2009, 2010), or only where rain was observed (Caya et al. 2005).

### b. Boundary perturbations and nesting

Compared to initial condition uncertainty, proper representation of boundary uncertainties may have a larger impact on the LAM EnKF. Lack of sufficient ensemble spread on the lateral boundaries may propagate inward and lead to filter divergence. Filter divergence means that the ensemble mean deviates farther and farther away from the truth as a result of the underestimated variance of the forecast ensemble, which results in more weight being given to the prior (model forecast) than to the observations. An additional consideration in generating boundary perturbations is the necessity of flow-dependent perturbations.

Boundary perturbations can be generated using the same methods as ensemble initialization described above, such as random sampling from a specified multivariate Gaussian distribution (Barker 2005; Torn et al. 2006; Torn and Hakim 2008a; Meng and Zhang 2008a), scaling deviations from randomly drawn climatological time series, or by using a smaller sample of global ensemble forecast or limited-area ensemble analysis to directly perturb only a subset of the boundary grid points, while perturbing all other points using an assumed covariance model (Torn et al. 2006). Torn et al. (2006) showed that the errors caused by perturbing the boundary condition around a mean following a certain form of PDF in comparison to a perfect “global” EnKF that extended beyond the limited-area domain were mostly constrained to areas near the boundaries. On the other hand, since higher spatial resolution may need more frequent boundary updates, random climatological states may not be adequate in terms of temporal frequency for LAM boundary perturbations (Dirren et al. 2007). Meng and Zhang (2008b) perturbed the lateral boundaries of the data assimilation domain, which is the inner domain, by updating the outer domain using the 6-h NCEP final analyses (FNL) perturbed by random draws from the WRF-3DVar background error covariance scaled to approximate the forecast uncertainty of the Global Forecast System (GFS) at different lead times. This flow-dependent perturbation method was effective in preventing the system inside the assimilation domain from drifting away from the truth.

Besides the boundary conditions of the outermost domain, nesting is also an important issue of the LAM EnKF across different domains. Since the inner domain usually decreases the horizontal grid spacing by a factor of 3 (thus the number of grid points increases by a factor of 9 over the same area of the coarse domain), performing the EnKF analysis and the model forecast in the inner domains is much more computationally expensive. One approach to deal with this problem is to use a coarser-resolution ensemble to estimate the background error covariance while the EnKF analysis and the control forecast are performed on a higher-resolution grid (Gao and Xue 2008; Yang et al. 2009).

How to perform the data thinning and covariance localization for nested domains is another important issue. A commonly used approach is one where all domains assimilate the same data and use a fixed radius of influence. Zhang et al. (2009a) proposed a successive covariance localization (SCL) technique in which a larger radius of influence (ROI) is used to assimilate a relatively small subset of observations in the coarser domains, while a smaller ROI is used to assimilate higher-density observations in the inner domains. This method will be detailed in section 3e(2) for sampling error treatment. Performing data assimilation in all domains may cause inconsistencies and imbalances near the boundaries, whose impacts have not received much attention in the literature.

### c. Observational issues

Different observing platforms at diverse spatial and temporal resolutions may have dissimilar impacts on the EnKF performance and the quality of the initial conditions (ICs) at different scales. For regional-scale LAM applications, radiosonde observations have been found to have a larger impact on the quality of ICs and forecasts than wind profiles and surface observations (Meng and Zhang 2008a). However, surface observations are a very important data source for mesoscale systems due to their higher resolution and thus their capability of reflecting more detailed mesoscale features, which reduces spinup time relative to coarse observations. Surface observations, including precipitation, have been shown to be useful for improving the simulation skill of mesoscale convective systems (MCSs), such as the location and intensity of the dryline, frontal boundaries, as well as the depth and structure of the planetary boundary layer (PBL; Fujita et al. 2007, 2008), although the results are not always positive (Miyoshi and Aranami 2006). Surface pressure is not only useful in reconstructing three-dimensional fields in large-scale models (Whitaker et al. 2004; Anderson et al. 2005), but also very helpful in retrieving different structures at the mesoscale (Dirren et al. 2007). Surface observations may also be beneficial for simulating boundary layer processes (Hacker and Snyder 2005; Aksoy et al. 2005). It has been demonstrated that the surface observations may affect the entire PBL (Hacker et al. 2007). How to effectively assimilate surface observations is important because PBL structure plays a key role in determining the type of deep moist convection.

However, because of the error from the difference between the real and the model terrain height and uncertainties in the parameterization of boundary layer and land surface physical processes, surface observations have been a big challenge in the mesoscale data assimilation field. Fujita et al. (2007) showed that using different physical parameterization schemes for different members can improve the quality of background error covariance and thus noticeably reduce forecast error especially for thermodynamic variables. However, Fujita et al. also found that the forecast error when assimilating surface observations could maintain a smaller value (as compared to without assimilation) only for 6 h after the assimilation period. The EnKF performance may also be sensitive to different formulations of the same observations. For example, the assimilation of altimeter setting, which is the surface pressure reduced to sea level using the standard atmosphere temperature profile, and is merely a function of surface pressure and terrain height, may result in a larger analysis error reduction than assimilating the 1-h surface pressure tendency when depicting mesoscale pressure patterns (Wheatley and Stensrud 2010). The EnKF is also found to have a reduced analysis error when assimilating potential temperature and dewpoint instead of temperature and specific humidity at the surface, which is likely due to the larger variability and less Gaussian distribution of the latter variables (Fujita et al. 2007).

To describe more detailed mesoscale features, observations with higher resolution than conventional surface observations are required. More and more attention is being paid to the assimilation of remotely sensed observations for the LAM EnKF. Doppler radars [e.g., Doppler-on-Wheels (DOW) and Weather Surveillance Radar-1988 Doppler (WSR-88D), airborne] may be the only observing platform that has sufficient temporal and spatial coverage to constrain convective clouds. The effectiveness of using the EnKF to assimilate Doppler radar velocities for supercell storms was first demonstrated in OSSEs in Snyder and Zhang (2003) (Fig. 1) and with real data in Dowell et al. (2004) and Lei et al. (2008). Good agreement was achieved between the EnKF prediction of the finescale supercell structure and the observations (Figs. 3a,b). Doppler radar velocities have also been shown in recent studies to improve the accuracy of both track and intensity forecasts for tropical cyclones (Zhang et al. 2009a; Weng et al. 2011), whereby the assimilation of radial velocity could draw the WRF-EnKF analysis of radial velocity to be very close to the observations (Figs. 3c,d). Assimilation of airborne radar Doppler velocity may have similar impacts as those of ground-based radars on hurricane forecasting (Y. Weng 2010, personal communication). Rapid-scan Phased Array Radar (PAR) observations have also been used to achieve a better analysis and subsequent ensemble forecast of an MCS while using a shorter assimilating time than WSR-88D data (Yussouf and Stensrud 2010).

The effectiveness of assimilating radar reflectivity and other hydrometer related quantities for convective-scale analysis and forecasts, on the other hand, remains an open question. Radar reflectivity has been shown to be less effective than Doppler radial velocity (Tong and Xue 2005). The likely non-Gaussian error distribution, weak cross correlations between state variables, inherent small-scale variability, and strong dependence of these quantities on the accuracy of model microphysics schemes appear to be the main limiting factors. Nevertheless, the assimilation of differential reflectivity *Z*_{DR}, reflectivity difference *Z*_{dp}, and specific differential phase *K*_{DP} beyond radar reflectivity and/or radial velocity seems to improve the storm analysis in the OSSEs of Jung et al. (2008). Moreover, the assimilation of even the echo-free radar observations, defined as radar reflectivity below a threshold value of 5 dB*Z*, sometimes effectively suppresses spurious convection (Aksoy et al. 2009).

With large volumes of radar observations recorded at a much higher resolution than the forecast model grid spacing for the EnKF data assimilation, significant data thinning of observations may be necessary. The process of combining multiple observations into one high-accuracy “super” observation (SO) is often referred to as “superobbing.” A data thinning and quality control procedure was developed in Zhang et al. (2009a) to generate SOs for ground-based Doppler radars (e.g., WSR-88Ds), with the observation error for radial velocity assumed to be 3 m s^{−1}. To avoid averaging of radial velocities (Vr) with significantly different directions, the averaging bin is defined as the area in a sector between two arcs that must satisfy all the following conditions: 1) the angle at center is no larger than 5°, 2) the length of the outer arc is no larger than 5 km, and 3) the distance between the two arcs is no larger than 5 km. Additional quality control procedures are applied during the superobbing to minimize the impact of ground clutter and to correct the failures in the dealiasing step, while further quality controls are implemented in the processing of the EnKF analysis (Zhang et al. 2009a). A similar procedure was used in Weng et al. (2011) in assimilating airborne Doppler radar observations.

As a special case of data thinning, subsampling of observations has been found to be able to generate a similar result as to when much more data is used (Torn and Hakim 2008b, 2009b). Torn and Hakim (2008b) found that assimilating the *O*(100) most significant observations may produce a similar forecast-metric variance as by assimilating thousands of observations with a statistically significant metric-mean change. How frequent in time the observations should be assimilated has remained empirical. Too frequent assimilation may not allow enough time for the short-term ensemble forecast to be initialized appropriately and to develop meaningful background error covariance, and for the imbalance that was introduced during the assimilation cycle to be adjusted (e.g., Meng and Zhang 2008a).

In addition to radars, an examination has already begun on the assimilation of satellite observations and/or satellite-derived products in the LAM EnKF. Challenging Minisatellite Payload (CHAMP) radio occultation refractivity has been found beneficial in regions where conventional high-quality observations are sparse (H. Liu et al. 2008). The model performances after assimilating univariate and multivariate specific humidity retrieved from the Atmospheric Infrared Sounder (AIRS) were compared using the LETKF (J. Liu et al. 2009). It was found that the largest improvement was obtained by multivariate specific humidity assimilation when the specific humidity was updated by all data types.

Compared to applications in global models (Houtekamer et al. 2005; Miyoshi and Sato 2007; Whitaker et al. 2009; Aravéquia et al. 2011), direct assimilation of satellite radiance with the LAM EnKF is still in its infancy. At the fourth EnKF workshop held during 6–10 April 2010 in New York, Z. Liu reported a better performance in assimilating Advanced Microwave Sounding Unit (AMSU) radiance for a tropical cyclone event in comparison to 3DVar (see online at http://hfip.psu.edu/EDA2010/LiuZQ.pdf). However, many issues remain to be explored in satellite radiance assimilation. For example, observation bias correction requires long-term stationary statistics of satellite observations over large areas, which is usually not available for mesoscale models. Additionally, mesoscale models usually do not have a high-enough model top, which may induce difficulties in the forward operator for some radiance measurements (the response function usually spans a large altitude range). It may be possible and necessary for the LAM EnKF to take advantage of existing bias correction approaches for satellite observations that are already in place for global (operational) NWP models.

Besides the aforementioned in situ and remotely sensed data, some special synthesized object-oriented observations, such as the vortex position of tropical cyclones, can also be easily assimilated by the EnKF and have been demonstrated to be helpful in improving hurricane forecast ability (Chen and Snyder 2007; Torn and Hakim 2009a). Wu et al. (2010) found that the EnKF performs well in initializing tropical cyclones after assimilating synthetic observations including position, intensity, and size derived from dropsondes and satellites.

### d. Treatment of model error

Model error may be the single most critical challenge that limits all aspects of NWP. It can result from inadequate parameterization of subgrid-scale physical processes, numerical inaccuracy, truncation error, ill-defined boundary conditions, or other random errors. Model error, especially at the mesoscale, is generally difficult to identify and deal with because of the chaotic nature of the atmosphere, its flow-dependent characteristics, and the lack of sufficiently dense observations for verification (e.g., Stensrud et al. 2000). The presence of model error often result in both a large bias in the ensemble mean and too little spread, which may ultimately cause the ensemble forecast to fail. Model error may lead to ensemble spread deficiency because of the missing model error term in the ensemble-based calculation of forecast error covariance (Hamill 2006). Additionally, model error components tend to be projected onto more stable modes, which will also limit the growth of ensemble spread (Mitchell et al. 2002). Since the EnKF depends critically on the quality of the first guess and the forecast error covariance estimated from a short-term ensemble forecast, the presence of model error may lead to poor filter performance and even filter divergence (e.g., Hamill and Whitaker 2005; Houtekamer et al. 2009; Li et al. 2009).

Several ad hoc approaches that have been used to account for model error in the context of the EnKF, including covariance inflation, bias correction, and/or the use of multimodel or multiphysics ensemble, will be reviewed in detail in this section. Alternative approaches that include simultaneous state and parameter estimation will be discussed in section 5a.

Additive covariance inflation, where a set of ensemble perturbations that can reflect forecast uncertainty is added to the forecast ensemble (e.g., Mitchell et al. 2002; Hamill and Whitaker 2005) has been shown to be effective in improving the performance of the LAM EnKF (Barker 2005). More detailed discussion on covariance inflation will be given in section 3e for the issue of sampling error.

Though most statistical data assimilation methods assume that the model forecast (or first guess) is unbiased, that is rarely the case. Model bias error can systematically cause the model to drift away from the truth. Since bias is a part of the model error, a better performance of the EnKF may be achieved through both bias correction and the treatment of random error (Li et al. 2009). Using a multimodel or multiphysics ensemble (discussed next) and simultaneous state and parameter estimation (section 5a) may also help to correct the bias.

Over the past decade, there has been an increasing amount of evidence demonstrating the advantages and effectiveness of using multimodel ensembles (over single-model ensembles) to account for model error in the prediction system (Krishnamurti et al. 1999; Palmer et al. 2004; Weigel et al. 2008; Weisheimer et al. 2009). A multimodel ensemble may provide a more realistic ensemble spread (better error covariance) and even reduce the error or the bias in the ensemble mean estimate (better first guess), which shows great potential for improving the EnKF (although it may lead to artificial clustering with little correspondence to the forecast uncertainties of the day). However, given technical implementation difficulties associated with inherent differences in model numerics, dynamical coordinates, and/or (prognostic) state variables among different forecast models, multimodel ensembles have not been used for the LAM or global EnKF.

Since a considerable part of the model error comes from parameterization of subgrid-scale physical processes (e.g., Stensrud et al. 2000), a more practical approach is to use a variety of physical parameterization schemes available in the same forecast model for different members to account for model uncertainties. Similar to the multimodel approach, each of the physics schemes in the multiphysics ensemble usually has its own advantages and disadvantages based on its own underlying physical assumptions whose accuracy is hard to distinguish a priori. For example, Wang and Seaman (1997) compared the performance of a mesoscale model with different cumulus parameterization schemes for several synoptic events and found that no particular scheme performs consistently better than another scheme. Multiphysics ensembles have been shown to be effective in accounting for model error in both global (Houtekamer et al. 1996) and mesoscale (Stensrud et al. 2000) ensemble forecast systems.

A multiphysics ensemble to account for model error in the LAM EnKF was first reported in Meng and Zhang (2007) through OSSEs and Fujita et al. (2007) for real surface data assimilation. Meng and Zhang (2007) demonstrated how a multiphysics ensemble may greatly improve the performance of the EnKF, especially for thermodynamic variables, in the presence of model error introduced by physical parameterization schemes. The effectiveness of a multiphysics ensemble has been confirmed in follow-up real-data experiments (Meng and Zhang 2008a,b), which showed that the use of a multiphysics ensemble results in consistently smaller error for different variables throughout the troposphere than the use of a single-physics ensemble (Fig. 4). A multiphysics ensemble contributes to the performance of the EnKF, likely through improved ensemble mean estimates, the increasing of ensemble spread, and a more effective background error covariance. Fujita et al. (2007) compared the method of generating an ensemble by perturbing the physics with multiphysics approach to other methods such as only perturbing the initial field and perturbing both the initial field and physics. It was shown that using a multiphysics ensemble may result in larger variance of temperature and dewpoint. This conclusion is consistent with the larger improvement obtained in the thermodynamic variables with a multiphysics ensemble in Meng and Zhang (2007, 2008a,b).

The impact of using a multiphysics ensemble (EnKF_m; solid red) to account for model error originating from physical parameterization schemes on the performance of a WRF-EnKF in comparison to a single-physics ensemble (EnKF_s; dashed red,), 3DVar (solid blue), and FNL_GFS (solid black) in terms of month-averaged RMSEs of 12-h forecast of (a) horizontal wind speed, (b) temperature, and (c) water vapor mixing ratio for the entire month of June 2003 [adapted from Meng and Zhang (2008b)].

Citation: Monthly Weather Review 139, 7; 10.1175/2011MWR3418.1

The impact of using a multiphysics ensemble (EnKF_m; solid red) to account for model error originating from physical parameterization schemes on the performance of a WRF-EnKF in comparison to a single-physics ensemble (EnKF_s; dashed red,), 3DVar (solid blue), and FNL_GFS (solid black) in terms of month-averaged RMSEs of 12-h forecast of (a) horizontal wind speed, (b) temperature, and (c) water vapor mixing ratio for the entire month of June 2003 [adapted from Meng and Zhang (2008b)].

Citation: Monthly Weather Review 139, 7; 10.1175/2011MWR3418.1

The impact of using a multiphysics ensemble (EnKF_m; solid red) to account for model error originating from physical parameterization schemes on the performance of a WRF-EnKF in comparison to a single-physics ensemble (EnKF_s; dashed red,), 3DVar (solid blue), and FNL_GFS (solid black) in terms of month-averaged RMSEs of 12-h forecast of (a) horizontal wind speed, (b) temperature, and (c) water vapor mixing ratio for the entire month of June 2003 [adapted from Meng and Zhang (2008b)].

Citation: Monthly Weather Review 139, 7; 10.1175/2011MWR3418.1

### e. Sampling error, covariance inflation, and localization

As a result of computational constraints, only a limited ensemble size can be afforded in the EnKF, which will result in some sampling error especially in the presence of model error and nonlinearity. The number of members sufficient for the EnKF to minimize the impact of sampling error still remains an open question. Variables with a weaker correlation may be more vulnerable to sampling error (Barker 2005). Until now, most published LAM EnKF studies use an ensemble size between 30 and 100 (Barker 2005; Torn and Hakim 2008a; Zhang et al. 2009a). It is possible to artificially augment the ensemble size and to alleviate sampling error by including a series of perturbed state vectors from each forecast run at time levels properly selected around the analysis time as proposed by Xu et al. (2008). Because of the limited ensemble size, the EnKF generically suffers from a rank deficiency problem with which only part of the phase space can be spanned by the ensemble. As a result, the ensemble spread tends to be systematically underestimated.

#### 1) Covariance inflation

The underestimation of ensemble spread is commonly treated by covariance inflation through multiplicative (Anderson 2001; Whitaker and Hamill 2002) or additive (e.g., Mitchell et al. 2002; Houtekamer et al. 2005) scaling, or covariance relaxation (Zhang et al. 2004). Multiplicative covariance inflation is achieved by multiplying all ensemble perturbations before or after the EnKF analysis with a constant slightly larger than 1 [e.g., 1.05 in Whitaker and Hamill (2002)]. There are reported benefits of using a time-dependent, but spatially invariant, inflating factor through matching the ensemble spread to the forecast error in the LAM EnKF (Barker 2005; Bonavita et al. 2008). However, multiplicative covariance inflation with a spatially constant inflation factor may cause a model to become unstable because of excessive spread in data-sparse regions. Anderson (2009) proposed a Bayesian algorithm that determines a spatially and temporally varying adaptive inflation factor for each element of the model state vector. This method has been shown to be effective in producing a smaller posterior error and a more consistent variance with a dominating sampling error and a variety of observation densities or frequencies. The additive method, which was described in section 3d, has not been widely used in the LAM EnKF likely because of difficulties in constructing proper additive perturbations.

**x**

^{a})′ to the prior forecast perturbations (

**x**

^{b})′:

*α*is the covariance relaxation coefficient. For example,

*α*= 0.5 means 50% of the analysis perturbation is directly from the perturbations of the prior forecast ensemble. The inflation results from a generally larger ensemble spread of forecast than analysis. This method only inflates those grid points that are updated by observation, thus avoiding the overinflation deficiency of the conventional inflation method. Though the relaxation method is ad hoc and likely violates the dominant balances in the system, it has been found useful in properly alleviating the inbreeding problem. Meng and Zhang (2007, 2008a,b) and Zhang et al. (2009a) used

*α*= 0.7 for real-data applications. This covariance relaxation method was also used in Torn and Hakim (2008a) for their quasi-operational LAM-EnKF system.

Localized covariance inflation may also be achieved through either an additive or multiplicative method by inflating only the areas updated by observations (Tong and Xue 2005) or only adding random perturbations to the affected areas (Caya et al. 2005; Dowell and Wicker 2009). Note that noise sometimes introduced through the multiplicative or additive methods to avoid filter divergence may degrade the analysis if it interferes with the dynamic balance of the ensemble perturbation (Peña et al. 2010).

#### 2) Covariance localization

*ρ*and

_{V}*ρ*are functions that decrease smoothly from one at the observation point to zero at a certain distance from the observation, according to the pattern determined by the localization function. The indices

_{H}*V*and

*H*mean vertical and horizontal and “∘” denotes entry-wise multiplication. The impact of an observation is thus confined only within a limited distance. This distance is usually called the radius of influence (ROI). Localization may not only decrease spurious distant correlation, but also reduce the computational cost and alleviate the rank-deficiency problem because of the limited ensemble size.

For real-data LAM applications, a horizontal ROI of 1000–2000 (60–150) km is often used for standard radiosonde (surface) observations (e.g., Fujita et al. 2007; Meng and Zhang 2008a,b; Torn and Hakim 2008a), while a much smaller ROI (6–8 km) is used for radar observations (e.g., Aksoy et al. 2009, 2010). However, the selection of an optimum ROI remains an area of active research. The ROI should depend on ensemble size, observation type and density, model error and resolution, as well as the characteristic scales of the underlying dynamic system. For example, a smaller ROI may be necessary for a smaller ensemble size and/or in the presence of a more severe model error (Zhang et al. 2009b). Lorenc (2003) suggested that the ROI should be 2–3 times the forecast error scales. An optimum ROI for the LAM EnKF may be harder to define because of the complicated multiscale interactions.

A successive covariance localization (SCL) technique was proposed by Zhang et al. (2009a) to assimilate dense radar observations that contain information about the state of the atmosphere at a wide range of scales. SCL assumes that both large- and small-scale errors are simultaneously present and was designed to reduce computational cost and sampling errors. This technique uses the Gaspari and Cohn (1999) fifth-order correlation function for covariance localization, but a different ROI is used for different subsets of randomly grouped observations. First, one tries to remove dynamically important aspects of the large-scale error by assimilating a relatively small subset of observations with a large ROI. Next, the ROI is made smaller, and higher-density observations are used to constrain both smaller-scale errors and what remains of the large-scale error. The process is repeated until all scales resolved by the observational network have been adequately dealt with. The SCL method has some resemblance to the successive correction method used in earlier empirical objective analysis schemes (e.g., Barnes 1964), though in the EnKF the same observation are not used twice. Zhang et al. (2009a) showed clear advantages of using the SCL method over using single ROIs in the assimilation of dense radar observations for a rapidly developing landfalling hurricane.

Though equally important, but possibly more difficult to implement correctly, vertical covariance localization has received much less attention as compared to horizontal localization. The vertical ROI is sometimes set to the depth of the atmospheric model for conventional observation assimilation (e.g., Meng and Zhang 2008a,b), also using the Gaspari and Cohn (1999) covariance localization function. A much smaller vertical ROI is sometimes used for assimilating radar observations at the convective scales (e.g., Aksoy et al. 2009, 2010), while no vertical covariance localization has been used for tropical cyclones as in Zhang et al. (2009a). Vertical localization is likely more of an issue when assimilating satellite radiance observations. For example, Miyoshi and Sato (2007) have shown that a channel-dependent vertical localization may be effective in damping the sampling error.

Besides ROI, the selection of a covariance localization function may also be important. The most widely used covariance localization is the fifth-order correlation function of Gaspari and Cohn (1999). The choice of localization function may become more complicated for observations that have complex spatial (such as the heterogeneously distributed surface observations), temporal and physical attributes, such as those without a well-defined location, at a different time from the state specification, or with an unknown relation with the state variable. The importance of flow-dependent covariance localization, which has been demonstrated in global models (Anderson 2006; Bishop and Hodyss 2007), has not been seen in literature for the LAM EnKF.

While accounting for sampling error, covariance localization may also cause imbalance when one observation is selected to update the state vector at one grid point, but not selected for a neighboring grid point. Consequently, covariance localization may produce analyses with weaker flow balance and stronger divergence, which may result in inaccurately balanced background error statistics. Methods used to reduce imbalance in large-scale models such as applying a digital filter (e.g., Lynch and Huang 1992; Huang and Lynch 1993) or covariance localization performed in the streamfunction-velocity potential rather than the wind component space (Kepert 2009), have still not been tested for the LAM EnKF.

### f. Verification issues

Because of a lack of dense, conventional mesoscale observations, verification of the LAM EnKF can be more difficult than verification of larger-scale prediction systems, especially for radar data assimilation. One way is to compare the analysis and/or forecast error against radar observations that are not assimilated, but saved specifically for assessment (Zhang et al. 2009a).

To verify the result in terms of radar reflectivity, one metric is the equitable threat score (ETS; Wilks 2006), which has been widely used for precipitation verification. However, this metric tends to be very sensitive to the phase error and thus may be misleading sometimes. An alternative metric that is more pattern based is the reflectivity correlation coefficient (Aksoy et al. 2010) between the observed and simulated reflectivity in observation space, which is similar to the centered anomaly correlation (Wilks 2006).

For severe weather systems, such as hurricanes, how to choose an appropriate error metric is still an open question. Since a small displacement of a storm center may result in a substantially large root-mean-square error (RMSE) of wind, the gridpoint-based RMSE of the wind fields integrated over the whole model domain may not be adequate for assessing EnKF performance. In this case, performance could be better assessed using feature-based verification, such as RMSE comparison for hurricane intensity and center position for individual members (e.g., Torn and Hakim 2009a; Zhang et al. 2009a).

## 4. Intercomparison and hybrid with variational schemes

Despite many of the challenging issues discussed in the previous section, there are several appealing advantages of the EnKF in comparison to the variational data assimilation techniques. These advantages include the following: 1) the background error covariance is flow dependent, which reflects the error of the day; 2) the model and observation operator can be nonlinear; 3) it provides not only the best estimation of the state, but also the associated flow-dependent uncertainty; therefore, it can be seamlessly coupled with ensemble forecasting; 4) there is no need to code a tangent linear or adjoint model; 5) it is easier to account for model error because of its use of an ensemble forecast; and 6) the ensemble members can be run simultaneously, making it easy to parallelize (e.g., Evensen 2003; Hamill 2006; Zhang and Snyder 2007).

Nevertheless, the variational data assimilation techniques have been predominantly used at several operational NWP centers around the globe. Though the EnKF has many advantages over the variational method, a widespread operational implementation has yet to be achieved. A global (regional) scale EnKF has been put into operational practice at the Canadian Meteorological Centre (Italian Weather Service). There are a few quasi-operational LAM-EnKF systems such as those performed at the University of Washington (UW), the Pennsylvania State University (PSU), and NCAR. The operational implementation of the EnKF may be affected by the following factors: 1) the EnKF needs almost as much computer resources as does 4DVar. Some weather centers currently relying on 3DVar systems may not be able to afford the cost. 2) Most centers have established a variational system. It is likely more attractive to set up a hybrid system instead of establishing a brand new standalone EnKF system. However, it remains to be seen whether the EnKF approach will supersede the variational methods or whether the hybrid approaches will prevail. This section reviews recent advances in the intercomparison and hybridization between these two state-of-the-art data assimilation approaches from the aspect of limited-area model applications.

### a. Intercomparison between the EnKF and 3DVar/4DVar

Similar to the results obtained from a global-scale perspective (Whitaker et al. 2004, 2008; Houtekamer et al. 2005; Miyoshi and Yamane 2007; Szunyogh et al. 2008), the LAM EnKF generally compares favorably with 3DVar. Meng and Zhang (2008a,b) compared a WRF-based EnKF with 3DVar in a mesoscale convective vortex (MCV) case for a month-long experiment in which standard radiosonde observations were ingested every 12 h (Fig. 4). The results showed that the EnKF generally outperformed 3DVar for the time period of interest. The 12-h forecasts from the EnKF analysis also outperformed the 12-h forecasts initiated from the FNL/NCEP analyses that assimilated many additional observations, including satellite radiances. In the case study of Zhang et al. (2009a), a WRF-based EnKF assimilating coastal Doppler radar observations was capable of simulating a rapidly intensifying landfalling hurricane, while the WRF-3DVar configuration, assimilating the same observations, failed almost completely.

The LAM EnKF also compares favorably with 4DVar. A cloud-model-based EnKF was shown to have a larger error than 4DVar at the beginning, but started to produce better analyses than 4DVar after several assimilation cycles, especially for model variables not functionally related to the observations (Caya et al. 2005). This time-dependent-relative performance is consistent with what has been observed in comparisons between EnKF and 4DVar using an operational global model (Buehner et al. 2010a,b), where a better (worse) forecast initialized from the ensemble mean analysis for the EnKF was produced in the medium (short) range. The forecast may initially suffer from the imbalance due to the ensemble averaging but recover gradually over time. Zhang et al. (2011) compared the forecast error of a WRF EnKF with WRF 4DVar and WRF 3DVar for a 1-month period (Fig. 5). It was found that the advantage of the EnKF over both 3DVar and 4DVar becomes very evident after the 36-h forecast time for all prognostic variables examined, while the EnKF moisture forecast field is superior to both 3DVar and 4DVar at all lead times despite fitting less closely to the observations at the analysis time. This result is consistent with the time dependency of the relative performance of 4DVar and EnKF in previous works (Caya et al. 2005). It is rather remarkable that the 72-h forecast error of the EnKF is comparable in magnitude to the 48-h error of 3DVar and 4DVar, a gain of nearly 1-day lead time in forecast accuracy.

Comparison between WRF-based EnKF, 3DVar, and 4DVar in terms of domain-averaged RMSEs averaged over all 59 WRF forecasts from June 2003 for each DA experiment at forecast lead times from 0 to 72 h evaluated every 12 h for (a) *U* (m s^{−1}), (b) *V* (m s^{−1}), (c) *T* (K), and (d) *Q* (g kg^{−1}) [adapted from Zhang et al. (2011)].

Citation: Monthly Weather Review 139, 7; 10.1175/2011MWR3418.1

Comparison between WRF-based EnKF, 3DVar, and 4DVar in terms of domain-averaged RMSEs averaged over all 59 WRF forecasts from June 2003 for each DA experiment at forecast lead times from 0 to 72 h evaluated every 12 h for (a) *U* (m s^{−1}), (b) *V* (m s^{−1}), (c) *T* (K), and (d) *Q* (g kg^{−1}) [adapted from Zhang et al. (2011)].

Citation: Monthly Weather Review 139, 7; 10.1175/2011MWR3418.1

Comparison between WRF-based EnKF, 3DVar, and 4DVar in terms of domain-averaged RMSEs averaged over all 59 WRF forecasts from June 2003 for each DA experiment at forecast lead times from 0 to 72 h evaluated every 12 h for (a) *U* (m s^{−1}), (b) *V* (m s^{−1}), (c) *T* (K), and (d) *Q* (g kg^{−1}) [adapted from Zhang et al. (2011)].

Citation: Monthly Weather Review 139, 7; 10.1175/2011MWR3418.1

### b. Hybrid of the EnKF with 3DVar/4DVar

Given the disadvantage of using static, mostly isotropic background covariance, there are increasing efforts to introduce the flow-dependent error statistics estimated from short-term ensemble forecasts into a variational data assimilation technique. Results show that 3DVar may benefit from using a homogeneous background error covariance calculated from an ensemble at either the initial time only or evolving with time (Buehner 2005; Meng and Zhang 2008a).

Lorenc (2003) proposed an extended control variable method using an ensemble based background error covariance in the background term of a variational analysis scheme. Another approach to ingest an ensemble background error covariance into a variational method is by directly combining the two error covariances (Hamill and Snyder 2000). The extended control variable and direct covariance combination approaches have been proven to be theoretically equivalent by Wang et al. (2007). As an extension of Lorenc (2003), C. Liu et al. (2008, 2009) designed an ensemble-based, four-dimensional variational algorithm for both a 1D shallow-water model and WRF by calculating the gradient of the cost function using an ensemble background covariance in the observation space. This avoids the use of a tangent linear and adjoint model. This method quickly converges to the true solution and can produce results that are comparable to 4DVar, but at far less computational cost in its minimization.

Why is a flow-dependent background error covariance beneficial for data assimilation? Zhang (2005) examined the dynamics and structure of the mesoscale error covariance of a snowstorm that occurred along the U.S. east coast in 2000, using an ensemble forecast in the perfect-model context. The error covariance valid at different times showed dramatic differences in magnitude, structure, and sign. The initial smaller-scale, uncorrelated, mostly random perturbations evolved into larger-scale, quasi-balanced disturbances with coherent structures within 12–24 h (also in Meng and Zhang 2007). This upscale spreading of error was also clearly demonstrated in the power spectrum analysis of the total difference energy for the same case in Zhang et al. (2003). Furthermore, the structures of the quasi-balanced disturbances starting from a different set of initial perturbations were found to be qualitatively similar, although details could differ. A highly flow-dependent covariance was also observed in cloud models (e.g., Snyder and Zhang 2003; Tong and Xue 2005), indicating that the true error covariance is likely flow dependent and ultimately determined by the underlying governing dynamics. Consequently, the use of a more realistic estimation of the background error covariance could be the reason why the EnKF compares favorably with variational methods.

Both the EnKF and variational method have their own advantages and disadvantages. The EnKF benefits from its flow-dependent background error covariance but suffers from rank deficiency, while the variational technique has advantages in its analysis algorithm, processing complex observations and applying physical constraints but suffers from the static and homogeneous initial background error covariance (Zhang et al. 2009b). Instead of settling on one particular method, more and more efforts are devoted to the hybridization of the two approaches. Wang et al. (2008a,b) proposed and tested one such hybrid system of ETKF 3DVar for both OSSE and real-data scenarios using the extended control variable method (Lorenc 2003). The hybrid algorithm provides a more accurate analysis than 3DVar, especially in data-sparse areas. Another example is a dual-resolution 3DVar-EnKF hybrid method (J. Gao 2010, personal communication), which uses a high-resolution 3DVar to provide analyses for a low-resolution EnKF, while the low-resolution EnKF provides flow-dependent ensemble covariance for the 3DVar to adjust the static error covariance. This method can reduce the computational cost of a regular 3DVar-EnKF hybrid without significantly reducing the quality of the analysis.

A hybrid of EnKF with 4DVar is regarded as one of the most advanced and most promising (as well as most computationally and technically demanding) data assimilation methods in both the research and operational communities. The EnKF was first coupled with 4DVar in the Lorenz-96 model under both perfect- and imperfect-model assumptions (Zhang et al. 2009b). The fully coupled assimilation scheme benefits from using the state-dependent uncertainty provided by the EnKF, while taking advantage of the 4DVar, which prevents the EnKF from diverging: the 4DVar analysis produces posterior maximum likelihood solutions by minimizing a cost function about which the ensemble perturbations are transformed. The hybrid system shows better performance and is less sensitive to ensemble size, assimilation window length, and model error than the stand-alone 4DVar and EnKF systems. A similar EnKF-4DVar coupled system has been recently implemented in WRF that was shown to outperform both the standalone WRF EnKF and WRF 4DVar (Zhang 2010). However, a hybrid system will likely inherit issues or complexity from the component systems such as the rank deficiency problem in the EnKF and the use of an outer-loop in 4DVar. One approach to deal with the rank deficiency problem in the hybrids of 4DVar and EnKF is through combining (or adding) the ensemble covariance with the static background error covariance, as implemented in the hybrid method of Zhang et al. (2009b), Zhang et al. (2011), and Zhang (2010). This is to some extent similar to using additive covariance inflation for the EnKF as proposed in Hamill and Whitaker (2005). For the outer-loop iteration issue in the hybrid/coupled systems, it is possible to use Kalman smoothing in the EnKF component and/or use relinearization of the dynamic model with an improved prior in 4DVar. More efforts are needed to tackle these issues in the future.

Considering the encouraging results obtained by coupling the EnKF with variational methods, several major operational NWP centers have already started testing or implementing such approaches in their global data assimilation systems such as Environment Canada (Buehner et al. 2010a,b), Méteo-France (Berre and Desroziers 2010), ECMWF, the Met Office, and NCEP. Similar efforts for limited-area data assimilation systems have not been reported.

The EnKF may also be coupled with the nudging method (L. Lei 2010, personal communication), for which the nudging coefficients are calculated using the EnKF error covariance. It may allow the Kalman gain matrix to be applied gradually in time, thus potentially leading to better intervariable consistency and retention of observational information than the EnKF.

## 5. Applications of the LAM EnKF beyond assimilation and forecasting

Since the EnKF naturally combines ensemble forecasts and data assimilation, it may be useful in many other applications besides state estimation. Currently, the LAM EnKF has been adapted for model error correction, sensitivity analysis, and observation targeting, etc.

### a. Parameter estimation

As mentioned above, the parameterization of subgrid physical processes is a major source of error in numerical prediction. One important reason for this error is that almost all parameters of subgrid physical parameterization schemes are empirical because of a lack of direct observations, and could therefore have large and unknown uncertainties. The EnKF can be used to estimate these parameters by the maximum likelihood method (Mitchell and Houtekamer 2000) or the state augmentation method (Anderson 2001; Annan et al. 2005; Aksoy et al. 2005). This technique, usually called parameter estimation, may help improve the performance of the EnKF via a model error correction. Here we mainly focus on the results of parameter estimation obtained with limited-area models.

Since there are no direct observations or physical evidence describing the variability of various parameters, the generation of a realistic initial ensemble for the estimated parameter is even more difficult than for standard state variables. A common practice is to simply use random perturbations from an arbitrary distribution. To maintain the ensemble spread of the estimated parameter, a conditional covariance localization method has been proposed, based on the rescaling of spread to a predefined value (Aksoy et al. 2005).

Not all parameters can be successfully estimated. The performance of a parameter estimation algorithm is determined by the degree to which a parameter is identifiable, which depends on the correlation between the parameter and model variables. There could be strong nonlinearity between the parameters and the model state variables. The feasibility with which a parameter can be identified is closely related to the EnKF configuration, such as observation type and location, ROI, ensemble size, and realizations of the initial perturbation for both the estimated parameter and model variables (Aksoy et al. 2005; Tong and Xue 2008a,b). For example, by assimilating simulated radiosondes and surface observations, the vertical-eddy-mixing coefficient of the fifth-generation PSU–NCAR Mesoscale Model (MM5) could nicely converge to the true value (Aksoy et al. 2006a). Conventional and polarimetric radar measurements were found to be beneficial for microphysical parameter estimation as well (Tong and Xue 2008a,b; Jung et al. 2010).

Parameter estimation performance using the EnKF is also closely associated with the number of simultaneously estimated parameters. It was found that the estimation of a single imperfect parameter is very effective at drawing the model variables close to the respective perfect-parameter case with a 2D sea-breeze model (Aksoy et al. 2006b) and with an MM5-based (Aksoy et al. 2006a) EnKF in an OSSE context. Increasing the number of estimated parameters inevitably leads to a decline in the improvement from parameter estimation, but still has an overall advantage over the imperfect case without parameter estimation in terms of the error statistics. The benefits of single-parameter estimation were also found in a PBL-model-based EnKF (Hacker and Snyder 2005) and a cloud-scale EnKF (Tong and Xue 2008a,b).

In addition to all of the above OSSE studies, Hu et al. (2010) reported a successful parameter estimation study with assimilating real-data observations into a LAM EnKF to estimate uncertain parameters in the Asymmetrical Convective Model, version 2 (ACM2) PBL parameterization scheme. As shown in Fig. 6, the simultaneous state and parameter estimation with the EnKF (SSPE) produces a smaller forecast error and bias than the EnKF without parameter estimation (NoPE), both of which outperform deterministic forecasting without data assimilation (NoDA). These results indicate that parameter estimation is helpful not only in improving state estimation, but also in producing a better performing PBL scheme. It was also found that a better wind profile could be achieved by parameter estimation, through the correction of a near-surface cold bias and momentum mixing in the boundary layer.

Real-data application of a LAM EnKF in parameter estimation (PE) via assimilating wind profiler observation during 29 Aug–2 Sep 2006 in Texas [adapted from Hu et al. (2010)]. Two parameters of the ACM2 PBL scheme implemented in WRF are estimated; namely *p*, an exponent affecting the magnitude and vertical distribution of eddy diffusivity within the unstably stratified PBL, and Rc, a critical Richardson number determining the transition between relatively large and small values of eddy diffusivity. Four experiments were performed including SSPE, NoPE, NoDA, and the deterministic forecast with estimated *p* and Rc from SSPE (NoDAnew). (a) The time evolution of the mean bias and RMSE of the 2-m temperature (T2) with respect to the unassimilated hourly observations at the 204 National Weather Service and Federal Aviation Administration (NWS/FAA) sites for the four experiments. The estimated *p* and Rc by SSPE are plotted in red solid lines in (b) and (c) with the blue dashed lines representing their respective default values. The red dashed lines in (b) and (c) denote the standard deviation of the estimated parameter.

Citation: Monthly Weather Review 139, 7; 10.1175/2011MWR3418.1

Real-data application of a LAM EnKF in parameter estimation (PE) via assimilating wind profiler observation during 29 Aug–2 Sep 2006 in Texas [adapted from Hu et al. (2010)]. Two parameters of the ACM2 PBL scheme implemented in WRF are estimated; namely *p*, an exponent affecting the magnitude and vertical distribution of eddy diffusivity within the unstably stratified PBL, and Rc, a critical Richardson number determining the transition between relatively large and small values of eddy diffusivity. Four experiments were performed including SSPE, NoPE, NoDA, and the deterministic forecast with estimated *p* and Rc from SSPE (NoDAnew). (a) The time evolution of the mean bias and RMSE of the 2-m temperature (T2) with respect to the unassimilated hourly observations at the 204 National Weather Service and Federal Aviation Administration (NWS/FAA) sites for the four experiments. The estimated *p* and Rc by SSPE are plotted in red solid lines in (b) and (c) with the blue dashed lines representing their respective default values. The red dashed lines in (b) and (c) denote the standard deviation of the estimated parameter.

Citation: Monthly Weather Review 139, 7; 10.1175/2011MWR3418.1

Real-data application of a LAM EnKF in parameter estimation (PE) via assimilating wind profiler observation during 29 Aug–2 Sep 2006 in Texas [adapted from Hu et al. (2010)]. Two parameters of the ACM2 PBL scheme implemented in WRF are estimated; namely *p*, an exponent affecting the magnitude and vertical distribution of eddy diffusivity within the unstably stratified PBL, and Rc, a critical Richardson number determining the transition between relatively large and small values of eddy diffusivity. Four experiments were performed including SSPE, NoPE, NoDA, and the deterministic forecast with estimated *p* and Rc from SSPE (NoDAnew). (a) The time evolution of the mean bias and RMSE of the 2-m temperature (T2) with respect to the unassimilated hourly observations at the 204 National Weather Service and Federal Aviation Administration (NWS/FAA) sites for the four experiments. The estimated *p* and Rc by SSPE are plotted in red solid lines in (b) and (c) with the blue dashed lines representing their respective default values. The red dashed lines in (b) and (c) denote the standard deviation of the estimated parameter.

Citation: Monthly Weather Review 139, 7; 10.1175/2011MWR3418.1

### b. Ensemble sensitivity analysis and observation targeting

Ensemble sensitivity analysis examines how small changes in the initial field may affect subsequent forecasts. The response of both a chosen metric and/or full state variables to a given initial perturbation or observation can be predicted (Ancell and Hakim 2007; Hakim and Torn 2008; Torn and Hakim 2008b, 2009b; Torn 2010; Sippel and Zhang 2010). As mentioned in section 3a, the ensemble is usually initialized by perturbing the initial field with fixed covariance perturbations randomly drawn from the default background covariance of an existing 3DVar system. Torn and Hakim (2008b) found sensitive regions for sea level pressure and precipitation forecast metrics by examining the climatological forecast sensitivity and the impact of observations.

In addition to examining the response of a forecast metric to currently available observations, as done in ensemble sensitivity analysis, the EnKF can also be used to locate the region where new observations should be added (usually called a “sensitive area”) to minimize targeted forecast uncertainty (Hamill and Snyder 2002). This approach is often referred to as targeted or adaptive observation. Wu et al. (2009) compared several approaches for observation targeting for tropical cyclones in the western North Pacific and found that the ETKF provides a similar sensitive area as the adjoint-derived sensitivity steering vector. Stuart et al. (2007) examined the effect that the targeted air quality observations may have on forecasts from a 2D sea-breeze-model-based EnKF and found that the sensitive area was similar before and after the assimilation of regular network observations.

The EnKF is also used to examine predictability of mesoscale systems such as hurricanes and MCSs. Error growth features in the mesoscale model were found to be predominantly guided by the underlying balanced dynamics and moist convection (Zhang 2005). In comparison with smaller, marginally resolvable scales, larger-scale error could be reduced more effectively by the EnKF (Zhang et al. 2006), and even more so in the presence of model error (Meng and Zhang 2007). In addition, the presence of deep moisture and high CAPE in the initial conditions could be the two most important factors for tropical cyclone genesis (Sippel and Zhang 2010).

## 6. Summary and conclusions

Since the first application to a cloud model in Snyder and Zhang (2003), great progress has been made in various aspects of *limited-area ensemble-based data assimilation*. This article reviewed recent advances and challenges in the development and applications of the EnKF, including its comparison and hybridization with variational methods, and the use of limited-area models that resolve weather systems from convective to meso- and regional scales.

As an approximation to the classic Kalman filter and as an emerging data assimilation technique, the EnKF unavoidably faces many challenging issues, for LAMs and convective/mesoscale systems in particular. How to generate perturbations to form the initial and boundary ensemble remains a difficult issue for the LAM EnKF systems, mostly due to the lack of error statistics. The ideal method would be to use a consistent global ensemble forecast system to directly provide the initial and boundary perturbations. When such a global ensemble model is not available, the initial and boundary perturbations can be generated by randomly sampling from a climatology-based background error covariance with proper tuning for regional scales, or by adding random uncorrelated noise to a horizontally homogeneous background for cloud-scale modeling. There is some evidence suggesting that using the latter methods cause noticeably larger error only near the boundary relative to using a global ensemble system.

Surface, radar, and satellite observations are three data sources that contain more mesoscale information relative to radiosondes, although satellite radiance assimilation is still in its infancy. There are many open questions in surface observation assimilation such as the mismatch between observed and model terrain height, the heterogeneous distribution, and the related determination of the radius of influence. Radar radial velocity has been shown to have more of a positive impact than reflectivity. Some special or synthetic observations such as vortex position/intensity/size have also been found to have apparent positive effects on the performance of the LAM EnKF.

The presence of model errors can often result in a large bias of the ensemble mean and too little spread, which can ultimately cause the ensemble forecast to fail (filter divergence). Model uncertainties can be accounted for by perturbing the forecast field or the model itself. Random model errors can be treated by covariance inflation with an adaptive inflating factor or relaxation of analysis perturbation to the forecast perturbation, while model bias correction or parameter estimation may be needed in order to account for systematic model uncertainties. Another promising alternative approach for taking into account model uncertainties is through the use of multimodel or multiphysics forecast ensembles, which has been shown to be more effective in improving the analysis of thermodynamic variables via bias correction and improving the background error covariance structure.

As a result of computational constraints, sampling error may result from the use of a small and insufficient ensemble size, more so in the presence of model error and nonlinearity. Sampling error in the EnKF can result in underestimation of analysis ensemble spread and unphysical, distant correlations that generate spurious analysis increments. Such errors can be respectively alleviated through additive or multiplicative covariance inflation or relaxation methods and via covariance localization, which constrains the impact of an observation to a certain distance (ROI) from the observation using an imposed localization function. It may also be beneficial to use an adaptive ROI and covariance localization function.

Many efforts have been made concerning the intercomparison between the LAM EnKF and 3DVar/4DVar. Generally speaking, the LAM EnKF compares favorably with both 3DVar and 4DVar, though larger errors may occur early on in the EnKF configuration, likely due to the imbalance issue as a result of using the sample covariance to mimic the classical background error covariance. The slightly higher performance of the EnKF over variational methods is likely due to its more realistic flow-dependent background error covariance. Many studies have clearly demonstrated the benefit of including the ensemble-based background error covariance into variational methods. Instead of choosing one approach, future data assimilation may rely on a hybrid of the two methods, with the EnKF benefiting from the dynamical constraints enforced by the variational method and the variational schemes adopting flow dependence in their background error covariance from ensembles.

In addition to state estimation, the EnKF has been applied to estimate parameters in certain physical parameterization schemes, to examine the predictability of certain weather phenomena, and to predict the response of certain forecast metrics to initial perturbations or observations. Parameter estimation is an effective way of accounting for model error for data assimilation. Ensemble sensitivity analysis can be used for observation targeting, which is aimed at finding a location to add extra observations in order to minimize a targeted forecast uncertainty or to better design an observing network.

There are many foreseeable improvements for the LAM EnKF. The most important, and difficult, improvement is better treatment of model error. An effort must be made to examine why the multiphysics ensemble method is effective and how it can be improved, as well as how parameter estimation can be performed in the presence of model error. Another aspect that is also very important is how to better deal with sampling error. Localization in terms of scale, variables, and observations and its deleterious impact on desired balance needs to be examined for optimality. The assimilation of satellite radiances is another issue that requires deep consideration, since the extent to which vertical localization should be applied, bias correction and data thinning are all open questions. Much more attention should be focused on surface assimilation to improve mesoscale weather prediction, because surface data contain rich information on mesoscale phenomena. Improving computational efficiency is also an important issue concerning the great operational potential of the EnKF.

Errors in convective-to-mesoscale systems depicted in LAMs may be more nonlinear and non-Gaussian. Though the EnKF can be used in nonlinear and non-Gaussian systems, the performance of the EnKF may be affected and ultimately limited by these two characteristics. Variants of algorithms for highly nonlinear and non-Gaussian systems have been proposed for simple models such as the particle filter (Snyder et al. 2008; Lei et al. 2010) and a morphing method (Beezley and Mandel 2008; Lawson and Hansen 2005). Currently, it is very hard to apply these filters to high-dimensional atmospheric models. How to construct proper algorithms to deal with high nonlinearity and non-Gaussianity awaits further efforts.

Finally, as summarized in the previous section, coupling the EnKF with the variational method seems to be a promising endeavor. Large efforts are being made on this subject in several major weather forecast centers such as Environment Canada, Méteo-France, NCEP, ECMWF, and the Met Office for global models. A hybrid between the EnKF and the variational method may become the prospective form of operational data assimilation algorithms in both global and limited-area data assimilation in the foreseeable future.

## Acknowledgments

We are very grateful for the constructive comments from Ryan Torn, Yonghui Weng, Meng Zhang, Ben Green, and Jon Poterjoy, and for proofreading by Poterjoy and Erin Munsell. We also benefited greatly from discussions with and among participants of the Third EnKF Workshop in Austin, Texas (April 2008) and the WWRP/THORPEX Workshop on 4DVar and EnKF intercomparisons in Buenos Aires, Argentina (November 2008). ZM is supported by Grants 2009CB421504, NSFC40940024, NSFC41075031, NSFC40921160380, NSFC40730948, and GYHY200906025 from China. FZ is supported by the U.S. Office of Naval Research under Grants N000140410471 and N000140910526, by NSF Grant ATM-084065, and by NOAA under the Hurricane Forecast Improvement Project (HFIP).

## REFERENCES

Aksoy, A., F. Zhang, J. W. Nielsen-Gammon, and C. C. Epifanio, 2005: Ensemble-based data assimilation for thermally forced circulations.

,*J. Geophys. Res.***110**, D16105, doi:10.1029/2004JD005718.Aksoy, A., F. Zhang, and J. W. Nielsen-Gammon, 2006a: Ensemble-based simultaneous state and parameter estimation with MM5.

,*Geophys. Res. Lett.***33**, L12801, doi:10.1029/2006GL026186.Aksoy, A., F. Zhang, and J. W. Nielsen-Gammon, 2006b: Ensemble-based simultaneous state and parameter estimation in a two-dimensional sea-breeze model.

,*Mon. Wea. Rev.***134**, 2951–2970.Aksoy, A., D. C. Dowell, and C. Snyder, 2009: A multicase comparative assessment of the ensemble Kalman filter for assimilation of radar observations. Part I: Storm-scale analyses.

,*Mon. Wea. Rev.***137**, 1805–1824.Aksoy, A., D. C. Dowell, and C. Snyder, 2010: A multicase comparative assessment of the ensemble Kalman filter for assimilation of radar observations. Part II: Short-range ensemble forecasts.

,*Mon. Wea. Rev.***138**, 1273–1292.Ancell, B., and G. J. Hakim, 2007: Comparing adjoint- and ensemble-sensitivity analysis with applications to observation targeting.

,*Mon. Wea. Rev.***135**, 4117–4134.Anderson, J. L., 2001: An ensemble adjustment Kalman filter for data assimilation.

,*Mon. Wea. Rev.***129**, 2884–2903.Anderson, J. L., 2006: Exploring the need for localization in ensemble data assimilation using a hierarchical ensemble filter.

,*Physica D***230**, 99–111.Anderson, J. L., 2009: Spatially and temporally varying adaptive covariance inflation for ensemble filters.

,*Tellus***61A**, 72–83.Anderson, J. L., and N. Collins, 2007: Scalable implementations of ensemble filter algorithms for data assimilation.

,*J. Atmos. Oceanic Technol.***24**, 1452–1463.Anderson, J. L., B. Wyman, S. Zhang, and T. Hoar, 2005: Assimilation of surface pressure observations using an ensemble filter in an idealized global atmospheric prediction system.

,*J. Atmos. Sci.***62**, 2925–2938.Annan, J. D., D. J. Lunt, J. C. Hargreaves, and P. J. Valdes, 2005: Parameter estimation in an atmospheric GCM using the ensemble Kalman filter.

,*Nonlinear Processes Geophys.***12**, 363–371.Aravéquia, A. J., I. Szunyogh, E. J. Fertig, E. Kalnay, D. Kuhl, and E. J. Kostelich, 2011: Evaluation of a strategy for the assimilation of satellite radiance observations with the local ensemble transform Kalman filter.

,*Mon. Wea. Rev.***139**, 1932–1951.Barker, D. M., 2005: Southern high-latitude ensemble data assimilation in the Antarctic Mesoscale Prediction System.

,*Mon. Wea. Rev.***133**, 3431–3449.Barnes, S. L., 1964: A technique for maximizing details in numerical weather map analysis.

,*J. Appl. Meteor.***3**, 396–409.Beezley, J., and J. Mandel, 2008: Morphing ensemble Kalman filters.

,*Tellus***60A**, 131–140.Berre, L., and G. Desroziers, 2010: Filtering of background error variances and correlations by local spatial averaging: A review.

,*Mon. Wea. Rev.***138**, 3693–3720.Bishop, C. H., and D. Hodyss, 2007: Flow-adaptive moderation of spurious ensemble correlations and its use in ensemble-based data assimilation.

,*Quart. J. Roy. Meteor. Soc.***133**, 2029–2044.Bishop, C. H., B. J. Etherton, and S. J. Majumdar, 2001: Adaptive sampling with the ensemble transform Kalman filter. Part I: Theoretical aspects.

,*Mon. Wea. Rev.***129**, 420–436.Bonavita, M., L. Torrisi, and F. Marcucci, 2008: The ensemble Kalman filter in an operational regional NWP system: Preliminary results with real observations.

,*Quart. J. Roy. Meteor. Soc.***134**, 1733–1744.Bonavita, M., L. Torrisi, and F. Marcucci, 2010: Ensemble data assimilation with the CNMCA regional forecasting system.

,*Quart. J. Roy. Meteor. Soc.***136**, 132–145.Buehner, M., 2005: Ensemble-derived stationary and flow-dependent background-error covariances: Evaluation in a quasi-operational NWP setting.

,*Quart. J. Roy. Meteor. Soc.***131**, 1013–1043.Buehner, M., P. L. Houtekamer, C. Charette, H. L. Mitchell, and B. He, 2010a: Intercomparison of variational data assimilation and the ensemble Kalman filter for global deterministic NWP. Part I: Description and single-observation experiments.

,*Mon. Wea. Rev.***138**, 1550–1566.Buehner, M., P. L. Houtekamer, C. Charette, H. L. Mitchell, and B. He, 2010b: Intercomparison of variational data assimilation and the ensemble Kalman filter for global deterministic NWP. Part II: One-month experiments with real observations.

,*Mon. Wea. Rev.***138**, 1567–1586.Caya, A., J. Sun, and C. Snyder, 2005: A comparison between the 4DVAR and the ensemble Kalman filter techniques for radar data assimilation.

,*Mon. Wea. Rev.***133**, 3081–3094.Charron, M., P. L. Houtekamer, and P. Bartello, 2006: Assimilation with an ensemble Kalman filter of synthetic radial wind data in anisotropic turbulence: Perfect model experiments.

,*Mon. Wea. Rev.***134**, 618–637.Chen, Y., and C. Snyder, 2007: Assimilating vortex position with an ensemble Kalman filter.

,*Mon. Wea. Rev.***135**, 1828–1845.Dévényi, D., and T. W. Schlatter, 1994: Statistical properties of three-hour prediction “errors” derived from the Mesoscale Analysis and Prediction System.

,*Mon. Wea. Rev.***122**, 1263–1280.Dirren, S., R. D. Torn, and G. J. Hakim, 2007: A data assimilation case study using a limited-area ensemble Kalman filter.

,*Mon. Wea. Rev.***135**, 1455–1473.Dowell, D. C., and L. J. Wicker, 2009: Additive noise for storm-scale ensemble data assimilation.

,*J. Atmos. Oceanic Technol.***26**, 911–927.Dowell, D. C., F. Zhang, L. J. Wicker, C. Snyder, and N. A. Crook, 2004: Wind and temperature retrievals in the 17 May 1981 Arcadia, Oklahoma, supercell: Ensemble Kalman filter experiments.

,*Mon. Wea. Rev.***132**, 1982–2005.Ehrendorfer, M., 2007: A review of issues in ensemble-based Kalman filtering.

,*Meteor. Z.***16**, 795–818.Etherton, B. J., 2007: Preemptive forecasts using an ensemble Kalman filter.

,*Mon. Wea. Rev.***135**, 3484–3495.Evensen, G., 1994: Sequential data assimilation with a nonlinear quasi-geostrophic model using Monte Carlo methods to forecast error statistics.

,*J. Geophys. Res.***99**, 10 143–10 162.Evensen, G., 2003: The ensemble Kalman filter: Theoretical formulation and practical implementation.

,*Ocean Dyn.***53**, 343–367.Evensen, G., 2007:

*Data Assimilation: The Ensemble Kalman Filter*. Springer, 279 pp.Fujita, T., D. J. Stensrud, and D. C. Dowell, 2007: Surface data assimilation using an ensemble Kalman filter approach with initial condition and model physics uncertainties.

,*Mon. Wea. Rev.***135**, 1846–1868.Fujita, T., D. J. Stensrud, and D. C. Dowell, 2008: Using precipitation observations in a mesoscale short-range ensemble analysis and forecasting system.

,*Wea. Forecasting***23**, 357–372.Gao, J., and M. Xue, 2008: An efficient dual-resolution approach for ensemble data assimilation and tests with simulated Doppler radar data.

,*Mon. Wea. Rev.***136**, 945–963.Gaspari, G., and S. E. Cohn, 1999: Construction of correlation functions in two and three dimensions.

,*Quart. J. Roy. Meteor. Soc.***125**, 723–757.Hacker, J. P., and C. Snyder, 2005: Ensemble Kalman filter assimilation of fixed screen-height observations in a parameterized PBL.

,*Mon. Wea. Rev.***133**, 3260–3275.Hacker, J. P., J. L. Anderson, and M. Pagowski, 2007: Improved vertical covariance estimates for ensemble-filter assimilation of near-surface observations.

,*Mon. Wea. Rev.***135**, 1021–1036.Hakim, G. J., and R. D. Torn, 2008: Ensemble synoptic analysis.

*Synoptic–Dynamic Meteorology and Weather Analysis and Forecasting: A Tribute to Fred Sanders,*L. F. Bosart and B. Bluestein, Eds., Amer. Meteor. Soc., 147–162.Hamill, T. M., 2006: Ensemble-based atmospheric data assimilation.

*Predictability of Weather and Climate,*T. Palmer and R. Hagedorn, Eds., Cambridge University Press, 124–156.Hamill, T. M., and C. Snyder, 2000: A hybrid ensemble Kalman filter—3D variational analysis scheme.

,*Mon. Wea. Rev.***128**, 2905–2919.Hamill, T. M., and C. Snyder, 2002: Using improved background error covariances from an ensemble Kalman filter for adaptive observations.

,*Mon. Wea. Rev.***130**, 1552–1572.Hamill, T. M., and J. S. Whitaker, 2005: Accounting for the error due to unresolved scales in ensemble data assimilation: A comparison of different approaches.

,*Mon. Wea. Rev.***133**, 3132–3147.Houtekamer, P. L., and H. L. Mitchell, 1998: Data assimilation using an ensemble Kalman filter technique.

,*Mon. Wea. Rev.***126**, 796–811.Houtekamer, P. L., and H. L. Mitchell, 2001: A sequential ensemble Kalman filter for atmospheric data assimilation.

,*Mon. Wea. Rev.***129**, 123–137.Houtekamer, P. L., and H. L. Mitchell, 2005: Ensemble Kalman filtering.

,*Quart. J. Roy. Meteor. Soc.***131**, 3269–3289.Houtekamer, P. L., L. Lefaivre, J. Derome, H. Ritchie, and H. L. Mitchell, 1996: A system simulation approach to ensemble prediction.

,*Mon. Wea. Rev.***124**, 1225–1242.Houtekamer, P. L., H. L. Mitchell, G. Pellerin, M. Buehner, M. Charron, L. Spacek, and B. Hansen, 2005: Atmospheric data assimilation with an ensemble Kalman filter: Results with real observations.

,*Mon. Wea. Rev.***133**, 604–620.Houtekamer, P. L., H. L. Mitchell, and X. Deng, 2009: Model error representation in an operational ensemble Kalman filter.

,*Mon. Wea. Rev.***137**, 2126–2143.Hu, X., F. Zhang, and J. W. Nielsen-Gammon, 2010: Ensemble-based simultaneous state and parameter estimation for treatment of mesoscale model error: A real-data study.

,*Geophys. Res. Lett.***37**, L08802, doi:10.1029/2010GL043017.Huang, X., and P. Lynch, 1993: Diabatic digital-filtering initialization: Application to the HIRLAM model.

,*Mon. Wea. Rev.***121**, 589–603.Hunt, B. R., and Coauthors, 2004: Four-dimensional ensemble Kalman filtering.

,*Tellus***56A**, 273–277.Hunt, B. R., E. J. Kostelich, and I. Szunyogh, 2007: Efficient data assimilation for spatiotemporal chaos: A local ensemble transform Kalman filter.

,*Physica D***230**, 112–126.Jung, Y., M. Xue, G. Zhang, and J. M. Straka, 2008: Assimilation of simulated polarimetric radar data for a convective storm using the ensemble Kalman filter. Part II: Impact of polarimetric data on storm analysis.

,*Mon. Wea. Rev.***136**, 2246–2260.Jung, Y., M. Xue, and G. Zhang, 2010: Simultaneous estimation of microphysical parameters and the atmospheric state using simulated polarimetric radar data and an ensemble Kalman filter in the presence of observation operator error.

,*Mon. Wea. Rev.***138**, 539–562.Kalman, R. E., 1960: A new approach to linear filtering and prediction problems.

,*Trans. ASME J. Basic Eng.***82D**, 35–45.Kalman, R. E., and R. S. Bucy, 1961: New results in linear filtering and prediction theory.

,*Trans. ASME J. Basic Eng.***83D**, 95–108.Kepert, J. D., 2009: Covariance localisation and balance in an ensemble Kalman filter.

,*Quart. J. Roy. Meteor. Soc.***135**, 1157–1176.Krishnamurti, T. N., C. M. Kishtawal, T. E. LaRow, D. R. Bachiochi, Z. Zhang, C. E. Williford, S. Gadgil, and S. Surendran, 1999: Improved weather and seasonal climate forecasts from multi-model superensemble.

,*Science***285**, 1548–1550.Lawson, W. G., and J. A. Hansen, 2005: Alignment error models and ensemble-based data assimilation.

,*Mon. Wea. Rev.***133**, 1687–1709.Lei, J., P. Bickel, and C. Snyder, 2010: Comparison of ensemble Kalman filters under non-Gaussianity.

,*Mon. Wea. Rev.***138**, 1293–1306.