## 1. Introduction

One important goal of the ensemble approach in atmospheric data assimilation is to approximate moments of the probability distribution functions (PDFs) of the analyses and forecasts using a group of random realizations. The ensemble Kalman filter (EnKF) is an objective way to obtain a set of analyses and also initialize ensemble forecasts. Furthermore, an advantage of using the EnKF algorithm is that it estimates and updates the background error covariances with a short-term ensemble forecast in each cycle, thus taking into account situation-dependent features.

EnKF algorithms have been developed for a wide range of spatial scales. Designed for large scales, Houtekamer and Mitchell (2005) implemented an EnKF system at the Canadian Meteorological Center (CMC) to assimilate observations with the Global Environmental Multiscale (GEM) model. This EnKF system provides an ensemble of initial conditions for the CMC's medium-range ensemble prediction system. Their study demonstrated that the EnKF can be used successfully for operational atmospheric data assimilation. Szunyogh et al. (2008) employed the local ensemble transform Kalman filter (LETKF) algorithm with the National Centers for Environmental Prediction (NCEP) global model. They found that the LETKF provides more accurate analyses than the spectral statistical interpolation (SSI) analyses in sparse observation regions. Based on the Weather Research and Forecasting Model (WRF), a limited-area EnKF system has been used with conventional data by Torn and Hakim (2008). In that study, it was found that upper-tropospheric wind and midtropospheric temperature are correlated with the water vapor field, which suggests that assimilating cloud motion wind and aircraft temperature observations may have a significant impact on the moisture analysis. The Italian National Meteorological Service also applied the LETKF in regional numerical weather prediction (NWP; Bonavita et al. 2010). The results showed that the LETKF-based forecasts generally outperformed their operational (three-dimensional variational data assimilation) 3D-Var-based (constant background error covariances) counterparts according to a root-mean-square error verification metric. The application of the EnKF technique at the storm scale is relatively new and the research is focused on the accuracy of EnKF analyses. Using simulated Doppler winds, Snyder and Zhang (2003) first applied the EnKF algorithm coupled with a cloud-resolving model. The results demonstrated the potential of the EnKF at convective scales. Tong and Xue (2005) examined the impact of assimilating both Doppler and reflectivity data in a series of observing system simulation experiments (OSSEs). They concluded that the best results are obtained when both Doppler wind and reflectivity data are used. Dowell et al. (2004, 2011) tested the EnKF algorithm with real radar data. All the above studies showed that by assimilating Doppler winds and/or reflectivity, realistic storm-scale structures can be obtained in the analyses. Recently, several investigations have turned to very short-term forecasts and to specific weather phenomena (Zhang et al. 2009; Stensrud and Gao 2010; Aksoy et al. 2010). Forecast error covariances play a crucial role in data assimilation algorithm. However, their structure at convective scales is not well understood. Some studies (Bannister et al. 2011; Montmerle and Berre 2010) have shown that instead of using climatological synoptic-scale statistics, it is preferable to construct situation-dependent background error statistics. In the 3D-Var framework, Brousseau et al. (2012) examined the impact of using situation-dependent background error covariances (provided by a six-member ensemble) at convective scales. They showed the impact on analysis increments and found improvements in short-term forecasts.

In this study, a high-resolution ensemble Kalman filter (HREnKF) system has been adapted for the limited-area model GEM_LAM from the global EnKF system (Houtekamer and Mitchell 2005; Houtekamer et al. 2009) currently operational at the CMC. The goal for the near future is to develop a convective-scale data assimilation system that assimilates radar data. Before discussing systematic assimilation of real radar observations, this paper presents an examination of the transition from homogeneous isotropic forecast error to situation-dependent short-term forecast error covariances at cloud-resolving scales. Many studies have shown the advantages of propagating the information of flow-dependent forecast errors via cycling procedures in EnKF systems. However, forecast errors at the convective scale are not well understood. Investigating the complex structure of forecast errors at cloud-resolving scales also helps to provide optimal values for various parameters (e.g., localization length) in the EnKF system. The paper is structured as follows. In section 2, the HREnKF system is introduced, while the configuration of the limited-area model and the method used to specify the initial perturbation in the HREnKF are presented in section 3. Section 4 describes a case study and the performance of the deterministic forecast. The results of background error covariances at the mesoscale/convective scale are presented in section 5. The summary and some suggestions for future work are given in section 6.

## 2. The high-resolution EnKF (HREnKF) system

**y**is a set of perturbed observations (

**y**=

The HREnKF assimilation system has the following features.

### a. Sequential processing of batches of observations

In operational atmospheric data assimilation systems, the typical size of the observation vector is at least *O*(10^{6}) or even more. To deal with issues of storage and inversion of matrices, the observations are divided into batches that are assimilated sequentially. In comparison with assimilating all observations simultaneously, one should notice that, the batching process of observations is strictly valid as long as the observations whose observation errors are correlated with each other are processed in the same batch (Houtekamer and Mitchell 2001).

### b. Partitioning the ensemble

In the EnKF algorithm, the same set of prior fields could be used both to provide initial guesses and to compute the Kalman gain. This double use of the same information may lead to an underestimation of the spread in the ensemble (Houtekamer and Mitchell 2001). In the current system, the ensemble is partitioned into four subensembles, and the gain matrix used for each subensemble is computed by using the prior fields from the other subensembles, thus improving the correspondence between ensemble spread and the ensemble mean error. The disadvantage of such a scheme is that the estimates of covariance are noisier, because of the smaller size of the subensembles.

### c. Localization

### d. Simulation of model errors

It is important to take into account the model error properly since neglecting model error may lead to a very small ensemble spread, and this may cause a convergence problem of the filter. Unfortunately, the model error in NWP, especially at the convective scale, is not well understood. In a similar way as Houtekamer et al. (2009), the model error component of the HREnKF applies a simplified and reduced amplitude form of homogeneous and isotropic background error correlations. This is done by adding an ensemble of random perturbation fields with a specified covariance structure to the ensemble of background fields.

The HREnKF system consists of a set of parallel short-term forecast and data assimilation steps. Figure 1 illustrates the cycling procedure between analysis and forecast steps. In our study, the first initial guess is from a previous, unperturbed (deterministic) forecast. By adding prescribed random perturbations (based on the aforementioned procedure for dealing with model error) to the deterministic forecast, an initial set of ensemble members is obtained. The random errors are added to simulate errors of the numerical model. To take into account the uncertainty in observations, these are also perturbed according to their estimated errors. Via the data assimilation process (analysis step), one is able to update the analyses and launch the model (forecast step) to produce very short-term forecasts. The analysis and forecast steps are repeated (dashed line) in the system as the cycling proceeds.

## 3. Configuration of the experiment

### a. Limited-area model

The fully compressible limited-area model GEM_LAM is used in our study. The model employs an implicit scheme in time and a semi-Lagrangian scheme in space. Detailed descriptions of the GEM model dynamics and physics formulations are available in Côté et al. (1998) and Mailhot et al. (1998), respectively.

A three-level nested domain (Fig. 2) is used in the model configuration to obtain a deterministic forecast in our experiments. The global grid forecast was run using GEM at a 15-km resolution (hereafter GLB-15km). The GLB-15km, which used the Sundqvist condensation scheme (Sundqvist 1978), was performed from 1200 UTC 21 July to 0300 UTC 22 July 2010. These hourly forecasts were used as initial conditions (1200 UTC) and lateral boundary conditions to launch a limited-area model in domain A with a horizontal resolution of 15 km (hereafter LAM-15km). The Milbrandt and Yau (2005) double-moment microphysics scheme used in LAM-15km predicts the mass mixing ratio and total number concentration of six hydrometeor categories (cloud water, rain, ice, snow, graupel, and hail). This approach leads to more precision than the Sundqvist condensation scheme for the computation of microphysical growth/decay rates and precipitation, and it is expected to shorten the spinup phase. A second nested LAM (domain B) of a forecast is started 6 h later (1800 UTC) with a 2.5-km resolution (LAM-2.5km) of the model. This domain (564 × 494 grid points) covers the southern part of the provinces of Ontario and Québec. A 1-km resolution simulation (domain C, LAM-1km) is launched 6 h later at 0000 UTC 22 July 2010, with an integration time step of 30 s. The LAM-1km is centered on the Montréal region (300 × 300 grid points) for the purpose of eventually assimilating S-band radar data provided by McGill University.

The limited-area simulations are fully nonhydrostatic with 58 hybrid vertical levels and a lid at 10 hPa. The land surface scheme called Interaction between Soil–Biosphere–Atmosphere (ISBA; see Noilhan and Planton 1989) is applied. The Kain–Fritsch convective scheme (Kain and Fritsch 1990) is applied in LAM-15km; however, no convective parameterization is used in either LAM-2.5km or LAM-1km. In addition, in contrast to the global EnKF system, which uses a multimodel option (different versions of physical parameterizations), currently we keep fixed all the physical schemes for running the ensemble forecasts. We point out that this configuration may cause an underestimation of the error covariances. The double-moment version of the Milbrandt and Yau (2005) microphysics scheme is used for the grid-scale processes. Note that besides the standard model control variables: horizontal wind (*u*, *υ*), temperature *T*, and specific humidity HU, the mixing ratio and number concentration of six hydrometeor variables (cloud water, rain, snow, ice, graupel, and hail) are also carried from the driving conditions.

### b. Method of adding initial perturbations for ensemble members

It is feasible to obtain a set of ensemble initial states from the global EnKF system. However, as mentioned previously, in this study the initial states are constructed by generating random perturbations and adding them to a deterministic forecast. By providing homogeneous and isotropic perturbations to obtain initial states, one is able to examine the transition to situation-dependent forecast error covariance structures. In addition, the evolution of situation-dependent error structures in different locations can be fairly compared.

The HREnKF includes a background error covariance simulator that produces random perturbations, from which we sample different fields for different members of the ensemble. The random perturbations are generated from the bi-Fourier decomposition in the spectral domain (Fillion et al. 2010). The error simulator considers independent perturbations for streamfunction, divergence, temperature, humidity, and surface pressure, which are then transformed into wind, temperature, specific humidity, and surface pressure background errors. The initial perturbations of background errors in the HREnKF system are generated as horizontally homogenous and isotropic in the limited-area domain. We stress here that no well-tuned mesoscale (or convective scale) data assimilation system is available to us at this stage of our study that could serve as a guide. Contrary to the global EnKF, we do not have at hand reliable operational nonseparable spectral homogeneous and isotropic correlation statistics. We thus used a simple specification of correlation scales in the horizontal and vertical using the separability assumption for background error correlations. This error specification is obviously an approximation. Nevertheless, we show clearly in the following that a rapid transition to situation-dependent structures occurs. The standard deviation background errors for the control variables are 3 m s^{−1} for horizontal wind, 0.5° for temperature, and 0.1 for the logarithm of specific humidity. Those statistics were obtained from a 2.5-km National Meteorological Center (NMC, now known as NCEP) approach (e.g., Fillion et al. 2010). We imposed a 10-km horizontal correlation length for streamfunction, velocity potential, temperature, and logarithm of specific humidity, and a correlation length of 200 hPa in the vertical. No cross correlations are imposed between different variables. In addition, we note that in general, it is important to consider perturbations of lateral boundary conditions (Caron 2013). However, for the first step of implementing the Canadian HREnKF system, we only added the errors to the fields over the entire analysis domain and did not perturb the lateral boundary conditions.

## 4. Description of the case study

The period selected for the case study was 21–22 July 2010. During this period, a mesoscale cyclone developed near the border between the provinces of Québec and Ontario, Canada, and subsequently moved eastward over Québec around 1300 UTC 21 July 2010. Precipitation was observed over the Montréal region from 1800 UTC 21 July to 0600 UTC 22 July. A deterministic model simulation was produced as the control run, and compared to remote sensing observations to assess the performance of the forecast.

Figure 3a shows the brightness temperature from the 11-*μ*m channel of Geostationary Operational Environmental Satellite (GOES) data around 2300 UTC 21 July, and Fig. 3b shows the model 5-h simulation (2300 UTC) of surface precipitation at 2.5-km resolution. The similarity in the pattern of weather systems (I and II) between observations and simulation indicates that the driving model provides reasonable initial and lateral boundary conditions in this case. The CMC radar composite over southern Québec is shown in Fig. 4a. The precipitation rate of the 1-km resolution forecast at 0030 UTC 22 July 2010 is depicted in Fig. 4b. At a lead time of 30 min, the LAM-1km forecast predicted precipitation over the Montréal region. Two well-structured weather systems are simulated in the analysis domain: one in the north and another in the southeast of the domain. In addition, some convective cells are scattered in the center and south of the domain. Compared to radar observations, the LAM-1km is able to simulate the precipitation over the Montréal domain, but with different structures and locations. By examining the time sequence of radar composites over southern Québec, we found that a phase error seems to occur. One should note that this is not an uncommon issue at mesoscale and convective scales.

## 5. Results of the forecast error at the mesoscale/convective scale

Adaptation of the global EnKF data assimilation system for convective scales required a large number of modifications. It is imperative for the new code to pass basic validation tests, which we describe in the appendix.

The following results are based on the perturbations method introduced in section 3. By using the same random perturbation method for a range of correlation lengths appropriate for convective scales (from 10 to 20 km), sensitivity tests showed that one can obtain qualitatively similar error structures. Our study and discussions focus on the transition to situation-dependent background error covariances, and how the error structures vary in different regions, both in the horizontal and the vertical.

### a. Horizontal error structure

*u*wind, temperature, and humidity based on an 80-member forecast ensemble at the initial time (

*t*= 0 min). For visualization purposes, the analysis domain is divided into 25 subdomains, and the error correlation is computed with respect to the center of each subdomain. In general, the error structure reflects the use of homogenous isotropic initial perturbations except for the specific humidity field. This is because humidity perturbations are generated from the logarithm of specific humidity, ln(HU), therefore the error structure depends on the mean state of the field. Because of the sampling error, the discrepancy in each subdomain is discernible in areas of weak correlations. The accuracy of the correlations is estimated by (Stuart and Ord 1986)where

*n*ensemble members. Substituting

*n*= 80 and the value

Figure 6 shows the forecast error structures that develop after 5 min of integration of GEM_LAM. The deformation of the forecast error of *u* wind (same as *υ* wind, not shown) is manifest in each subdomain (Fig. 6a) as compared to forecast error of the temperature (Fig. 6b) and humidity (Fig. 6c). The deformation of horizontal wind appears similar in all subdomains and has longer correlation lengths compared to the error structure we specified at the initial time. Daley (1985, his Fig. 1) shows error correlations of the *u* wind that are similar to those in Fig. 6a. His study shows that when the flow has both rotational and divergent components, the error structures tend toward an oval shape (positive correlation) with negative correlation lobes in the sides, but the direction of deformation is no longer along the east–west direction for *u* wind. This indicates that dynamic processes quickly affect and dominate the error structures of the *u* wind in the very early stage of model integration. After 10 min of model integration, the errors of *u* wind (Fig. 7a) evolve into various structures in each subdomain, and the development of forecast error of temperature and specific humidity is more significant (Figs. 7b,c) than the errors at *t* = 5 min (Figs. 6b,c). Furthermore, there is a strong resemblance between the temperature and humidity error structures. At 15 min, all the control variables exhibit the transition from purely homogeneous isotropic background error correlations to situation-dependent correlations (not shown). The evolution of forecast errors in the first 15 min indicates that as the model is integrated in time, the control variables rapidly build up their own error structures based on dynamic, thermodynamic, and physical processes. The *u*-wind error structure is established rapidly, but error structures of thermodynamic variables (temperature and humidity) evolve on a longer time scale.

After 30 min of model integration, significant heterogeneous error structure develops in the analysis domain (Fig. 8). The situation-dependent error structure (in each subdomain) is present clearly in all variables. One can see that the error structures of all variables are aligned along the lateral boundaries, and that the mean flow (see Fig. 4b, wind vectors) affects the error structures of the wind component (Fig. 8a). In addition, it is evident that the correlation lengths are shorter in some of the subdomains and longer in others. For instance, the subregion at *x* = 119–179, and *y* = 0–59 shows that the error structure can be extremely localized. This is associated with tiny convection cells appearing in the southern domain of Fig. 4b and it suggests that this physical process is isolated and decorrelates quickly with other processes occurring in the surrounding environment. Moreover, the spatial deformation of the errors has different orientations in each subdomain. Compared with the surface precipitation from the deterministic forecast in Fig. 4b, one can see that, in general, the error correlation length is much smaller surrounding the precipitation area and larger in the nonprecipitating regions. For instance, the error correlation length in the south west of the domain (nonprecipitating) is much longer than in the center and southeast of the domain (precipitating).

Next, we examined the performance of short-term forecasts. Compared with the control run, the 80 ensemble members exhibit a variety of intensities and locations of precipitation. However, none of the forecasted precipitation patterns resembles the radar observations at 0030 UTC 22 July 2010 (not shown). The ensemble spread in the horizontal illustrates the uncertainty in numerical forecasts in space. The ensemble spread for horizontal wind (*u*, *υ*), vertical velocity *w*, and temperature *T* at 800 hPa corresponding to the 30-min forecast is plotted in Fig. 9. The very small values of spread, visible near the boundaries in all variables, are a result of the lack of perturbations in the lateral boundary conditions. Larger values of spread occur near the two main weather systems and in the southern part of the analysis domain. In addition, relatively small ensemble spread is visible in the southwest portion of the domain, which is a region without precipitation in most of the ensemble forecasts. The ensemble forecasts exhibit very little precipitation in the southwest of the domain because the atmosphere is relatively stable over that area, and therefore the uncertainty is small. Furthermore, localized convective storms occur in the southern part of the domain. The intensity and the location of the storms vary from one member to another, so the uncertainty is large over that area. Moreover, the ensemble spread reveals that the precipitation system in the southeast corresponds to higher uncertainty compared to the system in the north of the domain.

### b. Vertical error structure

Figure 10 shows the vertical correlation structure obtained from the background error simulator that used to generate the initial ensemble of perturbations. Correlations shown are relative to the 600-hPa pressure surface. As we have already shown for the horizontal structure of forecast error, the vertical structure also develops situation-dependent features rapidly (i.e., within the first 15 min of the forecast). Vertical profiles of the error correlation for temperature corresponding to the 30-min forecast are presented in Fig. 11 with 25 subdomains (as before, the error correlation is computed with respect to the center of each subdomain). The temperature vertical error correlation exhibits different structures in different subdomains. It is recognizable that the vertical profiles are quite different in nonprecipitating (southwest of the domain) and precipitating areas. We select three areas (see Fig. 11) in the analysis domain for further examination of the vertical error structure: subdomain 7 (no precipitation), 24 (precipitation), and 10 (precipitation). All members forecasted precipitation in subdomains 10 and 24, although with different intensities and patterns.

The vertical error structure is computed in each subdomain and averaged over the area of the subdomain (3600 grid points). Figure 12a presents the vertical error structure in subdomain 7, which is the nonprecipitating subdomain. The error in the vertical is nearly a symmetric structure and the correlation length is slightly shorter than 200 hPa (initial correlation). In addition, negative error correlations occur at high and low levels of the profile. This is the typical temperature vertical error correlation structure observed in large-scale data assimilation systems related to the hydrostatic balance over nonprecipitating areas [see Fig. 13 in Gustafsson et al. (1999)]. We emphasize that the original random perturbations prescribed have a Gaussian vertical structure without negative lobes. However, as the 30-min forecast indicates, once the model is launched, the temperature rapidly develops error structures with such negative lobes over nonprecipitating regions. On the other hand, the vertical profiles in subdomain 24 (Fig. 12b) and subdomain 10 (Fig. 12c) exhibit shorter correlation lengths in the vertical and the correlation structure is no longer close to symmetric as in subdomain 7 (Fig. 12a), but rather more correlated above the reference level (600 hPa).

Since physical processes are the main difference between the precipitation and no-precipitation areas, we suspect that these processes play an important role in determining the vertical error structure of temperature. To examine the details of the vertical error structure in precipitating regions, the temperature tendencies are computed, being careful to discriminate between dynamics and physics contributions. The tendencies due to advection and ageostrophic accelerations are referred to as the dynamical forcing term. The tendencies from the physical parameterizations (radiation, condensation, etc.) are referred to as physical forcing. Figure 13 shows the ratio of temperature tendencies due to physics and dynamics (*F _{T}* is the temperature tendency) at 600 hPa. When the ratio is equal to or larger than 1, it indicates that the physical tendency is as important as the dynamical tendency. The plot illustrates the fact that the dynamics dominates in nonprecipitating regions (white color means negligible or zero tendency due to physics); around precipitating areas, it is quite common to see the value of the ratio larger than 1. In some locations, physical processes can even dominate (red and purple colors), which shows the importance of physical processes in precipitation regions.

To examine this issue further, we imposed a temperature observation near 600 hPa at *x* = 198, *y* = 258 in subdomain 24 (see the cross in Fig. 13). Figure 14a shows the vertical increment of the temperature cross section at point A (158 < *x* < 238 and *y* = 258). This single-observation test showed that the increment is of significant (positive) amplitude between 650 and 400 hPa. This feature matches the average of vertical error structures in subdomain 24 (Fig. 12b). Furthermore, the ensemble mean of total temperature tendency due to the physics is plotted in Fig. 14b. The vertical cross section shows that most of the heating is localized between 700 and 400 hPa, and has a vertical spread corresponding to the vertical increment in Fig. 14a. We have confirmed that most of the contribution to the physics temperature tendency comes from condensation (not shown), which suggests that localized diabatic heating contributes to such a narrow vertical error structure. The horizontal error correlations of temperature at *t* = 30 min at the location of the imposed observation are plotted in Figs. 14c and 14d. The correlations are much shorter at midlevels (Fig. 14d) compared to those at low levels (Fig. 14c), suggesting that the error horizontal length scales tend to be larger at low levels, where dynamics plays a dominant role, whereas the correlations are relatively localized in the horizontal and vertical at levels where physical processes are active.

Our study clearly demonstrates that the error structures are quite different in nonprecipitating and precipitating regions. Similar conclusions were found in other studies (e.g., Caron and Fillion 2010; Montmerle and Berre 2010). As part of their 3D-VAR assimilation system, Montmerle and Berre (2010) used a strategy for modeling background error correlations based on discrimination between precipitating and nonprecipitating regions as indicated by radar observations. The HREnKF data assimilation system indicates that care should to be taken when attempting such an approach. To illustrate this point, in Fig. 15a we show the area-averaged (1800 grid points) vertical error structure in the region defined by the black box in Fig. 4b, where no surface precipitation occurs in any of the 15-min ensemble forecasts. The temperature error correlation, computed at approximately 600 hPa, resembles the characteristic correlation for precipitating regions (with a narrow and asymmetric error correlation) as in Figs. 12b and 12c. Figure 15b shows an example of a cross section (east–west; see the arrow in Fig. 4b) of the cloud mixing ratio from one ensemble member. Even though there is no precipitation at the surface over this area, clouds have developed. This means that once physical processes become active, the vertical error structures change rapidly before the onset of precipitation.

## 6. Summary

For the purposes of data assimilation at cloud-resolving scales, as well as the future assimilation of radar data, the global EnKF system at the Canadian Meteorological Center has been modified for a high-resolution LAM grid. This system is called HREnKF.

This study focused on investigating the forecast error at the convective scale with a summer case valid on 22 July 2010. With an 80-member ensemble, the forecast error correlation structures evolve and exhibit a “situation dependence” rapidly after launching the GEM_LAM, typically within 15 min. In addition, we found that the situation-dependent error structures for different control variables tend to develop on different time scales. Dynamic variables such as horizontal wind evolve faster than the thermodynamic variables (temperature and humidity), and the humidity error typically resembles the temperature error. The error variances derived from ensemble forecasts (e.g., from 30-min integrations) illustrate the uncertainty of weather forecasts at cloud-resolving scales. Larger error variances tend to be found inside and around regions of precipitation (both embedded storm cells and mesoscale systems forced by large-scale motions), while the error variance usually remains small in nonprecipitating areas where the atmosphere is stable. Furthermore, an examination of the forecast error in the horizontal and the vertical plane demonstrates that the error structures are characteristically different in precipitating and nonprecipitating regions. By computing the temperature tendencies due to dynamics and physics, we have shown that physical processes are as important as dynamics at convective scales. The ensemble mean of the temperature tendency due to physics confirmed that diabatic heating is the major factor that modifies the temperature error structure. This indicates that the error structures both in the horizontal *and* vertical need to be addressed carefully at convective scales. Furthermore, our study shows that once physical processes are active, the error structures change rapidly before precipitation occurs.

The next step following this study is to assimilate McGill S-band radar data in the system. We plan to examine and report on the impact of assimilating radar data using a set of summer and winter cases.

Thanks to Dr. Mateusz Reszka for the internal review. We appreciated the comments and suggestions from three anonymous reviewers.

# APPENDIX

## Validation of the HREnKF System

A single-observation experiment is performed for validation of the HREnKF analysis procedure. This experiment also allows us to examine the actual analysis response based on forecast error structures provided by the HREnKF system.

### a. Validation experiments and sampling errors

The experiment is performed with 80 ensemble members, and we impose the single test observation at the center of the analysis domain (*x* = 150, *y* = 150). A simulated temperature innovation of 1° with a standard deviation error of 1° is imposed at 850 hPa. The horizontal localization distance is 60 km, and the vertical localization is configured to force the covariances to zero at a distance of two scale heights.

Figure A1a shows the temperature analysis increment for the HREnKF analysis. The increment is nearly isotropic in space, which reflects the prescribed error structure. The amplitude of the horizontal analysis increment decreases to zero at about 60 km from the imposed observation (heavy solid contour line), which is consistent with the prescribed localization distance. The maximum increment of 0.25° is at the observation location. This is precisely consistent with Eqs. (2.1) and (2.2) when use is made of the ensemble estimated forecast error variance at the observation point (estimated as 0.57°).

To provide an estimate of the side effect of error sampling, we illustrate the impact of the simulated temperature observation on wind analysis increments. Note that in principle, the cross correlation of temperature and winds should be zero as a result of our original correlation modeling assumptions. Figure A1b zooms in on the center of the increment area, and the increment of the horizontal wind (vectors) and temperature (contours) are plotted. Figures A2a and A2b present the error cross correlations between temperature and *u* (*υ*) components, respectively, based on 80 ensemble members. The correlation between temperature and wind explains clearly the horizontal wind increments due to the temperature observation. Based on the cross-correlation values in Fig. A2 (between −0.3 and 0.3), we see that according to Fig. A1b, perceptible, random divergent circulations can be induced by our algorithm as a result of the finite sample size. We speculate that this phenomenon can potentially trigger fictitious deep convection in circumstances where poor ensemble size and/or too permissive innovation data (i.e., poor data quality control) are present. The strong nonlinearity of convective-scale flow thus requires special care regarding error sampling issues.

### b. Transition to situation-dependent covariances

In this experiment, we examine a single-observation result based on the ensemble of forecast in HREnKF. The prescribed innovation and observation error are the same as before, and the location is again at the center of the analysis domain (*x* = 150, *y* = 150).

Figure A3 shows increments of the temperature and wind after the model was integrated in time with the HREnKF. The temperature increment clearly exhibits the situation-dependent structure of the forecast error built up from ensemble members. In the vertical plane, the temperature increment (Figs. A4a,b) exhibits a tilted structure, which may play a role at mesoscales and convective scales.

To examine how the wind changes in response to the single temperature observation, the error cross correlation between *u* (*υ*) and temperature *T* is plotted in Figs. A5a(b). The error cross correlations between winds and temperature are stronger than the ones in Fig. A2, and the background error structure is a result of the dynamical and physical processes inherent in the model forecast.

## REFERENCES

Aksoy, A., , D. C. Dowell, , and C. Snyder, 2010: A multicase comparative assessment of the ensemble Kalman filter for assimilation of radar observations. Part II: Short-range ensemble forecasts.

,*Mon. Wea. Rev.***138**, 1273–1292.Anderson, J. L., 2001: An ensemble adjustment Kalman filter for data assimilation.

,*Mon. Wea. Rev.***129**, 2884–2903.Bannister, R. N., , S. Migliorini, , and M. A. G. Dixon, 2011: Ensemble prediction for nowcasting with a convection-permitting model. II: Forecast error statistics.

*Tellus,***63A,**497–512.Bishop, C. H., , B. J. Etherton, , and S. J. Majumdar, 2001: Adaptive sampling with the ensemble transform Kalman filter. Part I: Theoretical aspects.

,*Mon. Wea. Rev.***129**, 420–436.Bonavita, M., , L. Torrisi, , and F. Marcucci, 2010: Ensemble data assimilation with the CNMCA regional forecasting system.

,*Quart. J. Roy. Meteor. Soc.***136**, 132–145.Brousseau, P., , L. Berre, , F. Bouttier, , and G. Desroziers, 2012: Flow-dependent background-error covariances for a convective-scale data assimilation system.

,*Quart. J. Roy. Meteor. Soc.***138**, 310–322.Caron, J. F., 2013: Mismatching perturbations at the lateral boundaries in limited-area ensemble forecasting: A case study.

*Mon. Wea. Rev.,***141,**356–374.Caron, J. F., , and L. Fillion, 2010: An examination of the incremental balance in a global ensemble-based 3D-Var data assimilation system.

,*Mon. Wea. Rev.***138**, 3946–3966.Côté, J., , S. Gravel, , A. Méthot, , A. Patoine, , M. Roch, , and A. Staniforth, 1998: The operational CMC-MRB Global Environmental Multiscale (GEM) model. Part I: Design considerations and formulation.

,*Mon. Wea. Rev.***126**, 1373–1395.Daley, R., 1985: The analysis of synoptic scale divergence by a statistical interpolation procedure.

,*Mon. Wea. Rev.***113**, 1066–1079.Dowell, D. C., , F. Q. Zhang, , L. J. Wicker, , C. Snyder, , and N. A. Crook, 2004: Wind and temperature retrievals in the 17 May 1981 Arcadia, Oklahoma, supercell: Ensemble Kalman filter experiments.

,*Mon. Wea. Rev.***132**, 1982–2005.Dowell, D. C., , L. J. Wicker, , and C. Snyder, 2011: Ensemble Kalman filter assimilation of radar observations of the 8 May 2003 Oklahoma City supercell: Influences of reflectivity observations on storm-scale analyses.

,*Mon. Wea. Rev.***139**, 272–294.Fillion, L., and Coauthors, 2010: The Canadian Regional Data Assimilation and Forecasting system.

,*Wea. Forecasting***25**, 1645–1669.Gustafsson, N., and Coauthors, 1999: Three-dimensional variational data assimilation for a high resolution limited area model (HIRLAM). HIRLAM Tech. Rep. 40, January 1999, 74 pp.

Houtekamer, P. L., , and H. L. Mitchell, 2001: A sequential ensemble Kalman filter for atmospheric data assimilation.

,*Mon. Wea. Rev.***129**, 123–137.Houtekamer, P. L., , and H. L. Mitchell, 2005: Ensemble Kalman filtering.

,*Quart. J. Roy. Meteor. Soc.***131**, 3269–3289.Houtekamer, P. L., , H. L. Mitchell, , and X. X. Deng, 2009: Model error representation in an operational ensemble Kalman filter.

,*Mon. Wea. Rev.***137**, 2126–2143.Kain, J. S., , and J. M. Fritsch, 1990: A one-dimensional entraining/detraining plume model and its application in convective parameterization.

,*J. Atmos. Sci.***47**, 2784–2802.Mailhot, J., and Coauthors, 1998: Scientific description of RPN physics library, version 3.6. Recherche en prévision numérique, 197 pp. [Available from RPN, 2121 Trans-Canada Highway, Dorval, PQ H9P 1J3, Canada.]

Milbrandt, J. A., , and M. K. Yau, 2005: A multimoment bulk microphysics parameterization. Part I: Analysis of the role of the spectral shape parameter.

,*J. Atmos. Sci.***62**, 3051–3064.Montmerle, T., , and L. Berre, 2010: Diagnosis and formulation of heterogeneous background-error covariances at the mesoscale.

,*Quart. J. Roy. Meteor. Soc.***136**, 1408–1420.Noilhan, J., , and S. Planton, 1989: A simple parameterization of land surface processes for meteorological models.

,*Mon. Wea. Rev.***117**, 536–549.Snyder, C., , and F. Q. Zhang, 2003: Assimilation of simulated Doppler radar observations with an ensemble Kalman filter.

,*Mon. Wea. Rev.***131**, 1663–1677.Stensrud, D. J., , and J. D. Gao, 2010: Importance of horizontally inhomogeneous environmental initial conditions to ensemble storm-scale radar data assimilation and very short-range forecasts.

,*Mon. Wea. Rev.***138**, 1250–1272.Stuart, A., , and K. Ord, 1986:

*Kendall's Advanced Theory of Statistics.*Vol. 1,*Distribution Theory,*5th ed. Charles Griffin and Company Limited, 604 pp.Sundqvist, H., 1978: Parameterization scheme for non-convective condensation including prediction of cloud water-content.

,*Quart. J. Roy. Meteor. Soc.***104**, 677–690.Szunyogh, I., , E. J. Kostelich, , G. Gyarmati, , E. Kalnay, , B. R. Hunt, , E. Ott, , E. Satterfield, , and J. A. Yorke, 2008: A local ensemble transform Kalman filter data assimilation system for the NCEP global model.

*Tellus,***60A,**113–130.Tong, M. J., , and M. Xue, 2005: Ensemble Kalman filter assimilation of Doppler radar data with a compressible nonhydrostatic model: OSS experiments.

,*Mon. Wea. Rev.***133**, 1789–1807.Torn, R. D., , and G. J. Hakim, 2008: Performance characteristics of a pseudo-operational ensemble Kalman filter.

,*Mon. Wea. Rev.***136**, 3947–3963.Whitaker, J. S., , and T. M. Hamill, 2002: Ensemble data assimilation without perturbed observations.

,*Mon. Wea. Rev.***130**, 1913–1924.Zhang, F. Q., , Y. H. Weng, , J. A. Sippel, , Z. Y. Meng, , and C. H. Bishop, 2009: Cloud-resolving hurricane initialization and prediction through assimilation of Doppler radar observations with an ensemble Kalman filter.

,*Mon. Wea. Rev.***137**, 2105–2125.