## 1. Introduction

The ensemble Kalman filter (EnKF; Evensen 1994; Burgers et al. 1998), a Monte Carlo approximation to the Kalman filter (Kalman 1960), has been developed and widely applied in atmospheric applications (e.g., Houtekamer and Mitchell 1998; Whitaker et al. 2008; Houtekamer et al. 2014). The EnKF has many advantages in data assimilation; however, a substantial problem of the EnKF with affordable ensemble sizes is the spurious sample covariances between widely separated observations and grid points, which introduce errors in the analysis. To reduce these spurious covariances and improve filter performance, a localization technique is typically applied that limits the impact of an observation on physically distant state variables, as demonstrated by Houtekamer and Mitchell (2001) and Hamill et al. (2001). Early studies used a boxcar function that has value 1.0 within a cutoff radius and 0.0 otherwise (e.g., Houtekamer and Mitchell 1998), but most later studies (e.g., Houtekamer and Mitchell 2001) have used a compactly supported fifth-order polynomial approximation of a Gaussian function [herein called the Gaspari–Cohn (GC) localization function; Gaspari and Cohn 1999], often for both horizontal and vertical localization.

The length scale of the GC function must be tuned for good filter performance for a given application and ensemble size. This length scale is typically described by the half-width that is half the distance at which the GC localization function goes to zero. However, tuning even this single parameter can be computationally expensive for atmospheric applications. Considering first the horizontal localization, many past studies used half-widths of several hundred kilometers for both global and regional atmospheric models (e.g., Raeder et al. 2012; Romine et al. 2013) where conventional observations, like temperature and wind from radiosondes, are assimilated. Meanwhile, for convective-scale applications (e.g., Snyder and Zhang 2003; Dowell et al. 2004), where often only radar observations are assimilated, a half-width of only a few kilometers is commonly used (e.g., Tong and Xue 2005; Aksoy et al. 2009). Some more recent studies have sought to treat the assimilation of radar and conventional observations separately by varying GC half-widths for different types or groups of observations (e.g., Zhang et al. 2009; Otkin 2012; Sobash and Stensrud 2013; Lange and Craig 2014). However, to date, no theoretical studies have explored whether there is a need for different localization functions for different observation types.

Appropriate flow-dependent covariance structures should vary in time (e.g., Buehner et al. 2010) and space (e.g., Houtekamer and Mitchell 1998); thus, adaptive localization approaches that permit this could provide better analyses than static localization. The hierarchical ensemble filter proposed by Anderson (2007) detects and corrects sampling error relaxing the need for localization. The ensemble correlations raised to a power (ECO-RAP; Bishop and Hodyss 2009a,b) dynamically moves the localization function and adapts to the width of the true error correlation function. Also, the sampling error correction developed by Anderson (2012) automatically computes the localization as a function of ensemble size and sample correlation. These adaptive localization techniques require additional computations during the assimilation.

Localization in the vertical is also required for good filter performance in atmospheric applications, but there are few theoretical studies of vertical localization. The GC localization function has been adopted for vertical localization in several studies, though there remains a challenge in defining the appropriate half-width. Houtekamer and Mitchell (2005) used a half-width of 1 ln *p* unit for all observations, including radiances. Half-widths of several kilometers were used in studies of assimilating radar observations (e.g., Tong and Xue 2005; Aksoy et al. 2009). Whitaker et al. (2004) chose a vertical localization function that has a value of 1.0 below

This study will use the ELF approach to investigate the horizontal and vertical localization functions in regions with and without precipitation for a regional model, the Weather Research and Forecasting (WRF) Model (Skamarock et al. 2008). The ELF takes sampling error and other potential errors into account and can automatically estimate the localization for any possible observation type with a given kind of state variable using the output of an observing system simulation experiment (OSSE). The ELF makes few a priori assumptions for the shape of the localization function and can require less computation than tuning the GC half-width. For example, tuning the GC half-width requires several simulations with different GC half-widths, while the ELF can provide an estimate of the localization function based on a single simulation. Promising results were obtained by applying the ELF in the simple Lorenz-96 model (Lorenz and Emanuel 1998), the dynamical core of the Geophysical Fluid Dynamics Laboratory (GFDL) B-grid climate model (Anderson et al. 2004), and the Community Atmosphere Model, version 5.0 (CAM5; Neale et al. 2012; Anderson and Lei 2013; Lei and Anderson 2014a,b,c).

Lei and Anderson (2014b) found that ELFs that vary with geographic region had advantages over global ELFs. Extending this approach, the hypothesis that all observations, not just radar observations, need unique horizontal localization in precipitating regions is examined in this study. Montmerle (2012) found shorter length scales of horizontal error correlations in precipitating regions owing to diabatic processes. This study will provide theoretical support for the reduced localization length scales typically used in convective-scale data assimilation, such as for radar observations that are available primarily within precipitating regions.^{1} Part of the motivation to use reduced localization length scales may be subjective in response to the increased observation density or expectations that sampling error might be larger within precipitation systems. Similar studies that focus on the structure and impact of the specified background error covariances in regions with precipitation were recently explored for variational data assimilation systems (Ménétrier and Montmerle 2011; Michel et al. 2011; Montmerle 2012).

There is also evidence that different localization functions are needed for different observation types (Houtekamer and Mitchell 2005; Tong and Xue 2005; Kang et al. 2011; Anderson and Lei 2013) and even different state variables (Anderson 2007, 2012). ELFs for different observation types are also investigated here.

Section 2 describes the experimental design. The ELF algorithm and the localization functions provided by the ELF for precipitating and nonprecipitating regions, as well as for different observation types, are discussed in section 3. Section 4 presents the fitted ELFs that are used in additional OSSEs, the assimilation results using the fitted ELFs, and a discussion of the imbalances caused by localization. Section 5 presents a general discussion and summary.

## 2. Experimental design

### a. The WRF/DART system

An OSSE is conducted using WRF V3.3.1 (Skamarock et al. 2008) coupled with the Data Assimilation Research Testbed (DART; Anderson and Collins 2007; Anderson et al. 2009). The model domain covers the continental United States (CONUS) and portions of the eastern Pacific, Canada, and the Gulf of Mexico, as shown by the outer domain in Fig. 1. The simulation has 15-km horizontal grid spacing on a 415 × 325 horizontal grid and 40 vertical levels with model top at 50 hPa.

The model configuration includes positive definite moisture advection (Skamarock and Weisman 2009) and the following physical parameterizations: the Thompson microphysics scheme (Thompson et al. 2008), the Mellor–Yamada–Janjić (MYJ) planetary boundary layer scheme (Mellor and Yamada 1982; Janjić 1994, 2002), the Noah land surface model (Ek et al. 2003), and Rapid Radiative Transfer Model for Global Climate Models (RRTMG) longwave and shortwave radiation schemes (Mlawer et al. 1997; Iacono et al. 2008). The cumulus parameterization uses the Tiedtke cumulus scheme (Tiedtke 1989; Zhang et al. 2011) that was released with WRF version 3.3.

The ensemble data assimilation system for this study is the DART toolkit configured as an ensemble adjustment Kalman filter (EAKF; Anderson 2001) and is used to combine synthetic observations with an ensemble of forecasts from WRF to produce an ensemble of analyses. To maintain appropriate ensemble spread, spatially varying and time-varying state space adaptive inflation (Anderson 2009) is applied to the prior state. The adaptive inflation uses 1.0, 0.6, and 0.9 as the initial value, fixed standard deviation, and damping settings, respectively. Also, sampling error correction (Anderson 2012) is used to further improve ensemble spread and overcome sampling errors because of a limited ensemble size. To reduce the influence from spurious correlations, the GC localization with a half-width of 0.1 rad (1 rad = 6371 km) is used as the default covariance localization, where the vertical distance is converted to equivalent radians by normalizing the height difference. A height difference of 80 km is equivalent to 1 rad. Thus, the horizontal (vertical) localization decreases to zero at 1274 (16) km away from the observation location. The empirical localization functions will be discussed in sections 3 and 4.

The GC half-width of 0.1 rad is adapted from Schwartz et al. (2014), along with results from previous formal (e.g., Romine et al. 2013) and informal experiments that involved evaluating observation-space diagnostics from cycling experiments over several week periods to identify optimal localization settings. Based on these results, 0.1 rad can be viewed as a nearly optimal GC half-width.

### b. Experimental design of the OSSE

The initial conditions (IC) and lateral boundary conditions (LBC) are generated from the National Centers for Environmental Prediction (NCEP) Global Forecast System (GFS) analysis. The ICs for the true state and ensemble state are generated by adding perturbations that sample the NCEP background error covariance using the WRF three-dimensional variational data assimilation system (WRFDA-3DVAR; Barker et al. 2012) to the IC from the GFS analysis (Torn and Hakim 2008). The LBCs for the true state and ensemble state are generated using the fixed covariance perturbation technique (Torn et al. 2006). Thus, similar to the ensembles, the truth has perturbations added to the IC and LBCs. It is drawn from the same distribution as the ensemble members. In this study, the ensemble size *N* is 50.

The evolving true state is obtained by advancing WRF from the true IC at 0000 UTC 24 May 2012 to 1800 UTC 30 May 2012. Synthetic observations of temperature and *u* and *υ* winds are generated by adding random draws from a normal distribution with mean 0 and specified observation error variances to spatially interpolated values from the gridded true state. The observation error variances are 1 K^{2} for temperature and 4 m^{2} s^{−2} for *u* and *υ* winds. The synthetic observations are uniformly distributed in the horizontal and range from 1000 to 100 hPa in the vertical on standard radiosonde mandatory pressure levels (1000, 850, 700, 500, 400, 300, 200, 150, and 100 hPa). There are 95 observation profiles in the domain, as shown in Fig. 1. The synthetic observations are available every 6 h.

An OSSE with the default GC localization and a half-width of 0.1 rad (GC0.1; Table 1) is produced first from 0000 UTC 24 May to 1800 UTC 30 May 2012. Starting from 0600 UTC 24 May, synthetic observations are assimilated every 6 h. When the localization functions for precipitating and nonprecipitating regions are analyzed in section 3, the first 2 days are discarded to eliminate transient effects, and the remaining 5 days are used. The data used for computing the ELF are within the verification region (Fig. 1) that is chosen to avoid areas with complex terrain. Data from areas with complex terrain lead to noisy ELFs where model levels are no longer quasi horizontal. Moreover, the terrain increases the complexity in subsetting the domain for computation of the ELF. For instance, an observation and state variable pair with or without terrain between them should not lie in the same subset.

Summary of the experiments with different localization functions.

Leveraging the ELFs derived in section 3, in section 4, ELFs are smoothed and then used in subsequent assimilation experiments during the same period as GC0.1 (see Table 1). As in the approach used for deriving ELFs, the results from the first 2 days in OSSEs are discarded to eliminate transient effects, and the remaining data in the same region and period as that used to compute the ELFs are used for verification. Since the true state is known in an OSSE, the verification is based on the state variables at every model grid point within the verification region. The time series of averaged root-mean-square error (RMSE) of the ensemble mean from the truth and ensemble spread for state variables of temperature and *u* and *υ* winds are used for evaluation. The vertical profiles of temporally and horizontally averaged RMSE and ensemble spread are also examined. The temporally and horizontally averaged inflation for state variables is also presented. Results are shown for the RMSE and ensemble spread of the prior, but qualitatively similar results are obtained for the posterior.

## 3. Localization scales for precipitating and nonprecipitating regions

The ELF approach (Anderson and Lei 2013) uses the output of an ensemble OSSE and minimizes the RMS difference between the true value and the posterior ensemble mean (a more detailed description is given in the next few paragraphs). The approach was demonstrated to produce appropriate localization functions in a simple atmospheric general circulation model (Lei and Anderson 2014a) and a more realistic atmospheric general circulation model (Lei and Anderson 2014b). Thus, the ELF method is used here to analyze the localization scale for precipitating and nonprecipitating regions in a regional model. A horizontal location at which the 6-h accumulated precipitation (cumulus and grid scale precipitation) is larger than 0.25 mm is defined as a location with precipitation. The procedures to compute the ELF are briefly described next.

Let **Y** be the set of observations to be used in a subsequent assimilation (OSSE here) and **X** be the set of model state variables to be modified by assimilating **Y**. Let **X**, *L* is the number of state variables; and let **Y**, *M* is the number of observations, and the superscript *o* denotes the observation. All pairs

*k*of

*K*total pairs in this subset, superscripts

*t*and

*p*denote the true value and prior, an overbar denotes the ensemble mean,

After the localization value *z* test is used to assess the significance of the localization (Lei and Anderson 2014a). Given the localization value and its standard error, the *z* test is applied with a null hypothesis that the localization *z* value is outside the critical region for a 95% confidence and the value of the localization for the subset is set to 0.

As in Lei and Anderson (2014b), the localization is approximated by the product of the localization for the horizontal separation and the localization for the vertical separation. Also, the ELF can be computed for every observation type with the same state variable kind. Thus, the ELFs are computed for temperature and *u* and *υ* winds in both the horizontal and vertical for precipitating and nonprecipitating regions separately.

To increase the sample size *K* and produce a smoother ELF, the gridded state values of temperature and *u* and *υ* winds on nine selected model levels **Q** = {1, 9, 13, 18, 20, 24, 28, 31, 35} that are closest to each of the nine mandatory radiosonde levels are used for the true observation values in Eq. (2), since the true gridded state values are known from an OSSE. In this way, the sample size of **Y**, given a vertical level at an analysis time of one state variable for both precipitating and nonprecipitating regions, increases from 95 to 29 241, where 95 is number of observation profiles, and 29 241 is the number of horizontal grid points in the verification region. Please note there are only 95 observation profiles that are assimilated each cycle for GC0.1.

ELFs as a function of horizontal separation are constructed from the output of GC0.1. To compute the ELF of temperature given a model level *q*, *s*, *q*) contains all pairs (*y*, *x*) where *y* is a temperature state variable at level *q* given any analysis time *t* with precipitation and *x* is a temperature state variable at time *t* on the same model level as *y* also with precipitation, and the separation of *x* from *y* is between (*s* − 1) × 0.002 and *s* × 0.002 rad, *S* is the total number of subsets. A similar procedure is used to construct the subset STN(*s*,* q*) for the ELF of temperature given a model level *q* for nonprecipitating regions, except that *y* and *x* are without precipitation. Subsets for the ELFs of *u* and *υ* winds with and without precipitation [SUP(*s*, *q*), SUN(*s*, *q*), SVP(*s*, *q*), and SVN(*s*, *q*)] are done the same way as for temperature.

The horizontal ELFs of temperature for precipitating and nonprecipitating regions averaged over the nine selected model levels are shown in Fig. 2a. The circles denote sample size for each subset for precipitating and nonprecipitating regions. Since most state locations do not have precipitation, the sample size for nonprecipitating regions is much larger than that for precipitating regions. The ELF for nonprecipitating regions (ELFNP) has similar shape to GC0.1 except that ELFNP has a slightly broader tail and is a bit noisy owing to the smaller sample size. The ELF for precipitating regions (ELFPP) is narrower than GC0.1 and ELFNP, which suggests a smaller localization scale is appropriate for temperature observations for precipitating regions than for nonprecipitating regions. Dashed lines show the correlation between temperature observations and temperature variables as a function of separation. The correlation of temperature for precipitating regions diminishes faster than that for nonprecipitating regions, and this is consistent with the ELFs.

The horizontal ELFs of *u* and *υ* winds for precipitating and nonprecipitating regions averaged over the nine selected model levels are shown in Figs. 2b and 2c. Similar to temperature (Fig. 2a), the sample sizes for nonprecipitating regions are much larger than those for precipitating regions. Consistent with correlations of temperature, *u*- and *υ*-wind correlations also diminish more rapidly in precipitating than nonprecipitating regions. The ELFPPs of *u* and *υ* winds are narrower than GC0.1, and they are similar to the ELFPP of temperature. The ELFNPs of *u* and *υ* winds are qualitatively similar to the ELFNP of temperature; they are similar to GC0.1 at small separations but have broader tails than GC0.1. Thus for observations of temperature and *u* and *υ* winds, precipitating regions have smaller correlation and localization scales than nonprecipitating regions in the horizontal, which is consistent with findings by Michel et al. (2011).

The horizontal ELFs shown by Fig. 2 are averaged over the nine selected model levels. There are variations of the horizontal ELFs with height (not shown). The localization scale for *u* and *υ* winds in regions without precipitation gradually increases with height. But this feature is not obvious for *u* and *υ* winds in regions with precipitation and temperature in regions with and without precipitation.

The horizontal ELFs of *y* in precipitating regions and *x* in nonprecipitating regions (not shown) are very similar to those with both *y* and *x* in precipitating regions (blue solid lines in Fig. 2), except that the former has a slightly broader tail than the latter when separation is larger than 0.05 rad. Similarly, the horizontal ELFs of *y* in nonprecipitating regions and *x* in precipitating regions (not shown) are very similar to those with both *y* and *x* in nonprecipitating regions (red solid lines in Fig. 2), except that the former has a slightly narrower tail than the latter when separation is larger than 0.05 rad. Therefore, the horizontal ELFs of *y* in precipitating (nonprecipitating) regions and *x* in either precipitating or nonprecipitating regions can be approximated by those with both *y* and *x* in precipitating (nonprecipitating) regions.

ELFs as a function of vertical separation are also constructed from the output of GC0.1 for precipitating and nonprecipitating regions separately. To compute the ELF of temperature for precipitating regions, the subset STP(*s*) contains all pairs (*y*, *x*) where *y* is a temperature state variable at any analysis time *t* with precipitation, and *x* is a temperature state variable at time *t* in the same vertical column, and the separation of *x* from *y* is between (*s* − 1) and *s* km. Unlike the subsets for horizontal ELFs that contain *y* given a vertical level *q*, the subsets for vertical ELFs use *y* in all nine selected model levels, because the sample size *K* for vertical ELFs is much smaller than for horizontal ELFs. A similar procedure is used to construct the subset STN(*s*) for the ELF of temperature for nonprecipitating regions, except that *y* and *x* are in columns without precipitation. Subsets for the ELFs of *u* and *υ* winds with and without precipitation [SUP(*s*), SUN(*s*), SVP(*s*), and SVN(*s*)] are done the same way as for temperature.

Figure 3 shows the vertical ELFs of temperature and *u* and *υ* winds for precipitating and nonprecipitating regions. The ELFPP of temperature is similar to GC0.1 for small separations (<4 km), but it is much broader than GC0.1. The ELFNP of temperature is smaller than GC0.1 for small separations (<4 km), and it is also much broader than GC0.1. The ELFPP generally has larger localization values than the ELFNP. The correlation of temperature observations with temperature variables for both precipitating and nonprecipitating regions quickly decreases with separation for separation less than 2 km and then gradually diminishes to zero for larger separations, which is consistent with the broad ELFs that decrease to zero at large separations. The correlation for precipitating regions is generally larger than that for nonprecipitating regions, similar to the comparison between ELFPP and ELFNP.

For the vertical ELFs of *u* and *υ* winds (Figs. 3b,c), the ELFPPs are slightly larger than the ELFNPs, and the ELFPPs and ELFNPs are smaller than GC0.1 when separations are smaller than 4 km. When separations are larger than 4 km, the ELFPPs and ELFNPs are broader than GC0.1, and ELFPPs are broader than ELFNPs. Similar to the correlations of temperature, the correlations of *u* and *υ* winds quickly decrease from 0 to 2 km and then decrease more gradually to 0 from 2 to 18 km. Consistent with the localization functions, the correlations for precipitating regions are slightly larger than those for nonprecipitating regions, especially between 4 and 14 km. Therefore, precipitating regions have larger correlation and localization scales than nonprecipitating regions in the vertical. The enhanced vertical motion and characteristics of a vertical column with an activated cumulus parameterization scheme in precipitating regions are possible reasons for the broader vertical correlation and localization.

The ELFNPs of *u* and *υ* winds are smaller than the ELFNP of temperature between 4- and 14-km separation. The ELFPPs of *u* and *υ* winds are smaller than the ELFPP of temperature with small separations (<4 km), and they are larger than the ELFPP of temperature between 4 and 12 km. Thus, the structures of ELFPPs and ELFNPs demonstrate that optimal vertical localizations differ by observation type, as well as for regions with and without precipitation.

The ELF vertical localizations for both precipitating and nonprecipitating regions are broader than GC0.1, which is consistent with the vertical localization results found with CAM (Lei and Anderson 2014b). Also, the vertical localization functional form is not very Gaussian, similar to results found in Hacker et al. (2007), where the vertical localization of near-surface observations was examined using the hierarchical filter (Anderson 2007).

## 4. Application of the ELF in a subsequent OSSE

The ELFs discussed in section 3 are now applied in OSSEs and are compared against GC0.1. Since the ELFs have noisy tails (Figs. 2 and 3), the GC function and a cubic spline function (Hastie and Tibshirani 1990) are fit to the ELFs to produce smooth localization functions (Lei and Anderson 2014a,b,c). To investigate the impact of ELF in detail, three sets of fitted smooth localization functions (ELFFs) are applied in additional OSSEs, which are summarized in Table 1. The first ELFF experiment (ELFOneA) employs one horizontal and one vertical ELFF. The second OSSE (ELFOnePN) uses two horizontal and two vertical ELFFs, with separate ELFFs for precipitating and nonprecipitating regions. The third OSSE (ELFObsPN) is like ELFOnePN but uses separate horizontal and vertical ELFFs for temperature and *u* and *υ* winds, while also varying by precipitating and nonprecipitating regions. The three OSSE experiments extend from 0000 UTC 24 May to 1800 UTC 30 May 2012. The construction of ELFFs is described in the next section.

### a. ELFF construction

The output of GC0.1 is used to construct the ELF for the OSSE ELFOneA in the same way as the ELF described in section 3, except that precipitating and nonprecipitating regions are combined. The horizontal ELFs are shown by the black dots in Fig. 4a, which are approximately Gaussian, so a GC localization function is fit to the horizontal ELF. The half-width of the GC function is determined by minimizing the RMS difference between the GC function and the ELFs. The blue line in Fig. 4a presents the horizontal ELFF, which has the same half-width as in the assimilation experiment GC0.1.

*c*is the GC half-width,

Parameters defined in Eq. (5) for the vertical ELFF.

Using the ELFs varying with precipitating and nonprecipitating regions, the second set of ELFFs is constructed similarly to the first set, which is shown by Fig. 5, and the parameters of which are displayed in Table 2. The horizontal ELFF for nonprecipitating regions (ELFFNP) has a slightly larger half-width than GC0.1, while the horizontal ELFF for precipitating regions (ELFFPP) has a smaller half-width than GC0.1. The vertical ELFPP is close to GC0.1 with small separations (<4 km) and extends with nearly constant values for vertical separation between 4 and 9 km. The vertical ELFNP has smaller localization values than ELFPP and gradually diminishes to zero with increasing separations.

The third set of ELFFs is constructed in the same way as the second one, except also differentiating by observation type. Figure 6 shows the third set of ELFFs, and Table 2 presents the corresponding parameters. The horizontal temperature ELFFNP has a half-width of 0.1 rad, matching GC 0.1, while the wind ELFFNP has a slightly larger half-width than GC0.1. The horizontal temperature ELFFPP has a slightly larger half-width than the wind ELFFPP, and both of them have smaller half-widths than GC0.1. The vertical wind ELFFPP is the same as that of temperature when separation is smaller than 3 km, but it has larger localization values than that for temperature when separation is larger than 5 km. The wind ELFFNP has slightly larger values than that of temperature with small separations (<3 km), but it is narrower than that of temperature when separation is larger than 3 km.

For the second and third set of ELFFs, ELFFPP is used for *y* in precipitating regions, and ELFFNP is used for *y* in nonprecipitating regions for all *x* (as discussed in section 3). The ELF can be computed for any types of observations with any kinds of state variables (Lei and Anderson 2014a). However, the ELFs and ELFFs here are for observations that are the same kind as the state variables, because it is impractical to compute an ELF for each observation type with each state variable kind for the WRF Model, and the use of unique ELFs by state and observation type may exacerbate the imbalance caused by data assimilation.

### b. OSSE ELFF results

RMSE and ensemble spread are computed using data from the same period and region as that used to compute the ELFs. Figure 7 presents the time series of RMSE and ensemble spread for temperature and *u* and *υ* winds averaged in the verification region for GC0.1 and the three ELFF experiments. RMSE is calculated based on the prior ensemble mean and the truth. ELFOneA has a smaller RMSE of temperature and *u* and *υ* winds than GC0.1. Recall the horizontal localization was unchanged from the control; thus, the vertical localization provided by the ELFF has advantages over the standard GC function. ELFOnePN decreases the RMSE of temperature and *u* and *υ* winds more than ELFOneA. Thus, the advantages of having localizations varying between precipitating and nonprecipitating regions are demonstrated. When the ELFFs are further discriminated by observation types (ELFObsPN), the RMSEs of temperature and *u* wind are very similar to those of ELFOnePN, while the *υ*-wind RMSE is slightly larger than ELFOnePN. As such, applying different localizations for observation types of temperature and wind does not further decrease the RMSE.

The vertical profiles of RMSE and spread averaged in the verification region for temperature and *u* and *υ* winds are shown in Fig. 8. ELFOneA, ELFOnePN, and ELFObsPN have smaller temperature RMSEs than GC0.1 for nearly every model level, especially between the surface and 18 km. Similar structures are obtained for *u* and *υ* winds. Results are much the same when the RMSEs are computed over regions with and without precipitation separately.

A statistical significance test using a bootstrap resampling with replacement (Efron and Tibshirani 1993) is applied on the RMSE from different experiments. The null hypothesis that the RMSE differences between two experiments are zero can be rejected at the 95% significance level. The RMSE differences between the three ELFF experiments and GC0.1 are found to be statistically significant. The same is true for ELFOnePN and ELFObsPN compared to ELFOneA. However, the RMSE differences between ELFOnePN and ELFObsPN are not statistically significant.

As shown in Fig. 7, the use of spatially homogeneous ELFFs (ELFOneA) reduces the spread relative to GC0.1, and similar spread reduction is obtained when using separate ELFFs for precipitating and nonprecipitating regions (ELFOnePN). When ELFFs vary by precipitating and nonprecipitating regions and also observation types, the spread is slightly smaller than GC0.1 but larger than the other two ELFF experiments. Similarly, the vertical profiles (Fig. 8) show the spread reduction with ELFFs is fairly consistent with the time series for all state variables.

The vertical profiles of inflation for temperature are shown in Fig. 9. The inflations of *u* and *υ* winds (not shown) are similar to that of temperature. ELFOneA has larger inflation than GC0.1. ELFOneA has the same GC half-width as GC0.1 in the horizontal but broader localization than GC0.1 in the vertical. The response of the adaptive inflation was to increase the inflation values as more state points in the vertical were impacted by observations. While ELFOnePN and ELFObsPN have a similar GC half-width to the control for nonprecipitating regions, smaller half-widths than GC are applied for precipitating regions, leading, on average, to a reduced number of state points impacted by observations. Yet ELFOnePN and ELFObsPN have similar inflation to GC0.1 and less than ELFOneA. In the vertical, ELFOnePN and ELFObsPN generally have broader localization than ELFOneA for precipitating regions and narrower localization than ELFOneA for nonprecipitation regions. Thus, on average, ELFOnePN and ELFObsPN have smaller localization than ELFOneA and require less inflation.

While the localization limits the sampling error, it may also introduce imbalance into the simulation (Kepert 2009). The imbalance is diagnosed by the domain-averaged surface pressure tendency. The surface pressure tendency is also averaged over each assimilation cycle. The average surface pressure tendency during the first 2-h forecast after assimilation is shown in Fig. 10. The truth is generated from the truth forecast, stopping the model every 6 h, but without data assimilation. The surface pressure tendencies of the assimilation experiments right after assimilation are larger than for the true state. Although the surface pressure tendencies quickly decrease with time, the assimilation experiments still have larger surface pressure tendencies than the true state throughout the forecast period. Thus, data assimilation with localization causes imbalance. The three sets of ELFFs have similar surface pressure tendencies, and they have slightly smaller surface pressure tendencies than GC0.1 immediately after assimilation and throughout the forecast period. Since ELFOneA only varies in the vertical localization, the vertical localization function produced by ELFF likely leads to the improved balance over the GC localization.

## 5. Discussion and summary

This study considers the impact of using more complex localization functions to improve ensemble Kalman filter (EnKF)-based analysis system performance over use of a standard Gaspari–Cohn-type localization function (GC0.1). Localization functions discriminated by different observation types and for precipitating and nonprecipitating regions are examined using the empirical localization function (ELF). For temperature and *u* and *υ* winds, the horizontal ELFs for nonprecipitating regions (ELFNP) have similar shape to GC0.1, and the horizontal ELFs for precipitating regions (ELFPP) are narrower than GC0.1 and ELFNP. The correlations of temperature and *u* and *υ* winds for precipitating regions diminish faster than that for nonprecipitating regions. Thus, precipitating regions have smaller correlation and localization scales than nonprecipitating regions in the horizontal.

The vertical ELFNPs are smaller than GC0.1 for small separations (<4 km), but have broader tails than GC0.1. The vertical ELFPPs are more similar to GC0.1 than the vertical ELFNPs for separations smaller than 4 km, and the vertical ELFPPs are broader than GC0.1 and ELFNPs for separations larger than 4 km. The vertical ELFPPs of *u* and *υ* winds are broader than for temperature between 4 and 10 km. Correlations for precipitating and nonprecipitating regions quickly decrease with separation when separations are smaller than 2 km and then diminish more gradually to zero for separations larger than 2 km. The vertical correlations for precipitating regions are generally larger than those for nonprecipitating regions. Therefore, precipitating regions have both larger correlation and localization scale than nonprecipitating regions in the vertical.

The impact of different localization functions for different observation types and for precipitating and nonprecipitating regions is examined by applying the fitted ELFs (ELFFs) in subsequent OSSEs. Using a single horizontal and a single vertical ELFF yields slightly smaller (but statistically significant) RMSEs of temperature and *u* and *υ* winds than GC0.1. ELFFs varying by precipitating and nonprecipitating regions have similar RMSE to ELFFs varying by precipitating and nonprecipitating regions and also observation types, and both of them have slightly smaller (but statistically significant) RMSE than the single horizontal and vertical ELFF and GC0.1. Thus, the advantages of the vertical localization provided by ELFF and varying localization for precipitation and nonprecipitating regions are demonstrated, although the localization functions varying by observation types do not show additional benefits. However, a bigger impact of varying localization functions with observation types would be expected for nonlocal observations like radiance (e.g., Houtekamer and Mitchell 2005; Anderson and Lei 2013).

Improper localization can introduce imbalance into the analysis (Greybush et al. 2011); thus, the balance is compared when varying localization functions by examining the domain-averaged surface pressure tendency. The three sets of ELFFs have similar surface pressure tendencies that are consistently slightly smaller than those with GC0.1. Thus, the ELFFs have improved balance over a GC localization function.

When the three sets of ELFFs are used in a period that is independent from the assimilation period used for computing the ELFFs, ELFOneA has a slightly smaller RMSE of temperature and *u* and *υ* winds than GC0.1, but ELFOnePN and ELFObsPN have a slightly larger RMSE than GC0.1 (not shown). There may be regional or synoptic pattern aspects that impact the optimal shape for ELFFs. This might indicate a need to subset the pairs that consist of an observation and a state variable not just by separation, but also by location, or the need to compute ELFs on the fly. Further studies to better define the pairs and subsets with changing synoptic conditions and varying with precipitating and nonprecipitating regions are needed.

Results presented in this study motivate future work to investigate ELFs for convective-scale data assimilation. The horizontal grid spacing here is 15 km, and cumulus parameterization is applied. It is unclear whether the structures of the ELFs, especially the vertical ELFs, would hold at higher resolution with explicit convection. Thus, the next step is to construct and test the ELFs for convective-scale assimilation. In this study, the pairs that belong to precipitating regions are determined by a nonzero precipitation rate at the surface. Columns with hydrometeors aloft, but not resulting in precipitation at the surface, are not included in the precipitating regions pairs. The need to subset the pairs in a more detailed and appropriate way will be investigated in the future. A closer examination of localization functions for moisture variables is also warranted, especially for convective-scale data assimilation.

A nearly uniform observation network is used here, and further exploration of the performance of the ELF with different observation densities and networks is needed. Moreover, a time-varying and spatially varying localization algorithm that takes into account ensemble size, state variable kind, observation type, density and location, and the dynamics of the observed system would be preferred. The ELFs are computed from the output of an OSSE here; however, they can also be computed from an ensemble simulation system with real observations (Anderson and Lei 2013). Thus, further investigation of constructing the ELFs with real observations and applying the ELFs with varying observation types and regions in real observation experiments are needed.

## Acknowledgments

Any opinions, findings, and conclusions or recommendations expressed in this publication are those of the authors and do not necessarily reflect the views of the National Science Foundation. Thanks to Tim Hoar and Nancy Collins for technical support and to Doug Nychka for helpful discussions. We would like to acknowledge high-performance computing support from Yellowstone (ark:/85065/d7wd3xhc) provided by NCAR’s Computational and Information Systems Laboratory, sponsored by the National Science Foundation. Insightful comments from three anonymous reviewers significantly improved this report.

## REFERENCES

Aksoy, A., , D. C. Dowell, , and C. Snyder, 2009: A multicase comparative assessment of the ensemble Kalman filter for assimilation of radar observations. Part I: Storm-scale analyses.

,*Mon. Wea. Rev.***137**, 1805–1824, doi:10.1175/2008MWR2691.1.Anderson, J. L., 2001: An ensemble adjustment Kalman filter for data assimilation.

,*Mon. Wea. Rev.***129**, 2884–2903, doi:10.1175/1520-0493(2001)129<2884:AEAKFF>2.0.CO;2.Anderson, J. L., 2007: Exploring the need for localization in ensemble data assimilation using a hierarchical ensemble filter.

,*Physica D***230**, 99–111, doi:10.1016/j.physd.2006.02.011.Anderson, J. L., 2009: Spatially and temporally varying adaptive covariance inflation for ensemble filters.

,*Tellus***61A**, 72–83, doi:10.1111/j.1600-0870.2008.00361.x.Anderson, J. L., 2012: Localization and sampling error correction in ensemble Kalman filter data assimilation.

,*Mon. Wea. Rev.***140**, 2359–2371, doi:10.1175/MWR-D-11-00013.1.Anderson, J. L., , and N. Collins, 2007: Scalable implementations of ensemble filter algorithms for data assimilation.

,*J. Atmos. Oceanic Technol.***24**, 1452–1463, doi:10.1175/JTECH2049.1.Anderson, J. L., , and L. Lei, 2013: Empirical localization of observation impact in ensemble Kalman filters.

,*Mon. Wea. Rev.***141**, 4140–4153, doi:10.1175/MWR-D-12-00330.1.Anderson, J. L., and et al. , 2004: The new GFDL global atmosphere and land model AM2–LM2: Evaluation with prescribed SST simulations.

,*J. Climate***17**, 4641–4673, doi:10.1175/JCLI-3223.1.Anderson, J. L., , T. Hoar, , K. Raeder, , H. Liu, , N. Collins, , R. Torn, , and A. Arellano, 2009: The Data Assimilation Research Testbed: A community facility.

,*Bull. Amer. Meteor. Soc.***90**, 1283–1296, doi:10.1175/2009BAMS2618.1.Barker, D., and et al. , 2012: The Weather Research and Forecasting Model’s Community Variational/Ensemble Data Assimilation System: WRFDA.

,*Bull. Amer. Meteor. Soc.***93**, 831–843, doi:10.1175/BAMS-D-11-00167.1.Bishop, C. H., , and D. Hodyss, 2009a: Ensemble covariances adaptively localized with ECO-RAP. Part 1: Tests on simple error models.

,*Tellus***61A**, 84–96, doi:10.1111/j.1600-0870.2008.00371.x.Bishop, C. H., , and D. Hodyss, 2009b: Ensemble covariances adaptively localized with ECO-RAP. Part 2: A strategy for the atmosphere.

,*Tellus***61A**, 97–111, doi:10.1111/j.1600-0870.2008.00372.x.Buehner, M., , P. L. Houtekamer, , C. Charette, , H. L. Mitchell, , and B. He, 2010: Intercomparison of variational data assimilation and the ensemble Kalman filter for global deterministic NWP. Part I: Description and single-observation experiments.

,*Mon. Wea. Rev.***138**, 1550–1566, doi:10.1175/2009MWR3157.1.Burgers, G., , P. J. van Leeuwen, , and G. Evensen, 1998: Analysis scheme in the ensemble Kalman filter.

,*Mon. Wea. Rev.***126**, 1719–1724, doi:10.1175/1520-0493(1998)126<1719:ASITEK>2.0.CO;2.Dowell, D. C., , F. Zhang, , L. J. Wicker, , C. Snyder, , and N. A. Crook, 2004: Wind and temperature retrievals in the 17 May 1981 Arcadia, Oklahoma, supercell: Ensemble Kalman filter experiments.

,*Mon. Wea. Rev.***132**, 1982–2005, doi:10.1175/1520-0493(2004)132<1982:WATRIT>2.0.CO;2.Efron, B., , and R. J. Tibshirani, 1993:

Chapman and Hall, 436 pp.*An Introduction to the Bootstrap.*Ek, M. B., , K. E. Mitchell, , Y. Lin, , E. Rogers, , P. Grunmann, , V. Koren, , G. Gayno, , and J. D. Tarpley, 2003: Implementation of Noah land surface model advances in the National Centers for Environmental Prediction operational mesoscale Eta model.

,*J. Geophys. Res.***108**, 8851, doi:10.1029/2002JD003296.Evensen, G., 1994: Sequential data assimilation with a nonlinear quasi-geostrophic model using Monte Carlo methods to forecast error statistics.

,*J. Geophys. Res.***99**, 10 143–10 162, doi:10.1029/94JC00572.Gaspari, G., , and S. E. Cohn, 1999: Construction of correlation functions in two and three dimensions.

,*Quart. J. Roy. Meteor. Soc.***125**, 723–757, doi:10.1002/qj.49712555417.Greybush, S. J., , E. Kalnay, , T. Miyoshi, , K. Ide, , and B. R. Hunt, 2011: Balance and ensemble Kalman filter localization techniques.

,*Mon. Wea. Rev.***139**, 511–522, doi:10.1175/2010MWR3328.1.Hacker, J. P., , J. L. Anderson, , and M. Pagowski, 2007: Improved vertical covariance estimates for ensemble-filter assimilation of near-surface observations.

,*Mon. Wea. Rev.***135**, 1021–1036, doi:10.1175/MWR3333.1.Hamill, T. M., , J. S. Whitaker, , and C. Snyder, 2001: Distance-dependent filtering of background error covariance estimates in an ensemble Kalman filter.

,*Mon. Wea. Rev.***129**, 2776–2790, doi:10.1175/1520-0493(2001)129<2776:DDFOBE>2.0.CO;2.Hastie, T., , and R. Tibshirani, 1990:

*Generalized Additive Models. Chapman & Hall/CRC Monogr. Stat. Appl. Probab.,*No. 43, Chapman and Hall, 352 pp.Houtekamer, P. L., , and H. L. Mitchell, 1998: Data assimilation using an ensemble Kalman filter technique.

,*Mon. Wea. Rev.***126**, 796–811, doi:10.1175/1520-0493(1998)126<0796:DAUAEK>2.0.CO;2.Houtekamer, P. L., , and H. L. Mitchell, 2001: A sequential ensemble Kalman filter for atmospheric data assimilation.

,*Mon. Wea. Rev.***129**, 123–137, doi:10.1175/1520-0493(2001)129<0123:ASEKFF>2.0.CO;2.Houtekamer, P. L., , and H. L. Mitchell, 2005: Ensemble Kalman filtering.

,*Quart. J. Roy. Meteor. Soc.***131**, 3269–3289, doi:10.1256/qj.05.135.Houtekamer, P. L., , X. Deng, , H. L. Mitchell, , S. Baek, , and N. Gagnon, 2014: Higher resolution in an operational ensemble Kalman filter.

,*Mon. Wea. Rev.***142**, 1143–1162, doi:10.1175/MWR-D-13-00138.1.Iacono, M. J., , J. S. Delamere, , E. J. Mlawer, , M. W. Shephard, , S. A. Clough, , and W. D. Collins, 2008: Radiative forcing by long-lived greenhouse gases: Calculations with the AER radiative transfer models.

,*J. Geophys. Res.***113**, D13103, doi:10.1029/2008JD009944.Janjić, Z. I., 1994: The step-mountain eta coordinate model: Further developments of the convection, viscous sublayer, and turbulence closure schemes.

,*Mon. Wea. Rev.***122**, 927–945, doi:10.1175/1520-0493(1994)122<0927:TSMECM>2.0.CO;2.Janjić, Z. I., 2002: Nonsingular implementation of the Mellor–Yamada level 2.5 scheme in the NCEP Meso model. NCEP Office Note 437, 61 pp. [Available online at http://www.emc.ncep.noaa.gov/officenotes/newernotes/on437.pdf.]

Kalman, R. E., 1960: A new approach to linear filtering and prediction problems.

,*J. Basic Eng.***82**, 35–45, doi:10.1115/1.3662552.Kang, J.-S., , E. Kalnay, , J. Liu, , I. Fung, , T. Miyoshi, , and K. Ide, 2011: “Variable localization” in an ensemble Kalman filter: Application to the carbon cycle data assimilation.

,*J. Geophys. Res.***116**, D09110, doi:10.1029/2010JD014673.Kepert, J., 2009: Covariance localisation and balance in an ensemble Kalman filter.

,*Quart. J. Roy. Meteor. Soc.***135**, 1157–1176, doi:10.1002/qj.443.Lange, H., , and G. C. Craig, 2014: The impact of data assimilation length scales on analysis and prediction of convective storms.

,*Mon. Wea. Rev.***142**, 3781–3808, doi:10.1175/MWR-D-13-00304.1.Lei, L., , and J. L. Anderson, 2014a: Comparisons of empirical localization techniques for ensemble Kalman filters in a simple atmospheric general circulation model.

,*Mon. Wea. Rev.***142**, 739–754, doi:10.1175/MWR-D-13-00152.1.Lei, L., , and J. L. Anderson, 2014b: Empirical localization of observations for serial ensemble Kalman filter data assimilation in an atmospheric general circulation model.

,*Mon. Wea. Rev.***142**, 1835–1851, doi:10.1175/MWR-D-13-00288.1.Lei, L., , and J. L. Anderson, 2014c: Impacts of frequent assimilation of surface pressure observations on atmospheric analyses.

,*Mon. Wea. Rev.***142,**4477–4483, doi:10.1175/MWR-D-14-00097.1.Lorenz, E. N., , and K. A. Emanuel, 1998: Optimal sites for supplementary weather observations: Simulation with a small model.

,*J. Atmos. Sci.***55**, 399–414, doi:10.1175/1520-0469(1998)055<0399:OSFSWO>2.0.CO;2.Mellor, G. L., , and T. Yamada, 1982: Development of a turbulence closure model for geophysical fluid problems.

,*Rev. Geophys.***20**, 851–875, doi:10.1029/RG020i004p00851.Ménétrier, B., , and T. Montmerle, 2011: Heterogeneous background-error covariances for the analysis and forecast of fog events.

,*Quart. J. Roy. Meteor. Soc.***137**, 2004–2013, doi:10.1002/qj.802.Michel, Y., , T. Auligné, , and T. Montmerle, 2011: Heterogeneous convective-scale background error covariances with the inclusion of hydrometeor variables.

,*Mon. Wea. Rev.***139**, 2994–3015, doi:10.1175/2011MWR3632.1.Mlawer, E. J., , S. J. Taubman, , P. D. Brown, , M. J. Iacono, , and S. A. Clough, 1997: Radiative transfer for inhomogeneous atmosphere: RRTM, a validated correlated-k model for the long-wave.

,*J. Geophys. Res.***102**, 16 663–16 682, doi:10.1029/97JD00237.Montmerle, T., 2012: Optimization of the assimilation of radar data at the convective scale using specific background error covariances in precipitation.

,*Mon. Wea. Rev.***140**, 3495–3506, doi:10.1175/MWR-D-12-00008.1.Neale, R. B., and et al. , 2012: Description of the NCAR Community Atmosphere Model (CAM 5.0). NCAR Tech. Note NCAR/TN-486+STR, 268 pp. [Available online at http://www.cesm.ucar.edu/models/cesm1.0/cam/docs/description/cam5_desc.pdf.]

Otkin, J. A., 2012: Assessing the impact of the covariance localization radius when assimilating infrared brightness temperature observations using an ensemble Kalman filter.

,*Mon. Wea. Rev.***140**, 543–561, doi:10.1175/MWR-D-11-00084.1.Raeder, K., , J. L. Anderson, , N. Collins, , T. J. Hoar, , J. E. Kay, , P. H. Lauritzen, , and R. Pincus, 2012: DART/CAM: An ensemble data assimilation system for CESM atmospheric models.

,*J. Climate***25**, 6304–6317, doi:10.1175/JCLI-D-11-00395.1.Romine, G. S., , C. S. Schwartz, , C. Snyder, , J. L. Anderson, , and M. L. Weisman, 2013: Model bias in a continuously cycled assimilation system and its influence on convection-permitting forecasts.

,*Mon. Wea. Rev.***141**, 1263–1284, doi:10.1175/MWR-D-12-00112.1.Schwartz, C. S., , G. S. Romine, , K. R. Smith, , and M. L. Weisman, 2014: Characterizing and optimizing precipitation forecasts from a convection-permitting ensemble initialized by a mesoscale ensemble Kalman filter.

,*Wea. Forecasting***29**, 1295–1318, doi:10.1175/WAF-D-13-00145.1.Skamarock, W. C., , and M. L. Weisman, 2009: The impact of positive-definite moisture transport on NWP precipitation forecasts.

,*Mon. Wea. Rev.***137**, 488–494, doi:10.1175/2008MWR2583.1.Skamarock, W. C., and et al. , 2008: A description of the Advanced Research WRF version 3. NCAR Tech Note NCAR/TN-475+STR, 113 pp. [Available online at http://www.mmm.ucar.edu/wrf/users/docs/arw_v3_bw.pdf.]

Snyder, C., , and F. Zhang, 2003: Assimilation of simulated Doppler radar observations with an ensemble Kalman filter.

,*Mon. Wea. Rev.***131**, 1663–1677, doi:10.1175//2555.1.Sobash, R. A., , and D. J. Stensrud, 2013: The impact of covariance localization for radar data on EnKF analyses of a developing MCS: Observing system simulation experiments.

,*Mon. Wea. Rev.***141**, 3691–3709, doi:10.1175/MWR-D-12-00203.1.Thompson, G., , P. R. Field, , R. M. Rasmussen, , and W. D. Hall, 2008: Explicit forecasts of winter precipitation using an improved bulk microphysics scheme. Part II: Implementation of a new snow parameterization.

,*Mon. Wea. Rev.***136**, 5095–5115, doi:10.1175/2008MWR2387.1.Tiedtke, M., 1989: A comprehensive mass flux scheme for cumulus parameterization in large-scale models.

,*Mon. Wea. Rev.***117**, 1779–1800, doi:10.1175/1520-0493(1989)117<1779:ACMFSF>2.0.CO;2.Tong, M., , and M. Xue, 2005: Ensemble Kalman filter assimilation of Doppler radar data with a compressible nonhydrostatic model: OSS experiments.

,*Mon. Wea. Rev.***133**, 1789–1807, doi:10.1175/MWR2898.1.Torn, R. D., , and G. J. Hakim, 2008: Performance characteristics of a pseudo-operational ensemble Kalman filter.

,*Mon. Wea. Rev.***136**, 3947–3963, doi:10.1175/2008MWR2443.1.Torn, R. D., , G. J. Hakim, , and C. Snyder, 2006: Boundary conditions for limited-area ensemble Kalman filters.

,*Mon. Wea. Rev.***134**, 2490–2502, doi:10.1175/MWR3187.1.Whitaker, J. S., , G. P. Compo, , X. Wei, , and T. M. Hamill, 2004: Reanalysis without radiosondes using ensemble data assimilation.

,*Mon. Wea. Rev.***132**, 1190–1200, doi:10.1175/1520-0493(2004)132<1190:RWRUED>2.0.CO;2.Whitaker, J. S., , T. M. Hamill, , X. Wei, , Y. Song, , and Z. Toth, 2008: Ensemble data assimilation with the NCEP Global Forecast System.

,*Mon. Wea. Rev.***136**, 463–482, doi:10.1175/2007MWR2018.1.Zhang, C., , Y. Wang, , and K. Hamilton, 2011: Improved representation of boundary layer clouds over the southeast Pacific in ARW-WRF using a modified Tiedtke cumulus parameterization scheme.

,*Mon. Wea. Rev.***139**, 3489–3513, doi:10.1175/MWR-D-10-05091.1.Zhang, F., , Y. Weng, , J. A. Sippel, , Z. Meng, , and C. H. Bishop, 2009: Cloud-resolving hurricane initialization and prediction through assimilation of Doppler radar observations with an ensemble Kalman filter.

,*Mon. Wea. Rev.***137**, 2105–2125, doi:10.1175/2009MWR2645.1.

^{1}

Radar observations in “clear air” outside of precipitation regions include 1) reflectivity, where the absence of echoes are detected, and 2) Doppler radial velocity observations, where scattering is not from precipitation, most often close to the radar site (e.g., Dowell et al. 2004).