## 1. Introduction

Because of its easy implementation, especially in a massively parallelized computing environment, ensemble Kalman filters (EnKFs; Evensen 1994, 2007; Hamill and Snyder 2000; Anderson and Anderson 1999) are becoming operational data assimilation methods in the weather and climate community. Another advantage of EnKFs versus variational analysis methods is the flow-dependent background error covariance that is evaluated by a model ensemble. However, because of the sampling error from a finite ensemble size, the ensemble-evaluated background variance is usually underestimated, and spurious correlations exist between a state variable and remote observations. Various static additive (e.g., F. Zhang et al. 2004; Whitaker et al. 2008; Houtekamer et al. 2009), multiplicative (e.g., Anderson and Anderson 1999) variance inflation schemes, and adaptive inflation methods (e.g., Anderson 2007b; Anderson 2009; Li et al. 2009; Miyoshi 2011) have been developed to address the first issue. To remove the long-distance spurious correlations and increase the reliability of ensemble-evaluated background covariance, the localization technique was introduced into ensemble-based filters (Houtekamer and Mitchell 1998). Note that although localization can also be applied to the background error covariance in the observation space (e.g., Houtekamer and Mitchell 1998; Ott et al. 2004; Hunt et al. 2007; Greybush et al. 2011), we focus on background error covariance localization in the state space in this study.

Originally, Houtekamer and Mitchell (1998) investigated the impact of the accuracy of background error covariance on the analysis quality, which brought about the fixed (parametric or static) localization methods (e.g., Hamill et al. 2001; Houtekamer and Mitchell 2001; Anderson 2001; Szunyogh et al. 2005, 2008). The fixed localization is usually realized by a Schur product (an element-by-element multiplication) of the ensemble-estimated covariance with an analytic localization operator. A widely used fixed localization function is the compactly supported fifth-order polynomial approximation (Gaspari and Cohn 1999, hereafter the GC function) of a normal probability distribution. Other parametric models include the exponential function, the Matérn function, and so on. Zhang et al. (2009) adopted several different localization distances to account for different physical scales. Zhu et al. (2011) sampled a fixed localization function by a set of local correlation function ensemble members so that the filter can assimilate nonlocal observations. All fixed covariance models need to determine a cutoff distance (impact radius) that defines the maximum impact range of observations. The impact radius significantly influences the analysis quality in EnKF, and the optimal cutoff radius is related to ensemble size as well as the properties of observing system and numerical model (Houtekamer and Mitchell 1998; Mitchell et al. 2002). However, it is expensive to tune the optimal cutoff distance given a specific ensemble size, model, and observing system. Therefore, many efforts have been made toward the nonparametric (adaptive) localization algorithms. Anderson (2007a) used a hierarchical filter to estimate the localization function using a group of ensembles. Bishop and Hodyss (2007) proposed the flow-dependent moderation (localization) functions that are built from powers of smoothed ensemble correlations. Then the moderation function is advanced to the ensemble correlations raised to a power (ECO-RAP method; Bishop and Hodyss 2009a,b) that propagates and adjusts the width of the localization function by computing powers of raw ensemble correlations. Bishop and Hodyss (2011) extended the ECO-RAP method to a global ensemble four-dimensional variational data assimilation scheme and demonstrated that the covariance function can adapt to anisotropic aspects of the flow. As a follow-up study, Bishop et al. (2011) proposed a computationally efficient algorithm for incorporating fixed localized ensemble covariance into variational data assimilation schemes. Jun et al. (2011) presented a kernel smoothing method with variable bandwidth to adaptively localize the covariance, and their results demonstrate that the nonparametric method provides a more accurate estimate of background covariance than the GC function. Anderson and Lei (2013) developed an empirical localization technique that computes localization from an observing system simulation experiment. Results in a low-order model show that the proposed method produces lower root-mean-square errors in most cases compared to assimilations using tuned localizations. Recently, Lei and Anderson (2014) investigated this localization algorithm in the Community Atmosphere Model, version 5, and obtained promising results. Although the adaptive models are promising, a certain limitation of the adaptive methods is the high computational cost. Besides the nonadaptive and adaptive correlation models, wavelet approaches (e.g., Deckmyn and Berre 2005; Pannekoucke et al. 2007), recursive filters (e.g., Wu et al. 2002; Purser et al. 2003), diffusion-based models (e.g., Weaver and Courtier 2001; Pannekoucke and Massart 2008; Weaver and Mirouze 2013; Yaremchuk and Nechaev 2013), and the wavelet and diffusion hybrid method (Pannekoucke 2009) are also developed to localize the covariances. These methods are also computationally complex. A more detailed review of localization methods is documented by Berre and Desroziers (2010). Note that most of the current localization techniques have pros and cons. For the parametric methods, one advantage is the low computational cost while one deficiency is the strong dependence on the impact radius. In this study, we present an approach to compensate for the following two disadvantages of the fixed localization models if the cutoff distance is not optimal: loss of longwave observational information due to a small cutoff distance, and contamination of the analysis model states by noises caused by the long-distance spurious correlation if a large cutoff distance is used. The compensatory approach uses a multiple-scale analysis (MSA) technique to retrieve multiple-scale information from the observational residuals (the differences between observations and the interpolated analysis ensemble means produced by EnKF) and adds the analysis fields to the ensemble mean of EnKF. The hybrid method proposed in this study is inspired from the previous studies that deal with the different spatial scales in filtering. Hamill and Snyder (2000), Lorenc (2003), as well as Rainwater and Hunt (2013) combined static (low-resolution ensemble evaluated) background error covariance and (high resolution) ensemble-evaluated background error covariance for the variational (ensemble filter) method. Buehner (2012) put forward a spatial/spectral localization approach that separately accounts for different-scale error covariances through a bandpass filter. Results of their data assimilation experiment justify that this method can reduce the error in spatial correlation estimates. Motivated by the work of Buehner (2012), Miyoshi and Kondo (2013) proposed a dual-localization method. Results of a perfect twin experiment show great advantage of their method over the single localization method. Afterward, Kondo et al. (2013) investigated the sensitivities of the parameters in the dual-localization approach, including the smoothing function and two localization scales. While these multiscale localization methods attempt to use ensemble-based flow-dependent covariance in longer-range covariances, the proposed EnKF–MSA hybrid method addresses the issue of losing longwave information (contaminating the analysis) caused by overly small (overly large) impact radiuses. With a global barotropical spectral model and a biased twin experiment as well as an idealized observing system, the performance of the proposed algorithm is deeply investigated.

After the introduction, section 2 briefly describes the global barotropical spectral model, an EnKF algorithm [ensemble adjustment Kalman filter (EAKF); Anderson (2003)], the idea of MSA, as well as the implementation of the hybrid method. Section 3 introduces a biased twin experiment. Section 4 thoroughly investigates the performance of the proposed method. Impact of the compensatory scheme on the weather forecast is presented in section 5 while a summary and a general discussion are given in section 6.

## 2. Methodology

### a. The model

*f*represent the relative vorticity and planetary vorticity, respectively (i.e., Coriolis parameter); and

*H*is the depth of the atmospheric layer. After introducing the geostrophic streamfunction

*β*plane (i.e.,

*f*=

*f*

_{0}+

*βy*), the absolute vorticity (i.e.,

*y*represents the northward meridional distance from equator; and

A rhomboidal 21 truncation is applied for the transformation between spectral coefficients and grid values. The state variables are spectral coefficients [the atmospheric streamfunction at the 64 (longitude) × 54 (latitude) Gaussian grid points] for the time stepping (the data assimilation). The integration step size is a half-hour. A leapfrog time step is used to integrate the model and a Robert–Asselin time filter (Robert 1969; Asselin 1972) is applied to damp the spurious computational modes.

### b. The EAKF algorithm in Anderson (2001)

**x**

^{b}represents the background of state vector

**x**with the dimension of

*M*× 1;

**y**

^{o}is the observation vector with the dimension of

*K*× 1;

^{b}are the observation error covariance matrix and the background error covariance matrix, respectively. Table 1 lists several notations used in this study.

Glossary of notations in this study.

^{b}is estimated by

*N*forecasted dynamical ensemble members:Here,

*n*th ensemble of background perturbation, which is defined aswhere

*n*th realization of background field. As the ensemble size is usually much smaller than the model dimension for typical atmospheric and oceanic applications,

**is an**

*ρ**M*×

*M*local support correlation matrix whose

*i*th-row and

*j*th-column element

*ρ*

_{i,j}represents the compactly supported correlation coefficient between the

*i*th model grid and the

*j*th model grid. Then, the Kalman-gain matrix

*y*

^{o}, the adjustments of ensemble mean and ensemble perturbation

^{1}of

*y*

^{o}are first computed by (see Anderson 2001, 2003):andrespectively. The posterior and prior ensemble means of

*y*

^{o}are denoted by

*R*and

*y*

^{o}. The

*i*th prior ensemble of

*y*

^{o},

^{2}exists between

*i*th ensemble perturbation of the

*j*th state variable

*x*

_{j}, respectively.

*x*

_{j}and

*y*

^{o}. Therefore, a quantity, namely, the increment, is defined to combine the adjustments of ensemble mean and ensemble perturbation (Anderson 2003):where

*y*

^{o}and the state increment of

*x*

_{j}for the

*i*th ensemble, respectively.

*y*

^{o}and

*x*

_{j}only appears in the linear regression formula in Eq. (15). Therefore, when the covariance localization is introduced into EAKF, the local support correlation is imposed in the numerator of the coefficient of linear regression aswhere

*ρ*

_{j,y}represents the localization factor between

*y*

^{o}and

*x*

_{j}.

*b*denotes the physical distance between

*y*

^{o}and

*x*

_{j},

*a*represents the half-width of the GC function (that is half of the impact radius).

### c. Some limitations in the fixed covariance localization methods

*a*.

^{3}Note that the time mean ensemble mean RMSE in this study is defined aswhere

*S*represents the number of analysis steps,

*s*indexes the analysis step while RMSE

_{s}denotes the RMSE at the

*s*th analysis step;

*im*and

*jm*are the dimensions of zonal and meridional model grids (that is 64 and 54), respectively. The ensemble mean of the atmospheric streamfunction

*t*” denote the prior and the truth values, respectively. The optimal value of

*a*is about 1500 km and the results of EnKF are sensitive to

*a*. Figure 2 displays a snapshot of

*a*s of 125 (Fig. 2a), 500 (Fig. 2b), 1000 (Fig. 2c), and 2000 km (Fig. 2d). For a small

*a*(like 125 km), many longwaves are lost, which is especially significant for the place where the observations are sparsely distributed (see solid circles in Fig. 4). For 50- and 1000-km values of

*a*, some longwave signals are still lost by EnKF. When

*a*exceeds a critical value, the localization cannot effectively suppress the long-distance spurious correlations which conversely contaminates the analysis solution of EnKF (Fig. 2d). Thus, for extreme

*a*values (like 125 and 2000 km here), multiscale information, which is even stronger than the observational error (the solid and dashed curves in Fig. 2), is left in the truth residuals. Here, the truth residual is the difference between truth and the analysis of EnKF:with

**x**

^{t}representing the truth of

**x**. However, only the observational residual defined asis available in practice. For overly large and overly small

*a*values,

**y**

^{res}still contains some multiscale information of

### d. Multiple scale analysis

Based on the analyses above, in this section, we introduce an MSA method to address the described issue. The MSA approach is a variant of a multigrid method that was initially suggested for solving differential equations (Briggs et al. 2000) and later introduced into data assimilation community (e.g., Li et al. 2008, 2010; Xie et al. 2011).

*l*th-scale level is formulated aswhere

*L*is the number of the scale levels;

*δ*

**x**

^{(l)},

^{(l)},

^{(l)},

**d**

^{(l)}represent the increment of the state vector

**x**, the linear projection operator from the observation space to the state space, the background error covariance matrix, the observational error covariance matrix in MSA, and the observational innovation vector for the

*l*th-scale level, respectively. Note that the observation term in Eq. (22) is slightly different from that in the cost function of traditional variational algorithm that projects the model state to observation space. MSA here conversely maps the observation to state space through the operator

_{MSA}to the pseudoinverse of

^{T}

^{−1}

^{(l)}is the smoothing matrix.

^{4}The dimensions of the above five matrices are

*M*× 1,

*M*×

*K*,

*M*×

*M*,

*M*×

*M*and

*K*× 1, respectively. For each level,

**d**

^{(l)}is defined aswhere

^{(l)}is the bilinear interpolation operator from the state space to the observation space for the

*l*th-scale level, and

**x**in MSA;

*δ*

**x**

_{MSA}

^{(l–1)}represents the analysis result of the (

*l*− 1)th scale level [see the following Eq. (25)].

*J*

^{(l)}with respect to the control vector

*δ*

**x**

^{(l)}can be derived as

**x**produced by MSA is

^{(l)}. To be consistent with the GC localization in EnKF, the element of

^{(l)},

*L*

_{ij}

^{(l)}which denotes the weight of the

*j*th observation on the

*i*th state variable, also employs the GC functionwhere the

*a*

^{(l)}represents the GC localization half-width for the

*l*th-scale level, and

*b*

_{ij}is the physical distance between the

*j*th observation and the

*i*th state variable. The denominator is a normalization factor.

### e. An EnKF–MSA hybrid method

To break the limitations of the fixed covariance localization in EnKF described in section 2c, we present an EnKF–MSA hybrid method in this section. Figure 3 shows the flowchart of the hybrid method for a data assimilation cycle. The sequential implementation of the hybrid method is as follows:

- Step 1: Adjust the ensemble members of state variable using the observation with the standard EnKF algorithm with
*a*GC half-width. - Step 2: Project linearly the analysis ensemble mean produced by EnKF to the observation positions to get the EnKF-estimated posterior observation values. Then Eq. (21) is used to compute the observational residuals.
- Step 3: Apply MSA to observational residuals to extract multiscale information from longwave to shortwave. Under this circumstance, the localization factor (here is the GC half-width) for the
*i*th-scale level in MSA [i.e.,*a*^{(i)}] should decrease monotonously from*a*^{(1)}to*a*^{(L)}as*i*increases from 1 to*L.*In addition, to keep consecutive with the localization of EnKF, the GC half-width in MSA for the last scale level [i.e.,*a*^{(L)}], which has the smallest scale is set to*a.* - Step 4: Add the analysis field generated by MSA to the ensemble mean produced by EnKF to obtain the final ensemble mean.
- Step 5: Add the new ensemble mean to the ensemble perturbations to generate the final ensemble members.

**y**

^{res}. While the standard EnKF can extract observational information implied in the first term with a fixed localization factor, it cannot deal with the second term that is the observational residual. For extreme values of impact factors, the observational residual may contain multiscale information (see Figs. 2a,d). Under this circumstance, the MSA in the hybrid method is used to retrieve the multiscale signals from observational residuals [Eq. (21)] to compensate the loss of longwave information (the contamination of the analysis) in the standard EnKF caused by an overly small (large) cutoff distance. Thus, in the hybrid method, the analysis solution of ensemble mean of model state is

*δ*

**x**

_{MSA}[Eq. (26)], which is retrieved by MSA from

**y**

^{res}. In other words, the observations are assimilated in two steps but the two analysis increments are completely compensatory. Second, since MSA is applied to the observational residuals, leading the prior information (e.g., the background error covariance) of truth residual [Eq. (20)] unknown, the background term in the cost function [Eq. (22)] of MSA is neglected.

**y**

^{(1)}is actually the observational residual

**y**

^{res}in Eq. (21). Thus, the background field of MSA is actually the analysis field of EnKF (i.e.,

**x**

^{(l)}is reshaped by a 2D matrix

^{(l)}with (

*im*,

*jm*) dimension, the expression of the smoothing term in Eq. (22) can be formulated bywhere

*i*,

*j*)th element of

^{(l)}, and the four coefficients arewhere lon(

*i*) and lat(

*j*) represent the longitude and latitude of the (

*i*,

*j*)th model grid, respectively. It is easy to infer that the smoothing matrix

*O*(

*δ*

**x**

^{(l)}through balancing the smoothing term and the observation term. Although the classical observation error covariance matrix

The final ensemble members can be obtained through adding the above ensemble mean to the ensemble perturbations updated by the standard EnKF.

## 3. Biased twin-experiment setup

A biased twin-experimental framework is designed to investigate the performances of two assimilation schemes. The sole source of model error is assumed to arise from the uncertainty of the time filter coefficient. We set the time filter coefficient value as 0.02 in the assimilation model, which produces an apparent bias with the truth model that uses 0.01 as its time filter coefficient value.

Started from the streamfunction at 1200 UTC 1 January 1991 derived from the European Centre for Medium-Range Weather Forecasts (ECMWF) reanalysis 500-hPa *u* and *υ* data, both the truth model and assimilation model are spun up for 30 days to derive their own initial conditions. Then the truth model is integrated for another 240 days to generate “observations” which sample the “truth” model states. The observational interval is set to 6 h (12 time steps). A Gaussian noise with the standard deviation of 10^{6} m^{2} s^{−1} is imposed to the truth streamfunction to simulate the “observational” error. Furthermore, to simply reflect the spatial structure of the observing system, observations for all model grids in the Northern Hemisphere (NH) are assumed to be available. In the Southern Hemisphere (SH), observations on odd *x*-index and *y*-index grids are assumed to be available. Namely, only ¼ of the grid points are observed in the SH. Figure 4 displays the observing system (solid circle) and the model grids (plus sign). The numerical values of *K* and *M* in this experiment are 2176 and 3456, respectively.

Initial ensemble perturbations of the atmospheric streamfunction for the assimilation are generated by adding a Gaussian white noise with the same standard deviation of observational error to the biased initial condition generated by the assimilation model. The ensemble size is set to a typical value of 20. Additionally, because the leapfrog scheme is used to integrate the model, a two time-level adjustment method (S. Zhang et al. 2004) is applied to the data assimilation. That says observations at time *t* are used to adjust the model states at time *t* and *t* − 1.

Three experiments are conducted to evaluate the performances of two assimilation algorithms. The first one is the ensemble control run (without observational constraint), serving as the reference experiment, denoted as CTL; the second one is the standard EnKF; and the third one is the EnKF with MSA. To simply examine the validity of MSA, it is only applied to observational residuals in the SH in this study. Additionally, the MSA in this study is activated after 20 model days, which is roughly the length of the spinup of the standard EnKF. The goal of this setting is to investigate whether MSA can further enhance the accuracy of the ensemble mean after the standard EnKF reaches its equilibrium. In fact, extra experiments that apply the MSA from the first model day obtain similar equilibriums as the experiments in this study with a shorter spinup (not shown). Two data assimilation algorithms use the same observing system and ensemble initial conditions as that in the first experiment. Discarding the assimilation results in the first 140 days as the spinup, the results of the last 100 days are used to conduct error statistics and analysis. The time mean RMSE of ^{7} m^{2} s^{−1}. Because of the overly large error of

*a*, seven values of

*a*, 125, 250, 500, 1000, 1500, 2000, and 2500 km, are used. The default relationship

^{5}between

*a*and {

*a*

^{(i)},

*i*= 1, …,

*L*} as well as

*L*is defined in Table 2, where

*a*

^{(i)}can be formulated asThus,

*a*

^{(1)}and

*a*

^{(L)}are constantly set to 4000 km and

*a*, respectively.

The relationship between the half-width of GC localization (denoted as *a*, km) in the EnKF of the hybrid method and the half-widths of GC localization [denoted as *a*^{(i)}, km] in the MSA of the hybrid method in this study.

Additionally, trial-and-error tests justify that assimilation experiments with the optimal static inflation factors draw the same conclusions as that with no inflation. Therefore, to simplify the issues and separate the effect of variance inflation, we did not employ the variance inflation in the data assimilation experiments in this study.

## 4. Results of the hybrid method

The dependences of the hybrid method on the GC half-width in EnKF and the number of scale levels in MSA are first investigated in this section. Then, the comparisons between the hybrid method and the standard EnKF as well as the EnKF–SCM method are conducted. Afterward, the sensitivity study of the hybrid algorithm with respect to observing system is performed while a simple analysis of the computational cost is presented at last.

### a. Dependence on the GC localization half-width

*a*values in the EnKF of the hybrid method, the number of scale levels and the GC half-widths in MSA are listed in Table 2. The dashed triangle curve in Fig. 1 gives the time-averaged ensemble mean RMSE [Eq. (19)] of

*a*takes various values in the hybrid method. Apparently, the optimal

*a*for the hybrid method is about 1500 km. To further understand the performance of the hybrid method, we examine the time series of the RMSE and the spatial distribution of RMSE. Here the first RMSE is defined as the RMSE

_{s}in Eq. (19) while the second RMSE for the (

*i*,

*j*)th grid is computed by

Figure 5 shows the time series of RMSE of *a* values: 125 (red curve), 500 (black curve), 2000 (blue curve), and 2500 km (green curve). As *a* increases from 125 to 2000 km, the RMSE is significantly reduced. For a 2500-km *a* value, the spinup period of the hybrid method is much longer than other cases. After the spinup period (here is about 100 days), the RMSE of *a* value is the worst among seven cases.

Figure 6 displays the spatial RMSEs of *a* values as 250 (Figs. 6a,b), 1000 (Figs. 6c,d), and 2500 km (Figs. 6e,f) in the standard EnKF (Figs. 6b,d,f) and the hybrid method (Figs. 6a,c,e). For a small *a* value (like 250 km) or a large *a* value (like 2500 km), compared to the standard EnKF, the hybrid method can greatly reduce the error of *a* value (like 1500 km), despite the error in the SH produced by the hybrid method is a little larger than that generated by the standard EnKF, the difference is much less than that for GC half-width values of 250 and 2500 km.

_{hybrid}), second is the RMSE of

_{EnKF in hybrid}), and third is the difference between RMSE

_{hybrid}and RMSE

_{EnKF in hybrid}(i.e., RMSE

_{hybrid}− RMSE

_{EnKF in hybrid}). Here, the first two quantities are respectively calculated byandwhere the superscript “hybrid posterior” (“posterior of EnKF in hybrid”) represents the posterior field produced by the hybrid method (the EnKF in the hybrid method). This definition can directly examine the validity of the MSA. Obviously, a negative difference means that the MSA is valid. Figure 7 plots the time series of the difference for the 2500-km

*a*value. Apparently, during 20–100 days, most of the differences are less than zero, demonstrating that the MSA in the hybrid method can continuously refine the quality of the analysis solution of the EnKF. After the spinup period of the hybrid method, the error of

### b. Dependence on the number of scale levels

To investigate the dependence of the hybrid method on the number of scale levels in MSA, we choose a relative small value (here is 250 km) of *a* in EnKF. The default five scale levels in MSA are first compressed to the first one, two, three and four levels, and then extended to include 8000 km and (8000 and 16 000 km) levels to generate seven configurations of *a*. Thus, eight experiments, including the experiment with 0 level in MSA, which reduces to the standard EnKF, are conducted in sum.

Figure 8a shows the sensitivity of the time mean RMSE [Eq. (19)] of *a* value cannot completely retrieve the observational signals whose spatial scales are within 250 km. Results of the time series of the RMSE [i.e., the RMSE_{s} in Eq. (19)] of

### c. Comparison with the standard EnKF

In this section, we first quantitatively compare the hybrid method with the standard EnKF. Then the error analysis of the time series and spatial distributions of RMSEs for two methods are conducted. Note that the two RMSEs here are computed the same as those in Figs. 5 and 6.

From Fig. 1, compared with the standard EnKF (the solid circle curve), the hybrid method has much weaker dependence on the cutoff distance. For relatively small or large *a* values, the time mean RMSE of *a* values (such as 1000 and 1500 km here), the hybrid algorithm is a little worse than the standard EnKF. From Fig. 2c, the signal in the SH implied in the truth residual [Eq. (20)] is much weaker than the standard deviation of observational error for 1000-km *a* value. Thus, under this circumstance, the observational residual [Eq. (21)] is noise dominant. Although Fig. 2 is a snapshot result, substantive examinations draw the same conclusion. Without background term in the cost function, it is very hard for MSA to extract useful information from the observational residuals.

*a*value cases, the hybrid method reduces the error of

^{6}m

^{2}s

^{−1}of the standard EnKF to 5.7 × 10

^{5}m

^{2}s

^{−1}). On the other hand, if we simply define the sensitivity of the data assimilation scheme with respect to

*a*aswhere

*i*indexes different value of

*a*, num equals to 7 in this study, and the sensitivities of the standard EnKF and the hybrid method are 1.2 × 10

^{6}m

^{2}s

^{−1}and 1.2 × 10

^{5}m

^{2}s

^{−1}, respectively. Thus, relative to the standard EnKF, the sensitivity of the hybrid method is reduced by 90%.

Figure 9 displays the time series of the RMSE of *a* values as 250 (Fig. 9a) and 2500 km (Fig. 9b) in the standard EnKF (black curve) and the hybrid method (blue curve). We discuss the advantages of the hybrid method relative to the standard EnKF from the following two aspects.

Then, we examine the SH^{6} results of the hybrid method at the first data assimilation cycle (i.e., 0600 on the 20th day). Figure 10 shows the SH results of *a* is set to 250 km. Since the cutoff distance is very small, the signals implied in the observational residuals are expected to be longwave dominant (Fig. 10b). Moreover, comparisons among Figs. 10a–c justify that on the one hand, observational residuals more or less contain true longwave information with strong signals; on the other hand, MSA can reasonably retrieve the multiple scale information from the observational residuals. The hybrid method can effectively reduce the error where the signal is strong, that is the darker blue in Fig. 10d occurs in the same areas as the darker red or blue places in Fig. 10a. This also validates the correctness of the analysis at the end of section 2e.

To reflect the analysis process of MSA, we plot the results of MSA for the first one (Fig. 11a), the sum of first two (Fig. 11b), the sum of first three (Fig. 11c), and the sum of all five scale levels (Fig. 11d).^{7} It is obvious that MSA can sequentially extract the multiple scale information from longwave to shortwave. And the total analysis of MSA can reflect strong signals implied in the observational residuals.

For a large *a* value (such as 2500 km), even with two scale levels in MSA (see Table 2), the hybrid method can greatly reduce the error (see the blue and black curves in Fig. 9b) of model state after a long spinup period of data assimilation. According to the analysis of Fig. 7, the model state can be gradually refined through adding back the two longwave signals. Additionally, examinations show that the long spinup period of the hybrid method here is mainly caused by too few scale levels and an overly large localization factor. Therefore, the spinup periods are expected to be shortened through including some small scale levels into MSA.

### d. Improvement of the performance of the hybrid method for overly large a values

Motivated from the analysis in the section 4c, we redesign the experiment of the hybrid method for 2500-km *a* value. Two scale levels in MSA are modified to five levels, including 250, 500, 1000, 2000, and 4000 km. The green curve in Fig. 9b presents the time series of the RMSE of ^{5} to 6.1 × 10^{5} m^{2} s^{−1}. Thus, the dependence of the hybrid method on *a* will be further lightened.

Here, we also check the performance of the redesigned experiment at the first data assimilation cycle (Fig. 12). Since the EnKF with an overly large *a* value contaminates the model state, the observational residuals may include various scale information (Fig. 12b), which is different from the situation (Fig. 10b) with the 250-km *a* value. Therefore, the MSA should include some small-scale levels. Fig. 12c proves that the MSA here can also extract the strong signals from the observational residuals (Fig. 12b) and subsequently restore the stained model state where the amplitudes of the observational residuals are relatively large (Fig. 12d).

It is worth mentioning that one may argue that MSA with only two levels (i.e., 2500 and 4000 km) in the original design of the hybrid method for 2500-km GC half-width can also trace the shortwave information of the model state (see Figs. 6e,f) and improve the analysis of the standard EnKF. The reason is that although MSA can only retrieve the longwave information from the observational residuals with large scales, it is also only valid in the places (not shown) where the signals are strong. On the one hand, when the MSA analysis is added to the EnKF ensemble mean analysis, some shortwave information can also be improved. On the other hand, the net effect at each data assimilation cycle during the spinup period of the hybrid method is that the RMSE [i.e. the RMSE_{s} in Eq. (19)] produced by the hybrid method is smaller than that produced by the standard EnKF. Because of the too few and too large-scale levels, MSA can only gradually refine the model state with a long spinup period (see the green curve in Figs. 5 and 7 and the blue curve in Fig. 9b).

Combining the above results and the results for small *a*s suggests that in the practical applications, for an overly large *a*, we should include some scales smaller than *a* into MSA; for an overly small *a*, we should fix the smallest scale [i.e., the last scale *a*^{(L)}] to *a* and include larger *a*^{(i)}s in MSA.

### e. Comparison with the EnKF–SCM method

As analyzed at the end of section 2e, without the background term, the overall impact of MSA is to move the analysis ensemble mean of EnKF closer to observations than the raw covariances intend to do. When the smoothing term is dropped in the cost function of MSA, the EnKF–MSA hybrid method degrades to EnKF–SCM method. Since the EnKF–MSA method has been somewhat ameliorated in the last section, it is necessary and meaningful to investigate the performances of these two hybrid methods. Because of the limited space, we only analyze the results of experiments with 250- and 2500-km *a* values here.

The red curve in Fig. 9 shows the time series of the RMSE of the atmospheric streamfunction with 250- (Fig. 9a) and 2500-km (Fig. 9b) *a* values for the EnKF-SCM scheme. Obviously, for extreme *a* values, even without smoothing term, the hybrid method still can gradually refine the analysis ensemble mean produced by EnKF, especially for an overly large *a* value. The reason is that the truth residuals contain signals stronger than the standard deviation of observational error (Figs. 2a,d). With this precondition, even when the model state at the observed model grids is successively corrected from the analysis of EnKF to the observation, the signal-to-noise ratio in the adjustment is high. For the sparse observing system, the net effect may lead the improvement of model state. When the smoothing term is introduced, the assimilation quality is further greatly enhanced, which contributes to the fact that the local small-scale noise (such as “bull’s-eye”) is filtered and the extracted signal from the observational residuals is more deterministic. Thus, the EnKF–MSA method in the practice may benefit from the smoothing term, although the smoothing operator here is relatively simply.

### f. Dependence on observing systems

As the foregoing analysis, the quality of MSA in the proposed hybrid method is sensitive to the signal-to-noise ratio of the observational residual. Because of the complex observing network in the real world, the quality of the signal-to-noise ratio is highly geographic dependent. Therefore, the assimilation quality of the presented new method must be sensitive to the observing system. Although the observing system in this study is highly simplified, we still can conceptually evaluate the dependence of the hybrid method on the observing system, and point out whether MSA in the hybrid method should be applied to the dense observing system or not for extreme *a* values.

Taking 125 and 2500 km for an example of extreme *a* values, we apply MSA in the hybrid method to global observations. Through comparing the results here with that in section 4a for 125 km and that in section 4d for 2500 km, we can answer the above question. Figure 13 shows the time series of RMSE [see the RMSE_{s} in Eq. (19) for the definition] of the streamfunction for 125- (Fig. 13a) and 2500-km (Fig. 13b) *a* values, where the blue (black) curve represents the results of the hybrid method without (with) MSA applied to NH observations. Note that here the blue curve in Fig. 13b is the same as the green curve in Fig. 9b. For an overly small *a* value, application of MSA to dense observed model grids adversely worsens the quality of the model state. The reason is that signals in the truth residual defined as in Eq. (20) in NH are weaker than the standard deviation of observational error (see Fig. 2a). It is difficult for MSA to correctly retrieve the useful information from observational residual. Thus, caution should be taken when the new scheme is applied to dense observations with an overly small cutoff distance. For an overly large *a* value, however, when MSA is applied to observations in NH, the model state is further refined compared to the results of the redesigned experiment in section 4d. This can also be explained by Fig. 2d, which points that the truth residuals in NH contain some signals stronger than the standard deviation of observational error. Note that the oscillations that exist in the black curves in Fig. 13 are caused by the inconsistency between the analysis of MSA and the model dynamics. According to the analysis process of MSA in this study, no model dynamics is contained, which may cause shocks between the analysis of MSA and the analysis of EnKF. Figure 14, which plots the spatial RMSEs [see Eq. (33) for the definition] of streamfunction for two *a*-value experiments, also justifies the above conclusions.

### g. Analysis of the computational cost

According to the description of the hybrid algorithm in section 2e, here we roughly analyze the computational cost of the hybrid method. Since the MSA only updates the ensemble mean, the additional computational cost relative to the standard EnKF is caused by the MSA.

Take a 500-km *a* value for an example, Fig. 15 shows the variation of normalized values of cost function of the MSA in the hybrid method with respect to iterate steps for the first (solid dot), second (hollow dot), third (solid diamond), and fourth (hollow diamond) scale levels. Here, the normalization factor is the value of cost function at the first iterate step. Apparently, the cost function converges fastest for the first scale level. For all cases, the largest number of iterations that is required to make the cost function converge is about 10. Under this precondition, the time consumption caused by MSA is very little relative to that caused by the standard EnKF.

## 5. Impact of the compensatory scheme on weather forecast

*a*values. Note that here the analysis results of the hybrid method with 2500-km

*a*value are produced by the redesigned experiment in section 4d. The 20 forecast cases are integrated up to 60 days for two assimilation methods. The global anomaly correlation coefficient (ACC) and RMSE of the forecasted ensemble mean are used to evaluate the pattern correlation and amplitude error relative to the truth. The formulas of these two quantities for the

*s*th lead time areandrespectively. Here

*R*equals to 20 and

*r*(

*i*and

*j*) indexes the forecast case (model grid). The superscripts “

*f*” and “

*t*” represent the forecasted and truth quantities. Here

*i*,

*j*)th grid point for the assimilation (truth) model. Here

*i*,

*j*)th grid point, denoted by

Figure 16 shows the variations of ACC (Figs. 16a,c) and RMSE (Figs. 16b,d) with the lead time of the forecasted ensemble means of *a* values for the hybrid method (blue curve) and the standard EnKF (black curve). For the 125-km *a* value case, although the advantage is no so evident, the hybrid method can maintain higher ACCs and smaller RMSEs during the first 4 days of lead time compared to the standard EnKF. That is to say the proposed approach can more or less enhance the short-term weather forecast skill for an overly small cutoff distance. For the 2500-km *a* value case, the hybrid method can greatly increase the short-term weather forecast skill relative to the standard EnKF (see the blue and black curves in Figs. 16c,d). If an ad hoc value of 0.6 ACC is used to evaluate the valid time scale of forecast (Hollingsworth et al. 1980), the hybrid method can extend the valid weather forecast time scale of the standard EnKF by about 10 days. In addition, results in Fig. 1 justify that improvement from the compensatory method relative to the standard EnKF for an overly large localization factor is much larger than that for an overly small localization factor. Therefore, the forecast results are consistent with the analysis results.

## 6. Summary and discussion

Covariance localization was initially introduced to enhance the reliability of ensemble-evaluated background error covariance by reducing the long-distance spurious correlation resulting from sampling errors of a finite ensemble. Although fixed covariance localization can greatly improve the analysis quality, it has significant limitations: insufficient longwave information with a small cutoff distance or contaminated analysis states if a large cutoff distance is used, while tuning an optimal cutoff distance is always difficult. Under these circumstances, we develop an EnKF and multiple-scale analysis (MSA) hybrid method to break the limitations and improve the performance of the standard EnKF. At each analysis step, after the standard EnKF is done, the MSA is used to extract multiscale information from observational residual (the difference between observations and interpolated analysis ensemble means produced by EnKF). Within a biased twin-experiment framework based on a global barotropical spectral model and an idealized observing system, the performance of the proposed method is examined. Results show that the hybrid method is superior to a standard EnKF for overly small or large cutoff distances and it has less dependence on cutoff distances. Consistently, the compensatory scheme can enhance the short-term weather forecast skill, especially for an overly large cutoff distance. In addition, it is shown that caution should be used in sensitivity studies with respect to observing systems when the new scheme is applied to dense observations with an overly small cutoff distance. Also, the new method has a nearly equivalent computational cost compared to the standard EnKF and thus it is suitable for GCM applications.

Although the compensatory approach presented in this study is promising, there are many challenges before it can be applied to the real weather climate models for state estimation and prediction initialization.

First, the MSA method in this study is actually a spatial smoothing of observational residuals. Given the observing network of Fig. 4 in this study, which is rather homogeneous and sufficiently dense coverage, the spatial averaging is robust and informative. Although additional experiments (not shown) that assume a sparser observing system than that used in this study have also demonstrated the superiority of the hybrid method over the standard EnKF, because of the complexity of the real observing systems that are highly heterogeneous, very sparse and irregular such as ocean in pre-Argo era, or even atmosphere in presatellite era, applying multiple-scale analysis to assimilate instrumental measurements into a realistic atmospheric, oceanic, or atmosphere–ocean coupled general circulation model should be examined to identify the problems and seek out the solutions.

Second, the comparison between the compensatory approach and the adaptive localization model shall be performed to increase our understanding about the multiple-scale analysis.

Third, from the results in this study, MSA can improve the accuracy of the ensemble mean for an overly small or an overly large localization factor. Under this circumstance, the analysis ensemble perturbations must be smaller in response to more accurate analysis of ensemble mean. However, the proposed hybrid method does not apply any changes to ensemble perturbations in response to the corrections made by MSA now. To remove this inconsistency, the presented hybrid method shall also been further ameliorated. Actually, extra experiments that attempt to add the variance inflation to the hybrid method gain larger errors of the model state than that without inflation and the optimal inflation factors are less than 1.0, which further verifies that the squeeze of ensemble perturbations may be more important than the variance inflation for the current version of the hybrid method. Additionally, results of experiments with optimal inflation factors also show similar increasing tendencies as Figs. 5, 8b, 9, and 13, demonstrating that the rising trends of the RMSEs in the hybrid method in this study are also not caused by the absence of variance inflation.

Fourth, when the forward operator is more complex than interpolation (e.g., for satellite radiances), the validity of MSA should also be examined.

Last, since the optimal localization scale would increase when ensemble size is increased, the longwave information loss caused by localization would be reduced. Whether the hybrid method can still outperform the EnKF with large ensemble sizes for extreme localization factors shall also be investigated.

## Acknowledgments

The authors thank three anonymous reviewers for their thorough and helpful suggestions on the earlier version of this manuscript. This research is cosponsored by grants from the National Natural Science Foundation (Grants 41030854, 41306006, 41376015, 41376013, 41106005, 41176003, and 41206178).

## REFERENCES

Anderson, J. L., 2001: An ensemble adjustment Kalman filter for data assimilation.

,*Mon. Wea. Rev.***129**, 2884–2903, doi:10.1175/1520-0493(2001)129<2884:AEAKFF>2.0.CO;2.Anderson, J. L., 2003: A local least squares framework for ensemble filtering.

,*Mon. Wea. Rev.***131**, 634–642, doi:10.1175/1520-0493(2003)131<0634:ALLSFF>2.0.CO;2.Anderson, J. L., 2007a: Exploring the need for localization in ensemble data assimilation using a hierarchical ensemble filter.

,*Physica D***230**, 99–111, doi:10.1016/j.physd.2006.02.011.Anderson, J. L., 2007b: An adaptive covariance inflation error correction algorithm for ensemble filters.

,*Tellus***59A**, 210–224, doi:10.1111/j.1600-0870.2006.00216.x.Anderson, J. L., 2009: Spatially and temporally varying adaptive covariance inflation for ensemble filters.

,*Tellus***61A**, 72–83, doi:10.1111/j.1600-0870.2008.00361.x.Anderson, J. L., , and S. L. Anderson, 1999: A Monte Carlo implementation of the nonlinear filtering problem to produce ensemble assimilations and forecasts.

,*Mon. Wea. Rev.***127**, 2741–2758, doi:10.1175/1520-0493(1999)127<2741:AMCIOT>2.0.CO;2.Anderson, J. L., , and L. L. Lei, 2013: Empirical localization of observation impact in ensemble Kalman filters.

,*Mon. Wea. Rev.***141**, 4140–4153, doi:10.1175/MWR-D-12-00330.1.Asselin, R., 1972: Frequency filter for time integrations.

,*Mon. Wea. Rev.***100**, 487–490, doi:10.1175/1520-0493(1972)100<0487:FFFTI>2.3.CO;2.Berre, L., , and G. Desroziers, 2010: Filtering of background error variance and correlations by local spatial averaging: A review.

,*Mon. Wea. Rev.***138**, 3693–3720, doi:10.1175/2010MWR3111.1.Bishop, C. H., , and D. Hodyss, 2007: Flow adaptive moderation of spurious ensemble correlations and its use in ensemble-based data assimilation.

,*Quart. J. Roy. Meteor. Soc.***133**, 2029–2044, doi:10.1002/qj.169.Bishop, C. H., , and D. Hodyss, 2009a: Ensemble covariances adaptively localized with ECO-RAP. Part 1: Tests on simple error models.

,*Tellus***61A**, 84–96, doi:10.1111/j.1600-0870.2008.00371.x.Bishop, C. H., , and D. Hodyss, 2009b: Ensemble covariances adaptively localized with ECO-RAP. Part 2: A strategy for the atmosphere.

,*Tellus***61A**, 97–111, doi:10.1111/j.1600-0870.2008.00372.x.Bishop, C. H., , and D. Hodyss, 2011: Adaptive ensemble covariance localization in ensemble 4D-Var state estimation.

,*Mon. Wea. Rev.***139**, 1241–1255, doi:10.1175/2010MWR3403.1.Bishop, C. H., , D. Hodyss, , P. Steinle, , H. Sims, , A. M. Clayton, , A. C. Lorenc, , D. M. Barker, , and M. Buehner, 2011: Efficient ensemble covariance localization in variational data assimilation.

,*Mon. Wea. Rev.***139**, 573–580, doi:10.1175/2010MWR3405.1.Bratseth, A. M., 1986: Statistical interpolation by means of successive corrections.

,*Tellus***38A**, 439–447, doi:10.1111/j.1600-0870.1986.tb00476.x.Briggs, W. L., , V. E. Henson, , and S. F. McCormick, 2000:

*A Multigrid Tutorial.*2nd ed. Society for Industrial and Applied Mathematics, 193 pp.Buehner, M., 2012: Evaluation of a spatial/spectral covariance localization approach for atmospheric data assimilation.

,*Mon. Wea. Rev.***140**, 617–636, doi:10.1175/MWR-D-10-05052.1.Deckmyn, A., , and L. Berre, 2005: A wavelet approach to representing background error covariances in a limited-area model.

,*Mon. Wea. Rev.***133**, 1279–1294, doi:10.1175/MWR2929.1.Evensen, G., 1994: Sequential data assimilation with a nonlinear quasi-geostrophic model using Monte Carlo methods to forecast error statistics.

,*J. Geophys. Res.***99**, 10 143–10 162, doi:10.1029/94JC00572.Evensen, G., 2007:

*Data Assimilation: The Ensemble Kalman Filter.*Springer Press, 187 pp.Gaspari, G., , and S. E. Cohn, 1999: Construction of correlation functions in two and three dimensions.

,*Quart. J. Roy. Meteor. Soc.***125**, 723–757, doi:10.1002/qj.49712555417.Greybush, S. J., , E. Kalnay, , T. Miyoshi, , K. Ide, , and B. R. Hunt, 2011: Balance and ensemble Kalman filter localization techniques.

,*Mon. Wea. Rev.***139**, 511–522, doi:10.1175/2010MWR3328.1.Haltiner, G. J., , and R. T. Williams, 1980:

*Numerical Prediction and Dynamic Meteorology.*2nd ed. Wiley, 477 pp.Hamill, T. M., , and C. Snyder, 2000: A hybrid ensemble Kalman filter-3D variational analysis scheme.

,*Mon. Wea. Rev.***128**, 2905–2919, doi:10.1175/1520-0493(2000)128<2905:AHEKFV>2.0.CO;2.Hamill, T. M., , J. S. Whitaker, , and C. Snyder, 2001: Distance-dependent filtering of background error covariance estimates in an ensemble Kalman filter.

,*Mon. Wea. Rev.***129**, 2776–2790, doi:10.1175/1520-0493(2001)129<2776:DDFOBE>2.0.CO;2.Hollingsworth, A., , K. Arpe, , M. Tiedtke, , M. Capaldo, , and H. Savijärvi, 1980: The performance of a medium-range forecast model in winter—Impact of physical parameterizations.

,*Mon. Wea. Rev.***108**, 1736–1773, doi:10.1175/1520-0493(1980)108<1736:TPOAMR>2.0.CO;2.Houtekamer, P. L., , and H. L. Mitchell, 1998: Data assimilation using an ensemble Kalman filter technique.

,*Mon. Wea. Rev.***126**, 796–811, doi:10.1175/1520-0493(1998)126<0796:DAUAEK>2.0.CO;2.Houtekamer, P. L., , and H. L. Mitchell, 2001: A sequential ensemble Kalman filter for atmospheric data assimilation.

,*Mon. Wea. Rev.***129**, 123–137, doi:10.1175/1520-0493(2001)129<0123:ASEKFF>2.0.CO;2.Houtekamer, P. L., , H. K. Mitchell, , and X. Deng, 2009: Model error representation in an operational ensemble Kalman filter.

,*Mon. Wea. Rev.***137**, 2126–2143, doi:10.1175/2008MWR2737.1.Hunt, B. R., , E. J. Kostelich, , and I. Szunyogh, 2007: Efficient data assimilation for spatiotemporal chaos: A local ensemble transform Kalman filter.

,*Physica D***230**, 112–126, doi:10.1016/j.physd.2006.11.008.Jazwinski, A. H., 1970:

*Stochastic Processes and Filtering Theory.*Academic Press, 376 pp.Jun, M., , I. Szunyogh, , M. G. Genton, , F. Zhang, , and C. H. Bishop, 2011: A statistical investigation of the sensitivity of ensemble-based Kalman filters to covariance filtering.

,*Mon. Wea. Rev.***139**, 3036–3051, doi:10.1175/2011MWR3577.1.Kondo, K., , T. Miyoshi, , and H. L. Tanaka, 2013: Parameter sensitivities of the dual-localization approach in the local ensemble transform Kalman filter.

,*SOLA***9**, 174–178, doi:10.2151/sola.2013-039.Lei, L. L., , and J. L. Anderson, 2014: Empirical localization of observations for serial ensemble Kalman filter data assimilation in an atmospheric general circulation model.

,*Mon. Wea. Rev.***142,**1835–1851, doi:10.1175/MWR-D-13-00288.1.Li, H., , E. Kalnay, , and T. Miyoshi, 2009: Simultaneous estimation of covariance inflation and observation errors within an ensemble Kalman filter.

,*Quart. J. Roy. Meteor. Soc.***135**, 523–533, doi:10.1002/qj.371.Li, W., , Y. Xie, , Z. He, , G. Han, , K. Liu, , J. Ma, , and D. Li, 2008: Application of the multigrid data assimilation scheme to the China Seas’ temperature forecast.

,*J. Atmos. Oceanic Technol.***25**, 2106–2116, doi:10.1175/2008JTECHO510.1.Li, W., , Y. Xie, , S.-M. Deng, , and Q. Wang, 2010: Application of the multigrid method to the two-dimensional Doppler radar radial velocity data assimilation.

,*J. Atmos. Oceanic Technol.***27**, 319–332, doi:10.1175/2009JTECHA1271.1.Liu, D. C., , and J. Nocedal, 1989: On the limited memory BFGS method for large scale optimization.

,*Math. Program.***45**, 503–528, doi:10.1007/BF01589116.Lorenc, A. C., 2003: The potential of the ensemble Kalman filter for NWP—A comparison with 4D-VAR.

,*Quart. J. Roy. Meteor. Soc.***129**, 3183–3203, doi:10.1256/qj.02.132.Mitchell, H. L., , P. L. Houtekamer, , and G. Pellerin, 2002: Ensemble size, balance, and model-error representation in an ensemble Kalman filter.

,*Mon. Wea. Rev.***130**, 2791–2808, doi:10.1175/1520-0493(2002)130<2791:ESBAME>2.0.CO;2.Miyoshi, T., 2011: The Gaussian approach to adaptive covariance inflation and its implementation with the local ensemble transform Kalman filter.

,*Mon. Wea. Rev.***139**, 1519–1535, doi:10.1175/2010MWR3570.1.Miyoshi, T., , and K. Kondo, 2013: A multi-scale localization approach to an ensemble Kalman filter.

,*SOLA***9**, 170–173, doi:10.2151/sola.2013-038.Ott, E., and et al. , 2004: A local ensemble Kalman filter for atmospheric data assimilation.

,*Tellus***56A**, 415–428, doi:10.1111/j.1600-0870.2004.00076.x.Pannekoucke, O., 2009: Heterogeneous correlation modeling based on the wavelet diagonal assumption and on the diffusion operator.

,*Mon. Wea. Rev.***137**, 2995–3012, doi:10.1175/2009MWR2783.1.Pannekoucke, O., , and S. Massart, 2008: Estimation of the local diffusion tensor and normalization for heterogeneous correlation modeling using a diffusion equation.

,*Quart. J. Roy. Meteor. Soc.***134**, 1425–1438, doi:10.1002/qj.288.Pannekoucke, O., , L. Berre, , and G. Desroziers, 2007: Filtering properties of wavelets for local background error correlations.

,*Quart. J. Roy. Meteor. Soc.***133**, 363–379, doi:10.1002/qj.33.Purser, R. J., , W. Wu, , D. F. Parrish, , and N. M. Roberts, 2003: Numerical aspects of the application of recursive filters to variational statistical analysis. Part II: Spatially inhomogeneous and anisotropic general covariances.

,*Mon. Wea. Rev.***131**, 1536–1548, doi:10.1175/2543.1.Rainwater, S., , and B. Hunt, 2013: Mixed-resolution ensemble data assimilation.

,*Mon. Wea. Rev.***141**, 3007–3021, doi:10.1175/MWR-D-12-00234.1.Robert, A., 1969: The integration of a spectral model of the atmosphere by the implicit method.

*Proc. WMO/IUGG Symp. on NWP,*Tokyo, Japan, Japan Meteorological Society, 19–24.Szunyogh, I., , E. J. Kostelich, , G. Gyarmati, , D. J. Patil, , B. R. Hunt, , E. Kalnay, , E. Ott, , and J. A. Yorke, 2005: Assessing a local ensemble Kalman filter: Perfect model experiments with the National Centers for Environmental Prediction global model.

,*Tellus***57A**, 528–545, doi:10.1111/j.1600-0870.2005.00136.x.Szunyogh, I., , E. J. Kostelich, , G. Gyarmati, , E. Kalnay, , B. R. Hunt, , E. Ott, , E. Satterfield, , and J. A. Yorke, 2008: A local ensemble transform Kalman filter data assimilation system for the NCEP global model.

,*Tellus***60A**, 113–130, doi:10.1111/j.1600-0870.2007.00274.x.Weaver, A. T., , and P. Courtier, 2001: Correlation modeling on a sphere using a generalized diffusion equation.

,*Quart. J. Roy. Meteor. Soc.***127**, 1815–1846, doi:10.1002/qj.49712757518.Weaver, A. T., , and I. Mirouze, 2013: On the diffusion equation and its application to isotropic and anisotropic correlation modeling in variational assimilation.

,*Quart. J. Roy. Meteor. Soc.***139**, 242–260, doi:10.1002/qj.1955.Whitaker, J. S., , T. M. Hamill, , X. Wei, , Y. Song, , and Z. Toth, 2008: Ensemble data assimilation with the NCEP Global Forecast System.

,*Mon. Wea. Rev.***136**, 463–482, doi:10.1175/2007MWR2018.1.Wu, W.-S., , R. J. Purser, , and D. F. Parrish, 2002: Three-dimensional variational analysis with spatially inhomogeneous covariances.

,*Mon. Wea. Rev.***130**, 2905–2916, doi:10.1175/1520-0493(2002)130<2905:TDVAWS>2.0.CO;2.Xie, Y., , S. Koch, , J. McGinley, , S. Albers, , P. E. Bieringer, , M. Wolfson, , and M. Chan, 2011: A space–time multiscale analysis system: A sequential variational analysis approach.

,*Mon. Wea. Rev.***139**, 1224–1240, doi:10.1175/2010MWR3338.1.Yaremchuk, M., , and D. Nechaev, 2013: Covariance localization with the diffusion-based correlation models.

,*Mon. Wea. Rev.***141**, 848–860, doi:10.1175/MWR-D-12-00089.1.Zhang, F., , C. Snyder, , and J. Sun, 2004: Impacts of initial estimate and observation availability on convective-scale data assimilation with ensemble Kalman filter.

,*Mon. Wea. Rev.***132**, 1238–1253, doi:10.1175/1520-0493(2004)132<1238:IOIEAO>2.0.CO;2.Zhang, F., , Y. Weng, , J. Sippel, , and C. Bishop, 2009: Cloud-resolving hurricane initialization and prediction through assimilation of Doppler radar observations with an ensemble Kalman filter.

,*Mon. Wea. Rev.***137**, 2105–2125, doi:10.1175/2009MWR2645.1.Zhang, S., , J. L. Anderson, , A. Rosati, , M. J. Harrison, , S. P. Khare, , and A. Wittenberg, 2004: Multiple time level adjustment for data assimilation.

,*Tellus***56A**, 2–15, doi:10.1111/j.1600-0870.2004.00040.x.Zhu, J., , F. Zheng, , and X.-C. Li, 2011: A new localization implementation scheme for ensemble data assimilation for non-local observations.

,*Tellus***63A**, 244–255, doi:10.1111/j.1600-0870.2010.00486.x.

^{1}

Note that the prior ensemble of the observation *y*^{o} is usually projected by the background ensemble of model state through the linearization of the observation operator *h*.

^{2}

Note that here the linear relationship is actually the local linearization of the nonlinear observation operator *h* through a least squares regression between the observation and the model state to be adjusted.

^{4}

Note the matrices ^{(l)}, ^{(l)}, ^{(l)}, and ^{(l)} will be specified in section 2e.

^{6}

Since the MSA is only applied in the SH, the results there are shown.