## 1. Introduction

The spatial resolutions of numerical atmospheric and oceanic circulation models have steadily increased over the past decades. Horizontal grid spacing down to the order of 1 km is now often used for regional models. These fine-resolution models thus encompass a wide range of temporal and spatial scales. In contrast, the formulation of data assimilation algorithms has remained essentially unchanged in many fundamental aspects, although a variety of parameters have been reestimated and tuned in response to the increased resolutions. A recasting of the basic data assimilation formulations to accommodate fine-resolution models is the focus of this study.

We argue that the current formulation of data assimilation is inherently ineffective when applied to fine-resolution models. The ineffectiveness arises from its filtering properties. The current formulation, referred to as basic data assimilation for convenience later on, is based on a minimum error variance solution or a maximum likelihood estimation, known as an optimal estimation (e.g., Lorenc 1986; Cohn 1997). The optimal estimation hinges on the error covariance associated with the background fields, known as the background error covariance. The background error covariance is a statistical quantity by definition in the ensemble sense, and large-scale components can be dominant even in fine-resolution models (e.g., Berre 2000). In comparison, the small-scale components account for only a small portion of the total covariance, and intermittently occurring, but energetic, small-scale components cannot be adequately represented.

One consequence of the underrepresentation of small-scale components in the background error covariance is a large decorrelation scale. The decorrelation scale, which is also known as a correlation length scale, is defined as the spatial distance over which the correlation decreases from 1 to 1/*e* (*e* is the mathematical constant that is the base of the natural logarithm), that is, the Daley correlation scale (Daley 1991). In the implementation of basic data assimilation, the background error covariance is generally characterized by a single spatial decorrelation scale (e.g., Gaspari and Cohn 1999). The decorrelation scale is the parameter that dictates the filtering effect of the data assimilation scheme. A large decorrelation scale imposes strong filtering on small scales (Daley 1991, also see section 2b). We infer that it is this filtering effect that led decorrelation scales to be empirically reduced in a number of recent studies aiming at effectively assimilating high-resolution observations, such as radar measurements, into high-resolution models (e.g., Xie et al. 2011; Zhang et al. 2009, 2011). Further, these studies demonstrated that a sequence of data assimilation should be applied for a set of decreasing decorrelation length scales. We present here a framework for using a sequence of decorrelation length scales, dubbed a multiscale data assimilation (MS-DA) scheme.

To mitigate the above-mentioned ineffectiveness, the essential strategy of the MS-DA scheme is to untangle distinct spatial scales. The basic data assimilation scheme seeks to minimize a cost function to obtain the optimal estimate (e.g., Lorenc 1986). To untangle the spatial scales, we decompose the cost function for distinct spatial scales. Accordingly, the background error covariance is decomposed, and the background error covariance is estimated for the distinct spatial scales. The data assimilation scheme with the decomposed cost function, hence, allows explicit incorporation of multiple decorrelation length scales in the background error covariances. The data assimilation problem is then solved sequentially from large to small scales.

The MS-DA scheme is also formulated to more effectively assimilate observations with different properties, in particular, high-resolution observations into high-resolution models. Observations of high resolution are increasingly available through advancements in satellite remote sensing and radar technologies. These observations are often localized, clustered, or patchy. Effectively assimilating such localized and patchy high-resolution observations using the basic data assimilation algorithm remains a challenge (Toth et al. 2014). Assimilation of such observations can become even more complicated when they are assimilated along with sparse conventional observations. In basic data assimilation, the single spatial decorrelation scale is often a mean decorrelation scale estimated using observations (Hollingsworth and Lonnberg 1986; Lonnberg and Hollingsworth 1986), deterministic model data [known as the National Meteorological Center (NMC) method; Parrish and Derber 1992], or ensemble-based model data. The use of multi-decorrelation-scale background error covariances can reduce the filtering effects on fine structures present in high-resolution observations and enhance a spatial spreading of the observational increments from sparse observations, which are conflicting objectives for a single-decorrelation-scale based scheme. The MS-DA scheme is thus effective in assimilating observations of disparate resolutions.

Further, the decomposition of the cost function outlined above provides a pathway for mitigating the effect of scale aliasing. Scale aliasing is the misrepresentation of small-scale waves as large-scale waves (e.g., Daley 1991). It is a classic problem in data analysis, but has barely been addressed so far in data assimilation. The decomposed cost functions highlight properties of scale aliasing in data assimilation. The aliasing may occur in such a way that the small-scale component is misrepresented as a large-scale component, but it is also possible for the large-scale components to impact the small-scale components in the analysis, even if there is no background error correlation between the two. The decomposed cost functions also reveal that the aliasing is associated with inherent additional representativeness errors in the basic data assimilation scheme. The additional representativeness errors are referred to as multiscale representativeness errors and will be described in section 3. When high-resolution observations are assimilated, the effects of scale aliasing and the inherent additional representativeness errors can be mitigated in MS-DA by assimilating observations that are decomposed appropriately.

In this paper, we will describe the MS-DA scheme in detail and use analytical and numerical solutions from a one-dimensional problem to elucidate its general and specific properties. The outline of this paper is as follows: section 2 presents a brief description of the formulation of basic data assimilation and its filtering properties and section 3 derives the MS-DA formulation and elucidates its algorithmic characteristics. In section 4, the configurations of the one-dimensional experiments are described and the randomization of the parameters for the statistical analyses is discussed. Section 5 presents a set of data assimilation experiments used to elucidate the performance of basic schemes and MS-DA using different observational scenarios, including complete, patchy, and mixed sparse and patchy high-resolution observations. Finally, a summary and discussion are given in section 6.

## 2. Basic data assimilation formulation and filtering properties

To proceed, we first describe the basic data assimilation scheme. Emphasis is placed on the description of representativeness errors and filtering characteristic that are closely related to the properties of the MS-DA scheme.

### a. Basic data assimilation scheme and representativeness error

*N*-dimensional vector, known as the incremental state variable, which is defined as

*t*indicates the unknown true state. The

*M*-dimensional vector

### b. Filtering property

In a one-dimensional (1D) problem with *N*-grid points, we can assume that there are observations at every grid point, that is, complete observations. In this case, we have *m*. With the transform *N*,

The spectral form in (14) illuminates the filtering property, that is, the smaller scales are strongly filtered and thus the data assimilation acts as a low-pass filter. This filtering property arises from the fact that *m*. For example, a Gaussian function is often a good approximation to represent a correlation function. For a Gaussian function, *m* and becomes virtually zero for the components with wavelengths smaller than twice the decorrelation scale. For white noise observation errors, *m* and the magnitude of *m* and

With white noise observation errors, another property revealed by (14) is that the analysis spectral coefficients are independent, and each spectral coefficient is optimally estimated. Thus, the filtering nature is intertwined with the optimality of individual spectral coefficients. These filtering characteristics lead to the inability of the basic data assimilation scheme to effectively assimilate high-resolution observations into fine-resolution models. To obtain (14), complete observations are assumed. When observations are incomplete, the effectiveness of the assimilation further deteriorates as we will show in the next section.

Since representativeness errors are crucial for the MS-DA scheme and they are spatially correlated as discussed in the next section, we examine here the impact of the spatial correlation of observational errors on the analysis based on (14). If observation errors are homogeneous as we assume for the background errors, *m*. The spatial correlation of observational errors reduces the filtering effect on small scales. For a correlation given by a Gaussian function, *m* and becomes virtually zero for the components with wavelengths smaller than twice the decorrelation scale. In other words, the observational errors primarily affect the scales larger than twice the specified decorrelation length scale. This property is important in the MS-DA implementation.

## 3. Multiscale data assimilation scheme

A spectral expansion is a basic mathematical method for scale decomposition. In the previous section, complete observations were assumed, so that a spectral expansion could be applied to the observations, thus untangling spatial scales in the estimation given in (14). However, observations are never complete. Therefore, rather than using a spectral expansion, we decompose the fields only into a limited set of distinct spatial scales. We use two spatial scales in the ensuing discussion.

### a. Formulation of multiscale data assimilation

In the partitioned cost functions in (18) and (19), the background error covariances can then be characterized by two distinct decorrelation length scales. Thus, a multi-decorrelation length scale background error covariance is incorporated. In contrast, when a background error covariance is characterized by a single length scale, which can be an average decorrelation length scale that is estimated using observations (Hollingsworth and Lonnberg 1986; Lonnberg and Hollingsworth 1986) or model generated data (known as the NMC method; Parrish and Derber 1992), we refer to it as a single length scale error covariance.

### b. Multiscale representativeness error and scale aliasing

Comparing the cost function in (1) with the partitioned cost functions in (18) and (19), we notice additional terms appear that are added to the observational error covariance

We next examine the effect of the multiscale representativeness errors on data assimilation. To illustrate, we consider again the case with complete observations. In this case, we have

When the observations are incomplete, the large-scale analysis depends on the small-scale representativeness errors. However, with the assumption of uncorrelated large- and small-scale background errors, the large-scale analysis should be independent of the small-scale error covariance and the component of the observations corresponding to the small scale. Such dependency is scale contamination by nature, and it is essentially scale aliasing (Ooyama 1987). This scale aliasing should be eliminated or mitigated whenever possible. We will show that the effect of scale aliasing can be mitigated in the MS-DA scheme for the high-resolution observations as detailed in the next section.

The large-scale representativeness error in (19), is often large in magnitude, but generally imposes a limited effect on the small-scale analysis. This is true even with incomplete observations, because the large-scale representativeness errors have a decorrelation-scale length much larger than the spatial scales of the small-scale component as discussed in the previous section. This limited effect of the large-scale representativeness error will be further illustrated using the experiment results presented in section 5.

### c. Effectiveness of assimilation of different observations

Using this MS-DA, high-resolution observations can be more effectively assimilated without being overly smoothed through the small-scale component, while the information from the sparse observations is spread out more effectively through the large-scale component. This is an advantage that is needed for current atmospheric and oceanic observing systems, and it will be illustrated in the experiments presented later.

In reality, an observing network generally consists of both high-resolution observations acquired from radar and satellite remote sensing, and sparse observations acquired from conventional observing platforms. The sparse observations are practically difficult to partition. Some high-resolution measurements may also be practically difficult to partition. For practical applications of MS-DA, we here formulate cost functions for assimilating partitioned and nonpartitioned observations simultaneously.

For convenience in the ensuing discussions, we will specifically refer to the DA methodology defined by (32) and (33) or by (34) and (35) as MS-DA, in which partitioned high-resolution observations are assimilated, while that defined by (18) and (19) as the additive background error covariance DA, denoted AB-DA. The scheme that uses a single length scale error covariance is denoted as SS-DA.

## 4. Configuration for experiments

In this section, we illustrate the properties of MS-DA using an array of experiments. These experiments are performed in a one-dimensional (1D) framework. Experiments with AB-DA and SS-DA are also presented to show the differences among the three schemes.

### a. Basic configuration

The experiments follow a typical identical-twin procedure. An identical-twin experiment is often used to validate and verify data assimilation schemes. The procedure here can be described in five steps: 1) a model is employed to generate true or control states; 2) background states, which are generally used as first guesses, are generated by introducing errors to the true states; 3) observations are generated by adding different random errors to the true states; 4) the observations are assimilated into the background states; and 5) the effectiveness of the data assimilation scheme is assessed by examining errors in the analyses. Steps 4â€“5 are the focus of section 5. This subsection describes the generation of the true states, background states, and observations required by steps 1â€“3.

*t*stands for a true state as before. The true state has been given as a discrete function of the number of grid points

*N*. The total function number

*K*is given as

*N*/5. The quantity

*Î³*is, the more dominant the large-scale components become. In the ensuing experiments, we will examine the dependence of SS-DA, AB-DA, and MS-DA on

*Î³*, as it ranges from 0 to 2. This range of

*Î³*covers a typical range of spatial structures in atmospheric and oceanic flows. To avoid a dominance in the true state of a small number of large spatial scale components, we let

*M*, is smaller than

*N*, that is, only a subset of the grid is used for sampling observations. By using different spatial distributions for these subsets, a variety of sampling schemes will be examined representing various observing platforms or networks in the experiments described later.

### b. Spatial-scale decomposition

We note that it is often difficult to decompose observations using a spectral expansion in practical applications. An alternative is to use a smoothing method for scale decomposition. The use of a Gaussian smoothing will be examined in section 5c, and the corresponding MS-DA scheme is denoted as MS-DA GAU.

A question arising here is how to determine an optimal

### c. Background error correlation matrices

One essential difference between SS-DA, AB-DA, and MS-DA is in the background error correlations. Here we discuss the construction of those correlations using Gaussian functions.

#### 1) Correlation matrices with a single decorrelation scale

*i*and

*j*, and

*D*is a decorrelation length scale. Since one single decorrelation length scale is used in representing the error covariance in (51), it is a single-scale error covariance.

The decorrelation length scale *D* is central to the performance of SS-DA. In the ensuing experiments, we will see that the decorrelation length scale *D* determines the filtering properties of SS-DA. The dependence of the performance of SS-DA on *D* will also be illustrated using the results from the experiments.

#### 2) Correlation matrices for multiscale data assimilation

These two decorrelation length scales

The background error correlations are not consistent with the background error specified in (43). The given correlations lead to an underestimation of the background error on the small scales. We have argued in the introduction that such an underestimation is unavoidable within the framework of the basic data assimilation. Here we aim to illustrate how the MS-DA scheme improves the effectiveness in assimilating high-resolution observations by reducing the small-scale errors.

### d. Implementation and statistical analyses of experiments

To quantify the differences between SS-DA and MS-DA, we have randomized a set of parameters in the expressions of the true states in (36), the background states in (39), and the observations in (44) to represent their different characteristics, as well as different background and observational errors. With these randomized parameters, we can then perform a large number of experiments, resulting in robust statistics.

We perform each experiment as many as 215 times. With this number of realizations of each experiment, the statistics, that is the root-mean-square error (RMSE), tend to converge, and further realizations lead to differences of no more than 2%.

In the expressions given previously in this section, some more parameters need to be prescribed. The observational errors are homogeneous and specified as *Î³*. Here we intentionally specify a background error that is twice as large as the observational error in order to highlight the impact of the observations, and note that the relative magnitudes of the background and observational errors do not affect the results presented.

## 5. Experiments and results

Following the procedure outlined and using the configuration described in the previous section, we perform experiments for three observational scenarios: 1) complete high-resolution observations, 2) patchy high-resolution observations, and 3) sparse and patchy high-resolution observations. We describe here the results with these three observational scenarios in order.

### a. Complete high-resolution observations

This is the simplest observational scenario. The associated experiments serve to illuminate the filtering properties of SS-DA and MS-DA. Figure 1b presents analyses from AB-DA and MS-DA. In the MS-DA experiments, the cost functions in (32) and (33) are used, and thus partitioned observations are assimilated. The partitioning of the observations follows (49) and (50). For SS-DA, four experiments are presented using decorrelation length scales of

The results shown in Fig. 1 illustrate three things that are of particular interest. First, both AB-DA and MS-DA generate an accurate analysis, and the analyses are very similar. This similarity confirms that the representativeness errors that are present in (32) and (33) have no effect as shown in section 2b. Additional experiments show that both AB-DA and MS-DA are not sensitive to the specified background errors

To further elucidate the performance of SS-DA, AB-DA, and MS-DA, we examine their analysis increments, which are presented in Fig. 2. For MS-DA, the large-scale analysis increment obtained by minimizing (32) corrects the large-scale background error, while the small-scale analysis increment obtained by minimizing (33) corrects the small-scale background error, as expected. A comparison of the AB-DA and MS-DA increments shows that the large- and small-scale analysis increments from the AB-DA are similar to those of MS-DA (Figs. 2c and 2d).

The SS-DA increments (Figs. 2a and 2b) are revealing of its filtering nature. With a small decorrelation length scale of

The above comparisons and analyses are associated with a particular experiment. We next analyze a large number of experiments to make statistical comparisons. Figure 3 presents the mean RMSEs calculated over ensembles, each of which consist of up to 215 experiments as described in section 4d. Figure 3 is also used to examine the dependence of the analysis RMSEs on the spectral power distribution of the background errors, that is, the spatial scale characteristics of the background errors.

Figure 3 presents results essentially consistent with those derived from the single experiment discussed above. Both AB-DA and MS-DA generate accurate analyses, and their analysis errors are about two-thirds of the observational error when small-scale components dominate in the background errors and less than half the observational error when large-scale components dominate. The AB-DA and MS-DA analyses RMSEs show little difference. Among the SS-DA experiments, the analysis RMSEs with a decorrelation length scale of

We next examine the sensitivity of the AB-DA and MS-DA to the selection of

### b. Incomplete observations

For complete observations, we have shown the effectiveness of the AB-DA and MS-DA in assimilation of high-resolution observations. We illustrate here how MS-DA improves the effectiveness of the assimilation of incomplete observations.

#### 1) Patchy high-resolution observations

High-resolution observations often have spatial distributions that consist of patches or swaths as are often seen in radar or satellite datasets. Here we examine the assimilation of patchy high-resolution observations. The patchy observations are taken over three isolated intervals. In the intervals with observations, observations are sampled at every grid point, thus representing high-resolution observations. There is a gap consisting of 40 grid points between two patches of observations. We note that the size of the gaps has a profound effect on the behavior of a data assimilation scheme, a point that will be carefully addressed later.

Following the experiments with complete high-resolution observations, we first present results from single experiments. The analyses from AB-DA and MS-DA are presented in Fig. 4b, and the analyses from SS-DA in Figs. 4c and 4d. In the MS-DA experiments, the observations are partitioned as in (49) and (50), and the partitioned observations are assimilated using the cost function in (32) and (33). For SS-DA, four experiments are presented again, using decorrelation length scales of

These figures indicate that the MS-DA analysis stands out among all the analyses, as evidenced by a RMSE that is smaller than the others. The MS-DA analysis RMSE (0.085) is about 40% smaller than the observational error (0.15), while the analysis RMSEs from all the other experiments are close to or larger than the observational error.

To better illustrate the performance of MS-DA, we compare the analysis errors in the intervals with observations to the errors in the gaps without observations separately. In the intervals with observations, the analysis errors of SS-DA, AB-DA, and MS-DA are similar to those in the case with complete observations. AB-DA, MS-DA, and SS-DA with a decorrelation length scale of

In the gaps without observations, there are notable differences among the SS-DA, AB-DA, and MS-DA analyses. The MS-DA analysis more realistically reproduces the true solution, while the other analyses show substantial errors there. Excepting the MS-DA, the errors in the gaps without observations account for most of the total analysis RMSEs. Note that the small-scale background errors in the gaps cannot be corrected by data assimilation, but the errors with spatial scales larger than the size of the gaps can be corrected. The reason why the MS-DA analysis shows particularly small RMSEs is that it faithfully reproduces the large-scale components of the true state within the gaps. This will become more evident when we examine the analysis increments.

Figure 5 presents the analysis increments corresponding to the analyses shown in Fig. 4. For SS-DA with a decorrelation length scale of

With the patchy observations assimilated here, we note that the analysis RMSEs are sensitive to the spatial distribution of background errors. It is crucial to perform statistical analyses over ensembles of a large numbers of experiments when estimating the background errors. As in Fig. 3 for the case with complete observations, Fig. 6 presents the mean RMSEs calculated over ensembles of 215 members.

The main conclusion that can be drawn from Fig. 6 is that the MS-DA analysis stands out among all the analyses: its analysis RMSE is much smaller than those of AB-DA and SS-DA. In fact, the analysis RMSEs of MS-DA tend to be as small as those with the complete observations. Also, it can be seen that the AB-DA outperforms SS-DA for all four decorrelation length scales, except that with a decorrelation length scale of *Î³* in a complicated way.

We here again examine the question of whether the results are sensitive to the selection of

#### 2) Sparse and high-resolution observations

In this set of experiments, both nonpartitioned sparse observations

We here again first analyze single experiments to illustrate the differences between MS-DA, AB-DA, and SS-DA. Figure 7 presents the analyses, along with the true state, background state and observations. As in the case with patchy observations, we examine the analyses within the half domain with sparse observations and the half domain with complete observations separately. For the half domain with the complete observations, the differences among the analyses are essentially the same as those in the experiments using complete observations described in section 5a.

For the half domain with sparse observations, the major difference occurs in the areas adjacent to the area with the complete observations. The MS-DA analysis better reproduces the large-scale component there, and the overall MS-DA analysis error is smaller than that obtained using either AB-DA or SS-DA. This is particularly apparent in the analysis increments shown in Fig. 8. Figure 8 shows that MS-DA tends to more effectively reduce the large-scale error in the domain with sparse observations, in particular, within approximately one large-scale decorrelation length scale of the area with complete observations. We note that the differences among the analyses for this half domain are sensitive to the background state and its associated errors, and also the observational errors. Thus, the advantages of MS-DA using (34) and (35) and assimilating the partitioned high-resolution observations must be verified statistically.

Figure 9 presents mean RMSEs calculated over ensembles, each of which consist of 215 experiments as described in section 4d. Both the AB-DA and MS-DA analysis produce smaller analysis errors than the four SS-DA experiments in this case. In particular, MS-DA is superior to either of the other schemes. This indicates that the superior performance of MS-DA occurs through more accurately reproducing the large-scale component in the domain with sparse observations, in particular, within a range of approximately one large-scale decorrelation length scale away from the area with complete observations.

### c. Practical considerations on decomposing observations

The previous discussion focused on results obtained for the scale decompositions based on a spectral expansion. Spatial scales can be well defined in spectral space, as the basis functions are usually trigonometric functions or other eigenfunctions that are distinct in space. In most practical applications, spectral decompositions of observations may be difficult or impossible. In this case, a horizontal spatial smoothing operator may be used instead. Using such a smoothing, the orthogonality between the large- and small-scale components may be lost, resulting in a spatial correlation in the observational errors and thus a potential negative impact on the MS-DA performance.

Here we examine the impact of a Gaussian smoothing on the MS-DA performance. In Gaussian smoothing, the weights are given by *r* is the distance between two given grid points, and *D* is a length scale. This length scale is taken to be 0.5 times the truncation wavelength. The smoothed fields are assumed to be the large-scale component, and the residual is the small-scale component.

For the case of complete observations, the use of the Gaussian smoothing has only a small impact as measured by the RMSEs (Fig. 3). In the cases of incomplete observations, however, the use of the Gaussian smoothing clearly results in deterioration in the MS-DA performance (Figs. 6 and 9). The negative impact on the performance is greatest in the observational scenario with three patches of observations. These results are understandable. Gaussian smoothing does not have much, if any, effect on the small scales, and thus the deterioration is primarily associated with scale aliasing. In the case of complete observations, there is no scale aliasing, and the use of Gaussian smoothing results in little impact on the performance. With incomplete observations, the use of Gaussian smoothing reduces the ability of MS-DA to mitigate scale aliasing. In particular, when *Î³* is small, that is, small scales dominate the background fields and observations, the smoothing itself gives rise to scale aliasing that negates the positive benefits of MS-DA. The results here thus suggest that while a Gaussian smoothing could be used, a smoothing method that is able to retain more orthogonality would be preferable.

## 6. Summary and discussion

We have formulated a multiscale variational data assimilation (MS-DA) scheme for fine-resolution models that encompass a wide range of spatial scales. Because small-scale components are generally underrepresented in estimates of the background error covariance used in most data assimilation schemes, the background error decorrelation scale is often so large as to strongly filter out fine structures in the observations. The basic data assimilation scheme is thus inherently ineffective for fine-resolution models. The MS-DA scheme is formulated and implemented to mitigate this ineffectiveness.

In this MS-DA scheme, the cost function is decomposed for a set of distinct spatial scales. The background error covariance is then estimated for the distinct spatial scales separately, and multi-decorrelation scales are explicitly incorporated in the background error covariances. We used here a decomposition of the cost function into separate components for the large and small scales. MS-DA then minimizes the partitioned cost functions sequentially from large to small scales. The large decorrelation length scale in the large-scale background error covariance allows for the spreading of sparse observations more effectively, while the small decorrelation length scale in the small-scale background error covariance allows for extracting the fine structure information from high-resolution observations.

The decomposition of the cost function also reveals some important limitations of the basic data assimilation scheme, that is, the presence of scale aliasing and multiscale representativeness errors. The large-scale background errors are multiscale representativeness errors for the small-scale data assimilation, since they act on the small-scale analysis as representativeness errors. The effect of small-scale representativeness errors turns out to be associated with scale aliasing. The MS-DA scheme provides an avenue to mitigate the effect of scale aliasing and multiscale representativeness errors for high-resolution observations. The mitigation is achieved through assimilating their partitioned components.

An array of one-dimensional experiments was conducted to elucidate the properties of the MS-DA scheme. In this one dimensional context, we addressed issues arising from a wide variety of multiscale structures in the background states. Emphasis was placed on the assimilation of patchy high-resolution observations, which aim to represent radar or satellite swath data, as well as the assimilation of such observations alongside sparse observations representing those from conventional observing platforms. The major conclusions can be summarized as follows: 1) a data assimilation scheme with a single scale background error covariance (SS-DA) is shown to suffer from inherent limitations in assimilating high-resolution observations, and these inherent limitations are especially apparent when the high-resolution observations are localized and patchy; 2) a data assimilation scheme that uses an additive multiscale background error covariance (AB-DA) is shown to be useful in mitigating the limitations of SS-DA related to the filtering effect; 3) MS-DA is demonstrated to further improve on the AB-DA scheme by assimilating partitioned high-resolution observations, which mitigates the effect of scale aliasing and multiscale representativeness error and thus improves the effectiveness of the assimilation of patchy high-resolution observations alongside sparse observations; and 4) the performance of MS-DA is not particularly sensitive to the definition of large and small scales in the decomposition, which makes MS-DA robust and flexible to use.

In recent years, model resolutions have been rapidly increasing, and a variety of radar and satellite sensors increasingly provide high-resolution observations. The results presented here suggest that this MS-DA scheme holds promise as a data assimilation methodology that can be used for the assimilation of high-resolution radar or satellite swath measurements into very high-resolution models. On the other hand, in such circumstances, a data assimilation scheme using a single-scale background error covariance has been suggested to be inadequate.

We note that the implementation of this MS-DA in a three-dimensional variational data assimilation (3DVar) system is straightforward. This is because the decomposed cost function is algorithmically the same as the original cost function for 3DVar. We have applied this MS-DA framework to an oceanic 3DVar system (Li et al. 2008a,b), dubbed MS-3DVar, which has operationally supported a coastal ocean observing system for a number of years. In Li et al. (2015), the practical issues involved in its implementation, including scale decomposition, estimates of the background error covariance, and assimilation of different types of observations, were detailed, and results from OSSEs and its operational application were presented to illustrate the advantages of MS-3DVar over the 3DVar. We also note the similarity to a dual-resolution ensemble Kalman filter (e.g., Gao and Xue 2008; Rainwater and Hunt 2013) and a dual-resolution hybrid variational-ensemble data assimilation (Schwartz et al. 2015), since they use additive background error covariances estimated from ensembles that are produced from models with two different spatial resolutions. The integration of this MS-DA into the dual-resolution ensemble Kalman filter or hybrid variational-ensemble data assimilation is a topic worthy of being explored.

## Acknowledgments

The research described in this publication was carried out, in part, the Jet Propulsion Laboratory (JPL), California Institute of Technology, under a contract with the National Aeronautics and Space Administration (NASA). This research was also supported in part by the Office of Naval Research (N00014-12-1-093) and (N00014-10-1-0557). The authors thank Prof. Fuqing Zhang and the anonymous reviewers for comments that were very helpful in improving the manuscript.

## REFERENCES

Berre, L., 2000: Estimation of synoptic and mesoscale forecast error covariances in a limited area model.

,*Mon. Wea. Rev.***128**, 644â€“667, doi:10.1175/1520-0493(2000)128<0644:EOSAMF>2.0.CO;2.Boer, G. J., 1983: Homogeneous and isotropic turbulence on sphere.

,*J. Atmos. Sci.***40**, 154â€“163, doi:10.1175/1520-0469(1983)040<0154:HAITOT>2.0.CO;2.Cohn, S. E., 1997: Estimation theory for data assimilation problems: Basic conceptual framework and some open questions.

,*J. Meteor. Soc. Japan***75**(1B), 257â€“288.Daley, R., 1991:

*Atmospheric Data Assimilation*. Cambridge University Press, 457 pp.Desroziers, G., , O. Brachemi, , and B. Hamadache, 2001: Estimation of the representativeness error caused by the incremental formulation of variational data assimilation.

,*Quart. J. Roy. Meteor. Soc.***127**, 1775â€“1794, doi:10.1002/qj.49712757516.Gao, J., , and M. Xue, 2008: An efficient dual-resolution approach for ensemble data assimilation and tests with simulated Doppler radar data.

,*Mon. Wea. Rev.***136**, 945â€“963, doi:10.1175/2007MWR2120.1.Gaspari, G., , and S. E. Cohn, 1999: Construction of correlation functions in two and three dimensions.

,*Quart. J. Roy. Meteor. Soc.***125**, 723â€“757, doi:10.1002/qj.49712555417.Hollingsworth, A., , and P. Lonnberg, 1986: The statistical structure of short-range forecast error as determined from radiosonde data. Part I: The wind fields.

,*Tellus***38A**, 111â€“136, doi:10.1111/j.1600-0870.1986.tb00460.x.Ide, K., , P. Courtier, , M. Ghil, , and A. C. Lorenc, 1997: Unified notation for data assimilation: Operational sequential and variational.

,*J. Meteor. Soc. Japan***75**(1B), 71â€“79.Jazwinski, A. H., 1970:

*Stochastic Processes and Filtering Theory*. Academic Press, 376 pp.Li, Z., , Y. Chao, , J. C. McWilliams, , and K. Ide, 2008a: A three-dimensional variational data assimilation scheme for the Regional Ocean Modeling System.

,*J. Atmos. Oceanic Technol.***25**, 2074â€“2090, doi:10.1175/2008JTECHO594.1.Li, Z., , Y. Chao, , J. C. McWilliams, , and K. Ide, 2008b: A three-dimensional variational data assimilation scheme for the Regional Ocean Modeling System: Implementation and basic experiments.

,*J. Geophys. Res.***113**, C05002, doi:10.1029/2006JC004042.Li, Z., , J. C. McWilliams, , K. Ide, , and J. D. Fararra, 2015: Coastal ocean data assimilation using a multi-scale three-dimensional variational scheme.

,*Ocean Dyn.***65,**1001â€“1015, doi:10.1007/s10236-015-0850-x.Lonnberg, P., , and A. Hollingsworth, 1986: The statistical structure of short-range forecast error as determined from radiosonde data. Part II: The covariance of height and wind errors.

,*Tellus***38A**, 137â€“161, doi:10.1111/j.1600-0870.1986.tb00461.x.Lorenc, A. C., 1986: Analysis methods for numerical weather prediction.

,*Quart. J. Roy. Meteor. Soc.***112**, 1177â€“1194, doi:10.1002/qj.49711247414.Ooyama, K. V., 1987: Scale-controlled objective analysis.

,*Mon. Wea. Rev.***115**, 2479â€“2506, doi:10.1175/1520-0493(1987)115<2479:SCOA>2.0.CO;2.Parrish, D. F., , and J. C. Derber, 1992: The National Meteorological Centerâ€™s spectral-interpolation system.

,*Mon. Wea. Rev.***120**, 1747â€“1763, doi:10.1175/1520-0493(1992)120<1747:TNMCSS>2.0.CO;2.Rainwater, S., , and B. Hunt, 2013: Mixed resolution ensemble data assimilation.

,*Mon. Wea. Rev.***141**, 3007â€“3021, doi:10.1175/MWR-D-12-00234.1.Schwartz, C. S., , Z. Liu, , and X.-Y. Huang, 2015: Sensitivity of limited-area hybrid variational-ensemble analyses and forecasts to ensemble perturbation resolution.

, doi:10.1175/MWR-D-14-00259.1, in press.*Mon. Wea. Rev.*Toth, Z., , M. Tew, , D. Birkenheuer, , S. Albers, , Y. Xie, , and B. Motta, 2014: Multiscale data assimilation and forecasting.

*Bull. Amer. Meteor. Soc.,***95,**ES30â€“ES33, doi:10.1175/BAMS-D-13-00088.1.Wu, W.-S., , R. J. Purser, , and D. F. Parrish, 2002: Three-dimensional variational analysis with spatially inhomogeneous covariances.

,*Mon. Wea. Rev.***130**, 2905â€“2916, doi:10.1175/1520-0493(2002)130<2905:TDVAWS>2.0.CO;2.Xie, Y., , S. Koch, , J. McGinley, , S. Albers, , P. E. Bieringer, , M. Wolfson, , and M. Chan, 2011: A spaceâ€“time multiscale analysis system: A sequential variational analysis approach.

,*Mon. Wea. Rev.***139**, 1224â€“1240, doi:10.1175/2010MWR3338.1.Zhang, F., , Y. Weng, , J. A. Sippel, , Z. Meng, , and C. H. Bishop, 2009: Cloud-resolving hurricane initialization and prediction through assimilation of Doppler radar observations with an ensemble Kalman filter.

,*Mon. Wea. Rev.***137**, 2105â€“2125, doi:10.1175/2009MWR2645.1.Zhang, F., , Y. Weng, , J. F. Gamache, , and F. D. Marks, 2011: Performance of convection-permitting hurricane initialization and prediction during 2008â€“2010 with ensemble data assimilation of inner-core airborne Doppler radar observations.

,*Geophys. Res. Lett.***38**, L15810, doi:10.1029/2011GL048469.