## Abstract

Strongly coupled data assimilation (SCDA), where observations of one component of a coupled model are allowed to directly impact the analysis of other components, sometimes fails to improve the analysis accuracy with an ensemble Kalman filter (EnKF) as compared with weakly coupled data assimilation (WCDA). It is well known that an observation’s area of influence should be localized in EnKFs since the assimilation of distant observations often degrades the analysis because of spurious correlations. This study derives a method to estimate the reduction of the analysis error variance by using estimates of the cross covariances between the background errors of the state variables in an idealized situation. It is shown that the reduction of analysis error variance is proportional to the squared background error correlation between the analyzed and observed variables. From this, the authors propose an offline method to systematically select which observations should be assimilated into which model state variable by cutting off the assimilation of observations when the squared background error correlation between the observed and analyzed variables is small. The proposed method is tested with the local ensemble transform Kalman filter (LETKF) and a nine-variable coupled model, in which three Lorenz models with different time scales are coupled with each other. The covariance localization with the correlation-cutoff method achieves an analysis more accurate than either the full SCDA or the WCDA methods, especially with smaller ensemble sizes.

## 1. Introduction

The use of coupled models in Earth sciences is becoming increasingly common. In addition to atmosphere–ocean coupled models used for predicting El Niño–Southern Oscillation (ENSO), Earth system models coupled with many components such as ice sheets, the biosphere, aerosols, and chemical species have improved our understanding of the Earth system and enabled projecting future climate with less artificial approximations.

The purpose of data assimilation (DA) is to prepare initial conditions for these models by estimating the current (real-time analysis) or past (reanalysis) state of the system combining the models and observations. Traditionally, the state of each component is estimated by using a background state estimate from an uncoupled model and observations of the same component. This approach is known as uncoupled DA. For example, the oceanic state is often analyzed using an ocean general circulation model (GCM) and observations of only the oceanic state, even if the analysis is then used for a prediction by a coupled model. Coupled data assimilation (CDA) substitutes the uncoupled models with a coupled model to generate the background and is expected to provide a more self-consistent and accurate analysis compared with uncoupled DA (e.g., Zhang et al. 2007; Penny and Hamill 2017).

One type of CDA known as weakly coupled DA (WCDA) uses a coupled model to generate the background but then updates the analysis separately for each component. Observations of one component are directly assimilated only into the state variables of the same component. The analysis increments of one component can, therefore, only indirectly impact the state of the other components through the forward integration of the coupled model. Weakly coupled ocean–atmosphere DA has been successfully implemented operationally and shown to be an improvement over uncoupled DA (e.g., Saha et al. 2010; Laloyaux et al. 2016).

Another type of CDA is called strongly coupled DA (SCDA; Penny and Hamill 2017). SCDA also uses a coupled model in the forecast step of DA as well but then uses the cross covariances of the background errors to assimilate observations of one component into the state estimate of the other components directly. Therefore, a model variable can be affected by the assimilation of all the possibly relevant observations of all the components, which should minimize the analysis uncertainty of the variable. In addition, SCDA is expected to achieve more self-consistent analysis by alleviating the initialization shocks present in WCDA, which are caused by each component assimilating a disjoint set of observations (Mulholland et al. 2015).

Some recent experiments with SCDA show encouraging results. Sluka et al. (2016) showed that the direct assimilation of atmospheric observations into the ocean analysis reduced the error in the ocean analysis by ~46% compared to WCDA using an intermediate complexity atmosphere–ocean coupled model under a perfect model scenario. However, SCDA does not always produce improved analyses. Han et al. (2013) showed that the use of cross covariance between components is only beneficial for the slower component (i.e., the ocean in an atmosphere–ocean model) and only if using a very large ensemble (). Kang et al. (2011) studied SCDA with a coupled atmosphere and carbon model and showed that the best analysis was obtained when ignoring the cross covariance between carbon flux/concentrations and a subset of the dynamic variables such as specific humidity. This approach, named “variable localization,” was based on the assumption that those pairs of variables did not have any physical interaction, unlike wind and carbon concentration. Variable localization is implemented by specifying a localization weight for each pair of analyzed and observed variable types (e.g., carbon concentration and zonal wind) in addition to the regular localization that depends on the physical distance between the analyzed grid point and the observation.

The main difficulty of CDA is due to the presence of multiple time scales in a coupled system dynamics. A coupled atmosphere–ocean system, for example, has high-frequency modes such as convection and baroclinic instability (minutes to days) and low-frequency modes such as ENSO (seasonal to interannual). However, most of the observable quantities are (possibly nonlinear) combinations of those modes. Therefore, when observations are assimilated to the physical state variables, irrelevant modes projected onto the observation can introduce noise into the mode of interest (e.g., Tardif et al. 2014). Lu et al. (2015a,b) explicitly tackled the multi-time-scale characteristic of coupled systems. They first showed that the use of coupled covariance (i.e., SCDA relative to WCDA) is only beneficial in the deep tropics, where the ocean drives the atmosphere through anomalous sea surface temperature (Ruiz-Barradas et al. 2017). They attributed the negative impact of SCDA outside the tropics to high-frequency weather noise and proposed the leading average coupled covariance (LACC) method, in which the innovations from the atmospheric observations were averaged over several days to enhance the cross correlation between the atmosphere and the ocean.

The aforementioned, apparently contradicting results raise an important question: Under what conditions does SCDA work better than WCDA? In our study, we address this question and propose an offline method to determine which observations should be assimilated into which variables during the analysis update.

## 2. Theoretical analysis

In this section, we derive an expression that estimates the analysis uncertainty reduction by the assimilation of each observation.

Here, we assume that only a single observation is assimilated at a time. This is not a strong assumption because, in both Gaussian and Bayesian frameworks, theoretical analysis shows that the observations can be assimilated sequentially without changing the resulting analysis if they have mutually independent error distributions (Houtekamer and Mitchell 2001; Anderson 2003). Furthermore, Anderson (2003) pointed out that the observations with correlated errors can be transformed into ones with uncorrelated errors by performing a singular value decomposition on the observation error covariance matrix . Note that when considering a second or later observation in sequential assimilation, the background error covariance in the following derivation should be replaced with the one used for the assimilation of the observation of interest (i.e., the *analysis* error covariance after assimilating all the previous scalar observations).

We start our derivation from the state-update equations of the Kalman filter (Kalman 1960). Assuming that the background error covariance and observation error covariance are correctly specified, and that the observation errors are not correlated with the background errors, the analysis error covariance is given by

where is the Kalman gain, is the identity matrix, and is a linearized observation operator (e.g., Gelb et al. 1974).

Consider the analysis error variance of the *i*th model variable,

where *n* is the size of the state vector and *p* is the number of observations. The fractional decrease of the uncertainty for the *i*th model variable is given by

Assuming that there is only one observation (), the observation error variance can be expressed by a scalar as . With this assumption, we can reduce Eq. (4) to

where we used the Kalman gain [Eq. (2)] and the single observation assumption repeatedly (note that and are a matrix and a scalar, respectively). We then rewrite the covariance between the background errors of the observable () and the *i*th model variable () as a product of their correlation and standard deviations ( and for the observable and the *i*th model variable, respectively) as . We finally obtain

where is the standard deviation of the analysis error of the *i*th model variable [a similar derivation for a two-variable example is provided in Hamill et al. (2001)]. It is informative to compare this equation with the analysis uncertainty reduction in the univariate analysis, in which a single state variable is directly observed by a single observation,

where , , and are the error variances for the background, analysis, and observation, respectively. Equation (6) is similar to Eq. (7), except that the right-hand side is multiplied by the square of the correlation between the background errors of the analyzed and observed variables.

Equation (6) indicates that the relative improvement of the estimate of the state of each model variable by the assimilation of a single observation is the product of two quantities: (i) the ratio of the background error variance at the observation location and the sum of the background and observation error variance at the observation location (which is close to one when the observation is relative to the background) and (ii) the square of the background error correlation between the analyzed and observed variables. This equation also provides a quantitative estimate of the analysis error reduction by SCDA using estimates of the cross covariances between background errors in the different components.

We hypothesize that in an ensemble Kalman filter (EnKF; Evensen 1994), the assimilation of “irrelevant” observations in SCDA degrades the analysis if the detrimental effect of spurious correlations from the limited ensemble size exceeds the expected error reduction from the Kalman filter. This hypothesis implies that for a strongly coupled EnKF, we should use a correlation-cutoff method to localize the analysis, in which we only consider cross covariances between variables that have strong background error correlation.

## 3. Methods

Local EnKFs such as the local ensemble transform Kalman filter (LETKF; Hunt et al. 2007) allows us to assimilate different subsets of observations for each model variable. Therefore, we can define a “localization pattern”, in which we select observations to be assimilated into each model variable depending on which component the observation and the model variable are located in (see details in section 3e and Fig. 1). In this section, the optimal localization pattern for a simple coupled model is sought by estimating the strength of cross correlation using an offline analysis cycle of the LETKF. Then, the localization pattern is tested in independent LETKF cycles with various ensemble sizes, and the accuracy of the resulting analysis is compared to that obtained with other localization patterns.

### a. Model

We test the correlation-cutoff method with a nine-variable, multi-time-scale coupled model proposed by Peña and Kalnay (2004). The governing equations of the model are as follows:

This coupled model consists of three components: a fast “extratropical atmosphere” (*x*_{e}, *y*_{e}, *z*_{e}), a fast “tropical atmosphere” (), and a slow “(tropical) ocean” (). Each component is a Lorenz (1963) three-variable model coupled by coefficients *c*, , and . The “ocean” is slowed down by a factor of 10 through *τ* to mimic the slower variations of the ocean. The extratropical atmosphere is only weakly coupled () with the tropical atmosphere, and the tropical atmosphere is strongly coupled () with the ocean. There is no direct coupling between the extratropical atmosphere and the ocean. The parameters are kept the same as in Peña and Kalnay (2004): . The model is integrated using the fourth-order Runge–Kutta scheme with time steps nondimensional time units.

Despite its extreme simplicity and computational efficiency, the multi-time-scale coupled model shares several important characteristics with the real coupled Earth system and is an excellent testbed for testing ideas for CDA problems. The model shows a chaotic behavior with two distinct regimes: the coupled tropical atmosphere and ocean cycle into a random number of “normal years” (between 2 and 7), interrupted by an “El Niño year” with large negative anomaly in *X*, before returning to normal years [see Fig. 2 of Peña and Kalnay (2004)]. Since this asymmetric oscillation neither occurs in the uncoupled tropical atmosphere nor ocean, it is regarded as an intrinsically coupled instability. Therefore, the model developers called the coupled tropical atmosphere and ocean as an ENSO-like coupled system. The extratropical atmosphere, on the other hand, behaves almost like an individual chaotic system because of the weak coupling with the other components. Norwood et al. (2013) examined the properties of the coupled model and showed it to have two positive, five negative, and two near-zero Lyapunov exponents.

### b. Data assimilation method

We use the LETKF (Hunt et al. 2007), one of the deterministic implementations of the ensemble Kalman filters classified as ensemble square root filters (EnSRFs; Tippett et al. 2003). The LETKF allows us to assimilate only a subset of the observations into the analysis of each variable.

According to Ng et al. (2011) and Trevisan and Palatella (2011), the dimension of the subspace spanned by perturbations is at most for a *K*-member ensemble and this dimension should be equal or larger than the number of nonnegative Lyapunov exponents. Hence, our coupled model needs at least five ensemble members to cover the entire unstable subspace of the model given its four nonnegative Lyapunov exponents. Therefore, we conduct our experiments with , and 10 ensemble members. With 10 members, the ensemble can span the entire 9-dimensional model space, so it represents a case with sufficient members. We expect the 4-member experiment to represent a case with insufficient members, and the 6-member experiment to represent an intermediate situation.

Although we also conducted some experiments with 100 members, the resulting temporal mean analysis RMS error was not qualitatively different from that of the 10-member experiments, and only the results from smaller ensemble experiments are shown below. Note that the LETKF is designed to provide the same analysis mean and analysis error covariance matrix as those of the Kalman filter for linear forward operators and with sufficient ensemble members to factorize the background error covariance matrix into ensemble perturbation matrices (Hunt et al. 2007). However, the premises may be violated if the model is biased or stochastic, or if the nonlinearity is significant (i.e., the errors are too large to neglect the second- and higher-order terms in the Taylor expansions of the nonlinear forward operators). In these difficult situations, larger ensemble size will be beneficial because the statistical sampling of the stochastic or nonlinear error growth becomes more accurate. The insensitivity of EnSRF’s averaged analysis error to excessive ensemble size is thoroughly discussed in Sakov and Oke (2008).

For covariance inflation, we use multiplicative adaptive inflation of Wang and Bishop (2003). The diagnosed inflation factor is first limited within and then temporarily smoothed with a forgetting factor [we follow the notation of Li et al. (2009)].

### c. Experimental settings

We test our method by performing identical twin experiments. The model [Eq. (8)] is started from random initial conditions and spun up by integrating for 25 000 time steps before saving the subsequent 75 000 time steps as the truth. Observations are produced by adding Gaussian noise to the truth with a mean of zero and standard deviation of and . The observations are available once every eight time steps, and only one variable in each component (, , *Y*) is observed to simulate a sparse observation network. We use the *y* variables here because the observations of *y* are the most informative when assimilated in the three-variable Lorenz model (Yang et al. 2006).

The ensemble members are initialized with random numbers (with different random seeds from the one used for the truth) and spun up for 25 000 time steps before starting the analysis cycle so that the background ensemble members for the first analysis are random samples on the model’s attractor. Analysis experiments are conducted for the subsequent 75 000 time steps, the same period as the one for which we have saved the truth and the observations. The analysis is updated every 8 time steps, and therefore, the observations are only available at the end of each window. Within the 75 000 time steps (9375 analysis windows), only the last 50 000 time steps (6250 analysis windows) are used for calculating the background error correlation (Fig. 2) and the analysis error (Fig. 3) in the following subsections because we are only interested in the filter performance after its initial transient.

### d. Offline experiment and error statistics

We first conduct an offline experiment to obtain the error statistics of the model. For this, we use the same analysis system as discussed in section 3c but with the truth, observations, and initial ensemble members independent from the main experiments. We use the fully coupled LETKF (*Full* pattern in section 3e) with members for this offline run.

For each pair of model variables and , we first calculate an instantaneous background error (perturbation) correlation from the ensemble at each analysis time *t*:

where is the *i*th model variable of the *k*th ensemble member at time *t*, and is the ensemble mean of the *i*th model variable at time *t*. Then we obtain the temporal mean of the squared correlation for each pair :

where is the number of assimilation windows used to estimate the error statistics.

Figure 2 shows the mean of squared error correlation for each pair of variables. In this model there are only weak error correlations () between the extratropical atmosphere and the other components, whereas the errors in the tropical atmosphere and the ocean are more strongly correlated (). These offline statistics suggest ignoring the cross covariance between the extratropical atmosphere and the other components and using only the background error covariance between the tropical atmosphere and the ocean (referred to as *ENSO-coupling* in section 3e) when performing data assimilation.

### e. Covariance localization

We test the five covariance localization patterns shown in Fig. 1.

*Full*is the standard SCDA in which every observation is assimilated into the analysis of every state variable.*Adjacent*uses the cross covariance only between directly interacting components. The cross covariances between the extratropical atmosphere and ocean are therefore ignored.*ENSO-coupling*is the pattern suggested by our theoretical analysis and the offline experiment. The observations of the ENSO-like coupled system (i.e., the tropical atmosphere and the ocean) are mutually assimilated, but the extratropical atmosphere is analyzed individually.*Atmos-coupling*analyzes the extratropical and tropical atmosphere together but the ocean separately. This pattern separately analyzes the fast and slow components.*Individual*analyzes each component individually. The background is updated by the coupled model, but the analysis step is individually implemented for each component, which is equivalent to WCDA for this three-component model.

## 4. Results

The resulting analysis errors are plotted in Fig. 3.

*Full* (standard SCDA) performs worse than *Individual* (WCDA) when the ensemble size is small (). The negative impact of SCDA in this case is expected given the rank deficiency and the resulting spurious correlations of the filter. As the ensemble becomes larger, *Full* gradually becomes better, whereas the analysis accuracy of *Individual* is not so sensitive to the ensemble size. This result suggests the importance of a larger ensemble for successful implementation of the strongly coupled LETKF.

As Eq. (6) indicates, the assimilation of any type of observations with the Kalman filter will, on average, not increase the analysis uncertainty if the background and observation error covariance matrices are accurately specified. When the ensemble size is sufficient, the background error is sufficiently small to ignore nonlinearity, and no localization is applied; thus, the assimilation of an observation uncorrelated with a state variable will be neither beneficial nor harmful since the LETKF converges to the Kalman filter. The number of ensemble members needed for successful implementation of SCDA will be highly model dependent and may be affected by other factors like the use of covariance inflation.

The *ENSO-coupling* pattern suggested by the correlation-cutoff method did perform best in essentially all experiments, as we expected. In comparison to *Individual* (WCDA) and *Atmos-coupling*, *ENSO-coupling* was superior regardless of the ensemble size. The inferior performance of *Individual* and *Atmos-coupling* is noticeable in the tropical atmosphere and the ocean, between which these inferior patterns ignore strong background error cross covariances. This comparison shows the importance of including the background error covariances between the tropical atmosphere and the ocean in this model. In contrast to *Full* (standard SCDA) and *Adjacent*, *ENSO-coupling* performed well with the smaller ensembles (), though all patterns using the cross covariance between the tropical atmosphere and the ocean performed similarly well when the ensemble size was large enough (). The inferior performance of *Full* and *Adjacent* with insufficient ensemble size is seen in all components and can be attributed to the use of spurious correlations between the extratropical atmosphere and the other components.

These comparisons support the *ENSO-coupling* pattern, or the decision of ignoring the weak cross covariance between the “extratropical atmosphere” and the other components while considering the strong covariance between the “tropical atmosphere” and the “ocean,” as suggested by Fig. 2.

## 5. Summary

We first derived a simplified equation for the expected analysis error reduction when assimilating an observation into the analysis of each model variable. The results from the tests with five different covariance localization patterns support the intuitive idea that SCDA benefits from using the cross covariance between variables of different components only when they have strong background error correlations.

We then experimentally showed that the use of cross covariance in the LETKF could be detrimental when the ensemble size is too small. This supports the claim of Han et al. (2013) that a large ensemble is needed to improve the analysis using the full cross covariance. With a limited number of ensemble members, localizing the background error covariance is essential to obtain a better analysis. We proposed the correlation-cutoff method: first, estimate the mean of the squared background error correlation with an offline run, then, uncouple the data assimilation if the background error correlation between the analyzed and the observed variables is weak. In our experiments with the nine-variable coupled model of Peña and Kalnay (2004), the correlation-cutoff method, intermediate to the standard SCDA and WCDA approaches, results in the best analysis and is the most robust to the choice of ensemble size.

Covariance localization guided by the correlation-cutoff method is a general idea to increase the signal-to-noise ratio of data assimilation. This method, however, is particularly important for the SCDA, where the correlation strength between different model components cannot be summarized by a simple function of distance, as represented by the carbon data assimilation of Kang et al. (2011). Although the distance-dependent localization (Hamill et al. 2001) showed great success in atmospheric and oceanic DA, it cannot deal with characteristics of the dynamics that are distance independent. On the other hand, the squared ensemble correlation is a norm-independent quantity between 0 and 1, which can be measured between any pair of observation and model variables. Furthermore, the method is also applicable before the implementation of SCDA; if a weakly coupled EnKF system has been already implemented, by measuring the squared ensemble correlations, one can assess the variance reduction that could be achieved by implementing the SCDA in advance. With these two characteristics, the correlation-cutoff method can be particularly useful for coupled EnKF applications.

In the toy model we used there was a clear distinction between strongly and weakly correlated pairs of variables (Fig. 2), and therefore, it was clear where to stop the coupled data assimilation. In more realistic configurations, the cutoff could be more complex. For example, Smith et al. (2017) estimated the coupled error correlation with a single column model and showed that strong cross correlations are mostly limited to the atmospheric boundary layer and the oceanic mixed layer. Such limited distribution of cross correlation was also observed with an operational coupled GCM (T. Sluka 2017, personal communication). We expect that in such realistic models, as in the simple coupled atmosphere–ocean model we tested, the squared background error correlation will provide guidance on where to stop the coupled assimilation.

## Acknowledgments

We gratefully acknowledge that T. Yoshida is supported by the Japanese Government Long-term Overseas Fellowship Program. We thank Dr. Travis Sluka and three anonymous reviewers for making several insightful suggestions that significantly improved the clarity of the paper.

## REFERENCES

*Applied Optimal Estimation.*M.I.T. Press, 374 pp.

## Footnotes

© 2018 American Meteorological Society. For information regarding reuse of this content and general copyright information, consult the AMS Copyright Policy (www.ametsoc.org/PUBSReuseLicenses).