## Abstract

Coupled data assimilation uses a coupled model consisting of multiple time-scale media to extract information from observations that are available in one or more media. Because of the instantaneous exchanges of information among the coupled media, coupled data assimilation is expected to produce self-consistent and physically balanced coupled state estimates and optimal initialization for coupled model predictions. It is also expected that applying coupling error covariance between two media into observational adjustments in these media can provide direct observational impacts crossing the media and thereby improve the assimilation quality. However, because of the different time scales of variability in different media, accurately evaluating the error covariance between two variables residing in different media is usually very difficult. Using an ensemble filter together with a simple coupled model consisting of a Lorenz atmosphere and a pycnocline ocean model, which characterizes the interaction of multiple time-scale media in the climate system, the impact of the accuracy of coupling error covariance on the quality of coupled data assimilation is studied. Results show that it requires a large ensemble size to improve the assimilation quality by applying coupling error covariance in an ensemble coupled data assimilation system, and the poorly estimated coupling error covariance may otherwise degrade the assimilation quality. It is also found that a fast-varying medium has more difficulty being improved using observations in slow-varying media by applying coupling error covariance because the linear regression from the observational increment in slow-varying media has difficulty representing the high-frequency information of the fast-varying medium.

## 1. Introduction

Given the importance of the balance and coherence of different model components (or media) in coupled model initialization, it has been realized that for the purpose of climate estimation and model initialization, data assimilation should be performed within a coupled model framework (e.g., Chen et al. 1995; Zhang et al. 2007; Chen 2010). When the observed data in one or more media are assimilated into the model, information is exchanged among different media in the coupled system. Such an assimilation procedure is called coupled data assimilation (CDA). CDA is expected to produce self-consistent coupled state estimates and optimal initialization for coupled model predictions (e.g., Zhang et al. 2007; Singleton 2011).

In a CDA system, there exists two ways through which observed information in a medium can be transferred to the other media. One way is through fluxes between two coupled media, which is built in by coupled dynamics. Coupled dynamics transfer information between coupled media in a dynamically consistent way, but observational signals may be distorted by the imperfect model. The second way is to directly project observed information in one medium to another by coupling error covariance. Although coupling error covariance also uses the imperfect model information, the observational signals can be transferred to other media instantaneously. A coupling error covariance measures the covarying strength of two variables residing in different coupled media and thus is a key to project observed information from one medium to another. However, the time scales at which the flows vary in different media are different; therefore, it is difficult to accurately evaluate a coupling error covariance between two variables in different media. Apparently, if the evaluated coupling error covariance is not accurate enough, the direct observational adjustment resulting from the coupling error covariance may not improve the assimilation quality. How to use coupling error covariance to enhance observational constraints in CDA is an important and urgent research topic.

In this study, we employ an ensemble filter (e.g., Evensen 1994; Whitaker and Hamill 2002; Anderson 2001, 2003) that consists of a simple coupled model and the ensemble adjustment Kalman filter (EAKF; Anderson 2001) to examine the impact of the accuracy of the evaluated coupling error covariance on the quality of CDA. The simple coupled model includes a fast atmosphere and slow upper ocean coupled to a much slower deep ocean as well as an idealized sea ice component. Although the simple coupled model does not have complex physics as a coupled general circulation model (CGCM), it does characterize the interaction of multiple time-scale media in the climate system (Zhang et al. 2013). The simple model can help us clearly understand the essence of the problem we want to address here.

This paper is organized as follows. Section 2 briefly describes the simple coupled model and the ensemble filter. The impacts of coupling error covariance on the quality of CDA and climate prediction skill are investigated in sections 3 and 4, respectively. Conclusions and discussion are given in section 5.

## 2. Methodology

### a. Brief description of a simple atmosphere–ocean–ice coupled model

To clearly address the fundamental issues in the use of coupling error covariance for ensemble CDA raised in section 1, we employ a simple coupled model that consists of Lorenz’s chaotic atmosphere model (Lorenz 1963), the variations of upper-ocean and pycnocline depth (Zhang 2011a,b; Zhang et al. 2012), and a simple sea ice model (Zhang et al. 2013). The governing equations are

where the six model variables represent the atmosphere, the ocean, and the sea ice [*x*_{1}, *x*_{2}, and *x*_{3} are for the atmosphere (hereafter denoted by *x*_{1,2,3} if present together), *w* is for the slab ocean, *η* is for the deep-ocean pycnocline, and *ϕ* is for the sea ice concentration]. A dot above the variable denotes the time tendency. For the Lorenz atmosphere, the original parameters of *σ*, *κ*, and *b* with their standard values of 9.95, 28, and 8/3 sustain the chaotic nature of the atmosphere, while *c*_{1} represents the oceanic forcing on the atmosphere. For the equation of *w*, *O*_{m} is the heat capacity of the ocean and *c*_{2} represents the atmospheric forcing on the ocean. The parameters *c*_{3} and *c*_{5} denote the linear forcing of the deep ocean and the nonlinear interaction of the upper and deep oceans. The term *O*_{d} denotes the damping coefficient of the slab ocean variable *w*. The parameters *S*_{m} and *S*_{s} define the magnitudes of the annual mean and seasonal cycle of the forcings. The term *S*_{pd} is chosen as 10 so that the period of the forcing is comparable with the oceanic time scale, defining the time scale of the model seasonal cycle. The parameter *c*_{7} is the coupling coefficient from the sea ice to the slab ocean. For the pycnocline model, *η* represents the anomaly of ocean pycnocline depth, and its tendency equation is derived from the two-term balance model of the zonal–time mean pycnocline (Gnanadesikan 1999). The Γ is a constant of proportionality, while *c*_{4} and *c*_{6} represent the linear forcing of the upper ocean and the nonlinear interaction of the upper and deep oceans. Finally, the sea ice model is a simple nonlinear function that transfers an enthalpy space to the sea ice concentration space. While the details of the idealized sea ice model can be found in Zhang et al. (2013), here we briefly describe it as follows.

To solve the assimilation issue caused by the discontinuous distribution of sea ice concentration (Zhang et al. 2013), the enthalpy (*H* = *c*_{8}*x*_{2}^{2} + *c*_{9}*w*^{2} + *c*_{10}*ϕ*_{t−1})—an extremely simplified nonlinear function of *x*_{2}, *w*, and *ϕ*—is introduced to define the sea ice medium. The nonlinear transformation function from the enthalpy to ice concentration is

where *H*_{ig} and *H*_{im} represent the enthalpy thresholds at the point of ice genesis and ice maintenance, respectively, and *H*_{0} is a coefficient to adjust the shape of the curve between 0 and 1.

The model is set with the parameter values of (*σ*, *κ*, *b*, *c*_{1}, *c*_{2}, *c*_{3}, *c*_{4}, *c*_{5}, *c*_{6}, *c*_{7}, *c*_{8}, *c*_{9}, *c*_{10}, *O*_{m}, *O*_{d}, *S*_{m}, *S*_{s}, *S*_{pd}, Γ, *γ*, *H*_{ig}, *H*_{im}, and *H*_{0}) = (9.95, 28, 8/3, 0.1, 1, 0.01, 1, 0.01, 0.01, 0.1, 0.1, 10, 10, 10, 1, 10, 1, 10, 100, 0.25, 50, 10, and 80). By using a leapfrog time differencing scheme (Δ*t* = 0.01) with a Robert–Asselin time filter (Robert 1969; Asselin 1972), where the time filtering coefficient is set as *γ* = 0.25 and starting from the initial conditions (*x*_{1}, *x*_{2}, *x*_{3}, *w*, *η*, and *ϕ*) = (0, 1, 0, 0, 0, and 0), the model is first spun up for 10^{4} nondimensional time units (TUs; 1 TU = 100 time steps). Figure 1 shows the time series of *x*_{2}, *w*, *η*, and *H* [note that the time axis is set to reflect the different time scale for each variable and obviously that *x*_{2} (*η*) has the shortest (longest) time scale]. According to the formula of *H*, it has time scales of both *x*_{2} and *w*. Such a simple model system that has multiple time scales can facilitate our investigation of the use of coupling error covariance cross multiple media that have different time scales.

### b. Ensemble coupled data assimilation

In this study, we employ a derivative of the sequential ensemble Kalman filter EAKF (Anderson 2001) to implement the coupling error covariance adjustment. The assumption of independence of observational error allows EAKF to sequentially assimilate observations. While the sequential implementation provides much computational convenience for data assimilation, the EAKF maintains the nonlinearity of background flows as much as possible (Anderson 2003; Zhang and Anderson 2003). For the *k*th observation, denoted as *y*_{k}, the EAKF consists of the following two steps: the first step computes the observational increment [see Eqs. (2)–(5) in Zhang et al. (2007)], and the second step projects the observational increment onto related model variables using a uniform linear regression formula:

where Δ*y*_{k,i} represents the observational increment of *y*_{k} for the *i*th ensemble member. The term Δ*x*_{k,i} is the contribution of the *k*th observation to the model variable *x* for the *i*th ensemble member. Here, cov(*x*, *y*_{k}) denotes the error covariance between the prior ensemble of *x* and the model-estimated ensemble of *y*_{k}. The parameter *σ*_{k} is the standard deviation of the model-estimated ensemble of *y*_{k}. If *x* and *y* are from different media, then *x*–*y* coupling error covariance is involved.

### c. Biased twin experiment setup

To imitate the real world scenario with various model biases, we design a biased twin experiment that assumes the parameter error is the only source of model biases. The coupled model with the standard parameter values produces the true solution and observations. After the model is spun up for 10^{4} TUs, starting from the initial condition given in section 2a, observations are produced by adding a Gaussian white noise to the true values of *x*_{1,2,3}, *w*, and *ϕ* at the intervals of 5 time steps for *x*_{1,2,3} and 20 time steps for *w* and *ϕ*. These observational frequencies simulate the feature of the real climate observing system in which the atmospheric observations are available more frequently than the ocean and sea ice. According to the governing equation of *w*, the seasonal cycle is defined by 10 TUs, and therefore a model year (decade) is 10 (100) TUs. Also, to simulate the lack of deep-ocean measurements in the real observing system, no observation is available for *η*. The standard deviations of observational errors are 2 for *x*_{1,2,3}, 0.5 for *w*, and 0.1 for *ϕ*. When the sea ice observations are available, the observations of *H* are calculated according to the definition of *H*, by using the observations of *x*_{2}, *w*, and *ϕ*.

The assimilation model in which all physical parameters of *σ*, *κ*, *b*, *c*_{1}, *c*_{2}, *c*_{3}, *c*_{4}, *c*_{5}, *c*_{6}, *c*_{7}, *c*_{8}, *c*_{9}, *c*_{10}, *O*_{m}, *O*_{d}, *S*_{m}, *S*_{s}, *S*_{pd}, Γ, and *γ* are set to be 1% greater than their standard values is used in the assimilation experiments. A Gaussian white noise with the same standard deviations as the observational errors is added on the model state at the end of spinup to form the ensemble initial conditions from which the ensemble filtering data assimilation starts. The total data assimilation period is 10 000 TUs with the first 1000 TUs being the spinup of data assimilation. All statistics are computed using the results of the last 9000 TUs. A two time level adjustment scheme (Zhang et al. 2004), that is the observations at time *t* are allowed to adjust states at time *t* − 1 and *t* simultaneously, is employed in all data assimilation experiments in this study. Note that, throughout this study, we use coupling error covariance to define the error covariance of two variables residing in different media while using “cross error covariance” to represent the error covariance between two variables residing in the same medium.

The analysis is based on the verification of four experiments: 1) model control run (CTL) with no observational constraint; 2) single variable assimilation (SVA), in which the observations of a model variable are only allowed to impact this variable itself ; 3) single medium assimilation (SMA), in which multivariate assimilation is performed in each medium through the use of cross error covariance within the medium; and 4) multiple media assimilation (MMA), in which the coupling error covariance is used to assimilate one medium’s observations into the other media. Table 1 lists the details of the three schemes of SVA, SMA, and MMA, where the superscript “obs” represents observations and *H*^{obs} is the observation of enthalpy that is computed with the observations of *x*_{2}, *w*, and *ϕ*. Although a nonlinear relationship exists between *H* and *x*_{2}, *w*, and *ϕ*, we still simply set the standard deviation of observational errors of *H* as 3, without changing the essence of the problem we are addressing. In SVA, observations of *x*_{1}, *x*_{2}, *x*_{3}, *w*, and *H* are only allowed to adjust themselves. Based on SVA, observations of *x*_{1}, *x*_{2}, and *x*_{3} in SMA are further allowed to adjust (*x*_{2}, *x*_{3}), (*x*_{1}, *x*_{3}), and (*x*_{1}, *x*_{2}), respectively. Based on SMA, observations of *x*_{1}, *x*_{2}, *x*_{3}, *w*, and *H* in MMA are further allowed to adjust (*w*, *H*), (*w*, *H*), (*w*, *H*), (*x*_{1}, *x*_{2}, *x*_{3}, *η*, *H*), and (*x*_{1}, *x*_{2}, *x*_{3}, *w*), respectively. Thus, the difference between SMA and MMA is the use of only cross error covariance for SMA and both cross and coupling error covariances for MMA in the filtering equation [Eq. (3)]. From the view of the Kalman gain, SVA constrains the Kalman gain matrix to be diagonal, and SMA constrains the cross covariances between media to vanish, and MMA uses the full ensemble Kalman filter. It should also be noted that SVA, SMA, and MMA use the same observations.

To examine the impact of ensemble size on the quality of the evaluated coupling error covariance, we perform an array of experiments with different ensemble size (20, 50, 100, and 200) for all three data assimilation schemes. Since the coupling error covariance may require a large ensemble size to reach a high quality, we also examine some large ensemble sizes (e.g., 500, 1000, 2000, 5000, 10 000, and 15 000). Although many sophisticated algorithms of the inflation scheme (e.g., Anderson 2007, 2009; Li et al. 2009; Miyoshi 2011) exist for the filtering in an atmosphere model, the filtering inflation for a coupled model is a new subject due to the multiple time-scale nature of the system (more discussions on this issue will be given in section 3b). Furthermore, trial-and-error experiments show that the usual forms of inflation (e.g., only applying inflation to the atmosphere or equally inflating ensemble spreads of all variables) lead the analysis to increase to infinity, so the numerical calculation halts because of the overflow. Thus, for simplicity and convenience of comparison, in this study, we do not use any inflation for any of our assimilation experiments.

## 3. Impact of coupling error covariance on ensemble coupled data assimilation

### a. Sensitivity of the coupling error covariance with respect to ensemble size

We first examine the performances of SVA, SMA, and MMA with a default ensemble size (20) that has been used in previous studies (e.g., Zhang and Anderson 2003; Zhang 2011a,b; Liu et al. 2013). Figure 2 shows the root-mean-square errors (RMSEs) of *x*_{1,2,3}, *w*, *η*, and *H* for SVA, SMA, and MMA with this small ensemble size, where the RMSE of *x*_{1,2,3} is the arithmetical average of the RMSEs of *x*_{1}, *x*_{2}, and *x*_{3}. Here, the RMSE used is defined by

where *N* represents the number of analysis steps during the last 9000 TUs for the state variable *z* that is a unified notation of *x*_{1}, *x*_{2}, *x*_{3}, *w*, *η*, and *H*, and and *z*^{t} are the ensemble mean and the truth of *z*, respectively.

Note that the RMSEs in Fig. 2 have been normalized by the corresponding values for CTL (14.4 for *x*_{1,2,3}, 1.66 for *w*, 0.86 for *η*, and 66.94 for *H*) that are also computed according to Eq. (4). It clearly shows that SVA behaves best while MMA performs worst. To look into the details, we show the time series of RMSEs^{1} of *x*_{2}, *w*, *η*, and *H* (see Fig. 3). Note that since both the truth and CTL are spun up, then the variability of the RMSE of CTL is just the RMS difference of two randomly selected trajectories from the two runs (i.e., there is no initial condition information or observational constraint in the time evolution). SVA maintains a small and nearly stationary error for all model variables, while SMA introduces large oscillations into high-frequency variables (*x*_{1,2,3} and *H*) with the use of cross error covariances among *x*_{1}, *x*_{2}, and *x*_{3}. Compared to SMA, the coupling error covariances make MMA even more unstable for all variables whose errors are almost comparable to that of CTL. Therefore, for a small ensemble size, the introduction of cross error covariance or coupling error covariance may worsen the state estimate.

We perform an array of experiments for understanding the sensitivity of the qualities of cross error covariance and coupling error covariance with respect to the ensemble size (50, 100, 200, 500, 1000, 2000, 5000, 10 000, and 15 000) until the RMSEs of all variables converge (only a tiny change occurs as the ensemble size increases). Figures 4–6 show the time series of RMSEs for *x*_{2} (Figs. 4a, 5a, and 6a), *w* (Figs. 4b, 5b, and 6b), *η* (Figs. 4c, 5c, and 6c), and *H* (Figs. 4d, 5d, and 6d) in SVA, SMA, and MMA, with the ensemble sizes of 20 (blue), 2000 (black), and 15 000 (green). In SVA (Fig. 4), the RMSEs of all model variables cannot be distinguished from each other among these three ensemble sizes. In SMA (Fig. 5), evident fluctuations exist in the RMSE of all variables (especially the high-frequency variables) in the case of an ensemble size of 20. Again, with this small ensemble size, the *x*_{2} error is larger in SMA than in SVA. The degraded assimilation quality resulting from the introduction of cross covariance means that the cross error covariance among *x*_{1}, *x*_{2}, and *x*_{3} is poorly estimated. Because of the coupling effects, the RMSEs of *w* and *η* are also influenced. As the ensemble size increases, the amplitude of oscillations of RMSEs for *x*_{2} is greatly reduced, which also indirectly indicates that the qualities of cross error covariances among *x*_{1}, *x*_{2}, and *x*_{3} are sensitive to the ensemble size.

Based on SMA, the coupling error covariance is introduced into MMA, aiming to further refine the quality of model state estimation. However, compared to SMA, a small ensemble size introduces more noises to all variables (the blue curve in Fig. 6), which also indirectly demonstrates that the coupling error covariance is poorly estimated for a small ensemble size. Similar to SMA, a large ensemble size can effectively enhance the qualities of model variables in MMA (the green curve in Fig. 6) that justifies that the coupling error covariance is more sensitive to the ensemble size (in this case, the coupling error covariance is the error covariance among *x*_{1,2,3}, *w*, and *H*). More experiments show that the ensemble size by which a robust coupling error covariance is available for MMA is about 10 000–15 000.

Based on the above analysis, the sensitivities of the qualities of estimated model variables with respect to ensemble size in SVA, SMA, and MMA are shown in Fig. 7. We can see that MMA is most sensitive to the ensemble size compared to SVA and SMA. For a small ensemble size, the introduction of cross error covariance (SMA) and coupling error covariance (MMA) worsens the qualities of estimated model variables. For a large ensemble size, based on SMA, MMA can further refine the estimates for *w* and *η* while conversely enlarge the error of *x*_{2} and *H*. Here, in order to quantitatively evaluate the quality of cross (coupling) error covariance, a saturate ensemble size of 15 000 is chosen to act as a reference. Then the distances (measured by a Frobenius norm) of cross (coupling) error covariance in the ensemble sizes of 20, 50, 100, 200, 500, 1000, 2000, 5000, and 10 000 from the corresponding quantity with an ensemble size of 15 000, in SMA, are computed (Fig. 8) as follows

where *M* represents ensemble size and cov represents either cross covariance or coupling error covariance.

Note that the analysis intervals for *x*_{1}–*x*_{2} cross error covariance and *x*_{2}–*w* coupling error covariance are the observational intervals of *x*_{1,2,3} and *w*, respectively. As ensemble size increases, the accuracy of cross (coupling) error covariance is gradually enhanced, indicating that the quantity is sensitive to ensemble size.

### b. Impact of MMA on climate state estimation

Based on the above analysis, with an ensemble size of 15 000, all error covariances in SVA (only using autoerror covariance), SMA (using autoerror covariance and cross error covariance), and MMA (using autoerror covariance, cross error covariance, and coupling error covariance) are well evaluated by the model ensemble. Here, we examine whether the coupling adjustment from coupling error covariance can further improve the quality of model state estimates with an ensemble size of 15 000.

Figure 9 shows the time series of the RMSEs of *x*_{2} (Fig. 9a), *w* (Fig. 9b), *η* (Fig. 9c), and *H* (Fig. 9d) with this large ensemble size in SVA (red), SMA (black), and MMA (green). We can observe that the RMSEs of *w* and *η* for MMA are less than SVA and SMA. However, for an atmospheric variable such as *x*_{2} and sea ice variable such as *H*, the large oscillations still exist in MMA. Figure 10 displays the RMSEs of *x*_{1,2,3}, *w*, *η*, and *H* produced by SVA, SMA, and MMA. With the large ensemble size, the RMSEs of *x*_{1,2,3}, *w*, *H*, and *η* are greatly reduced in all assimilation schemes compared to their counterparts with an ensemble size of 20 (see Fig. 2). In particular, because of the use of well-estimated coupling error covariance, the errors of state estimates for all media in MMA are dramatically reduced. It is observed that for an ensemble size of 15 000, while the RMSEs of *w* and *η* in MMA are smaller than those in SVA and SMA, the errors of *x*_{1,2,3} and *H* in MMA are a little larger than that in SMA, especially for *x*_{1,2,3}. This means that while MMA significantly improves the estimate of low-frequency variables, the use of observations in low-frequency media based on the coupling error covariance has difficulties producing extra improvement for the fast-varying medium that has strong nonlinearity.

As the qualities of estimated model variables have been enhanced greatly by the large ensemble size, the qualities of cross error covariance and coupling error covariance must be improved accordingly. We may attribute the larger errors of *x*_{1,2,3} and *H* in MMA than in SMA to the following three reasons: 1) large observational interval of *w*, 2) no inflation applied to the assimilation, and 3) the local linear regression property in the second step of EAKF mentioned in section 2b.

We conduct more experiments to examine the sensitivity of the quality of the atmosphere and sea ice state estimates in MMA on different observational intervals of *w*. The results show that increasing the frequency of *w* observations does not improve the qualities of estimated *x*_{1,2,3} and *H*. To enhance the representation of the model ensemble to the model uncertainty, an inflation technique is usually used to enlarge the spread of the model ensemble in the filtering with an atmospheric model (e.g., Anderson 2007, 2009; Li et al. 2009; Miyoshi 2011). However, in a coupled model, since multiple time scales are involved to develop low-frequency signals, how to apply inflation to different time-scale components so that the slow-varying signals would not be adversely impacted is a new research topic (e.g., Zhang and Rosati 2010). Our test experiments show that solely introducing the inflation factor into the high-frequency medium (*x*_{1,2,3}) makes the ensemble increase to infinity since the simple inflation artificially changes the interaction of different media. Furthermore, to understand the potential role of inflation in improving the quality of coupling error covariance, we examine the dependence of standard deviations of variables residing in fast-varying and slow-varying media and their correlation on the ensemble size. For example, we show the variations of *x*_{2} and *w* standard deviations as well as their correlation and error covariance in the ensemble size space in Fig. 11 in the SMA case. From Fig. 11, we find that the *x*_{2} (*w*) standard deviation converges fastest (slowest) as the ensemble size increases while the convergent rate of the *x*_{2}–*w* correlation stays in between. These results suggest that an appropriate inflation scheme may enhance the accuracy of coupling error covariance to some degree by increasing the ensemble spread. However, because of the dependence of the correlation of different time-scale variables on the ensemble size, we may not expect that inflation resolves the issue of coupling error covariance entirely. Given the restriction of the ensemble size, designing an appropriate inflation scheme to enhance the quality of coupled model data assimilation could be a meaningful research topic.

To understand the influence of the local linear regression property in the second step in the EAKF on the state estimates for a medium with strong nonlinearity, we expand a model adjustment increment Δ*x* with respect to an observational increment Δ*y*:

Equation (3) is a linear (i.e., the first order) approximation of Eq. (6), where serves as the linear regression coefficient in Eq. (3). For SMA, because the atmospheric observation and the model variables being adjusted have the same time scale, the local least squares regression can well approximate Eq. (6). Therefore, SMA can further refine the state estimate of the atmosphere compared to SVA. For MMA, if the low-frequency observation (*w*) is used to adjust the fast and nonlinear atmosphere or sea ice, the first-order approximation cannot include high-frequency (nonlinear; i.e., the higher order) information. So the coupling error covariance cannot improve the fast and nonlinear atmosphere and sea ice. However, when the high-frequency observations (*x*_{1,2,3} and *H*) are used to adjust the slow-varying variable (*w*), the linear regression can filter the high-frequency noise and preserve the low-frequency signals. Under this circumstance, the use of coupling error covariance can enhance the quality of state estimation for slow-varying media.

Figure 12 gives an example in which MMA fails to constrain the transient atmospheric attractor using an ensemble size of 15 000, where the black dots, and red, black, and green curves represent the observation, truth, and analysis fields of SMA and MMA, respectively. The inappropriate linear fitness from low-frequency observational increment of slow-varying variables (*w* for this case) to the adjustment of fast-varying variables (*x*_{1,2,3} for this case) leads to a larger error in MMA (compared to SMA) in this particular transition period of attractor lobes.

## 4. Impact of coupling error covariance on climate prediction

To evaluate the impact of MMA on climate predictions, a suite of forecast experiments (20 for this case) are initialized from the coupled states of the assimilation results during 6300–6490 TUs every 10 TUs for each assimilation scheme with an ensemble size of 15 000. Each forecast is integrated forward up to 200 TUs (20 yr). The anomaly correlation coefficient (ACC) and the RMSE of the ensemble mean are used to assess the quality of prediction. Here, the formulas of ACC and RMSE at a given lead time are, respectively,

and

where *L* represents the number of forecast cases (20), *z*_{i} and denote the ensemble mean and the truth of the forecast variable *z* for the *i*th forecast case, and and stand for the means of *z*_{i} and . An ad hoc value of 0.6 for the ACC is used to show the valid lead time of forecasts (Hollingsworth et al. 1980).

Figure 13 shows the variations of ACCs (Figs. 13a,c) and RMSEs (Figs. 13b,d) with the forecast lead time for the upper-ocean *w* (Figs. 13a,b) and the deep-ocean *η* (Figs. 13c,d) in SVA (red), SMA (black), and MMA (green). Because of the improvement of state estimates from multivariate adjustments in the atmosphere, the valid forecast skill of *w* in SMA is 1 TU longer than that in SVA (from 6 TUs in SVA to 7 TUs in SMA). The use of the coupling error covariance in MMA further extends the valid forecast skill of *w* to 9 TUs, and the RMSEs of *w* are consistently reduced by MMA from SMA.

For the deep-ocean variable *η* whose characteristic time scale is 100 TUs (i.e., 10 yr), although no direct observation is available to adjust it, the coupled assimilation in MMA can also extend its valid forecast lead time and produce lower RMSEs throughout the entire forecast lead time compared to SVA and SMA (Figs. 13c,d).

## 5. Summary and discussion

A coupling error covariance measures the covarying strength of two variables residing in different media that have different time scales in a coupled climate system. The role of coupling error covariance in coupled data assimilation (CDA) has been thoroughly examined using an ensemble filter together with a simple coupled model consisting of a Lorenz atmosphere and a pycnocline ocean model. The simple model characterizes the interaction of multiple time-scale media in the climate system and thus is an appropriate tool to test the impact of coupling error covariance on CDA. Results show that the improvement of coupled assimilation quality with the use of coupling error covariance strongly depends on the accuracy of the evaluated coupling error covariance—only when the ensemble size is very large can the direct observational adjustments resulting from the coupling error covariance improve the assimilation quality and enhance the climate prediction skill. It is also found that the use of coupling error covariance can efficiently improve the estimate of slow-varying media while a quickly changing medium such as the atmosphere has more difficulty being improved by observations in slow-varying media. These results suggest that when the real climate observations are assimilated into a coupled general circulation model (CGCM) for climate estimation and model initialization, a robust coupling error covariance should be guaranteed (using a large ensemble size in this case) before applying coupling error covariance to the assimilation.

From this simple model study, we learned that when a coupling error covariance is used in CDA to directly project observed information from one medium to another, the high accuracy of coupling error covariance is the key. Because of complex physics, a CGCM must have a different performance on the quality of coupling error covariance. When a CGCM is used to assimilate the atmospheric and oceanic observations for climate estimation and model initialization, how to enhance the accuracy of coupling error covariance is an important research topic. For example, an appropriate inflation scheme with physical balances among different time-scale media may improve the quality of coupling error covariance to some degree and reduce the saturate ensemble size for coupling error covariance from a practically impossible value to a reasonable one. Given the climate observing system, once the well-described coupling error covariance is available, it is promising to increase the observational constraints and improve the quality of climate estimation and model initialization through the application of coupling error covariances. Additionally, the impacts of the observational interval and the analysis update cycle length on the three schemes of CDA performed in this paper should be investigated in follow-up studies.

## Acknowledgments

This research is cosponsored by the grants of the National Basic Research Program (2013CB430304), National Natural Science Foundation (41030854, 41106005, 41176003, and 41206178), National High-Tech R&D Program (2013AA09A505), and National Polar Science Strategic Research Foundation (20100209) of China. This research is also sponsored by the NSF Grant of the United States (0968383).

## REFERENCES

**30,**1235–1248,

*Proc. WMO/IUGG Symp. on NWP,*Tokyo, Japan, Japan Meteorological Society, 19–24.

**24,**6210–6226.

**26,**2368–2378.

## Footnotes

^{1}

Note that the RMSE in Figs. 3–6 and 9 is different from that in Figs. 2, 7, and 10 that use Eq. (4). At a given analysis step the RMSE is normally defined as , where *K* represents the number of state variables or spatial grids and *i* indexes variable or grid. For the one-dimensional model in this study, the RMSE of a single variable at the given analysis step degrades to the absolute error (i.e., ). Thus, when the time series of RMSE are mentioned for a single variable, we actually mean the absolute error.