## Abstract

This study addresses the issue of model errors with the ensemble Kalman filter. Observations generated from the NCEP–NCAR reanalysis fields are assimilated into a low-resolution AGCM. Without an effort to account for model errors, the performance of the local ensemble transform Kalman filter (LETKF) is seriously degraded when compared with the perfect-model scenario. Several methods to account for model errors, including model bias and system noise, are investigated. The results suggest that the two pure bias removal methods considered [Dee and Da Silva (DdSM) and low dimensional (LDM)] are not able to beat the multiplicative or additive inflation schemes used to account for the effects of total model errors. In contrast, when the bias removal methods are augmented by additive noise representing random errors (DdSM+ and LDM+), they outperform the pure inflation schemes. Of these augmented methods, the LDM+, where the constant bias, diurnal bias, and state-dependent errors are estimated from a large sample of 6-h forecast errors, gives the best results. The advantage of the LDM+ over other methods is larger in data-sparse regions than in data-dense regions.

## 1. Introduction

After more than 10 years of research, variants of the ensemble Kalman filter (EnKF) proposed by Evensen (1994) are now becoming viable candidates for the next generation of data assimilation in operational NWP. The advance is primarily due to the fact that 1) they include a flow-dependent background error covariance; 2) they are easy to code and implement; and 3) they automatically generate an optimal ensemble of analysis states to initialize ensemble forecasts. Many studies to date have tested EnKF systems under the perfect-model assumption with simulated observations, and only within the last few years have there been tests of the EnKF assimilating real atmospheric observations (e.g., Houtekamer et al. 2005; Whitaker et al. 2008; Szunyogh et al. 2008; Torn and Hakim 2008; Meng and Zhang 2008). Notably, an EnKF has been operational since 2005 in the Canadian Meteorological Centre (Houtekamer and Mitchell 2005) and when using the same model and observations, the EnKF scores are about the same as the scores with a four-dimensional variational data assimilation (4DVAR) system (see online at http://4dvarenkf.cima.fcen.uba.ar/Download/Session_7/Intercomparison_4D-Var_EnKF_Buehner.pdf).

Numerical weather forecast errors grow as a result of errors in the initial conditions and model deficiencies. Model errors can result from approximate parameterizations of physical processes, the failure to represent subgrid-scale events, and numerical discretization, etc. When preparing a data assimilation scheme for the inclusion of real observations, assuming that the model is perfect is overly optimistic (Dee 1995). In this case, the “true” forecast error covariance should include uncertainties from both inaccurate initial condition and model errors. As a result, if the background error covariance in the EnKF attributes errors only to the initial conditions, it will be smaller than the true forecast error covariance. To address this problem, approaches include inflating the background error covariance with multiplicative inflation (Anderson and Anderson 1999), additive inflation (Mitchell and Houtekamer 2000; Mitchell et al. 2002; Hamill and Whitaker 2005; Corazza et al. 2007; Whitaker et al. 2008) and the covariance relaxation method (Zhang et al. 2004). Though these techniques are easy to implement, they account for model errors in the second moment of the ensemble, not the mean. Other methods have been proposed to deal with the mean of model errors. Fujita et al. (2007) and Meng and Zhang (2007, 2008) reported that using different physical parameterization schemes resulted in both better background error covariance and mean estimates in their mesoscale EnKF experiments. Aksoy et al. (2006a,b) proposed a simultaneous state and parameter estimation method to estimate uncertain model parameters in an EnKF environment. Dee and da Silva (1998) proposed a method for the online estimation and correction of model bias. This bias correction method [i.e., the Dee and da Silva bias estimation method (DdSM)] has been successfully tested, for example, by Dee and Todling (2000), Carton et al. (2000), Chepurin et al. (2005) in three-dimensional variational data assimilation (3DVAR) and optimal interpolation (OI) data assimilation systems, and by Keppenne et al. (2005) in an EnKF system. Recently, Baek et al. (2006) developed another bias correction method similar to the DdSM except that it allows for “adjusting” the observations, rather than the model bias. It also accounts for the cross correlation of uncertainties in model state and bias ignored by the DdSM. They successfully tested this approach with the Lorenz-96 model. However, these two bias correction methods assume that model errors are constant in time, resulting in estimates of only the slowest varying component of model errors. In reality, model errors are likely to vary with time (e.g., errors in the diurnal cycle) or with the state of the atmosphere (e.g., biases are different during an El Niño episode).

Danforth et al. (2007) proposed an approach where the state-independent model error (bias) was estimated from the average of a large ensemble of 6-h forecast minus analysis fields (i.e., 6-h apparent forecast errors). Diurnal errors were estimated from the dominant empirical orthogonal functions (EOFs), and state-dependent errors were determined using the leading terms in a singular value decomposition (SVD). They found this low-dimensional method (LDM), where the state-dependent errors are expressed in terms of very few degrees of freedom (d.o.f.), to be very successful and computationally efficient. This method has been successfully tested with the coupled Lorenz-96 model (Danforth and Kalnay 2008) and the simple but realistic global Simplified Parametrization Primitive Equation Dynamics (SPEEDY) model (Danforth et al. 2007). However, neither study addressed data assimilation; the initial conditions were taken to be a system trajectory and the National Centers for Environmental Prediction–National Center for Atmospheric Research (NCEP–NCAR) reanalysis, respectively. Here we expand the application of the LDM to more realistic situations where the forecast–analysis is cycled and, as a result, forecast error includes both model error and dynamical growing error due to the imperfect initial condition.

In this study, we investigate the performance of an ensemble Kalman filter in a perfect model and in the presence of significant model errors, and then compare approaches for dealing with model errors in ensemble data assimilation. The local ensemble transform Kalman filter (LETKF; Hunt et al. 2007) is used as a representative of other EnKF systems. A review of the LETKF is given in section 2. The different techniques for treating model errors are described in section 3. Section 4 reports the performance of the LETKF in a perfect-model scenario. In section 5, the LETKF performance in the presence of model errors due to assimilating observations generated from the NCEP–NCAR reanalysis fields is examined. Two inflation schemes (multiplicative and additive inflation) and two bias correction methods (DdSM and LDM) are applied to account for and/or correct model errors. Their results are compared and discussed in both a uniform and a rawinsonde-like observation network. Section 6 gives our summary and discussion.

## 2. LETKF data assimilation scheme

The LETKF is one of the ensemble square root filters (Tippett et al. 2003) in which the observations are assimilated to update only the ensemble mean by

where **x**^{a} and **x **^{f} are the ensemble mean of analysis and background (forecast) respectively; 𝗞 is the Kalman gain; **y**^{o} is the observations; and *H*() is the observation operator. Here we use the superindex *f* rather than *b* to denote the background (forecasts) in order to avoid confusion with the subindex *b* used to denote the bias in the next section.

The ensemble perturbations are updated by transforming the background perturbations through a transform matrix 𝗧, as proposed by Bishop et al. (2001):

where 𝗫* ^{a}* and 𝗫

*are the analysis and background ensemble perturbations (matrices whose columns are the difference between the ensemble members and the ensemble mean), and the error covariances are given by 𝗣*

^{f}^{a,f}= 𝗫

^{a,f}(𝗫

^{a,f})

^{T}.

In the LETKF, the Kalman gain and transform matrix are given by

where **P̃**^{a}, the analysis error covariance in ensemble space, is given by

with dimension *K* × *K*, where the ensemble size *K* is usually much smaller than both the dimension of the model and the number of observations. As a result, the LETKF performs the analysis in the space spanned by the forecast ensemble members, which greatly reduces the computational cost. The LETKF is computed locally for each grid point by choosing the observations that will influence that grid point. More details about the LETKF are available in Hunt et al. (2007) and Szunyogh et al. (2008).

## 3. Methods to deal with model errors

### a. Multiplicative inflation

Multiplicative inflation simply inflates the ensemble error covariance 𝗣* _{e}^{f}* by a factor 1 + Δ to approximate the true error covariance 𝗣

*:*

^{f}where Δ is a tunable parameter. Equation (6) provides for an increase in the ensemble covariance 𝗣* _{e}^{f}* to account for the model errors not included in the original 𝗣

*. This method implicitly assumes that model errors have the same error structure as the internal errors so that their error covariance is proportional to the dynamically evolved error covariance 𝗣*

_{e}^{f}*.*

_{e}^{f}### b. Additive inflation

Additive inflation parameterizes model errors by adding random perturbations with a certain covariance structure to each ensemble member. Following Whitaker et al. (2008), in this study we randomly select samples from a subset of NCEP–NCAR reanalysis (NNR; Kalnay et al. 1996) 6-h tendency fields in January and February for the years 1982–86. Unlike random numbers, these randomly selected tendency fields are geostrophically balanced. In each analysis cycle, we randomly select *K* tendency fields, remove their mean, scale these zero-mean fields (we tune the scaling factor), and add each of these scaled fields to one background ensemble member:

so that

and denote

Here *k* is the index for each ensemble member, denotes the *k*th ensemble forecast, **q*** _{k}* is the field added to ensemble member

*k*, and

*r*is the globally uniform scaling factor, a tunable parameter. Our procedure is similar to that of Whitaker et al. (2008), ensuring that the added fields will only enlarge the background error covariance by 𝗤 and will not change the ensemble mean.

### c. DdSM

Dee and da Silva (1998) developed a two-stage bias estimation algorithm, in which the estimation procedures for the bias and the state are carried out successively. At the first step bias is estimated on every model grid point by

where the matrix and is the forecast error covariance for the state variables and for the bias, respectively.

In practice the bias forecast error covariance is unknown, so that following Dee and da Silva (1998) we assume that

where *α* is a tunable parameter.

In the second step, the analysis for the state variables is obtained using the standard analysis procedure with the unbiased forecast state **x *** ^{f}* −

**b**

*:*

^{a}As for the bias forecast model, following Carton et al. (2000) we use damped persistence:

where *i* denotes time step, and *μ* < 1 is a tunable parameter.

The cost of the DdSM is about twice that of no bias estimation, since the updated equations are solved twice, first for the bias estimation and then for the state variables. However, this doubling of the cost can be avoided if *α* ≪ 1 in which case Eq. (13) becomes

Reversing the order of the bias estimation step and that of the state analysis step, we obtain a simplified version of the DdSM (Radakovich et al. 2001):

In this approach the computation of Eq. (19) is almost cost free after the state analysis **x*** ^{a}* has been updated by Eq. (18), since 𝗞

*[*

_{x}**y**

*−*

^{o}*H*(

**x**

*−*

^{f}**b**

*)] is simply the analysis increment for the state variables.*

^{f}Above we give the general equations for the DdSM and its simplified version. In the application of the DdSM to an EnKF system, no additional ensemble members are required for the bias since the bias forecast error covariance is obtained directly from the state forecast error covariance . Therefore, the term **x *** ^{f}* in Eqs. (10) and (19) is actually the ensemble mean forecast.

### d. LDM

We assume the NNR as a reference field to approximate the true atmosphere and create a reference 6-h model forecast initialized from the NNR without data assimilation. The (apparent) 6-h forecast error **x*** ^{e}* is then defined as the difference between the reference forecast and the NNR valid at the same time.

The low-dimensional scheme (Danforth et al. 2007) assumes that model error **x*** _{m}^{e}* has three components, namely the bias, periodic errors dominated by errors in the diurnal cycle, and model errors that are state-dependent:

in which *t* denotes the time step. The forecast bias **b** is obtained by averaging the forecast errors **x*** ^{e}* over a certain training time period

**b**= 〈

**x**

^{e}〉, and the leading EOFs

**e**

*from the anomalous error field*

_{l}**x**

^{e′}=

**x**

^{e}− 〈

**x**

^{e}〉 are used to estimate diurnal or other periodic errors. The state-dependent model error component is given by the leading SVD modes

**f**

*of the covariance of the coupled model state anomalies*

_{n}**x**

^{f ′}=

**x**

^{f}− 〈

**x**

^{f}〉 and corresponding error anomalies

**x**

^{e′}. Here

*L*and

*N*are the number of retained leading modes of EOFs and SVDs, respectively. The spatial fields

**b**,

**e**

*, and*

_{l}**f**

*are time independent and computed offline using the training period samples. We call this approach low-dimensional because the spatial shape of the model errors is preestimated separately and only the amplitudes*

_{n}*β*(

_{l}*t*) and

*γ*(

_{n}*t*), which have a much lower dimension (

*L*and

*N*) than the full model dimension, are estimated online.

During the training period, the time series of *β _{l}* is calculated by projecting

**x**

^{e′}onto the EOFs

**e**

*. Since*

_{l}*β*is dominated by the bias in the diurnal cycle, the time-dependent

_{l}*β*(

_{l}*t*) can be estimated by averaging the

*β*over the diurnal cycle in the training period. For example, we can use the samples in the training period to calculate the average of

_{l}*β*for 0000, 0600, 1200, and 1800 UTC separately and time interpolate them in the current time step. For the calculation of the state-dependent component using SVD, the readers are referred to Danforth et al. (2007).

_{l}## 4. The LETKF performance in perfect-model experiments

The AGCM used in this study, SPEEDY (Molteni 2003), solves the primitive equations for prognostic variables of zonal wind (*u*), meridional wind (*υ*), temperature (*T*), specific humidity (*q*), and surface pressure (*p _{s}*) at triangular truncation T30, corresponding to 96 × 48 grid points at each of the 7 sigma levels. First, we assume a perfect model where a nature run is generated by integrating the SPEEDY model from 0000 UTC 1 January 1987 to 1800 UTC 15 February 1987. The observations are simulated by adding normally distributed random noise to the nature run, and are available at each model grid point for

*p*and every second model grid point for

_{s}*u*,

*υ*,

*T*,

*q*, in both the zonal and meridional directions (i.e., 25% of the number of model grid points). The amplitudes of the observation errors are 1 m s

^{−1}for

*u*,

*υ*wind, 1 K for

*T*, 10

^{−4}kg kg

^{−1}for

*q*, and 1 hPa for

*p*. For each forecast–analysis cycle, the SPEEDY model is used to generate the 6-h forecast and the observations are assimilated by the LETKF with 30 ensemble members and a Δ = 0.05 multiplicative inflation parameter to deal with sample errors in the covariance. Figure 1 shows the time series of analysis root-mean-square error (RMSE) with respect to the nature run, averaged over the whole globe, for zonal wind (

_{s}*u*), geopotential height (

*Z*), temperature (

*T*), specific humidity (

*q*) at 500 hPa, and surface pressure (

*p*). It is clear that after the initial spinup period, the analysis RMSE for all the variables is much smaller than the observational error standard deviations.

_{s}To investigate why the LETKF performs well, the background ensemble spread is compared with the background error. Carrying out perfect-model experiments, the true background error with respect to the nature run can be calculated. In order for the LETKF to perform well, the ensemble spread should be representative of the true background error. Figure 2 shows the 500-hPa height background ensemble mean error field (shaded) and the ensemble spread (contour) at 1200 UTC 3 February 1987, an arbitrarily chosen time. Although noisy, these two fields generally agree with each other in location. The background RMSE of the height field and the corresponding spread averaged over the whole globe are then compared in Fig. 3. The spread is slightly smaller at low levels and slightly larger in the upper levels compared to the background RMSE, but they are still quite close. Figures 2 and 3 together demonstrate that the spread among 30 ensemble members has captured the true background error well in both structure and magnitude.

## 5. Accounting for and correcting model errors in the LETKF

To assess the performance of the LETKF in the presence of model errors, we replace the nature run in the perfect-model experiments by the NNR fields. Since the NNR assimilated real observations, we assume that the NNR fields are an approximate estimate of the unknown true atmosphere (a quantitative validation of this assumption is beyond the scope of this research). Random noise with the same standard deviation used in the perfect-model experiments is added to simulate “NNR observations.” The density of observations remains the same as that in the perfect-model experiments.

### a. Effects of model errors on the LETKF

Serving as a benchmark, the control run is carried out using the same configuration (Δ = 0.05 and 30 ensemble members) as that in section 4 but replacing the observations generated from the nature run with the NNR observations. No additional method is applied to deal with model errors. After an initial spinup period of 15 days, the analyses and forecasts are verified against the NNR.

The strong negative influence of the model errors on the performance of the LETKF is very clear in the control run (results not shown). In the presence of model errors and without accounting for them, the 500-hPa height analysis RMSE increases from 2.4 (in the perfect model) to 50 m because of the model errors and their accumulated effects. An increase by more than an order of magnitude is also observed for other variables and regions. However, it should be noted that with a more sophisticated and high-resolution numerical model, such as those currently used in operational centers, the negative impact of model errors would be much smaller.

The background ensemble spread of the height field is plotted in Fig. 4 to investigate why the LETKF performs poorly in the presence of large model errors. It is worth noting that the spread is very close to that obtained in the perfect-model experiment, while ideally it should be as large as the actual forecast error, as it happens for a perfect model. This result indicates that when using the same model, the forecast ensembles are blind to model errors, and therefore the ensemble spread underestimates the actual forecast error, leading to excessive confidence in the forecasts, less weight given to the observations, and, as a result, to the poor performance of the LETKF.

### b. Accounting for and correcting model errors

The methods described in section 3 are then applied to the SPEEDY-LETKF system to account for model errors. As in the control run, experiments are implemented for the period 0000 UTC 1 January–1800 UTC 15 February 1987 and the verification statistics are computed for analyses and forecasts against the NNR fields after the initial spinup period of 15 days.

#### 1) Multiplicative inflation versus additive inflation

Figure 5 shows the analysis RMSE of 200-hPa *u*, 500-hPa *Z*, 850-hPa *q*, and 925-hPa *T* fields, from multiplicative inflation (with an optimal parameter of Δ = 1.5), additive inflation (with an optimal amplitude of *r* = 1.5), and the control run. Both inflation schemes result in much better analyses than the control run for all the fields. As found in previous studies (e.g., Hamill and Whitaker 2005; Whitaker et al. 2008), additive inflation outperforms multiplicative inflation.

#### 2) Dee and da Silva method combined with additive inflation (DdSM+)

The DdSM aims to estimate and correct model bias, but does not account for state-dependent and random errors. Since the performance of the additive inflation is better than that of the multiplicative inflation, we combine the DdSM with additive inflation to account for system noise. The additive noise is obtained in the same manner as for the additive inflation scheme. We refer to the DdSM augmented with additive noise as DdSM+.

Recall that there are two variables (*μ* and *α*) to be tuned in the pure DdSM. If additive inflation is used to model the system noise, the amplitude (*r*) of additive noise is another parameter that also needs to be tuned. To simplify the task of tuning these three parameters, first we fix *α* = 0.5 [following the recommendation of Dee and da Silva (1998)] and *μ* = 1.0 (assuming a persistence model for bias prediction) and then tune the amplitude (*r*) of the additive noise. We start at 0000 UTC 1 January 1987 by assuming zero bias and run the SPEEDY–LETKF system. However, no matter how small *r* is, the filter diverges, especially for temperature fields in the lower levels, and the bigger *r* is, the faster the divergence. Using the same SPEEDY–LETKF system but with pure DdSM (*r* = 0), Miyoshi (2005) observed similar filter divergence for all choices of *α* when fixing *μ* = 1.0. Thus, we let *μ* be less than 1 to reduce the impact of bias from the previous time step and find that *μ* = 0.9 is successful for a wide range of choices for *r* and gives better results than *μ* = 0.8. Fixing *μ* = 0.9, the pairs of (*α*, *r*) are tuned. The results in terms of 500-hPa RMSE computed over the 1-month period after the initial 15-day spinup period are summarized in Table 1. It is clear that accounting for random system noise is essential in order for the LETKF to perform well. Without additive noise, the pure DdSM (*r* = 0) is not competitive with pure additive inflation (*α* = 0) with an optimal amplitude of *r* = 1.5. Using a small additive noise (*r* = 0.25), the DdSM+ outperforms the pure additive inflation scheme, but the optimal choice of *α* is large (*α* = 0.75). When *r* is increased to 0.5, the value of the optimal *α* reduces to 0.5. These results can be better understood by the expression of where *α* is an explicit parameter and *r* is an implicit factor (through affecting ) to determine the bias forecast error covariance . When *r* is small, the system requires a large value of *α* to obtain an optimal , while as *r* increases, the optimal value of *α* decreases because the forecast error covariance for the state variables has already been increased. By increasing *r* from 0 to 0.5, a large improvement is found. Beyond *r* = 0.5, there is little further improvement. Therefore, we choose (*μ* = 0.9, *α* = 0.5, *r* = 0.6) as the optimal setting of the parameters for the DdSM+.

#### 3) LDM with additive inflation (LDM+)

Since experiments are carried out for January and February in 1987, the 5-yr climatology of the same months for the years 1982–86 is chosen as the training period, following Danforth et al. (2007). In this training period, the 6-h SPEEDY forecasts initialized with the NNR fields are conducted; the samples of forecast error **x*** ^{e}* are then obtained by taking the differences between the SPEEDY 6-h forecasts and the NNR fields valid at the same time.

Three types of model errors in Eq. (20) with LDM are corrected but, as was the case with the pure DdSM, the pure LDM without accounting for system noise was unable to beat pure additive inflation (Fig. 6). To parameterize system noise, randomly selected NNR 6-h tendency fields are added to each background ensemble member and their amplitude is tuned as was done in the DdSM+ scheme. As seen in Fig. 6, the LDM, plus a small amount (*r* = 0.4) of additive noise (hereinafter LDM+) significantly outperforms the pure additive inflation scheme, suggesting the necessity to deal with both model bias and system noise in the presence of complicated model errors.

#### 4) Overall comparison

Finally all the methods with their optimal configurations are compared with each other. As before, the results are verified against the NNR fields.

##### (i) Analysis verification

Figure 7 shows the analysis RMSE comparison. Among all methods, the LDM+ gives the best results: it outperforms other methods in all the fields throughout all pressure levels, especially at lower levels. The DdSM+ generally outperforms both pure inflation schemes. Though the pure multiplicative inflation scheme produces the worst results of the four methods, it should be noted that all the methods have made major analysis improvements compared to the control run (gray dotted curve in Fig. 8). These results indicate that accounting for model errors is essential for the EnKF and correcting model biases is, in general, better than only accounting for their effects in the second moment of the ensemble, assuming we have a good method for estimating them.

##### (ii) 48-h forecast verification

So far we have focused on the comparisons in terms of the analysis accuracy. However, the goal of developing more accurate analyses is to improve the short-term forecasts. Within an imperfect model, the short-term forecast errors come from both growing errors in the initial condition and model deficiencies. Although the model errors could be corrected within the forecast model, here we would like to see if the advantage of one method in the data assimilation process can be retained over a forecast period. Otherwise, there would be no benefit in improving the initial analysis on short-term forecasts. Figure 9 shows the global-averaged 48-h forecast RMSE at all pressure levels. The advantage of DdSM+ over additive inflation becomes less obvious for most fields, but remains significant for geopotential height fields at all levels. The large advantage of the LDM+ over the other two methods also decreases due to the contamination of the model errors. Nevertheless, it is still quite obvious and significant, except for the zonal wind above 200 hPa and the humidity above 700 hPa.

Above we focused on the impact of initial analysis on the short-term forecast and did not attempt to correct the model errors during the forecast process. The LDM can also be used to estimate and correct the short-term model errors in the forecast phase, as shown by Danforth et al. (2007) with the SPEEDY model.

### c. Sensitivity to observational network

Thus far, different methods have been compared using a globally uniform observation network (the upper-right panel in Fig. 10), as if they were based only on satellite observations, resulting in uniform error fields (not shown). However, in reality there are more rawinsonde observations over land and fewer over the ocean. To investigate the sensitivity of each method to the choice of observational system, we assimilate a rawinsonde-like network (the upper-left panel in Fig. 10) for *u*, *υ*, *T*, *q* (*p _{s}* is still available everywhere) using pure additive inflation, the DdSM+, and the LDM+. We retune the parameters and choose the optimal setting for each method. Figure 10 (lower-left panel) shows the zonal and time-averaged latitudinal profiles of geopotential height analysis RMSE at 500 hPa for the three methods with a rawinsonde-like network. For comparison, the results from experiments described in section 5b with a uniform observation network are shown in the lower-right panel. With uniform observations, the performance of each method is less latitude dependent. We note that the sawtooth behavior of the error observed in the DdSM+ and pure additive inflation methods is due to the fact that observations are available every other latitude. With rawinsonde-like observations, though the DdSM+ is still better than pure additive inflation, both are far more sensitive to observation density than LDM+. In the Southern Hemisphere and the northern polar region where few upper-air observations are available, the DdSM+ and pure additive inflation behave significantly worse than that of LDM+. Figure 10 demonstrates that the DdSM+ and pure additive inflation schemes are more sensitive to the choice of observation network, and perform poorly in regions with sparse upper-air observations, while the LDM+ is more robust. This was confirmed in an additional intercomparison in which surface pressure was only observed at rawinsonde locations. Although there was a slight further degradation in skill, the relative performance of the methods remained similar to that in Fig. 10 (not shown), indicating again that the advantage of the LDM+ over other two methods is larger in data-sparse regions than in data-dense regions. These results are not unexpected since the LDM+ bias correction is done in model space while the DdSM+ is performed in observation space, and additive inflation accounts for model errors through an inflated background error covariance that relies on the presence of observations to affect the final analysis.

## 6. Summary and discussion

In this study we addressed the issue of model errors with the ensemble Kalman filter. Though focusing on the LETKF, an efficient approach within the EnKF family, the results are applicable to other EnKF systems. First, we performed data assimilation experiments using the LETKF with the SPEEDY model under a perfect-model scenario. It was found that the LETKF works very well in this ideal case. Then we relaxed the perfect-model assumption and assimilated observations generated from the NCEP–NCAR reanalysis fields. Without any additional effort to handle model errors, the performance of the LETKF was seriously degraded. The background ensemble spread was similar to that in the perfect model and therefore much smaller than the true forecast error (that now includes model errors). The “blindness” of the LETKF to model error is likely due to the fact that each ensemble member is integrated with the same model. If forecasts from different systems are available, as in the superensemble of Krishnamurti et al. (2000), we would expect to be able to at least partially represent model errors.

We investigated two simple ways to represent the effect of model errors and two methods to estimate and remove model bias. Our results suggest that multiplicative inflation is worse than additive inflation. The pure bias removal methods (DdSM and LDM) correct model bias, but cannot remove system noise; as a result, they are unable to beat inflation schemes that account for the total model errors. Supplemented by additive noise to represent the random errors, bias removal methods generally outperform the pure inflation schemes. Of all these methods, the low-dimensional method with additive inflation (LDM+) where the time-averaged model bias, diurnal bias, and state-dependent errors are estimated from a large number of 6-h forecast errors, gives the most accurate analyses and 48-h forecasts. The advantage of the LDM+ over other methods is larger in data-spare regions than in data-dense regions due to the fact that its bias correction is done in model space and is thus less sensitive to the absence of abundant observations. We note that in training the LDM we have used a long reanalysis that has spread information from both satellite and rawinsonde observations throughout the atmosphere.

Although the DdSM combined with inflation (DdSM+) produces results not as good as the LDM+, it is generally superior to both pure inflation schemes. The disadvantages of DdSM+ are the doubling of the computational cost and exclusive reliance on observations. When the observations are sparser, the impact of the bias correction is limited. In the worst case, where the observations themselves are biased, it is not at all obvious that this algorithm can work correctly. Our results of LDM estimation may be too optimistic, since in our applications we assume the NNR field is an approximation of the unknown truth and use it to generate the samples of model errors during the training period. In reality, the NNR field could be biased, and generating good samples of model errors is a challenge to the LDM+. It is not clear whether the model error samples generated from the NNR fields are good enough to represent the true model errors. In practice, we could use a more advanced reanalysis, like the 40-yr European Centre for Medium-Range Weather Forecasts (ECMWF) Re-Analysis (ERA-40; Uppala et al. 2005) or the Japanese 25-yr Re-Analysis (JRA-25; Onogi et al. 2007) to replace the NNR. We could also verify the analysis against a more advanced reanalysis, and thus simulate the fact that the truth is not available for training. Alternatively, the EnKF analysis increments could also be used as model error samples instead of using a reanalysis. For the training of error samples to derive model bias, this would imply a spinup of the bias correction since at first the analyses are also biased, so that the analysis increments cannot initially sample model error well. We intend to explore this idea with the SPEEDY model to see whether the final model error samples after convergence are good enough to represent the true model errors.

## Acknowledgments

The authors are very grateful to the members of the Weather-Chaos group at the University of Maryland for helpful discussions, and to two anonymous reviewers for their constructive suggestions that helped us to improve the manuscript. This research was partially supported by 973 Program (2009CB421500), NASA Grant NNG04G29G, NOAA Grant NA04OAR4310103, and CMA Grant GYHY200806029.

## REFERENCES

**,**

**,**

**,**

**,**

**,**

**,**

**,**

**,**

**,**

**,**

**,**

**,**

**,**

**,**(

**,**

**,**

**,**

**,**

**,**

**,**

**,**

**,**

**,**

**,**

**,**

**,**

**,**

**,**

**,**

**,**

**,**

**,**

**,**

**,**

## Footnotes

*Corresponding author address:* Hong Li, Shanghai Typhoon Institute, 166 Puxi Rd., Shanghai 200030, China. Email: lih@mail.typhoon.gov.cn

This article included in the Mathematical Advances in Data Assimilation (MADA) special collection.