## 1. Introduction

No matter how well understood a physical process is, predictions of that process derived from numerical integration of models are likely to suffer from two factors. First, nonlinearities amplify uncertainties in the initial conditions, causing similar states of the system to diverge quickly on small scales. Second, deficiencies in the numerical model introduce errors during integration. These deficiencies may be structural problems (wrong equations) and are induced by inaccurate forcings and parameterizations used to represent the effect of subgrid-scale physical processes and result in large-scale systematic forecast errors.

Leith (1978) proposed a statistical method to account for model bias and systematic errors linearly dependent on state anomalies. Leith derived a state-dependent empirical correction to a simple dynamical weather model by minimizing the tendency errors relative to a reference time series. The resulting correction operator attempts to predict the error in the model tendency as a function of the model state. While Leith’s empirically estimated state-dependent correction term is only optimal for a linear model, it was shown to reduce the nonlinear model’s bias.

DelSole and Hou (1999) perturbed the parameters of a two-layer quasigeostrophic (QG) model on an 8 × 10 grid (*N _{d}* = 160 degrees of freedom) to generate a “nature” run and then modified it to create a “model” containing a primarily state-dependent error. They found that a state-independent error correction did not improve the forecast skill. By adding a state-dependent empirical correction to the model, inspired by the procedure proposed by Leith, they were able to extend forecast skill up to the limits imposed by observation error. However, Leith’s technique requires the solution of a

*N*-dimensional linear system. As a result, before the procedure can be considered useful for operational use, a low-dimensional representation of Leith’s empirical correction operator is required.

_{d}Wilks (2005) used the Lorenz ’96 coupled system as the truth and an uncoupled version of the same system as a model, and developed a stochastic parameterization of the effects of the unresolved variables. The correction resulted in improved agreement between model and system climatologies, as well as improved ensemble mean and spread for short-range forecasts. Individually deterministic forecasts were degraded by the stochastic parameterization methods. Wilks found the improvement resulting from stochastic forcing to depend strongly on both the standard deviation and time scale of the stochastic term, and weakly on its spatial scale.

In what follows, we use the same experimental setup as Wilks with a low-dimensional representation of Leith’s empirical correction operator using singular value decomposition (SVD; Golub and Van Loan 1996) developed by Danforth et al. (2007). We use the resulting SVD modes as a basis for deterministic parameterization of the tendencies of the Lorenz ’96 system unresolved by the uncoupled model. Empirical correction of the uncoupled model using the SVD modes results in significant forecast improvement (anomaly correlation and RMSE) when compared with Leith’s operator, at the expense of weakening ensemble spread. The SVD method can be extremely computationally efficient, requiring only an inner product and the solution of a low-dimensional linear system. The paper concludes with a discussion of applications to numerical weather prediction.

## 2. Empirical correction

**x**(

*t*) and 𝗠[

**x**(

*t*)] are the model state vector and model tendency at step

*t*, respectively; 𝗠 is the best available representation of the governing dynamics of the physical process whose future behavior we are attempting to predict. Let

**x**

*(*

^{a}*t*) denote the true state of the dynamical system at step

*t*(estimated, e.g., from an analysis) and

**x**

^{f}

_{Δt}(

*t*) denote a prediction of

**x**

*(*

^{a}*t*) generated by integrating 𝗠 for time Δ

*t*from the state

**x**

*(*

^{a}*t*− Δ

*t*). Leith considered the difference between

**x**

*(*

^{a}*t*) and

**x**

^{f}

_{Δt}(

*t*) for small Δ

*t*to be an approximation of the model tendency error. The residual at step

*t*is given by the difference between the truth

**x**

*(*

^{a}*t*) and the model forecast state

**x**

^{f}

_{Δt}(

*t*) namely,where Δ

*t*is the forecast lead time. The smaller Δ

*t*, the better

*δ*

**x**

^{a}

_{Δt}(

*t*) is as an approximation of the model error. The time average of the residuals is an estimate of the model bias, or state-independent error. It is given bywhere

*N*denotes number of individual verifications that can be made comparing forecast and truth. The truth, model predictions, and corresponding residuals are then separated into their anomalous and time average components, namely,so that deviations from the mean can be analyzed.

_{a}### a. Leith’s empirical correction operator

*debiased*model predictions can be generated using an improved model 𝗠

^{+}, defined by the tendency equationThe time series of anomalous residuals of 𝗠

^{+}, obtained by comparing predictions made by 𝗠

^{+}with

**x**

*(*

^{a}*t*), provides an estimate of the linear state-dependent model error. Leith (1978) suggested that these residuals could be used to form a

*state*-

*dependent*correction. Leith sought an improved model of the formwhere

**x**′(

*t*) =

**x**(

*t*) − 〈

**x**

*〉 is the anomalous model state at time*

^{a}*t*and 𝗟

**x**′(

*t*) is the state-dependent error correction. The tendency error of the improved model is given bywhere

**ẋ**

^{a}(

*t*) is the instantaneous time derivative of the true state. Of course, the true time tendency is unknown, so

**ẋ**

^{a}(

*t*) can only be approximated by finite differences using the reference time series. The mean square tendency error of the improved model is given by 〈

**g**(

^{⊤}*t*)

**g**(

*t*)〉. Minimizing this tendency error with respect to 𝗟, Leith’s state-dependent correction operator is given bywhere

**ẋ**

*(*

^{a}*t*) − 𝗠

^{+}[

**x**

*(*

^{a}*t*)] is approximated by the residuals,which for an operational weather model are typically available from preimplementation testing. As a result, the operator 𝗟 may be estimated with no additional model integrations.

*δ*

**x**

^{a}

_{Δt}′(

*t*) is computed using the online debiased model 𝗠

^{+}. The

*cross*-

*covariance*(Bretherton et al. 1992) of the residuals with their corresponding true states, the

*lagged*cross-covariance, and the true state covariance are given byrespectively. The empirical correction operator (10) is then given by

We define **w**(*t*) = 𝗖_{xaxa}^{−1} · **x**′(*t*) to be the anomalous state normalized by its empirically derived covariance so that the matrix · vector product **w**(*t*) in Eq. (8) is the best estimate of the anomalous residual corresponding to the anomalous model state **x**′ over the dependent sample. Assuming that sampling errors are small and that model errors evolve linearly with respect to lead time, this correction should improve the forecast model 𝗠^{+}. Errors in initial state grow exponentially with lead time, but those forced by model error tend to grow linearly (e.g., Dalcher and Kalnay 1987; Reynolds et al. 1994). Therefore, the Leith operator should provide a useful estimate of the state-dependent model error.

Using a model with very few degrees of freedom and errors that were strongly state-dependent, DelSole and Hou (1999) found that the Leith operator was very successful in correcting state-dependent errors relative to a nature run. However, the direct computation of 𝗟**x**′ requires *O*(*N* ^{3}_{d}) floating point operations every time step. For operational models, *N _{d}* =

*O*(10

^{9}); it is clear that this operation would be prohibitive. Approaches to reduce the dimensionality of the Leith correction are now described.

### b. Low-dimensional approximation

*t*) and

*t*). They are then used to compute the cross-correlationwhere normalization is required to make

*t*) and

*t*). The SVD is given bywhere the columns of the orthonormal matrices 𝗨 and 𝗩 are the left and right singular vectors

**u**

*and*

_{k}**v**

*;*

_{k}**Σ**is a diagonal matrix containing singular values

*σ*whose magnitude decreases with increasing

_{k}*k*. The leading patterns

**u**

_{1}and

**v**

_{1}associated with the largest singular value

*σ*

_{1}are the dominant coupled signals in the time series

**u**

*and*

_{k}**v**

*represent the*

_{k}*k*th most significant coupled signals. Expansion coefficients or principal components (PCs)

*a*(

_{k}*t*),

*b*(

_{k}*t*) describe the magnitude and time dependence of the projection of the coupled signals onto the reference time series. They are given by

## 3. Low-dimensional correction

*N*-dimensional linear system 𝗖

_{d}_{xaxa}

**w**(

*T*) =

**x′**(

*T*) for

**w**at each time

*T*during a forecast integration. Assuming that Δ

*t*is small (error growth is approximately linear during the short forecasts used for training), we can approximate

_{xaxa}≈

**w**= 𝗖

_{xfxf}

^{−1}

**x′**as a linear combination of the leading orthonormal right singular vectors

**v**

*, namely,where*

_{k}*K*≪

*N*should be chosen such that the explained variance, given byexceeds a system-dependent threshold for

_{d}*k*=

*K*. From trial and error, it appears that an explained variance of

*r*(

*K*) ≥ 0.95 results in the best forecast improvement for the simple model discussed in the following section. As a result,

*K*should be chosen to fulfill this, or a similar, inequality. For the SPEEDY (simplified parameterizations, primitive equation dynamics) model, Danforth et al. (2007) found the best results for the anomaly correlation using a truncation of

*K*= 10.

**w**in the

*K*-dimensional space spanned by the right singular vectors

**v**

*can contribute to the empirical correction defined by (20). This dependence can be exploited as follows. Assume the model state at time*

_{k}*T*during a numerical model integration is given by

**x**(

*T*). The normalized state anomaly

**x**′(

*T*) =

**x**(

*T*) − 〈

**x**

*〉 normalized by the standard deviation of*

^{a}**x**

^{f}

_{Δt}′ over the dependent sample. The component of

**v**

*may then be estimated by computing the new principal component*

_{k}*b*(

_{k}*T*) =

**v**

^{T}

_{k}·

**= 〈**

_{bb}**bb**

^{T}〉, calculated using

**b**

*from (19). Owing to the orthogonality of the right singular vectors*

_{k}**v**

*, assuming an infinite sample, PCs*

_{k}*b*and

_{k}*b*are uncorrelated for

_{j}*k*≠

*j*. As a result, we restrict the finite sample covariance 𝗖

**to be diagonal. The linear systemmay then be solved for**

_{bb}*γ*at time

*T*. As a result, the cost of solving (22) is

*O*(

*K*) where

*K*is the number of SVD modes retained, as opposed to the (

*N*×

_{d}*N*) linear system required by Leith’s full-dimensional empirical correction. The solution of (22) gives an approximation of

_{d}**w**(

*T*); namely,where

**w̃**(

*T*) is generated by solving the linear system (22) in the space of the leading

*K*singular vectors, while

**w**(

*T*) requires solving the

*N*-dimensional linear system (23) in the space of the model grid. Writing

_{d}**u**

*for the error signal*

^{c}_{k}**u**

*weighted by the standard deviation of*

_{k}*δ*

**x**

^{a}

_{Δt}′ over the dependent sample, the

*k*th component of the state-dependent error correction at time

*T*is given bywhere

*σ*is the coupling strength over the dependent sample [given by the SVD (17)]. The weight

_{k}*γ*(

_{k}*T*) assigned to residual signal

**u**

*indicates the sign and magnitude of the correction that may amplify, dampen, or shift the flow anomaly local to the pattern*

^{c}_{k}**u**

*. Then the SVD-corrected model is given at time*

^{c}_{k}*T*byso that

*during*a prediction, a few (

*K*) dominant model state signals

**v**

*can be projected onto the anomalous, normalized model state vector. The resulting sum Σ*

_{k}

^{K}_{k=1}

**z**

_{k}is the best representation of the original residual anomalies

*δ*

**x**

^{a}

_{Δt}′ in terms of the current forecast state

**x**(

*T*). If the correlation between the normalized state anomaly

**v**

*is small, the new expansion coefficient*

_{k}*b*(

_{k}*T*) will be negligible, no correction by

**u**

*will be made at time*

^{c}_{k}*T*; therefore, no harm will be done to the prediction. This fact is particularly important with respect to predicting behavior that may vary on a time scale longer than the training period, for example, El Niño–Southern Oscillation (ENSO) events (Barnston et al. 1999).

The SVD representation of the error is advantageous compared to Leith’s correction operator for several reasons. First, it reduces the sampling errors by identifying the most robust coupled signals between the residual and forecast state anomalies. Second, the added computation is trivial; it requires solving a *K*-dimensional linear system and computing *K* inner products. Finally, the SVD signals identified by the technique can be used by modelers to isolate flow-dependent model deficiencies. In ranking these signals by strength, SVD gives modelers the ability to evaluate the relative importance of various model errors.

## 4. Numerical experiments

### a. Lorenz ’96 model

*N*-dimensional governing equations, given by Lorenz (1996), arewhere

_{d}*N*= (

_{d}*J*+ 1)

*I*and the subscripts

*i*and

*j*are treated as periodic with period

*I*and

*J*respectively. For example,

*x*

_{I}_{+1}≡

*x*

_{1}so that the variables form a cyclic chain. Equation (27) describes the behavior of a set of slowly changing, large-amplitude unspecified scalar meteorological quantities, such as temperature, at

*I*equally spaced grid sites on a latitude circle. Each

*x*is coupled to

_{i}*J*quickly changing, small-amplitude variables

*y*whose dynamics are described by (28). The notation floor [(

_{j}*j*− 1)/

*J*] describes integer truncation of the bracketed term and indicates that each of the small-amplitude

*y*variables in a group is equally affected by the large-amplitude

*x*variable to which it belongs. In our experiments, we have used the same parameter values as Wilks (2005), namely,

*I*= 8 and

*J*= 32 for a total of

*N*= 264 state variables,

_{d}*h*= 1,

*c*= 10, and

*b*= 10 (which has the effect of making the small-amplitude variables

*y*oscillate 10 times more rapidly than the large-amplitude variables

_{i}*x*), and the forcing is chosen to be either

_{i}*F*= 8, 14, or 18. Wilks (2005) chose

*F*= 18 and

*F*= 20 to ensure that the deterministic parameterizations would be competitive with the stochastic parameterizations.

This system shares certain properties with many atmospheric models: a nonlinear advection-like term, a linear term representing loss of energy to thermal dissipation, and a constant forcing term *F* to provide energy. It has been used in several previous predictability studies to represent atmospheric behavior (e.g., Lorenz and Emanuel 1998; Wilks 2005; Danforth and Yorke 2006) and for data assimilation (e.g., Anderson 2001; Whitaker and Hamill 2002; Ott et al. 2004). The time unit represents the dissipative decay time of 5 days (Lorenz and Emanuel 1998) and there are 13 positive Lyapunov exponents.

*x*of “true” values of the slow variables. We then set

^{a}*h*= 0 in Eq. (27) and add a bias term with weight

*α*to obtain the modelwhich fails to resolve any of the small-amplitude behavior. The sinusoidal bias term, weighted by

*α*= 1, is included as an additional source of model error with respect to model (27), meant to represent a longitudinally dependent misrepresentation of the dissipation or forcing. Forecasts

**x**

^{f}

_{Δt}generated by this model exhibit model error with respect to

**x**

*that is sinusoidally dependent on the grid point, but independent of the state. Training data is then compiled by generating 10*

^{a}^{7}short forecasts (Δ

*t*= 0.1 time units ≈ 12 h) of model (29) and comparing these forecasts to

**x**

*. Throughout the experiment, a fourth-order Runge–Kutta integration scheme is used with a time step of 0.001. The model bias is then given by the time average of the difference between*

^{a}**x**

*and*

^{a}**x**

^{f}

_{12}, namely, 〈

*δ*

**x**

^{a}

_{12}〉 [see (2)]. The anomalous errors and forecasts are used to generate Leith’s correction operator 𝗟 (15), and the corresponding modes (17) for SVD correction.

### b. Empirical correction experiments

**x**

*. These initial states are distinct from those used for training, and consecutive initial states are separated by 50 time units (250 days). Methods are distinguished by the explicit form of the empirical correction term*

^{a}*D*(

**x**) in (30), which is meant to represent the small-amplitude behavior and reduce the bias, as indicated below:Term

*D*

^{(1)}(

**x**) represents the original model (29); forecasts made with no empirical correction will be used to gauge the success of other methods. Term

*D*

^{(2)}(

**x**) is the time average residual observed in forecasts made by model (29) and represents a state-independent correction (see Fig. 1). Term

*D*

^{(3)}(

**x**) represents Leith’s empirical correction operator (8), and term

*D*

^{(4)}(

**x**) represents the SVD correction described by (25). Term

*D*

^{(5)}(

**x**) represents forecasts made by system (27), (28) with observational noise (see next subsection), but with no model error. Skill scores are made for the ensemble mean using anomaly correlation and rms error and verifying against the time series

**x**

*.*

^{a}### c. Ensemble initialization

**x**

*(*

^{a}*T*), for example, is found by gathering analogs from long integrations of (27), (28). Analogs are defined to be states within an

*I*-dimensional hypercube (with side

*i*having length equal to 5% of the climatological span of

*x*) centered on

_{i}**x**

*(*

^{a}*T*). As in Wilks (2005), the analog integrations were performed until each of the 10 000 cubes contained a minimum of 100 states. The

*I*×

*I*covariance matrix for the analogs of the state

**x**

*(*

^{a}*T*) is denoted 𝗖 (

*T*). The distribution from which the initial ensemble for the forecast of

**x**

*(*

^{a}*T*) is chosen is given bywhere

*λ*is the average eigenvalue of 𝗖(

*T*) and

*σ*

_{clim}is the climatological standard deviation of

**x**

*. The covariance 𝗖*

^{a}^{init}(

*T*) has the same eigenvectors and correlations as 𝗖(

*T*), but is scaled so that the average standard deviation is 5% of the climatology of the true time series

**x**

*(Wilks 2005). Control states for each ensemble forecast are generated by adding appropriately shaped random noise to each of the 10 000 true states,where*

^{a}**y**(1,

*k*) is an

*I*-dimensional vector whose entries are independent random numbers chosen from a Gaussian distribution. The square root of 𝗖

^{init}is computed offline for each initial state using the Cholesky decomposition (Golub and Van Loan 1996). Ensemble members are then generated from a multivariate Gaussian distribution by performing the same operation on

**x**

^{f}

_{0}(1,

*k*), namely,where

**y**is different for each of the 10 000 initial states and

*N*= 20 ensemble members.

_{e}### d. Results

*y*and the additional state-independent error termThe time-average effect of these model errors, namely 〈

_{i}*q*〉, is given byThe time-average residual 〈

*δ*

**x**

^{a}

_{12}〉 (3) is an empirical estimate of 〈

**q**〉. Figure 1 shows the true bias 〈

**q**〉, and 〈

*δ*

**x**

^{a}

_{12}〉 weighted by a factor of −12

^{−1}, representing the fact that it is a correction applied every hour but was calculated by averaging 12-h errors. The trained empirical correction term

*D*

^{(2)}(

**x**) = 〈

*δ*

**x**

^{a}

_{12}〉 slightly underestimates and shifts the true bias of model (29).

Coupled signals between normalized, anomalous residuals and forecasts—namely, **u*** _{k}* and

**v**

*are shown in Fig. 2 for forecasts made with model (29); they are superimposed on a shifted scale for visual simplicity. Unit vectors*

_{k}**u**

_{1}and

**v**

_{1}suggest that states of model (29) of the shape

**u**

_{1}are typically misrepresented. Because each of the coupled signal pairs for the Lorenz ’96 model with forcings

*F*= 14 and

*F*= 18 roughly satisfies

**u**

*≈ −*

_{k}**v**

*, the state anomalies will be damped by the SVD empirical correction. The SVD method is also capable of suggesting amplification of anomalies if*

_{k}**u**

*≈*

_{k}**v**

*, as seen for*

_{k}*F*= 8 mode 3 and as demonstrated by Danforth et al. (2007).

The explained variance (21) for the spectrum of singular values of the cross-covariance matrix *F* = 8, but not for *F* = 18. The additional computational expense of including the few modes required to reach 95% is negligible for this model, where the number of degrees of freedom that one can attempt to correct is a maximum of *K* = *I* = 8. For an operational weather model, the spectrum is likely to be significantly flatter than that observed for *F* = 8. As a result, we may be forced to correct many forecast patterns (large *K*) to see improvement. Fortunately, the SVD technique that we are describing is very cheap, even for large *K*.

A sample of 10^{7} short forecasts was used to train the operators in order to predict a maximum of *I* = 8 degrees of freedom. In practice, such a large sample size is unavailable for training. In the case of a small training set, the singular value spectrum may be steep, not due to the importance of the leading modes but due to the smaller sample size (Hamill et al. 2001). The larger the sample size, the more likely the operator will represent the true covariance and, hopefully, the greater the number of forecast patterns that can be corrected.

Typical 10-day, 20-member ensemble forecasts of *x*_{1} using model (30) and *F* = 14, with empirical correction terms described by (31), are shown in Fig. 4. Forecasts empirically corrected by the observed bias of model (29), *D*^{(2)}, perform slightly better than forecasts not corrected at all, *D*^{(1)}. Ensemble divergence is typically significant by day 5 for both *D*^{(1)} and *D*^{(2)}. State-dependent empirical correction significantly improves forecasts. Ensemble spread is weak for both Leith’s empirical correction *D*^{(3)} and the SVD correction *D*^{(4)} with mode truncation *K* = 5. However, small spread is seen for perfect model forecasts *D*^{(5)}, and the effect is less evident for *F* = 8 and *F* = 18. Since the ensemble spread represents the uncertainty in the forecast and since the forecast skill is clearly improved by the Leith and SVD empirical corrections, this result should be expected.

Figure 5 shows the average anomaly correlation and rms error (RMSE) of the ensemble mean of 10 000 independent 20-member ensemble forecasts. The state-independent correction adds approximately 1 time unit (5 days) to the usefulness of *F* = 8 forecasts, and 0.1 time units (12 h) to the usefulness of *F* = 14 and *F* = 18 forecasts. For *F* = 14, Leith’s operator improves forecasts by 710% (27.2 days), and the SVD correction results in an improvement of 1176% (45 days). The SVD correction term *D*^{(4)}(**x**) is chosen to have *K* = 7, 5, and 2 modes for forcings *F* = 18, 14, and 8, respectively; the truncation was chosen to explain 95% of the variance in the cross-covariance matrix *N _{e}* = 20, AC scores for

*N*= 1 and

_{e}*N*= 50 are qualitatively similar, indicating that the performance of the SVD method is insensitive to ensemble size.

_{e}*α*= 0,

*F*= 18, and the actual tendencies of system (27), (28), to approximate model error. The collection of tendency errors for each resolved variable

*x*were then fit with a degree four polynomialwhere the term

_{i}*e*was a stochastic component and

_{i}*β*

_{0}corresponds to 〈

*δ*

**x**

^{a}

_{Δt}〉 from (3). For

*e*= 0, Wilks found that, on average, 20-member ensemble forecasts crossed the 0.6 anomaly correlation line at a time of 4 days. This improvement is illustrated by the × in the bottom left window of Fig. 5. Wilks demonstrated a particular choice for the stochastic term

_{i}*e*to increase the crossing time of 20-member ensemble forecasts by 180% of the crossing time for single integrations with a deterministic parameterization of model error. Ensemble spread was also shown to improve as a result of the stochastic parameterization, with deterministic parameterizations resulting in smaller spread. Figure 5 demonstrates Leith’s state-dependent linear correction to improve on the crossing time of 20-member ensemble forecasts by 292% for

_{i}*F*= 18 and the SVD correction to improve on the same measure by 338%. For

*F*= 8 and

*F*= 14, the SVD method outperforms Leith’s method by a much larger margin (see Table 1).

Figure 6 shows the average ensemble spread versus time and versus RMSE. The ensemble dispersion is good for the SVD correction *D*^{(4)} for *F* = 8, but for both *F* = 14 and *F* = 18 there is essentially no spread. We believe that this is related to the number of degrees of freedom used to correct the model as shown in Fig. 3. Only *K* = 2 modes are used for SVD correction of *F* = 8, while *K* = 5 and *K* = 7 modes are used for *F* = 14 and *F* = 18 respectively. As a result, for *F* = 8 there are enough degrees of freedom to allow for unstable modes in the SVD-corrected model (Fig. 2), and the ensemble spread is quite good (Fig. 6). By contrast, for *F* = 14 and *F* = 18, all of the modes used to correct forecasts result in damping of anomalies (Fig. 2), and consequently in damping the ensemble spread as well. In a more realistic model, the number of SVD modes needed for the empirical correction should be much smaller than the number of degrees of freedom of the model, and the reduction in spread may not be as severe as observed in this model. It should be noted that both SVD and the Leith empirical correction methods essentially find the maximum likelihood estimate of the probability distribution of corrections observed during the training period, given the current state. It may be possible to derive within the SVD scheme a low-order method for estimating the uncertainty associated with each correction. In that case the improvement of the ensemble spread could be obtained by adding random corrections drawn from this distribution to each ensemble member, as suggested by an anonymous reviewer.

## 5. Discussion

Leith’s method consistently improves forecasts for short lead times, outperforming the SVD method for the first 10 days of *F* = 14 and *F* = 18 forecasts. After 10 days, the ensemble spread of forecasts made using Leith’s method grows rapidly, while the spread in SVD method forecasts remains small. The *F* = 14 and *F* = 18 forecasts made with the SVD correction deteriorate rapidly the first few days after which time they degrade at essentially the same rate as forecasts made with a perfect model. This second dynamic behavior is an indication that after the first few days, the SVD method is an excellent parameterization of the behavior of the small-amplitude variables. In fact, the SVD method performs as well or better than the perfect model for the first 10 days of *F* = 8 forecasts. However, we see in Fig. 3 that as *F* increases, the SVD method requires a greater number of modes to represent the cross-correlation matrix utilized by Leith’s method. As a result, in the SVD method, *F* = 18 forecasts are corrected by modes whose coupling is less statistically significant than *F* = 8 and *F* = 14. This is demonstrated by mode *k* = 8 in Fig. 2, which significantly harms SVD-corrected forecasts [see final *D*^{(4)}(**x**) row in Table 1] relative to truncation at mode *K* = 7.

Clearly, these results are overoptimistic in that the model error in (29) relative to system (27) is highly state-dependent. However, Fig. 5 indicates that both Leith’s empirical correction operator and the SVD approximation do an excellent job representing the state-dependent component of the unresolved small-amplitude behavior. In fact, the SVD method isolates and ranks the most relevant spatial correlations described by Leith’s operator. As a result, truncation can actually improve performance. This was verified by using *K* = *I* = 8 modes for term *D*^{(4)}(**x**); forecasts were slightly worse than those made using Leith’s operator for forcings *F* = 14 and *F* = 18.

The methods presented here have relied on an exact characterization of the true state for a very long training period in order to understand the best possible impact of empirical correction. While the analysis increments for an operational weather model are typically available from preimplementation testing, they are computed as the difference between an analysis that suffers from deficiencies in the model used to create it, and are only available for short training periods. Future studies will examine the effectiveness of model error parameterization by SVD using less accurate estimates of the true state and shorter training periods.

## 6. Conclusions

A new method of state-dependent error correction was introduced, based on singular value decomposition of coupled residual and forecast state anomalies. The cross-covariance is the same as that which appears in Leith’s formulation, but it would be prohibitive to compute for the grid density required by operational weather models. The new method uses the SVD modes as a basis for linear regression and results in significant forecast improvement. The new method is also many orders of magnitude faster than Leith’s empirical correction. The method can be applied at a rather low cost, both in the training and in the correction phases, and yields significant forecast improvements, at least for the Lorenz ’96 model and the simple but realistic global QG and SPEEDY models (Danforth et al. 2007). It could be applied with low computational cost and minimal sampling problems to data assimilation and ensemble numerical weather prediction, applications where accounting for model errors has been found to be important. The method may be particularly useful for forecasting of severe weather events where a posteriori bias correction will typically weaken anomalies. Furthermore, the patterns identified by SVD could also be used to identify sources of model deficiencies and thereby guide future model improvements. Further development of the SVD method will include a low-order method for estimating the uncertainty in the correction terms.

## Acknowledgments

This research was supported by NOAA THORPEX Grant NOAA/NA040AR4310103 and NASA Phase-II Grant NNG 06GE87G to the Vermont Advanced Computing Center.

## REFERENCES

Anderson, J. L., 2001: An ensemble adjustment Kalman filter for data assimilation.

,*Mon. Wea. Rev.***129****,**2884–2903.Barnston, A. G., , M. H. Glantz, , and Y. He, 1999: Predictive skill of statistical and dynamical climate models in SST forecasts during the 1997–98 El Niño episode and the 1998 La Niña onset.

,*Bull. Amer. Meteor. Soc.***80****,**217–243.Bretherton, C., , C. Smith, , and J. Wallace, 1992: An intercomparison of methods for finding coupled patterns in climate data.

,*J. Climate***5****,**541–560.Dalcher, A., , and E. Kalnay, 1987: Error growth and predictability in operational ECMWF forecasts.

,*Tellus***39A****,**474–491.Danforth, C. M., , and J. A. Yorke, 2006: Making forecasts for chaotic physical processes.

,*Phys. Rev. Lett.***96****.**144102, doi:10.1103/PhysRevLett.96.144102.Danforth, C. M., , E. Kalnay, , and T. Miyoshi, 2007: Estimating and correcting global weather model error.

,*Mon. Wea. Rev.***135****,**281–299.DelSole, T., , and A. Y. Hou, 1999: Empirical correction of a dynamical model. Part I: Fundamental issues.

,*Mon. Wea. Rev.***127****,**2533–2545.Golub, G., , and C. F. Van Loan, 1996:

*Matrix Computations*. 3rd ed. The Johns Hopkins University Press, 694 pp.Hamill, T., , J. Whitaker, , and C. Snyder, 2001: Distance-dependent filtering of background error covariance estimates in an ensemble Kalman filter.

,*Mon. Wea. Rev.***129****,**2776–2790.Leith, C. E., 1978: Objective methods for weather prediction.

,*Annu. Rev. Fluid Mech.***10****,**107–128.Lorenz, E. N., 1996: Predictability: A problem partly solved.

*Proc. ECMWF Seminar on Predictability*, Reading, United Kingdom, European Centre for Medium-Range Weather Forecasts, 1–18.Lorenz, E. N., , and K. A. Emanuel, 1998: Optimal sites for supplementary weather observations: Simulation with a small model.

,*J. Atmos. Sci.***55****,**399–414.Ott, E., and Coauthors, 2004: A local ensemble Kalman filter for atmospheric data assimilation.

,*Tellus***56A****,**415–428.Reynolds, C., , P. J. Webster, , and E. Kalnay, 1994: Random error growth in NMC’s global forecasts.

,*Mon. Wea. Rev.***122****,**1281–1305.Whitaker, J., , and T. Hamill, 2002: Ensemble data assimilation without perturbed observations.

,*Mon. Wea. Rev.***130****,**1913–1924.Wilks, D. S., 2005: Effects of stochastic parametrizations in the Lorenz ’96 system.

,*Quart. J. Roy. Meteor. Soc.***131****,**389–407.

Improvement in crossing time of anomaly correlation scores with 0.6 for different empirical correction schemes relative to *D*^{(1)}(**x**) = 0. For the anomaly correlations, see Fig. 5 where *D*^{(4)}(**x**) is truncated at mode *K* = 2, 5, and 7 for the Lorenz ’96 model with forcings *F* = 8, 14, and 18, respectively. These improvements are shown in bold in the chart. The truncation was chosen to explain 95% of the variance in the cross-covariance matrix