Abstract

The purpose of the present study is to use a new method of empirical model error correction, developed by Danforth et al. in 2007, based on estimating the systematic component of the nonperiodic errors linearly dependent on the anomalous state. The method uses singular value decomposition (SVD) to generate a basis of model errors and states. It requires only a time series of errors to estimate covariances and uses negligible additional computation during a forecast integration. As a result, it should be suitable for operational use at a relatively small computational expense.

The method is tested with the Lorenz ’96 coupled system as the truth and an uncoupled version of the same system as a model. The authors demonstrate that the SVD method explains a significant component of the effect that the model’s unresolved state has on the resolved state and shows that the results are better than those obtained with Leith’s empirical correction operator. The improvement is attributed to the fact that the SVD truncation effectively reduces sampling errors. Forecast improvements of up to 1000% are seen when compared with the original model. The improvements come at the expense of weakening ensemble spread.

1. Introduction

No matter how well understood a physical process is, predictions of that process derived from numerical integration of models are likely to suffer from two factors. First, nonlinearities amplify uncertainties in the initial conditions, causing similar states of the system to diverge quickly on small scales. Second, deficiencies in the numerical model introduce errors during integration. These deficiencies may be structural problems (wrong equations) and are induced by inaccurate forcings and parameterizations used to represent the effect of subgrid-scale physical processes and result in large-scale systematic forecast errors.

Leith (1978) proposed a statistical method to account for model bias and systematic errors linearly dependent on state anomalies. Leith derived a state-dependent empirical correction to a simple dynamical weather model by minimizing the tendency errors relative to a reference time series. The resulting correction operator attempts to predict the error in the model tendency as a function of the model state. While Leith’s empirically estimated state-dependent correction term is only optimal for a linear model, it was shown to reduce the nonlinear model’s bias.

DelSole and Hou (1999) perturbed the parameters of a two-layer quasigeostrophic (QG) model on an 8 × 10 grid (Nd = 160 degrees of freedom) to generate a “nature” run and then modified it to create a “model” containing a primarily state-dependent error. They found that a state-independent error correction did not improve the forecast skill. By adding a state-dependent empirical correction to the model, inspired by the procedure proposed by Leith, they were able to extend forecast skill up to the limits imposed by observation error. However, Leith’s technique requires the solution of a Nd-dimensional linear system. As a result, before the procedure can be considered useful for operational use, a low-dimensional representation of Leith’s empirical correction operator is required.

Wilks (2005) used the Lorenz ’96 coupled system as the truth and an uncoupled version of the same system as a model, and developed a stochastic parameterization of the effects of the unresolved variables. The correction resulted in improved agreement between model and system climatologies, as well as improved ensemble mean and spread for short-range forecasts. Individually deterministic forecasts were degraded by the stochastic parameterization methods. Wilks found the improvement resulting from stochastic forcing to depend strongly on both the standard deviation and time scale of the stochastic term, and weakly on its spatial scale.

In what follows, we use the same experimental setup as Wilks with a low-dimensional representation of Leith’s empirical correction operator using singular value decomposition (SVD; Golub and Van Loan 1996) developed by Danforth et al. (2007). We use the resulting SVD modes as a basis for deterministic parameterization of the tendencies of the Lorenz ’96 system unresolved by the uncoupled model. Empirical correction of the uncoupled model using the SVD modes results in significant forecast improvement (anomaly correlation and RMSE) when compared with Leith’s operator, at the expense of weakening ensemble spread. The SVD method can be extremely computationally efficient, requiring only an inner product and the solution of a low-dimensional linear system. The paper concludes with a discussion of applications to numerical weather prediction.

2. Empirical correction

Following Leith (1978), consider an arbitrary dynamical system

 
formula

where x(t) and 𝗠[x(t)] are the model state vector and model tendency at step t, respectively; 𝗠 is the best available representation of the governing dynamics of the physical process whose future behavior we are attempting to predict. Let xa(t) denote the true state of the dynamical system at step t (estimated, e.g., from an analysis) and xfΔt(t) denote a prediction of xa(t) generated by integrating 𝗠 for time Δt from the state xa(t − Δt). Leith considered the difference between xa(t) and xfΔt(t) for small Δt to be an approximation of the model tendency error. The residual at step t is given by the difference between the truth xa(t) and the model forecast state xfΔt(t) namely,

 
formula

where Δt is the forecast lead time. The smaller Δt, the better δxaΔt(t) is as an approximation of the model error. The time average of the residuals is an estimate of the model bias, or state-independent error. It is given by

 
formula

where Na denotes number of individual verifications that can be made comparing forecast and truth. The truth, model predictions, and corresponding residuals are then separated into their anomalous and time average components, namely,

 
formula
 
formula
 
formula

so that deviations from the mean can be analyzed.

a. Leith’s empirical correction operator

Online bias corrected or debiased model predictions can be generated using an improved model 𝗠+, defined by the tendency equation

 
formula

The time series of anomalous residuals of 𝗠+, obtained by comparing predictions made by 𝗠+ with xa(t), provides an estimate of the linear state-dependent model error. Leith (1978) suggested that these residuals could be used to form a state-dependent correction. Leith sought an improved model of the form

 
formula

where x′(t) = x(t) − 〈xa〉 is the anomalous model state at time t and 𝗟x′(t) is the state-dependent error correction. The tendency error of the improved model is given by

 
formula

where a(t) is the instantaneous time derivative of the true state. Of course, the true time tendency is unknown, so a(t) can only be approximated by finite differences using the reference time series. The mean square tendency error of the improved model is given by 〈g(t)g(t)〉. Minimizing this tendency error with respect to 𝗟, Leith’s state-dependent correction operator is given by

 
formula

where a(t) − 𝗠+[xa(t)] is approximated by the residuals,

 
formula

which for an operational weather model are typically available from preimplementation testing. As a result, the operator 𝗟 may be estimated with no additional model integrations.

To estimate 𝗟, the time series of residuals δxaΔt′(t) is computed using the online debiased model 𝗠+. The cross-covariance (Bretherton et al. 1992) of the residuals with their corresponding true states, the lagged cross-covariance, and the true state covariance are given by

 
formula
 
formula
 
formula

respectively. The empirical correction operator (10) is then given by

 
formula

We define w(t) = 𝗖xaxa−1 · x′(t) to be the anomalous state normalized by its empirically derived covariance so that the matrix · vector product w(t) in Eq. (8) is the best estimate of the anomalous residual corresponding to the anomalous model state x′ over the dependent sample. Assuming that sampling errors are small and that model errors evolve linearly with respect to lead time, this correction should improve the forecast model 𝗠+. Errors in initial state grow exponentially with lead time, but those forced by model error tend to grow linearly (e.g., Dalcher and Kalnay 1987; Reynolds et al. 1994). Therefore, the Leith operator should provide a useful estimate of the state-dependent model error.

Using a model with very few degrees of freedom and errors that were strongly state-dependent, DelSole and Hou (1999) found that the Leith operator was very successful in correcting state-dependent errors relative to a nature run. However, the direct computation of 𝗟x′ requires O(N3d) floating point operations every time step. For operational models, Nd = O(109); it is clear that this operation would be prohibitive. Approaches to reduce the dimensionality of the Leith correction are now described.

b. Low-dimensional approximation

An alternative formulation of Leith’s correction operator is described here, based on the correlation of the leading SVD modes. For a more detailed derivation, see Danforth et al. (2007). The dependent sample of anomalous residuals and model predictions are normalized at each grid point by their standard deviation so that they have unit variance: they are denoted (t) and (t). They are then used to compute the cross-correlation

 
formula

where normalization is required to make a correlation matrix. The matrix is then decomposed to identify pairs of spatial patterns that explain as much as possible of the mean-squared temporal covariance between the fields (t) and (t). The SVD is given by

 
formula

where the columns of the orthonormal matrices 𝗨 and 𝗩 are the left and right singular vectors uk and vk; Σ is a diagonal matrix containing singular values σk whose magnitude decreases with increasing k. The leading patterns u1 and v1 associated with the largest singular value σ1 are the dominant coupled signals in the time series and , respectively (Bretherton et al. 1992). Patterns uk and vk represent the kth most significant coupled signals. Expansion coefficients or principal components (PCs) ak(t), bk(t) describe the magnitude and time dependence of the projection of the coupled signals onto the reference time series. They are given by

 
formula
 
formula

3. Low-dimensional correction

The most significant computational expense required by Leith’s empirical correction (8) involves solving the Nd-dimensional linear system 𝗖xaxaw(T) = x′(T) for w at each time T during a forecast integration. Assuming that Δt is small (error growth is approximately linear during the short forecasts used for training), we can approximate and 𝗖xaxa. Then a substantial reduction in computation for this operation can be achieved by expressing w = 𝗖xfxf−1x′ as a linear combination of the leading orthonormal right singular vectors vk, namely,

 
formula

where KNd should be chosen such that the explained variance, given by

 
formula

exceeds a system-dependent threshold for k = K. From trial and error, it appears that an explained variance of r(K) ≥ 0.95 results in the best forecast improvement for the simple model discussed in the following section. As a result, K should be chosen to fulfill this, or a similar, inequality. For the SPEEDY (simplified parameterizations, primitive equation dynamics) model, Danforth et al. (2007) found the best results for the anomaly correlation using a truncation of K = 10.

It is important to note that only the component of w in the K-dimensional space spanned by the right singular vectors vk can contribute to the empirical correction defined by (20). This dependence can be exploited as follows. Assume the model state at time T during a numerical model integration is given by x(T). The normalized state anomaly is given by the vector x′(T) = x(T) − 〈xa〉 normalized by the standard deviation of xfΔt′ over the dependent sample. The component of explained by the signal vk may then be estimated by computing the new principal component bk(T) = vTk · . The right PC covariance over the dependent sample is given by 𝗖bb = 〈bbT〉, calculated using bk from (19). Owing to the orthogonality of the right singular vectors vk, assuming an infinite sample, PCs bk and bj are uncorrelated for kj. As a result, we restrict the finite sample covariance 𝗖bb to be diagonal. The linear system

 
formula

may then be solved for γ at time T. As a result, the cost of solving (22) is O(K) where K is the number of SVD modes retained, as opposed to the (Nd × Nd) linear system required by Leith’s full-dimensional empirical correction. The solution of (22) gives an approximation of w(T); namely,

 
formula
 
formula

where (T) is generated by solving the linear system (22) in the space of the leading K singular vectors, while w(T) requires solving the Nd-dimensional linear system (23) in the space of the model grid. Writing uck for the error signal uk weighted by the standard deviation of δxaΔt′ over the dependent sample, the kth component of the state-dependent error correction at time T is given by

 
formula

where σk is the coupling strength over the dependent sample [given by the SVD (17)]. The weight γk(T) assigned to residual signal uck indicates the sign and magnitude of the correction that may amplify, dampen, or shift the flow anomaly local to the pattern uck. Then the SVD-corrected model is given at time T by

 
formula

so that during a prediction, a few (K) dominant model state signals vk can be projected onto the anomalous, normalized model state vector. The resulting sum ΣKk=1zk is the best representation of the original residual anomalies δxaΔt′ in terms of the current forecast state x(T). If the correlation between the normalized state anomaly and the local pattern vk is small, the new expansion coefficient bk(T) will be negligible, no correction by uck will be made at time T; therefore, no harm will be done to the prediction. This fact is particularly important with respect to predicting behavior that may vary on a time scale longer than the training period, for example, El Niño–Southern Oscillation (ENSO) events (Barnston et al. 1999).

The SVD representation of the error is advantageous compared to Leith’s correction operator for several reasons. First, it reduces the sampling errors by identifying the most robust coupled signals between the residual and forecast state anomalies. Second, the added computation is trivial; it requires solving a K-dimensional linear system and computing K inner products. Finally, the SVD signals identified by the technique can be used by modelers to isolate flow-dependent model deficiencies. In ranking these signals by strength, SVD gives modelers the ability to evaluate the relative importance of various model errors.

4. Numerical experiments

a. Lorenz ’96 model

In this section we demonstrate the empirical correction procedures using a simple nonlinear system to define the truth (the quantity that will be predicted by a model). The Nd-dimensional governing equations, given by Lorenz (1996), are

 
formula
 
formula

where Nd = (J + 1)I and the subscripts i and j are treated as periodic with period I and J respectively. For example, xI+1x1 so that the variables form a cyclic chain. Equation (27) describes the behavior of a set of slowly changing, large-amplitude unspecified scalar meteorological quantities, such as temperature, at I equally spaced grid sites on a latitude circle. Each xi is coupled to J quickly changing, small-amplitude variables yj whose dynamics are described by (28). The notation floor [(j − 1)/J] describes integer truncation of the bracketed term and indicates that each of the small-amplitude y variables in a group is equally affected by the large-amplitude x variable to which it belongs. In our experiments, we have used the same parameter values as Wilks (2005), namely, I = 8 and J = 32 for a total of Nd = 264 state variables, h = 1, c = 10, and b = 10 (which has the effect of making the small-amplitude variables yi oscillate 10 times more rapidly than the large-amplitude variables xi), and the forcing is chosen to be either F = 8, 14, or 18. Wilks (2005) chose F = 18 and F = 20 to ensure that the deterministic parameterizations would be competitive with the stochastic parameterizations.

This system shares certain properties with many atmospheric models: a nonlinear advection-like term, a linear term representing loss of energy to thermal dissipation, and a constant forcing term F to provide energy. It has been used in several previous predictability studies to represent atmospheric behavior (e.g., Lorenz and Emanuel 1998; Wilks 2005; Danforth and Yorke 2006) and for data assimilation (e.g., Anderson 2001; Whitaker and Hamill 2002; Ott et al. 2004). The time unit represents the dissipative decay time of 5 days (Lorenz and Emanuel 1998) and there are 13 positive Lyapunov exponents.

We use Eqs. (27) and (28) to generate a time series xa of “true” values of the slow variables. We then set h = 0 in Eq. (27) and add a bias term with weight α to obtain the model

 
formula

which fails to resolve any of the small-amplitude behavior. The sinusoidal bias term, weighted by α = 1, is included as an additional source of model error with respect to model (27), meant to represent a longitudinally dependent misrepresentation of the dissipation or forcing. Forecasts xfΔt generated by this model exhibit model error with respect to xa that is sinusoidally dependent on the grid point, but independent of the state. Training data is then compiled by generating 107 short forecasts (Δt = 0.1 time units ≈ 12 h) of model (29) and comparing these forecasts to xa. Throughout the experiment, a fourth-order Runge–Kutta integration scheme is used with a time step of 0.001. The model bias is then given by the time average of the difference between xa and xf12, namely, 〈δxa12〉 [see (2)]. The anomalous errors and forecasts are used to generate Leith’s correction operator 𝗟 (15), and the corresponding modes (17) for SVD correction.

We then experiment with empirical correction of (29) using the improved model

 
formula

where the term D(x) attempts to correct the sinusoidal bias and represent the behavior unresolved by (29), namely, the coupling to the small-amplitude variables described by (28).

b. Empirical correction experiments

Five different versions of model (30) are used to forecast a set of 10 000 uncorrelated initial states chosen from the true time series xa. These initial states are distinct from those used for training, and consecutive initial states are separated by 50 time units (250 days). Methods are distinguished by the explicit form of the empirical correction term D(x) in (30), which is meant to represent the small-amplitude behavior and reduce the bias, as indicated below:

 
formula

Term D(1)(x) represents the original model (29); forecasts made with no empirical correction will be used to gauge the success of other methods. Term D(2)(x) is the time average residual observed in forecasts made by model (29) and represents a state-independent correction (see Fig. 1). Term D(3)(x) represents Leith’s empirical correction operator (8), and term D(4)(x) represents the SVD correction described by (25). Term D(5)(x) represents forecasts made by system (27), (28) with observational noise (see next subsection), but with no model error. Skill scores are made for the ensemble mean using anomaly correlation and rms error and verifying against the time series xa.

Fig. 1.

The empirically generated bias 〈δxa12〉 (time-average residual) in model (29) relative to (27) slightly underestimates and shifts the true bias 〈q〉. The true bias is a combination of the sinusoidal state-independent error and the bulk effect of ignoring the small-amplitude modes. It is described by Eq. (35). The Lorenz ’96 model with forcing F = 18 exhibits a slightly larger bias due to the effect the large-amplitude variables (with increased energy) have on the small-amplitude variables.

Fig. 1.

The empirically generated bias 〈δxa12〉 (time-average residual) in model (29) relative to (27) slightly underestimates and shifts the true bias 〈q〉. The true bias is a combination of the sinusoidal state-independent error and the bulk effect of ignoring the small-amplitude modes. It is described by Eq. (35). The Lorenz ’96 model with forcing F = 18 exhibits a slightly larger bias due to the effect the large-amplitude variables (with increased energy) have on the small-amplitude variables.

c. Ensemble initialization

Our numerical experiments are initialized in a manner inspired by Wilks (2005); each ensemble forecast is initialized by choosing random perturbations from a distribution that approximates the shape of the attractor local to the initial state. The distribution corresponding to xa(T), for example, is found by gathering analogs from long integrations of (27), (28). Analogs are defined to be states within an I-dimensional hypercube (with side i having length equal to 5% of the climatological span of xi) centered on xa(T). As in Wilks (2005), the analog integrations were performed until each of the 10 000 cubes contained a minimum of 100 states. The I × I covariance matrix for the analogs of the state xa(T) is denoted 𝗖 (T). The distribution from which the initial ensemble for the forecast of xa(T) is chosen is given by

 
formula

where λ is the average eigenvalue of 𝗖(T) and σclim is the climatological standard deviation of xa. The covariance 𝗖init(T) has the same eigenvectors and correlations as 𝗖(T), but is scaled so that the average standard deviation is 5% of the climatology of the true time series xa (Wilks 2005). Control states for each ensemble forecast are generated by adding appropriately shaped random noise to each of the 10 000 true states,

 
formula

where y(1, k) is an I-dimensional vector whose entries are independent random numbers chosen from a Gaussian distribution. The square root of 𝗖init is computed offline for each initial state using the Cholesky decomposition (Golub and Van Loan 1996). Ensemble members are then generated from a multivariate Gaussian distribution by performing the same operation on xf0(1, k), namely,

 
formula

where y is different for each of the 10 000 initial states and Ne = 20 ensemble members.

d. Results

The bias in model (29), relative to (27), results from the unresolved behavior of the small-amplitude variables yi and the additional state-independent error term

 
formula

The time-average effect of these model errors, namely 〈q〉, is given by

 
formula

The time-average residual 〈δxa12(3) is an empirical estimate of 〈q〉. Figure 1 shows the true bias 〈q〉, and 〈δxa12〉 weighted by a factor of −12−1, representing the fact that it is a correction applied every hour but was calculated by averaging 12-h errors. The trained empirical correction term D(2)(x) = 〈δxa12〉 slightly underestimates and shifts the true bias of model (29).

Coupled signals between normalized, anomalous residuals and forecasts—namely, ′ and ′ respectively—are identified and ranked by singular value decomposition (17). The left and right singular vectors uk and vk are shown in Fig. 2 for forecasts made with model (29); they are superimposed on a shifted scale for visual simplicity. Unit vectors u1 and v1 suggest that states of model (29) of the shape u1 are typically misrepresented. Because each of the coupled signal pairs for the Lorenz ’96 model with forcings F = 14 and F = 18 roughly satisfies uk ≈ −vk, the state anomalies will be damped by the SVD empirical correction. The SVD method is also capable of suggesting amplification of anomalies if ukvk, as seen for F = 8 mode 3 and as demonstrated by Danforth et al. (2007).

Fig. 2.

Coupled signals between normalized, anomalous residuals and forecasts—namely, δxa12 and ′, respectively—are identified and ranked by singular value decomposition (17). The left and right singular vectors uk (solid) and vk (dashed) are shown for forecasts made with model (29); they are superimposed for visual simplicity. For example, unit vectors u1 and v1 represent the most dominant coupled signal between errors and forecasts, respectively. Qualitative similarities are seen between the modes found for the F = 14 and F = 18 Lorenz ’96 model. Mode 3 for F = 8 suggests a strengthening of anomalies of the shape v3, while mode 3 for both F = 14 and F = 18 suggests a weakening of anomalies of the shape v3.

Fig. 2.

Coupled signals between normalized, anomalous residuals and forecasts—namely, δxa12 and ′, respectively—are identified and ranked by singular value decomposition (17). The left and right singular vectors uk (solid) and vk (dashed) are shown for forecasts made with model (29); they are superimposed for visual simplicity. For example, unit vectors u1 and v1 represent the most dominant coupled signal between errors and forecasts, respectively. Qualitative similarities are seen between the modes found for the F = 14 and F = 18 Lorenz ’96 model. Mode 3 for F = 8 suggests a strengthening of anomalies of the shape v3, while mode 3 for both F = 14 and F = 18 suggests a weakening of anomalies of the shape v3.

The explained variance (21) for the spectrum of singular values of the cross-covariance matrix is shown in Fig. 3. It indicates that the most of the empirically estimated state-dependent model error can be captured with the first few modes for F = 8, but not for F = 18. The additional computational expense of including the few modes required to reach 95% is negligible for this model, where the number of degrees of freedom that one can attempt to correct is a maximum of K = I = 8. For an operational weather model, the spectrum is likely to be significantly flatter than that observed for F = 8. As a result, we may be forced to correct many forecast patterns (large K) to see improvement. Fortunately, the SVD technique that we are describing is very cheap, even for large K.

Fig. 3.

The explained variance (21) for the spectrum of singular values of the cross-covariance matrix shows how much of the empirically estimated state-dependent model error can be captured with the leading modes. To explain 95% of the variance K = 7, 5, and 2 modes are required for the Lorenz ’96 model with forcings F = 18, 14, and 8, respectively. Steep spectrums, like that seen for F = 8, indicate that the SVD representation is likely to be able to capture the relevant model error information with very few degrees of freedom.

Fig. 3.

The explained variance (21) for the spectrum of singular values of the cross-covariance matrix shows how much of the empirically estimated state-dependent model error can be captured with the leading modes. To explain 95% of the variance K = 7, 5, and 2 modes are required for the Lorenz ’96 model with forcings F = 18, 14, and 8, respectively. Steep spectrums, like that seen for F = 8, indicate that the SVD representation is likely to be able to capture the relevant model error information with very few degrees of freedom.

A sample of 107 short forecasts was used to train the operators in order to predict a maximum of I = 8 degrees of freedom. In practice, such a large sample size is unavailable for training. In the case of a small training set, the singular value spectrum may be steep, not due to the importance of the leading modes but due to the smaller sample size (Hamill et al. 2001). The larger the sample size, the more likely the operator will represent the true covariance and, hopefully, the greater the number of forecast patterns that can be corrected.

Typical 10-day, 20-member ensemble forecasts of x1 using model (30) and F = 14, with empirical correction terms described by (31), are shown in Fig. 4. Forecasts empirically corrected by the observed bias of model (29), D(2), perform slightly better than forecasts not corrected at all, D(1). Ensemble divergence is typically significant by day 5 for both D(1) and D(2). State-dependent empirical correction significantly improves forecasts. Ensemble spread is weak for both Leith’s empirical correction D(3) and the SVD correction D(4) with mode truncation K = 5. However, small spread is seen for perfect model forecasts D(5), and the effect is less evident for F = 8 and F = 18. Since the ensemble spread represents the uncertainty in the forecast and since the forecast skill is clearly improved by the Leith and SVD empirical corrections, this result should be expected.

Fig. 4.

Typical 10-day ensemble forecasts of x1 using model (30), F = 14, with empirical correction terms described by (31). The dashed curve is a true solution of system (27), (28). The solid curves are a 20-member ensemble forecast of model (30), initialized according to Eq. (34). Forecasts empirically corrected by the observed bias of model (29)—namely, D(2)—perform slightly better than forecasts not corrected at all, D(1). Ensemble divergence is typically significant by day 5 for both D(1) and D(2). Ensemble spread is weak for both Leith’s empirical correction D(3) and the SVD correction D(4) with mode truncation K = 5. However, small spread is seen for perfect model forecasts D(5), and the effect is less evident for F = 8 and F = 18.

Fig. 4.

Typical 10-day ensemble forecasts of x1 using model (30), F = 14, with empirical correction terms described by (31). The dashed curve is a true solution of system (27), (28). The solid curves are a 20-member ensemble forecast of model (30), initialized according to Eq. (34). Forecasts empirically corrected by the observed bias of model (29)—namely, D(2)—perform slightly better than forecasts not corrected at all, D(1). Ensemble divergence is typically significant by day 5 for both D(1) and D(2). Ensemble spread is weak for both Leith’s empirical correction D(3) and the SVD correction D(4) with mode truncation K = 5. However, small spread is seen for perfect model forecasts D(5), and the effect is less evident for F = 8 and F = 18.

Figure 5 shows the average anomaly correlation and rms error (RMSE) of the ensemble mean of 10 000 independent 20-member ensemble forecasts. The state-independent correction adds approximately 1 time unit (5 days) to the usefulness of F = 8 forecasts, and 0.1 time units (12 h) to the usefulness of F = 14 and F = 18 forecasts. For F = 14, Leith’s operator improves forecasts by 710% (27.2 days), and the SVD correction results in an improvement of 1176% (45 days). The SVD correction term D(4)(x) is chosen to have K = 7, 5, and 2 modes for forcings F = 18, 14, and 8, respectively; the truncation was chosen to explain 95% of the variance in the cross-covariance matrix (see Fig. 3). Table 1 summarizes the improvement in AC scores. While we present results for Ne = 20, AC scores for Ne = 1 and Ne = 50 are qualitatively similar, indicating that the performance of the SVD method is insensitive to ensemble size.

Fig. 5.

Average anomaly correlation and RMSE the ensemble mean of 10 000 independent 20-member ensemble forecasts. The state-independent correction D(2) adds approximately 1 time unit (5 days) to the usefulness of forecasts with no correction (D(1)) for F = 8, and 0.1 time units (12 h) to the usefulness of F = 14 and F = 18 forecasts. With a parameterization of model error in F = 18 forecasts, Wilks (2005) improved forecasts by a similar length of time (see the × in the lower left hand window). For F = 14, Leith’s operator D(3) improves forecasts by an average of 710%, and the SVD correction D(4) results in an average improvement of 1176%. The SVD correction is chosen to have K = 7, 5, and 2 modes for forcings F = 18, 14, and 8, respectively; the truncation was chosen to explain 95% of the variance in the cross-covariance matrix (see Fig. 3). The only source of error in forecasts made with the perfect model D(5) is observational noise. See Table 1 for a complete list of improvements.

Fig. 5.

Average anomaly correlation and RMSE the ensemble mean of 10 000 independent 20-member ensemble forecasts. The state-independent correction D(2) adds approximately 1 time unit (5 days) to the usefulness of forecasts with no correction (D(1)) for F = 8, and 0.1 time units (12 h) to the usefulness of F = 14 and F = 18 forecasts. With a parameterization of model error in F = 18 forecasts, Wilks (2005) improved forecasts by a similar length of time (see the × in the lower left hand window). For F = 14, Leith’s operator D(3) improves forecasts by an average of 710%, and the SVD correction D(4) results in an average improvement of 1176%. The SVD correction is chosen to have K = 7, 5, and 2 modes for forcings F = 18, 14, and 8, respectively; the truncation was chosen to explain 95% of the variance in the cross-covariance matrix (see Fig. 3). The only source of error in forecasts made with the perfect model D(5) is observational noise. See Table 1 for a complete list of improvements.

Table 1.

Improvement in crossing time of anomaly correlation scores with 0.6 for different empirical correction schemes relative to D(1)(x) = 0. For the anomaly correlations, see Fig. 5 where D(4)(x) is truncated at mode K = 2, 5, and 7 for the Lorenz ’96 model with forcings F = 8, 14, and 18, respectively. These improvements are shown in bold in the chart. The truncation was chosen to explain 95% of the variance in the cross-covariance matrix ; see Fig. 3.

Improvement in crossing time of anomaly correlation scores with 0.6 for different empirical correction schemes relative to D(1)(x) = 0. For the anomaly correlations, see Fig. 5 where D(4)(x) is truncated at mode K = 2, 5, and 7 for the Lorenz ’96 model with forcings F = 8, 14, and 18, respectively. These improvements are shown in bold in the chart. The truncation was chosen to explain 95% of the variance in the cross-covariance matrix ; see Fig. 3.
Improvement in crossing time of anomaly correlation scores with 0.6 for different empirical correction schemes relative to D(1)(x) = 0. For the anomaly correlations, see Fig. 5 where D(4)(x) is truncated at mode K = 2, 5, and 7 for the Lorenz ’96 model with forcings F = 8, 14, and 18, respectively. These improvements are shown in bold in the chart. The truncation was chosen to explain 95% of the variance in the cross-covariance matrix ; see Fig. 3.

Wilks (2005) used differences between the tendencies of the resolved variables in model (29), with α = 0, F = 18, and the actual tendencies of system (27), (28), to approximate model error. The collection of tendency errors for each resolved variable xi were then fit with a degree four polynomial

 
formula

where the term ei was a stochastic component and β0 corresponds to 〈δxaΔt〉 from (3). For ei = 0, Wilks found that, on average, 20-member ensemble forecasts crossed the 0.6 anomaly correlation line at a time of 4 days. This improvement is illustrated by the × in the bottom left window of Fig. 5. Wilks demonstrated a particular choice for the stochastic term ei to increase the crossing time of 20-member ensemble forecasts by 180% of the crossing time for single integrations with a deterministic parameterization of model error. Ensemble spread was also shown to improve as a result of the stochastic parameterization, with deterministic parameterizations resulting in smaller spread. Figure 5 demonstrates Leith’s state-dependent linear correction to improve on the crossing time of 20-member ensemble forecasts by 292% for F = 18 and the SVD correction to improve on the same measure by 338%. For F = 8 and F = 14, the SVD method outperforms Leith’s method by a much larger margin (see Table 1).

Figure 6 shows the average ensemble spread versus time and versus RMSE. The ensemble dispersion is good for the SVD correction D(4) for F = 8, but for both F = 14 and F = 18 there is essentially no spread. We believe that this is related to the number of degrees of freedom used to correct the model as shown in Fig. 3. Only K = 2 modes are used for SVD correction of F = 8, while K = 5 and K = 7 modes are used for F = 14 and F = 18 respectively. As a result, for F = 8 there are enough degrees of freedom to allow for unstable modes in the SVD-corrected model (Fig. 2), and the ensemble spread is quite good (Fig. 6). By contrast, for F = 14 and F = 18, all of the modes used to correct forecasts result in damping of anomalies (Fig. 2), and consequently in damping the ensemble spread as well. In a more realistic model, the number of SVD modes needed for the empirical correction should be much smaller than the number of degrees of freedom of the model, and the reduction in spread may not be as severe as observed in this model. It should be noted that both SVD and the Leith empirical correction methods essentially find the maximum likelihood estimate of the probability distribution of corrections observed during the training period, given the current state. It may be possible to derive within the SVD scheme a low-order method for estimating the uncertainty associated with each correction. In that case the improvement of the ensemble spread could be obtained by adding random corrections drawn from this distribution to each ensemble member, as suggested by an anonymous reviewer.

Fig. 6.

Average ensemble spread is shown vs time and vs RMSE for 10 000 independent 20-member ensemble forecasts. Terms D(1) and D(2) have been removed for visual clarity. Weak ensemble dispersion is seen for D(4) for F = 14 and F = 18. Since K = 2 modes were used for SVD correction of F = 8, the ensemble spread is quite good. As more modes are used to correct the forecast, the empirical correction appears to overpower the model dynamics.

Fig. 6.

Average ensemble spread is shown vs time and vs RMSE for 10 000 independent 20-member ensemble forecasts. Terms D(1) and D(2) have been removed for visual clarity. Weak ensemble dispersion is seen for D(4) for F = 14 and F = 18. Since K = 2 modes were used for SVD correction of F = 8, the ensemble spread is quite good. As more modes are used to correct the forecast, the empirical correction appears to overpower the model dynamics.

5. Discussion

Leith’s method consistently improves forecasts for short lead times, outperforming the SVD method for the first 10 days of F = 14 and F = 18 forecasts. After 10 days, the ensemble spread of forecasts made using Leith’s method grows rapidly, while the spread in SVD method forecasts remains small. The F = 14 and F = 18 forecasts made with the SVD correction deteriorate rapidly the first few days after which time they degrade at essentially the same rate as forecasts made with a perfect model. This second dynamic behavior is an indication that after the first few days, the SVD method is an excellent parameterization of the behavior of the small-amplitude variables. In fact, the SVD method performs as well or better than the perfect model for the first 10 days of F = 8 forecasts. However, we see in Fig. 3 that as F increases, the SVD method requires a greater number of modes to represent the cross-correlation matrix utilized by Leith’s method. As a result, in the SVD method, F = 18 forecasts are corrected by modes whose coupling is less statistically significant than F = 8 and F = 14. This is demonstrated by mode k = 8 in Fig. 2, which significantly harms SVD-corrected forecasts [see final D(4)(x) row in Table 1] relative to truncation at mode K = 7.

Clearly, these results are overoptimistic in that the model error in (29) relative to system (27) is highly state-dependent. However, Fig. 5 indicates that both Leith’s empirical correction operator and the SVD approximation do an excellent job representing the state-dependent component of the unresolved small-amplitude behavior. In fact, the SVD method isolates and ranks the most relevant spatial correlations described by Leith’s operator. As a result, truncation can actually improve performance. This was verified by using K = I = 8 modes for term D(4)(x); forecasts were slightly worse than those made using Leith’s operator for forcings F = 14 and F = 18.

The methods presented here have relied on an exact characterization of the true state for a very long training period in order to understand the best possible impact of empirical correction. While the analysis increments for an operational weather model are typically available from preimplementation testing, they are computed as the difference between an analysis that suffers from deficiencies in the model used to create it, and are only available for short training periods. Future studies will examine the effectiveness of model error parameterization by SVD using less accurate estimates of the true state and shorter training periods.

6. Conclusions

A new method of state-dependent error correction was introduced, based on singular value decomposition of coupled residual and forecast state anomalies. The cross-covariance is the same as that which appears in Leith’s formulation, but it would be prohibitive to compute for the grid density required by operational weather models. The new method uses the SVD modes as a basis for linear regression and results in significant forecast improvement. The new method is also many orders of magnitude faster than Leith’s empirical correction. The method can be applied at a rather low cost, both in the training and in the correction phases, and yields significant forecast improvements, at least for the Lorenz ’96 model and the simple but realistic global QG and SPEEDY models (Danforth et al. 2007). It could be applied with low computational cost and minimal sampling problems to data assimilation and ensemble numerical weather prediction, applications where accounting for model errors has been found to be important. The method may be particularly useful for forecasting of severe weather events where a posteriori bias correction will typically weaken anomalies. Furthermore, the patterns identified by SVD could also be used to identify sources of model deficiencies and thereby guide future model improvements. Further development of the SVD method will include a low-order method for estimating the uncertainty in the correction terms.

Acknowledgments

This research was supported by NOAA THORPEX Grant NOAA/NA040AR4310103 and NASA Phase-II Grant NNG 06GE87G to the Vermont Advanced Computing Center.

REFERENCES

REFERENCES
Anderson
,
J. L.
,
2001
:
An ensemble adjustment Kalman filter for data assimilation.
Mon. Wea. Rev.
,
129
,
2884
2903
.
Barnston
,
A. G.
,
M. H.
Glantz
, and
Y.
He
,
1999
:
Predictive skill of statistical and dynamical climate models in SST forecasts during the 1997–98 El Niño episode and the 1998 La Niña onset.
Bull. Amer. Meteor. Soc.
,
80
,
217
243
.
Bretherton
,
C.
,
C.
Smith
, and
J.
Wallace
,
1992
:
An intercomparison of methods for finding coupled patterns in climate data.
J. Climate
,
5
,
541
560
.
Dalcher
,
A.
, and
E.
Kalnay
,
1987
:
Error growth and predictability in operational ECMWF forecasts.
Tellus
,
39A
,
474
491
.
Danforth
,
C. M.
, and
J. A.
Yorke
,
2006
:
Making forecasts for chaotic physical processes.
Phys. Rev. Lett.
,
96
.
144102, doi:10.1103/PhysRevLett.96.144102
.
Danforth
,
C. M.
,
E.
Kalnay
, and
T.
Miyoshi
,
2007
:
Estimating and correcting global weather model error.
Mon. Wea. Rev.
,
135
,
281
299
.
DelSole
,
T.
, and
A. Y.
Hou
,
1999
:
Empirical correction of a dynamical model. Part I: Fundamental issues.
Mon. Wea. Rev.
,
127
,
2533
2545
.
Golub
,
G.
, and
C. F.
Van Loan
,
1996
:
Matrix Computations.
3rd ed. The Johns Hopkins University Press, 694 pp
.
Hamill
,
T.
,
J.
Whitaker
, and
C.
Snyder
,
2001
:
Distance-dependent filtering of background error covariance estimates in an ensemble Kalman filter.
Mon. Wea. Rev.
,
129
,
2776
2790
.
Leith
,
C. E.
,
1978
:
Objective methods for weather prediction.
Annu. Rev. Fluid Mech.
,
10
,
107
128
.
Lorenz
,
E. N.
,
1996
:
Predictability: A problem partly solved. Proc. ECMWF Seminar on Predictability, Reading, United Kingdom, European Centre for Medium-Range Weather Forecasts, 1–18
.
Lorenz
,
E. N.
, and
K. A.
Emanuel
,
1998
:
Optimal sites for supplementary weather observations: Simulation with a small model.
J. Atmos. Sci.
,
55
,
399
414
.
Ott
,
E.
, and
Coauthors
,
2004
:
A local ensemble Kalman filter for atmospheric data assimilation.
Tellus
,
56A
,
415
428
.
Reynolds
,
C.
,
P. J.
Webster
, and
E.
Kalnay
,
1994
:
Random error growth in NMC’s global forecasts.
Mon. Wea. Rev.
,
122
,
1281
1305
.
Whitaker
,
J.
, and
T.
Hamill
,
2002
:
Ensemble data assimilation without perturbed observations.
Mon. Wea. Rev.
,
130
,
1913
1924
.
Wilks
,
D. S.
,
2005
:
Effects of stochastic parametrizations in the Lorenz ’96 system.
Quart. J. Roy. Meteor. Soc.
,
131
,
389
407
.

Footnotes

Corresponding author address: Chris Danforth, Dept. of Mathematics and Statistics, University of Vermont, Burlington, VT 05401. Email: chris.danforth@uvm.edu