## Abstract

The performance of an ensemble prediction system is inherently flow dependent. This paper investigates the flow dependence of the ensemble performance with the help of linear diagnostics applied to the ensemble perturbations in a small local neighborhood of each model gridpoint location ℓ. A local error covariance matrix 𝗣_{ℓ} is defined for each local region, and the diagnostics are applied to the linear space defined by the range of the ensemble-based estimate of 𝗣_{ℓ}. The particular diagnostics are chosen to help investigate the efficiency of in capturing the space of analysis and forecast uncertainties.

Numerical experiments are carried out with an implementation of the local ensemble transform Kalman filter (LETKF) data assimilation system on a reduced-resolution [T62 and 28 vertical levels (T62L28)] version of the National Centers for Environmental Prediction (NCEP) Global Forecast System (GFS). Both simulated observations under the perfect model scenario and observations of the real atmosphere in a realistic setting are used in these experiments. It is found that (i) paradoxically, the linear space provides an increasingly better estimate of the space of forecast uncertainties as the time evolution of the ensemble perturbations becomes more nonlinear with increasing forecast time; (ii) provides a more reliable linear representation of the space of forecast uncertainties for cases of more rapid error growth (i.e., for cases of lower predictability); and (iii) the ensemble dimension (*E* dimension) is a reliable predictor of the performance of in predicting the space of forecast uncertainties.

## 1. Introduction

The purpose of an ensemble prediction system is to account for the influence of the spatiotemporal changes in predictability on the forecasts (e.g., Kalnay 2002; Palmer and Hagedorn 2006). Kuhl et al. (2007, hereafter KEA07) showed that the spatiotemporal changes in the predictability make the performance of an ensemble prediction system of a finite number of ensemble members inherently flow dependent. Thus, a uniformly good performance of the system over all weather situations cannot be expected. We refer to this flow dependence of the performance of the finite size ensemble as the local predictability of the ensemble performance.

The study of KEA07 was based on assimilating randomly located simulated observations under the perfect model scenario with an implementation of the local ensemble transform Kalman filter (LETKF) data assimilation system (Hunt et al. 2007; Szunyogh et al. 2008) on a reduced-resolution [T62 and 28 vertical levels (T62L28)] version of the model component of the National Centers for Environmental Prediction (NCEP) Global Forecast System (GFS). The purpose of the present study is to extend the investigation of KEA07, building to a more realistic setting, by first assimilating simulated observations in realistic locations under a perfect model scenario and then assimilating an operationally used set of observations of the real atmosphere.

The main goal of our study is to lay the theoretical foundation of a practical approach to predict the spatiotemporal changes in the performance of an ensemble prediction system. In particular, we define a local volume around each grid point ℓ and define a local state vector that represents the model state in the local volume. Then, we investigate the efficiency of the linear space , defined by the range of the ensemble-based estimate of the local covariance matrix 𝗣_{ℓ} for the components of the local state vector, in capturing the true forecast uncertainties. (The quality of the prediction of the magnitude of the analysis and forecast uncertainty and of the skill of the ensemble in distinguishing between the importance of the different state space directions within will be the subject of our next paper.) The motivation to utilize the local approach here is twofold: first, it provides a natural framework to study the spatial changes in the ensemble performance; second, the experience with ensemble-based Kalman filter data assimilation schemes strongly suggest that an ensemble of practically attainable size can provide a sufficiently accurate estimate of the covariance between different components of the state vectors, at short forecast times, only in localized regions (e.g., Houtekamer and Mitchell 2001; Szunyogh et al. 2005; Anderson 2007; Whitaker and Hamill 2002; Hamill et al. 2001).

In this study, we use the LETKF algorithm to generate the ensemble initial conditions for the investigation of the relationship between strong instabilities, local dimensionality, and the ability of the ensemble to capture the local structure of the forecast errors; however, we expect our results to remain valid for any suitably formed ensemble-based Kalman filter scheme. Because we do not attempt to account for the effect of model errors in our formulation of the ensemble, our results should not be used directly to interpret the behavior of an ensemble system that also employs model perturbations: for example, the ensemble prediction system of the European Centre for Medium-Range Forecasts (ECMWF, Berner et al. 2009). We believe, however, that our diagnostic approach could be used to validate existing model perturbation schemes.

The structure of this paper is as follows: In section 2, we introduce, in a more rigorous fashion than in KEA07, the diagnostics we use to assess and explain the performance of the ensemble prediction system at the different locations and times. In section 3, we describe the design of the numerical experiments. In section 4, we examine the spatiotemporal evolution of the forecast errors, which is our preferred way to assess the spatiotemporal evolution of predictability, and we analyze the relationship between predictability and our diagnostics. In section 5, we summarize our conclusions.

## 2. Diagnostics

We use linear diagnostics applied to the ensemble perturbations in a small local neighborhood of each model grid point to explore the spatiotemporally changing predictive qualities of the ensemble. In particular, we define a local state vector and the associated local covariance matrix to represent the state and the uncertainty in the state estimate at each grid point. In addition, we introduce a set of local diagnostics based on the eigensolution of the local covariance matrix and a measure of nonlinearity in the evolution of the local state vectors.

### a. Local vectors and their covariance

We define a local state vector **x**(ℓ) with all *N* state variables of the model representation of the state within a local volume centered at location (grid point) ℓ. For the rest of this article, we will discuss what to do at an arbitrary location ℓ, so we now drop the argument ℓ from the notation of the local state vectors. The mathematical model we adopt to predict the evolution of uncertainty in a local state estimate (analysis or forecast) **x*** ^{e}* is based on the assumption that the error in the state estimate,

is a random variable. In Eq. (1), **x*** ^{t}* is the model representation of the true state of the atmosphere, which is, in practice, unknown. The covariance between the different components of

**is described by the error covariance matrix 𝗣**

*ξ*_{ℓ}.

We employ a *K*-member ensemble of local state estimates **x**^{e(k)}, *k* = 1, … , *K*, to predict the uncertainty in the knowledge of the local state. The ensemble-based estimate of the covariance matrix 𝗣_{ℓ} is

where the ensemble perturbations **x**′^{(k)}, *k* = 1, … , *K*, are defined by the difference

between the ensemble members **x**^{(k)}, *k* = 1, … , *K*, and the ensemble mean,

In Eq. (2), superscript T denotes the matrix transpose. The linear space defined by the range of is spanned by the *K* ensemble perturbations. (We use the subscript ℓ in the notation 𝗣_{ℓ}, , and to emphasize that the linear space we are interested in is defined for the local neighborhood of each grid point.) Based on Eqs. (3) and (4), the sum of the ensemble perturbations is zero at all forecast lead times: that is,

Equation (5) indicates that the *K* ensemble perturbations are not linearly independent. Thus, the dimension of the linear space cannot be larger than *K* − 1.

To obtain a convenient orthonormal basis in for the definition and computation of our diagnostics, we compute the eigensolution of P̂. Because P̂ is a nonnegative definite and symmetric *N* × *N* matrix, it has *N* nonnegative eigenvalues, *λ*_{1} ≥ *λ*_{2} ≥ … ≥ *λ _{r}* … ≥

*λ*≥ 0, and the

_{N}*N*associated eigenvectors

**u**

*,*

_{n}*n*= 1, … ,

*N*, are orthogonal with respect to the Euclidean inner product. That is, when the eigenvectors are chosen to be of unit length with respect to the Euclidean vector norm,

where *δ _{ij}* = 1 for

*i*=

*j*and

*δ*= 0 for

_{ij}*i*≠

*j*. When the number of components of the local state vector is larger than the number of ensemble members (

*N*>

*K*), only the first

*K*− 1 eigenvalues can be larger than zero (in what follows,

*N*>

*K*is assumed unless noted otherwise). In this case, the normalized eigenvectors associated with the first

*K*− 1 eigenvalues

**u**

*,*

_{k}*k*= 1, … ,

*K*− 1, define an orthonormal basis in . The physical interpretation of the

*N*vectors

**u**

*,*

_{k}*k*= 1, … ,

*K*− 1, is that they represent linearly independent patterns of uncertainty in the ensemble perturbations in the local region at ℓ.

An arbitrary local state vector **x** can be decomposed as

where *δ***x** is the difference between **x** and the ensemble mean **x**. The perturbation vector *δ***x** can be further decomposed as

where *δ***x**^{(‖)} is the component that projects into ; that is,

where the coordinate , *k* = 1, … , *K* − 1, can be computed by

The vector *δ***x**^{(⊥)} is the component of *δ***x** that projects into the null space of . That is, *δ***x**^{(⊥)} cannot be represented by the ensemble perturbations. Using this notation, the error ** ξ** in the state estimate

**x**

*can be decomposed as*

^{e}Although the ensemble mean, or the error in the ensemble mean, does not appear directly in the local decomposition of the error [the rhs of Eq. (11)], the ensemble mean provides the reference point for the definition of the basis vectors that span the space .

The main focus of our investigation in this paper is the linear space . We choose diagnostics and design numerical experiments to identify the conditions under which the evolution of the forecast uncertainties can be efficiently described in . We emphasize that an ensemble forecast system can, in principle, describe the evolution of the forecast uncertainty, even if it is nonlinear. In that case, however, it is not guaranteed that the probability distribution of the uncertainty is Gaussian. In such a case, the knowledge of the mean and the covariance matrix may not be sufficient to fully describe the probability distribution. We still restrict our attention to , because our objective is to study spatiotemporal changes in the performance of the ensemble. This requires a measure of performance that can be computed for an arbitrary time and location.

### b. Explained variance

Our choice of this measure is the explained variance, which measures the projection of the analysis and forecast errors onto . Formally, the explained variance (EV) is calculated as

where ‖·‖ is the Euclidean vector norm on the space of the local state vectors. (Because is a subspace of the space of the local state vectors, this norm can be used to measure the magnitude of both the error and its projection into .)

The larger EV is, the more efficient is in capturing the uncertain components of the analysis and forecast fields; EV takes its maximum value of one when the entire forecast error projects into and takes its minimum value of zero when the forecast error does not have projection into . Our definition of explained variance is similar to the perturbation versus error correlation analysis (PECA) diagnostic defined in Wei and Toth (2003) when PECA is calculated using an optimally combined perturbation vector. The main difference between the two diagnostics is that we calculate the explained variance using local regions as opposed to the global domain used in Wei and Toth (2003). Finally, we emphasize that the explained variance measures one specific aspect of the performance of the ensemble system and a good performance with respect to the explained variance does not guarantee good performance with respect to another verification score.

### c. E dimension

The ensemble dimension (*E* dimension),

which was introduced by Patil et al. (2001) and discussed in detail by Oczkowski et al. (2005), characterizes the local complexity of dynamics, where *E* is a spatiotemporally evolving measure of the steepness of the eigenvalue spectrum, *λ*_{1} ≥ *λ*_{2} … ≥ *λ _{r}* … ≥

*λ*, having smaller values for a steeper spectrum (Szunyogh et al. 2007). For our choice of the perturbations, where the

_{K}*K*perturbations are linearly dependent,

*λ*= 0, the largest possible value of

_{K}*E*is

*K*− 1. For a set of linearly independent ensemble perturbations, the maximum value of

*E*is equal to the number of ensemble perturbations

*K*, which occurs when the uncertainty predicted by the ensemble is evenly distributed between

*K*linear spatial patterns in .

### d. Linearity of the local dynamics

The period of time in which model dynamics can be approximated as linear is commonly assumed to be 2–3 days (e.g., Palmer et al. 1994). Gilmour et al. (2001), using an objective measure to quantify the importance of nonlinear effects on the dynamics called “relative nonlinearity,” argued that the assumption of linearity may not be valid for longer than 24 h. To define the relative linearity measure, Gilmour et al. (2001) considered the evolution of twin pairs of ensembles members, which were obtained by adding the same ensemble perturbation to the analysis with both a positive and a negative sign. We introduce a measure of linearity that is motivated by that of Gilmour et al. (2001).

The evolution of the *k*th ensemble member is governed by the equation

where 𝗟 is a linear operator and 𝗡 is a nonlinear function of the analysis perturbation and the analysis **x*** _{g}^{a}*. The subscript

*g*indicates that Eq. (15) is for global state vectors instead of the local state vectors we consider elsewhere in this paper. Equation (15) is based on the Taylor expansion of the dynamics 𝗙 about

**x**

^{a}in the direction of the analysis perturbation

*δ*

**x**

^{a(k)}. Also, when

**x**

^{e(k)}refers to an ensemble member at analysis time, 𝗙 and 𝗟 are the identity and 𝗡 is zero. Because

taking the ensemble mean of the two sides of Eq. (15), we obtain

Applying the global equivalent of Eq. (5), the second term on the rhs of Eq. (18) is zero. Thus, introducing the notation

for the ensemble mean of the nonlinearly evolving component of the forecast ensemble members, Eq. (18) yields

We define the local relative nonlinearity measure *ρ*_{ℓ} as the ratio between the magnitude of **e** and the ensemble average of the magnitude of the ensemble perturbations for the local state vectors: that is,

Note that Eq. (21) is based on local state vectors. One important difference between our measure and that of Gilmour et al. (2001) is that, instead of a pair of perturbations, we consider a *K*-member ensemble. We make this choice because the *K*-member analysis ensemble generated by the LETKF algorithm is not composed of pairs, but it has the property that the sum of the ensemble perturbations is zero at analysis time. We also note that our definition is based on local state vectors and also applied to global state vectors for comparison. The study of Gilmour et al. (2001) used global state vectors, although their measure, in principal, could be applied locally as well.

## 3. Experiment design

We carry out numerical experiments both under the perfect model scenario and in a realistic NWP setting. In the perfect model experiments, we generate simulated observations of the hypothetical “true” trajectory of the atmospheric state, where the time series of true states **z** is generated by a 60-day model integration of the GFS model at T62L28 resolution starting from an operational NCEP analysis truncated to T62L28 resolution. We first repeat the experiment of KEA07 to verify that the findings of that paper remain valid for the different time period investigated here (January and February 2004 instead of January and February 2000).^{1} Then, we build to the realistic NWP setting in two steps: first, we replace the randomly located simulated observations by simulated observations taken by a realistic observing network, then we replace the simulated observations with observations of the real atmosphere.

### a. Observational datasets

#### 1) Randomly placed simulated observations

The truth **z** is taken to be an integration of the GFS model starting from the operational NCEP analysis at 0000 UTC 1 January 2004. At each grid point and model level, we generate simulated observations of the two horizontal components of the wind, the temperature, and the surface pressure by perturbing the true states with normally distributed, zero-mean-assumed observational errors with standard deviations of 1 K, 1.1 m s^{−1}, and 0.6 hPa for temperature, wind, and surface pressure. Next, similar to Szunyogh et al. (2005) and KEA07, we randomly choose 2000 soundings to reflect a 10% observational coverage of the model grid. By choosing observations randomly, we ensure that the simulated observing network has little systematic impact on the geographical distribution of analysis and forecast errors.

#### 2) Simulated observations at realistic locations

In the second set of experiments, we assimilate simulated observations at the locations of routine nonradiance observations of the real atmosphere. These simulated observations are generated by adding random observational noise, created by using the standard deviation of the estimated observational error provided with each observation by NCEP, to the true gridpoint values of the surface pressure, the temperature, and the two horizontal components of the wind vector. The location and type of observations is obtained from a database that includes all nonradiance observations operationally assimilated at NCEP between 0000 UTC 1 January and 0000 UTC 15 February 2004, with the exception of satellite radiances but including satellite-derived winds. We also exclude all surface observations, except for the surface pressure and the scatterometer wind measurements over oceans.

#### 3) Observations of the real atmosphere

Finally, the observations of the real atmosphere, which are used to obtain the type and location for the simulated observations at realistic locations, are assimilated.

### b. Selection of the LETKF parameters

For each observational dataset, an analysis is obtained at the native model resolution every 6 h. Diagnostics are computed at a reduced 2.5° × 2.5° grid resolution. We assimilate observations between 0000 UTC 1 January and 0000 UTC 15 February 2004. In these experiments, multiplicative covariance inflation is used at each analysis step to increase the estimated analysis uncertainty to compensate for the loss of ensemble variance resulting from sampling errors, the effects of nonlinearities, and model errors. In the current version of the LETKF code, the variance inflation coefficient is constant in time and the zonal directions (along the latitudes). A near-optimal value of the coefficient is found by numerical experimentation, trying to ensure that the “trace” (P̂) ≈ ‖*δ ξ*

^{(‖)}‖

^{2}at analysis time.

^{2}The parameters of the LETKF used in this experiment are as follows:

The ensemble has

*K*= 40 members.Observations are considered for assimilation in an 800-km horizontal radius of the grid point, where the state is estimated.

Observations have equal weight within a 500-km radius of the given grid point, beyond which the weight of the observations tapers linearly to zero at 800 km.

Observations are considered in a vertical patch radius centered at the grid point. This layer has depth 0.35 scale height between model levels 1–15 and gradually increases to 2 at the top of the model atmosphere.

Surface pressure is assimilated at the first model level and temperature, and zonal and meridional winds are assimilated at all 28 model levels.

For simulated randomly distributed observations, we use a 25% covariance inflation at all vertical levels in all geographic regions. For the simulated observations taken at realistic locations, the covariance inflation is 2.5% at all vertical levels in the Southern Hemisphere (SH) extratropics and 10% in the Northern Hemisphere (NH) extratropics. In the tropics, the covariance inflation varies from 2.5% to 7.5%. For the conventional observations of the real atmosphere, the covariance inflation tapers from 50% at the surface to 40% at the top of the model atmosphere in the SH extratropics and from 100% to 60% in the NH extratropics, and it changes smoothly in the tropics (between 25°S and 25°N) from the values of the SH extratropics to the values of the NH extratropics.

### c. Initialization

In the two sets of experiments that assimilate observations in realistic locations, high-frequency oscillations (typically associated with gravity waves) are filtered from all background ensemble members with a digital filter scheme (Huang and Lynch 1993), which is part of the NCEP GFS model and can be turned on or off by choice. (Unlike in the original formulation of the digital filter algorithm, where a filtered analysis is produced, the NCEP filter provides only a filtered background field.) We use the filter with a 3-h cutoff frequency. We find that turning the digital filter on in these two sets of experiments leads to a major improvement of the analyses.

In the experiments with randomly placed observations, turning the digital filter on degrades the analysis in the tropics (Fig. 1). More precisely, the surface pressure errors with the digital filter turned on (Fig. 1, top) have a clear wavenumber two pattern in the tropics. A more careful examination of the structure of the error fields reveals that the digital filter wipes out the semidiurnal tidal wave.^{3} We illustrate this effect of the filter by showing the spectrum of Fourier amplitudes of the time series of surface pressure at the location 0°N, 160°W for the nature run and the analyses prepared with and without the use of the digital filter (Fig. 2). The 12-h frequency oscillation characteristic of the semidiurnal tidal wave is not present in the run that uses the digital filter, even though this oscillation is the dominant signal in the nature run and in the analysis cycles that do not use the digital filter. We note that the digital filter initialization also has a negative effect on the analysis of the semidiurnal tidal wave in the two experiments that assimilate observations at realistic locations. In those experiments, however, the problem does not get exposed, because the beneficial effect of the filter from removing spurious gravity waves that are not present in the experiment based on uniformly distributed observations, outweighs the degradation from wiping out the semidiurnal tidal wave.

The semidiurnal tidal wave is primarily caused by the absorption of solar radiation by ozone in the stratosphere and the atmosphere. The response to this stratospheric excitation propagates downward in the form of an inertia–gravity wave (Chapman and Lindzen 1970). Our conjecture is that the digital filter affects this inertia–gravity wave. We also suspect that applying a digital filter initialization to the analysis increment instead of filtering the 6-h background forecast, which is the general practice for variational data assimilation schemes, would eliminate the negative effect of the filter on the semidiurnal tidal wave, as suggested by Sankey et al. (2007).^{4}

### d. Forecasts

We prepare the deterministic forecasts daily, started from the mean analysis at 0000 and 1200 UTC and output every 12 h. These model integrations provide the state estimate **x**^{a,f}. At analysis time and at short forecast lead times (while the time evolution of the ensemble perturbations stays linear), this state estimate provides our best deterministic estimate of the state. At longer lead times, **x**^{a,f} simply represents a forecast for which the analysis was drawn from a probability distribution that is consistent with our estimate of the analysis uncertainty.

In addition to the state estimate, the LETKF also generates an ensemble of analyses to estimate the uncertainty in the state estimate. These analyses serve as initial conditions for the ensemble of forecasts. Ensemble forecasts are obtained once daily, started from the ensemble of analyses at 0000 UTC and output every 12 h. Both the deterministic forecast and the ensemble forecasts are carried out to a five-day lead time. Unlike the experiments that use realistically placed observations, forecasts for the experiment that assimilates observations in random locations are run without the use of the digital filter. We note that turning the digital filter off in this experiment slightly increases the forecast error up to 12-h lead times, after which the filter has no effect on the forecast errors.

Forecast error statistics are computed by comparing the deterministic forecasts **x**^{a,f} to the true states **z**. Forecasts started from analyses generated by assimilating conventional observations of the real atmosphere are verified using the high (T254L64) resolution operational NCEP analyses truncated to 2.5° × 2.5° resolution as proxy for the true state. (The local state vectors are defined on the 2.5° × 2.5° grid and not on the nominally higher-resolution native computational grid of the model.) These operational analyses were obtained by NCEP assimilating a large number of satellite radiance observations in addition to the conventional observations used in our experiments. Forecast error statistics are generated for the 36-day period of 0000 UTC 11 January–0000 UTC 15 February 2004.

### e. Selection of the parameters of the diagnostics

We define the local state vector by all temperature, wind, and surface pressure gridpoint variables in a cube that is defined by 5 × 5 horizontal grid points and the entire column of the model atmosphere. For the computation of the diagnostics, the different components of the state vector are scaled so that they have a dimension of square root of energy (e.g., Oczkowski et al. 2005). Based on our previous experience (Oczkowski et al. 2005; KEA07), for a specific choice of the local volume, the spatiotemporal variability in the *E* dimension and the explained variance for the first 5 forecast days are associated with synoptic-scale atmospheric processes. The choice of the local state vector also determines the minimum number of ensemble members required to provide a sufficiently accurate estimate of the local error covariance matrix 𝗣_{ℓ}. The problem of achieving a proper balance between the size of the local region and the number of ensemble members was investigated in detail in the context of ensemble-based Kalman filtering by Szunyogh et al. (2005). Because we expect the dynamically active number of degrees of freedom to be the largest at analysis time, we chose the number of ensemble members to be *K* = 40, a choice that was found to be sufficient for an efficient data assimilation with the LETKF using similar size local regions by Szunyogh et al. (2005). Because the dimension of the local state vector is *N* = 1975, our choices for the local state vector and the ensemble size satisfy the condition *K* < *N*, which was assumed in section 2.

To illustrate the only modest sensitivity of our diagnostics to the selection of the local region size, which is a desirable property considering that the choice of the local region is somewhat arbitrary, we show snapshots of the *E* dimension and explained variance for two different size local regions at the 120-h forecast time (Figs. 3, 4). The results indicate, in agreement with Oczkowski et al. (2005), that, although the values of *E* dimension somewhat increase and the values of explained variance somewhat decrease with doubling the size of the local region in both horizontal directions, the general location of the low-dimensional and high explained variance regions are mostly unaffected. The figures also show that smaller region size, which is used in the rest of the paper, provides a sharper resolution of the diagnostics.

## 4. Numerical experiments

### a. Forecast errors

First, to illustrate the general spatial distribution of the errors in the **x*** _{g}^{e}* state estimate, we examine the absolute error in the analyses and forecasts of the meridional wind component at 500 hPa. We choose the meridional wind instead of the more commonly used geopotential height, because this way we can use the same quantity to characterize the errors in the tropics and the extratropics. Plots of the absolute error are obtained by computing the time average of ‖

**‖ at each location (grid point). Figure 5 shows the time-mean absolute error at analysis time and at the 72-h forecast lead time for all three experiments. The results obtained by assimilating simulated observations in randomly placed locations show that the largest analysis errors are in the tropics and the smallest analysis errors are in midlatitude storm-track regions, in agreement with Szunyogh et al. (2005). Forecast errors become dominant in the storm-track regions within 48–72 h. In comparison, when simulated observations are placed in realistic locations, the results show that the distribution of the magnitude of the analysis errors is strongly modulated by the observation density: the lowest errors are over continents in the Northern Hemisphere and the highest errors are over Antarctica and in the oceanic region between Cape Horn and the Antarctic Peninsula.**

*ξ*We see strong similarities in the spatial distribution of the errors at analysis time and for short-term forecasts in both experiments that assimilate observations in realistic locations. This similarity indicates that observation density plays a more dominant role than model error in determining the large-scale spatial variation of the analysis and the short-term forecast errors. Nevertheless, the results obtained by assimilating observations of the real atmosphere show that the magnitude of the forecast error is almost double the forecast error found in the experiments that used simulated observations. In all three experiments, we find rapid growth of forecast errors in the midlatitude storm-track regions, which become the dominant region of forecast error by the 72-h time. The dominance of the synoptic-scale error structures at and beyond the 72-h forecast time is also well illustrated by Fig. 6, which shows the evolution of the spectral distribution of the forecast errors: initially, the magnitude of the errors is similar in the 1–5, 6–10, and 21–42 wavenumber ranges and slightly higher in the 11–20 wavenumber range, but at longer lead times the errors in the 6–10 wavenumber range become dominant (the dominance of the 6–10 wavenumber range occurs more rapidly for the experiments that assimilate realistically placed observations). Errors in the wavenumber range 21–42 saturate much faster than the errors at the larger scale. The increasing wavelength of the dominant error structures contributes to the general tendency of a decrease of the *E* dimension. Also, the increasing dominance of the synoptic-scale error structures leads to the development of local minima of the *E* dimension in the regions of the main synoptic-scale features.

### b. E dimension and explained variance

Szunyogh et al. (2005) showed that, for lower values of *E* dimension, the ensemble more certainly captured the structure of the background error. KEA07 extended the *E*-dimension diagnostic to study predictability of the performance of ensemble forecasts and found that, in the extratropics, atmospheric instabilities^{5} that led to fast local error growth also led to low *E* dimension and therefore to increased certainty that a greater portion of the forecast error was efficiently captured by the ensemble.

We investigate the relationship between the *E* dimension, explained variance, and forecast errors with the help of joint probability distribution functions (JPDFs). The JPDF shown in Fig. 7 is obtained by calculating the number of occurrences in each bin defined by Δ*E* × ΔEV, where Δ*E* denotes the bin increment for *E* dimension and ΔEV denotes the bin increment for the explained variance. The number of occurrences is then normalized by Δ*E* × ΔEV × *n*, where *n* is the total sample size, which is equal to the total number of grid points in a geographic region multiplied by the total number of verification times. This normalization ensures that the integral of the plotted values over all bins is equal to one. At analysis time, we find lower values of *E* dimension corresponding to higher values of explained variance for the experiments that use realistically placed observations (middle and bottom panels on lhs of Fig. 7) than for the experiment that uses randomly placed simulated observations. As forecast lead time increases, lower values of *E* dimension have a greater probability of corresponding to high value of explained variance. In good agreement with KEA07, we find that, at the 120-h lead time (Fig. 7, right), the lower the *E* dimension, the greater the probability that explained variance is high. We find this relationship independent of experiment and geographic region. We also note that the evolution of the *E* dimension and explained variance is similar to that in KEA07 for all three experiments. Our local results are in agreement with the findings of Wei and Toth (2003) for the global ensemble perturbations and errors, which showed higher PECA values with increasing lead time because of the collapse of the phase space of ensemble perturbations and forecast errors into a smaller dimensional subspace.

A unique feature of the results for the experiments that use real observations (Fig. 7, two bottom panels) is that the values of explained variance never reach their theoretical upper limit of one. This behavior is most likely due to the effects of the model errors, because in the two experiments that use simulated observations the largest values of the explained variance are near one. We cannot determine, however, based on the results of our experiments, whether this reduction in the maximum of the explained variance occurs because some of the forecast errors are orthogonal to the model attractor (thus, an ensemble of model forecasts cannot capture them) or because our approach to generate the ensemble perturbations does not enable the members of the forecast ensemble to explore that part of the model attractor that includes the true system state.

The results shown in Fig. 7 suggest that the *E* dimension may be a good linear predictor of the lower bound of the explained variance at a given time and location. To further explore this idea, we break up the 36-day dataset into two sets of 18 days, and we try to find a quantitative relationship between the *E* dimension and the lower bound of the explained variance based on the first 18 days (training period), which we can then use to predict the lower bound of the explained variance based on the *E* dimension for the second 18-day period. To obtain a prognostic equation, we first order the values of *E* dimension for the training period and divide them into 100 bins, each containing an equal number of data points. Each bin provides a pair of data: a value of *E* defined by its mean for the bin and the minimum of the explained variance in the bin min* _{E}*EV. Based on this paired dataset, the correlations between

*E*and min

*EV are 0.9224, 0.6248, and 0.7353 at the 120-h lead time for experiments that assimilate simulated observations in random locations, simulated observations in realistic locations, and observations of the real atmosphere, respectively, which are statistically significant at the 99.99% level by a*

_{E}*t*test. This supports our hypothesis that

*E*may be a good linear predictor of min

*EV. Thus, we compute the linear regression coefficients between the paired data over all of the bins; that is, we obtain*

_{E}*a*and

*b*such that

The results are shown in Fig. 8. The correlation values (0.906, 0.8206, and 0.669) between the predicted and the actual minimum of the explained variance indicate a linear relationship, which is statistically significant at the 99.99% confidence level by a *t* test. Nevertheless, when we use the related linear regression coefficients to predict min* _{E}*EV based on

*E*, the predicted values are clearly overestimated in some cases. This problem is most noticeable for the experiment that assimilates observations of the real atmosphere at analysis time. A more careful inspection of Fig. 7 suggests that this overestimation of the minimum explained variance by the linear model may be due to a few statistical outliers of min

*EV in the training datasets. To dampen the effects of the outliers, we repeat our calculations using the 5th percentile value EV*

_{E}_{(5)}of the explained variance instead of min

*EV. That is, we obtain*

_{E}*a*and

*b*such that

where *P*(·|*E*) is the conditional probability, given *E*. The results are summarized in Fig. 9, which shows that the linear expression based on the *E* dimension provides a better prediction of the 5th percentile than the minimum of EV. The prediction of the 5th percentile of explained variance is especially accurate at longer forecast lead times; for example, at the 120-h lead time, the correlation between the predicted and the true value of the explained variance is greater than 0.94 and the root-mean-square error of the prediction is less than 0.02 in all three experiments.

To further illustrate the predictive value of the linear relationship we found, we plot a snapshot of the actual values of explained variance and the predicted 5th percentile at the 120-h forecast time for the experiment that assimilates conventional observations of the real atmosphere (Fig. 10). The top panel of Fig. 10 shows an area of local low dimensionality that develops in the northeast Pacific in an atmospheric region dominated by a strong ridge. The location of the low-dimensional area suggests that there is an uncertainty in the forecast of the ridge and that this uncertainty is dominated by a few state space directions. The second panel of Fig. 10 shows that, as expected based on Fig. 7, the ensemble is efficient in capturing the space of uncertainties in the low-dimensional region (EV ≈ 0.93). When we use the linear expression of Eq. (23), we obtain a value of 0.85 for the 5th percentile (which is slightly lower that the actual value of 0.93, as expected), which correctly predicts a good performance of the ensemble in capturing the uncertain forecast patterns in the ridge region. We emphasize that a good prediction of the uncertain forecast patterns is not a guarantee of a good prediction of the magnitude or the probability distribution of the forecast uncertainty.

### c. Explained variance and forecast error

Figure 11 shows the JPDF for the explained variance and the state estimation error in the NH extratropics. Although there is no obvious relationship between the magnitude of the analysis errors and the explained variance (Fig. 11, left), similar to the results of KEA07, the ensemble captures the patterns associated with larger forecast errors more efficiently. In addition, both the minimum and the maximum of explained variance increase with forecast time in all three experiments, which indicates that provides an increasingly better representation of the space of forecast uncertainty with increasing forecast time (results are shown only for the 120-h forecast time).

Figure 12 shows the mean *E* dimension for the bins in the JPDF for the analysis and forecast error and the explained variance. Interestingly, the distribution of *E* dimension with explained variance at analysis time is more similar for the two experiments that assimilate realistically distributed observations. For these two experiments, we find locations where the explained variance is high and the *E* dimension is low but the analysis error is relatively large. These are locations where the ensemble efficiently captures the space of uncertainties, but there are no observations available to take advantage of this information. Such locations do not exist for the experiment that assimilates randomly placed observations, as in that experiment the observational coverage is sufficiently dense at all locations to effectively remove the background errors at locations of high explained variance (low *E* dimension).

At the 120-h forecast time, the findings of KEA07 extend to the more realistic settings: the instabilities that lead to large forecast errors also lead to low *E* dimension and therefore to higher explained variance. However, because low *E* dimension can occur even if the forecast error is small, the *E* dimension cannot be used to predict the magnitude of the forecast error. That is, we may not be able to predict that the forecast error will be large at a given time and location, but we know that, if the error will be large, the ensemble will provide a reliable representation of the potential forecast error patterns.

### d. Local linearity

As mentioned earlier, analyzing Fig. 11, within the five-day forecast range we investigated in this paper, the minimum and maximum of the explained variance increase with increasing forecast time. In other words, the linear space provides an increasingly better representation of the space of the forecast errors for longer forecast lead times. This is a counterintuitive result, because the ensemble perturbations are expected to evolve linearly only for the shortest forecast lead time, when their magnitude is still small. In this case, they can be considered a representation of the tangent space to the nonlinear system trajectory. The evolution of the ensemble perturbations is expected to become more nonlinear as their magnitude is growing with increasing forecast time. In the remainder of this section, we investigate the roots of this seemingly paradoxical result.

A potential resolution of the paradox is that, although the evolution of the ensemble perturbations becomes less linear with increasing forecast time with respect to a global measure of nonlinearity, there may be local regions where linearity becomes stronger with increasing forecast time. To explore this possibility, we first investigate the local variability of the strength of nonlinearity. We do this by comparing the mean and the standard deviation of the local values of the relative nonlinearity for the different forecast lead times (Fig. 13). Although the globally averaged values of relative nonlinearity computed with and without localization are similar, the standard deviations of the values computed using localization show a strong local variability in the degree of linearity. This variability is especially striking for the experiment that assimilates randomly placed observations: values of the relative nonlinearity as low as its mean at the 24-h lead time are within one standard deviation at 120 h. Conversely, values as high as the mean at 84 h are within one standard deviation at the 24-h forecast lead time. Although the local variability of the relative nonlinearity is not negligible, the results do not support a scenario in which nonlinearity could locally decrease at most locations. To confirm that high explained variance is not due to strong local linearity, we also prepared scatterplots for the explained variance and the relative nonlinearity (Fig. 14). It is obvious, based on the scatterplot, that a weak nonlinearity is not a necessary condition for high explained variance. Our results are also in line with those who found that a tangent linear model can, qualitatively, well predict the fastest growing error patterns for much longer times than could be expected based on the nonlinearity index (e.g., Reynolds and Rosmond 2003).

Our finding that, although model errors double the magnitude of the forecast errors, they do not decrease much the efficiency of the ensemble based linear space in representing the space of forecast errors, makes us speculate that a large part of the model errors may simply act as an additional stochastic forcing.

## 5. Conclusions

In this paper, we studied the spatiotemporally changing nature of predictability by coupling a reduced-resolution version of the model component of the NCEP GFS with the LETKF data assimilation scheme. Our focus was on exploring the value of the linear space spanned by the ensemble perturbations in predicting the space of uncertainties for the first five forecast days. We employed a hierarchy of increasingly more realistic experiment designs to be able to investigate the effects of the temporal and spatial distribution of the observations and the effects of model errors. Although we found that the distribution of the observations modulates the distribution of the analysis and forecast errors and model errors lead to a doubling of the average and maximum magnitude of the errors, the following main findings apply to all experiment setups, including the one that assimilated observations of the real atmosphere:

We have found that provides an increasingly better representation of the space of uncertainties with increasing forecast time.

We have shown that the improving performance of with increasing forecast time is not due to local linear error growth but rather to nonlinearly evolving forecast errors that have a growing projection on the linear space . We have argued that our result is consistent with the theory that views synoptic-scale eddies as stochastically forced disturbances on a baroclinically stable background flow.

We have shown that the

*E*dimension is strongly anticorrelated with the error variance explained by the ensemble; thus, it is a reliable linear predictor of the performance of in capturing the most important forecast error patterns. What makes this finding especially valuable is that*E*dimension is always low when the forecast error is large; that is, we can have high confidence in the skill of the ensemble in capturing the error patterns associated with the large error in the deterministic forecast, and we can use a simple linear regression to provide a quantitative prediction of this skill at a given time and location.

We conclude this paper by noting that a good prediction of the space of forecast uncertainties does not automatically guarantee a good prediction of the magnitude or the probability distribution of the uncertainties. An assessment of the skill of the ensemble in predicting the contributions of the different directions within to the forecast uncertainties is the subject of our next paper.

## Acknowledgments

The authors thank David Kuhl, Eric Kostelich, and Gyorgyi Gyarmati for their contributions to this work. The two anonymous reviewers made several helpful suggestions that helped us improve the presentation of our results. The research reported in this paper was funded by the National Research Foundation (Grant ATM-0722721).

## REFERENCES

**,**(

**,**

**,**

**,**

**,**

**,**

**,**(

**,**

**,**

**,**

**,**

**,**

**,**

**,**

**,**

**,**

**,**

## Footnotes

*Corresponding author address:* Elizabeth Satterfield, Department of Atmospheric Sciences, Texas A&M University, College Station, TX 77843-3150. Email: easatterfield@ariel.met.tamu.edu

^{1}

Another important difference between the experiment design of the two studies is that we use a later version of the LETKF. Most importantly, the LETKF used in this study provides more accurate analyses in the polar regions. We also note that the current implementation of the LETKF defines the local region in terms of distance rather than by model grid points. Because we generate randomly placed simulated observations to cover 10% of the model grid, as in KEA, our experiment considers a greater number of observations for analyses at higher latitudes.

^{2}

We found, based on numerical experimentation, that further inflating the ensemble perturbations to account for that part of the magnitude of the error that is associated with that component of the error that is orthogonal to does not improve the analysis. In other words, overinflating the ensemble in the correctly resolved directions of uncertainties to correct the ensemble spread for the loss associated with not capturing all directions of uncertainties does not improve the analysis.

^{4}

We are currently in the process of developing such an initialization algorithm for the LETKF scheme.

^{5}

Here, we use the term “instability” in the mathematical sense; that is, it refers to the divergence of nearby model trajectories in state space. These instabilities are not always directly related to atmospheric instabilities characterized by the generation of transient kinetic energy (e.g., baroclinic instability, barotropic instability, and convection).