## 1. Introduction

Knowledge of how observations, dynamical instabilities, and model error determine analysis and forecast error covariance is fundamental to data assimilation, ensemble prediction, and observational network design. Data assimilation schemes use forecast error covariance information in order to reject faulty data and to optimally combine observations with model first-guess fields (e.g., Daley 1991; Cohn 1997). Data assimilation schemes need the best available information about forecast error covariances in order to maximize their ability to reduce analysis error variance (Daley 1991). Ensemble prediction schemes need accurate analysis error covariance information in order to sample effectively the distribution of analysis errors (e.g., Houtekamer et al. 1996; Toth and Kalnay 1997; Barkmeijer et al. 1998, 1999). Observational network designers, in particular, need to know how the interplay between observations, dynamical instabilities, and data assimilation schemes determine forecast and analysis error covariance (Lorenz and Emanuel 1998; Palmer et al. 1998; Berliner et al. 1999; Baker and Daley 2000; Bishop et al. 2001).

In operational settings, model error, poorly specified observation error covariances together with the prohibitive computational expense of running an extended Kalman filter or smoother (e.g., Jazwinski 1970) severely limit our ability to accurately estimate analysis and forecast error covariance. To gain insight into how flow dynamics and observations determine error covariances, we sought exact error covariance information for simplified yet atmospherically relevant systems. Section 2 presents our first results: closed form expressions for the infinite time Kalman filter error covariances for systems with perfect time-independent linear dynamics operator, time-independent observation operator, and time-independent observation error covariance. While the utility of these solutions is limited by the idealizations made to obtain them, a variety of pedagogical and practical applications might be possible.

For example, an educational application would be to use them to demonstrate explicitly how optimal data assimilation schemes use dynamical and observational information to minimize error variance. An example of a practical application is using the exact covariances to test approximate error covariance modeling schemes in a similar way to which exact analytic solutions of partial differential equations are used to test approximate numerical integration schemes.

The World Weather Research Program (see Web site http://box.mmm.ucar.edu/uswrp/wwrp/cross.html), the U.S. Weather Research Program (USWRP; see Web site http://www.mmm.ucar.edu/uswrp/), Emanuel et al. (1995, 1997), and Dabbert et al. (1996) have all stressed the importance of tools for designing cost-effective observational networks. One such tool is the Observation System Simulation Experiment (OSSE; see Arnold and Dey 1986; Daley 1991). In an OSSE, a long integration (3 months, say) of a sophisticated numerical model is used to generate “pseudodata” associated with proposed observational networks and then, for each proposed network, a 3-month data assimilation/forecast cycle is performed to assess the network's value. The number of observational network configurations that can be tested by a conventional OSSE is strongly limited by available computational resources.

In conventional OSSEs, the error distributions propagate about time-evolving nonperiodic flow. Authors such as Marshall and Molteni (1993) have argued that atmospheric low-frequency variability may be understood in terms of the movement of the atmosphere from one unstable quasi-stationary state to another. This perspective suggests that an ideal observational network would control error growth about all of the quasi-stationary states within which the atmospheric trajectory is likely to fall.^{1} Steady-state OSSEs in which the tangent linear propagator was fixed in time could be used to assess the value of observational networks for each of the relevant quasi-steady states. The closed form solutions presented here allow steady-state OSSEs for a very large number of different possible configurations of the global observing network to be performed with very small computational expense once a relevant stationary state and the unstable eigenvectors of its tangent linear propagator have been found. If the nontrivial problems of finding the relevant stationary states and eigenvectors could be solved, the solutions presented here could be used to test the error-reducing effect of large numbers of different possible networks. Thus, the solutions could allow for the testing of a much larger number of observational networks than regular OSSEs at the expense of not properly accounting for time variations in the basic flow. A thorough exploration of this possibility is beyond the scope of this paper. Here, our objective is to illustrate with a simple model how the error covariance solutions enable fairly sophisticated search algorithms to be applied to the problem of optimizing the global observing network for two relevant quasi-steady states in a T21 quasigeostrophic (QG) global model with three vertical levels. Details are given in sections 3 and 4, while concluding remarks follow in section 5.

## 2. Steady-state error covariances of an optimal data assimilation cycle

### a. Preliminaries

^{a}

_{k}

^{f}

_{k}

*t*

_{k}by

^{a}

_{k}

^{f}

_{k}

^{f}

_{k}

^{T}

^{f}

_{k}

^{T}

^{−1}

^{f}

_{k}

*t*

_{1},

*t*

_{2}, … ,

*t*

_{k}, … and that

*t*

_{k}−

*t*

_{k−1}= constant, for all

*k.*With no model error,

^{f}

_{k}

^{a}

_{k−1}

^{T}

*p*×

*n*matrix observation operator mapping the

*n*-dimensional model state

**x**to the

*p*-dimensional space of observations. (Our notation follows that of Ide et al. 1997, as closely as possible.) Since the state vector

**x**lists all variables defining the state of the atmosphere, different elements of

**x**may have different units, for example, temperature (kelvins) and pressure (pascals). Similarly, the elements of the

*n*×

*n*covariance matrices

^{f}

_{k}

^{a}

_{k}

^{2}and can be expressed as

**v**

_{i}of the matrix 𝗩 are the eigenvectors of 𝗠 and the diagonal matrix

**Λ**lists the corresponding eigenvalues of 𝗠. Throughout this paper, we assume that the columns of eigenvector matrices are ordered by eigenvalue magnitude with the first and last columns corresponding to the largest and smallest eigenvalue, respectively. The eigenvectors of 𝗠 often come in complex conjugate pairs. Typically, in atmospheric applications, such conjugate pairs together with their corresponding complex eigenvalues describe propagating waves. The real eigenvectors of 𝗠 correspond to stationary waves. The matrix 𝗨, whose columns are the adjoint or left eigenvectors of 𝗠, is defined by 𝗨 = (𝗩

^{−1})

^{H}= 𝗩

^{−H}where the superscripts H and −H indicate the Hermitian (the conjugate transpose) and the inverse of the Hermitian, respectively.

Since we are unaware of any atmospheric examples where the dynamics are nondiagonalizable (i.e., defective) and since the two stationary states we examined with our QG model were found to have a diagonalizable 𝗠, we have no reason to believe that the assumption of diagonalizability made in (3) significantly limits the relevance of our results. We suspect that our assumptions of no model error and a temporally invariant 𝗠 limit the geophysical relevance of our results more strongly than our assumption that 𝗠 be diagonalizable.

^{n}and the analysis error covariance can be expressed as

^{a}

_{k}

^{a}

_{k}

^{H}

^{a}

_{k}

^{H}

^{a}

_{k}

^{−H}. The requirement that

^{a}

_{k}

^{a}

_{k}

^{a}

_{k}

### b. Decaying eigenvectors not required

^{a}

_{k}

^{f}

_{k}

^{k}

^{a}

_{0}

^{T}

^{k}

^{H}

_{i}

^{a}

_{k}

_{i}

^{a}

_{k}

_{li}

*λ*

_{i}

^{2k}

^{a}

_{0}

_{ii}

^{H}𝗩 = 𝗜 and the notation ( )

_{ij}to indicate the

*ij*th element of the parenthesized matrix. The error variance on decaying modes with |

*λ*

_{i}| < 1 goes exponentially to zero in the limit of large

*k.*Since the Kalman filter is the optimal data assimilation method minimizing error in all possible directions, the Kalman filter forecast and analysis error distributions must have zero variance on the space of decaying adjoint eigenmodes also.

*b*such that the leading eigenvalue

*λ*

_{1}(

^{a}

_{k}

*λ*

_{1}(

^{a}

_{k}

*b*for all

*k.*Consequently, if 𝘂

_{i}and 𝘂

_{j}correspond to a nondecaying and decaying adjoint eigenvector, respectively, then

*λ*

_{j}| < 1, the right-hand side of (6) tends to zero as

*k*→ ∞. This proves that the steady-state covariance between decaying and nondecaying eigenvectors is zero. This means that as

*k*→ ∞, the kernel matrix

^{a}

_{k}

^{a}

_{K}

*K*×

*K*Hermitian matrix and

*K*is the number of nondecaying eigenmodes of the dynamics. Consequently, the steady-state form of Eq. (4) is

^{a}

_{K}

^{a}

_{K}

^{H}

_{K}

_{K}= [

**v**

_{1},

**v**

_{2}, … ,

**v**

_{K}]. The same arguments apply to the steady-state forecast error covariance matrix. It now remains to find expressions for the steady-state error covariances

^{a}

_{K}

^{f}

_{K}

### c. Infinite time Kalman filter error covariances

*k*→ ∞, the analysis and forecast error covariances become independent of the initial error covariances and the time step

*k.*They satisfy,

^{H}

_{K}

_{K}, respectively, gives

_{K}= 𝗛𝗩

_{K}. Taking the inverse of (11) and rewriting it using the Sherman–Morrison–Woodbury formula (Golub and van Loan 1996) gives

**Λ**

_{K}is diagonal, it is straightforward to solve this equation for the elements of

^{f−1}

_{K}

^{H}

_{K}

_{K}gives,

^{f}

_{K}

**Λ**

_{K}

^{a}

_{K}

**Λ**

^{H}

_{K}

^{a}

_{K}

**Λ**

^{−1}

_{K}

^{f}

_{K}

**Λ**

^{−H}

_{K}

**Λ**

_{K}will be near unity. In this case,

^{f}

^{a}will be very similar to each other. Maximum eigenvalue magnitude is a decreasing function of the doubling time Δ

*t*

_{d}of the most rapidly growing eigenvector and an increasing function of the time interval between which observations are taken

*t*

_{k+1}−

*t*

_{k}. Consequently,

^{f}

^{a}will be similar to each other whenever the time interval between observations is a small fraction of the doubling time. When

*t*

_{k+1}−

*t*

_{k}is greater than or equal to Δ

*t*

_{d}, then

^{f}

^{a}may differ substantially.

^{a−1}

_{K}

_{ij}is independent of eigenvalue for |

*λ*

_{i}

*λ*

_{j}| ≫ 1. Thus illustrating how optimal data assimilation schemes reduce error most in the most uncertain directions of the forecast field.

## 3. Observational design in a T21L3 quasigeostrophic model

### a. Model description

The model used in this study is the QG model described in Marshall and Molteni (1993). It is a global spectral model with a T21 horizontal truncation and has three vertical levels corresponding to 800, 500, and 200 hPa. Diabatic processes are represented by forcing and dissipative terms. The model forcing is composed of specified source terms of potential vorticity (PV) that are spatially varying but temporally constant and produce time mean and variance fields that correspond to the observed northern winter climatology. The model has three types of dissipative forcing: the first term represents Newtonian relaxation of thickness with a relaxation coefficient of 25 day^{−1}, the second term represents Ekman dissipation expressed as a linear drag term on the 800-hPa wind that varies with topography from 3 to 1.5 day^{−1}, and the third term is a horizontal scale-selective (∇^{8}) dissipation such that spherical harmonics of the time-varying potential vorticity with total wavenumber 21 are damped on a 2-day timescale. The model, with 1449 degrees of freedom is complex enough to capture baroclinic synoptic-scale processes important in forecast error growth, but small enough that the eigenvectors and adjoint eigenvectors are easily computed.

### b. Identification of relevant time-invariant basic states

In the appendixes, we show that when eigenvectors exist whose eigenvalues are infinitesimally greater than unity, it can take an unbounded amount of time for a Kalman filter to produce the infinite time error covariance solutions given in section 2. As such, we wish to identify stationary states that lie near observed atmospheric states. Here, a descent algorithm is used to find a quasi-stationary state near the observed state whose rate of evolution is several orders of magnitude less than the rate of evolution of the observed state. We then recognize that many of the terms in operational numerical models provide error prone representations of physical processes. Furthermore, the parameters that determine these error prone representations are uncertain. Having made this recognition, we then search for a new representation of physical processes that is as plausible as the old representation and for which the state obtained from the descent algorithm is precisely stationary.

*direct descent algorithm,*consider the problem of perturbing a flow of interest to obtain an initial state that remains quasi-stationary over the analysis cycle. If the model state vector of length

*n*is represented by

**x**

^{a}and

**x**

^{f}at analysis and forecast time, respectively, and small perturbations to the state vectors are represented by

**x**′

^{a}and

**x**′

^{f}, then

*M*

**x**

^{a}

**x**

^{a}

*M*

**x**

^{a}

**x**

^{a}

**x**

^{f}

**x**

^{f}

*M*represents the nonlinear model and 𝗠 is the

*n*×

*n*forward tangent propagator matrix based on the nonlinear time-dependent trajectory. We seek a perturbation

**x**′

^{a}such that

**x**

^{f}

**x**

^{f}

**x**

^{a}

**x**

^{a}

**x**′

^{a}for

**x**′

^{f}yields

**x**

^{a}

^{−1}

**x**

^{a}

**x**

^{f}

*n*×

*n*identity matrix.

For our T21L3 model, direct inversion of 𝗠 − 𝗜 was feasible; however, direct inversion of 𝗠 − 𝗜 led to a perturbation **x**′^{a} that was far too large to evolve linearly. To avoid this problem, a singular value decomposition of 𝗠 − 𝗜 was used to create a pseudoinverse that excludes the trailing 10% (i.e., fastest decaying) singular vectors of 𝗠 − 𝗜. This pseudoinverse operator produced a reasonably small initial perturbation that, when added to **x**^{a}, produced an initial state vector whose 12-h trajectory is closer to its starting point than the previous trajectory. The tangent linear propagator around this new trajectory was then computed and the procedure was repeated. Iterations were continued until **x**^{a} − **x**^{f} no longer decreased. In the cases we examined, seven iterations were enough to reduce the difference between the initial and evolved state vectors by two orders of magnitude.

Having used the descent method to obtain slowly varying states, we next computed the PV forcing *S*′ for which the initial state of the slowly varying state would be precisely stationary. By adding *S*′ to the original PV forcing *S,* the model supporting the precisely stationary state is obtained. To assess whether this new model is as plausible as the original model, we divided the root-mean-square of the change in *S* (*S*′) by the root-mean-square of *S.* This ratio was found to be less than 0.02 for all of the quasi-stationary states considered in this paper. We also divided the maximum absolute value of *S*′ by the maximum absolute value of *S.* This ratio was found to be less than 0.006 for all states considered. Since Marshall and Molteni's (1993) method of estimating *S* is based on a crude statistical sampling technique, we concluded that the models that supported precisely stationary states were as plausible as the original model. Furthermore, it was found that all of the results presented in this paper are insensitive to whether the stationary-state or quasi-stationary-state eigenvectors and eigenvalues are used.

Other possible methods of finding quasi-stationary states near observed states include variational approaches in which the functional *J* = |(𝗠 − 𝗜)**x**′^{a} − (**x**^{a} − **x**^{f})|^{2} is minimized, the Newton–Picard method discussed in Lust et al. (1998), and numerical methods for solving the discretized equations one obtains after setting the time derivatives to zero.

## 4. Results

For the results shown here, the analysis cycle interval is 12 h and the observational network consists solely of measurements of streamfunction. The observation errors are assumed to be uncorrelated in both the vertical and horizontal. The streamfunction observation error variance at all three model levels would be equivalent to 100 m^{2} geopotential height error variance under geostrophic balance at 45°. The results are presented in these equivalent height error variance units, although the model state vector is streamfunction. All observation locations coincide with the Gaussian grid points of the model.

### a. Time-invariant basic states and their eigenvectors

Using the methods of section 3b, we computed two different time-invariant basic states; one of which approximated the Northern Hemisphere winter time-mean model state^{3} while the other approximated the flow on a day when a blocking high was present in the North Pacific. Most of our analysis is focused on the wintertime mean model basic state. The blocking high basic state is solely used to illustrate the sensitivity of effective observing networks to changes in the basic state.

*σ*

_{E}is defined as

*f*is the Coriolis force,

*V*is the magnitude of the vector wind, and

*N*is the Brunt–Väisälä frequency. Here,

*σ*

_{E}is calculated for the 200–800-hPa layer based on the stationary time-mean fields. The areas of baroclinic instability in the vicinities of the west Pacific and Atlantic jets are clearly apparent.

As described in section 2, the analysis and forecast error covariances are completely spanned by the nondecaying eigenvectors of the forward tangent propagator 𝗠. Figure 2 shows the 12-h amplification factors (i.e., the eigenvalue magnitudes) for the eigenvectors of 𝗠. Only 110 of the 1449 eigenvectors are nondecaying (i.e., have magnitudes greater than or equal to 1.0). In this case, no modes are exactly neutral.

Figure 3 shows the real and imaginary components of the streamfunction of the leading eigenvector of 𝗠, which has a 12-h amplification factor of 1.18, (equal to an exponential growth rate of 0.35 day^{−1}, or a doubling time of 2 days). The eigenvector is not localized, having significant amplitude in both the North Pacific and North Atlantic, with amplitude increasing with height. Some eigenvectors (not shown) exhibit significant amplitude in both Northern and Southern Hemispheres. Vertical cross sections (not shown) indicate a westward tilt with height, particularly between 800 and 500 hPa. The root-mean square (rms) of the first 20 leading eigenvectors (not shown) exhibit amplitude maxima over the North Pacific and North Atlantic, just downstream from the regions of maximum baroclinicity.

Also of interest are the adjoint eigenvectors, that is, the eigenvectors of 𝗠^{T}. The adjoint eigenvector matrix is equivalent to the inverse transpose of the eigenvector matrix of 𝗠. These adjoint modes are the perturbations that optimally excite the eigenvectors of 𝗠 (Farrell 1988, 1989). The leading adjoint eigenvector streamfunction is shown in Fig. 4. This perturbation optimally excites the leading eigenvector shown in Fig. 3. This mode has maximum amplitude over Europe and Asia, with larger amplitudes between 500 and 800 hPa, and almost no energy at 200 hPa. In contrast to the leading eigenvectors, the rms of the first 20 adjoint eigenvectors (not shown) have the largest amplitude at 500 hPa, with maxima located upstream from the eigenvector maxima.

### b. Selection of time interval between observations

As discussed in section 2, a key variable in determining the infinite time forecast and analysis error covariances is the ratio of the time interval *t*_{k+1} − *t*_{k} between assimilation times and the doubling time Δ*t*_{d} for the most rapidly growing eigenvector. Simmons et al. (1995) estimate the wintertime 2000/01 doubling time of rms 500-mb height errors in forecasts for 1994 from the European Centre for Medium-Range Weather Forecasts (ECMWF) to be 1.6 days. The time interval between assimilation times for most operational weather forecasting centers is 6 h. [Note, however, that radiosonde observations are typically made only every 12 h while other observation types are irregularly distributed through time) (see Daley 1991).] Thus, the ratio of error doubling time over time interval between assimilation times for the atmospheric data assimilation problem is about 6.4.

With grid column observations at every grid point the corresponding doubling time for Northern Hemisphere 500-hPa rms errors in our idealized system is 3.2 days. (This doubling time is extrapolated from 12-h amplification assuming exponential growth.) We choose our time interval between observations and assimilation times so that Δ*t*_{d}/(*t*_{k+1} − *t*_{k}) is also equal to 6.4. With Δ*t*_{d} = 3.2 days, this gives *t*_{k+1} − *t*_{k} = 12 h.

### c. Error covariances with observations in every grid column

The first experiments are based on observations at every grid column with uncorrelated observation error variances of 100 m^{2} at all three vertical levels. The vertically averaged 12-h forecast height error variance for this system is shown in Fig. 5. The forecast error variance increases with height, and maxima occur near the exit regions of the Northern Hemisphere jets. These maxima are collocated with the maxima of the leading eigenvectors of 𝗠.

The analysis error variance fields (not shown) exhibit similar spatial patterns with smaller magnitudes. The global average forecast and analysis height error variances are 0.49 and 0.42 m^{2}, respectively. This means that rms analysis error is just 6.4% of rms observation error for a globe-covering observing network. This is an upper limit to the accuracy with which the atmospheric state can be known with radiosonde-like observations in each grid column.

^{f}𝗛

^{T}is a scalar giving the forecast error variance of the observed variable. Thus, if the observed variable is a state variable, then the single observation increment describes the column of 𝗣

^{f}that lists the covariances of all state variables with the observed variable normalized by the variance of the observed variable.

Figure 6 displays such increments for 500-mb streamfunction observations at two different horizontal locations. The structures of the increments are complex and differ significantly between the two locations. The vertically tilted localized wave packet structures exhibited in the vertical cross sections of the increments are somewhat reminiscent of the early stage baroclinic wave packet development examined in Simmons and Hoskins (1979).

The optimal single observation analysis increments are not very localized in space. The Northern Hemisphere increments show values above 0.3 in both the Atlantic and Pacific basins. (The increment has magnitude 1 at the observed variable.) The far field effect of a single observation assimilated with perfect forecast error covariances for a steady-state basic state is particularly evident for the Southern Hemisphere observation. These figures demonstrate the existence of true error covariances between variables spatially separated by distances of the order of magnitude of the radius of the earth. These long distance covariances could be artifacts of neglecting model error, the stationarity of the basic state or the infinite time limit. However, without further research our result makes it difficult to ignore the possibility that true long distance error covariances exist. Current operational data assimilation schemes preclude the representation of such covariances.

### d. Determining effective locations for observations

The error covariance solution can be used to find the optimal location of a single grid column observation by simply evaluating the globally averaged forecast error variance for all possible locations of the grid column observation. The observation location that minimizes the global forecast error variance is the optimal site for a single grid column observation. Having identified this first observation site, one can then use the error covariance solution to find the optimal location of an additional observation given the existence of the first observation. This is done by evaluating the global forecast error variance for all possible two-column-observation networks consisting of an observation at the first site and an observation column at another site. Adding more observations to the network by iterating this approach creates an observational network that we shall call *effective.* This serial observational network design technique is not guaranteed to be optimal because we do not evaluate the forecast error variance resulting from every possible combination of observation locations. However, we can call it effective because our search procedure ensures that the network has a certain degree of nonredundancy. Similar serial observation network design approaches have been suggested by Evans et al. (1998).

Figure 7a shows the globally averaged forecast height error variance as a function of the location of a single column observation (contours greater than 24 × 10^{5} m^{2} have been omitted). The absence of Southern Hemisphere contours in Fig. 7a is a consequence of the fact that a single column observation lying in a Northern Hemisphere baroclinic zone controls error variance in the direction of fastest growing (Northern Hemisphere) eigenvectors much more effectively than any single column observation lying in the Southern Hemisphere. The most effective observation location in this case is actually over Baja, California. It appears that this location has a projection onto both the Pacific- and Atlantic-region eigenvectors.

Figure 7b gives the locations of the first 10 effective observations. In general, the optimal location for an additional observation occurs in a region remote from the other observations, usually in the opposite hemisphere of the previous observation. The first 150 effective column observations (Fig. 7c) tend to cluster in two midlatitude belts. Local maxima are found in the vicinity and downstream of the Pacific and Atlantic storm tracks. Local maxima over Eurasia may be related to the adjoint modes, that is, the perturbations that optimally excite the leading eigenvectors.

Within this simple system it is feasible to compare the effective column observations with other types of suboptimal observing configurations. Figure 8 shows the global average forecast height error variance for *N* effective column observations compared with land-only observing networks and networks where the observations are equally spaced over the globe. For reference, column observations at every grid column (2048) will produce a global forecast height error variance of 0.49 m^{2}. Eight effective observations are sufficient to make a globally averaged 12-h forecast error variance be less than the observational error variance of 100 m^{2}; 154 effective observations will produce a global forecast error variance comparable to the 601 observations placed at all land grid points between 70°S and 70°N. When the land-based observation network density is decreased by a factor of 4, then 151 land-based observations produce a forecast error variance of 9.5 m^{2}, 3.8 times larger than 151 effective observations (2.50 m^{2}). The effective observations are also considerably better than equally spaced observations and slightly better than observations placed at every grid column at the latitudes where the effective observations cluster. Consistent with the results of Morss et al. (2001), sensitivity to observation placement is larger when the observing networks are less dense.

To test the sensitivity of the location of the first 150 effective observing sites to changes in the stationary basic state, we have also computed the first 150 effective observing sites for a stationary basic state corresponding to a Northern Hemisphere blocking event. Figure 9 shows the results of these calculations. Comparison of Fig. 9b with Fig. 7c reveals that the two sets of 150 effective observing sites have many sites in common. Unsurprisingly, the largest differences between site locations tend to maximize in regions where the differences between the two basic-state flows are maximized (cf. Fig. 9a with Fig. 1).

This effective network design algorithm allows for the possibility of taking two (or more) independent observations at the same site. However, in the calculations performed to construct Figs. 7–9, the algorithm never selected the same site more than once.

It is also possible to examine the impact of adding one observation to a fixed observing network as a function of the location of that observation. Figure 10 shows the percent global forecast error variance reduction obtained by adding one column observation to a fixed land-based observing system (column observations at every land grid point between 70°S and 70°N), as a function of the location of the additional observation. The results indicate that a reduction in global forecast error variance of over 4% is obtainable through an optimal placement of one additional column observation in either the South or North Pacific.

### e. Which are better, upper-level or lower-level observations?

Results considered so far were produced using column observations. Observations at single model levels have also been considered. The vertically averaged forecast height error variance for observations at every grid point at 200 hPa only and at 500 hPa only are shown in Fig. 11a,b respectively. The observation errors are considered to be spatially uncorrelated and, as in the previous sections, have observation error variances of 100 m^{2}. The global average 12-h forecast height error variance for the 200-hPa observations is 1.23 m^{2}, which is 0.76 of the corresponding 12-h forecast error variance of 1.62 m^{2} for 500-hPa observations (and 0.4 of the 3.10 m^{2} resulting from 800-hPa observations, not shown). However, even though the 200-hPa observations result in smaller global error variances, local forecast error variance maxima over the storm track regions of the North Pacific and North Atlantic are larger for the 200-hPa observations than for the 500-hPa observations. Figure 11c is the ratio between the vertically averaged forecast error variance fields shown in Figs. 11a and 11b. The field is shaded where the ratio is greater than one, indicating regions where the 500-hPa observations result in lower forecast error variance than the 200-hPa observations. The 500-hPa observations results in lower error variances primarily at and downstream of baroclinic regions. This is consistent with singular vector and adjoint sensitivity studies that find that the fastest growing perturbations in strongly baroclinic environments tend to occur in the middle to lower troposphere. In non- or weakly baroclinic environments, the fastest growing perturbations tend to occur at higher levels (Buizza and Palmer 1995).

To test the sensitivity of this result to forecast lead time, we also computed the 72-h forecast error variance for either 500- or 200-hPa observations at 12-h intervals. The global average 72-h forecast height error variance for the 200-hPa observations was 2.90 m^{2}, which is 0.78 of the corresponding 72-h forecast error variance of 3.71 m^{2} for 500-hPa observations (cf. 0.78 with the 0.76 obtained for 12-h forecasts). Thus, although 500-hPa observations are still less effective at reducing global forecast error variance than 200-hPa observations at the 72-h lead time, it is evident that at the 72-h forecast lead time 500-hPa observations are more useful relative to 200 hPa than at the 12-h forecast lead time. Plots like Fig. 11c showing the ratio of 72-h forecast error variance from 500- and 200-hPa observations showed that the regions for which 500-hPa observations produced less error than 200-hPa observations were generally larger than the corresponding regions for the 12-h forecast. To summarize, the relative importance of 500- and 200-hPa observations is fairly insensitive to changes in forecast lead time but the lower-level observations do slightly better at longer forecast lead times.

## 5. Concluding discussion

Exact solutions have been found for the infinite time error covariances of Kalman filter state estimates with a linear time-independent nondefective unstable dynamics propagator, time-independent observation operator, and time-independent observation error covariance. As far as the authors are aware, the solutions are new.

Since the Kalman filter is the optimal state estimation method, these solutions describe the upper bound to what can be known and predicted in such systems. They describe how knowledge of the state of a system is limited by dynamical error amplification, the projection of the observation operator onto the amplifying eigenvectors of the dynamics, and observational error covariance. Under the assumptions by which the solutions are derived, error variance is wholly confined to the vector subspace of the nondecaying adjoint eigenvectors. Only the nondecaying eigenvectors of the dynamics propagator are required to precisely describe the infinite time error covariances.

Once the nondecaying eigenvectors of the dynamics operator have been computed, the exact solutions allow the error variance controlling properties of a wide variety of observational networks to be quickly computed. In other words, the solutions allow OSSEs to be performed for stationary basic states in a trivial amount of time. This computational efficiency allows sophisticated observation network design algorithms to be employed. Here, we have demonstrated a serial network design algorithm that, although not formally optimal for networks with more than a single observation, avoids highly redundant observation placements.

Key features of the error covariance fields for differing observational networks have been computed for a T21L3 global QG model. This simple model provides a representation of atmospheric processes such as large-scale midlatitude baroclinic and barotropic instability that is qualitatively correct. Marshall and Molteni (1993) have shown that its representation of low-frequency variability also has some qualitatively reasonable features. The interval between data assimilation times for our solutions was selected so that the ratio of error doubling time of the most rapidly growing normal mode to the time interval between data assimilation times was similar to that for the atmospheric data assimilation problem.

For an observational network featuring radiosonde-like grid column observations of streamfunction in every grid column, and a stationary basic state corresponding to the Northern Hemisphere winter time-mean, global rms analysis errors are just 7% of the rms observation error. This represents an upper bound of the analysis error accuracy that can be achieved with a network consisting solely of grid column streamfunction observations. Plots of vertically averaged analysis error and forecast error variance showed that uncertainty in state estimates was largest in the exit regions of Northern Hemisphere jets.

Optimal single observation increments for this network demonstrated the existence of true error covariances between variables spatially separated by distances of the order of magnitude of the radius of the earth. While it is possible that this result is peculiar to our assumptions of no model error and time-invariant dynamics, without further research one must also accept the possibility that such long distance error covariances occur in more realistic systems. Consequently, the result might be of interest to ensemble Kalman filter developers, such as Houtekamer and Mitchell (2001), Anderson (2001), and Whitaker and Hamill (2002), who have suggested that all long distance error covariances produced by ensemble-based error correlations should be removed by means of a Schur inner product with some prescribed error correlation matrix. Work by Houtekamer and Mitchell (2001) on a simplified system suggests that error covariance localization can be discarded with little ill effect when the number of ensemble members exceeds 250.

Petterssen and Smebye (1971) identified two archetypes of atmospheric cyclogenesis: type A, which was initiated at low levels; and type B, which was initiated at upper levels. Hoskins et al.'s (1985) description of baroclinic instability in terms of counterpropagating Rossby waves located the sources of instabilities in meridional potential vorticity gradients. Buizza and Palmer (1995) found that perturbations that maximize energy growth typically have maximal amplitude at lower levels (at and below 500 mb) in strongly baroclinic regions and in upper levels (around 200 mb) in weakly baroclinic regions. All of these descriptions of baroclinic development invite suggestions about whether observations should be concentrated at lower, mid-, or upper levels in order to ensure that baroclinic development is accurately forecast. The exact error covariance expressions introduced in this paper make it easy to answer such questions for the systems for which the solutions are valid. For the T21L3 QG model with 12-h data assimilation cycle, it was found that upper-level (200 mb) observations were more effective at reducing globally averaged error variance, but midlevel (500 mb) observations were more effective at reducing error variance in the baroclinic regions associated with midlatitude jets. Low-level observations (850 mb) were the least effective. It would be interesting to see if this result changed if a more sophisticated atmospheric model were used.

To apply the error covariance solutions introduced in this paper to the sophisticated numerical models used in operational data assimilation schemes, one needs to identify stationary states that lie near observed states of interest. We were able to do this in our T21L3 QG model by using a descent algorithm to find a quasi-stationary state and then by adding a small forcing term to the model that rendered the quasi-stationary state precisely stationary. In the cases we examined, the additional forcing term was so small that the model that included the additional forcing term was as plausible as the original model. Given the tangent linear of a sophisticated nonlinear model and its adjoint, the first part of this procedure could be achieved for a sophisticated model by using a conjugate gradient scheme similar to that used in 4D-Var to find slowly varying states near observed states of interests. The second part of this procedure could also be achieved by adding to the model's equations forcing terms that would render the quasi-stationary state stationary. Whether or not this forcing would be small enough to have a negligible effect on the plausibility of the model would depend on (a) the uncertainties in the model's parameterization schemes and (b) the slowness of the slowly varying state identified by the descent algorithm.

Given any existing fixed observational network, the exact solutions make it feasible to compute the optimal location for an additional observation. Through the iterative use of this facility, *effective* global observing networks were designed. These networks were found to be considerably superior to observing strategies such as land-based observations, equally spaced observations or storm-track-based observations. Nine effective radiosonde column observations (27 in total) rendered globally averaged 12-h forecast error variance less than radiosonde observation error variance. We find it interesting to compare this result with the fact that current operational data assimilation schemes require about 10^{5} observations to render analysis error variance less than radiosonde observation error variance. [See Simmons and Hollingsworth (2002) for evidence that analysis error variance is now less than radiosonde observation error variance, and see Daley (1991) for estimates of numbers of observations in the routine observing network.]

To test whether something as simple as weak error growth between data assimilation times in the T21L3 model was responsible for its errors being so much easier to constrain that the atmospheres, the “effective network” experiment was rerun with a 72-h interval between observation times. In this case, 49 column observations were required to make the globally averaged 72-h rms forecast error be less than rms observation error. With 72 h between observation times, the leading eigenvector amplifies by a factor of 2.8 between each observation time. Simmons and Hollingsworth (2002) estimate error-doubling times during the first 24 h of integration of the ECMWF model to be about 1.12 days; consequently, the 6-h error growth between nominal observation times in the ECMWF model is likely to be far less than the 72-h error growth in the T21L3 QG model. Thus, lack of error growth in the T21L3 QG model does not appear to be a likely reason for the huge difference in the numbers of observations required to make rms forecast error be smaller than rms observation error in the T21L3 and ECMWF systems. Consequently, the discrepancy must be attributed to some combination of the following: a larger number of amplifying error directions in the atmosphere than in the T21L3 model, suboptimality of the ECMWF data assimilation scheme, random model error that significantly projects onto rapidly growing error structures, model error bias, poor observation locations, mis-specified observation error covariances, nonlinear transfer of error from unresolved rapidly evolving small scales to resolved scales, or a time evolving basic state.

Operational forecast models are not perfect. The Kalman filter error covariance solutions presented here assume a perfect model. Algebra, not shown here, indicates that the extension of the solution to the case of an imperfect model is fairly straightforward provided that the part 𝗤 of the forecast error covariance due to model error is one or two orders of magnitude smaller than the part 𝗠𝗣^{a}𝗠^{T} due to the propagation of analysis error covariance.^{4} However, since so little is known about the nature of model error covariance in operational models, the operational relevance of any hypothesized 𝗤 would be largely unknown. Nevertheless, Farrell and Ioannou's (2001a,b) work on systems with stable dynamics propagators shows that model error can, in principle, profoundly affect the structures of the minimal set of perturbations required to explain the maximum amount of error variance. Etherton and Bishop (2002), in a simple model study, found that the presence of model error whose statistics and bias are unknown increased long-term error variances by several orders of magnitude. Orrell et al. (2001) suggest that model error, rather than initial condition error, might be the primary cause of forecast error although Simmons and Hollingsworth (2002) suggest the converse. The neglect of model error may be a prime contributor to the fact that observations reduce error much more effectively in our idealized problem than they do in the ECMWF's data assimilation scheme.

While only uncertain estimates of the qualitative characteristics of 𝗤 are available, the bounds analysis techniques employed in Tippett et al. (2000) remain a promising means of predicting likely impacts of 𝗤 on error covariance and its dimension. Another approach would be to investigate the effect of model error using the steady-state error covariance solutions for suboptimal data assimilation schemes given in Gajic and Qureshi (1995).

The relevance of our solution is also limited by the assumption of a time-independent basic state. Work by Dritschel et al. (1991), Bishop (1993), and Bishop and Thorpe (1994) illustrates how the stability characteristics of shear flows are profoundly affected when deformation forces the shear flow to evolve.

The degree of relevance to the atmosphere of our assumptions of time independence and vanishingly small model error depends on the scale of atmospheric motion considered. At large scales, atmospheric flow evolves slowly in time and is well resolved by current weather prediction models whereas small-scale evolution is often rapid and poorly represented in models. Marshall and Molteni (1993) argue that low-frequency atmospheric flow can be viewed in terms of a series of transitions between quasi-stationary states. Haarsma et al. (2001) have found such quasi-stationary states to be a key feature of 40-kyr integrations of coupled climate models.

Suppose one had identified a set of large-scale atmospherically relevant quasi-stationary basic states together with data on the percentage of time that the atmosphere spends in these states (see Marshall and Molteni 1993; Lust et al. 1998). For each quasi-stationary state, all amplifying normal modes could be computed (Lehoucq et al. 1998). With these in hand, one could use infinite time error covariance solutions to rapidly assess the error controlling qualities of some candidate global observing network for each of the relevant large-scale quasi-stationary atmospheric states. An overall measure of the value of any particular candidate network could then be deduced by using data on the percentage of time that the atmosphere spends in each state to appropriately weight the value of the network to each state. In this way, the infinite time error covariance expressions presented in this paper might be used to help design observational networks. However, because of the many assumptions made in computing our solutions and since the amount of time such quasi-stationary states remained “quasi stationary” would often be less than the amount of time required for error covariances to become stationary, such results would need to be treated with caution until OSSEs could be performed to evaluate them.

The economic value of accurate forecasts of regime changes from high index blocked flow to low index zonal flow is high. How could the routine observational network be altered in order to reduce errors in forecasts of regime changes? Might an observational network designed to accurately monitor linear deviations from a stationary state lying near a high index blocked regime also be useful in identifying the deviations that lead to the nonlinear transition from blocked to unblocked flow? More research is needed to answer these questions.

## Acknowledgments

The authors gratefully acknowledge helpful discussions of this work with Roger Daley and Ron Gelaro. Thanks also to Ron Gelaro and Phil Merilees for providing resources for a visit to the Naval Research Laboratory, Monterey, during which much of the research for this paper was performed. The support of the Office of Naval Research Grant N00014-00-1-0106 and Program Element 0601153N, Project Number BE-0333-0345 are gratefully acknowledged. The Department of Defense High Performance Computing Program at NAVO MSRC provided computing resources.

## REFERENCES

Anderson, J. L., 2001: An ensemble adjustment filter for data assimilation.

,*Mon. Wea. Rev.***129****,**2884–2903.Arnold, C., and C. Dey, 1986: Observation system simulation experiments: Past, present, and future.

,*Bull. Amer. Meteor. Soc.***67****,**687–695.Baker, N. L., and R. Daley, 2000: Observation and background adjoint sensitivity in the adaptive observation-targeting problem.

,*Quart. J. Roy. Meteor. Soc.***126****,**1431–1454.Barkmeijer, J., M. van Gijzen, and F. Bouttier, 1998: Singular vectors and estimates of the analysis error covariance metric.

,*Quart. J. Roy. Meteor. Soc.***126****,**1431–1454.Barkmeijer, J., R. Buizza, and T. N. Palmer, 1999: 3D-Var Hessian singular vectors and their potential use in the ECMWF Ensemble Prediction System.

,*Quart. J. Roy. Meteor. Soc.***125****,**2333–2351.Berliner, L. M., Q. Lu, and C. Snyder, 1999: Statistical design for adaptive weather observations.

,*J. Atmos. Sci.***56****,**2536–2552.Bishop, C. H., 1993: On the behaviour of baroclinic waves undergoing horizontal deformation. Part 2: Error bound amplification and Rossby wave diagnostics.

,*Quart. J. Roy. Meteor. Soc.***119****,**241–269.Bishop, C. H., and A. J. Thorpe, 1994: Frontal stability during moist deformation frontogenesis. Part II: The suppression of nonlinear wave development.

,*J. Atmos. Sci.***51****,**852–873.Bishop, C. H., B. J. Etherton, and S. J. Majumdar, 2001: Adaptive sampling with the ensemble transform Kalman filter. Part I: Theoretical aspects.

,*Mon. Wea. Rev.***129****,**420–436.Buizza, R., and T. N. Palmer, 1995: The singular-vector structure of the atmospheric global circulation.

,*J. Atmos. Sci.***52****,**1434–1456.Cohn, S. E., 1997: An introduction to estimation theory.

,*J. Meteor. Soc. Japan***75****,**257–288.Cohn, S. E., and D. P. Dee, 1988: Observability of discretized partial differential equations. SIAM.

,*J. Numer. Anal.***25****,**586–617.Dabberdt, W. F., and Coauthors. 1996: Research opportunities from emerging atmospheric observing and modeling capabilities.

,*Bull. Amer. Meteor. Soc.***77****,**305–323.Daley, R., 1991:

*Atmospheric Data Analysis.*Vol. 2. Cambridge University Press, 457 pp.Dritschel, D. G., P. H. Haynes, M. N. Juckes, and T. G. Shepherd, 1991: The stability of a two-dimensional vortices filament under uniform strain.

,*J. Fluid Mech.***230****,**647–665.Eady, E. T., 1949: Long waves and cyclone waves.

,*Tellus***1****,**17–31.Emanuel, K., and Coauthors. 1995: Report of the First Prospectus Development Team of the U.S. Weather Research Program to NOAA and the NSF.

,*Bull. Amer. Meteor. Soc.***76****,**1194–1208.Emanuel, K., and Coauthors. 1997: Observations in aid of weather prediction for North America: Report of Prospectus Development Team Seven.

,*Bull. Amer. Meteor. Soc***78****,**2859–2868.Evans, M. N., A. Kaplan, and M. A. Cane, 1998: Optimal sites for coral-based reconstruction of global sea surface temperature.

,*Paleoceanography***13****,**502–516.Farrell, B., 1988: Optimal excitation of neutral Rossby waves.

,*J. Atmos. Sci.***45****,**163–172.Farrell, B., 1989: Optimal excitation of baroclinic waves.

,*J. Atmos. Sci.***46****,**1193–1206.Farrell, B., and P. J. Ioannou, 2001a: Accurate low-dimensional approximation of the linear dynamics of fluid flows.

,*J. Atmos. Sci.***58****,**2771–2789.Farrell, B., and P. J. Ioannou, 2001b: State estimation using a reduced-order Kalman filter.

,*J. Atmos. Sci.***58****,**3666–3680.Gajic, Z., and M. Qureshi, 1995:

*Lyapunov Matrix Equation in System Stability and Control*. Academic Press, 255 pp.Golub, G. H., and C. F. van Loan, 1996:

*Matrix Computations.*3d ed. The Johns Hopkins University Press, 694 pp.Haarsma, R. J., J. D. Opsteegh, F. M. Selten, and X. Wang, 2001: Rapid transitions and ultra-low frequency behaviour in a 40 kyr integration with a coupled climate model of intermediate complexity.

,*Climate Dyn.***17****,**559–570.Hoskins, B. J., and P. J. Valdes, 1990: On the existence of storm-tracks.

,*J. Atmos. Sci.***47****,**1854–1864.Hoskins, B. J., M. E. McIntyre, and A. W. Robertson, 1985: On the use and significance of isentropic potential vorticity maps.

,*Quart. J. Roy. Meteor. Soc.***111****,**877–946.Houtekamer, P. L., and H. L. Mitchell, 2001: A sequential ensemble Kalman filter for atmospheric data assimilation.

,*Mon. Wea. Rev.***129****,**123–137.Houtekamer, P. L., L. Lefaivre, J. Derome, H. Ritchie, and H. L. Mitchell, 1996: A system simulation approach to ensemble prediction.

,*Mon. Wea. Rev.***124****,**1225–1242.Ide, K., P. Coutier, M. Ghil, and A. C. Lorenc, 1997: Unified notation for data assimilation: Operational, sequential and variational.

,*J. Meteor. Soc. Japan***75****,**181–189.Jazwinski, A., 1970:

*Stochastic Processes and Filtering Theory*. Academic Press, 376 pp.Lehoucq, R. B., D. C. Sorenson, and C. Yang, 1998:

*ARPACK Users Guide: Solution of Large-Scale Eigenvalue Problems with Implicitly Restarted Arnoldi Methods*. Society for Industrial and Applied Mathematics, 160 pp.Lindzen, R. S., and B. Farrell, 1980: A simple approximate result for the maximum growth rate of baroclinic instabilities.

,*J. Atmos. Sci.***37****,**1648–1654.Lorenz, E. N., and K. A. Emanuel, 1998: Optimal sites for supplementary observation sites: Simulation with a small model.

,*J. Atmos. Sci.***55****,**399–414.Lust, K., D. Roose, A. Spence, and A. R. Champneys, 1998: An adaptive Newton–Picard algorithm with subspace iteration for computing periodic solutions.

,*SIAM J. Sci. Comput.***19****,**1188–1209.Marshall, J., and F. Molteni, 1993: Toward a dynamical understanding of planetary-scale flow regimes.

,*J. Atmos. Sci.***50****,**1792–1818.Morss, R. E., K. A. Emanuel, and C. Snyder, 2001: Idealized adaptive observation strategies for improving numerical weather prediction.

,*J. Atmos. Sci.***58****,**210–232.Orrell, D., L. Smith, J. Barkmeijer, and T. N. Palmer, 2001: Model error in weather forecasting.

,*Nonlinear Processes Geophys.***8****,**357–371.Palmer, T. N., R. Gelaro, J. Barkmeijer, and R. Buizza, 1998: Singular vectors, metrics, and adaptive observations.

,*J. Atmos. Sci.***55****,**633–653.Petterssen, S., and S. J. Smebye, 1971: On the development of extratropical cyclones.

,*Quart. J. Roy. Meteor. Soc.***97****,**457–482.Simmons, A. J., and B. J. Hoskins, 1979: The downstream and upstream development of unstable baroclinic waves.

,*J. Atmos. Sci.***36****,**1239–1254.Simmons, A. J., and A. Hollingsworth, 2002: Some aspects of the improvement in skill of numerical weather prediction.

,*Quart. J. Roy. Meteor. Soc.***128****,**647–678.Simmons, A. J., R. Mureau, and T. Petroliagis, 1995: Error growth and estimates of predictability from the ECMWF forecasting system.

,*Quart. J. Roy. Meteor. Soc.***121****,**1739–1771.Tippett, M. K., S. E. Cohn, R. Todling, and D. Marchesin, 2000: Low-dimensional representation of error covariance.

,*Tellus***52A****,**533–553.Toth, Z., and E. Kalnay, 1997: Ensemble forecasting at NCEP and the breeding method.

,*Mon. Wea. Rev.***125****,**3297–3319.Whitaker, J. S., and T. M. Hamill, 2002: Ensemble data assimilation without perturbed observations.

,*Mon. Wea. Rev.***130****,**1913–1924.

## APPENDIX

### Finite Time Error Covariances for Unstable Time-Invariant Diagonalizable Dynamics, No Model Error, and Fixed Observational Network

*information*forms of (1) and (2) are, respectively, given by

^{a}

_{k}

^{−1}

^{T}

^{−1}

^{a}

_{k−1}

^{−1}

^{−1}

^{T}

^{−1}

^{a}

_{k}

^{f}

_{k}

*k*is

**x**is proportional to exp(−½

**x**

^{T}

^{a−1}

_{k}

**x**); thus, from (A5), the probability density of finding an analysis error equal to the ith eigenvector

**v**

_{i}is proportional to exp[−½(𝗪

_{k})

_{ii}], where (𝗪

_{k})

_{ij}is the

*ij*th component of 𝗪

_{k}= 𝗩

^{H}(

^{a}

_{k}

^{−1}𝗩, and from (A5) and the diagonality of

**Λ**,

*i*th eigenvector when

*k*is large). However, the independence of the evolving error covariance from the initial error covariance just ensures that the first term on the right-hand side of (A5) is negligible. To achieve stationary error statistics the second term on the right-hand side of (A5) must also be time independent. Consequently, satisfaction of (A7) together with

^{A1}By the time the error covariances in the subspace of amplifying eigenvectors satisfy (A8), error covariances in the decaying eigenvector subspace would typically be of negligible amplitude. For any given

*k,*a (

*λ*

_{i}

*λ*

_{i}) value close enough to unity can violate (A8). Consequently, if eigenvalues exist such that (

*λ*

_{i}

*λ*

_{i}) is infinitesimally greater than unity, then (A8) will only be satisfied for infinite

*k.*In such cases, a quasi-stationary state would not remain quasi stationary long enough for steady-state error covariances to develop. Consequently, in the absence of certain knowledge of the nonexistence of eigenvalues infinitesimally greater than unity, the eventual production by a Kalman filter of the error covariance solutions given in this paper can only be counted on when the Kalman filter is applied to a flow that is precisely steady (or precisely periodic).

^{1}

Marshall and Molteni (1993) suggest how both observational and dynamical considerations might be used to determine which quasi-stationary states might be most relevant.

^{2}

The propagator is diagonalizable, for instance, when the eigenvalues of the dynamics are all distinct (e.g., Golub and van Loan 1996). A nondiagonalizable matrix is *defective.*

^{3}

The time-mean state is not a quasi-stationary solution of the governing equations. Nevertheless, the methods of section 3b allow us to identify a quasi-stationary state near the time-mean state.

^{4}

Unlike the no-model-error case, error covariances are not confined to the subspace of amplifying eigenvectors when 𝗤 ≠ 0.

^{}

Error covariances within the subspace of nonamplifying eigenvectors continually diminish and hence, they never reach a steady state. For example, (𝗪_{k})_{ij} is unbounded when |*λ*_{i}*λ*_{j}| ≤ 1 and *k* → ∞.