Using Improved Background-Error Covariances from an Ensemble Kalman Filter for Adaptive Observations

Thomas M. Hamill NOAA–CIRES Climate Diagnostics Center, Boulder, Colorado

Search for other papers by Thomas M. Hamill in
Current site
Google Scholar
PubMed
Close
and
Chris Snyder National Center for Atmospheric Research,* Boulder, Colorado

Search for other papers by Chris Snyder in
Current site
Google Scholar
PubMed
Close
Full access

Abstract

A method for determining adaptive observation locations is demonstrated. This method is based on optimal estimation (Kalman filter) theory; it determines the observation location that will maximize the expected improvement, which can be measured in terms of the expected reduction in analysis or forecast variance. This technique requires an accurate model for background error statistics that vary both in space and in time. Here, these covariances are generated using an ensemble Kalman filter assimilation scheme. A variant is also developed that can estimate the analysis improvement in data assimilation schemes where background error statistics are less accurate.

This approach is demonstrated using a quasigeostrophic channel model under perfect-model assumptions. The algorithm is applied here to find the supplemental rawinsonde location to add to a regular network of rawinsondes that will reduce analysis errors the most. The observation network is configured in this experiment so there is a data void in the western third of the domain. One-hundred-member ensembles from three data assimilation schemes are tested as input to the target selection procedure, two variants of the standard ensemble Kalman filter and a third perturbed observation (3DVAR) ensemble. The algorithm is shown to find large differences in the expected variance reduction depending on the observation location, the flow of the day, and the ensemble used in the adaptive observation algorithm. When using the two variants of the ensemble Kalman filter, the algorithm defined consistently similar adaptive locations to each other, and assimilation of the adaptive observation typically reduced analysis errors significantly. When the 3DVAR ensemble was used, the algorithm picked very different observation locations and the analyses were not improved as much.

The amount of improvement from assimilating a supplemental adaptive observation instead of a fixed observation in the middle of the void depended on whether the observation was assimilated sporadically or during every analysis cycle. For sporadic assimilation, the adaptive observation provided a dramatic improvement relative to the supplemental fixed observation. When an adaptive observation was regularly assimilated every cycle, the improvement was smaller.

For the sporadic assimilation of an adaptive observation, targeting based simply on the maximum spread in background forecasts provided similar target locations and similar analysis improvements to those generated with the full algorithm. The improvement from the regular assimilation of an adaptive observation based on the spread algorithm was no larger than when observations from a fixed target in the middle of the void were regularly assimilated.

Corresponding author address: Dr. Thomas M. Hamill, NOAA–CIRES Climate Diagnostics Center, R/CDC 1, 325 Broadway, Boulder, CO 80305-3328. Email: hamill@cdc.noaa.gov

Abstract

A method for determining adaptive observation locations is demonstrated. This method is based on optimal estimation (Kalman filter) theory; it determines the observation location that will maximize the expected improvement, which can be measured in terms of the expected reduction in analysis or forecast variance. This technique requires an accurate model for background error statistics that vary both in space and in time. Here, these covariances are generated using an ensemble Kalman filter assimilation scheme. A variant is also developed that can estimate the analysis improvement in data assimilation schemes where background error statistics are less accurate.

This approach is demonstrated using a quasigeostrophic channel model under perfect-model assumptions. The algorithm is applied here to find the supplemental rawinsonde location to add to a regular network of rawinsondes that will reduce analysis errors the most. The observation network is configured in this experiment so there is a data void in the western third of the domain. One-hundred-member ensembles from three data assimilation schemes are tested as input to the target selection procedure, two variants of the standard ensemble Kalman filter and a third perturbed observation (3DVAR) ensemble. The algorithm is shown to find large differences in the expected variance reduction depending on the observation location, the flow of the day, and the ensemble used in the adaptive observation algorithm. When using the two variants of the ensemble Kalman filter, the algorithm defined consistently similar adaptive locations to each other, and assimilation of the adaptive observation typically reduced analysis errors significantly. When the 3DVAR ensemble was used, the algorithm picked very different observation locations and the analyses were not improved as much.

The amount of improvement from assimilating a supplemental adaptive observation instead of a fixed observation in the middle of the void depended on whether the observation was assimilated sporadically or during every analysis cycle. For sporadic assimilation, the adaptive observation provided a dramatic improvement relative to the supplemental fixed observation. When an adaptive observation was regularly assimilated every cycle, the improvement was smaller.

For the sporadic assimilation of an adaptive observation, targeting based simply on the maximum spread in background forecasts provided similar target locations and similar analysis improvements to those generated with the full algorithm. The improvement from the regular assimilation of an adaptive observation based on the spread algorithm was no larger than when observations from a fixed target in the middle of the void were regularly assimilated.

Corresponding author address: Dr. Thomas M. Hamill, NOAA–CIRES Climate Diagnostics Center, R/CDC 1, 325 Broadway, Boulder, CO 80305-3328. Email: hamill@cdc.noaa.gov

1. Introduction

It has long been recognized that the quality of a numerical weather forecast is related to the quality of its initial condition, or “analysis.” If the analysis has large errors, or if it has moderate errors in regions where forecast errors grow quickly, then the resulting numerical forecast may be poor.

Let us assume that in addition to a routine network of observations, additional observations could be collected sporadically for a moderate cost. These observations, which might come from dropsondes, pilotless drones, driftsondes, or satellites, would be taken at a location(s) chosen to maximize the expected improvement in some aspect of the ensuing analysis or the subsequent forecasts. This general problem is known as targeting, or adaptive observations (Emanuel et al. 1995; Snyder 1996; Lorenz and Emanuel 1998).

The development of existing methods for adaptive observations have been driven by practical opportunities in field experiments such as FASTEX (Snyder 1996; Joly et al. 1997; Emanuel and Langland 1998), NORPEX (Langland et al. 1999a), and the Winter Storms Reconaissance Program (Szunyogh et al. 2000). These methods include the singular vector technique (Palmer et al. 1998; Buizza and Montani 1999; Bergot et al. 1999; Gelaro et al. 1999, 2000; Bergot 2001), a quasi-linear inverse approach (Pu et al. 1997, 1998; Pu and Kalnay 1999), gradient and sensitivity approaches (Bergot et al. 1999; Langland et al. 1999b; Baker and Daley 2000), ensemble spread techniques (Lorenz and Emanuel 1998; Hansen and Smith 2000; Morss 1998; Morss et al. 2001); the ensemble transform technique (Bishop and Toth 1999; Szunyogh et al. 1999), and the ensemble transform Kalman filter (Bishop et al. 2001; Majumdar et al. 2001).

Considering the adaptive observation problem on a somewhat more theoretical level, determining the optimal observation location requires that we predict the influence of a given observation on the uncertainty of the analysis or the subsequent forecast. That influence is determined partly by the type and accuracy of the observation and partly by how errors will grow during the subsequent forecast (if we are interested in forecasts). It is also strongly affected by the uncertainty in the prior, or “background” forecast. Berliner et al. (1999) provide the analytical tools for understanding how analysis and forecast error characteristics are related to observation and background uncertainty. Their framework, which is reviewed in section 2, is an application of ideas of statistical design and estimation theory to adaptive observations (see their section 2, appendix A; and Cohn 1997). The choices of observation locations derived through this framework are optimal in the case that the required probability distributions are normal and forecast-error evolution is linear. This framework differs from many of the existing approaches that do not incorporate the effects of background uncertainty, such as the singular vector technique or sensitivity techniques (as implemented in practice, though not in principle; see Ehrendorfer and Tribbia 1997; Barkmeijer et al. 1998; Palmer et al. 1998; Hansen and Smith 2000 for discussions of issues related to the choice of initial norm). As a consequence, when using these methods, the same observation location is defined regardless of how large or small the background error is in a given location, and regardless of how accurate or inaccurate the observation (Baker and Daley 2000).

The data assimilation scheme is an additional factor that determines the influence of an observation on the background uncertainty. Results from field experiments show that adding observations to operational analysis–forecast systems can sometimes degrade the subseqent forecasts, and this has often been blamed on inadequacies of the operational assimilation schemes. Bergot (2001) found that supplemental observations provided a more consistently positive impact when they were assimilated with a four-dimensional variational analysis (4DVAR; Le Dimet and Talagrand 1986; Rabier et al. 1998) rather than a three-dimensional variational analysis (3DVAR; Lorenc 1986; Parrish and Derber 1992). Note, however, that occasional degradations are inherent in statistical assimilation schemes; see Morss and Emanuel (2002).

Very little work has been done showing how to use the new and potentially very accurate ensemble-based data assimilation methods for adaptive observations. Our primary intent in this paper is to demonstrate that covariances provided by the ensemble Kalman filter (EnKF) are of sufficient quality to use in designing observing networks. Under the assumptions of a perfect model, an infinite ensemble, normality of observation and background errors, and linearity of error growth, the EnKF (Evensen 1994; Evensen and van Leeuwen 1996; Houtekamer and Mitchell 1998, 2001; van Leeuwen 1999; Keppenne 2000; Mitchell and Houtekamer 2000; Hamill and Snyder 2000; Hamill et al. 2001; Anderson 2001; Whitaker and Hamill 2002) provides the minimum-variance estimate of an updated analysis state and a correct model of the background- and analysis-error covariances (Burgers et al. 1998). If high-quality background-error covariances are available, this in turn permits the adaptive observation problem to be tackled rigorously and with only minor approximations. We will also show that the lessened improvement from assimilating an adaptive observation using suboptimal assimilation schemes can be predicted as well, provided an accurate estimate of background uncertainty is also available.

To explore the consequences of using different assimilation schemes for adaptive observations, we will use ensembles of forecasts produced by two variants of the EnKF. We will also test an ensemble produced by a perturbed observation 3DVAR scheme. Using these ensembles, one can predict the reduction in error variance from the assimilation of the regular network of observations. Subsequently, it is possible to apply a conceptually simple algorithm to estimate the magnitude of the subsequent variance reduction across the domain from an adaptive observation. A large number of potential observation locations can be evaluated very quickly. The adaptive observation algorithm then determines the location where this variance is reduced by a maximum expected amount. We will also describe but not test an adaptive observation algorithm, which can find the observation locations that maximize the expected reduction in forecast error.

Others investigations of adaptive observations have used ensemble techniques to estimate background uncertainty (e.g., Lorenz and Emanuel 1998; Bishop and Toth 1999; Morss et al. 2001; Hansen and Smith 2000). Only the ensemble transform Kalman filter (ETKF) of Bishop et al. (2001), however, has used that estimate to calculate explicity the influence of a given observation. Were both our proposed algorithm and the ETKF given the same set of ensemble members after the assimilation of the regular network of observations, the two techniques would produce equivalent results (and both are approximations to the results of Berliner et al. 1999). However, we aim for a slightly different goal than in Bishop et al. (2001) and their subsequent work (e.g., Majumdar et al. 2001). Their research has focused on what can be done with operational ensemble forecast data, and they make some algorithmic simplifications that permit greater computational efficiency but reduced accuracy. Our purpose is to demonstrate that this rigorous approach to adaptive observations is feasible and effective using an EnKF, in that it can accurately predict the effect of observations on analysis uncertainty.

In part to limit the scope of this paper, we focus on choosing additional observations to minimize expected analysis errors. The algorithm we develop, however, has a straightforward extension to the case of minimizing expected forecast errors, as described in section 3d. Minimizing expected analysis errors is also of interest in its own right, as it is the natural approach if one desires to optimize forecast quality simultaneously at multiple lead times or from multiple initialization times (Berliner et al. 1999). Minimizing expected analysis errors also avoids potential complications arising from the nonlinearity of forecast dynamics, and the associated non-Gaussianity of forecast errors (Hansen and Smith 2000).

2. Design of the experiment

The rest of the paper will use a quasigeostrophic (QG) channel model as a vehicle for testing algorithms for adaptive observations. For these experiments, we assume the forecast model is perfect. A long reference integration of the QG model provides the true state; the assimilation and forecast experiments then use that same model together with imperfect observations of the true state.

Errors will be measured in a total energy norm. Let f denote the Coriolis parameter (here, 10−4 s−1); m is the dimension of the model state vector; N is the Brunt–Väisälä frequency (here, 1.13 × 10−2 s−1), and Φ′ is a geopotential perturbation. Then the energy norm is denoted as
i1520-0493-130-6-1552-e1

a. Model and observations

The QG model is documented in Snyder et al. (2001, manuscript submitted to Mon. Wea. Rev.) and was used in Hamill and Snyder (2000) and Hamill et al. (2000). It is a midlatitude, beta-plane, gridpoint channel model that is periodic in x (east–west), has impermeable walls on the north–south boundaries, and rigid lids at the top and bottom. There is no terrain, nor are there surface variations such as land and water. Pseudopotential vorticity (PV) is conserved except for Ekman pumping at the surface, ∇4 horizontal diffusion, and forcing by relaxation to a zonal mean state. The domain is 16 000 × 8000 × 9 km; there are 129 grid points east–west, 65 north–south, and 8 model forecast levels, with additional staggered top and bottom levels at which potential temperature θ is specified. Forecast model parameters are set as in Hamill et al. (2000).

A single fixed observational network is tested here (Fig. 1), with a data void in the western third of the domain. All “control” observations are presumed to be rawinsonde soundings, with u- and υ-wind components and θ observed at each of the eight model levels. Observation errors (Table 1) are assumed to be normal and uncorrelated between vertical levels and uncorrelated in time. The same observation error variances are used in the data assimilation and in the generation of synthetic control observations. These control observations and new analyses are generated every 12 h, followed by a 12-h forecast with the QG model that produces the background state at the next analysis time.

b. Data assimilation schemes

The adaptive observation algorithm to be described in section 3 requires an ensemble whose sample covariance matrix approximates that of the background errors (prior to the assimilation of the additional observations). That ensemble will depend on the specific data assimilation scheme used to assimilate previous observations.

We will use three assimilation schemes, a “perturbed observation” 3DVAR algorithm (Houtekamer and Derome 1995; Hamill et al. 2000) and two variants of the EnKF. All are described in detail in the appendix, and parameter settings are listed in Table 2. The two versions of the EnKF differ in the way that background-error covariances are approximated given an ensemble of background states. In the first, the deviation of each member from the ensemble mean is “inflated” (i.e., multiplied by a scalar constant greater than 1) before their use in the EnKF. In the second, the assimilation is based on a “hybrid” covariance model in which the background-error covariance matrix is approximated as a weighted sum of the sample covariance from the ensemble and a stationary covariance matrix (specifically, that used in the 3DVAR scheme). Both versions of the EnKF use covariance localization, as discussed in the appendix.

The required ensemble is generated in the same manner for each of these assimilation schemes. Suppose that we have an ensemble of prior forecasts. Then, given new control observations, each member of this ensemble is updated separately with those observations perturbed by an independent realization from the observation-error distribution; this we term a perturbed observation scheme. The resulting ensemble of analyses can then be used to produce an ensemble of short-range forecasts valid at the next observation time. Previous work has shown that in a perfect-model context, such ensembles have desirable sampling characteristics when used either with a 3DVAR assimilation (Hamill et al. 2000) or in the context of the EnKF (Houtekamer and Mitchell 1998, 2001; Hamill and Snyder 2000; Hamill et al. 2001). Again, further details of the implementation appear in the appendix. For ensemble data assimilation schemes that do not involve perturbing the observations, see Lermusiaux and Robinson (1999), Anderson (2001), and Whitaker and Hamill (2002).

3. Methodology for choosing adaptive observation locations

The methodology we implement for the selection of an adaptive observation location follows closely from the theory of Berliner et al. (1999). Our emphasis here is to demonstrate that this rigorous approach to adaptive observations is feasible and effective, in that it can accurately predict the effect of observations on analysis uncertainty using an appropriately constructed ensemble. In addition, this methodology is able to predict the impact of additional observations even when those observations are assimilated with a suboptimal assimilation scheme. Notation generally follows the conventions suggested in Ide et al. (1997).

Though our algorithm is conceptually simpler, this methodology is mathematically identical to the ETKF of Bishop et al. (2001) assuming both algorithms input the same ensemble after the assimilation of the regular network of observations. The major difference is determination of how the regular network of observations will reduce the variance in the ensemble between the background and the analysis. The ETKF estimates the reduction in the subspace of the ensemble in a complex but computationally efficient manner. We assume the ensemble of background states will be changed by the assimilation of real or synthetic observations using the actual analysis scheme. This is significantly more expensive but also is conceptually simpler and more accurate, since aspects like “covariance localization” (Houtekamer and Mitchell 2001; Hamill et al. 2001) can be included when processing the regular network of observations.

a. Equations to predict analysis-error variance

First, consider the analysis calculation, written in the general form
i1520-0493-130-6-1552-e2
where xa is the m-dimensional analyzed state vector, yo is a p-dimensional vector of observations, is an approximate gain matrix defined below, and xb is the background state, which is typically a forecast from the previous analysis but more generally is our best estimate of the state prior to assimilating the observations. The linear operator 𝗵 relates the true state xt to the observations through
yoxtϵ,ϵN
In reality, the relation between the model state and the observations is often nonlinear, and most existing assimilation schemes satisfy (2) and (3) only approximately.
The gain matrix is specific to the assimilation scheme. In all the schemes considered here, has the form
bTbT−1
where P̂b is a model or approximation of the actual backround error covariance matrix,
bxtxbxtxbT
where 〈·〉 denotes the expected value. For example, many 3DVAR systems use P̂b = 𝗯, where 𝗯 is a stationary, isotropic covariance matrix (e.g., Parrish and Derber 1992), while the EnKF bases P̂b on the sample covariance of an ensemble of background states.
Next, we derive a general expression for the analysis-error covariance,
axtxaxtxaT
Subtracting both sides of (2) from xt gives
xtxaIxtxbϵ.
Substituting this result into (6) and assuming that the observation and background errors are uncorrelated, that is, 〈ϵ(xtxb)T〉 = 0, we obtain the imperfect covariances
i1520-0493-130-6-1552-e8
If the assimilation scheme uses the correct background-error covariance matrix (P̂b = 𝗽b), then becomes the Kalman gain matrix, 𝗸 = 𝗽b𝗵T(𝗵𝗽b𝗵T + 𝗿)−1, and
i1520-0493-130-6-1552-e9
which is the familiar updating of covariances in the Kalman filter.

Equations (8) and (9) thus provide us with a framework for estimating analysis-error covariances for a given 𝗵 in the case of imperfect and perfect background-error statistics, respectively. These equations express how assimilation of new observations changes the uncertainty of the analysis relative to that of the background. This change depends on the form and location of the observations through 𝗵, the data assimilation scheme through , and the background uncertainty through 𝗽b. In particular, that change does not depend on the actual observations yo, and one can predict the effects of additional observations prior to the measurements themselves (Berliner et al. 1999).

We emphasize that (8) accounts naturally for the assimilation scheme and, moreover, that the influence of the assimilation scheme on the analysis uncertainty cannot be fully quantified without knowledge of the true 𝗽b. In addition, note that while the derivations of (8) and (9) do not make assumptions about the form of the underlying probability distributions for the forecast and analysis, those equations will be useful only when the covariance matrices 𝗽a and 𝗽b are useful summaries of uncertainty, that is, when those distributions are not too far from Gaussian. The usefulness of (8) is also limited to those assimilation schemes in which the update is approximately linear as assumed in (2).

b. Adaptively observing to reduce analysis-error variance

Now suppose we want to choose the location of a single observation to minimize the expected analysis-error variance.1 Formally, this amounts to maximizing the trace tr(𝗽b − 𝗽a) over a set of observation operators 𝗵 consisting of all possible locations for the observation. Assuming hereafter that an ensemble is available that provides a reasonable and computationally tractable estimate of 𝗽b, then (8) or (9) allow us to determine the best 𝗵 by evaluating tr(𝗽b − 𝗽a) for each 𝗵. Typically, this additional observation will supplement an existing network of routine observations. The background or prior estimate to which 𝗽b pertains is then the analysis with all routine observations.

For each potential observation location, there is an associated 𝗵; however, it may be economically feasible to observe more than one location at a time. With two locations, one would potentially have to evaluate all the combinations of locations to find the two that would reduce variance the most. Instead, we will make the simplifying assumption that the correct combination of locations can be determined with a serial, or “greedy” algorithm (Lu et al. 2000; Bishop et al. 2001). This serial approach is applicable when successive observations have independent errors.

Specifically, to determine a sequence of multiple locations, the following steps are repeated: first, tr(𝗽b − 𝗽a) is computed for each candidate observation location (each 𝗵). The location with the maximum trace is then selected, and an updated ensemble is generated whose sample covariance approximates 𝗽a implied by assimilating an observation at that location. [Note that real observations need not be assimilated at this point; the important detail is that the ensemble can be updated with some synthetic set of observations, since (8) and (9) depend only on the observation error covariance 𝗿 and not on the actual observations.] The adaptive observation algorithm is then applied again using the updated ensemble of analyses as background forecasts to select the next location.

If the observations will be assimilated using a less accurate model of background-error covariances, perhaps as are used in 3DVAR, then (8) should be used instead of (9) and prediction of the variance reduction in (8) requires an ensemble (such as the perturbed-observation ensemble discussed in the appendix) that reflects uncertainty in the background using a given assimilation scheme. Of course, if the ensemble estimate of 𝗽b is good enough for this purpose, it would also be natural to include it in the assimilation scheme and (8) would not be required.

c. Making the adaptive observation algorithm computationally efficient

The algorithm just outlined involves evaluating the influence of an observation on the analysis error over many different observation locations. In this section, we outline a relatively inexpensive technique for computing the expected reduction in analysis variance for a given observation.

The technique begins from an ensemble of background states, written as {xbi, i = 1, … , n}, where subscripts denote ensemble members. The ensembles considered here all approximate random samples from the conditional distribution of xt given other information. The background state xb is thus replaced by the ensemble mean, xb = (1/n) Σni=1 xbi, and 𝗽b is estimated in (8) by
i1520-0493-130-6-1552-e10
where 𝘅b is the matrix whose ith column is (n − 1)−1/2(xbixb). For the remainder of this section, we will simply replace 𝗽b by P̂b in (9), with the assumption that P̂b approximates 𝗽b with sufficient accuracy. Our subsequent results will demonstrate that this is so in a moderately complex, quasigeostrophic model.
If we are evaluating the reduction from assimilating a single radiosonde, the matrix (𝗵𝗽b𝗵T + 𝗿) in (9) is of full rank, relatively low order, symmetric, and positive definite. Hence it can be decomposed as 𝗾Λo𝗾T, where 𝗾 is an orthogonal matrix whose columns are the normalized eigenvectors and Λo a diagonal matrix of associated eigenvalues. Since 𝗾−1 = 𝗾T,
i1520-0493-130-6-1552-e11
This square root formulation in (11) is attractive since we can now write 𝗽b − 𝗽a as a matrix square root:
i1520-0493-130-6-1552-e12
However, in computing the term in parentheses on the right-hand side of (12), a matrix multiplication by 𝗽b is still necessary, and if 𝗵 is sparse, this typically will be the most computationally intensive step.
In calculating the trace of (12), the product 𝗽b𝗵T is evaluated as 𝘅b(𝗵𝘅b)T, as in (A2) from the appendix. To render this more computationally efficient, we perform a singular value decomposition (SVD) on 𝘅b, so that
bΣT
where 𝘂 is an m × (n − 1) matrix with orthonormal columns, Σ is an (n − 1) × (n − 1) diagonal matrix of nonzero singular values, and 𝘃 is an (n − 1) × (n − 1) orthogonal matrix. Similarly, 𝗵𝘅b = 𝗵𝘂Σ𝘃T so (𝗵𝘅b)T = 𝘃ΣT(𝗵𝘂)T = 𝘃Σ(𝗵𝘂)T since ΣT = Σ. Using this and 𝘃T𝘃 = 𝗶, (12) can be rewritten as
i1520-0493-130-6-1552-e14
Computing the trace of (14) can be further simplified. Since the columns of 𝘂 are orthonormal, the leading multiplication by 𝘂 in each of the factors on the right-hand side can be omitted without changing the trace of the product, and
i1520-0493-130-6-1552-e15

This equation is relatively inexpensive to compute. There is an up-front cost of performing a singular value decomposition of 𝘅b, but this need be done only once, and after this decomposition is performed, then the evaluation of (15) at any particular observation location can be performed quickly. The operation (𝗵𝘂)T is (for this model) the matrix transpose of a simple interpolation to the observation locations using the ensemble of eigenvectors instead of the raw ensemble data. Further, the multiplication by Σ2 is inexpensive since Σ2 is diagonal. An eigenvalue decomposition of 𝗵𝗽b𝗵T must be performed for each potential observation location, but the rank of this matrix is relatively small, and so its decomposition is inexpensive.

Note that for computational reasons, we have made one simplification that may reduce the accuracy of this adaptive observation scheme. Background-error covariances in the adaptive observation algorithm are assumed to be a direct outer product of ensemble member forecasts's deviation from their mean, as in (10); that is, 𝗽b is modeled strictly in a reduced, n-dimensional subspace to make the computations tractable. The model of covariances used in this algorithm thus assumes no localization, nor a hybridization of ensemble-based and stationary covariances that have been found to improve the EnKF performance. Even though these features may be a part of the actual data assimilation, their inclusion would make the computations here much more expensive. This simplifying assumption may cause some minor misestimation of the actual benefits of assimilating an observation. Results (not shown) indicated that the discrepancies introduced by making these approximations resulted only in a very small misestimation of the expected reduction in analysis variance.

d. Adaptive observations to reduce forecast-error variance

Next, consider choosing locations for additional observations with the goal of minimizing the forecast-error variance. This requires comparing the forecast from xa, the analysis including both routine and additional observations, with that from xb, the analysis based using only routine observations). Denoting quantities pertaining to these two forecasts by superscripts f|a and f|b, respectively, the change in forecast-error variance produced by the additional observations is tr(𝗽f|b − 𝗽f|a). Our methodology is again similar to that proposed in Bishop et al. (2001) and used in Majumdar et al. (2001).

If the analysis errors are not too large, then 𝗽f|b − 𝗽f|a ≈ 𝗺(𝗽b − 𝗽a)𝗺T, where 𝗺 is the linearization of the nonlinear forecast operator M. Using (9) and writing 𝗽b = 𝘅b𝘅bT,
i1520-0493-130-6-1552-e16
Now consider the ensemble of forecasts from the background ensemble, 𝘅f|bi = M(xbi) for i = 1, … , n. With the same accuracy, 𝗺𝘅b in (16) can be replaced by 𝘅f|b, the matrix whose ith column is (n − 1)−1/2(xf|bixf|b), and (16) becomes
i1520-0493-130-6-1552-e17
An efficient calculation of tr(𝗽f|b − 𝗽f|a) now proceeds as in (14) with the eigendecomposition of (𝗵𝗽b𝗵T + 𝗿) and singular-value decomposition (SVD) of 𝘅f|b as in (13) as
f|bf|bΣf|bf|bT
Thus,
i1520-0493-130-6-1552-e19
again omitting a factor of 𝘂f|b that does not change the trace, as in (15).

Assuming an ensemble of forecasts have been generated from the analyses without adaptive observations, algorithmically, then, the first step is to perform SVDs of the forecasts as in (18). Then for each observation location (each 𝗵), compute the expected reduction in forecast error variance using (19). After each 𝗵 has been tested, the adaptive observation location is determined from the 𝗵 where the trace was largest.

4. Performance of the ensemble data assimilation methods

Before demonstrating the adaptive observation location method, we document the general performance of the three data assimilation methods using the regularly available observations at the fixed network of rawinsondes in Fig. 1. We describe the general error characteristics of each ensemble, as the error characteristics will affect the amount of improvement that can be expected from a new observation. As well, the sampling characteristics of the ensembles are briefly documented to justify using each ensemble to estimate background error covariances. In the subsequent section, the ensemble from each of three data assimilation methods will be tested for their efficacy in defining adaptive observation locations using (15).

For each of the three assimilation methods, a 90-day cycle of short-range forecasts and analyses were generated, with an updated analysis generated every 12 h. We document the performance of three data assimilation schemes as described in Section 2b, the appendix, and Table 2: an inflated ensemble Kalman filter, a hybrid EnKF–3DVAR scheme, and a perturbed observation 3DVAR (PO–3DVAR) ensemble where the covariances are stationary, as in 3DVAR. For each experiment, a 100-member ensemble was used.

Figures 2a–c show a time series of analysis errors in the total energy norm [Eq. (1)] for each member and for the ensemble mean. As expected, for each of the three ensembles, the mean analysis is substantially lower in error than the large majority of individual ensemble member analyses. Errors for the inflated ensemble are slightly lower than for the hybrid, and both of these are dramatically lower in error than for the PO–3DVAR ensemble, indicating the dramatic benefits that may be achievable with accurate, flow-dependent background-error covariances (though the relative improvement may be unrepresentative of the results in real world weather prediction, since these experiments are conducted with a relatively simple model in a perfect-model framework).

We also provide a second metric of forecast quality, measuring the ability of the ensemble to sample properly from the distribution of plausible forecast states. For a properly constructed ensemble, low analysis error should be accompanied by uniformly distributed rank histograms (Hamill 2001 and references therein). The rank of the truth relative to a sorted n-member ensemble of forecasts should be equally likely to occur in any of the n + 1 possible ranks if the truth and ensemble sample the same underlying probability distribution. Hence, over many samples, a histogram of the ranks of truth relative to the sorted ensemble should be approximately uniform.

Figures 3a–c provides rank histograms for the model level 4 potential temperature, generated using a subset of 20 times from the time series, with the first sample analysis taken 10 days after the start of the cycle and with 4 days between each sample analysis. Samples are taken every 250 km in the domain [as noted in Hamill (2001), samples spaced this closely together may not have independent ranks—the general shape of rank histograms may be useful, but hypothesis tests for uniformity using χ2 tests are likely to be misleading]. In any case, there is a slight excess of population at the highest ranks, more notably for the two variants of the ensemble Kalman filter. Interestingly, there appears to be more nonuniformity for the inflated ensemble, where analysis errors were lowest. This showed up in many other simulations as well; often lower analysis errors were accompanied by more nonuniform rank histograms, suggesting that it is difficult to optimize the ensemble simultaneously for minimum error characteristics and optimum sampling characteristics. In any case, the departures from nonuniformity are quite mild. Given that 1) the ensemble-data assimilation schemes are generating analyses with greatly reduced errors relative to 3DVAR, and 2) the ensemble data appears relatively reliable, we proceed under the assumption that background-error covariance estimates required by the adaptive observation algorithm should be reasonably estimated by the sample covariance of the ensemble.

5. Adaptive observation results

a. Location selection with full algorithm

We now test the scheme that selects the adaptive observation location that will maximize the expected reduction in analysis error variance [Eq. (15), the “full” algorithm]. The adaptive observation results shown here are primarily based on the same subset of 20 times in this series, starting 10 days into the analysis cycle and every 4 days thereafter. The analyses produced by the assimilation of the fixed network of rawinsondes (raobs) are used as the background states for the adaptive observation tests performed here. This is a generally justifiable assumption to make if the observations are assimilated serially (Gelb 1974; Anderson and Moore 1979), though see Whitaker and Hamill (2002) for circumstances under which this approximation is not valid.

As a first check of our adaptive observation algorithm, we assess whether the expected reduction of variance computed via (15) is consistent with the actual reduction in variance achieved during the data assimilation. To determine this, it was assumed that a raob profile of winds and temperatures with statistics from Table 1 (adapted from operational statistics cited in Parrish and Derber 1992) would be available at each of the fixed set of locations shown in Fig. 4 on each of the 20 case days. For each location and time, the appropriate 𝗵 operator was developed, which extracts a background wind and temperature profile at the observation location. The expected tr(𝗽b − 𝗽a) was computed via (15) for each sample. We normalized this by the number of grid points according to
i1520-0493-130-6-1552-e20
generating a vector of b's.
We compare this against the reduction in variance when an observation is actually assimilated. For each location and time, a sample control observation was generated, and then a set of perturbed observations. The perturbed observations were assimilated using a standard ensemble Kalman filter algorithm, with no localization of covariances, no inflation of member deviations, nor hybridization. We then computed the actual reduction in ensemble variance
i1520-0493-130-6-1552-e21
in the energy norm from the ensemble of analyzed states. The expected reduction from (20) in the sample analysis variance ought to closely match the actual reduction from (21). Some minor variations can be expected as a result of using the ensemble Kalman filter approach, since it is possible that the use of perturbed observations can generate spurious background-observation error correlations (see Whitaker and Hamill 2002). In any event, this effect should be small when the ensemble size is large, and as shown in Fig. 5a, b is near perfectly correlated with a. In other words, the reduction in the ensemble sample variance from assimilating an observation can be predicted nearly perfectly without actually assimilating that observation by using (15).
Since we have not compared the ensemble against the true state, the calculation performed above tells us nothing about whether the analysis was improved by the assimilation of that observation, only the magnitude of the expected error variance reduction. The more interesting question is of course whether the assimilation of observations actually reduces the ensemble error. Forecast ensembles provide an estimate of the conditional distribution of the true state xt (given all previously available observations). The sample mean is an estimate of xt. For each individual observation profile that was assimilated, we calculate the actual reduction in mean squared error according to
cxbxt2xaxt2
It can be shown that in an expected-value sense, c and b ought to be of similar magnitudes. To assess whether this relationship exists here, c is plotted against b in Fig. 5b. There is much less of an obvious linear relationship, though in general large expected reductions in variance are more typically associated with large reductions in ensemble mean error. We suspect that the lack of a clear relationship may be due to the small sample size (also, sample points are not independent, since error structures are correlated for sample points on the same case day, and since there are only 20 case days). Also, note that 28% of the assimilated observations actually increased the error.

Why do some of the assimilated observations increase the ensemble mean analysis error? (see Morss and Emanuel 2002 for an extended discussion of this topic.) First, the EnKF provides a model of background-error covariances, but there is no guarantee that these error statistics are perfect. As well, the assimilated observations are imperfect, and sometimes the errors in the observations may be large enough for the observation to worsen the analysis (see Morss and Emanuel 2002, Fig. 11 for a nice illustration of this). The nature of the analysis process is statistical and of course subject to random errors; on average, observations provide benefit but they are not guaranteed to do so in every individual instance. For this perfect-model simulation with a known true state, we can assess the importance of observation errors by simply assimilating perfect observations. When they are assimilated, as shown in Fig. 6c, only 12% of the instances yields a mean square analysis error increase, and the magnitude of the typical degradation is significantly smaller. This suggests that the majority of the degradations were associated with errors in the observations.

Next, we demonstrate the operation of the adaptive observation algorithm on several selected case days. Ensemble data from each of the three data assimilation systems (inflated, hybrid, and PO–3DVAR) were used for location selection under the assumption that the ensemble could provide a perfect model of the covariances, that is, ensembles from each were input to (9) to assess the impact assuming that a Kalman filter approach was used for the data assimilation. For each horizontal grid point in the domain on each of the 20 case days, we tested the assimilation of a hypothetical adaptive observation profile at that location using (15). At each location, we calculated tr(𝗽b − 𝗽a)/tr(𝗽b), a measure of the fractional reduction of expected analysis-error variance over the entire domain. Figures 6–8 provide maps of the patterns of expected fractional reduction on three different case days using the inflated ensemble. The three cases show days where assimilating a raob profile could be expected to produce small, moderate, and large improvements, respectively. Several interesting features are shown here. First, the difference in the expected improvement between Figs. 6b and 8b is quite dramatic; less than a 10% fractional improvement from assimilating a raob profile to approximately a 55% improvement. This suggests that the algorithm may be able to define days when supplemental observations will be particularly helpful, as well as where in the domain the observation should be taken to provide the most benefit. Also note that a synthetic observation was actually assimilated in each case, with concomitant reductions in analysis variance, as illustrated in panel c of Figs. 6–8. These show maps of the expected improvement when the adaptive observation algorithm was applied a second time, after the first adaptive raob had been assimilated.

Figures 7b and 8b also suggest that an optimal adaptive location may differ from that which the casual user might pick from inspection of the flow on that day. In Fig. 7, the ensemble apparently indicated greater uncertainty about the details of the cutoff low in the northern part of the data void than the structure of the jet. Similarly, in Fig. 8, the trough in the southwest part of the domain was apparently poorly defined. Also note that the errors between the regions of the primary and secondary maxima in Fig. 7b were likely to be uncorrelated, given their distance from each other and that the primary location was in a cutoff low detached from the main jet. In Fig. 7c, after assimilation of the primary adaptive observation, most of the error variance near the primary location had been eliminated but not so near the secondary location.

The adaptive observation examples shown so far were generated with the inflated ensemble. Are the targets and patterns of expected improvements similar when generated from the hybrid and perturbed observation 3DVAR ensembles? Figures 9a–c presents the expected improvement from the hybrid ensemble computed using (15); these panels should be compared respectively to Figs. 6b, 7b, and 8b. The patterns of expected improvement were quite similar, and the observation locations for the latter two cases were almost identical.

The PO–3DVAR ensemble was also examined. In this test, the expected improvement was evaluated using (15), so that P̂b was estimated from the PO–3DVAR ensemble using (9). This, in essence, assumed that 𝗽b was correctly estimated from the PO–3DVAR ensemble, and that the subsequent data assimilation was done with the EnKF instead of 3DVAR (though, in actuality, the data assimilation did use 3DVAR). Figures 10a–c present the expected improvements for the three case days discussed. The expected improvements that might be obtained are much larger than for the inflated and hybrid ensembles, concomitant with the variance in this ensemble being larger. The regions of large improvement are also more diffuse, indicating that the PO–3DVAR ensemble is generally more uncertain about the state of the atmosphere over large regions, whereas the inflated and hybrid EnKFs were able to narrow down the regions of uncertainty. Of course, one would not run a perturbed observation, 3DVAR ensemble and then switch to assimilating an adaptive observation via the EnKF; presumably, 3DVAR would be used for the assimilation of the adaptive observation. We will revisit shortly the impact of an adaptive observation when much less accurate 3DVAR statistics are used for the data assimilation instead of the ensemble-based statistics.

However, let us briefly return to assessing the impact of these adaptive observations on improving analysis errors. To assess the improvement, for each of the 20 case days, the one optimal observation location was determined for the inflated, hybrid, and PO–3DVAR ensemble using (15). Because the accuracy of the subsequent analysis may depend upon the accuracy of the observation, for each case day we generated five independent realizations of the control observations by adding errors to the true state, with the errors consistent with 𝗿. Each observation was then separately assimilated using the same set of background forecasts. The values of c and b were computed from (22) and (20), respectively, and c versus b is plotted in Figs. 11a–c for the inflated, hybrid, and PO–3DVAR ensembles, respectively. The expected reduction in variance and the actual reduction in ensemble mean squared error were roughly consistent for the inflated and hybrid ensembles; generally, larger expected reductions in the ensemble mean error were associated with larger expected reductions in analysis variance. However, the actual reduction for the PO–3DVAR ensemble was much less than predicted. This, as noted in the preceding paragraph, was a consequence of the actual data assimilation being performed with 3DVAR while the adaptive observation algorithm assumes that the assimilation was performed with an EnKF. Now, suppose that the ensemble really does provide an accurate model of 𝗽b, but the much less accurate 3DVAR statistics are to be used for the data assimilation. Then we can evaluate the improvement from an adaptive observation based on (8) instead of (9); here, we compute the trace of (8) assuming P̂b is the stationary, 3DVAR covariance model and 𝗽b is the covariance estimate from the PO–3DVAR ensemble. Fig. 11d shows c versus b under these assumptions. Now, the expected improvement from assimilating via 3DVAR based on (8) was consistent with the ensemble mean squared errors. We note that accurately evaluating the improvement from assimilating adaptive observations using (8) requires a near-perfect estimate of the background-error covariances, such as may be supplied from an EnKF; if one has such an estimate and could perform the assimilation via an EnKF as readily as via 3DVAR, one might as well assimilate the data with the EnKF. Note also that greater improvements from adaptive observations when using a more sophisticated data assimilation system has previously been suggested by Bergot (2001) and Bishop et al. (2001).

Consider now whether the algorithm picked similar locations using each of the three ensembles. Figure 9 suggests that hybrid and inflated locations were often quite similar, while Fig. 10 suggests that PO–3DVAR locations were often different. Figures 12a–b show just how similar the inflated and hybrid locations were. The exact same observation location was picked on half the case days, and only three days had substantially different locations, one of which is illustrated in Figs. 6b and 9a. However, when comparing the locations from the inflated ensemble against the PO–3DVAR ensemble (Fig. 12b), there were many cases when locations were quite different. The differences in adaptive observation locations do not necessarily indicate a problem with the PO–3DVAR ensemble; rather, they highlight that different data assimilation schemes will produce different background-error statistics.

b. Improvement from adaptive versus supplemental fixed observations

We now attempt to provide an estimate of the benefit of assimilating a supplemental adaptive observation relative to assimilating a supplemental fixed observation in the middle of the void. We test this in two manners; first, we compare the analysis-error reduction when either a fixed or adaptive observation is sporadically assimilated. Next, we consider the case when an adaptive or new fixed observation replaces one of the fixed observations in the data-rich region during every data assimilation cycle.

Using the inflated ensemble and the set of 20 times used previously in Figs. 5 and 11, we applied the adaptive observation algorithm (15). The fractional reduction in the ensemble mean analysis error c from (22) was computed and then compared to the fractional reduction that would be achieved with a fixed supplemental raob profile at the grid point (30, 33), in the middle of the void. A scatterplot of the reduction is shown in Fig. 13. There is a dramatic improvement from using the adaptive observation relative to the fixed observation. The mean improvement is more than four times larger for the adaptive relative to the fixed. The adaptive observation improved the analysis in 19 of 20 cases versus only 15 of 20 for the fixed.

We also performed an experiment where one observation profile in the middle of the data-rich region was removed (the observation at x = 80, y = 45 in Fig. 1, chosen because of the abundance of other nearby observations), and either a new fixed observation at (30, 33) or an adaptive observation was assimilated during every cycle. The time-averaged relative improvement now is not nearly as dramatic (Fig. 14). There were substantial reductions in the ensemble mean analysis error from inserting a fixed observation in the middle of the void (compare to ensemble mean error of 1.07 in Fig. 2a). With an adaptive observation, there was further improvement, but not to the extent suggested from the experiments where an adaptive observation was introduced sporadically. There may be a number of factors that limit the improvement with cycled adaptive observations. First, relatively quickly, the adaptive observations reduce error variance in the data void. The primary benefit of adaptive observations occurs when the background errors are quite large; then the observation has a great impact (see Morss and Emanuel 2002 as well). When an adaptive observation is continually assimilated, it reduces the maximum background errors substantially, and errors are not likely to grow back to their original magnitude in the 12 h to the next assimilation cycle, similar to a result noted in Gelaro et al. (2000). Thus, in some sense an adaptive observation can make subsequent adaptive observations less necessary. Another possibility is that features with high errors eventually flow near enough by the fixed observation to be effectively corrected using the EnKF covariances.

c. Adaptive observations based on ensemble spread

The algorithm described in (15) still requires a nonnegligible amount of computing time and involves a moderate amount of coding. Since it is theoretically justifiable based on filtering theory and requires only minor approximations, it does provide a nice baseline for the evaluation of simpler methods. We examined one such method, selecting a location where the ensemble spread was largest. Such a technique has been suggested in the past in Lorenz and Emanuel (1998), Morss (1998), Hansen and Smith (2000), and Morss et al. (2001). Here, we used the squared spread (the variance about the ensemble mean) of column total energy generated from the inflated ensemble and compared it to the observation locations selected using (15) with the inflated ensemble. Figures 15a–c shows the squared spread in the ensemble on the same three days as pictured in Figs. 6–8; note the strong correlation in the patterns of spread and the magnitudes of expected improvement in Figs. 6b, 7b, and 8b. Figure 16a shows the strong correspondence of locations over the 20 cases and Fig. 16b shows how the expected improvement using (15) was quite similar to the expected improvement at the spread observation locations.

The strong correspondence was somewhat to be expected. The Kalman gain 𝗸 = 𝗽b𝗵T(𝗵𝗽b𝗵T + 𝗿)−1 is the product of two factors. The first, 𝗽b𝗵T, is the covariance between the observation location and other grid points. The second, (𝗵𝗽b𝗵T + 𝗿)−1, factors in the relative accuracy of the observation and the background at the observation location. If many grid points have background errors that strongly co-vary with background errors at the observation site, then the observation will make large corrections to the analysis over those co-varying grid points. Conversely, if background errors at other grid points near the observation are relatively uncorrelated with errors at the observation location, the corrective influence of that observation will be small (Berliner et al. 1999). If the extent of background-error covariance is rather similar from grid point to grid point, then the spread in the ensemble is the primary factor in determining location; however, if the spread is similar everywhere, variations in the background-error covariance will play a bigger role in determining the location. For the intermittent assimilation of observations, the location apparently was determined largely by the geographical variations of spread more than by the covariance structure in the background.

To demonstrate what improvement may be realized from assimilating an adaptive observation based on spread during every analysis cycle, we performed an experiment similar to the one used to generate Fig. 14. We conducted a 90-day assimilation cycle, assimilating all of the fixed observations shown in Fig. 1 except the observation at x = 80, y = 45. We then assimilated a replacement adaptive observation at the location with maximum spread. This resulted in a reduction of ensemble mean error in the energy norm of about 18% (Fig. 17). However, perhaps introducing an observation somewhere out in the middle of the void is what was of importance more than the specific location of the observation. To test this, we performed the same experiment of removing the observation at x = 80, y = 45 and inserting a new, fixed observation at x = 30, y = 33, near the middle of the void. The improvement for this network was about the same as with the adaptive observation based on spread.

Collectively, these results suggest that sporadic use of adaptive observations based on spread generated from an appropriate ensemble should be both useful and simple to implement. However, when cycled, the variance in the ensemble is quickly reduced and homogenized, and the spread algorithm is not very effective. Further improvements mostly depend upon the using of information on the covariance structure of background errors, as evidenced by the improvement noted in Fig. 14 but not in Fig. 17.

6. Discussion and conclusions

The underlying theory of data assimilation provides a rational basis for the selection of an adaptive observation location. Under the theory, if the background-error covariance is accurately modeled and depends upon the dynamics of the flow, then the effect of an adaptive observation upon analysis-error variance can be estimated. At first glance, the equation for estimating the reduction in analysis- or forecast-error variance appears to be too computationally demanding to be useful, owing to the high dimensionality of the background-error covariance matrix. However, if background-error covariances can be modeled using ensemble-based data assimilation methods such as the EnKF, these covariances can be estimated in a reduced-dimension subspace and the computations made more efficient.

In this paper, we demonstrated the application of an algorithm to select the optimal adaptive observation location using the background-error statistics from an ensemble Kalman filter coupled to a quasigeostrophic model. A perfect-forecast model was assumed, and an experiment was conducted with an observation network with a dramatic data void covering the western third of the domain. The algorithm was able to determine locations on each day where a supplementary observation was expected to reduce analysis error the most. The algorithm was also able to quantify how the expected benefit changed from day to day. When tested in a simple quasigeostrophic channel model under perfect-model assumptions, the algorithm predicted large day-to-day variations in the expected improvement to be realized from an adaptive observation. This suggests that it may be possible to define a small subset of days when such supplemental observations will be especially helpful in reducing analysis errors. For brevity, we did not consider the problem of adaptive observations to reduce forecast errors, though in principle the problem is no more complex; the specific technique that would be employed for reducing forecast error was described in section 3c.

As well as developing a model for predicting the influence of observations when background-error statistics are estimated correctly, we also developed and demonstrated a technique to estimate observation influence when background-error statistics are imperfect, as they clearly are in the time-averaged covariances in 3DVAR. As expected, the improvement from supplemental observations is significantly lessened when the adaptive observation is assimilated with a scheme using less accurate background-error covariances. These results underscore the importance of accurate estimates of the background-error covariance matrix when locating or assimilating adaptive observations.

We also compared the amount of improvement when adaptive observations were either assimilated sporadically or during every analysis cycle. When an adaptive observation was assimilated sporadically using the EnKF, that observation increased the percentage of analysis-error variance reduction fourfold compared to the assimilation of an observation at a fixed location in the middle of the data void. However, if either a fixed or an adaptive observation was assimilated during every data assimilation cycle, the reduction in error compared to assimilating the fixed observation was substantial but less dramatic. Adaptive observations are most helpful in situations when the background errors are large. Thus, if previous adaptive observations have already dramatically reduced analysis errors, subsequent adaptive observations will be less useful.

As a proxy for the full adaptive observation algorithm developed here, we examined the efficacy of assimilating an adaptive observation based on the spread in the ensemble. If such an observation was sporadically assimilated, it provided nearly the same level of benefit as an observation taken at the location determined from the full adaptive observation algorithm. However, if an adaptive observation based on ensemble spread was assimilated every cycle, the reduction in error relative to a fixed observation was negligible. This suggests that the spread algorithm efficiently determined locations where background errors were large, and assimilation of the adaptive observation significantly reduced the analysis error. However, once the background errors had been made more uniformly distributed from the assimilation of previous adaptive observations, the spread algorithm provided little or no subsequent benefit. Likely this was because the reduction in analysis error from a given observation is both a function of the variance in background errors at the observation location and the structure of how errors were correlated between the observation location and the analysis grid point. This latter effect was apparently more important in situations where there were not dramatic spatial variations in background error variances.

Readers are cautioned not to overinterpret the results presented here. These results used a simplified, quasigeostrophic channel model under the assumptions of no model error. Adaptive observation strategies for improving analyses were tested, but not strategies for improving forecasts. Also, the observational network we tested was somewhat unrealistic, including a dramatic data void and observations with simple error characteristics. In reality, observations are available throughout the real-world data voids, though often the observations are of lesser quality and do not contain detailed vertical structure. The algorithms presented here may not work as well if there is a large amount of nonlinearity in the forecast or non-Gaussianity of error distributions. As a relatively new and computationally expensive technology, ensemble-based data assimilation techniques have yet to be demonstrated operationally. Still, no intractable problems have been encountered in tests conducted so far with a wide variety of models of varying complexity, as discussed in the previously cited literature; further, many research groups are currently working toward testing these ideas in operational models. Nonetheless, our results should be interpreted as estimating an upper bound for the usefulness of adaptive observations. Overall, this algorithmic approach may be very attractive, since it is theoretically consistent with the underpinnings of current data assimilation systems.

The application of such an algorithm in real world numerical weather prediction and data analysis presupposes the existence of an operational EnKF or other similar algorithm. While many groups are working toward this goal, as of yet there is no operational EnKF for atmospheric data assimilation. Perhaps the clear benefit of ensemble-based data assimilation methods, not only for straightforward data assimilation but also for these ancillary applications, will make its appeal greater within the operational numerical forecast facilities.

Acknowledgments

This research was started with support through NCAR's U. S. Weather Research Program and finished at the NOAA–CIRES Climate Diagnostics Center (CDC). CDC is thanked for allowing the lead author to complete this research. We thank Rebecca Morss (NCAR), Craig Bishop (Penn State and NRL), Jim Hansen (MIT), and Ron Gelaro (NASA Goddard) for their advice and detailed reviews of a draft of the manuscript, and we thank Jeff Whitaker (CDC) for his extensive consultations.

REFERENCES

  • Anderson, B. D., and J. B. Moore, 1979: Optimal Filtering. Prentice-Hall, Inc., 357 pp.

  • Anderson, J. L., 2001: An ensemble adjustment Kalman filter for data assimilation. Mon. Wea. Rev., 129 , 28842903.

  • Anderson, J. L., and S. L. Anderson, 1999: A Monte Carlo implementation of the nonlinear filtering problem to produce ensemble assimilations and forecasts. Mon. Wea. Rev., 127 , 27412758.

    • Search Google Scholar
    • Export Citation
  • Baker, N. L., and R. Daley, 2000: Observation and background adjoint sensitivity in the adaptive observation-targeting problem. Quart. J. Roy. Meteor. Soc., 126 , 14311454.

    • Search Google Scholar
    • Export Citation
  • Barkmeijer, J., M. van Gijzen, and F. Bouttier, 1998: Singular vectors and estimates of the analysis error covariance metric. Quart. J. Roy. Meteor. Soc., 124 , 16951713.

    • Search Google Scholar
    • Export Citation
  • Bergot, T., 2001: Influence of the assimilation scheme on the efficiency of adaptive observations. Quart. J. Roy. Meteor. Soc., 127 , 635660.

    • Search Google Scholar
    • Export Citation
  • Bergot, T., G. Hello, A. Joly, and S. Malardel, 1999: Adaptive observations: A feasibility study. Mon. Wea. Rev., 127 , 743765.

  • Berliner, L. M., Z-Q. Lu, and C. Snyder, 1999: Statistical design for adaptive weather observations. J. Atmos. Sci., 56 , 25362552.

  • Bishop, C. H., and Z. Toth, 1999: Ensemble transformation and adaptive observations. J. Atmos. Sci., 56 , 17481765.

  • Bishop, C. H., B. J. Etherton, and S. Majumjar, 2001: Adaptive sampling with the ensemble transform Kalman filter. Part 1: Theoretical aspects. Mon. Wea. Rev., 129 , 420436.

    • Search Google Scholar
    • Export Citation
  • Buizza, R., and A. Montani, 1999: Targeting observations using singular vectors. J. Atmos. Sci., 56 , 29652985.

  • Burgers, G., P. J. van Leeuwen, and G. Evensen, 1998: Analysis scheme in the ensemble Kalman filter. Mon. Wea. Rev., 126 , 17191724.

  • Cohn, S. E., 1997: An introduction to estimation theory. J. Meteor. Soc. Japan, 75 ((1B),) 257288.

  • Ehrendorfer, M., and J. J. Tribbia, 1997: Optimal prediction of forecast error covariances through singular vectors. J. Atmos. Sci., 54 , 286313.

    • Search Google Scholar
    • Export Citation
  • Emanuel, K. A., and R. Langland, 1998: FASTEX adaptive observations workshop. Bull. Amer. Meteor. Soc., 79 , 19151919.

  • Emanuel, K. A., and Coauthors. 1995: Report of the first prospectus development team of the U.S. Weather Research Program to NOAA and the NSF. Bull. Amer. Meteor. Soc., 76 , 11941208.

    • Search Google Scholar
    • Export Citation
  • Evensen, G., 1994: Sequential data assimilation with a nonlinear quasigeostrophic model using Monte Carlo methods to forecast error statistics. J. Geophys. Res., 99 ((C5),) 1014310162.

    • Search Google Scholar
    • Export Citation
  • Evensen, G., and P. J. van Leeuwen, 1996: Assimilation of Geosat altimeter data for the Agulhas current using the ensemble Kalman filter with a quasigeostrophic model. Mon. Wea. Rev., 124 , 8596.

    • Search Google Scholar
    • Export Citation
  • Gaspari, G., and S. E. Cohn, 1999: Construction of correlation functions in two and three dimensions. Quart. J. Roy. Meteor. Soc., 125 , 723757.

    • Search Google Scholar
    • Export Citation
  • Gelaro, R., R. Langland, G. D. Rohaly, and T. E. Rosmond, 1999: An assessment of the singular vector approach to targeted observations using the FASTEX data set. Quart. J. Roy. Meteor. Soc., 125 , 32993327.

    • Search Google Scholar
    • Export Citation
  • Gelaro, R., C. A. Reynolds, R. H. Langland, and G. D. Rohaly, 2000: A predictability study using geostationary satellite wind observations during NORPEX. Mon. Wea. Rev., 128 , 37893807.

    • Search Google Scholar
    • Export Citation
  • Gelb, A., Ed.,. 1974: Applied Optimal Estimation. MIT Press, 374 pp.

  • Hamill, T. M., 2001: Interpretation of rank histograms for verifying ensemble forecasts. Mon. Wea. Rev., 129 , 550560.

  • Hamill, T. M., and C. Snyder, 2000: A hybrid ensemble Kalman filter–3D variational analysis scheme. Mon. Wea. Rev., 128 , 29052919.

  • Hamill, T. M., C. Snyder, and R. E. Morss, 2000: A comparison of probabilistic forecasts from bred, singular-vector, and perturbed observation ensembles. Mon. Wea. Rev., 128 , 18351851.

    • Search Google Scholar
    • Export Citation
  • Hamill, T. M., J. S. Whitaker, and C. Snyder, 2001: Distance-dependent filtering of background error covariance estimates in an ensemble Kalman filter. Mon. Wea. Rev., 129 , 27762790.

    • Search Google Scholar
    • Export Citation
  • Hansen, J. A., and L. A. Smith, 2000: The role of operational constraints in selecting supplementary observations. J. Atmos. Sci., 57 , 28592871.

    • Search Google Scholar
    • Export Citation
  • Houtekamer, P. L., and J. Derome, 1995: Methods for ensemble prediction. Mon. Wea. Rev., 123 , 21812196.

  • Houtekamer, P. L., and H. L. Mitchell, 1998: Data assimilation using an ensemble Kalman filter technique. Mon. Wea. Rev., 126 , 796811.

    • Search Google Scholar
    • Export Citation
  • Houtekamer, P. L., and H. L. Mitchell, . 2001: A sequential ensemble Kalman filter for atmospheric data assimilation. Mon. Wea. Rev., 129 , 123137.

    • Search Google Scholar
    • Export Citation
  • Ide, K., P. Courtier, M. Ghil, and A. C. Lorenc, 1997: Unified notation for data assimilation: Operational, sequential, and variational. J. Meteor. Soc. Japan, 75 ((1B),) 181189.

    • Search Google Scholar
    • Export Citation
  • Joly, A., and Coauthors. 1997: The Fronts and Atlantic Storm-Track Experiment (FASTEX): Scientific objectives and experimental design. Bull. Amer. Meteor. Soc., 78 , 19171940.

    • Search Google Scholar
    • Export Citation
  • Keppenne, C. L., 2000: Data assimilation into a primitive equation model with a parallel ensemble Kalman filter. Mon. Wea. Rev., 128 , 19711981.

    • Search Google Scholar
    • Export Citation
  • Langland, R., and Coauthors. 1999a: The North Pacific Experiment (NORPEX-98): Targeted observations for improved North American weather forecasts. Bull. Amer. Meteor. Soc., 80 , 13631384.

    • Search Google Scholar
    • Export Citation
  • Langland, R., R. Gelaro, G. D. Rohaly, and M. A. Shapiro, 1999b: Targeted observations in FASTEX: Adjoint based targeting procedures and data impact experiments in IOP/7 and IOP/8. Quart. J. Roy. Meteor. Soc., 125 , 32413270.

    • Search Google Scholar
    • Export Citation
  • Le Dimet, F-X., and O. Talagrand, 1986: Variational algorithms for analysis and assimilation of meteorological observations: Theoretical aspects. Tellus, 38A , 97110.

    • Search Google Scholar
    • Export Citation
  • Lermusiaux, P. F. J., and A. R. Robinson, 1999: Data assimilation via error subspace statistical estimation. Mon. Wea. Rev., 127 , 13851407.

    • Search Google Scholar
    • Export Citation
  • Lorenc, A. C., 1986: Analysis methods for numerical weather prediction. Quart. J. Roy. Meteor. Soc., 112 , 11771194.

  • Lorenz, E. N., and K. A. Emanuel, 1998: Optimal sites for supplementary observations: Simulation with a small model. J. Atmos. Sci., 55 , 399414.

    • Search Google Scholar
    • Export Citation
  • Lu, Z-Q., L. M. Berliner, and C. Snyder, 2000: Experimental design for spatial and adaptive observations. Studies in the Atmospheric Sciences, L. M. Berliner, D. Nychka, and T. Hoar, Eds., Lecture Notes in Statistics, Vol. 144, Springer-Verlag, 199.

    • Search Google Scholar
    • Export Citation
  • Majumdar, S. J., C. H. Bishop, I. Szunyogh, and Z. Toth, 2001: Can an Ensemble Transform Kalman Filter predict the reduction in forecast error variance produced by targeted observations? Quart. J. Roy. Meteor. Soc., 127 , 28032820.

    • Search Google Scholar
    • Export Citation
  • Mitchell, H. L., and P. L. Houtekamer, 2000: An adaptive ensemble Kalman filter. Mon. Wea. Rev., 128 , 416433.

  • Morss, R. E., 1998: Adaptive observations: Idealized sampling strategies for improving numerical weather prediction. Ph.D. dissertation, Massachusetts Institute of Technology, 225 pp. [Available from UMI Dissertation Services, P. O. Box 1346, 300 N. Zeeb Rd., Ann Arbor, MI, 48106-1346.].

    • Search Google Scholar
    • Export Citation
  • Morss, R. E., and K. A. Emanuel, 2002: Influence of added observations on analysis and forecast errors: Results from idealized systems. Quart. J. Roy. Meteor. Soc., 128 , 285322.

    • Search Google Scholar
    • Export Citation
  • Morss, R. E., K. A. Emanuel, and C. Snyder, 2001: Idealized adaptive observation strategies for improving numerical weather prediction. J. Atmos. Sci., 58 , 210234.

    • Search Google Scholar
    • Export Citation
  • Palmer, T. N., R. Gelaro, J. Barkmeijer, and R. Buizza, 1998: Singular vectors, metrics, and adaptive observations. J. Atmos. Sci., 55 , 633653.

    • Search Google Scholar
    • Export Citation
  • Parrish, D. F., and J. C. Derber, 1992: The National Meteorological Center's spectral statistical interpolation system. Mon. Wea. Rev., 120 , 17471763.

    • Search Google Scholar
    • Export Citation
  • Pu, Z-X., and E. Kalnay, 1999: Targeting observations with the quasi-linear inverse and adjoint NCEP global models: Performance during FASTEX. Quart. J. Roy. Meteor. Soc., 125 , 33293337.

    • Search Google Scholar
    • Export Citation
  • Pu, Z-X., E. Kalnay, J. Sela, and I. Szunyogh, 1997: Sensitivity of forecast errors to initial conditions with a quasi-inverse linear method. Mon. Wea. Rev., 125 , 24792503.

    • Search Google Scholar
    • Export Citation
  • Pu, Z-X., S. J. Lord, and E. Kalnay, 1998: Forecast sensitivity with dropwindsonde data and targeted observations. Tellus, 50A , 391410.

    • Search Google Scholar
    • Export Citation
  • Rabier, F., J-N. Thepaut, and P. Courtier, 1998: Extended assimilation and forecast experiments with a four-dimensional variational assimilation system. Quart. J. Roy. Meteor. Soc., 124 , 139.

    • Search Google Scholar
    • Export Citation
  • Schubert, S. D., and M. Suarez, 1989: Dynamical predictability in a simple general circulation model: Average error growth. J. Atmos. Sci., 46 , 353370.

    • Search Google Scholar
    • Export Citation
  • Snyder, C., 1996: Summary of an informal workshop on adaptive observations and FASTEX. Bull. Amer. Meteor. Soc., 77 , 953965.

  • Szunyogh, I., Z. Toth, K. A. Emanuel, C. H. Bishop, C. Snyder, R. E. Morss, J. Woolen, and T. Marchok, 1999: Ensemble based targeting experiments during FASTEX: The impact of dropsonde data from the Lear jet. Quart. J. Roy. Meteor. Soc., 125 , 21893218.

    • Search Google Scholar
    • Export Citation
  • Szunyogh, I., Z. Toth, R. E. Morss, S. Majumdar, B. J. Etherton, and C. H. Bishop, 2000: The effect of targeted dropsonde observations during the 1999 Winter Storms Reconaissance Program. Mon. Wea. Rev., 128 , 35203537.

    • Search Google Scholar
    • Export Citation
  • van Leeuwen, P. J., 1999: Comment on “Data assimilation using an ensemble Kalman filter technique.”. Mon. Wea. Rev., 127 , 13741377.

    • Search Google Scholar
    • Export Citation
  • Whitaker, J. S., and T. M. Hamill, 2002: Ensemble data assimilation without perturbed observations. Mon. Wea. Rev., in press.

APPENDIX

Data Assimilation Methods

Each of the three ensembles are generated by conducting parallel data assimilation cycles, with different member cycles receiving different perturbed observations. We start with an ensemble of n member analyses at some time t0. These perturbed analyses were generated by adding scaled differences between random model states (Schubert and Suarez 1989) to a control analysis. We then repeat the following three-step process for each data assimilation cycle: 1) Make n forecasts to the next analysis time, here, 12 h hence. These forecasts will be used as background fields for n subsequent parallel objective analyses. 2) Given the already imperfect observations at this next analysis time (hereafter called the control observations), generate n independent sets of perturbed observations by adding random noise to the control observations. The noise is drawn from the same distributions as the observation errors (see section 2a). The perturbations are constructed in a manner to ensure that the mean of the perturbed observations is equal to the control observation. 3) Perform n objective analyses, updating each of the n background forecasts using the associated set of perturbed observations. The rationale for this methodology is outlined in Burgers et al. (1998). The details of how the objective analysis is performed for each of the three ensembles is discussed below.

Additional complexity will be introduced here to the basic design of the EnKF. As noted in previous work, (e.g., Houtekamer and Mitchell 1998; van Leeuwen 1999; Hamill et al. 2001), these details are added to simplify computations, to improve the analysis, and perhaps most importantly, to avoid the effects of a detrimental process known as “filter divergence.” This is a process whereby errors can start a cyclical and worsening underestimation of background covariances that results in the ensemble ignoring the influence of new observations. A discussion of this problem is provided in Hamill et al. (2001).

A variety of methods have been tried to prevent filter divergence. Houtekamer and Mitchell (1998) and Mitchell and Houtekamer (2000) propose the use of a “double” EnKF, and more recently, a localization of ensemble covariance estimates, explained later (Houtekamer and Mitchell 2000). Anderson and Anderson (1999) suggest inflating the deviation of background members with respect to their mean by a small amount. Hamill and Snyder (2000) proposed a hybrid ensemble Kalman filter–3DVAR data assimilation system, where background-error covariances are modeled as a weighted linear combination of covariances from the ensemble and stationary covariances from 3DVAR. By including a small amount of 3DVAR covariances, which have more degrees of freedom and are larger in magnitude (by virtue of being a less accurate data assimilation scheme), the algorithm draws the analyses more toward the observations and adjusts them in more directions in phase-space than they are in a straight EnKF. This tends to prevent filter divergence.

We have coded the assimilation algorithm here in a general manner, permitting (a) covariance localization, (b) the inflation of member deviations from their mean, and/or (c) the hybridization with or even total usage of 3DVAR covariances. However, the implementation of the hybrid as used here is somewhat different than that described in Hamill and Snyder (2000); notably though the same forecast model is used, the analysis variable is now geopotential rather than potential vorticity, and the analysis equations are solved in observation space. Also, the 3DVAR statistics are calculated in a different manner, and covariances from the ensemble are localized. More details are provided later.

Recall xbi, i = 1, … , n is defined as the m-dimensional model state vector for the ith member background forecast of an n-member ensemble. The state vector x for the QG model data assimilation system is comprised of the streamfunction at each level and grid point, and the potential temperature at each grid point of the top and bottom boundaries.

Presuming one starts with an ensemble of initial conditions generated in a rational manner, the first step in the data assimilation is to integrate an ensemble of forecasts to the next time when observations are available. If the option to inflate the ensemble is invoked, the next step is to replace the background state with a new background state inflated about the ensemble mean forecast. Background forecasts deviation from the mean are inflated by an amount r, slightly greater than 1.0:
xbirxbixbxb
Here, the operation ← denotes a replacement of the previous value.
Next, following the standard EnKF formulation, each member of the ensemble is updated. The analysis equation for the ith member is
i1520-0493-130-6-1552-ea1
Here, xai is the subsequently analyzed state; yo denotes the set of no control observations, with distinct perturbed observations yoi generated for each member forecast; P̂b is an approximation of the background-error covariances, described below, and 𝗵 (here assumed linear) is an operator that converts the model state to the observation type and location. Here, 𝗵 is a simple extraction of winds and temperature from the background at the observation location. Here, 𝗿 is the no × no measurement error covariance matrix; that is, the observations are related to the true state xt by yo = 𝗵xt + ϵ, where ϵ is a normally distributed, random vector with zero mean and covariance matrix 𝗿. Note also that the operation sequence P̂b𝗵T[𝗵P̂b𝗵T + 𝗿]−1 is often referred to as the gain matrix; it represents how the observation increment yoi − 𝗵xbi will change the background state at every grid point.

In this data assimilation scheme, nr individual fixed location raob profiles are assimilated serially; that is, the set of analyses generated by updating the background states with the first raob serves as the background states for assimilation of the second raob, and so on, until all nr profiles are assimilated. Then these member analyses are used as the background forecasts for assimilation of an adaptive observation. Because raob errors should be independent of each other, the analysis produced by the serial assimilation of raobs should be similar to the analysis produced by assimilating all raobs together (Anderson and Moore 1979, though see caveats in Whitaker and Hamill 2002). Further, this makes the rank of [𝗵P̂b𝗵T + 𝗿] rather low, so computation of its inverse is not expensive.

As in Evensen (1994) and Houtekamer and Mitchell (1998, 2001), for computational efficiency, the matrix operations P̂b𝗵T and 𝗵P̂b𝗵T in (A1) are computed together using data from the ensemble of background states. Again, 𝘅b is the matrix whose ith column is (n − 1)−1/2(xbixb). Then
i1520-0493-130-6-1552-ea2
There are two terms in each equation. The first term represents the contribution of flow-dependent statistics derived from the ensemble, and the second term represents the stationary, 3DVAR contribution, where 𝗯 is the 3DVAR background-error covariance model. The two terms are weighted by α, a tuneable, fixed constant, 0.0 ≤ α ≤ 1.0. For the 3DVAR part, 𝗯 is modeled as 𝗯 = c〈P̂b〉, where 〈·〉 denotes an average covariance from a 100-member EnKF over a 180-day cycle. Since these time-averaged covariances have magnitudes consistent with the covariances of the EnKF but are less accurate (see section 4 also), an empirical multiplier c > 1.0 is applied to the covariance model; after testing the accuracy of 3DVAR over a range, c = 16 was found to produce the best analyses, and this constant is used here.

The operation ρ ∘ in (A2) denotes a Schur product (an element-by-element multiplication) of a correlation matrix 𝘀 with the covariance model generated by the ensemble, that is, a localization of covariances. The Schur product of matrices 𝗮 and 𝗯 is a matrix 𝗰 of the same dimension, where Cij = AijBij. For serial data assimilation, the function 𝘀 depends upon the observation location; it is a maximum of 1.0 at the observation location and typically decreases monotonically to zero at some finite distance from the observation. The Schur product is not applied in (A3), a minor approximation; the 𝗵 operator involves a limited stencil of grid points near the observation location, and the correlation at all grid points is approximately 1.0. See Houtekamer and Mitchell (2001) and Hamill et al. (2001) for further explanations of the rationale for covariance localization.

Because the forecast model we use has impermeable walls on the north and south walls, 𝘀 cannot be modeled strictly using a simple isotropic localization function around the observation such as suggested by Gaspari and Cohn (1999); the Schur product of this with P̂b𝗵T will produce different elements in the gain matrix for the grid points along the north and south walls. This in turn will cause analysis increments to vary along the walls, producing a model state that violates the boundary conditions. Hence, a modified form of covariance localization is used that permits the same covariance value to be used at all points along the wall.

To localize covariances, we use the compactly supported, fifth-order function in Gaspari and Cohn (1999). Define a correlation length scale lc, measured in model grid points, and let Fc = (10/3)lc. Define ‖Dij‖ to be the Euclidean distance in grid points between grid point (i, j) and the observation location. Then an isotropic localization function wij is defined for every grid point (i, j) in the domain according to wij(i, j) = Ω(Fc, ‖Dij‖), where
i1520-0493-130-6-1552-ea4
We also define a function wj(j), which is maximized at the walls and decreases quickly toward zero away from them. Let nj equal the number of grid points in north–south direction (here, 65). Define a distance from the nearest wall ‖Dj‖ according to
i1520-0493-130-6-1552-ea5
Then wj(j) = Ω(2.5, ‖Dj‖). Finally, we define the overall localization matrix operator 𝘀 with element sij at the (i, j)th grid point. Here, sij is a combination of the isotropic function and the zonally averaged function, with the weight given to each depending on j:
i1520-0493-130-6-1552-ea6
Examples of what this localization function looks like for a grid point in the center of the domain and near a wall are shown in Figs. A1a,b.

Fig. 1.
Fig. 1.

Location of fixed rawinsondes for network with data void

Citation: Monthly Weather Review 130, 6; 10.1175/1520-0493(2002)130<1552:UIBECF>2.0.CO;2

Fig. 2.
Fig. 2.

Time series of analysis errors for ensemble assimilating rawinsonde data using the fixed network in Fig. 1. Dots indicate errors of individual ensemble members, and the solid line the error of the ensemble mean in the total-energy norm. Time average of errors for individual members and for ensemble mean are denoted by the numbers on rhs of plot. (a) Inflated EnKF, (b) hybrid EnKF–3DVAR, and (c) PO–3DVAR

Citation: Monthly Weather Review 130, 6; 10.1175/1520-0493(2002)130<1552:UIBECF>2.0.CO;2

Fig. 3.
Fig. 3.

Rank histograms for analyzed θ at model level 4. (a) Inflated ensemble, (b) hybrid, and (c) PO–3DVAR

Citation: Monthly Weather Review 130, 6; 10.1175/1520-0493(2002)130<1552:UIBECF>2.0.CO;2

Fig. 4.
Fig. 4.

Observation locations for testing of expected vs actual analysis variance reduction

Citation: Monthly Weather Review 130, 6; 10.1175/1520-0493(2002)130<1552:UIBECF>2.0.CO;2

Fig. 5.
Fig. 5.

Comparisons of variances when assimilating observations at locations in Fig. 4 using inflated ensemble. (a) Actual reduction in analysis variance a after assimilation of adaptive observation vs predicted reduction in analysis variance b. (b) Fractional reduction in ensemble mean squared error variance c vs b when using imperfect observations, and (c) as in (b) but using perfect observations

Citation: Monthly Weather Review 130, 6; 10.1175/1520-0493(2002)130<1552:UIBECF>2.0.CO;2

Fig. 6.
Fig. 6.

Expected fractional reduction of analysis error variance from application of adaptive observation algorithm on day 14 of the 90-day integration of the inflated ensemble assimilation scheme. (a) True geopotential height (solid) at model level 8 and θT (potential temperature on top lid; dashed). (b) Expected fractional reduction in analysis error variance for each potential observation location in the domain; the value at a given location thus denotes the fractional reduction over the entire domain if an observation were to be assimilated at that location (normalized by the sum of background-error variances before the assimilation of an adaptive observation). Dots indicate locations of fixed network of observations previously assimilated. Star indicates location of maximum expected reduction (the target location). Contours at 2% and every 4% thereafter. (c) As in (b) but the improvement after the first adaptive observation has been assimilated. Again, the fractional reduction is normalized by the background-error variance

Citation: Monthly Weather Review 130, 6; 10.1175/1520-0493(2002)130<1552:UIBECF>2.0.CO;2

Fig. 7.
Fig. 7.

As in Fig. 6 but for day 54

Citation: Monthly Weather Review 130, 6; 10.1175/1520-0493(2002)130<1552:UIBECF>2.0.CO;2

Fig. 8.
Fig. 8.

As in Fig. 6 but for day 70

Citation: Monthly Weather Review 130, 6; 10.1175/1520-0493(2002)130<1552:UIBECF>2.0.CO;2

Fig. 9.
Fig. 9.

(a) As in Fig. 6b but for hybrid ensemble, (b) as in Fig. 7b but for hybrid ensemble, and (c) as in Fig. 8b, but for hybrid ensemble

Citation: Monthly Weather Review 130, 6; 10.1175/1520-0493(2002)130<1552:UIBECF>2.0.CO;2

Fig. 10.
Fig. 10.

(a) As in Fig. 6b but for PO–3DVAR ensemble. However, contour interval is changed to 4% and every 8% thereafter. (b) As in Fig. 7b but for perturbed observation ensemble, and (c) as in Fig. 8b, but for perturbed observation ensemble

Citation: Monthly Weather Review 130, 6; 10.1175/1520-0493(2002)130<1552:UIBECF>2.0.CO;2

Fig. 11.
Fig. 11.

(a) Reduction in ensemble mean squared error c vs expected reduction in analysis error variance b for optimal target locations from inflated ensemble. Vertical row of five dots indicate the range of error reduction for 5 independent control observations tested for each of the 20 case days. (b) As in (a) but for hybrid ensemble, and (c) as in (a), but for PO–3DVAR ensemble. (d) As in (c), but where Eq. (8) is used instead of (9) to predict expected improvement. Note different scales for axes in each figure

Citation: Monthly Weather Review 130, 6; 10.1175/1520-0493(2002)130<1552:UIBECF>2.0.CO;2

Fig. 12.
Fig. 12.

(a) Difference in selected optimal adaptive observation locations when using inflated ensemble (darkened dots) and hybrid ensemble (diamonds). Darkened diamonds indicate that adaptive observation locations were identical. Locations for the same case day are connected by solid line. (b) As in (a) but for inflated ensemble vs PO–3DVAR ensemble

Citation: Monthly Weather Review 130, 6; 10.1175/1520-0493(2002)130<1552:UIBECF>2.0.CO;2

Fig. 13.
Fig. 13.

Improvement in ensemble mean analysis error when assimilating adaptive vs fixed observations on each of 20 case days using inflated ensemble

Citation: Monthly Weather Review 130, 6; 10.1175/1520-0493(2002)130<1552:UIBECF>2.0.CO;2

Fig. 14.
Fig. 14.

Time series of ensemble mean analysis errors when replacing observation profile at grid location (80, 45) during every data assimilation cycle with either a fixed profile at (30, 33) or an adaptive observation. Compare against time series of ensemble mean error from Fig. 2(a)

Citation: Monthly Weather Review 130, 6; 10.1175/1520-0493(2002)130<1552:UIBECF>2.0.CO;2

Fig. 15.
Fig. 15.

Squared spread in column total energy from the inflated ensemble (shaded) and model level 8 geopotential (dark solid lines). Target locations are marked with a star. Contours for spread at 1, 2, 3, 5, 10, 15, 20, 30, 40, 50, and 60 m2 s−2. (a) Case day 14 [compare with Fig. 6(b)]. (b) Case day 54 [compare with Fig. 7(b)]. (c) Case day 70 [compare with Fig. 8(b)]

Citation: Monthly Weather Review 130, 6; 10.1175/1520-0493(2002)130<1552:UIBECF>2.0.CO;2

Fig. 16.
Fig. 16.

(a) Difference in selected adaptive observation locations when using full algorithm with inflated ensemble (darkened dots) and locations based on maximum column total energy spread in inflated ensemble (diamonds). (b) Expected reduction in analysis error variance as evaluated from ensemble when locations are defined by full algorithm (abscissa) vs at locations with maximum spread (ordinate)

Citation: Monthly Weather Review 130, 6; 10.1175/1520-0493(2002)130<1552:UIBECF>2.0.CO;2

Fig. 17.
Fig. 17.

Ensemble mean errors in the energy norm using the inflated ensemble. Dashed line indicates errors for where a single sounding from the fixed network at the location x = 80, y = 45 has been replaced by a sounding at x = 30, y = 33. Solid line indicates errors where sounding at x = 80, y = 45 is replaced by an adaptive observation with the location determined by the maximum spread

Citation: Monthly Weather Review 130, 6; 10.1175/1520-0493(2002)130<1552:UIBECF>2.0.CO;2

i1520-0493-130-6-1552-fa01

Fig. A1. Covariance localization functions for (a) grid point near the center of the channel, and (b) grid point near wall. Correlation length scale in this example is 15 grid points

Citation: Monthly Weather Review 130, 6; 10.1175/1520-0493(2002)130<1552:UIBECF>2.0.CO;2

Table 1.

Observation error variances for temperature (K2), and u and v wind components (m2 s−2)

Table 1.
Table 2.

Parameters used for the three data assimilation approaches tested here. Here, α is the percentage weight applied to stationary covariances; c is an inflation factor for the time mean covariances derived from an EnKF; r is the amount that background forecast deviations about the mean are inflated before the data assimilation proceeds, and lc is the correlation length scale (in grid points) for the covariance localization

Table 2.

*

The National Center for Atmospheric Research is sponsored by the National Science Foundation

1

This minimization can be carried out for measures of uncertainty other than total variance. A variety of choices are discussed in Berliner et al. (1999).

Save
  • Anderson, B. D., and J. B. Moore, 1979: Optimal Filtering. Prentice-Hall, Inc., 357 pp.

  • Anderson, J. L., 2001: An ensemble adjustment Kalman filter for data assimilation. Mon. Wea. Rev., 129 , 28842903.

  • Anderson, J. L., and S. L. Anderson, 1999: A Monte Carlo implementation of the nonlinear filtering problem to produce ensemble assimilations and forecasts. Mon. Wea. Rev., 127 , 27412758.

    • Search Google Scholar
    • Export Citation
  • Baker, N. L., and R. Daley, 2000: Observation and background adjoint sensitivity in the adaptive observation-targeting problem. Quart. J. Roy. Meteor. Soc., 126 , 14311454.

    • Search Google Scholar
    • Export Citation
  • Barkmeijer, J., M. van Gijzen, and F. Bouttier, 1998: Singular vectors and estimates of the analysis error covariance metric. Quart. J. Roy. Meteor. Soc., 124 , 16951713.

    • Search Google Scholar
    • Export Citation
  • Bergot, T., 2001: Influence of the assimilation scheme on the efficiency of adaptive observations. Quart. J. Roy. Meteor. Soc., 127 , 635660.

    • Search Google Scholar
    • Export Citation
  • Bergot, T., G. Hello, A. Joly, and S. Malardel, 1999: Adaptive observations: A feasibility study. Mon. Wea. Rev., 127 , 743765.

  • Berliner, L. M., Z-Q. Lu, and C. Snyder, 1999: Statistical design for adaptive weather observations. J. Atmos. Sci., 56 , 25362552.

  • Bishop, C. H., and Z. Toth, 1999: Ensemble transformation and adaptive observations. J. Atmos. Sci., 56 , 17481765.

  • Bishop, C. H., B. J. Etherton, and S. Majumjar, 2001: Adaptive sampling with the ensemble transform Kalman filter. Part 1: Theoretical aspects. Mon. Wea. Rev., 129 , 420436.

    • Search Google Scholar
    • Export Citation
  • Buizza, R., and A. Montani, 1999: Targeting observations using singular vectors. J. Atmos. Sci., 56 , 29652985.

  • Burgers, G., P. J. van Leeuwen, and G. Evensen, 1998: Analysis scheme in the ensemble Kalman filter. Mon. Wea. Rev., 126 , 17191724.

  • Cohn, S. E., 1997: An introduction to estimation theory. J. Meteor. Soc. Japan, 75 ((1B),) 257288.

  • Ehrendorfer, M., and J. J. Tribbia, 1997: Optimal prediction of forecast error covariances through singular vectors. J. Atmos. Sci., 54 , 286313.

    • Search Google Scholar
    • Export Citation
  • Emanuel, K. A., and R. Langland, 1998: FASTEX adaptive observations workshop. Bull. Amer. Meteor. Soc., 79 , 19151919.

  • Emanuel, K. A., and Coauthors. 1995: Report of the first prospectus development team of the U.S. Weather Research Program to NOAA and the NSF. Bull. Amer. Meteor. Soc., 76 , 11941208.

    • Search Google Scholar
    • Export Citation
  • Evensen, G., 1994: Sequential data assimilation with a nonlinear quasigeostrophic model using Monte Carlo methods to forecast error statistics. J. Geophys. Res., 99 ((C5),) 1014310162.

    • Search Google Scholar
    • Export Citation
  • Evensen, G., and P. J. van Leeuwen, 1996: Assimilation of Geosat altimeter data for the Agulhas current using the ensemble Kalman filter with a quasigeostrophic model. Mon. Wea. Rev., 124 , 8596.

    • Search Google Scholar
    • Export Citation
  • Gaspari, G., and S. E. Cohn, 1999: Construction of correlation functions in two and three dimensions. Quart. J. Roy. Meteor. Soc., 125 , 723757.

    • Search Google Scholar
    • Export Citation
  • Gelaro, R., R. Langland, G. D. Rohaly, and T. E. Rosmond, 1999: An assessment of the singular vector approach to targeted observations using the FASTEX data set. Quart. J. Roy. Meteor. Soc., 125 , 32993327.

    • Search Google Scholar
    • Export Citation
  • Gelaro, R., C. A. Reynolds, R. H. Langland, and G. D. Rohaly, 2000: A predictability study using geostationary satellite wind observations during NORPEX. Mon. Wea. Rev., 128 , 37893807.

    • Search Google Scholar
    • Export Citation
  • Gelb, A., Ed.,. 1974: Applied Optimal Estimation. MIT Press, 374 pp.

  • Hamill, T. M., 2001: Interpretation of rank histograms for verifying ensemble forecasts. Mon. Wea. Rev., 129 , 550560.

  • Hamill, T. M., and C. Snyder, 2000: A hybrid ensemble Kalman filter–3D variational analysis scheme. Mon. Wea. Rev., 128 , 29052919.

  • Hamill, T. M., C. Snyder, and R. E. Morss, 2000: A comparison of probabilistic forecasts from bred, singular-vector, and perturbed observation ensembles. Mon. Wea. Rev., 128 , 18351851.

    • Search Google Scholar
    • Export Citation
  • Hamill, T. M., J. S. Whitaker, and C. Snyder, 2001: Distance-dependent filtering of background error covariance estimates in an ensemble Kalman filter. Mon. Wea. Rev., 129 , 27762790.

    • Search Google Scholar
    • Export Citation
  • Hansen, J. A., and L. A. Smith, 2000: The role of operational constraints in selecting supplementary observations. J. Atmos. Sci., 57 , 28592871.

    • Search Google Scholar
    • Export Citation
  • Houtekamer, P. L., and J. Derome, 1995: Methods for ensemble prediction. Mon. Wea. Rev., 123 , 21812196.

  • Houtekamer, P. L., and H. L. Mitchell, 1998: Data assimilation using an ensemble Kalman filter technique. Mon. Wea. Rev., 126 , 796811.

    • Search Google Scholar
    • Export Citation
  • Houtekamer, P. L., and H. L. Mitchell, . 2001: A sequential ensemble Kalman filter for atmospheric data assimilation. Mon. Wea. Rev., 129 , 123137.

    • Search Google Scholar
    • Export Citation
  • Ide, K., P. Courtier, M. Ghil, and A. C. Lorenc, 1997: Unified notation for data assimilation: Operational, sequential, and variational. J. Meteor. Soc. Japan, 75 ((1B),) 181189.

    • Search Google Scholar
    • Export Citation
  • Joly, A., and Coauthors. 1997: The Fronts and Atlantic Storm-Track Experiment (FASTEX): Scientific objectives and experimental design. Bull. Amer. Meteor. Soc., 78 , 19171940.

    • Search Google Scholar
    • Export Citation
  • Keppenne, C. L., 2000: Data assimilation into a primitive equation model with a parallel ensemble Kalman filter. Mon. Wea. Rev., 128 , 19711981.

    • Search Google Scholar
    • Export Citation
  • Langland, R., and Coauthors. 1999a: The North Pacific Experiment (NORPEX-98): Targeted observations for improved North American weather forecasts. Bull. Amer. Meteor. Soc., 80 , 13631384.

    • Search Google Scholar
    • Export Citation
  • Langland, R., R. Gelaro, G. D. Rohaly, and M. A. Shapiro, 1999b: Targeted observations in FASTEX: Adjoint based targeting procedures and data impact experiments in IOP/7 and IOP/8. Quart. J. Roy. Meteor. Soc., 125 , 32413270.

    • Search Google Scholar
    • Export Citation
  • Le Dimet, F-X., and O. Talagrand, 1986: Variational algorithms for analysis and assimilation of meteorological observations: Theoretical aspects. Tellus, 38A , 97110.

    • Search Google Scholar
    • Export Citation
  • Lermusiaux, P. F. J., and A. R. Robinson, 1999: Data assimilation via error subspace statistical estimation. Mon. Wea. Rev., 127 , 13851407.

    • Search Google Scholar
    • Export Citation
  • Lorenc, A. C., 1986: Analysis methods for numerical weather prediction. Quart. J. Roy. Meteor. Soc., 112 , 11771194.

  • Lorenz, E. N., and K. A. Emanuel, 1998: Optimal sites for supplementary observations: Simulation with a small model. J. Atmos. Sci., 55 , 399414.

    • Search Google Scholar
    • Export Citation
  • Lu, Z-Q., L. M. Berliner, and C. Snyder, 2000: Experimental design for spatial and adaptive observations. Studies in the Atmospheric Sciences, L. M. Berliner, D. Nychka, and T. Hoar, Eds., Lecture Notes in Statistics, Vol. 144, Springer-Verlag, 199.

    • Search Google Scholar
    • Export Citation
  • Majumdar, S. J., C. H. Bishop, I. Szunyogh, and Z. Toth, 2001: Can an Ensemble Transform Kalman Filter predict the reduction in forecast error variance produced by targeted observations? Quart. J. Roy. Meteor. Soc., 127 , 28032820.

    • Search Google Scholar
    • Export Citation
  • Mitchell, H. L., and P. L. Houtekamer, 2000: An adaptive ensemble Kalman filter. Mon. Wea. Rev., 128 , 416433.

  • Morss, R. E., 1998: Adaptive observations: Idealized sampling strategies for improving numerical weather prediction. Ph.D. dissertation, Massachusetts Institute of Technology, 225 pp. [Available from UMI Dissertation Services, P. O. Box 1346, 300 N. Zeeb Rd., Ann Arbor, MI, 48106-1346.].

    • Search Google Scholar
    • Export Citation
  • Morss, R. E., and K. A. Emanuel, 2002: Influence of added observations on analysis and forecast errors: Results from idealized systems. Quart. J. Roy. Meteor. Soc., 128 , 285322.

    • Search Google Scholar
    • Export Citation
  • Morss, R. E., K. A. Emanuel, and C. Snyder, 2001: Idealized adaptive observation strategies for improving numerical weather prediction. J. Atmos. Sci., 58 , 210234.

    • Search Google Scholar
    • Export Citation
  • Palmer, T. N., R. Gelaro, J. Barkmeijer, and R. Buizza, 1998: Singular vectors, metrics, and adaptive observations. J. Atmos. Sci., 55 , 633653.

    • Search Google Scholar
    • Export Citation
  • Parrish, D. F., and J. C. Derber, 1992: The National Meteorological Center's spectral statistical interpolation system. Mon. Wea. Rev., 120 , 17471763.

    • Search Google Scholar
    • Export Citation
  • Pu, Z-X., and E. Kalnay, 1999: Targeting observations with the quasi-linear inverse and adjoint NCEP global models: Performance during FASTEX. Quart. J. Roy. Meteor. Soc., 125 , 33293337.

    • Search Google Scholar
    • Export Citation
  • Pu, Z-X., E. Kalnay, J. Sela, and I. Szunyogh, 1997: Sensitivity of forecast errors to initial conditions with a quasi-inverse linear method. Mon. Wea. Rev., 125 , 24792503.

    • Search Google Scholar
    • Export Citation
  • Pu, Z-X., S. J. Lord, and E. Kalnay, 1998: Forecast sensitivity with dropwindsonde data and targeted observations. Tellus, 50A , 391410.

    • Search Google Scholar
    • Export Citation
  • Rabier, F., J-N. Thepaut, and P. Courtier, 1998: Extended assimilation and forecast experiments with a four-dimensional variational assimilation system. Quart. J. Roy. Meteor. Soc., 124 , 139.

    • Search Google Scholar
    • Export Citation
  • Schubert, S. D., and M. Suarez, 1989: Dynamical predictability in a simple general circulation model: Average error growth. J. Atmos. Sci., 46 , 353370.

    • Search Google Scholar
    • Export Citation
  • Snyder, C., 1996: Summary of an informal workshop on adaptive observations and FASTEX. Bull. Amer. Meteor. Soc., 77 , 953965.

  • Szunyogh, I., Z. Toth, K. A. Emanuel, C. H. Bishop, C. Snyder, R. E. Morss, J. Woolen, and T. Marchok, 1999: Ensemble based targeting experiments during FASTEX: The impact of dropsonde data from the Lear jet. Quart. J. Roy. Meteor. Soc., 125 , 21893218.

    • Search Google Scholar
    • Export Citation
  • Szunyogh, I., Z. Toth, R. E. Morss, S. Majumdar, B. J. Etherton, and C. H. Bishop, 2000: The effect of targeted dropsonde observations during the 1999 Winter Storms Reconaissance Program. Mon. Wea. Rev., 128 , 35203537.

    • Search Google Scholar
    • Export Citation
  • van Leeuwen, P. J., 1999: Comment on “Data assimilation using an ensemble Kalman filter technique.”. Mon. Wea. Rev., 127 , 13741377.

    • Search Google Scholar
    • Export Citation
  • Whitaker, J. S., and T. M. Hamill, 2002: Ensemble data assimilation without perturbed observations. Mon. Wea. Rev., in press.

  • Fig. 1.

    Location of fixed rawinsondes for network with data void

  • Fig. 2.

    Time series of analysis errors for ensemble assimilating rawinsonde data using the fixed network in Fig. 1. Dots indicate errors of individual ensemble members, and the solid line the error of the ensemble mean in the total-energy norm. Time average of errors for individual members and for ensemble mean are denoted by the numbers on rhs of plot. (a) Inflated EnKF, (b) hybrid EnKF–3DVAR, and (c) PO–3DVAR

  • Fig. 3.

    Rank histograms for analyzed θ at model level 4. (a) Inflated ensemble, (b) hybrid, and (c) PO–3DVAR

  • Fig. 4.

    Observation locations for testing of expected vs actual analysis variance reduction

  • Fig. 5.

    Comparisons of variances when assimilating observations at locations in Fig. 4 using inflated ensemble. (a) Actual reduction in analysis variance a after assimilation of adaptive observation vs predicted reduction in analysis variance b. (b) Fractional reduction in ensemble mean squared error variance c vs b when using imperfect observations, and (c) as in (b) but using perfect observations

  • Fig. 6.

    Expected fractional reduction of analysis error variance from application of adaptive observation algorithm on day 14 of the 90-day integration of the inflated ensemble assimilation scheme. (a) True geopotential height (solid) at model level 8 and θT (potential temperature on top lid; dashed). (b) Expected fractional reduction in analysis error variance for each potential observation location in the domain; the value at a given location thus denotes the fractional reduction over the entire domain if an observation were to be assimilated at that location (normalized by the sum of background-error variances before the assimilation of an adaptive observation). Dots indicate locations of fixed network of observations previously assimilated. Star indicates location of maximum expected reduction (the target location). Contours at 2% and every 4% thereafter. (c) As in (b) but the improvement after the first adaptive observation has been assimilated. Again, the fractional reduction is normalized by the background-error variance

  • Fig. 7.

    As in Fig. 6 but for day 54

  • Fig. 8.

    As in Fig. 6 but for day 70

  • Fig. 9.

    (a) As in Fig. 6b but for hybrid ensemble, (b) as in Fig. 7b but for hybrid ensemble, and (c) as in Fig. 8b, but for hybrid ensemble

  • Fig. 10.

    (a) As in Fig. 6b but for PO–3DVAR ensemble. However, contour interval is changed to 4% and every 8% thereafter. (b) As in Fig. 7b but for perturbed observation ensemble, and (c) as in Fig. 8b, but for perturbed observation ensemble

  • Fig. 11.

    (a) Reduction in ensemble mean squared error c vs expected reduction in analysis error variance b for optimal target locations from inflated ensemble. Vertical row of five dots indicate the range of error reduction for 5 independent control observations tested for each of the 20 case days. (b) As in (a) but for hybrid ensemble, and (c) as in (a), but for PO–3DVAR ensemble. (d) As in (c), but where Eq. (8) is used instead of (9) to predict expected improvement. Note different scales for axes in each figure

  • Fig. 12.

    (a) Difference in selected optimal adaptive observation locations when using inflated ensemble (darkened dots) and hybrid ensemble (diamonds). Darkened diamonds indicate that adaptive observation locations were identical. Locations for the same case day are connected by solid line. (b) As in (a) but for inflated ensemble vs PO–3DVAR ensemble

  • Fig. 13.

    Improvement in ensemble mean analysis error when assimilating adaptive vs fixed observations on each of 20 case days using inflated ensemble

  • Fig. 14.

    Time series of ensemble mean analysis errors when replacing observation profile at grid location (80, 45) during every data assimilation cycle with either a fixed profile at (30, 33) or an adaptive observation. Compare against time series of ensemble mean error from Fig. 2(a)

  • Fig. 15.

    Squared spread in column total energy from the inflated ensemble (shaded) and model level 8 geopotential (dark solid lines). Target locations are marked with a star. Contours for spread at 1, 2, 3, 5, 10, 15, 20, 30, 40, 50, and 60 m2 s−2. (a) Case day 14 [compare with Fig. 6(b)]. (b) Case day 54 [compare with Fig. 7(b)]. (c) Case day 70 [compare with Fig. 8(b)]

  • Fig. 16.

    (a) Difference in selected adaptive observation locations when using full algorithm with inflated ensemble (darkened dots) and locations based on maximum column total energy spread in inflated ensemble (diamonds). (b) Expected reduction in analysis error variance as evaluated from ensemble when locations are defined by full algorithm (abscissa) vs at locations with maximum spread (ordinate)

  • Fig. 17.

    Ensemble mean errors in the energy norm using the inflated ensemble. Dashed line indicates errors for where a single sounding from the fixed network at the location x = 80, y = 45 has been replaced by a sounding at x = 30, y = 33. Solid line indicates errors where sounding at x = 80, y = 45 is replaced by an adaptive observation with the location determined by the maximum spread

  • Fig. A1. Covariance localization functions for (a) grid point near the center of the channel, and (b) grid point near wall. Correlation length scale in this example is 15 grid points

All Time Past Year Past 30 Days
Abstract Views 0 0 0
Full Text Views 3133 2783 154
PDF Downloads 254 62 1