Evaluation of the Impact of Observations on Analyses in 3D- and 4D-Var Based on Information Content

Cristina Lupu Department of Earth and Atmospheric Sciences, Université du Québec à Montréal, Montreal, Quebec, Canada

Search for other papers by Cristina Lupu in
Current site
Google Scholar
PubMed
Close
,
Pierre Gauthier Department of Earth and Atmospheric Sciences, Université du Québec à Montréal, Montreal, Quebec, Canada

Search for other papers by Pierre Gauthier in
Current site
Google Scholar
PubMed
Close
, and
Stéphane Laroche Meteorological Research Division, Environment Canada, Dorval, Quebec, Canada

Search for other papers by Stéphane Laroche in
Current site
Google Scholar
PubMed
Close
Full access

Abstract

The degrees of freedom for signal (DFS) is used in data assimilation applications to measure the self-sensitivity of analysis to different observation types. This paper describes a practical method to estimate the DFS of observations from a posteriori statistics. The method does not require the consistency of the error statistics in the analysis system and it is shown that the observational impact on analyses can be estimated from observation departures with respect to analysis or the forecast. This method is first introduced to investigate the impact of a complete set, or subsets, of observations on the analysis for idealized one-dimensional variational data assimilation (1D-Var) analysis experiments and then applied in the framework of the three dimensional (3D)- and four-dimensional (4D)-Var schemes developed at Environment Canada.

Corresponding author address: Cristina Lupu, European Centre for Medium-Range Weather Forecasts, Shinfield Park, Reading RG2 9AX, United Kingdom. Email: cristina.lupu@ecmwf.int

Abstract

The degrees of freedom for signal (DFS) is used in data assimilation applications to measure the self-sensitivity of analysis to different observation types. This paper describes a practical method to estimate the DFS of observations from a posteriori statistics. The method does not require the consistency of the error statistics in the analysis system and it is shown that the observational impact on analyses can be estimated from observation departures with respect to analysis or the forecast. This method is first introduced to investigate the impact of a complete set, or subsets, of observations on the analysis for idealized one-dimensional variational data assimilation (1D-Var) analysis experiments and then applied in the framework of the three dimensional (3D)- and four-dimensional (4D)-Var schemes developed at Environment Canada.

Corresponding author address: Cristina Lupu, European Centre for Medium-Range Weather Forecasts, Shinfield Park, Reading RG2 9AX, United Kingdom. Email: cristina.lupu@ecmwf.int

1. Introduction

Recently, new methods have been developed to quantify the impact of the observations on the assimilation and on the ensuing forecast. For the analysis, this can be achieved by evaluating the information content expressed in terms of degrees of freedom for signal (DFS; Purser and Huang 1993; Rodgers 2000; Rabier et al. 2002). For any data assimilation system, this diagnostic quantifies information brought by any given type of observations and is useful to assess the relative impact of the different types of observations being assimilated. With the increasing number of datasets used by modern data assimilation systems, such as the hyperspectral infrared sounders, Atmospheric Infrared Sounder (AIRS) and Infrared Atmospheric Sounding Interferometer (IASI), it is important to know the information content associated with the radiance measurements which permits us to reduce the volume of data associated with these new instruments. An example of how this diagnostic was applied for the channel selection procedure is presented in Rabier et al. (2002) in the context of IASI-simulated data by evaluating the impact of the different channels on the analysis.

Proposed by Cardinali et al. (2004), the influence matrix gives a measure of how much any given observation impacted the analysis. They used this approach to estimate the information content supplied by different types of observational data to analyses produced by the European Centre for Medium-Range Weather Forecasts (ECMWF) four-dimensional variational data assimilation (4D-Var) system. The sensitivity of the analysis to observations then showed that about 25% of the information was provided by ground-based observing systems and 75% by satellite systems. This approach also allows the partial influence of observational subsets to be examined based on geographical area and observation type. Another method was applied with the Action de Recherche Petite Echelle Grande Echelle (ARPEGE) 4D-Var system of Météo-France (Chapnik et al. 2006). It is based on a method proposed in Girard (1987) in which perturbations to both the background and the observations are introduced to measure the sensitivity of the resulting analysis given the uncertainty in both the observations and the background. In any data assimilation system, the impact on the analysis depends critically on the observation and background error statistics used in the assimilation.

In numerical weather prediction, one is interested in knowing the impact of the observation on the forecast made from the analysis. Traditionally, the observation impact on forecasts has been obtained from Observing System Experiments (OSEs) in which selected datasets are systematically added or removed from the assimilation system (e.g., Kelly et al. 2007). Using the OSEs, the impact of various observation network configurations can be assessed by comparing forecast scores from experiments that use different observation scenarios. This approach is expensive and only provides a global view of the impact of observations. Recently, adjoint-based sensitivities with respect to observations have also been proposed to assess the observation impact on short-range forecasts without carrying out data-denial experiments (Baker and Daley 2000; Langland and Baker 2004; Zhu and Gelaro 2008; Cardinali 2009). Zhu and Gelaro (2008) showed that the adjoint-based method provides accurate assessments of the forecast sensitivity with respect to most of the observations assimilated. Gelaro and Zhu (2009) and Cardinali (2009) have recently applied adjoint-based impact calculations to results from OSEs to show that the two methods provide complementary information.

The objective of this paper is to propose a simple approach that permits to easily evaluate the information content associated with observations used in any data assimilation system directly from observation departures from the analysis and forecast, a natural by-product of the assimilation process. The emphasis is then on the impact of observations on the analysis only, not the forecasts. Following Desroziers et al. (2005), observation departures from analyses and forecasts can be used to make diagnostics about the consistency of the observation and background error statistics used in the assimilation. If these error statistics are suboptimal, they showed that this information can be used to recalibrate the error statistics to meet the χ2 optimality criteria. The methods are presented in Desroziers and Ivanov (2001) and Chapnik et al. (2006). What they show is that observation departures with respect to the background and the analysis are directly related to the observation, background, and analysis error covariances. Based on this, they showed that any inconsistency between those diagnostics and the a priori error statistics used in the assimilation can be used to recalibrate the observation and background error statistics. As pointed out in Chapnik et al. (2006), these relationships show that they can provide an estimate of the information content, provided the error statistics are consistent. What we show in this paper is that these relationships provide a reliable estimate of the information content as evaluated with the perturbation method of Girard (1987).

An analytic derivation is presented to show how the DFS can be evaluated from the a posteriori statistics. Section 2 describes the methodology for computing the information content brought by the observations. Based on the results of Desroziers et al. (2005), it is shown that the information content brought in by the data assimilation system can be estimated from observation departures from the analysis and forecast, even when the expected statistics of innovation vector differ from those specified in the assimilation system. A unique aspect of the method proposed here is that it does not require the consistency of the error statistics in the analysis system. In section 3, results obtained with the simplified one-dimensional (1D)-Var scheme are presented and discussed. In section 4, this is applied to results from three-dimensional (3D)- and 4D-Var to show how the impact of observations depends on the assimilation method. Those results were obtained from analyses produced with the 3D- and 4D-Var systems of Environment Canada (Gauthier et al. 1999, 2007). Finally, the summary and conclusions are given in section 5.

2. Estimation of information content brought by the observations

Consider a data assimilation scheme that provides an optimal analysis xa:
i1520-0493-139-3-726-e1
where xb is the background state, y is a vector of observational data, and H is the nonlinear observation operator, while
i1520-0493-139-3-726-e2
is the optimal Kalman gain matrix expressed in terms of the background error covariance matrix 𝗕, the observation error covariance matrix 𝗥, and 𝗛 the tangent linear model of H, linearized in the vicinity of xb.
The DFS is used in data assimilation applications to measure the self-sensitivity of analysis to different observation types (Rodgers 2000). The DFS is the image in observation space of the trace of the derivative of the analysis with respect to observations:
i1520-0493-139-3-726-e3
where tr{·} denotes trace of {·}. In the linear case, (1) and (3) imply that
i1520-0493-139-3-726-e4
Because of the size of the matrices involved, the evaluation of the DFS using (4) is not straightforward. Moreover, because the Kalman gain matrix is not readily available in a variational scheme, Cardinali et al. (2004) compute an estimate of tr(𝗛𝗞) using the leading singular vectors of the Hessian of the cost function provided by the Lanczos’s conjugate gradient algorithm while Fisher (2003) applied numerical methods for directly calculating the trace of large sparse matrix. Another approach is based on a randomization technique proposed by Chapnik et al. (2006). In their study, the trace of 𝗛𝗞 is evaluated from simple consistency diagnostics introduced by Desroziers et al. (2005).
Desroziers et al. (2005) developed a set of diagnostics in observation space based on combinations of differences between observation and background [ = yH(xb)], observation and analysis [ = yH(xa)], and background and analysis [ = H(xa) − H(xb) ≡ ], the last being the image of the analysis increment in observation space. From these quantities, it is possible to diagnose a posteriori observation, background, and analysis error statistics in observation space. The mean diagnostics are the following:
i1520-0493-139-3-726-e5a
i1520-0493-139-3-726-e5b
i1520-0493-139-3-726-e5c
i1520-0493-139-3-726-e5d
where E[·] is the statistical expectation operator, 𝗗 = 𝗛𝗕𝗛T + 𝗥 is the a priori innovation covariance, = 𝗛𝗛T + is the estimated covariance from innovations, and 𝗔 is the analysis error covariance. The diagnosed observation and background error covariance in observation space are = E[()T] and 𝗛𝗛T = E[()T], respectively.
It is important to stress that and are related to the innovation vector by
i1520-0493-139-3-726-e6a
i1520-0493-139-3-726-e6b
An expression for the DFS can also be derived from these two expressions. The statistical expectation of the outer product of = 𝗥−1/2 with = 𝗥−1/2 is
i1520-0493-139-3-726-e7
and therefore,
i1520-0493-139-3-726-e8
By using the property that the statistical expectation and the trace operator commute, that is, tr{E[·]} = E{tr[·]} and tr(abT) = bTa for any two vectors a and b, (8) reduces to
i1520-0493-139-3-726-e9

a. Case with consistent error statistics

When the sample covariance matches the prescribed innovation covariance ( = 𝗗), (9) provides an estimation of the information content relative to an analysis scheme (3D-/4D-Var). The globally estimated trace of 𝗛𝗞 for all observation types is the total DFS then given by
i1520-0493-139-3-726-e10
Equation (10) gives a simple and efficient way to estimate the DFS for an optimal assimilation scheme because only by-products of the data assimilation scheme are necessary.
For many observation types, the observation error covariance matrix 𝗥 can be reasonably assumed to be diagonal, and that the observation error variance is not correlated. There are of course limitations to this assumption but it remains reasonable to a certain extent. Reliable estimates of the information content can be obtained for any subset of data with uncorrelated observation error variance with respect to the other subsets. In that case, the partial DFS of the kth subset (yk = y) extracted from the full observation vector by means of the projection operator , is given by
i1520-0493-139-3-726-e11

b. Case with inconsistent error statistics

These results hold insofar as the innovation error statistics are consistent with those specified in the assimilation, namely, that = E[T] = 𝗗 = 𝗥 + 𝗛𝗕𝗛T, so the a priori 𝗗 and a posteriori terms in (9) cancel each other out. However, as pointed out by Desroziers et al. (2005), if they differ, the diagnosed covariance matrices in (5a) and (5b) may be seen as some adjusted covariance estimates. The a posteriori Kalman gain matrix is now defined as
i1520-0493-139-3-726-e12
Therefore, the estimate of tr(𝗞) from the a posteriori statistics is
i1520-0493-139-3-726-e13
where −1 denotes the pseudoinverse of . A generalization of the usual inverse matrix (Golub and van Loan 1996) must be used here because may be singular. It follows that the information content can be determined either from the a posteriori statistics or from the a priori statistics.
A more interesting form can be obtained by introducing (5b) in (13). Using the properties that the trace and expectation operators commute and that 𝗫E[(·)] = E[𝗫(·)] for any nonrandom matrix 𝗫, then leads to the following result:
i1520-0493-139-3-726-e14
In other words, the DFS associated with any assimilation system can be directly obtained from = E[()T] and (14).

The equivalence established here states that the DFS evaluated using diagnostics of E[()T] = 𝗛𝗛T and E[()T] = yields the same results as if a perturbation method was used to evaluate the DFS associated with the a priori error statistics. This is the method proposed by Chapnik et al. (2006). Inspection of (5a) and (5b) indicates that = 𝗥𝗗−1 and 𝗛𝗛T = 𝗛𝗕𝗛TD−1 differs from their a priori definition by the same factor, 𝗗−1. When using those a posteriori definitions, those factors cancel out to retrieve the same DFS as would be obtained using the a priori error statistics.

A difficulty remains however, since (14) requires that be inverted, which is not immediate as it embeds both the observation error and the background error. The latter cannot be assumed to be uncorrelated, which makes nondiagonal. However, an alternative approach can be taken to simplify the computation. The analysis sensitivity matrix, introduced in Cardinali et al. (2004), being 𝗦 = 𝗞T𝗛T, can also can be defined with respect to the a posteriori statistics. Using (5a) and (5c), it is easily shown that
i1520-0493-139-3-726-eq1
and consequently,
i1520-0493-139-3-726-e15
Substituting (5c) into (15), the a posteriori DF̃S can be rewritten as
i1520-0493-139-3-726-e16
This has the same form as (10), but that the estimated observation error covariance matrix is to be used. This matrix is possibly nondiagonal full matrix and, in general, may not be symmetric and contain cross correlations due to the presence of background error in its estimate, as indicated by (5a). To calculate the generalized inverse, −1, a singular value decomposition (SVD) of the matrix can be used by decomposing = 𝗨Λ𝗩T, where 𝗨 and 𝗩 denotes the matrices formed by the left (𝗨) and right (𝗩) singular vectors while Λ is a diagonal matrix defined by the singular values. In that case, −1 = 𝗩Λ−1𝗨T and the DFS in (16) can be evaluated at the cost of a few dot products. This would also be the approach to take to evaluate −1 to compute the DFS using (14).

For many observation types like radiosondes and ground-based instruments, the observation error is uncorrelated between distinct observations. We then introduce the assumption that can be approximated as a block-diagonal matrix, each being of the form k(k)k, where (k) is the diagnosed observation error variance associated with the kth observation type. This is justified when the observation error is expected to be uncorrelated for observations coming from independent instruments. This is the case for several observation types such as radiosondes and ground-based instruments but may not be valid for measurements from satellite instruments. As stated in Talagrand (1999), the approach for computing the a posteriori covariances cannot provide any new information about 𝗥 and 𝗕 without imposing an external hypothesis to disentangle the observation and background error embedded within the innovation error statistics. An important fact is that only the observation error variances are extracted from the diagnosed statistics by assuming that the observation error is uncorrelated. This is where the evaluation of the DFS using (16) shows a clear advantage over (14): the matrix to be inverted can be assumed to be diagonal. However, this remains to be verified.

In the next section a simple system 1D-Var is used to investigate the extent to which this assumption is a reasonable one in an idealized context in which ensemble of analyses can be generated.

3. Application to 1D-Var system

Using the methodology presented in the previous section we discuss the estimation of the DFS with a simplified 1D-Var scheme. The 1D domain contains N = 256 points uniformly distributed over a circle of latitude (approximately at 41° latitude) with perimeter of 30 000 km. The true background error covariance matrix 𝗕t in physical space assumes isotropic error correlations is defined as
i1520-0493-139-3-726-e17
where σb(i) and σb(j) are the true background standard deviation of component i and j of 𝗕t, respectively ; rij is the Euclidean distance between points i and j; and Lt is the true horizontal length scale taken to be 300 km. In our experiments, we consider three different values of the background correlation length (300, 500, and 1000 km) in the a priori background error statistics. The observing system is fixed to be 60 observations at every other three-grid point. The observations are simulated by adding Gaussian random noise to the truth and the innovation vector y′ is defined as y′ = yH(xb) ≅ εo − 𝗛εb, where εb and εo represent the errors in the background state and the observations, respectively. Every observation is taken directly as a value at a grid point and all the observations have the same error variance. Therefore, 𝗥t is defined as , with , the identity matrix, and , the true observation error variance. In this context, it is possible to repeat the analysis for a number of realizations based on the true observation and background error, which may differ from the a priori statistics used in the assimilation. Based on the true error statistics, an ensemble of 2000 analyses was produced to estimate the a posteriori error statistics.

a. Estimation of the off-diagonal terms in the observation error covariance

The first experiment is to examine whether the a posteriori estimate of observation error covariance can be assumed to be diagonal and their importance for the definition of is discussed in this section.

The nondiagonal elements of were estimated using (5a) assuming the observation error variance to be the same for all the 60 observations used in this experiments. Moreover, the error covariance is assumed to be identical when the distance between the observations is the same. The observation error covariance (i, j) between components i and j as a sample mean is given by
i1520-0493-139-3-726-e18
where the overbar represent the sample mean for the whole ensemble of 2000 analyses. With consistent error statistics, the observation and background error variances are perfectly known, that is, the specified values are and , but different values for the horizontal length scale Lc = 300, 500, and 1000 km were used. For all cases, the magnitude of the off-diagonal elements in the observation error covariances is very small compared with those of the diagonal components of each element of . Figure 1 shows a representation of (i, j) as a function of distance rij between points i and j. The examination of the off-diagonal elements in the observation error covariance matrix reveals small values (below 10%). This shows that the diagnosed observation error covariance matrix may be considered diagonal ().

b. Degrees of freedom for signal

In a second set of experiments, the DFS is evaluated using the a posteriori statistics and compared with that obtained using the perturbation method (Girard 1987). Results are also shown when the a posteriori diagnostics are evaluated using either (14) or (16). Finally, the DFS is estimated using (16), but retaining only the estimated diagonal elements of . The objective of this experiment is to show that the information content estimated from the a posteriori and a priori statistics concur. One has to keep in mind that the DFS estimated is a reflection of the error statistics used in the assimilation. The DFS estimated from the true statistics gives what would be obtained if the error statistics of the assimilation were consistent with the estimation based on observation departures from the background state and the analysis.

In general, the direct evaluation of tr(𝗛𝗞) is not straightforward because the Kalman gain matrix is not explicitly available in a variational data assimilation system. However, the calculation of this trace can be accomplished in the simplified 1D-Var model here considered. In particular, assuming that 𝗥 and 𝗕 are the covariances used in the assimilation, the theoretical DFS can be evaluated as
i1520-0493-139-3-726-e19
in which 𝗞 is the gain matrix.
For more complex systems, Girard (1987) proposed a randomization method to approximate the trace of a matrix only known as a composition of operators. A practical method that requires a random perturbation of the vector of observations was introduced in Desroziers and Ivanov (2001) and was employed in Chapnik et al. (2006). It can be shown that a randomized estimation of tr(𝗛𝗞) where 𝗞 is based on the specified 𝗥 and 𝗕 covariances, that were used in the analysis is given by
i1520-0493-139-3-726-e20
where 𝗛 and 𝗛 contain the analysis increments obtained from perturbed and unperturbed observations, respectively. The observations are perturbed by adding small perturbations εo = 𝗥1/2ξ to the original set of observations y* = y + 𝗥1/2ξ, where ξ is a vector of random numbers with zero mean and unit variance.
In our study, the argument we propose is that the DFS can be computed directly from observation departures from the analysis and forecast. Relying on expressions (13) and (15), the DFS can be also evaluated from (14) using the a posteriori statistics:
i1520-0493-139-3-726-e21
or, equivalently, using (16):
i1520-0493-139-3-726-e22
The question then becomes which a posteriori relation should be used? In particular, for (21), the inversion of may be complicated by the fact that may be singular. By replacing the a posteriori observation error covariance matrix by a diagonal matrix , in this case, (22) simplifies to
i1520-0493-139-3-726-e23
In the following experiment, the DFS has been estimated from 2000 analyses. This is compared with the DFS computed with Girard’s method in (20) and the DFS calculated using the a posteriori statistics as introduced in (21)(23).

Table 1 shows the estimates of DFS obtained with the true background and observation errors. In this case, the estimated 𝗞 is equal to the true Kalman gain matrix. The a posteriori estimate of DFS is similar (within 0.1% accuracy) with that found from Girard’s method and in good agreement with the analytic value. Since the DFS is a function of 𝗕, the horizontal model correlations affect the DFS: when the correlation length increases the DFS tends to decrease. This can be seen in the results of Table 1 that illustrate the influence of the background correlation length on the DFS.

The second set of experiments is similar to the previous one except that the observation error variance is now underestimated and taken to be σo2 = 2.25. The results, shown in Table 2, are similar to that of Table 1. Similarly, Table 3 presents the results obtained when both the background and observation error variances are underestimated (σb2 = 0.25 and σo2 = 2.25, respectively). In both experiments, the DFS calculations using the full estimate of the a posteriori observation error covariance matrix give similar results to that obtained using the randomized Girard method. Still a good approximation is achieved when only the diagonal elements are considered. The relative difference between the values of the DFS calculated as in (20) and (23) is around 3% when the background correlation length was assumed to be 300 km.

The conclusions from these experiments are now summarized. When the a priori error statistics differ from those estimated from observation departures, the estimated observation error covariance matrix might show cross correlations due in part to the presence of background error in its estimate. In this study, the nondiagonal elements of were shown to be small, so that the diagnosed matrix can be approximated as a diagonal matrix. The idealized experiments with the 1D-Var show that it is possible to obtain the appropriate value for the DFS from a posteriori statistics. The results indicate in all experiments that the information content estimated from the a posteriori and a priori statistics provide quite similar results. A simple method has been introduced in which the estimated observation error covariances are assumed to be diagonal. The results obtained are also found to be in good agreement with the method proposed by Girard (1987), Chapnik et al. (2006), and the provided analytical solution.

4. Evaluation of the information content in 3D-Var and 4D-Var

In this section, the diagnostics introduced in the previous section are used to evaluate the DFS from the 3D- and 4D-Var systems of the Meteorological Service of Canada (MSC). The 3D- and 4D-Var experiments used in this study are those described in Laroche and Sarrazin (2010a,b). The 3D- and 4D-Var systems have been cycled over the period 21 December 2006 to 28 February 2007 using a 6-h assimilation window. All diagnostics exclude the first 11 days, the spinup period of the analysis. The incremental 4D-Var is used (Gauthier et al. 2007) in which the analysis increment is calculated at a lower horizontal resolution (∼170 km). The 4D-Var analysis is obtained after two outer loops by interpolating this lower-resolution analysis increment to the same grid (∼35 km) as the background state before adding the two. The subsets of observations assimilated in either 3D- or 4D-Var during winter 2006–07 include radiosondes (RAOB), aircraft data (AI), surface and ship data (SF), wind profiler data (PR), atmospheric motion vectors (AMVs) from geostationary satellites and those from Moderate Resolution Imaging Spectroradiometer (MODIS AMVs), and radiances from polar-orbiting satellites [Advanced Microwave Sounding Unit (AMSU-A/B)] and from geostationary satellites [Geostationary Operational Environmental Satellite (GOES-East) and (GOES-West)]. A summary is given in Table 4.

a. A posteriori diagnostics and consistency checks

The variational data assimilation formulation relies on a number of hypotheses on the background and observation error statistics. The validity of these hypotheses is an important factor in determining the optimality of the analysis. The chi-square χ2 diagnostic can be used to check if the sample covariances of innovations in a region, or for a given observing system, are very different from what has been prescribed. For data assimilation χ2 is defined as
i1520-0493-139-3-726-eq2
and its expected value is E[χ2] = tr(𝗗−1). Assuming that 𝗗 = , then E[χ2] = p, where p is the total number of observations used in the analysis. In 3D-/4D-Var, χ2 can be obtained from the value of the cost function at minimum, which is
i1520-0493-139-3-726-e24
Equation (24) provides a simple diagnostic to check the global consistency of an assimilation algorithm. In Table 5, the average over January and February 2007 of the estimated values of χ2 in 3D- and 4D-Var systems are shown and compared to the number of observations. The expected value of χ2/p is less than 1, which implies that either the background or observation error variances, or both, have been overestimated.

The consistency diagnostic has been calculated for the observation and background error covariances as in (5a) and (5b). Results confirm the overestimation of the error statistics for most observation types and consequently the suboptimality of the system here considered. However, as previously shown, the DFS calculation is not affected by any degree of the system suboptimality.

b. Computation of DFS in MSC’s 3D-Var and 4D-Var

The DFS for different data types and regions can be computed. Let indicate with DFSkRegion the DFS of the kth observation subset over that region. For instance, if the region is the whole globe, the DFS is defined as
i1520-0493-139-3-726-eq3
and represents the ratio of the obtained from a particular subset of observations to the total DFSGlobe extracted from all observations. Expressed as a percentage, it then represents the relative contribution of any subset of observations to the global DFS. More generally, for a particular region, DFSkRegion of different observation types can be written as
i1520-0493-139-3-726-eq4
Figure 2 presents estimates of the total DFS averaged over January–February 2007 in the MSC 3D- and 4D-Var systems for the following regions: the globe, Northern Hemisphere (20°–90°N), tropics (20°S–20°N), and Southern Hemisphere (90°–20°S). These results indicate that the DFS for 3D-Var is larger than for 4D-Var over all regions.

Figure 3 shows the DFS percentage in the 3D- and 4D-Var for different observation type over the globe. Results show that the most important observations in terms of information content in the analyses are radiosonde and brightness temperature data types (AMSU-A/B) followed by aircraft data. Different results have been obtained at the ECMWF (Cardinali et al. 2004) where satellite observations [AMSU-A, High Resolution Infrared Radiation Sounder (HIRS), and Special Sensor Microwave Imager (SSM/I)] contribute more to the DFS than conventional observations. The MSC 3D/4D-Var relies on a smaller number of satellite data as compared to ECMWF. It is also observed that radiosonde, wind profiler, aircraft, and AMSU-B data have more relative impact in 4D-Var than in 3D-Var. In the Northern Hemisphere, the largest DFS is obtained for radiosonde and aircraft data while satellite radiances are dominant in the Southern Hemisphere. The fact that for satellite data the DFS is smaller in 4D-Var than in 3D-Var and that, in general, the DFS is larger for conventional observations than for satellite data, indicates a need for model error covariance recalibration.

For any selected subset of data, the observation influence (OI), is defined as the DFS normalized by the number of observations:
i1520-0493-139-3-726-eq5
Figure 4 shows the impact of individual observations in both 3D- and 4D-Var. We note that the observation influence is larger for the radiosonde data in both data assimilation systems. All other data types show a much smaller impact per observation. We also note that the AMSU-B data have a mean influence larger than the AMSU-A data. Information in AMSU-B data is with respect to humidity while the AMSU-A’s channels are sensitive to high-tropospheric and low-stratospheric temperature variations.

Figure 5 shows the (%) for different AMSU-A/B channels. The number of assimilated radiance channels in our system is 7 from an AMSU-A instrument (channels 4–10) and 4 from an AMSU-B instrument (channels 2–5). In particular, the weighting functions of channels 9 and 10 from AMSU-A peak around 50–100 hPa and a fraction of their weighting function is above the model top. We note that a large part of the DFS is coming from stratospheric AMSU-A channel 10. However, 4D-Var AMSU-A channel 9 (lower stratosphere) shows a negative DFS, which is difficult to interpret. The method proposed in this paper assumes that observation departures are unbiased, which may not be exactly verified in the results obtained from an operational system.

Figure 6 shows the information content, for the main data types in the 3D- and 4D-Var as a function of the observation time within the assimilation window. The regions represented here are the Northern and Southern Hemispheres. The results suggest that radiosonde and surface pressure data have the largest DFS near the middle of the assimilation window as most of the data are available at the synoptic time. On the other hand, the satellite data are roughly evenly spread across the assimilation window but have the largest DFS at the end of the assimilation window. The DFS is expected to be larger at the end of the assimilation window for the evolution of the covariance matrices in the window. The DFS comparisons as a function of time in the assimilation window indicate not significant difference between 3D- and 4D-Var systems.

5. Conclusions

As described in this paper, there are a number of approaches that have recently been used to evaluate the value of observations in data assimilation systems. The DFS is used in data assimilation applications to indicate the self-sensitivity of analysis to different observation types. In this paper, a new method to assess the information content of observation on analyses is presented and applied to calculate the DFS of a complete set, or subsets, of observations in the MSC’s 3D- and 4D-Var systems. Based on the results of Desroziers et al. (2005), it is shown that the information content brought in by the data assimilation system can be estimated from observation departures from the analysis and the background state. The main point made in this paper is that even though the error statistics may not be consistent, the observation departures can still be used to measure the information content in observations associated with the a priori error statistics used in the assimilation. These a posteriori estimates were inspired by the results of Desroziers et al. (2005). It was shown here that by introducing the additional assumption that the observation error is uncorrelated, the method is easily applicable as a diagnostic of the results produced by any data assimilation system. One has to be aware that it is implicitly assumed that the observation departures are unbiased which may not be verified. A simplified 1D-Var system was used to test the validity of the method and the results confirmed that the estimates obtained agree with a method proposed by Girard (1987). With error statistics differing from the true ones, it was shown that the a posteriori estimates of the observation error is reasonably diagonal, which justifies the hypothesis made on the a posteriori estimate of the observation error covariances.

The DFS method calculation was also applied in the MSC’s 3D- and 4D-Var systems. The partition by observation types allows diagnosing the relative influence on the analysis of different observing systems. The results suggest that radiosondes are the most influential data type of the global observing system, followed by brightness temperature data types (AMSU-A/B) and aircraft data. It is worth mentioning that the largest observation influence is provided by radiosonde and AMSU-B data. It has already been shown that the DFS is useful to evaluate the sensitivity of the analysis to different channels for a particular radiometer. The estimation of the a posteriori error standard deviations for satellite radiances indicate that the errors are generally overestimated in the MSC’s 3D- and 4D-Var schemes. It is, however, planned to more carefully investigate the a posteriori estimation of the observation error variance for radiometers channels sounding in the high atmosphere.

The results shown in the paper indicate some deficiencies in the current estimate of the error statistics used in the assimilation. Future work will have to be done to recalibrate the error statistics to reflect changes brought to the system. These diagnostics will be used to evaluate the information content of a complete set, or subsets, of observations on the 4D-Var scheme that was implemented operationally in 2008. Since then, the number of the Advanced Television and Infrared Observation Satellite (TIROS) Operational Vertical Sounder (ATOVS) and AMVs observations was increased in the new system and new observation types as AIRS, SSM/I (clear-sky radiances), and QuikSCAT Seawinds are now assimilated. It is worth mentioning that among the different applications, the DFS can be used to map the evolution of the model covariance matrix.

Acknowledgments

The authors thank Mr. Pierre Koclas of the Meteorological Service of Canada who helped us in the utilization of the observation database to build the diagnostics in observation space used in this study. Environment Canada provided partial financial support for this study on top of the computing facilities and technical assistance for the use of their assimilation system. The authors would also like to thank Drs. Mark Buehner from Environment Canada and Carla Cardinali from ECMWF for their comments and discussions, which led to significant improvements to the paper. Comments from an anonymous reviewer are also acknowledged and helped to improve the paper as well.

This work has been funded in part by Grant 500-b of the Canadian Foundation for Climate and Atmospheric Sciences (CFCAS) for the project on the Impact of Observing Systems on Forecasting Extreme Weather in the short, medium and extended range: A Canadian contribution to THORPEX, with additional support from Discovery Grant 357091 of the Natural Sciences and Engineering Research Council (NSERC) of Canada.

REFERENCES

  • Baker, N. L. , and R. Daley , 2000: Observation and background adjoint sensitivity in the adaptative observation-targeting problem. Quart. J. Roy. Meteor. Soc., 126 , 14311454.

    • Search Google Scholar
    • Export Citation
  • Cardinali, C. , 2009: Monitoring the observation impact on the short-range forecast. Quart. J. Roy. Meteor. Soc., 135 , 239250.

  • Cardinali, C. , S. Pezzulli , and E. Andersson , 2004: Influence-matrix diagnostic of a data assimilation system. Quart. J. Roy. Meteor. Soc., 130 , 27672786.

    • Search Google Scholar
    • Export Citation
  • Chapnik, B. , G. Desroziers , F. Rabier , and O. Talagrand , 2006: Diagnosis and tuning of observational error in a quasi-operational data assimilation setting. Quart. J. Roy. Meteor. Soc., 132 , 543565.

    • Search Google Scholar
    • Export Citation
  • Desroziers, G. , and S. Ivanov , 2001: Diagnosis and adaptive tuning of observation-error parameters in a variational assimilation. Quart. J. Roy. Meteor. Soc., 127 , 14331452.

    • Search Google Scholar
    • Export Citation
  • Desroziers, G. , L. Berre , B. Chapnik , and P. Poli , 2005: Diagnosis of observation, background and analysis-error statistics in observation space. Quart. J. Roy. Meteor. Soc., 131 , 33853396.

    • Search Google Scholar
    • Export Citation
  • Fisher, M. , 2003: Estimation of entropy reduction and degrees of freedom for signal for large variational analysis systems. ECMWF Tech. Memo. 397, 18 pp.

    • Search Google Scholar
    • Export Citation
  • Gauthier, P. , C. Charette , L. Fillion , P. Koclas , and S. Laroche , 1999: Implementation of a 3D variational data assimilation system at the Canadian Meteorological Centre. Part I: The global analysis. Atmos.–Ocean, 37 , 103156.

    • Search Google Scholar
    • Export Citation
  • Gauthier, P. , M. Tanguay , S. Laroche , S. Pellerin , and J. Morneau , 2007: Extension of 3DVAR to 4DVAR: Implementation of 4DVAR at the Meteorological Service of Canada. Mon. Wea. Rev., 135 , 23392354.

    • Search Google Scholar
    • Export Citation
  • Gelaro, R. , and Y. Zhu , 2009: Examination of observation impacts derived from observing system experiments (OSEs) and adjoint models. Tellus, 61A , 179193.

    • Search Google Scholar
    • Export Citation
  • Girard, D. , 1987: A fast Monte Carlo cross-validation procedure for large least squares problems with noisy data. Tech. Rep. 687-M, IMAG, Grenoble, France, 22 pp.

    • Search Google Scholar
    • Export Citation
  • Golub, G. H. , and C. F. van Loan , 1996: Matrix Computations. The Johns Hopkins University Press, 728 pp.

  • Kelly, G. , J. N. Thépaut , R. Buizza , and C. Cardinali , 2007: The value of observations. I: Data denial experiments for the Atlantic and the Pacific. Quart. J. Roy. Meteor. Soc., 133 , 18031815.

    • Search Google Scholar
    • Export Citation
  • Langland, R. H. , and N. L. Baker , 2004: Estimation of observation impact using the NRL atmospheric variational data assimilation adjoint system. Tellus, 56A , 189201.

    • Search Google Scholar
    • Export Citation
  • Laroche, S. , and R. Sarrazin , 2010a: Impact study with observations assimilated over North America and the North Pacific Ocean on the MSC global forecast system. Part I: Contribution of radiosonde, aircraft and satellite data. Atmos.–Ocean, 48 , 1025.

    • Search Google Scholar
    • Export Citation
  • Laroche, S. , and R. Sarrazin , 2010b: Impact study with observations assimilated over North America and the North Pacific Ocean on the MSC global forecast system. Part II: Sensitivity experiments. Atmos.–Ocean, 48 , 2638.

    • Search Google Scholar
    • Export Citation
  • Purser, R. J. , and H. L. Huang , 1993: Estimating effective data density in a satellite retrieval or an objective analysis. J. Appl. Meteor., 32 , 10921107.

    • Search Google Scholar
    • Export Citation
  • Rabier, F. , N. Fourrié , D. Chafai , and P. Prunet , 2002: Channel selection methods for Infrared Atmospheric Sounding Interferometer radiances. Quart. J. Roy. Meteor. Soc., 128 , 10111027.

    • Search Google Scholar
    • Export Citation
  • Rodgers, C. , 2000: Inverse Methods for Atmospheric Sounding Theory and Practice. World Scientific Publishing, 256 pp.

  • Talagrand, O. , 1999: A posteriori evaluation and verification of the analysis and assimilation algorithms. Proc. Workshop on Diagnosis of Data Assimilation Systems, Reading, United Kingdom, ECMWF, 17–28.

    • Search Google Scholar
    • Export Citation
  • Zhu, Y. , and R. Gelaro , 2008: Observation sensitivity calculations using the adjoint of the Gridpoint Statistical Interpolation (GSI) analysis system. Mon. Wea. Rev., 136 , 335351.

    • Search Google Scholar
    • Export Citation

Fig. 1.
Fig. 1.

Off-diagonal terms in the observation error covariance as function of distance rij between points i and j.

Citation: Monthly Weather Review 139, 3; 10.1175/2010MWR3404.1

Fig. 2.
Fig. 2.

The total DFS 2-month average (January–February 2007) in the MSC 3D- and 4D-Var analysis over the 4 regions: the entire globe, the Northern Hemisphere (20°–90°N), the tropics (20°S–20°N), and the Southern Hemisphere (90°–20°S).

Citation: Monthly Weather Review 139, 3; 10.1175/2010MWR3404.1

Fig. 3.
Fig. 3.

DFS 2-month average (January–February 2007) in the MSC 3D- and 4D-Var analysis for the 8 data types over the globe.

Citation: Monthly Weather Review 139, 3; 10.1175/2010MWR3404.1

Fig. 4.
Fig. 4.

Observation influence in the MSC 3D- and 4D-Var analysis for the 8 data types over the globe.

Citation: Monthly Weather Review 139, 3; 10.1175/2010MWR3404.1

Fig. 5.
Fig. 5.

DFS 2-month average over the globe in the MSC 3D- and 4D-Var analysis for each channel of (a) AMSU-A and (b) AMSU-B.

Citation: Monthly Weather Review 139, 3; 10.1175/2010MWR3404.1

Fig. 6.
Fig. 6.

DFS for the main data types in the 3D- and 4D-Var systems as a function of observation time relative to the assimilation window. The observing platforms are color coded and given in the legend.

Citation: Monthly Weather Review 139, 3; 10.1175/2010MWR3404.1

Table 1.

DFS estimate values as a function of background correlation length scale Lc. DFSANALYTIC as calculated from the prescribed statistics; DFSGIRARD as computed with Girard’s method; and , and DF̃SDIAG as obtained from (21)(23). The a priori values are perfectly known: .

Table 1.
Table 2.

As in Table 1, but for the experiment with and an underestimated value of the observation error variance (σo2 = 2.25).

Table 2.
Table 3.

As in Table 1, but for the experiment with both the observation and background error variances underestimated (σo2 = 2.25 and σb2 = 0.25, respectively).

Table 3.
Table 4.

List of observations assimilated in 3D- and 4D-Var systems of the Environment Canada during the winter of 2006–07. Variables retain their standard definitions.

Table 4.
Table 5.

Comparison between estimated values of χ2 and the number of observation p in 3D- and 4D-Var averaged for a 2-month winter period (1 Jan–28 Feb 2007).

Table 5.
Save
  • Baker, N. L. , and R. Daley , 2000: Observation and background adjoint sensitivity in the adaptative observation-targeting problem. Quart. J. Roy. Meteor. Soc., 126 , 14311454.

    • Search Google Scholar
    • Export Citation
  • Cardinali, C. , 2009: Monitoring the observation impact on the short-range forecast. Quart. J. Roy. Meteor. Soc., 135 , 239250.

  • Cardinali, C. , S. Pezzulli , and E. Andersson , 2004: Influence-matrix diagnostic of a data assimilation system. Quart. J. Roy. Meteor. Soc., 130 , 27672786.

    • Search Google Scholar
    • Export Citation
  • Chapnik, B. , G. Desroziers , F. Rabier , and O. Talagrand , 2006: Diagnosis and tuning of observational error in a quasi-operational data assimilation setting. Quart. J. Roy. Meteor. Soc., 132 , 543565.

    • Search Google Scholar
    • Export Citation
  • Desroziers, G. , and S. Ivanov , 2001: Diagnosis and adaptive tuning of observation-error parameters in a variational assimilation. Quart. J. Roy. Meteor. Soc., 127 , 14331452.

    • Search Google Scholar
    • Export Citation
  • Desroziers, G. , L. Berre , B. Chapnik , and P. Poli , 2005: Diagnosis of observation, background and analysis-error statistics in observation space. Quart. J. Roy. Meteor. Soc., 131 , 33853396.

    • Search Google Scholar
    • Export Citation
  • Fisher, M. , 2003: Estimation of entropy reduction and degrees of freedom for signal for large variational analysis systems. ECMWF Tech. Memo. 397, 18 pp.

    • Search Google Scholar
    • Export Citation
  • Gauthier, P. , C. Charette , L. Fillion , P. Koclas , and S. Laroche , 1999: Implementation of a 3D variational data assimilation system at the Canadian Meteorological Centre. Part I: The global analysis. Atmos.–Ocean, 37 , 103156.

    • Search Google Scholar
    • Export Citation
  • Gauthier, P. , M. Tanguay , S. Laroche , S. Pellerin , and J. Morneau , 2007: Extension of 3DVAR to 4DVAR: Implementation of 4DVAR at the Meteorological Service of Canada. Mon. Wea. Rev., 135 , 23392354.

    • Search Google Scholar
    • Export Citation
  • Gelaro, R. , and Y. Zhu , 2009: Examination of observation impacts derived from observing system experiments (OSEs) and adjoint models. Tellus, 61A , 179193.

    • Search Google Scholar
    • Export Citation
  • Girard, D. , 1987: A fast Monte Carlo cross-validation procedure for large least squares problems with noisy data. Tech. Rep. 687-M, IMAG, Grenoble, France, 22 pp.

    • Search Google Scholar
    • Export Citation
  • Golub, G. H. , and C. F. van Loan , 1996: Matrix Computations. The Johns Hopkins University Press, 728 pp.

  • Kelly, G. , J. N. Thépaut , R. Buizza , and C. Cardinali , 2007: The value of observations. I: Data denial experiments for the Atlantic and the Pacific. Quart. J. Roy. Meteor. Soc., 133 , 18031815.

    • Search Google Scholar
    • Export Citation
  • Langland, R. H. , and N. L. Baker , 2004: Estimation of observation impact using the NRL atmospheric variational data assimilation adjoint system. Tellus, 56A , 189201.

    • Search Google Scholar
    • Export Citation
  • Laroche, S. , and R. Sarrazin , 2010a: Impact study with observations assimilated over North America and the North Pacific Ocean on the MSC global forecast system. Part I: Contribution of radiosonde, aircraft and satellite data. Atmos.–Ocean, 48 , 1025.

    • Search Google Scholar
    • Export Citation
  • Laroche, S. , and R. Sarrazin , 2010b: Impact study with observations assimilated over North America and the North Pacific Ocean on the MSC global forecast system. Part II: Sensitivity experiments. Atmos.–Ocean, 48 , 2638.

    • Search Google Scholar
    • Export Citation
  • Purser, R. J. , and H. L. Huang , 1993: Estimating effective data density in a satellite retrieval or an objective analysis. J. Appl. Meteor., 32 , 10921107.

    • Search Google Scholar
    • Export Citation
  • Rabier, F. , N. Fourrié , D. Chafai , and P. Prunet , 2002: Channel selection methods for Infrared Atmospheric Sounding Interferometer radiances. Quart. J. Roy. Meteor. Soc., 128 , 10111027.

    • Search Google Scholar
    • Export Citation
  • Rodgers, C. , 2000: Inverse Methods for Atmospheric Sounding Theory and Practice. World Scientific Publishing, 256 pp.

  • Talagrand, O. , 1999: A posteriori evaluation and verification of the analysis and assimilation algorithms. Proc. Workshop on Diagnosis of Data Assimilation Systems, Reading, United Kingdom, ECMWF, 17–28.

    • Search Google Scholar
    • Export Citation
  • Zhu, Y. , and R. Gelaro , 2008: Observation sensitivity calculations using the adjoint of the Gridpoint Statistical Interpolation (GSI) analysis system. Mon. Wea. Rev., 136 , 335351.

    • Search Google Scholar
    • Export Citation
  • Fig. 1.

    Off-diagonal terms in the observation error covariance as function of distance rij between points i and j.

  • Fig. 2.

    The total DFS 2-month average (January–February 2007) in the MSC 3D- and 4D-Var analysis over the 4 regions: the entire globe, the Northern Hemisphere (20°–90°N), the tropics (20°S–20°N), and the Southern Hemisphere (90°–20°S).

  • Fig. 3.

    DFS 2-month average (January–February 2007) in the MSC 3D- and 4D-Var analysis for the 8 data types over the globe.

  • Fig. 4.

    Observation influence in the MSC 3D- and 4D-Var analysis for the 8 data types over the globe.

  • Fig. 5.

    DFS 2-month average over the globe in the MSC 3D- and 4D-Var analysis for each channel of (a) AMSU-A and (b) AMSU-B.

  • Fig. 6.

    DFS for the main data types in the 3D- and 4D-Var systems as a function of observation time relative to the assimilation window. The observing platforms are color coded and given in the legend.

All Time Past Year Past 30 Days
Abstract Views 0 0 0
Full Text Views 1802 1372 613
PDF Downloads 275 106 7