1. Introduction
Recently, new methods have been developed to quantify the impact of the observations on the assimilation and on the ensuing forecast. For the analysis, this can be achieved by evaluating the information content expressed in terms of degrees of freedom for signal (DFS; Purser and Huang 1993; Rodgers 2000; Rabier et al. 2002). For any data assimilation system, this diagnostic quantifies information brought by any given type of observations and is useful to assess the relative impact of the different types of observations being assimilated. With the increasing number of datasets used by modern data assimilation systems, such as the hyperspectral infrared sounders, Atmospheric Infrared Sounder (AIRS) and Infrared Atmospheric Sounding Interferometer (IASI), it is important to know the information content associated with the radiance measurements which permits us to reduce the volume of data associated with these new instruments. An example of how this diagnostic was applied for the channel selection procedure is presented in Rabier et al. (2002) in the context of IASI-simulated data by evaluating the impact of the different channels on the analysis.
Proposed by Cardinali et al. (2004), the influence matrix gives a measure of how much any given observation impacted the analysis. They used this approach to estimate the information content supplied by different types of observational data to analyses produced by the European Centre for Medium-Range Weather Forecasts (ECMWF) four-dimensional variational data assimilation (4D-Var) system. The sensitivity of the analysis to observations then showed that about 25% of the information was provided by ground-based observing systems and 75% by satellite systems. This approach also allows the partial influence of observational subsets to be examined based on geographical area and observation type. Another method was applied with the Action de Recherche Petite Echelle Grande Echelle (ARPEGE) 4D-Var system of Météo-France (Chapnik et al. 2006). It is based on a method proposed in Girard (1987) in which perturbations to both the background and the observations are introduced to measure the sensitivity of the resulting analysis given the uncertainty in both the observations and the background. In any data assimilation system, the impact on the analysis depends critically on the observation and background error statistics used in the assimilation.
In numerical weather prediction, one is interested in knowing the impact of the observation on the forecast made from the analysis. Traditionally, the observation impact on forecasts has been obtained from Observing System Experiments (OSEs) in which selected datasets are systematically added or removed from the assimilation system (e.g., Kelly et al. 2007). Using the OSEs, the impact of various observation network configurations can be assessed by comparing forecast scores from experiments that use different observation scenarios. This approach is expensive and only provides a global view of the impact of observations. Recently, adjoint-based sensitivities with respect to observations have also been proposed to assess the observation impact on short-range forecasts without carrying out data-denial experiments (Baker and Daley 2000; Langland and Baker 2004; Zhu and Gelaro 2008; Cardinali 2009). Zhu and Gelaro (2008) showed that the adjoint-based method provides accurate assessments of the forecast sensitivity with respect to most of the observations assimilated. Gelaro and Zhu (2009) and Cardinali (2009) have recently applied adjoint-based impact calculations to results from OSEs to show that the two methods provide complementary information.
The objective of this paper is to propose a simple approach that permits to easily evaluate the information content associated with observations used in any data assimilation system directly from observation departures from the analysis and forecast, a natural by-product of the assimilation process. The emphasis is then on the impact of observations on the analysis only, not the forecasts. Following Desroziers et al. (2005), observation departures from analyses and forecasts can be used to make diagnostics about the consistency of the observation and background error statistics used in the assimilation. If these error statistics are suboptimal, they showed that this information can be used to recalibrate the error statistics to meet the χ2 optimality criteria. The methods are presented in Desroziers and Ivanov (2001) and Chapnik et al. (2006). What they show is that observation departures with respect to the background and the analysis are directly related to the observation, background, and analysis error covariances. Based on this, they showed that any inconsistency between those diagnostics and the a priori error statistics used in the assimilation can be used to recalibrate the observation and background error statistics. As pointed out in Chapnik et al. (2006), these relationships show that they can provide an estimate of the information content, provided the error statistics are consistent. What we show in this paper is that these relationships provide a reliable estimate of the information content as evaluated with the perturbation method of Girard (1987).
An analytic derivation is presented to show how the DFS can be evaluated from the a posteriori statistics. Section 2 describes the methodology for computing the information content brought by the observations. Based on the results of Desroziers et al. (2005), it is shown that the information content brought in by the data assimilation system can be estimated from observation departures from the analysis and forecast, even when the expected statistics of innovation vector differ from those specified in the assimilation system. A unique aspect of the method proposed here is that it does not require the consistency of the error statistics in the analysis system. In section 3, results obtained with the simplified one-dimensional (1D)-Var scheme are presented and discussed. In section 4, this is applied to results from three-dimensional (3D)- and 4D-Var to show how the impact of observations depends on the assimilation method. Those results were obtained from analyses produced with the 3D- and 4D-Var systems of Environment Canada (Gauthier et al. 1999, 2007). Finally, the summary and conclusions are given in section 5.
2. Estimation of information content brought by the observations




















a. Case with consistent error statistics



b. Case with inconsistent error statistics












The equivalence established here states that the DFS evaluated using diagnostics of E[








For many observation types like radiosondes and ground-based instruments, the observation error is uncorrelated between distinct observations. We then introduce the assumption that
In the next section a simple system 1D-Var is used to investigate the extent to which this assumption is a reasonable one in an idealized context in which ensemble of analyses can be generated.
3. Application to 1D-Var system




a. Estimation of the off-diagonal terms in the observation error covariance
The first experiment is to examine whether the a posteriori estimate of observation error covariance can be assumed to be diagonal and their importance for the definition of








b. Degrees of freedom for signal
In a second set of experiments, the DFS is evaluated using the a posteriori statistics and compared with that obtained using the perturbation method (Girard 1987). Results are also shown when the a posteriori diagnostics are evaluated using either (14) or (16). Finally, the DFS is estimated using (16), but retaining only the estimated diagonal elements of






Table 1 shows the estimates of DFS obtained with the true background and observation errors. In this case, the estimated 𝗞 is equal to the true Kalman gain matrix. The a posteriori estimate of DFS is similar (within 0.1% accuracy) with that found from Girard’s method and in good agreement with the analytic value. Since the DFS is a function of 𝗕, the horizontal model correlations affect the DFS: when the correlation length increases the DFS tends to decrease. This can be seen in the results of Table 1 that illustrate the influence of the background correlation length on the DFS.
The second set of experiments is similar to the previous one except that the observation error variance is now underestimated and taken to be σo2 = 2.25. The results, shown in Table 2, are similar to that of Table 1. Similarly, Table 3 presents the results obtained when both the background and observation error variances are underestimated (σb2 = 0.25 and σo2 = 2.25, respectively). In both experiments, the DFS calculations using the full estimate of the a posteriori observation error covariance matrix
The conclusions from these experiments are now summarized. When the a priori error statistics differ from those estimated from observation departures, the estimated observation error covariance matrix might show cross correlations due in part to the presence of background error in its estimate. In this study, the nondiagonal elements of
4. Evaluation of the information content in 3D-Var and 4D-Var
In this section, the diagnostics introduced in the previous section are used to evaluate the DFS from the 3D- and 4D-Var systems of the Meteorological Service of Canada (MSC). The 3D- and 4D-Var experiments used in this study are those described in Laroche and Sarrazin (2010a,b). The 3D- and 4D-Var systems have been cycled over the period 21 December 2006 to 28 February 2007 using a 6-h assimilation window. All diagnostics exclude the first 11 days, the spinup period of the analysis. The incremental 4D-Var is used (Gauthier et al. 2007) in which the analysis increment is calculated at a lower horizontal resolution (∼170 km). The 4D-Var analysis is obtained after two outer loops by interpolating this lower-resolution analysis increment to the same grid (∼35 km) as the background state before adding the two. The subsets of observations assimilated in either 3D- or 4D-Var during winter 2006–07 include radiosondes (RAOB), aircraft data (AI), surface and ship data (SF), wind profiler data (PR), atmospheric motion vectors (AMVs) from geostationary satellites and those from Moderate Resolution Imaging Spectroradiometer (MODIS AMVs), and radiances from polar-orbiting satellites [Advanced Microwave Sounding Unit (AMSU-A/B)] and from geostationary satellites [Geostationary Operational Environmental Satellite (GOES-East) and (GOES-West)]. A summary is given in Table 4.
a. A posteriori diagnostics and consistency checks


The consistency diagnostic has been calculated for the observation and background error covariances as in (5a) and (5b). Results confirm the overestimation of the error statistics for most observation types and consequently the suboptimality of the system here considered. However, as previously shown, the DFS calculation is not affected by any degree of the system suboptimality.
b. Computation of DFS in MSC’s 3D-Var and 4D-Var

Figure 3 shows the DFS percentage in the 3D- and 4D-Var for different observation type over the globe. Results show that the most important observations in terms of information content in the analyses are radiosonde and brightness temperature data types (AMSU-A/B) followed by aircraft data. Different results have been obtained at the ECMWF (Cardinali et al. 2004) where satellite observations [AMSU-A, High Resolution Infrared Radiation Sounder (HIRS), and Special Sensor Microwave Imager (SSM/I)] contribute more to the DFS than conventional observations. The MSC 3D/4D-Var relies on a smaller number of satellite data as compared to ECMWF. It is also observed that radiosonde, wind profiler, aircraft, and AMSU-B data have more relative impact in 4D-Var than in 3D-Var. In the Northern Hemisphere, the largest DFS is obtained for radiosonde and aircraft data while satellite radiances are dominant in the Southern Hemisphere. The fact that for satellite data the DFS is smaller in 4D-Var than in 3D-Var and that, in general, the DFS is larger for conventional observations than for satellite data, indicates a need for model error covariance recalibration.
Figure 5 shows the
Figure 6 shows the information content, for the main data types in the 3D- and 4D-Var as a function of the observation time within the assimilation window. The regions represented here are the Northern and Southern Hemispheres. The results suggest that radiosonde and surface pressure data have the largest DFS near the middle of the assimilation window as most of the data are available at the synoptic time. On the other hand, the satellite data are roughly evenly spread across the assimilation window but have the largest DFS at the end of the assimilation window. The DFS is expected to be larger at the end of the assimilation window for the evolution of the covariance matrices in the window. The DFS comparisons as a function of time in the assimilation window indicate not significant difference between 3D- and 4D-Var systems.
5. Conclusions
As described in this paper, there are a number of approaches that have recently been used to evaluate the value of observations in data assimilation systems. The DFS is used in data assimilation applications to indicate the self-sensitivity of analysis to different observation types. In this paper, a new method to assess the information content of observation on analyses is presented and applied to calculate the DFS of a complete set, or subsets, of observations in the MSC’s 3D- and 4D-Var systems. Based on the results of Desroziers et al. (2005), it is shown that the information content brought in by the data assimilation system can be estimated from observation departures from the analysis and the background state. The main point made in this paper is that even though the error statistics may not be consistent, the observation departures can still be used to measure the information content in observations associated with the a priori error statistics used in the assimilation. These a posteriori estimates were inspired by the results of Desroziers et al. (2005). It was shown here that by introducing the additional assumption that the observation error is uncorrelated, the method is easily applicable as a diagnostic of the results produced by any data assimilation system. One has to be aware that it is implicitly assumed that the observation departures are unbiased which may not be verified. A simplified 1D-Var system was used to test the validity of the method and the results confirmed that the estimates obtained agree with a method proposed by Girard (1987). With error statistics differing from the true ones, it was shown that the a posteriori estimates of the observation error is reasonably diagonal, which justifies the hypothesis made on the a posteriori estimate of the observation error covariances.
The DFS method calculation was also applied in the MSC’s 3D- and 4D-Var systems. The partition by observation types allows diagnosing the relative influence on the analysis of different observing systems. The results suggest that radiosondes are the most influential data type of the global observing system, followed by brightness temperature data types (AMSU-A/B) and aircraft data. It is worth mentioning that the largest observation influence is provided by radiosonde and AMSU-B data. It has already been shown that the DFS is useful to evaluate the sensitivity of the analysis to different channels for a particular radiometer. The estimation of the a posteriori error standard deviations for satellite radiances indicate that the errors are generally overestimated in the MSC’s 3D- and 4D-Var schemes. It is, however, planned to more carefully investigate the a posteriori estimation of the observation error variance for radiometers channels sounding in the high atmosphere.
The results shown in the paper indicate some deficiencies in the current estimate of the error statistics used in the assimilation. Future work will have to be done to recalibrate the error statistics to reflect changes brought to the system. These diagnostics will be used to evaluate the information content of a complete set, or subsets, of observations on the 4D-Var scheme that was implemented operationally in 2008. Since then, the number of the Advanced Television and Infrared Observation Satellite (TIROS) Operational Vertical Sounder (ATOVS) and AMVs observations was increased in the new system and new observation types as AIRS, SSM/I (clear-sky radiances), and QuikSCAT Seawinds are now assimilated. It is worth mentioning that among the different applications, the DFS can be used to map the evolution of the model covariance matrix.
Acknowledgments
The authors thank Mr. Pierre Koclas of the Meteorological Service of Canada who helped us in the utilization of the observation database to build the diagnostics in observation space used in this study. Environment Canada provided partial financial support for this study on top of the computing facilities and technical assistance for the use of their assimilation system. The authors would also like to thank Drs. Mark Buehner from Environment Canada and Carla Cardinali from ECMWF for their comments and discussions, which led to significant improvements to the paper. Comments from an anonymous reviewer are also acknowledged and helped to improve the paper as well.
This work has been funded in part by Grant 500-b of the Canadian Foundation for Climate and Atmospheric Sciences (CFCAS) for the project on the Impact of Observing Systems on Forecasting Extreme Weather in the short, medium and extended range: A Canadian contribution to THORPEX, with additional support from Discovery Grant 357091 of the Natural Sciences and Engineering Research Council (NSERC) of Canada.
REFERENCES
Baker, N. L. , and R. Daley , 2000: Observation and background adjoint sensitivity in the adaptative observation-targeting problem. Quart. J. Roy. Meteor. Soc., 126 , 1431–1454.
Cardinali, C. , 2009: Monitoring the observation impact on the short-range forecast. Quart. J. Roy. Meteor. Soc., 135 , 239–250.
Cardinali, C. , S. Pezzulli , and E. Andersson , 2004: Influence-matrix diagnostic of a data assimilation system. Quart. J. Roy. Meteor. Soc., 130 , 2767–2786.
Chapnik, B. , G. Desroziers , F. Rabier , and O. Talagrand , 2006: Diagnosis and tuning of observational error in a quasi-operational data assimilation setting. Quart. J. Roy. Meteor. Soc., 132 , 543–565.
Desroziers, G. , and S. Ivanov , 2001: Diagnosis and adaptive tuning of observation-error parameters in a variational assimilation. Quart. J. Roy. Meteor. Soc., 127 , 1433–1452.
Desroziers, G. , L. Berre , B. Chapnik , and P. Poli , 2005: Diagnosis of observation, background and analysis-error statistics in observation space. Quart. J. Roy. Meteor. Soc., 131 , 3385–3396.
Fisher, M. , 2003: Estimation of entropy reduction and degrees of freedom for signal for large variational analysis systems. ECMWF Tech. Memo. 397, 18 pp.
Gauthier, P. , C. Charette , L. Fillion , P. Koclas , and S. Laroche , 1999: Implementation of a 3D variational data assimilation system at the Canadian Meteorological Centre. Part I: The global analysis. Atmos.–Ocean, 37 , 103–156.
Gauthier, P. , M. Tanguay , S. Laroche , S. Pellerin , and J. Morneau , 2007: Extension of 3DVAR to 4DVAR: Implementation of 4DVAR at the Meteorological Service of Canada. Mon. Wea. Rev., 135 , 2339–2354.
Gelaro, R. , and Y. Zhu , 2009: Examination of observation impacts derived from observing system experiments (OSEs) and adjoint models. Tellus, 61A , 179–193.
Girard, D. , 1987: A fast Monte Carlo cross-validation procedure for large least squares problems with noisy data. Tech. Rep. 687-M, IMAG, Grenoble, France, 22 pp.
Golub, G. H. , and C. F. van Loan , 1996: Matrix Computations. The Johns Hopkins University Press, 728 pp.
Kelly, G. , J. N. Thépaut , R. Buizza , and C. Cardinali , 2007: The value of observations. I: Data denial experiments for the Atlantic and the Pacific. Quart. J. Roy. Meteor. Soc., 133 , 1803–1815.
Langland, R. H. , and N. L. Baker , 2004: Estimation of observation impact using the NRL atmospheric variational data assimilation adjoint system. Tellus, 56A , 189–201.
Laroche, S. , and R. Sarrazin , 2010a: Impact study with observations assimilated over North America and the North Pacific Ocean on the MSC global forecast system. Part I: Contribution of radiosonde, aircraft and satellite data. Atmos.–Ocean, 48 , 10–25.
Laroche, S. , and R. Sarrazin , 2010b: Impact study with observations assimilated over North America and the North Pacific Ocean on the MSC global forecast system. Part II: Sensitivity experiments. Atmos.–Ocean, 48 , 26–38.
Purser, R. J. , and H. L. Huang , 1993: Estimating effective data density in a satellite retrieval or an objective analysis. J. Appl. Meteor., 32 , 1092–1107.
Rabier, F. , N. Fourrié , D. Chafai , and P. Prunet , 2002: Channel selection methods for Infrared Atmospheric Sounding Interferometer radiances. Quart. J. Roy. Meteor. Soc., 128 , 1011–1027.
Rodgers, C. , 2000: Inverse Methods for Atmospheric Sounding Theory and Practice. World Scientific Publishing, 256 pp.
Talagrand, O. , 1999: A posteriori evaluation and verification of the analysis and assimilation algorithms. Proc. Workshop on Diagnosis of Data Assimilation Systems, Reading, United Kingdom, ECMWF, 17–28.
Zhu, Y. , and R. Gelaro , 2008: Observation sensitivity calculations using the adjoint of the Gridpoint Statistical Interpolation (GSI) analysis system. Mon. Wea. Rev., 136 , 335–351.
Off-diagonal terms in the observation error covariance as function of distance rij between points i and j.
Citation: Monthly Weather Review 139, 3; 10.1175/2010MWR3404.1
The total DFS 2-month average (January–February 2007) in the MSC 3D- and 4D-Var analysis over the 4 regions: the entire globe, the Northern Hemisphere (20°–90°N), the tropics (20°S–20°N), and the Southern Hemisphere (90°–20°S).
Citation: Monthly Weather Review 139, 3; 10.1175/2010MWR3404.1
DFS 2-month average (January–February 2007) in the MSC 3D- and 4D-Var analysis for the 8 data types over the globe.
Citation: Monthly Weather Review 139, 3; 10.1175/2010MWR3404.1
Observation influence in the MSC 3D- and 4D-Var analysis for the 8 data types over the globe.
Citation: Monthly Weather Review 139, 3; 10.1175/2010MWR3404.1
DFS 2-month average over the globe in the MSC 3D- and 4D-Var analysis for each channel of (a) AMSU-A and (b) AMSU-B.
Citation: Monthly Weather Review 139, 3; 10.1175/2010MWR3404.1
DFS for the main data types in the 3D- and 4D-Var systems as a function of observation time relative to the assimilation window. The observing platforms are color coded and given in the legend.
Citation: Monthly Weather Review 139, 3; 10.1175/2010MWR3404.1
DFS estimate values as a function of background correlation length scale Lc. DFSANALYTIC as calculated from the prescribed statistics; DFSGIRARD as computed with Girard’s method; and
As in Table 1, but for the experiment with
As in Table 1, but for the experiment with both the observation and background error variances underestimated (σo2 = 2.25 and σb2 = 0.25, respectively).
List of observations assimilated in 3D- and 4D-Var systems of the Environment Canada during the winter of 2006–07. Variables retain their standard definitions.
Comparison between estimated values of χ2 and the number of observation p in 3D- and 4D-Var averaged for a 2-month winter period (1 Jan–28 Feb 2007).