The Center for Weather Forecast and Climate Studies [Centro de Previsão e Tempo e Estudos Climáticos (CPTEC)] at the Brazilian National Institute for Space Research [Instituto Nacional de Pesquisas Espaciais (INPE)] has recently operationally implemented a three-dimensional variational data assimilation (3DVAR) scheme based on the Gridpoint Statistical Interpolation analysis system (GSI). Implementation of the GSI system within the atmospheric global circulation model from CPTEC/INPE (AGCM-CPTEC/INPE) is hereafter referred to as the Global 3DVAR (G3DVAR) system. The results of an observing system experiment (OSE) measuring the impacts of radiosonde, satellite radiance, and GPS radio occultation (RO) data on the new G3DVAR system are presented here. The observational impact of each of these platforms was evaluated by measuring the degradation of the geopotential height anomaly correlation and the amplification of the RMSE of the wind. Losing the radiosonde, GPS RO, and satellite radiance data in the OSE resulted in negative impacts on the geopotential height anomaly correlations globally. Nevertheless, the strongest impacts were found over the Southern Hemisphere and South America when satellite radiance data were withheld from the data assimilation system.
The Center for Weather Forecast and Climate Studies [Centro de Previsão de Tempo e Estudos Climáticos (CPTEC)] at the Brazilian National Institute for Space Research [Instituto Nacional de Pesquisas Espaciais (INPE)] recently implemented the Gridpoint Statistical Interpolation analysis system (GSI; Wu et al. 2002; Kleist et al. 2009) [with a three-dimensional variational data assimilation (3DVAR) approach] in the CPTEC/INPE atmospheric global circulation model (AGCM-CPTEC/INPE). This implementation of the GSI system, known as the Global 3DVAR (G3DVAR) system, has been operational since January 2013 and initializes AGCM-CPTEC/INPE forecasts on a global grid every 6 h. This implementation of the GSI system has replaced the Physical-space Statistical Analysis System (PSAS; Cohn et al. 1998), which was previously used to initialize the AGCM-CPTEC/INPE. The transition to the GSI system has increased the maximum number of observations we can assimilate into our model and has provided the ability to assimilate satellite radiance data.
Since numerical weather prediction (NWP) is an initial value problem, the data assimilation process used to initialize forecasting models can have a significant impact on the quality of forecasts. Data assimilation is the process of combining observed data with short-range forecasts, therein considering the errors in the observations and errors associated with the numerical model, to generate an optimal estimate of the current state of the atmosphere (Talagrand 1997; Tsuyuki and Miyoshi 2007; Herdies et al. 2008). The information in the observing systems (i.e., the quantity and quality of the observations) plays a key role in the data assimilation process; it impacts the resulting analysis and consequently affects the quality of the forecasts. The resulting forecasts should benefit from a careful evaluation of how the different observing systems impact the NWP system since the inclusion of certain observations may degrade the forecasts. Furthermore, knowledge of which datasets provide better estimates of weather conditions can be used to optimize data assimilation systems by improving the process of selecting observations that contribute positively to the analysis.
The observing system experiment (OSE) technique is a popular method for determining the impacts of observing platforms on NWP forecasts. Following Lupu et al. (2011), one or more observing systems are excluded from the data assimilation process to assess the impact of the inclusion or exclusion of a specific observation platform on the quality of the forecast of the model. According to Atlas (2002), experiments of this type provide a quantitative assessment of each data source used in the data assimilation system. This type of information can be used to improve the utility of different observational datasets in the data assimilation system and to determine the relative importance of each type of data.
Several OSE-based studies have demonstrated the importance of satellite data for the Southern Hemisphere. English et al. (2013) evaluated the impact of satellite data on the ECMWF forecast system on a global scale. The authors found a large gap in its forecasting ability for the Northern and Southern Hemispheres in the years around 2000 and that this gap narrowed dramatically. Their study suggested that the main reason for a gap between the Northern and Southern Hemispheres was the low availability of in situ observations in the Southern Hemisphere. It is reasonable to attribute the closing of the gap after 2000 to improved satellite observations (English et al. 2013). In 2012, McNally (2012) confirmed that the availability of observations from polar satellites had a clear positive impact on the forecast accuracy and improved the predictability in the Southern Hemisphere by 30%. Recently, Cucurull and Anthes (2014) conducted a study that compared the impacts of infrared, microwave, and radio occultation (RO) satellite observations on NCEP’s operational global forecast model during March 2013. The authors concluded that satellite data impacted the predictability differently in the two hemispheres: satellite observations had a much stronger impact on forecasting ability in the Southern Hemisphere than in the Northern Hemisphere. Cucurull and Anthes (2014) also found that the largest improvement in forecasting ability resulted from the assimilation of all three types of data. Additionally, the assimilation of one type of satellite observation may help improve the assimilation of other types of observations. Bonavita (2014) and Bauer et al. (2014) both showed that the anchoring effect of assimilated GPS RO data improved the bias correction process needed for the assimilation of radiance observations.
Although an OSE follows similar standard procedures at different operational centers, each data assimilation system (i.e., a numerical model plus a data assimilation algorithm) shows a unique sensitivity to the observational datasets selected. Therefore, it is appropriate to evaluate the impact of the observations after the data assimilation or NWP system in an operational center undergoes major changes. The current study proposes an investigation of the relative impacts of different observing systems on the CPTEC operational model to add these results to the international pool of model evaluations. Such information is critical for understanding how a numerical weather prediction evolves daily. It also helps us to establish a baseline for comparison with other operational centers. Consequently, an OSE has been conducted to complement the implementation of the G3DVAR system at CPTEC/INPE. In this paper, we describe the impacts of data-denial experiments using satellite radiances, GPS RO data, and information from radiosondes within the G3DVAR system framework. Section 2 outlines the methods used in this study, including details of the numerical model and the data assimilation system, the experimental setup, and the statistical evaluation techniques used. Section 3 presents the results and a discussion of them, and the conclusions are presented in section 4.
a. Atmospheric general circulation model
The AGCM-CPTEC/INPE runs at a resolution of TQ0299L064, that is, a spectral triangular truncation in the 299 zonal wavenumber corresponding to a horizontal resolution of approximately 44 km near the equator and 64 vertical layers in sigma coordinates. This model is currently used for weather forecasting at CPTEC/INPE. The AGCM-CPTEC/INPE is based on the COLA AGCM (Kinter et al. 1997) with various improvements in its physical parameterizations, dynamic core, code structure, and parallelism (Cavalcanti et al. 2002; Panetta et al. 2006; Maciel 2009; Kubota 2012). The physical parameterization schemes of this model include the microphysics of Rasch and Kristjansson (1998), the CLIRAD shortwave scheme developed by Chou and Suarez (1999) and modified by Tarasova and Fomin. (2007), the longwave scheme of Harshvardhan et al. (1987), the vertical diffusion scheme of Mellor and Yamada (1982) with the modifications of Kubota (2012), the simplified Simple Biosphere (SSiB) surface scheme developed by Xue et al. (1991), the gravity wave scheme of Alpert et al. (1988), the cloud fraction scheme of Slingo (1987), the shallow diffusion scheme of Tiedtke (1983), and the scheme of Grell and Dévényi (2002) with Grell closure (GD-Grell). This model also has the ability to simulate the main characteristics of the climates of the Southern and Northern Hemispheres (Cavalcanti et al. 2002).
b. Data assimilation system
The GSI has been developed as the NCEP operational global data assimilation system using recursive filters in gridpoint space (Wu et al. 2002). This system is able to assimilate a wide range of observations including synoptic, satellite, and radar data. The GSI-based analysis scheme currently employed at CPTEC/INPE uses a 6-h cycle on a synoptic time scale (Fisher and Andersson 2001). The state variables from the global model fields that are updated by the 3DVAR scheme are the virtual temperature, vorticity, divergence, specific humidity, ozone concentration, liquid water tracers, and fields from the land and ocean surfaces. The control vector of the minimization algorithm of the GSI system is composed of the streamfunction, unbalanced potential velocity, unbalanced temperature, unbalanced surface pressure, pseudo–relative humidity, ozone mixing ratio, and total cloud water condensate. Once the GSI system has completed the minimization process, the updated fields are passed back to the AGCM as the state variables listed above.
In this study, the GSI system was configured to use only one outer loop and one inner loop. The minimization algorithm of the GSI system iterates until it reaches the convergence condition or the maximum number of iterations, which, in our system, was set to 100 iterations. This stop criterion was found to be computationally feasible and to produce results of reasonable quality. No nonlinear quality control was applied during the minimization process. Small weighting factors (0.005) were used to reduce the number of negative water vapor and supersaturation points in the analysis step; however, further tests need to be performed to identify the optimum values for this system.
The default background error (BE) covariance matrix that was distributed with the GSI system was used as is in the G3DVAR system. Although tuning the BE covariance matrix for the AGCM would be optimal, the authors assumed that any differences between the default and tuned BE covariance matrixes would be minimal because NCEP’s model and the AGCM are similar in many ways, including their spectral natures. The authors believed that the lack of tuning would have a minimal impact on the results presented here because the same covariance matrix was used for all the experiments. The length scales could also have been tuned by the user by multiplying factors relative to the fixed values in the BE matrix. The scale factor for the vertical correlation lengths applied to the BE matrix was 0.7, whereas three horizontal scales were used with default factors to reduce them by factors of 1.7, 0.8, and 0.5 and with default relative weights of 0.45, 0.3, and 0.25, respectively. Observations, which were assimilated within a window of ±3 h from the analysis time, were obtained from WMO Binary Universal Form for Representation of Meteorological Data (BUFR) files processed at NCEP (called prepBUFR files), and errors were assigned to each type of observation. Additionally, the errors assigned to the satellite observations varied according to the sensors, channels, and sky conditions (clear or cloudy radiance). The satellite radiance data were separated into groups with different thinning mesh values that vary from 145 to 180 km (no thinning was applied to the conventional data). The GSI system was also able to minimize the bias in the radiative transfer model by correcting the slowly evolving changes in the satellite scan angles and the bias that varied with the atmospheric state; these are often referred to as an “angle bias correction” and an “airmass bias correction,” respectively. In this study, 1 month of spinup time was necessary for the coefficients used to correct the satellite biases to convergence. Furthermore, no direct assimilation of GPS RO refractivity data or bending angle was performed; only the retrieved refractivity data were assimilated. Therefore, the GPS RO data were considered to be conventional data.
An OSE technique was employed to estimate the impacts of the different observing platforms on the G3DVAR system following Atlas (2002), Andreoli et al. (2008), and Jung (2013). The experiments performed using this technique were as follows:
CONTROL, where all observational data available at CPTEC were assimilated;
NO_SAT, where all available data except the satellite radiance data, including the Advanced Microwave Sounding Unit (AMSU-A), the Microwave Humidity Sounder (MHS), the High Resolution Infrared Radiation Sounder (HIRS-4), the Infrared Atmospheric Sounding Interferometer (IASI), and the Atmospheric Infrared Sounder (AIRS), were assimilated;
NO_RAD, where all available data except the radiosonde data were assimilated; and
NO_GPS, where all available data except the GPS RO data were assimilated.
A summary of the experiments is presented in Table 1, which lists the different observing systems used in each experiment.
The simulations were performed on a Cray SX6 supercomputer at CPTEC/INPE for a 2-month period during the austral summer from December 2012 through January 2013; forecasts were run out to lead times of 120 h. The first month of each experiment was discarded to minimize any possible shock due to the removal of a key component of the global observing system. The immediate and prolonged removal of key observation systems can cause instabilities in forecast metrics as the model adjusts to the loss of data. Therefore, only January 2013 was used in the evaluation to ensure that the model had reached a steady state after the data were removed. To ensure consistency, the satellite bias was corrected independently in each experiment.
Degradation in forecasting ability due to the removal of an observing system is unlikely to be uniform across the globe; therefore, statistical metrics were calculated for different regions: the Southern Hemisphere (SH), consisting of the region between 20° and 80°S; the Northern Hemisphere (NH), consisting of areas between 20° and 80°N; the tropical region (TR), consisting of the area between 20°N and 20°S; South America and the adjoining oceans (SAAO), covering the area between 0° and 120°W and 60°S and 12°N; and finally, the entire globe (G). The results of each experiment were compared with those of the CONTROL experiment.
3. Results and discussion
When the number of observations assimilated per experiment was computed, the CONTROL experiment assimilated more conventional and nonconventional data than did the other experiments. There was a decrease in the total amount of radiance data assimilated in the NO_GPS and NO_RAD experiments, which confirmed the findings of Bonavita (2014). Bonavita (2014) concluded that the GPS RO data served as anchoring observations for correcting the radiance bias, which allowed more radiance data to be assimilated. It is likely that, in the G3DVAR system, the radiosonde and GPS data both serve as anchoring observations, which allows more radiance observations to be assimilated.
Figure 1 shows the anomaly correlation of 500-hPa geopotential height over the globe. The bottom portion of Fig. 1 shows the results of applying a Student’s t test to the reduction in the geopotential height anomaly correlation for each simulation. The information was significant at the 95% confidence level when the curves were outside of their corresponding boxes. The removal of all three data platforms reduced the anomaly correlation, with the greatest degradation occurring when the satellite radiance data were removed. Globally, this degradation was statistically significant in the NO_SAT experiment starting at lead times of 12 h, in the NO_RAD experiment starting at 36 h, and in the NO_GPS experiment starting at 84 h.
The impacts of the nonuniform nature of the global observing system are shown in Figs. 2 and 3. These figures are equivalent to Fig. 1 for the Northern Hemisphere and the Southern Hemisphere, respectively. These figures confirm what many previous studies have found: it is more difficult to forecast with acceptable skill in the Southern Hemisphere than in the Northern Hemisphere. Additionally, withholding satellite radiance data had a much greater impact on the ability to forecast in the SH than the NH and a greater impact than the other data types tested on the ability to forecast in the SH. These results are consistent with the findings of Bouttier and Kelly (2001), Kelly et al. (2007), and Jung (2013). The differences in the impact of satellite data on the SH and the NH are probably due to differences in the availability of data between the two hemispheres. The SH is mostly covered by oceans and lacks a significant number of radiosonde and synoptic stations that the NH possesses. Satellite observations help fill these SH data voids, and the loss of these observations causes forecasts to degrade significantly more so than under the loss of data of other types.
In the SH (Fig. 3), the error caused by withholding data in the NO_GPS and NO_RAD experiments only became statistically significant at 36 h of forecast lead time; for comparison, in the NO_SAT experiment, the error became statistically significant at 12 h of forecast lead time. Despite the difference in hours between the NO_GPS and NO_RAD experiments and the NO_SAT experiment, this finding is important because of the reduced number of observations in the SH, where the lack of any information could result in a decrease in the ability to model that region. In the NH (Fig. 2), the absence of GPS data results in a slightly better ability compared with the CONTROL experiment between 48 and 72 h. In this study, the GPS was the only observing system that degraded the forecasts during a certain period of time. Furthermore, in both experiments, satellite radiance data had a positive impact, especially over the SH.
As in the Southern Hemisphere, the analysis of the geopotential height anomaly correlation at 500 hPa over the SAAO region showed that satellite radiance data had a significant impact on the forecasting ability. It was also apparent that the limited amount of conventional data over that region helped the data assimilation system mitigate the loss of satellite radiance data compared with the SH. The NO_GPS and NO_RAD experiments also produced degraded forecasts in the SAAO region; however, they exhibited smaller impacts. As seen in Fig. 4, the loss of satellite radiance data began to significantly degrade the model’s forecasting ability after 12 h of forecast lead time, which highlighted how critical the radiance information was for NWP over the SAAO region. Nevertheless, the impact of the NO_RAD and NO_GPS experiments started at forecast hour 48. The difference between the time at which the forecast began to degrade in the NO_RAD and NO_GPS experiments compared with the NO_SAT experiment may be due to the very limited amount of conventional information available for that region.
The RMSEs of the zonal wind component over the tropical region at 850 and 250 hPa are shown in Figs. 5 and 6, respectively. All the experiments were compared with the CONTROL experiment, which was considered the “truth.” The bottom panels show the statistical significance of the results computed using a Student’s t test. Although there was no direct relationship with the wind (in contrast to radiosondes), the loss of radiance data had a greater impact in that region. Nevertheless, this result was only statistically significant in the lower troposphere (at 850 hPa), whereas the GPS and radiosonde data had significant impacts at higher levels.
The RMSEs of the meridional wind are shown in Figs. 7 and 8 at 850 and 250 hPa, respectively. As in the analysis of the zonal component, the loss of the radiance data had a greater impact on the error, but this was only statistically significant in the lower troposphere. Although the loss of GPS and radiosonde data resulted in a smaller RMSE compared with the loss of satellite radiance data, both were statistically significant in the upper and lower troposphere. Therefore, the three observing systems improved the forecasts. Furthermore, we observe that the largest errors were found at high levels for both components of the wind, although the error was not statistically significant in all cases. Because radiosondes are relatively sparse in the tropics, one must keep in mind that they measure the wind directly. Despite the reduced number of radiosondes, the influence of observations from remote regions of the globe can spread and contribute to the impact on the tropics.
Although information from only two levels, the upper and lower troposphere (250 and 850 hPa, respectively), was available for this study, one can consider the relative amount of information from each source on each level (i.e., radiance versus radiosonde data) to infer that this result may be related to the lack of conventional information at high levels over the tropics. Nevertheless, further investigation is required to narrow such a broad conclusion since one requires information from other levels and a better understanding of the role of other observations in the lower and upper troposphere.
Three experiments were conducted using the new G3DVAR assimilation scheme implemented for the CPTEC/INPE global model to assess the impact of satellite radiance, GPS RO, and radiosonde data at forecast lead times between 0 and 120 h. These experiments were conducted during January 2013 and evaluated over five regions: the globe, the Northern Hemisphere, the Southern Hemisphere, the tropics, and South America and the adjacent oceans.
The results of the G3DVAR experiments confirm what has been found in previous studies using other data assimilation systems: satellite data are extremely important for maintaining the ability to forecast in the Southern Hemisphere. The loss of an observing platform has less impact on the ability to forecast in the Northern Hemisphere because it is more data dense; that is, neighboring observations are able to provide enough information to limit the degradation in ability.
This study shows that, in the Southern Hemisphere, the loss of satellite radiance data starts to degrade the forecasting ability significantly after 18 h. Consequently, the G3DVAR system depends strongly on satellite observations in the SH and in the SAAO region, whereas in the NH, the model’s ability to forecast is maintained for up to 72 h in the absence of radiance data.
In the tropics, the loss of radiance data impacts the lower troposphere most significantly; at higher levels (250 hPa), the impact is not as significant. Nonetheless, the greatest forecast degradation due to the loss of GPS and radiosonde data was found at 250 hPa. Additionally, similar analyses (not shown) were conducted in the intermediate levels, between 850 and 250 hPa, confirming that the impact due to the loss of radiance data decreases with increasing altitude.
The observing platforms studied in this study were shown to have a significant global impact on the G3DVAR analysis and to be particularly critical for maintaining the AGCM’s ability to forecast over South America, CPTEC’s main region of interest.
The authors thank the Group on Data Assimilation Development (GDAD) for its support, Coordenação de Aperfeiçoamento de Pessoal de Nível Superior (CAPES) for the financial support of HBdA, and Arlindo da Silva for his contributions.