## Abstract

The Global Positioning System (GPS) radio occultation (RO) technique is becoming a robust global observing system. GPS RO refractivity is typically modeled at the ray perigee point by a “local refractivity operator” in a data assimilation system. Such modeling does not take into account the horizontal gradients that affect the GPS RO refractivity. A new observable (linear excess phase), defined as an integral of the refractivity along some fixed ray path within the model domain, has been developed in earlier studies to account for the effect of horizontal gradients.

In this study, the error statistics of both observables (refractivity and linear excess phase) are estimated using the GPS RO data from the Formosa Satellite 3–Constellation Observing System for Meteorology, Ionosphere and Climate (FORMOSAT-3/COSMIC) mission. The National Meteorological Center (NMC) method, which is based on lagged forecast differences, is applied for evaluation of the model forecast errors that are used for estimation of the GPS RO observational errors. Also used are Weather Research and Forecasting (WRF) model forecasts in the East Asia region at 45-km resolution for one winter month (mid-January to mid-February) and one summer month (mid-August to mid-September) in 2007.

Fractional standard deviations of the observational errors of refractivity and linear excess phase both show an approximately linear decrease with height in the troposphere and a slight increase above the tropopause; their maximum magnitude is about 2.2% (2.5%) for refractivity and 1.1% (1.3%) for linear excess phase in the lowest 2 km for the winter (summer) month. An increase of both fractional observational errors near the surface in the summer month is attributed mainly to a larger amount of water vapor. The results indicate that the fractional observational error of refractivity is about twice as large as that of linear excess phase, regardless of season. The observational errors of both linear excess phase and refractivity are much less latitude dependent for summer than for winter. This difference is attributed to larger latitudinal variations of the specific humidity in winter.

## 1. Introduction

The Global Positioning System (GPS) radio occultation (RO) technique has emerged as a robust global observing system that provides valuable data to support operational numerical weather prediction (e.g., Healy and Thépaut 2006; Cucurull et al. 2007; Anthes et al. 2008; Cucurull and Derber 2008; Healy 2008). Several observables, retrieved from GPS RO measurements, which range from raw excess phases to retrieved moisture and/or temperature profiles, can be used in data analysis and assimilation (Kuo et al. 2000). Such an observable as the bending angle, defined as the angle between the incoming and outgoing directions of a GPS-transmitted electromagnetic ray (Kursinski et al. 2000), can be retrieved under the assumption of spherical symmetry of refractivity. The refractivity can be retrieved from the bending angle (using Abel inversion) and assigned to the occultation (ray tangent) point. In the past, many studies on the assimilation of Abel-retrieved refractivity and bending angle data demonstrated positive impacts on regional as well as global weather prediction (Kuo et al. 1998; Zou et al. 1999, 2000, 2004; Liu and Zou 2003; Huang et al. 2005; Healy et al. 2005; Healy and Thépaut 2006; Cucurull et al. 2006, 2007; Healy 2008; Poli et al. 2009; Chen et al. 2009; Huang et al. 2010).

In most of the regional data assimilation studies, a local refractivity observation operator is used to represent the GPS RO Abel-retrieved refractivity as local refractivity. This introduces certain representativeness errors because the Abel-retrieved refractivity derived from GPS RO observation is influenced by the atmosphere along the ray path. In a grossly simplified approximation, the Abel-retrieved refractivity is similar to a density-weighted average over the horizontal path of about 300 km, centered at the ray perigee point. There can be significant inhomogeneity in the horizontal along the 300-km averaging path. To reduce this representativeness error, Sokolovskiy et al. (2005a) suggested a new observable, a linear excess phase, which is defined as the integrated amount of refractivity along a fixed (e.g., straight line) ray path, to account for the nonlocal nature of the Abel-retrieved refractivity. This follows from the smaller apparent errors normalized by the standard deviations of the weather-induced variations of the excess phase than those of the refractivity, as was shown by Sokolovskiy et al. (2005b). Because the nonlocal observation operator can achieve better accuracy at reasonable computational expenses, it has been implemented in the data assimilation system of the Weather Research and Forecasting (WRF; e.g., Liu et al. 2008; Chen et al. 2009) and National Centers for Environmental Prediction (NCEP) Gridpoint Statistical Interpolation (e.g., Ma et al. 2009).

The Formosa Satellite 3–Constellation Observing System for Meteorology, Ionosphere and Climate (FORMOSAT-3/COSMIC) mission is now providing substantially more RO soundings (around 2000 soundings per day) with better penetration and smaller measurement errors in the lower troposphere (with the use of the open-loop tracking technique) than other relevant missions such as Challenging Minisatellite Payload (CHAMP), which uses a phase-locked loop tracking technique [see Anthes et al. (2008)]. To appropriately assimilate the FORMOSAT-3/COSMIC GPS RO data, using either the local refractivity operator or the nonlocal linear excess phase operator, one needs the observational error statistics based on these two operators.

In this study we estimate observational errors of refractivity and linear excess phase based on FORMOSAT-3/COSMIC data over East Asia and the western Pacific for one winter and one summer month (referring to the Northern Hemisphere). We investigate latitudinal dependence of both refractivity and linear excess phase observational errors by stratifying the observations into different latitudinal bins. A brief introduction of the methodologies (including the observational error, local and nonlocal operators, and the WRF model) and the experiment design is given in section 2. The estimated observational errors are discussed in section 3. Section 4 concludes this study.

## 2. Methodologies and experiment design

### a. Apparent errors, observational errors, and forecast errors

Observational errors include measurement errors and representativeness errors (Daley 1991; Kuo et al. 2004; Sokolovskiy et al. 2005a). While the observational errors can be estimated theoretically, this is a very difficult task and commonly they are estimated by comparing to other observations with uncorrelated error characteristics. In this study we use the model forecast. Then the variance of the apparent error is related to the variances of the observational and forecast errors and as follows:

Note that the apparent errors are the total errors, defined as the differences between the model forecast and observation. This is equivalent to observation minus background (*O* − *B*), or the innovation, as defined in standard data assimilation literature (e.g., Daley 1991). Given the apparent errors, observational errors can be determined once forecast errors have been calculated.

Forecast errors can be estimated using the National Meteorological Center (NMC) method (Parrish and Derber 1992) based on lagged forecast differences, or by calculating the correlation between innovations with uncorrelated observational errors (Rutherford 1972; Hollingsworth and Lönnberg 1986). The Hollingsworth and Lönnberg (H-L) method needs complementary observations (e.g., radiosondes), whereas the NMC method does not. However, the NMC method tends to underestimate model forecast errors because of the partial cancellation of the representativeness errors. In this study, the NMC method was used since over the ocean, where error estimates are important, there is an insufficient amount of radiosonde data to apply the H-L method. The differences between the forecasts at two different times, 24 h and 12 h, for the WRF model are used in the NMC method in this study.

The observational error covariance matrix is used in the minimization of the cost function. To reduce the computational cost, in the WRF three-dimensional variational data assimilation (3DVAR) it is assumed that the observational errors are only autocorrelated and thus the error covariance matrix is a diagonal matrix. This diagonal error covariance matrix is constructed from the fractional standard deviations (i.e., the standard deviations normalized by the mean observables). In this study, the observational errors of the two observables (refractivity and linear excess phase) are represented in terms of fractional standard deviations for better visualization of their vertical variations.

### b. Prediction model and preprocessor

The WRF model (version 2.1.2) was used in this study. A description of the model can be found at the online (see http://wrf-model.org/index.php) as well as in Skamarock et al. (2005). The model is compressible and nonhydrostatic, and features multiple dynamical cores with high-order numerics to improve numerical accuracy. A preprocessor is used to map the model variables (i.e., pressure, temperature, and water vapor) from WRF forecast grid fields to the time and location of the observations. The WRF forecasts at 6-h intervals are linearly interpolated into the measurement time of RO soundings. Then the local and nonlocal observation operators are used for the calculation of model refractivity from the model variables and for the line integration of model refractivity (i.e., linear excess phase), respectively.

### c. Observation operators

The retrieved or derived observables are the Abel-retrieved atmospheric refractivity and the linear excess phase (the integral of refractivity along a ray path). Atmospheric refractivity *N* is related to the model predictive variables (i.e., pressure, temperature, and water vapor pressure; Bevis et al. 1994). In earlier studies, GPS RO refractivity was often assimilated by assuming that it was representative of a local value at the ray perigee point. Such modeling of GPS RO refractivity, though computationally efficient, does not take into consideration the effects of horizontal gradients and is associated with certain representativeness errors.

To account for the effects of horizontal gradients, in the nonlocal excess phase operator (Sokolovskiy et al. 2005a), the refractivity is integrated along a fixed ray path, which approximately models the true ray path but does not depend on refractivity. We note that thus defined linear excess phase used in this study is different from the commonly used excess phase of GPS RO measurements, which is the phase between transmitter and receiver in excess to the phase in a vacuum. A geometric illustration of the linear excess phase operator is given in Fig. 1 of Sokolovskiy et al. (2005b). In the simplest case (which results in reasonably good accuracy), this fixed ray path is a straight-line tangent to the real ray path at the perigee point. The linear excess phase, which is treated as a new observable linear on refractivity, is defined as

where *dl* is the differential ray path, and *A* and *B* are the incoming and outgoing points of the straight line within the model domain, respectively. The linear excess phase *S* is calculated in the same way for the 3D model refractivity, thus accounting for the effects of horizontal gradients, and for the observational counterpart, one-dimensional (1D) Abel-retrieved refractivity (where the effects of horizontal gradients are contained in convolved form). Thus, Eq. (2) allows an arbitrary factor, which is set to 1. Integration of the refractivity along the ray path is stopped once the ray has reached the model boundaries [e.g., upper, lateral, or lower (terrain)].

### d. Data processing

Application for the nonlocal operator is briefly described as follows. First, we obtained retrieved profiles, the atmPrf files, with the highest level of near 60 km from the COSMIC Data Analysis and Archival Center (CDAAC; http://www.cosmic.ucar.edu), which include the longitude, latitude, and height of a tangent point, the azimuth of the ray, and the refractivity at tangent point. Descriptions of data processing at CDAAC can be found in Kuo et al. (2004) and Ho et al. (2009). Then, we define the mean model vertical grid by averaging heights of all model grid points for each vertical level. When interpolating observations to this mean model vertical grid, the observational latitude, longitude, azimuth, and logarithm of refractivity are vertically averaged between the midpoints of the contiguous mean model layers above and below the mean vertical level. For calculation of the linear excess phase, both the observation and the model refractivities are integrated along a ray path according to the observed azimuthal direction with a step size of 5 km, starting from the tangent point in both directions and ending at the model boundaries (lateral, upper, or lower). For the integration, both the model and the observational refractivities are log-linearly interpolated between the mean model levels. Although calculation of the linear excess phase for the observational refractivity, which is treated as spherically symmetric, can be reduced to a 1D integral, it is performed in exactly the same way as for the 3D model refractivity in order to eliminate the error arising from different discrete representations. The nonlocal operator in WRF 3DVAR may also assimilate refractivity simply by deactivating the ray’s integration (Chen et al. 2009). The horizontal smear of tangent point trajectories has been taken into account both in local and nonlocal operators; additionally, the nonlocal operator uses the azimuth of the ray at tangent point.

### e. Model settings and calculation steps

In this study, we use the WRF forecast model in a single domain (151 × 151 grids) at horizontal resolution of 45 km, as shown in Fig. 1, covering latitudes from about 10°S to 50°N and longitudes from 80° to 160°E, with a model top of 50 hPa (31 model vertical levels) with varied vertical resolution. The horizontal resolution of 45 km in this study should be appropriate for the error estimation according to Chen et al. (2006). They computed the innovation using the fifth-generation Pennsylvania State University–National Center for Atmospheric Research (NCAR) Mesoscale Model (MM5) with different horizontal resolutions and found that the representativeness error from 30-km resolution is compatible with that of 10-km resolution. In this study, the model physical parameterizations include the cumulus parameterization of the Kain–Fritsch scheme (Kain 2004), the cloud microphysics of the Purdue–Lin scheme (Chen and Sun 2002), and the planetary boundary layer parameterization of the Yonsei University (YSU) scheme (Hong et al. 2006) (for details of the schemes, please refer to http://wrf-model.org/index.php).

The 12- and 24-h forecasts are taken from the WRF model predictions for two periods, each of one month, one (15 January–15 February 2007) during the winter and the other (15 August–15 September 2007) during the summer. During the winter (summer) month, there are 2999 (2845) RO soundings within the model domain (Fig. 1 shows the distribution of the winter soundings) available from the FORMOSAT-3/COSMIC mission. The RO data counts decrease near the surface because of different penetration of retrieved profiles. Only about 500 (800) RO soundings penetrate close to surface for the summer (winter) month. Table 1 provides a list of the numbers of soundings within different latitudinal bins in the model domain. There are several hundred RO soundings in each bin that passed CDAAC quality control, sufficient for reliable statistical estimation of the observational errors. Data quality control has been conducted routinely at CDAAC (see Kuo et al. 2004). In addition, the observations (refractivity and linear excess phase) are discarded in this study if they deviate from the model forecasts by more than five standard deviations of the apparent errors (we use the same criteria as in WRF 3DVAR), as an additional quality check to remove outliers (Table 1).

The steps for calculation of the observational errors are outlined as follows:

Step 1: The NCEP Aviation (AVN) analyses were used as the initial condition for the WRF model. The model was initialized four times (0000, 0600, 1200, and 1800 UTC) each day during the two months, and each model forecast was integrated for 24 h. Thus, for each month, there are a total of 125 forecasts, including 0000 UTC on the last day.

Step 2: The forecast errors are evaluated using the NMC method, which computes the differences between two consecutive forecasts (24- and 12-h forecasts) projected to the observation space. The daily 12- and 24-h forecasts for the summer and winter months are interpolated to the time and location of RO measurements. The modeled refractivity and linear excess phase are then calculated using the observation operators. Thus, the standard deviations of the forecast errors are derived.

Step 3: The observational errors of both refractivity and linear excess phase are computed using Eq. (1) at each vertical level for the two months. Although, in general, the NMC method tends to underestimate the forecast errors, there are still a limited number of cases when the forecast errors are close to or exceed the apparent errors. When this happens, the observational error variance calculated from Eq. (1) is close to zero or negative, thus having no physical meaning. These cases were considered outliers and removed from the dataset in this study.

## 3. Observational error estimation

### a. The fractional errors

The fractional apparent errors before (black curves) and after (red curves) passing the abovementioned criteria for both winter and summer months are shown in Fig. 2. We truncate error profiles at ∼18 km because close to the model top (50 hPa), the linear excess phase basically represents the local refractivity (because of the short ray path). The largest errors (outliers) occur mainly in the lower troposphere, which may not necessarily indicate the largest fractional observational errors since forecast errors may also be large. Although the numbers of outliers are rather limited after the quality control applied by CDAAC, we still find the additional criteria useful.

The fractional apparent errors of refractivity (linear excess phase) after applying the additional criteria, in general, show approximately linear decay with height in the troposphere, and their maximum magnitude in the lowest several kilometers is about 12% (7%) for both the winter and summer months. Linear excess phase errors are smaller than refractivity errors, especially in the lower troposphere, by about a factor of 2. This reduction is to be expected because the linear excess phase is related to integration of refractivity, and by itself does not prove reduction of the representativeness errors related to horizontal gradients. The latter follows from reduction of the apparent errors normalized by the standard deviations of the weather-related variations of the observables (Sokolovskiy et al. 2005b). Calculation of the linear excess phase from refractivity (which is reduced to the Abel transform in spherically symmetric case) not only reduces the magnitude of the variations and the errors but also increases their vertical correlations. This follows from the frequency response of the Abel transform (Lohmann 2005). However, in this study we do not consider the vertical error correlations. Figure 3 shows the fractional apparent, forecast, and observational errors (i.e., fractional standard deviations) of linear excess phase and refractivity for both months. The ratio of the fractional errors of refractivity and linear excess phase in the lower troposphere, 2:1, appears to hold true for the apparent errors as well as for the forecast errors, and thus also for the observational errors. Fractional observational errors have minima at approximately 10 km and maxima in the lower troposphere, 2.2% (2.5%) for refractivity and 1.1% (1.3%) for linear excess phase in the winter (summer) month. Slightly larger errors in the summer are likely related to a larger amount of water vapor in the lower troposphere.

### b. Latitudinal dependence

Latitudinal variations of the observational errors of refractivity have been investigated by Kuo et al. (2004) using CHAMP data from December 2001. To investigate latitudinal dependency of the observational errors of both refractivity and linear excess phase, we stratified the RO observations by latitude. Figure 4 shows the fractional observational errors of refractivity and linear excess phase for latitudes southward and northward of 30°N for both months. In the winter month, the fractional observational errors in the lower troposphere are about 3 times larger in lower latitudes than those in higher latitudes, with maxima of about 3.0% (1.5%) for refractivity (linear excess phase). In the summer month, when the intertropical convergence zone shifts to the north and covers most of the model domain, the latitudinal dependency of errors is not pronounced. For all months and latitudes the fractional apparent errors of the linear excess phase in the lower troposphere are about a factor of 2 smaller than those of the refractivity [in agreement with Sokolovskiy et al. (2005b)]. The fractional observational errors of refractivity for the winter month have minima at 8 km (12 km) and maxima of 1.1% (3.0%) near the surface for higher (lower) latitudes, in good agreement with the results of Fig. 13 in Kuo et al. (2004).

For further investigation, we stratify the RO observations in different latitudinal bins of 5°–10° (with sufficient amount of data in each bin) and show the results in Fig. 5. An additional piece of information contained in Fig. 5 compared to Fig. 4 is that the maximal fractional observational errors in the lower troposphere for the winter month, reaching maxima of about 3.7% (1.7%) for refractivity (linear excess phase), are not at the equator, but at the 15°–25° latitude band. This largest fractional observational error is most likely related to larger errors of the GPS RO (due to superrefraction and insufficient tracking depth) (Sokolovskiy 2003) and also may be related to underresolving of the sharp atmospheric boundary layer (ABL) in the subtropical region by the model (resulting in underestimation of the model errors). For both observational errors near the surface, the ratio of the largest to the smallest magnitudes in the considered latitudinal band is about 5:1 for the winter month but reduces to about 2:1 for the summer month. The variability of the errors is related to the variability of the real atmosphere and needs to be taken into account in assimilation.

To provide a better understanding of the latitudinal structure of errors, the forecast errors, stratified in the same way as the observational errors (Fig. 5), are shown in Fig. 6. For the winter month, fractional forecast errors for both the refractivity and linear excess phase exhibit significant latitudinal variations, which in general are consistent with variations of the fractional observational errors, except that the largest fractional forecast errors are found in the band of 5°–15° rather than in 15°–25° as for the fractional observational errors (Figs. 5a,b). However, such similarity between the fractional forecast and observational error variations is not as obvious in the summer month. The latitudinal dependence also can be found in the monthly, zonally averaged specific humidity of 12-h WRF forecasts in the winter month, as shown in Fig. 7a. Weaker latitudinal dependence of the forecast errors in summer is intimately related to the smaller latitudinal variations of the corresponding monthly averaged specific humidity (Fig. 7b). A comparison of Fig. 6 and Figs. 7c,d shows that the latitudinal dependences of the forecast errors are very similar to the forecast differences in specific humidity. This suggests that the latitudinal dependency of the observational errors is in fact strongly affected by the atmospheric moisture. The latitudinal and vertical variations of the apparent (total) errors are more similar to the variations of the observational errors in both summer and winter (figures not shown). This is because the forecast errors, on average, are smaller than the observational errors.

The vertical profiles of the observational errors estimated in this study contain small-scale structures, especially in the lower troposphere. We have attempted to fit the vertical profiles of observational errors with several fitting curves from a polynomial of tenth order for each of the latitudinal zones, as shown in Fig. 8. The fitting curves can represent the gross characteristics of the observational errors with respect to latitudes and seasons. As a result of stronger latitudinal dependence in the winter month, eight fitting curves are derived for different latitudinal bins (Figs. 8a,b). Because the observational errors for different bins do not vary considerably in the summer month, we can obtain two well-fitting curves for refractivity and linear excess phase as shown in Figs. 8c and 8d, respectively. For convenience, a lookup table (Table 2) is provided for the fractional standard deviations of the observational errors below the height of 14 km for the winter and summer months. Above 14 km, the fractional observational errors, as shown in Fig. 5, increase with height for linear excess phase; this increase is less pronounced for refractivity. Therefore, the fractional observational errors of refractivity at 14 km can be used as a proxy above 14 km for both months. A uniform error of 0.3% for refractivity near and above the tropopause is also assumed in WRF 3DVAR. For linear excess phase, the vertical gradients of the fractional observational errors above 14 km in the winter and summer months are about 0.075% and 0.1% km^{−1}, respectively. We note that the estimated observational errors are valid only for the domain used in this study and may need reevaluation in other domains or for the models with substantially different resolution.

## 4. Conclusions

In this study, we investigate the observational errors of refractivity and linear excess phase retrieved from FORMOSAT-3/COSMIC RO data for two months: one month (15 January–15 February 2007) in the winter and the other (15 August–15 September 2007) in the summer, in East Asia centered around Taiwan. We use the NMC method to calculate forecast errors, which are in turn used for the estimation of observational errors for both observables (refractivity and linear excess phase).

Fractional standard deviations of observational errors for refractivity and linear excess phase both show a roughly linear decrease with height in the troposphere and a slight increase above the tropopause; their maximum magnitude is about 2.2% (2.5%) for refractivity and 1.1% (1.3%) for linear excess phase in the lowest 2 km for the winter (summer) month. The observational errors of refractivity demarcated by 30°N in this study are comparable with those in Kuo et al. (2004). The latitudinal dependence of both observational errors (refractivity and linear excess phase) is stronger in the winter month and weaker in the summer month. Through analysis of specific humidity fields, the latitudinal dependence of the observational error is found to be influenced by the variations of the atmospheric moisture. Such latitudinal dependence of the observational errors in winter and summer must be taken into account in RO data assimilation. The errors for other seasons may be approximately evaluated by interpolation between winter and summer.

This study uses the regional WRF model to estimate observational errors, and the results are most useful for numerical weather prediction applications over this region at comparable resolution. Application of these results for other regions of the world (which may have different moisture structure and variability) or for models with substantially different horizontal and vertical resolutions may not be warranted. Because the estimated observational errors depend on atmospheric structures and their representation by a model, similar studies shall be conducted for applications related to other regions and/or models.

## Acknowledgments

This study was supported by the National Science Council in Taiwan under Grant NSC 96-2111-M-008-003 and by the National Space Organization (NSPO) in Taiwan under Grant 97-NSPO(B)-SP-FA07-02(B). The first author, S.-Y. Chen, was also supported by UCAR/COSMIC and an NSF ATM grant during her two-year visit to NCAR.

## REFERENCES

**,**

**,**

**,**

**,**

**,**

**,**

**,**

**,**

**,**

**,**

**,**

**,**

**,**

**,**

**,**

**,**

**,**

**,**

**,**

**,**

**,**

**,**

**,**

**,**

**,**

**,**

**,**

**,**

**,**

**,**

**,**

**,**

**,**

**,**

## Footnotes

*Corresponding author address:* Ching-Yuang Huang, Dept. of Atmospheric Sciences, National Central University, 300 Jhongda Rd., Jhongli City, Taoyuan County 32001, Taiwan. Email: hcy@atm.ncu.edu.tw