The simulation of observations—a critical Community Global Observing System Simulation Experiment (OSSE) Package (CGOP) component—is validated first by a comparison of error-free simulated observations for the first 24 h at the start of the nature run (NR) to the real observations for those sensors that operated during that period. Sample results of this validation are presented here for existing low-Earth-orbiting (LEO) infrared (IR) and microwave (MW) brightness temperature (BT) observations, for radio occultation (RO) bending angle observations, and for various types of conventional observations. For sensors not operating at the start of the NR, a qualitative validation is obtained by comparing geographic and statistical characteristics of observations over the initial day for such a sensor and an existing similar sensor. The comparisons agree, with no significant unexplained bias, and to within the uncertainties caused by real observation errors, time and space collocation differences, radiative transfer uncertainties, and differences between the NR and reality. To validate channels of a proposed future MW sensor with no equivalent existing spaceborne sensor channel, multiple linear regression is used to relate these channels to existing similar channels. The validation then compares observations simulated from the NR to observations predicted by the regression relationship applied to actual real observations of the existing channels. Overall, the CGOP simulations of error-free observations from conventional and satellite platforms that make up the global observing system are found to be reasonably accurate and suitable as a starting point for creating realistic simulated observations for OSSEs. These findings complete a critical step in the CGOP validation, thereby reducing the caveats required when interpreting the OSSE results.
Observing system experiments (OSEs) are data assimilation (DA) and forecast experiments that measure the impact of an observing system by comparing analysis and forecast results with and without the particular observing system. As described by Boukabara et al. (2016), observing system simulation experiments (OSSEs) extend the concept of OSEs to proposed future sensors by using observations simulated from the nature run (NR). OSSEs support (i) decision-makers by providing valuable quantitative information on the impact of proposed future sensors, (ii) system engineers through trade studies to tune the design of the proposed sensor, and (iii) data assimilation scientists by allowing the development and testing of new data assimilation algorithms to optimally extract the information content of the proposed sensor before that sensor exists (Atlas 1997; Atlas et al. 2001, 2015). However, no OSSE is perfect. The OSSE system components and the entire OSSE system must be validated and calibrated, and deficiencies noted during this process must be documented to ensure the correct interpretation of the OSSE results (Hoffman and Atlas 2016). First, the development of the individual OSSE system components must include sufficient testing, validation, and calibration to ensure realistic results. Second, because OSSE systems are complex, approximations made in one component may have substantial impacts on other downstream components, and some iteration of the validation of the overall OSSE system may be required. Third, since even minor deficiencies or inconsistencies may be crucial in designing, conducting, and/or interpreting an experiment for a particular proposed future sensor, it is critical to document and review such deficiencies during the entire OSSE process. In addition, users should bear in mind that OSSE results are only one factor in the decision-making process concerning proposed sensors.
In section 2 of Boukabara et al. (2016), the Community Global OSSE Package (CGOP) was described in detail from a functional viewpoint. Currently, as described in Boukabara et al. (2016), CGOP includes the NOAA DA (Kleist and Ide 2015) and forecast systems. Here, this Part II reports the quantitative validation of the first component of the package: the simulation of perfect (i.e., with no added explicit errors) observations from both existing and proposed sensors. All observation types described in Boukabara et al. (2016) have been validated. This includes all infrared (IR) and microwave (MW) sensors listed in Table 4 of Boukabara et al. (2016), namely, Atmospheric Infrared Sounder (AIRS), Advanced Microwave Sounding Unit-A (AMSU-A), AMSU-B, Advanced Technology Microwave Sounder (ATMS), Cross-Track Infrared Sounder (CrIS), High-Resolution Infrared Sounder-3 (HIRS-3), HIRS-4, Infrared Atmospheric Sounding Interferometer (IASI), Microwave Humidity Sounding (MHS), sounder (SNDR), Spinning Enhanced Visible and Infrared Imager (SEVIRI), and Special Sensor Microwave Imager/Sounder (SSMIS; acronyms are also defined in the appendix). In this paper examples or summaries for each data type are presented.
The present study has two important outcomes. First, on a simplistic level, it “validates” the mechanics of simulating perfect observations—that is, it interpolates the NR and applies the forward problem procedures. Second, and more importantly, it sets goals for the modeling of explicitly added errors—differences found in comparing real and perfect simulated observations reveal the complexity, structure, and causes of errors that should be present in the error-added simulated observations. In this study, references to validation of perfect observations are to validation in these two senses: the checking of the simulation of perfect observations, and the description and characterization of the explicit simulated errors that should be added to the perfect observations.
Section 2 describes our overall approach to validate the perfect simulated observations, using the NR initial day. Section 3 summarizes the validation conducted by Gelaro et al. (2015) of the 7-km Goddard Earth Observing System Model, version 5 (GEOS-5) nature run (G5NR). Insights identified during the validation of the G5NR are critical to our discussion of the perfect observations validation. The sections that follow describe the validation of the error-free simulated observations. There we validate the simulated brightness temperature (BT) observations from existing low-Earth-orbiting (LEO) MW and IR satellite sensors (section 4); bending angle observations from Global Navigation Satellite System (GNSS) radio occultation (RO) observations (section 5); and conventional observations from existing sensors based on land, ships, buoys, balloons, and aircraft (section 6). Following the operational convention, we include atmospheric motion vector (AMV) and satellite surface wind observations in the category of conventional observations. Then, in section 7, we present qualitative validations of simulated BT from a proposed MW sensor on a constellation of small satellites. Section 8 provides a summary, conclusions, and brief discussions of the remaining validation and calibration activities of other CGOP components and of the application of the entire OSSE system. These latter topics will be the subject of a separate paper.
2. Strategies to compare real and perfect simulated observations
Since every OSSE conducted with CGOP depends on the NR and the simulated observations, assessing the accuracy of the perfect simulated observations and explaining any differences with respect to reality are critical prerequisites before the subsequent steps of calibrating and validating the rest of the OSSE system. First, differences between the NR and reality will be mirrored in simulated and real observations. In this study, we limit these differences by restricting our comparisons to the 24 h at the beginning of the NR, that is, the NR initial day. Below, we present detailed discussions of both the principal advantage and the principal disadvantage of this approach: that one-to-one comparisons of observations are possible but that the results may not be representative of other periods. Second, differences between the forward problem and how real sensors observe the earth result in further differences between simulated and real observations. Differences of both types should be included in the errors explicitly added to the simulated perfect observations. It is important to note that there are two types of validation described here and that both of these validations of NR-based simulations are not to be confused with a thorough sensor validation, which is performed to assess the accuracy of the measurements and to establish instrumental error characteristics (typically during sensor calibration–validation activities).
In general, when conventional and satellite observations are simulated from the NR, all of these should be simulated consistently with coverage, resolution, and accuracy comparable to real observations. In addition, bias and horizontal and vertical correlations of errors with each other and with the synoptic situation should be introduced appropriately. The validation of the simulated observations should include, for example, side-by-side comparisons of coverage maps, quality-control (QC) yields, and frequency distributions (histograms) of observations and observation innovations (observations minus DA background).
The present study validates so-called perfect simulated observations, which do not include the explicit addition of random and bias components of error. When compared to real observations, perfect observations include implicit errors and representativeness differences (Errico et al. 2013).1 Here, implicit observation error includes interpolation error and forward problem error. For example, the difference between perfect observations and reality strongly depends on the accuracy of the forward operators and the required parameters that must be provided, such as surface emissivity in the case of satellite radiances. Representativeness differences refers to scales and phenomena that in reality are not present in the NR model.2
Since the NR is a free-running numerical weather prediction (NWP) forecast, the NR atmosphere will substantially differ from the real atmosphere and as a result the simulated observations will be uncorrelated with the real observations even for the same period. The exception, of course, is at the beginning of the NR, where close agreement of the NR and reality is expected. The period used here for our assessment is the NR initial day, unless otherwise noted. The first type of validation used in this study is quantitative and involves the comparison, for each data source, of perfect simulated to actual observations that match up on a one-to-one basis. This is possible only for sensors that were functioning on 16 May 2005 and for sensors that the DA system is capable of assimilating. These one-to-one comparisons are very valuable for demonstrating the correctness of the implementation of the OSSE observation simulation software (as well as other OSSE system components). In the current CGOP, as a result of the use of quality-controlled data in PREPBUFR files [i.e., observations preprocessed and quality controlled as described by Keyser (2017) in Binary Universal Form for the Representation of Meteorological Data (BUFR) format] and other template files, simulated and real observations agree in time and location. Observed values should match closely, since the NR is linearly interpolated in both time and location. Even in this type of quantitative validation, the G5NR and CGOP forecasts diverge over the 24-h period, and therefore we cannot expect the perfect observation increments (observation minus perfect simulated observation) to be within instrumental-noise-plus-representativeness difference levels.
The figures presented in this paper that are based on one-to-one comparisons of “perfect” simulated observations (P) to real observations (O) also show the perfect observation increments (O − P; increments). For existing observations, simulated observations were created for each actual observation present in the template files: PREPBUFR for conventional observations, BUFR for satellite RO observations, and BUFR thinned to match the Gridpoint Statistical Interpolation analysis system (GSI) radiance statistics and diagnostics (RADSTAT) output for satellite BT observations. PREPBUFR observations have undergone some preliminary QC. For satellite BT and RO observations, the plots include all observations during the NR initial day, that is, the four 16 May 2005 DA cycles. For existing conventional observations, the plots include all observations during the 0000 UTC 16 May 2005 DA cycle. In figures that display maps in the top half and histograms in the bottom half (Figs. 1, 3, and 9), letters identifying the panels are reused for maps and histograms, but it is always clear from the text whether a map or histogram is being referred to. In each half (maps and histograms) O, P, and O − P are displayed from top to bottom. This top-to-bottom ordering is maintained in other figures (Figs. 2, 5, and 8) as well.
One-to-one comparisons are possible only for data types actually observed in reality at the start of the NR. For example, ATMS data are available after 2012, but the G5NR is available only until 31 May 2007. And the same, of course, applies to any novel proposed sensor. Therefore, the second type of validation used in this study is based on the qualitative comparison of the perfect simulated observations of sensors that are proposed, or are otherwise not available in reality, to actual observations for similar data sources on the NR initial day. For these qualitative validations, we rely on comparing maps and histograms and making sure they have similar characteristics, namely, data coverage and patterns in the case of maps and distribution shape and mode location in the case of histograms. An example is given in section 7a, but in this case, assuming the proposed sensor and existing sensor are in the same orbit allows a one-to-one comparison.
In the case of totally novel sensors, proxy cross validation is employed, in which comparisons are made to proxy observations; that is, for proposed sensor channels with no existing real observations on the NR initial day that have similar spectral and geometrical characteristics, we rely on proxy or predicted observations that are based on real observations from channels of a sensor existing at the start of the NR that have similar sensitivity to the environment as each proposed channel. These real observations are transformed to create proxy measurements that can then be substituted for real observations in the validation process. The relationship to predict the proxy measurements is developed by accurately calculating BTs for both existing and proposed channels for a diverse database of geophysical conditions, and then applying one of a number of estimation techniques—in the present case multiple linear regression (section 7b).
It must be noted that the start of the NR is not ideal for all validation and calibration, since 1) adjustment processes will be active and 2) only a small portion of the annual cycle will be sampled. Therefore, further validation and calibration should be done for the period(s) of interest. Except at the start of the NR, simulated observations can be validated only in terms of characteristics and statistics. Even in terms of statistics, large discrepancies between the simulated and real observations may occur. An additional diagnostic is to compare differences between (error added) simulated and perfect simulated observations for the study period to differences between real observations and perfect simulated observations for the NR initial day described in this study. If the errors added explicitly to the simulated observations are realistic, then maps and statistics of these two differences should be similar, after accounting for seasonal changes. The reasons for large discrepancies observed between the simulated and real observations should be understood and possible effects on the applicability of the OSSE system should be clearly stated. For example, cloud and precipitation effects present in real observations are currently included in the CGOP simulated observations only in terms of data coverage; radiances that are not simulated are those that would be affected by clouds; and radiances that are simulated are simulated for clear-sky conditions. Therefore, large discrepancies are expected unless simulated observations are compared to clear-sky real observations.
3. Nature run validation
As the operational DA and forecast systems evolve, there is a continuing need for improved NRs with higher spatial and temporal resolutions and improved representation of physical processes and meteorological phenomena. With this motivation, the NASA Global Modeling and Assimilation Office (GMAO) produced the G5NR, described briefly in section 3 of Boukabara et al. (2016) and in depth by Putman et al. (2015) and in other references cited in Boukabara et al. (2016). The G5NR is an advanced global atmospheric 7-km nonhydrostatic NR that is more than 2 yr long (16 May 2005–16 June 2007), and it includes aerosol and land surface components.
GMAO conducted extensive validation of the G5NR in comparison to reality (Gelaro et al. 2015). To paraphrase the abstract of Gelaro et al. (2015), this validation favored comparisons to observational datasets, but it made use of reanalyses and other long model simulations, and it covered many characteristics of the G5NR: the time mean fields of the model variables, the energy spectra, the water cycle, the representation of synoptic features, the surface characteristics and fluxes, the effects of clouds and radiation on the dynamics, the dynamics of the upper atmosphere, and the representation of aerosols and trace gases. While the G5NR was produced prior to the development of the CGOP, this validation is a critical step in the process of validating the CGOP, since the NR is the basis of any OSSE system. In general, the G5NR is a very detailed and realistic simulation, including representations of extreme weather events (e.g., Reale et al. 2017). However, there are definite discrepancies. Some of the differences compared to reality that have been documented by Gelaro et al. (2015) and that may be important in the OSSE context are as follows: 1) because of the very strong computational damping at the smallest resolved scales, the true resolution of the G5NR should be considered as several times coarser than its 7-km grid spacing; 2) land surface temperature biases—potentially important to the simulation of microwave observations—include a cool bias over North Africa and a warm bias in northern high latitudes; 3) tropical tropopause temperatures are warmer in the G5NR compared to reanalyses for all months; 4) specific humidities are too high above 700 hPa; 5) precipitation is overestimated over land and in the Pacific intertropical convergence zone (ITCZ); 6) nonprecipitating cloud water amounts in G5NR are high, resulting in too much reflected shortwave radiation at the top of the atmosphere; and 7) G5NR clouds tend to be biased brighter, but they also tend to be biased lower in altitude, both for high and low clouds. Of these findings, difference 1 implies that small scales that give rise to representativeness errors in reality will not be present in the perfect simulated observations and is one of the reasons that observation errors must be tuned; differences 2–4 are discussed in the comparisons presented below; and differences 5–7 are not relevant to the comparisons reported here because NR clouds and precipitation are not included in the clear-sky BTs that are simulated.
4. Validation of simulated BT observations
In this section, we compare real to perfect simulated BTs. In the CGOP, all BTs are simulated using the Community Radiative Transfer Model (CRTM; Chen et al. 2008; Ding et al. 2011). Comparisons were made for all channels of all instruments making BT observations listed in the introduction [and in Table 4 of Boukabara et al. (2016)], but here detailed comparisons are presented only for three channels of the AMSU-A and three channels of the AIRS. These selected channels are described below and provide comparisons that include sensitivity to atmospheric temperature and moisture and to the surface, and are broadly representative of a large fraction of the set of all comparisons.
In the figures of this section, the panels are ordered by channel number from left to right. The reader is cautioned that, within a figure, for the same variable (e.g., BT) the x and y axes or color scales often vary from panel to panel for different channels because of the wide variation in response with respect to frequency. The axis and color scales are yet again different for panels that plot increments. Observations are plotted in time sequence in maps, and where there is overlap only the newest observation is visible. In figures displaying both maps and histograms, the data samples displayed in the maps match the data samples used to calculate the histograms, with two exceptions for radiance observations. First, the histograms exclude locations over sea ice (ice), and over ice- and snow-covered land (snow). Second, side-by-side comparisons of cloud- or precipitation-affected BTs are expected to be poor, since the information contained in the G5NR is not sufficient to properly simulate the cloud- and precipitation-impacted BTs. This would require accurate specification of the vertical variation of the particle size distribution for each type of hydrometeor. Further, even at the start of the NR, the exact locations of cloud boundaries and precipitation areas may be different from reality. For BTs, the strategy used here is to restrict quantitative comparisons (e.g., histograms) to simulated clear-sky BTs by restricting the sample to BT locations that are considered clear sky by the GSI QC procedures, that is, with an estimated cloud fraction of less than 0.10. Of course, there may still be some signals from clouds and precipitation in the real BT observations. The simulated observations in all cases are calculated for clear-sky conditions; that is, there is no impact on the NR clouds or precipitation.
a. MW observations
AMSU-A is a 15-channel microwave instrument operating at a frequency range of 23.8–89 GHz (Rosenkranz 2001). AMSU-A channels 1 and 2 are very sensitive to atmospheric water vapor, and to the surface. Channel 15 is known as a surface channel because the measured radiances are mostly emitted by the surface. AMSU-A channels 3–14 are temperature sounding channels centered around the oxygen absorption band at 57 GHz. In Fig. 1 we present results from three representative AMSU-A channels—channel 3 (50.3 GHz), a lower-tropospheric (LT) channel that is also sensitive to the surface and therefore to the cloud; channel 8 (55.5 GHz), an upper-troposphere (UT), lower-stratosphere (LS) temperature sounding channel; and channel 15 (89 GHz), the surface channel—to demonstrate that the observation simulation in the MW is properly handling variations in temperature and surface properties. Table 1 lists the channels that are the main examples in this study.
In Fig. 1, and all subsequent plots, except where noted, the data sample corresponds to the NR initial day. Channel 8 shows the UT–LS is cold over the winter pole but warm over the summer pole. Southwest of Australia, there is a noticeable warm anomaly with respect to the zonal average. Channel 8 is virtually unaffected by the surface and the O − P increments are small everywhere. Even the pattern of cold striping at the swath edges in the tropics as a result of increased incidence angle is correctly matched by the simulated observations. Climatologically, Gelaro et al. (2015) report that the tropical tropopause temperature is warmer in the G5NR compared to reanalyses for all months, but this is not present in this initial day comparison. Channel 15, the surface channel, depicts the pole-to-equator and land-versus-ocean temperature gradients. While O and P agree visually, there are many differences on the order of 10 K. These differences correspond to precipitation over the ocean, which is present in reality but not included in the observation simulation. Differences over land correspond to differences in emissivities and surface temperatures between reality and what is assumed in the forward problem, especially over the deserts. Differences over snow and ice, which are very noticeable in the polar regions, occur because our knowledge of snow and ice emissivities is imperfect.
Note that the precipitation signal changes sign from cool near the equator to warm at high latitudes because the meridional gradients of sea surface temperature crosses that of the effective radiative temperature of precipitation. As a result of the low emissivity of the ocean in the MW, the surface brightness temperature varies from roughly 270 K at the equator to 200 K at high latitudes (Fig. 1h) at this frequency. Since cloud tops are higher and colder at the equator than in high latitudes, the precipitation signal is cool at the equator and increases poleward. Maps of channel 3, the LT temperature channel, are similar in character to channel 15, the surface channel, but with reduced amplitude. However, the amplitude of the snow and ice emissivity signals are as large in this channel as in the surface channel because at these high and drier latitudes, the channel weighting functions shift to lower elevation and become more influenced by the surface.
The histograms for land and ocean in Fig. 1 quantify some of the discussion of the maps. For channel 8, the UT–LS temperature channel, the increments are O(1) K or less. (Basic statistics for the distributions displayed in the histograms are given in Table 2.) Differences between land and ocean are due to the differences in distribution of land with latitude. In channel 15, the surface channel, we see that ocean BT varies from 200 to 270 K (as noted earlier), while land is mostly confined to 260–300 K. However, the real observations have a noticeable number of locations over land with a BT as low as 240 K. Increments for this channel are mostly in the range of −10 to +10 K, with some land differences reaching −20 K (with reality appearing colder than simulation) as a result of surface emissivity impacts over deserts and some ocean differences as much as +15 K (with reality appearing warmer than simulation) as a result of the ice and precipitation effects at high latitudes.
Figure 2 shows the mean and standard deviation of O − P for each AMSU-A channel. The largest differences occur for channels 1–3 and 15. This occurs because these are channels sensitive to the surface, which allows discrepancies between real and modeled emissivities to have a large impact on O − P. In general, for the AMSU-A channels 4–14, which are sensitive to temperature above the midtroposphere (MT), increments are very small. For LT temperature channels 1 and 2, the sensitivity to the surface, and hence the increments, is larger than for channel 15, the surface channel, with standard deviations for land >10 K, and for ocean slightly <10 K. Channel 1, at the 23-GHz water vapor resonance, is used to measure integrated water vapor, and in dry conditions is essentially a window channel. Similarly, channel 2 is the 31.4-GHz water vapor continuum channel and the atmosphere is also very transparent at this frequency in dry conditions. Therefore, under dry conditions we expect large differences in these two channels between real and perfect simulated observations as a result of differences in emissivity in reality and in the CRTM. This is especially the case over land, but it is also a factor over the ocean, where emissivity depends strongly on surface wind speed at these frequencies. There is a noticeable difference in both O and P in the mid- and upper stratosphere (channels 11–14) between land and ocean, again as a result of the distribution of land with latitude.
b. IR observations
AIRS is a high-spectral-resolution, multiaperture, echelle-grating spectrometer with 2378 channels in the thermal infrared (3.7–15.4 µm or 2665–649 cm−1) with a resolution of 13.5 km at nadir (Aumann et al. 2003). AIRS channels are sensitive to the surface temperature, surface emissivity, atmospheric temperature, humidity, and other atmospheric constituents, including carbon dioxide, ozone, methane, and carbon monoxide. The operational DA system uses only about half or less (currently 120) of the 281-channel subset (Susskind et al. 2003; Le Marshall et al. 2006) of the AIRS channels that are made available to NWP centers. For example, since there is no provision to use atmospheric trace gases as CRTM inputs, those channels sensitive to these atmospheric constituents are not well simulated by CRTM and therefore are not used for NWP.
Figure 3 shows comparison results for three representative IR channels: channel 221 (712.73 cm−1), a far IR UT–LS CO2 temperature sounding channel; channel 1252 (1131.19 cm−1), a window channel; and channel 1449 (1330.97 cm−1), an MT water vapor sounding channel. Channel 221 shows a fairly uniform pole-to-equator temperature gradient that is symmetric with respect to the equator, except that Antarctica, which is in darkness at this date, is considerably cooler than the Arctic. High cold clouds affect O in this channel (e.g., south of Mexico in Fig. 3a) and this is seen clearly in the negatively skewed distribution in the map of O − P (Fig. 3c). Channel 1252 as a window channel is more affected by clouds and responds strongly to the high desert surface temperatures in Africa and South Asia. The O − P differences are generally negative as a result of cloud effects, but there are some areas where O − P reaches +10 K. Over Antarctica, this may be due to differences in emissivity, while over the continental areas, daytime heating may be underestimated by G5NR. Maps of channels 1252 and 1449 are similar in many respects. Both channels are sensitive to clouds, and in dry conditions both are sensitive to the surface. But channel 1449 is also sensitive to humidity, and this is clearly seen in P along the ITCZ in the western equatorial Pacific.
In the histograms for AIRS (Fig. 3), for channel 221, differences between land and ocean again reflect the different distribution of land with respect to latitude. In this channel, residual cloud effects result in a distribution of O − P that has a standard deviation of <1.5 K and that is slightly skewed negatively with a mode at −1 K and a longer fatter negative tail. For channel 1252 there is a wide range of land surface temperatures, with corresponding BTs ranging from 265 to 325 K. Ocean BT also reaches lows of 265 K, but it has a strong high cutoff at 300 K. Again, residual cloud effects result in negatively skewed distributions for O − P, with standard deviations of approximately 1.5 K for ocean but 5 K for land. For channel 1449, ocean and land histograms are similar, but with the ocean cooler and a long positive tail for land. For this channel O − P is not negatively skewed and has standard deviations on the order of 3.5 and 2.0 K, over land and ocean, respectively.
Figure 4 plots the mean and standard deviation of AIRS for a 2-week period in August for land and ocean versus frequency for channels used in the DA system in channel subsets for the temperature sounding channels (top), the window and ozone channels (middle), and the water vapor channels (bottom). The overall pattern for the mean and standard deviation of O and P agree very well. There is generally a larger, often 2 times larger, standard deviation channel by channel over land compared to ocean in both O and P.
5. Validation of simulated GNSS RO observations
GNSS RO observations provide highly accurate information related to atmospheric thermodynamic structure with global coverage that is not affected by clouds or precipitation (Kursinski et al. 1997). In NWP systems GNSS RO observations provide critical anchoring data for the variational radiance bias correction procedures (e.g., Cucurull et al. 2014). While profiles of atmospheric temperature and moisture can be retrieved from GNSS RO observations, NWP centers typically assimilate a more fundamental observation—refractivity, bending angle, or phase delay. In CGOP the bending angle is assimilated following Cucurull et al. (2013). The most important source of GNSS RO observations is the Constellation Observing System for Meteorology, Ionosphere and Climate-1 (COSMIC-1; Anthes et al. 2008). However, COSMIC-1 was launched in April 2006 and RO observations were not used operationally by NCEP until May 2007. Here we validate the simulation of bending angle observations from the Challenging Minisatellite Payload (CHAMP; Wickert et al. 2001). While not available in real time in 2005, CHAMP BUFR files were routinely collected by NCEP and were used in testing RO assimilation procedures by Cucurull et al. (2007).
Figure 5 plots the bending angle observations used by the DA system in the 10–20-km height range. Each occultation provides hundreds of observations, but these are thinned for DA. In terms of horizontal locations, the CHAMP RO data are relatively sparse, with only approximately 50 bending angle profiles during the 6-h window centered at 0000 UTC 16 May 2005. In the maps of Figs. 5a and 5b, there is good general agreement between O and P, respectively.
Distributions of bending angle for O and P are distinctively different for high, middle, and low elevations, but they are very similar for O and P (e.g., Figs. 5d and 5e, respectively). The O − P values are all very small, with highly peaked and long-tailed distributions (e.g., Fig. 5f).
Figure 6 shows profiles of the mean and standard deviation of the bending angle. The bending angle increases exponentially with pressure and with specific humidity. As a result, so too do the standard deviations and O − P statistics. The statistics for O and P match very well. To display the relative difference statistics, we also plot the mean and standard deviation of O − P normalized by O in Fig. 6d. The relative difference statistics show that the simulation is biased high at both high and low elevations, and that relative errors are large near the surface below 10 km and very large above 40 km. These statistics are consistent with our understanding of the error structure of the real observations.
6. Validation of simulated conventional observations
Figure 7 shows that the real and perfect simulated radiosonde observations have nearly the same statistical (mean and variance) variation with pressure for temperature, specific humidity, and wind components. As expected, temperature decreases with height through the troposphere and specific humidity and its variability decreases rapidly with height. Note that humidity is notoriously difficult to measure at high elevations and that currently humidity observations above 300 hPa are not assimilated. The u wind component is at a maximum at jet stream level (250 hPa) and both u and υ wind variability are largest at this level.
Observations of surface pressure (Fig. 8) are influenced by synoptic features and topography and in general agree closely between reality and simulation. Even in the one 6-h period plotted in Fig. 8, there are more than 50 000 observations from rawinsondes and land and ocean surface stations. Note that surface pressure is the pressure at the elevation of the observations in reality. The map of increments (Fig. 8c) shows that O pressures compared to P pressures are higher in South Africa, Turkey, Iran, and China, and that they are lower in the Rockies and Thailand. However, overall difference magnitudes are generally less than 2 hPa. Differences caused by the difference between topography at stations in reality, in the DA system, and in the NR, are minimized here because the simulated pressure observation is the G5NR pressure interpolated to the location and elevation of the real observation.
Maps and histograms were also studied for temperature, specific humidity, wind components, and wind speed. For each, the vertical domain was segmented into low, middle, and upper levels. In general, the spatial patterns of conventional observations are very similar for O and P and the distributions seen in the histograms for land and ocean are roughly similar (basic statistics for some of these distributions are given in Table 2). Specific findings are described in the following paragraphs.
For conventional temperature observations, the difference plots of O − P show some areas of larger coherent difference in the troposphere. In the surface and near-surface layer (p > 1000 hPa), O is greater than P by more than 2 K over land along the U.S. Gulf Coast. In the layer from 300 to 1000 hPa, there are a number of mountainous areas, such as the Appalachians, where O is greater than P by 2.5 K or so. The temperature difference (O − P) is distributed approximately normally with standard deviations increasing toward the surface.
For conventional specific humidity observations, the difference plots of O − P show areas of larger coherent difference in the boundary layer. Notable areas of difference are in the high mountains of North America and Asia in the 300–950-hPa layer, in the United States east of the Mississippi River and the area of northern India in the foothills of the Himalayas in the 950–1000-hPa layer, and in the area of Siberia just north of Kazakhstan in the layer below 1000 hPa. Although divided into three layers, in all cases because humidity increases exponentially toward the surface, the difference plots, even for the 300–950-hPa layer, are dominated by the lowest observations in each layer. The O − P histograms for specific humidity have standard deviations of approximately 1.5 g kg−1 and appear to be non-Gaussian with high central peaks and long tails.
For winds, note that the low-level winds include and are dominated by scatterometer observations. The plots of O − P show many areas of smaller-scale coherent differences for both wind speed and wind components. The histograms of wind speed show the typical Weibull distribution (Pavia and O’Brien 1986). The distribution of the differences is approximately normal for high-level wind speed, but it becomes increasingly peaked lower in the atmosphere. The histograms of O − P for wind components are approximately normal with standard deviations on the order of 5, 4, and 3 m s−1 for high, middle, and low-level winds, respectively.
7. Validation of simulated BT observations from proposed sensors
Validation of simulated BT observations from proposed future sensors is much more difficult than the validation of existing observations described in the previous sections. First, the validation of existing observations compares forward model calculations and sensor observations that are both successfully used in operational DA. Second, for some proposed sensors, forward model calculations and/or the creation of comparison sensor observations may require entirely novel methods. Validation of simulated BT observations from proposed sensors is also critical for OSSEs. Confidence in the simulation of observations from proposed sensors is required in order that the OSSE results are reliable and actionable.
For validation of a proposed sensor, exact matchups of real and simulated observations may not be possible, because the proposed sensor may have different characteristics than any existing sensor. For example, in this section we validate the simulation of BTs for the channels of the proposed Microsized Microwave Atmospheric Satellite-2 (MicroMAS-2) sensor (Blackwell 2017). MicroMAS-2 has channels both very similar to and very different from existing sensor channels. Examples of each are given in the next two sections.
In comparisons involving proposed sensors, we must usually account for differences in observing geometry and coverage. Since AMSU-A and AMSU-B are in LEO over the course of 24 h, most of Earth will be observed, except for some diamond-shaped regions around the equator. In contrast both the constellation of 12 MicroMAS-2 sensors in the Time-Resolved Observations of Precipitation Structure and Storm Intensity with a Constellation of Smallsats (TROPICS) mission hosted on small satellites in low (30°) inclination orbits and a geostationary microwave satellite constellation (Geo-MW) will observe the entire low-latitude region frequently. AMSU-A and AMSU-B and MicroMAS-2 are cross-track scanners, so these sets of observations correspond to a wide range of scan angles (here, the term sensor scan angle, also sometimes referred to as viewing angle, is the angle between nadir and sensor telescope). However, for two different sensors, these angles generally do not match at locations on the surface because of the difference in orbits. For example, the scan angle is zero only at the five subsatellite points for geostationary Earth orbit (GEO) sensors, but it is zero all along the subsatellite track for the LEO sensors. Thus, in the general case, validation is performed by statistical comparisons stratified by scan angle, latitude band, and land–ocean flag. For the purpose of the present MicroMAS-2 validation, we assume the observing geometry of MicroMAS-2 exactly matches that of AMSU-A or AMSU-B. In this case, for channels present in both MicroMAS-2 and AMSU, the validation applies to both (e.g., section 7a).
a. Microwave (183 GHz) water vapor channel observations
Figure 9 compares brightness temperature observed by AMSU-B and simulated for MicroMAS-2 for the NR initial day. From right to left, these channels sense the UT, MT, and LT humidity. The frequencies of these channels are all centered on the strong 183.31-GHz water vapor spectral line. At the line center a moist atmosphere is opaque, and the farther away a channel is from the line center, the farther into the atmosphere it senses. In the maps of Fig. 9, we are essentially seeing the atmospheric temperature at one optical depth. The largest overall difference is that P (Fig. 9b) is much colder than O (Fig. 9a), indicating significantly more moisture in the G5NR UT and resulting in areas of higher O − P (Fig. 9c), particularly in the Indian Ocean. Since moisture is correlated vertically, we see similar spatial patterns going from left to right, but with the BT values increasing as the channels see deeper into the atmosphere and with the magnitude of the differences decreasing. In the LT, the overall moisture in the G5NR is similar to reality (Figs. 9g and 9h). In the MT, the G5NR is somewhat moister than reality (Figs. 9d and 9e). The spatial patterns match in the NR and reality, but there are substantial small-scale differences as might be expected for a short-term forecast of humidity.
In Fig. 9, the shapes of the histograms are somewhat similar, but the simulated MicroMAS-2 BTs are skewed to lower values compared to the real AMSU-B BTs, especially for the UT (cf. Figs. 9a and 9b). Climatologically, Gelaro et al. (2015) report G5NR specific humidities are high above 700 hPa, and this is seen even in this initial day comparison.
b. Microwave (118 GHz) temperature sounding channel observations
Proposed future microwave sounding sensors such as MicroMAS-2 (Blackwell 2017) and GeoStorm3 (Lambrigtsen et al. 2017) exploit frequencies around the 118-GHz oxygen absorption band for temperature sounding. To date, sensor designers have preferred the 60-GHz oxygen band for temperature sounding, but the 118-GHz band is favored for GEO or for small satellites, since a smaller antenna may be used. To validate these channels, multiple linear regression making use of one or more of the 60-GHz channels is used to predict the BT of 118-GHz channels. For example, MicroMAS-2 channel 8 with a frequency of 118.59 GHz was regressed on AMSU-A channels 9 and 10 with frequencies of 57.29 and 57.29 ± 0.22 GHz. These two AMSU channels have weighting function peaks that bracket the peak of MicroMAS-2 channel 8. The training sample is perfect simulated observations for all three channels at the AMSU-A locations during the first 6 h of the NR initial day, and the regression was determined separately for land and ocean. Figure 10 (top panels) illustrates the goodness of fit that was achieved (R2 = 0.996 for land and 0.981 for ocean, where R is the correlation coefficient). Excluding the imagery channel (channel 12) over land, all values of R2 were greater than 0.94 using either one or two predictors. Then, predicted 118.59-GHz BTs were determined using the regression relationships and real BTs from ATMS channels 10 and 11, which are identical to AMSU-A channels 9 and 10. Figure 10 (center panels) compares the distribution of both predicted (red) and simulated (black) 118.59-GHz BTs over a 1-month period. These histograms show a relatively good agreement between the predicted and simulated observations for both land and ocean. The predicted BTs are based on the August 2014 ATMS observations, and the simulated BTs are evaluated from the G5NR during August 2006 at the August 2014 ATMS observation locations. Thus, these two sets of observation agree in the distribution of the observations with respect to time of day, time of year, and geographic location. Figure 10 (bottom panels) extends the statistical comparison to all channels and shows generally good agreement between real and simulated BT statistics. For the LT and MT temperature channels (2–6) and for the water vapor channels (9–11), simulated BTs are cool over the ocean and warm over land compared to reality. The simulated and predicted means of the surface channel (channel 1) over land and the UT temperature channels (channels 7 and 8) agree well, but over the ocean, the surface channel’s simulated BTs are much cooler. Simulated and predicted standard deviations are similar over the ocean but over land the simulated BTs are generally smaller.
8. Concluding remarks
As stated in Boukabara et al. (2016), rigorous OSSEs require a complex infrastructure. The CGOP provides all components of such an infrastructure for global NWP applications— the NR, the simulation of existing and proposed future observations, and the entire NWP system. As part of the CGOP release process, the individual system components and the entire OSSE system are being tested, validated, and calibrated. The CGOP is by design modular, and individual components can be replaced relatively easily. However, the complexity and sensitivity of the interactions between components means that the introduction of a new component may require extensive validation and calibration of the OSSE system as a whole, including multiple OSSE components. For example, a new, more realistic NR may allow changes to make the simulated observations more realistic, to require retuning of the simulated observation errors caused by differences in representativeness, and to require modifications to the forecast model to maintain realistic predictability properties.
In this study, the simulation of observations—a critical CGOP component—is validated first by comparison of error-free simulated observations for the NR initial day to the real observations for that period. Sample results of this validation are presented here for LEO MW and IR BT observations, for RO bending angle observations, and for various types of conventional observations. For sensors not operating at the start of the NR, a qualitative validation is obtained by comparing geographic and statistical characteristics of observations for such a sensor and either an existing similar sensor or by using an existing sensor to predict the observations of the proposed sensor. Sample results of this validation are presented here for proposed MW BT observations. The comparisons agree, with no significant unexplained bias, and to within the uncertainties caused by real observation errors, time and space collocation differences, radiative transfer uncertainties, and representativeness differences (differences between the NR model and the real world). Overall, the CGOP simulations of error-free observations for conventional and satellite-based measurements (microwave, infrared, radio occultation, geostationary, and polar platforms) are found to be reasonably accurate and suitable as a starting point for creating realistic simulated observations for OSSEs.
The inconsistencies that are observed are explained by the following factors. First, the real clear-sky BTs have some residual effects of precipitation and clouds that were not detected by the GSI QC procedures. Second, the NR representation of the radiative characteristics of land surfaces differs from reality and differs from the representation in the CRTM forward radiative model used in the CGOP simulation of BTs. These inconsistencies are acceptable if cloud- and surface-affected radiances are not assimilated, as is the current practice. In the future, when DA systems assimilating all-sky radiances are included in OSSE systems, accurate simulation of precipitation- and cloud-affected radiances will be necessary.
The remaining CGOP validation and calibration activities will be the subject of a separate paper. These activities include checking the quality control yields, tuning the explicitly added errors so that the simulated observation innovation statistics match those observed in reality, comparing forecast skill in the OSSE system and in reality, and the tuning and validating the OSSE system as a whole. This last activity includes a demonstration that the impact of existing observing systems is similar in reality and in the OSSE.
Many scientists have contributed to the development of the CGOP over a period of more than a decade. They come from numerous laboratories and centers within NOAA and NASA, the JCSDA, and the NCEP and ECMWF operational prediction centers. We thank these scientists and their institutions. The authors particularly acknowledge the contributions from other members of the NOAA QOSAP OSSE team—Krishna Kumar, Zhenglong Li, Michiko Masutani, Isaac Moradi and Jack Woollen—and our NASA GMAO colleagues—Ron Gelaro, Bill Putman, and many others—who produced the G5NR at NASA GMAO. In addition, we thank the team that developed and maintain the CRTM, including Mark Liu and the late Paul VanDelst. We thank the reviewers. Responding to their insightful suggestions improved this paper.
We gratefully acknowledge support for this work provided by NOAA—directly, under the Disaster Relief Appropriations Act of 2013 (H.R. 152), under the Scientific and Technical Services II Federal Contracting Vehicle (DOCDG133E12CQ0020), and under the auspices of the cooperative institutes given in the author affiliations on the title page through Cooperative Agreements NA14NES4320003 for CICS, NA15OAR4320064 for CIMAS, NA14OAR4320125 for CIRA, and NA15OAR4320137 for CIRES.
Definition of Acronyms
This article has a companion article which can be found at http://journals.ametsoc.org/doi/abs/10.1175/JTECH-D-16-0012.1
Errico et al. (2013) included representativeness differences in their definition of implicit errors, but we will keep them separate in this discussion.
We use the phrase representativeness differences, since there can be no representativeness errors in the OSSE system where the NR is considered to be the truth.
GeoStorm is a particular mission implementation of the Geostationary Synthetic Thinned-Aperture Radiometer (GeoSTAR) design (Lambrigtsen et al. 2007).