A modeling investigation explores the impacts of the assimilation of satellite-retrieved soundings on forecast error in the Fifth-Generation Pennsylvania State University–National Center for Atmospheric Research Mesoscale Model (MM5). Simulations of the period of the U.S. Air Force’s Contrail Experiment (18–29 September 1995) vary the initialization method and datasets assimilated, the performance of first-guess reanalysis, the imposition of quality control (QC) on the satellite data, and the frequency of the model update cycle. MM5 experiments employing four-dimensional data assimilation (FDDA) are compared with a control experiment without FDDA. In the former, combinations of conventional surface and radiosonde observations and retrieved temperature and moisture soundings from the Defense Meteorological Satellite Program (DMSP) and Television and Infrared Observation Satellite Operational Vertical Sounder (TOVS) satellite instruments are assimilated. Forecast error statistics for the experiments are computed and analyzed. It is found that for retrieved temperatures the DMSP and TOVS sounding datasets used have similar, reasonable accuracy, but for retrieved dewpoints they display significant, and more differing, errors. Overall, the TOVS retrievals obtained are of poorer quality than are the DMSP retrievals. Sensitivity tests reveal that imposing a QC filter on the satellite data prior to assimilation does improve the resultant MM5 simulations. With such QC, it is found that assimilating DMSP and TOVS soundings with the methods used can significantly improve the forecasts of both temperature and moisture variables in the MM5. Model performance, however, can still reflect the relative quality of the satellite retrievals assimilated, with the lower-error DMSP data yielding better simulations than do the TOVS data. Tests exploring the reanalysis of first-guess fields obtained from FDDA show that it does benefit the short-term (0–12 h) forecast but that significant gains diminish thereafter.
Contrail1 forecasting poses a unique challenge for the U.S. Air Force Weather Agency (AFWA). To aid in the prediction of contrail formation, AFWA has developed algorithms that receive data on environmental relative humidity (RH) and pressure and output critical temperatures at or below which contrails will form (see, e.g., Appleman 1953). Enhancing contrail forecast accuracy suggests that temperature and relative humidity predictions through the troposphere and lower stratosphere should be improved, yet although mesoscale models offer the potential for such improvements, their guidance must be interpreted with care. Previous verification studies of operational (Nested Grid Model, Eta) and research (MM5, see below) mesoscale models (e.g., in the Storm-Scale Operational and Research Meteorology–Fronts Experiment Systems Test and Winter Icing and Storms Project field projects) have shown that they can have significant root-mean-square errors (up to 30%) and bias errors in upper-level moisture forecasts (Kuo et al. 1994;see also Manning and Davis 1997). Given that improved model initializations can reduce such errors, and in light of AFWA’s objective of improving contrail and other forecast parameters, this study examines the impact of the assimilation of satellite soundings on the performance of a widely used mesoscale model in simulating temperature and moisture through the troposphere and lower stratosphere. The model is the Fifth-Generation Pennsylvania State University–National Center for Atmospheric Research Mesoscale Model (MM5).
Errors in model-predicted temperature and RH may be due both to 1) temperature and moisture errors in the initial conditions and 2) deficiencies in model physics. With respect to RH, errors in initial conditions at upper levels reflect the inability of synoptic radiosonde networks to resolve moisture gradients (often smaller in scale than those of temperature) and, particularly at upper levels, inaccuracies in the sparse observations themselves (from the degraded performance of humidity sensors aloft). To improve mesoscale simulations of temperature and humidity, better initial thermodynamic analyses are needed. One avenue for generating these improved analyses is the assimilation of temperature and moisture soundings from satellites, which can provide much denser observational coverage with greater horizontal resolution than can the operational radiosonde network; the latter leaves large data voids over the majority of the globe.
This study investigates the impacts of satellite sounding assimilation on the performance of a popular mesoscale model, the MM5. The satellite datasets used contain temperature and moisture measurements from the Defense Meteorological Satellite Program (DMSP) and Television and Infrared Observation Satellite (TIROS) platforms. Model experiments are performed in which these satellite data and conventional (i.e., surface synoptic and rawinsonde) data both are assimilated. The simulations cover a 12-day period of the Air Force’s Contrail Experiment, which took place in the northeastern United States from 18 to 29 September 1995. As part of this program, special high-resolution radiosondes were launched regularly at five sites in New England. These soundings provide a database for both assessment of the satellite data and verification of the model.
This work investigates the effects of the assimilation of DMSP and TOVS (described below) satellite data on MM5 simulations of not only upper-tropospheric conditions significant in contrail prediction, but also the lower- and middle-tropospheric conditions necessary for general forecasting. Presented are results of satellite data-quality analyses and satellite data-assimilation experiments. The results given herein reflect analyses of the full troposphere, namely, the 900–250-hPa depth.
The MM5 (Grell et al. 1995) is a nonhydrostatic, primitive-equation, mesoscale forecast model. This section summarizes the MM5’s configuration for the experiments performed. For full model details the reader may consult Dudhia (1993), Grell et al. (1995), and Haagenson et al. (1994).
The MM5 configuration used for the experiments features two domains in a two-way nested setup. The coarse- and fine-grid horizontal resolutions are 54 and 18 km, respectively. There are 45 vertical σ levels (described below) between the surface and the model top at 70 hPa. Figures 1a,b show the model domains. In Fig. 1a the dots locate the standard synoptic upper-air sounding sites; in Fig. 1b they locate the sites of the special, high-resolution soundings of the Contrail Experiment.
In the MM5, the variables pressure p, temperature T, and density ρ are defined through a constant reference state and perturbations from it (Grell et al. 1995). For example, pressure is defined by
where p0 is the reference, or base state, pressure, and p′ is the deviation from it. Here x, y, and z are spatial coordinates, and t is time. In this setting, pressure perturbation p′ is the predicted variable. The MM5 uses a terrain-following σ vertical coordinate defined as
where ps is the reference state surface pressure, and pt is the model-top pressure (here 70 HPa) (Grell et al. 1995).
The moist physics package consists of an ice-process explicit scheme for grid-resolved processes and a cumulus parameterization for subgrid-scale processes. The explicit scheme (Dudhia 1989; see also Grell et al. 1995) has prognostic equations for water vapor, cloud water, and rainwater. An ice phase transition occurs as the freezing level is crossed, but there is no mixed-phase treatment. Implemented in this scheme is an ice sedimentation parameterization (Heymsfield and Donner 1990) to improve the simulation of cloud ice above 400 hPa (see Manning and Davis 1997). The cumulus scheme is the Kain–Fritsch parameterization (Kain et al. 1993). The Blackadar high-resolution PBL scheme (Blackadar 1979; Zhang and Anthes 1982) simulates boundary layer processes.
The model experiments consist of series of simulations of 48-h segments (referred to here as “cases”) initialized every 12 h and covering the period 1200 UTC 18 September–0000 UTC 29 September 1995. For the initialization of the first segment (i.e., for 1200 UTC 18 September) the National Centers for Environmental Prediction (NCEP) global analysis provides the first-guess field; for the initializations of the subsequent segments, the previous case’s 12-h MM5 forecast provides the first-guess field. There are reanalyses of the first-guess fields with observations through a Cressman scheme (Manning and Haagenson 1992), which can be regarded as intermittent, “static” (see, e.g., Daley 1991) assimilations. These steps shall be referred to as “reanalysis,” in contrast to the dynamic assimilations performed during FDDA (described below). The lateral boundary conditions for all of the experiments are derived from archived analyses and 6-hourly forecast data from NCEP’s spectral, or aviation (AVN) model for the study period.
The dynamic assimilation method (see, e.g., Daley 1991) used is four-dimensional data assimilation (FDDA) via Newtonian relaxation (i.e., “nudging”). The technique is summarized here; for full details the reader is referred to Stauffer and Seaman (1990) and Grell et al. (1995). The relaxation method involves the application of forcing functions to the model equations at every time step to nudge the simulation toward observations or toward gridded analyses. The data assimilation is performed in a 12-h time window [the 12 h preceding the beginning time of the free (unnudged) forecast of each case]. During this period, both analysis and observation nudging are used. In the former, gridded analyses are used to nudge the model, and in the latter, conventional surface, conventional upper-air, and satellite-retrieved observations are applied. The inclusion of the conventional observations in the analyses implies that the effect of these observations is felt from both the analysis and the observation nudging; the satellite measurements are felt from the observation nudging only.
In the nudging scheme, the model’s prognostic equations are adjusted through artificial tendency terms that are based on the difference between the model and analyzed/observed states. The artificial tendency terms appear in the model equations as follows:
where α is the model variable to be nudged, p* = ps − pt [with ps and pt as in Eq. (2)], and F represents the model physical forcing terms. The value Gα is a nudging coefficient establishing the relative weight of the nudging term, Wt is a four-dimensional weighting function, and α0 is the gridded analysis value or the observation value. Note that, in an initial series of experiments, there was no static reanalysis with standard surface and upper-air observations after the 12-h nudging period. However, testing suggested that such a reanalysis following the dynamic assimilation would be beneficial, and thus all of the experiments described below (except CONV_NORA) do include post-FDDA static reanalysis.
The datasets for the nudging assimilations include: 1) standard surface and upper-air observations obtained from 12-hourly radiosondes and 3-hourly surface station reports, 2) gridded analyses created by the objective analysis of the standard surface and upper-air observations, and 3) satellite-retrieved temperature and moisture soundings from the DMSP platform and the TIROS Operational Vertical Sounder (TOVS) instrument (see Smith et al. 1985). Because of the highly suspect quality of radiosonde humidity measurements above 300 hPa (see, e.g., Elliot and Gaffen 1991, Wade 1994), there is no nudging of humidity at 250 hPa and above. From the satellites, the retrieved quantities are on mandatory pressure levels through 70 hPa, with TOVS also providing a surface value. During the Contrail Experiment (18–29 September 1995) there were 50 passes of the DMSP satellite and 99 passes of the TOVS satellite over the MM5 domains. Figure 2 presents an example of the distribution of DMSP observations in one pass. The horizontal resolution of the DMSP retrievals is about 46 km; that of the TOVS retrievals is is about 75 km.
During the Contrail Experiment, five sites in New England (located in Fig. 1b) launched special radiosondes at 3- and 12-hourly intervals (the frequency varied with the day and time period). These high-resolution ascents delivered measurements of temperature, moisture, and winds every 0.2–1.0 hPa. The special sounding data are used in the model verifications but were not used in either the nudging or reanalysis to keep the simulations over their relatively narrow area (Fig. 1b) independent of them.
A comparison of the radiosonde and the satellite sounding measurements revealed discrepancies sufficient to warrant a quality check of the satellite data. Thus, in the experiments described, a quality-check (QC) procedure implementing error thresholds was imposed on the DMSP and TOVS data prior to their assimilation. Section 3 describes the results regarding QC.
Table 1 lists the experiments performed. These experiments vary the kind of data assimilated, QC procedure, reanalysis, and update cycle. The experiments have either a 12-h or a 24-h update cycle—that is, they use either the 12-h or the 24-h forecast as the first guess for the new simulation (new 48-h segment). In the FDDA runs, analysis nudging is performed on the coarse grid only, but observation nudging is done on both grids. An experiment consists of a series of 48-h “cases,” each of which represents a new 48-h simulation initialized 12 h after the beginning of the previous case segment. Table 1 shows the number of cases run for each experiment. This number is either 22 for experiments running from 18–29 September or 12 for experiments running from 18–24 September. The shorter experiments target specific sensitivity questions that did not require simulation of the full Contrail Experiment period. These questions are described at the end of section 4.
The control experiment (CTRL) consists of the series of 48-h runs with no FDDA. Two variants of the control are CTRL_12H and CTRL_24H: each case in CTRL_12H is initialized with a 12-h MM5 forecast first guess, and each case in CTRL_24H is initialized with a 24-h MM5 forecast first guess. Note that all of the experiments besides CTRL_24H use a 12-h forecast for the first guess (either with or without FDDA during such 12-h period). The three main nudging assimilation experiments are CONV, DMSP, and TOVS, and they reflect different combinations of datasets. CONV uses standard (conventional) surface and upper-air observations (i.e., 3-hourly surface and 12-hourly upper-air);DMSP uses the standard observations plus the DMSP satellite observations; and TOVS uses the standard observations, plus the DMSP observations, plus the TOVS instrument observations.
Two additional pairs of experiments are variants of CONV and TOVS. A pair of CONV experiments addresses the issue of static reanalysis following the 12-h FDDA period: CONV_REAN and CONV_NORA. In CONV_REAN the first guess yielded by the 12-h assimilation period is objectively reanalyzed with conventional surface and upper-air observations; in CONV_NORA there is no such reanalysis, and the free forecast proceeds directly upon the end of the nudging period. Two TOVS experiments, TOVS_QC and TOVS_NOQC, examine the importance of imposing QC on the satellite data. In TOVS_QC, QC is performed on the satellite retrievals before their assimilation; no such QC is done in TOVS_NOQC.
To assess the forecast impact of the data assimilations, verifications were performed with both the conventional radiosondes and the special soundings. To do this verification, the model values at grid points neighboring the sounding locations were interpolated to such locations and compared with the observed values. Verification skill scores of root-mean-square error (rmse), bias error (bias), and correlation coefficient (CC) were then computed. Section 4 examines these results.
To evaluate the statistical significance of the differences in experiment performance, as reflected in the skill score variations, Student’s t test (Panofsky and Brier 1965) analyses have been conducted. In this setting, the null hypothesis is that two given experiments run over an infinite number of cases would have the same mean scores. Rejection of this hypothesis at a level of 5% implies a 95% probability that the difference is significant. The t test value is computed as follows:
where N is the number of cases, D is the mean of the differences in scores between the two experiments averaged over the sample cases, and s is the standard deviation of the differences in the sample. The value D is defined as
where Dn is the difference in the given score for the nth case. Here, s is formulated as
If the calculated t value is greater than the critical t value [from tables reflecting the degrees of freedom (N − 1) and the significance level], the experiment differences are statistically significant. Section 4 addresses the significant differences in the experiments.
Assessment of satellite data quality
To assess their accuracy, the DMSP and TOVS retrievals collected during the Contrail Experiment period were compared with the ground-based regular and special soundings. The comparisons were made through pairing the satellite soundings with the observed profiles. A satellite–radiosonde observation pairing was done for all satellite soundings that were within 2 h and 0.2° latitude/longitude of the time and location of the radiosonde launch. Obtaining a radiosonde counterpart at the specific time of the satellite profile was done through the linear temporal interpolation of the surrounding observed soundings.
Figures 3a,b show examples of the distribution of satellite-retrieved versus observed (i.e., radiosonde) dewpoints (Td). The values shown are for the full vertical range of measurements for the respective satellite instruments (discussed in section 2). The amount of agreement between the radiosonde and satellite values varies considerably from retrieval to retrieval, with the scatter of the TOVS data (Fig. 3b) greater than that of DMSP data (Fig. 3a).
Table 2 presents the calculated DMSP and TOVS data rms errors, or, perhaps more accurately, rms differences (rmsd),2 and biases. The results in 2a reveal that the tropospherically averaged (over the levels shown) Td rmsd for the DMSP data is about 1.7°C lower than that for the TOVS data, with the DMSP rmsd averaging 7.18°C and the TOVS rmsd averaging 8.85°C. The DMSP Td biases vary from a maximum negative of −0.68°C at 850 hPa to a maximum positive of 3.89°C at 500 hPa. The TOVS Td biases vary from a maximum negative of −0.32°C at 850 hPa to a maximum positive of 8.40°C at 300 hPa (Table 2). It is found that the TOVS Td retrievals have a larger average rmsd than do the DMSP data at a given level. Note that, overall, the quality of the satellite Td’s, both TOVS and DMSP, was so poor at 250 hPa and above that they were not used in the assimilations. Consistent with this fact, the subsequent analyses and discussions focus on levels up to 250 hPa.
For temperature T, the average DMSP and TOVS rmsds are very similar. For both datasets, these values are about 2.8°C through the 850–70-hPa depth, although the average DMSP rmsd is slightly lower. The average biases for temperature differ more substantially, however, with the DMSP data having a negative (cool) bias of ∼−0.5°C, and the TOVS data having a positive (warm) bias of ∼0.8°C.
The evaluations of the retrieved soundings made clear the necessity of running a QC procedure on the satellite data prior to their assimilation. The QC method employed first estimated the error in the retrievals, determined by comparing the retrieved values with time-interpolated values of the gridded analysis fields (available at hours 0 and 12 of the nudging period). The error e here was defined as e = analysis values − satellite retrieval. Maximum error tolerances emax were then set at 3°C for T and 6°C for Td, with these maxima based on the results described above. Satellite measurements differing from the analysis values by more than emax were excluded from the assimilation. Upon application of QC, about 25% of the DMSP T values and over 38% of the TOVS T values were rejected. For Td, 30% of the DMSP values and 41% of the TOVS values were rejected. During the 12-day period analyzed, there was no drift in the flagging rate for either platform’s retrievals.
The first four experiments listed in Table 1, CTRL, CONV, DMSP, and TOVS, constitute the core group. To evaluate their differences, Tables 3a–d present the mean rms and bias errors in the temperature and relative humidity predictions for the 22 cases (i.e., 48-h simulated periods) of these experiments. The results presented here are for the hour-12, hour-24, and hour-36 verification times of each case, and eight pressure levels (see tables) are considered. The boldface entries in the tables indicate errors in the assimilation runs (CONV, DMSP, TOVS) that are at least 0.1°C less for T or 1% less for RH than the corresponding errors in the control (CTRL—no data assimilation). The boldface entries thus show where the assimilation runs are superior.
With respect to the temperature simulations [Tables 3a (rmse), 3b (bias)], the experiments with satellite data assimilation (especially DMSP) do perform better than the control run. For example, DMSP and TOVS show 0.10°–0.17°C lower rms error than CTRL in the 900- and 700-hPa forecasts at 24 h. Moreover, although CONV also performs better than CTRL, CONV is generally not as good as DMSP and TOVS. Overall, DMSP shows the best performance. With respect to the vertically averaged temperature rmse scores, DMSP has the lowest error, about 0.05°–0.07°C less than that of CTRL. With respect to T biases, CONV and DMSP generally have the lowest error over the 700–300-hPa layer at 12–24 h; DMSP is superior to CONV in the 500–300-hPa layer at 24 h.
For relative humidity [Tables 3c (rmse), 3d (bias)], the nudging experiments show lower rmse than does CTRL in the lower troposphere for all verification times. In fact, for the 24–36-h forecasts, the DMSP and TOVS configurations show superior performance extending to 500 hPa. Above 500 hPa, however, FDDA was not clearly superior in reducing RH rmse. At the 300-hPa level (e.g., at a contrail height) CONV and DMSP yield improvements at 24 h; at 250 hPa CONV shows a poorer (higher) rmse score at 24–36 h than CTRL does (Table 3c). With respect to the vertically averaged rmse (Table 3c), DMSP shows the lowest RH errors at 24–36 h by just a slight margin. For biases, the nudging experiments CONV, DMSP, and TOVS all have lower-magnitude RH biases than CTRL does in the lower troposphere (900–850 hPa) at 24–36 h. Up through the middle troposphere (600 hPa), DMSP generally shows the lowest-magnitude biases through 36 h. It is notable, however, that the run without satellite data (CONV) has the lowest bias in the upper troposphere (300 hPa) for the full 12–36-h range; the data-enhanced DMSP and TOVS runs show higher biases than does CONV, although they are generally better than CTRL. At 250 hPa, the satellite data runs are not consistently superior to CTRL. TOVS is poorer at all times. The DMSP results are mixed; they are worse at hour 12, slightly better at hour 24, and the same at hour 36. These results suggest that assimilation of the satellite-retrieved values through the troposphere might actually degrade the model’s upper-level performance in comparison with that of simulations either without FDDA or with FDDA using conventional upper-air data.
Figures 4a,b show vertical profiles of temperature and RH rmse averaged over the 48-h cases for CTRL and DMSP. DMSP is compared with CTRL here because, in general, DMSP’s performance is superior to that of the other assimilation runs. The profiles in Figs. 4a,b reflect the rmse at 18 levels for every 50 hPa between 1000 and 150 hPa. It is seen that, for temperature, DMSP has lower rms errors through the entire column than CTRL does. DMSP’s improvement is greatest in the lower troposphere, specifically below 850 hPa and in the 750–700-hPa layer. For RH (Fig. 4b), DMSP also displays lower rmse in the lower troposphere, mainly in the 900–850- and 750–700-hPa layers. Overall, however, the RH forecast improvement from the DMSP data is small. That it is negligible in the upper-tropospheric/lower-stratospheric region (e.g., above 350 hPa) in part reflects the lack of usable satellite data at 250 hPa and above.
All of the core experiments (CTRL, CONV, DMSP, TOVS) show a maximum of RH error at 300 hPa at all verification times (see, e.g., Fig. 4b). This result is considered to be due, in part, first to differences in the actual and simulated tropopause heights, where 300 hPa is at or near this boundary and its associated moisture gradient. Second, even in the ice scheme used (Dudhia 1989), the MM5’s tendency to retain cloud ice aloft (reflecting overproduction and insufficient fallout) contributes to a moister model environment at upper levels (see Manning and Davis 1997). Although the simulations did employ ice sedimentation modifications (Heymsfield and Donner 1990), these modifications do not eliminate overpredictions of moisture above 400 hPa, as illustrated by Manning and Davis (1990). Also, as seen here, the magnitude of the upper-level moisture error can increase with forecast length. Third, the error at 300 hPa may be linked to error in the upper-level RH initializations, inaccuracy that reflects relatively poor humidity measurements and thus reanalyses, and, in part, the reanalysis procedure used. The latter relates to an MM5 preprocessor that checks for observations and reanalysis point values above 300 hPa that have an RH of less than 10%. Such values are reset (i.e., increased as necessary) to a minimum of 10%.3 This standard procedure was not changed for these experiments, to make the results more applicable to past and current applications of the MM5. Note that the relatively high RH error aloft is consistent with what has been found in other MM5 studies (Kuo et al. 1994).
The above results address skill score differences between CTRL and the three assimilation experiments (CONV, DMSP, TOVS). The statistical significance of these differences has been evaluated, and Figs. 5a–c present the results in terms of comparisons of the rmse performance. These plots show the numbers of fields for which the assimilation runs perform significantly better than CTRL (number of fields below the 0 ordinate value) and significantly worse than CTRL (number of fields above the 0 ordinate value) for different verification times. The results in Fig. 5 reflect the evaluations of five variables [T, qυ (water vapor mixing ratio), RH, Td, and WND (wind speed)] on eight pressure levels from 900 to 250 hPa (900, 850, 700, 600, 500, 400, 300, and 250 hPa). There are thus 40 quantities for each verification time, for a total of 160 for the four verification times (hours 12, 24, 36, 48). Although it may be argued that Td and qυ might sometimes overlap in occurrences of significant fields, the results are essentially the same if the set of variables examined excludes one (e.g., either Td or qυ), and checks of this experiment have revealed that the conclusions below are not weakened by doing so. In Fig. 5, the unshaded portions of the bars reflect the number of fields that are significantly different at the 90% confidence level, and the shaded portions of the bars reflect the number of fields significantly different at the 95% level.
Figure 5a reveals that the nudging with the conventional (standard surface and upper-air) data in CONV does improve the forecast over CTRL, especially at 12–24 h: for 12 fields, model performance is better than it is without the assimilation (i.e., better than CTRL); for three fields, model performance is poorer with it. With the additional assimilation of DMSP data, the improvement becomes significant over all 48 forecast hours, with DMSP performance better than CTRL for 29 fields and poorer for just 2 fields (Fig. 5b). The further addition of TOVS data, however, yields no additional benefit (Fig. 5c). In fact, model performance is degraded with respect to CTRL in comparison with that of the DMSP run, with TOVS performance improved over CTRL for 24 fields and poorer than CTRL for 4 fields.
Biases and correlation coefficients have been similarly compared for the experiments, and Fig. 6 presents the results for CTRL versus DMSP. As in Fig. 5, the five variables are evaluated at eight pressure levels between 900 and 250 hPa. Here, DMSP is the focus among the assimilation runs because of its superior showing in the results (Fig. 5) discussed above. The results in Fig. 6 confirm that MM5 with DMSP data assimilation does perform significantly better than without it. For bias, 17 fields are better (lower biases) in DMSP (as compared with 3 fields better for CTRL); for correlation coefficient, 29 fields are better (higher CC) in DMSP (as compared with 2 fields better in CTRL).
Comparisons of significant DMSP and CTRL rmse differences as functions of pressure levels and variables are depicted in Figs. 7a,b. The improvement from the assimilation of DMSP data appears for all pressure levels (Fig. 7a) and for all verification variables (T, qυ, RH, Td, WND) (Fig. 7b). Fields showing improvement are most numerous at lower levels (i.e., below 700 hPa), with a lesser, secondary, maximum between 300 and 250 hPa (Fig. 7a). Showing the breakdown by field for all analyzed levels, Fig. 7b reveals that the field improved most significantly is temperature. In summary, in terms of enhancing forecasting, there is improvement aloft from the additional FDDA ingest of satellite data (beneficial in particular for contrail applications), but the forecast performance gains are actually maximized at lower levels (Fig. 7a).
From the above comparisons of rmse, bias, and CC, it is seen that the improvement to MM5’s 12–36-h forecasts of temperature and moisture from the satellite sounding assimilation occurs mostly in the lower troposphere, and secondarily in the upper troposphere. For a comprehensive comparison of the CTRL and DMSP experiments in terms of temperature and moisture (a composite of the qυ, RH, and Td scores), Tables 4a,b present the results in terms of vertical level and forecast hour. In these tables, “R,” “B,” and “C” mark those levels and forecast times for which there are statistically significant CTRL–DMSP differences (at the 90% confidence level) of rmse, bias, and CC, respectively. Boldface entries indicate DMSP performance that was better than CTRL, and italicized entries indicate DMSP performance that was worse than CTRL. The results show that the assimilation of DMSP satellite data in MM5can yield statistically significantly improvements in both the initializations (hour-0 analyses) and forecasts of temperature and moisture. For the hour-0 analyses, temperature improvements occur throughout the troposphere; for moisture, improvements are mainly either below 850 hPa or aloft at 400–300 hPa. For the forecasts, the temperature improvements occur in the lower troposphere below 700 hPa through all 48 h of the simulations and above 300 hPa for the 24–48-h period. The moisture forecast improvements are mainly at lower levels below 700 hPa for the 12–36-h period; at upper levels, the improvements are less uniform, appearing at different levels for different times—that is, at 300 hPa for hour 24 and at 400–500 hPa for hour 36. Nonetheless, the assimilation in DMSP does consistently improve the model’s performance over that in CTRL. The only instance in which DMSP’s performance lags that of CTRL is that of the 12-h 500-hPa moisture forecast.
For the sensitivity experiments, Table 5 presents the vertically averaged temperature and RH rmse over 18 pressure levels for three sensitivity experiment pairs: TOVS_QC–TOVS_NOQC, CONV_REAN–CONV_NORA, and CTRL_12H–CTRL_24H (described in section 2). The pressure levels in the vertical averages are every 50 hPa between 1000 and 150 hPa. Recall that the impact of a quality check on the satellite data prior to assimilation was to be addressed in TOVS_QC and TOVS_NOQC. The results show that the QC does decrease temperature and RH rmse, particularly during the first 12 h of the forecast. The TOVS_QC reductions of temperature error of 0.13°–0.4°C and of RH error of 1.2%–5.5% in comparison with TOVS_NOQC indicate a benefit from imposing a QC step prior to the assimilation of the satellite data.
The CONV_REAN and CONV_NORA experiments investigate the effect of reanalyzing the FDDA first-guess field. Comparisons of the CONV_REAN and CONV_NORA errors in Table 5 show that reanalysis after the preforecast FDDA period (done in CONV_REAN) can significantly reduce the hour-0 error, with the improvements here being a 0.51°C reduction in the vertically average temperature rmse and an 8.3% reduction in the vertically averaged RH rmse. Although the reanalysis benefits appear to attenuate at 12 h and beyond, Fig. 8 shows that some fields are still significantly improved in the later periods. In Fig. 8, the number of fields improved in CONV_REAN (i.e., having statistically significantly lower rmse than in CONV_NORA), is indicated by the totals above the 0 ordinate value. It is found that reanalysis can improve the 12–48-h forecasts of temperature and moisture at isolated levels (e.g., hour-12 temperature at 400 hPa and hour-36–48 moisture at 850 hPa).
The last pair of experiments in Table 5 (CTRL_12H–CTRL_24H) tests the sensitivity of forecast error to length of update cycle, in the absence of FDDA. These runs compare a 12-h update cycle (CTRL_12H) with a 24-h update cycle (CTRL_24H). Here, neither the vertically averaged temperature or RH rmse show a strong difference signal. Thus, at least where FDDA is not performed, the MM5 forecast accuracy (as reflected in vertically averaged T and RH rmse) does not appear to be significantly sensitive to 12-h versus 24-h update cycle frequencies. Last, note, in a cross-comparison of the experiments in Table 5, that the CONV_REAN 0-h errors show a slight improvement over those in CTRL_12H (i.e., that the data assimilation is beneficial). Although both experiments involve a reanalyzed 12-h forecast as a first-guess, CONV_REAN reflects data assimilation during the 12-h period.
Summary and conclusions
This study has explored the impacts of the assimilation of satellite and conventional observational datasets on the forecast errors in a widely used mesoscale model, the MM5. The focus has been on the assimilation of DMSP and TOVS satellite soundings, with an ultimate aim being future tunings of both MM5 initialization and physics to optimize predictions of conditions important for aircraft contrail forecasting. It is emphasized, however, that this presentation has offered results reflecting analyses of the whole troposphere, not simply the highest regions and those most associated with contrails.
The experiments performed vary the types of data assimilated and the model configuration. The datasets assimilated contain standard surface and upper-air observations (“conventional” data), DMSP satellite soundings, and TOVS satellite soundings. The experiments consist of series of simulations of 48-h “cases,” or individual 2-day periods of the Contrail Experiment (18–29 September 1995). The experiment variations reflect the datasets assimilated, update cycle, satellite data QC procedure, and reanalysis. The investigations have included analyses of the quality of the retrieved DMSP and TOVS data. Error statistics for differing variables, vertical levels, and forecast times in the experiments have been computed and analyzed, and the main results are as follows.
The quality of the satellite data sought to be used in mesoscale model initializations is a significant concern. For retrieved temperatures, the DMSP and TOVS datasets for the experiment period examined here had similar accuracy. They both averaged an approximately 2.8°C rms difference (rmsd) from the rawinsonde measurements through the troposphere and lower stratosphere, although the DMSP individual rmsd was slightly lower than that of TOVS. For retrieved dewpoint temperatures, both the DMSP and TOVS datasets had much more significant differences. The DMSP data had a vertically averaged Td rmsd of 7.18°C, while the TOVS data fared worse, with a Td rmsd of 8.85°C. In regard to this result, note that, because moisture can be more spatially variable than temperature, there is the possibility of a less-accurate comparison of Td’s from the satellite and ground-based soundings. Nonetheless, the significantly greater rmsd found in the satellite Td’s is a strong-enough signal to indicate an actually poorer dataset. In summary, the results motivate imposing QC on DMSP and TOVS data prior to their assimilation, and mesoscale modelers seeking to use such soundings should do so with caution. The QC and no-QC sensitivity tests performed confirm that imposing a QC step prior to the assimilation of these data will indeed improve simulations.
The benefit to a forecast from nudging assimilation with conventional (standard surface and upper air) observations and with DMSP and TOVS satellite data varies with the levels, variables, and forecast times considered. Overall, however, the added satellite data do consistently improve (at statistically significant levels) both the analyses (initializations) and the forecasts of thermodynamic variables in the MM5. With respect to temperature, the main areas of improvement were the hour-0 analyses throughout the troposphere, the hour-12 to hour-48 forecasts in the lower troposphere (below 700 hPa), and the 24–48 hour forecasts in the upper troposphere (300–250 hPa). The latter represents a benefit to contrail forecasting ability. With respect to moisture (Td, RH, and qυ), the main areas of improvement were the hour-0 analyses at levels below 850 hPa and aloft in the 400–300-hPa layer, and the 12–36-h forecasts in the lower troposphere (below 700 hPa). At upper levels, however, no consistent improvements in the nudged moisture forecasts were found—only improvements at isolated levels and times (e.g., the hour-24 forecast at 300 hPa; hour-36 forecast at 400–500 hPa). For the uppermost levels analyzed here (of most concern in contrail forecasting per se), there was no strong improvement in the moisture forecast, given the assimilated satellite data (Td through 300 hPa). Improvements were more consistent at 300 hPa than above and for the DMSP sounding data (0- and 24-h forecasts) in comparison with runs based on TOVS soundings. At 250 hPa, as much degradation as improvement is associated with the use of the satellite data. Thus, even with a QC procedure, the assimilation of satellite soundings does not consistently guarantee better mesoscale moisture simulations in the upper troposphere.
As suggested by, and consistent with, the findings of Lindzen and Fox-Rabinovitz (1989), limitations on the vertical resolution of the retrievals are weaknesses in the soundings and are likely a contributing factor to the mixed success of the assimilation of the DMSP and TOVS data. Note, however, that with the QC employed the maximum error levels for the satellite data used in the core group of experiments were circumscribed. The detriment to the forecast from the data inaccuracies associated with the retrieval algorithms was thus, to an extent, mitigated by the QC procedure.
Within the bounds of QC, the relative quality of the satellite datasets was important in model performance. Thus, although the assimilated TOVS and DMSP sounding values both met the QC tolerances, it was found that assimilating the resulting lower-error DMSP data yielded better MM5 simulations (in terms of skill statistics) than using the lower-accuracy TOVS data. Although not presented here, other testshave been conducted involving the assimilation of temperature and moisture soundings retrieved from GOES (Geostationary Operational Environmental Satellite) sounder brightness temperature datasets. Upon analysis, however, the GOES data were found to be of significantly poorer quality than either the DMSP or TOVS data. For example, the vertically averaged (850–300-hPa layer) rms error in dewpoint temperature was 13°C, as compared with Td rmse of about 7.2° and 8.9°C in the DMSP and TOVS data, respectively. Experiments using the GOES data with QC constitute ongoing work to determine the forecast impacts of this ostensibly poorer dataset.
Reanalyses of the first-guess fields obtained from a 12-h nudging data assimilation period are found to reduce error in both the MM5 hour-0 analyses and the hour-12 forecasts. Thus, to obtain the short-term forecast benefits, the results recommend a reanalysis following the FDDA period. However, with respect to 24-h and longer forecasts, the improvements are marginal and present only at isolated levels.
With respect to the above findings, it should be understood that not only the assimilation method but also the kinds of satellite data retrievals employed are influential. To explore further how to improve temperature and moisture forecasts with MM5 through satellite sounding assimilation, future work seeks to address the inclusion of such datasets through 4-dimensional variational data assimilation (4DVAR). The 4DVAR is much more computationally intensive than FDDA, however, and is still in development within the MM5 system. Note that, whether assimilated through nudging FDDA or through variational techniques, satellite data should prove particularly beneficial in initializing mesoscale models over the data-sparse oceans. Even with the kinds of data errors identified here, satellite data’s inclusion can better define initial horizontal structures over these problem areas. In the future, mesoscale model forecast improvement is expected from enhancements in both satellite data quality and assimilation methodology.
The authors thank the U.S. Air Force Research Laboratory for supporting this work, and Don Norquist of AFWA for assistance, the satellite data, and helpful comments. The authors thank Dr. Ying-Hwa Kuo of NCAR’s Mesoscale and Microscale Meteorology (MMM) Division for discussions and helpful comments. The authors also thank Wei Huang of NCAR’s MMM Division for data preparation, and experiment analysis and coding. This work has been sponsored by the U.S. Air Force Research Laboratory under Contract F49620-96-C-0020.
Corresponding author address: Dr. Jordan G. Powers, Mesoscale and Microscale Meteorology Division, National Center for Atmospheric Research, Boulder, CO 80307.
A condensation trail (contrail) is a visible trail of ice crystals left behind a flying aircraft. It results from condensation attending the reduction in pressure above the wing surfaces or condensation of water vapor in engine exhaust gases.
The deviations from the radiosonde data may be described as differences because error is introduced from the temporal and spatial interpolations of the satellite data and from inaccuracy in the radiosonde measurements themselves.
The poor performance of moisture sensors that was historically seen at upper levels and problems with unrealistic moisture determinations originally motivated this check.