Basic Diagnosis and Prediction of Persistent Contrail Occurrence Using High-Resolution Numerical Weather Analyses/Forecasts and Logistic Regression. Part I: Effects of Random Error

David P. Duda National Institute of Aerospace, Hampton, Virginia

Search for other papers by David P. Duda in
Current site
Google Scholar
PubMed
Close
and
Patrick Minnis Science Directorate, NASA Langley Research Center, Hampton, Virginia

Search for other papers by Patrick Minnis in
Current site
Google Scholar
PubMed
Close
Full access

Abstract

Straightforward application of the Schmidt–Appleman contrail formation criteria to diagnose persistent contrail occurrence from numerical weather prediction data is hindered by significant bias errors in the upper-tropospheric humidity. Logistic models of contrail occurrence have been proposed to overcome this problem, but basic questions remain about how random measurement error may affect their accuracy. A set of 5000 synthetic contrail observations is created to study the effects of random error in these probabilistic models. The simulated observations are based on distributions of temperature, humidity, and vertical velocity derived from Advanced Regional Prediction System (ARPS) weather analyses. The logistic models created from the simulated observations were evaluated using two common statistical measures of model accuracy: the percent correct (PC) and the Hanssen–Kuipers discriminant (HKD). To convert the probabilistic results of the logistic models into a dichotomous yes/no choice suitable for the statistical measures, two critical probability thresholds are considered. The HKD scores are higher (i.e., the forecasts are more skillful) when the climatological frequency of contrail occurrence is used as the critical threshold, whereas the PC scores are higher (i.e., the forecasts are more accurate) when the critical probability threshold is 0.5. For both thresholds, typical random errors in temperature, relative humidity, and vertical velocity are found to be small enough to allow for accurate logistic models of contrail occurrence. The accuracy of the models developed from synthetic data is over 85% for the prediction of both contrail occurrence and nonoccurrence, although, in practice, larger errors would be anticipated.

Corresponding author address: David P. Duda, NASA Langley Research Center, Mail Stop 420, Hampton, VA 23681-2199. Email: david.p.duda@nasa.gov

Abstract

Straightforward application of the Schmidt–Appleman contrail formation criteria to diagnose persistent contrail occurrence from numerical weather prediction data is hindered by significant bias errors in the upper-tropospheric humidity. Logistic models of contrail occurrence have been proposed to overcome this problem, but basic questions remain about how random measurement error may affect their accuracy. A set of 5000 synthetic contrail observations is created to study the effects of random error in these probabilistic models. The simulated observations are based on distributions of temperature, humidity, and vertical velocity derived from Advanced Regional Prediction System (ARPS) weather analyses. The logistic models created from the simulated observations were evaluated using two common statistical measures of model accuracy: the percent correct (PC) and the Hanssen–Kuipers discriminant (HKD). To convert the probabilistic results of the logistic models into a dichotomous yes/no choice suitable for the statistical measures, two critical probability thresholds are considered. The HKD scores are higher (i.e., the forecasts are more skillful) when the climatological frequency of contrail occurrence is used as the critical threshold, whereas the PC scores are higher (i.e., the forecasts are more accurate) when the critical probability threshold is 0.5. For both thresholds, typical random errors in temperature, relative humidity, and vertical velocity are found to be small enough to allow for accurate logistic models of contrail occurrence. The accuracy of the models developed from synthetic data is over 85% for the prediction of both contrail occurrence and nonoccurrence, although, in practice, larger errors would be anticipated.

Corresponding author address: David P. Duda, NASA Langley Research Center, Mail Stop 420, Hampton, VA 23681-2199. Email: david.p.duda@nasa.gov

1. Introduction

Contrail-induced cloud cover could be a significant factor in regional climate change over the United States (Minnis et al. 2004). As air traffic increases, the potential for globally significant impacts also rises. To better understand and predict these potential climatic effects, it is necessary to develop models that can accurately represent contrail properties based on ambient atmospheric variables, including temperature, relative humidity, and winds.

Several high-resolution numerical weather analyses (NWAs), including the 20-km Rapid Update Cycle (RUC; Benjamin et al. 2004) and the University of Oklahoma Center for Analysis and Prediction of Storms (CAPS) Advanced Regional Prediction System (ARPS; Xue et al. 2003), can provide the temperature, humidity, and wind information necessary to diagnose contrail formation and persistence at time and space scales close to those of observed contrails. One outstanding problem that must be addressed to achieve a realistic simulation of contrails is the uncertainty in upper-tropospheric humidity (UTH) in numerical weather analyses. Current numerical weather analyses tend to underestimate UTH because of dry biases in the balloon soundings used to construct the analyses (e.g., Minnis et al. 2005). Numerical weather prediction (NWP) models are usually built for the prediction of storms and precipitation, and the accurate prediction of UTH is of secondary importance. This underestimation of humidity makes the straightforward calculation of contrail formation via the classical Schmidt–Appleman (Schumann 1996) thermodynamic criteria difficult, at best. In addition, numerical weather models are modified periodically, leading to changes in the way meteorological variables are computed in the model. The contrail forecast model therefore must also be modified to reflect these changes but in an objective and consistent manner. An additional problem in using numerical weather analyses is that, although their humidity fields appear to correlate with the location of persistent contrail coverage, the agreement is not exact. Nevertheless, there is some relationship between the structure of the NWA humidity fields and the longevity, spreading rate, and optical depth of the observed contrails. The results from previous studies (e.g., Duda et al. 2004) show that the thickest, longest-lasting trails tend to occur in the moistest areas of the NWA.

To deal with these problems, weather forecasters have used statistically processed numerical weather model data to make probabilistic forecasts of weather variables for many years. One of the earliest models reported in the literature was developed by Lund (1955), and the model output statistics (MOS) method (Glahn and Lowry 1972) provided some of the first widely used probabilistic forecasts developed from numerical weather forecasts. By using a statistical technique such as logistic regression, forecasts of the occurrence or nonoccurrence of a weather-related event can be derived from the meteorological analyses and forecasts provided by operational NWP models. Assuming that the NWP models assimilate data consistently, logistic regression can obtain relationships between contrail occurrence and meteorological variables without requiring error-free data (which are necessary for the Schmidt–Appleman criteria). Logistic regression techniques also provide an objective method to deal with any necessary changes resulting from the reformulation of the NWP model.

Probabilistic forecasting has already been applied to the contrail formation problem. Travis et al. (1997) used a combination of rawinsonde temperature and Geostationary Operational Environmental Satellite (GOES) 6.7-μm water vapor absorption data to develop a logistic model to predict the occurrence of widespread persistent contrail coverage. Jackson et al. (2001) created a statistical contrail prediction model using surface observations and rawinsonde measurements of temperature, humidity, and wind.

Despite the success of these probabilistic forecast models, some questions remain about the usefulness of logistic models. Most importantly, neither study attempted to determine the potential impacts of random measurement error on the quality of the forecasts. In this paper, we assess the ability of logistic models to provide a valuable and accurate diagnosis/prediction of persistent contrail occurrence via numerical weather models under typical random errors expected in meteorological measurements.

The next section briefly reviews classical contrail formation theory and its limitations, whereas section 3 introduces the logistic regression technique used to create the probabilistic model. A set of probabilistic persistent contrail occurrence forecasts is then created from examples of synthetic meteorological data based on operational numerical weather analyses, and the effects of random error in the meteorological variables are studied in section 4. The final section briefly summarizes and discusses the results.

2. Brief overview of contrail formation theory

Many contrail forecasting techniques rely on the Schmidt–Appleman theory to determine the meteorological conditions necessary for persistent contrail formation. This theory is described in detail by Schumann (1996); only a brief description is provided here.

The Schmidt–Appleman theory computes a theoretical critical temperature Tc at which the mixture of aircraft engine exhaust and the ambient air reaches saturation with respect to water. The critical temperature is a function of the ambient temperature and the fuel combustion efficiency of the aircraft. The Schmidt–Appleman theory assumes that the aircraft exhaust and ambient air mix adiabatically and isobarically. If the heat and moisture within this aircraft plume mix similarly, then the mixing can be described on a vapor pressure versus temperature diagram as a straight line. The slope of this mixing line is determined by the fuel combustion efficiency of the aircraft. Using this mixing line, Tc can be found either graphically (Appleman 1953) or numerically (Schrader 1997) by matching the slope of the line with the derivative of the saturation vapor pressure curve with respect to temperature on the vapor pressure–temperature diagram. If the ambient vapor pressure is greater than or equal to the saturation vapor pressure with respect to ice, a persistent contrail will form for temperatures less than or equal to the points along the appropriate mixing line. Therefore, for a constant aircraft propulsion efficiency, persistent contrail formation at a particular pressure level is ostensibly determined by the ambient temperature and humidity only. In the context of an operational contrail forecast where the resolutions of the temperature and humidity data are on the order of tens to hundreds of kilometers, temperature and humidity are not precisely known. To determine the occurrence or nonoccurrence of persistent contrails from the Schmidt–Appleman theory, accurate and consistent meteorological data are required. This requirement limits the accuracy of contrail prediction models based strictly on the Schmidt–Appleman criteria. Meteorological data are subject to bias and random measurement errors that must be corrected before the Schmidt–Appleman theory can be applied successfully.

Another factor complicating the prediction of persistent contrail occurrence is that other variables (including vertical velocity and the atmospheric lapse rate) may affect the formation and the development of persistent contrails. Duda et al. (2009) matched several months of contrail coverage statistics derived from surface and satellite observations to a number of meteorological variables (e.g., upper-tropospheric humidity, vertical velocity, wind shear, and atmospheric stability) in two operational numerical weather analyses. The relationships between contrail occurrence and the NWA-derived statistics were analyzed to determine under which atmospheric conditions persistent contrail formation is favored within NWAs. Humidity is the most important factor determining whether contrails are short-lived or persistent, and persistent spreading contrails are more likely to appear when vertical velocities are positive. Carleton et al. (2008) also note the relationship between persistent contrail occurrence, humidity, and rising motion in the upper troposphere. Because the Schmidt–Appleman theory only deals with the formation of contrails and not the development of persistent contrails, these factors are not considered in models based on the Schmidt–Appleman criteria.

To overcome these limitations, probabilistic models using logistic regression have been developed. Not only can logistic models include an arbitrary number of atmospheric variables related to the occurrence of persistent contrails, the logistic model was considered in this study because it can handle the effects of a consistent, systematic bias error effectively. For example, if all relative humidity measurements used to create a logistic model of persistent contrail occurrence were reduced in magnitude by 15%, the probabilistic model developed from the modified data would be as accurate as the model developed from the original data. It is not as clear, however, how random error would impact the logistic model. In the next section, we develop a test model using synthetic meteorological data to determine how much random error affects the ability of logistic models to forecast persistent contrail occurrence.

3. Development of logistic models using synthetic data

Logistic models are an effective method to build probabilistic forecasts. Unlike the Schmidt–Appleman criteria, logistic models are not affected by a consistent temperature or humidity bias in the observations used to develop them. We will examine a logistic model developed using synthetic meteorological data with perfectly known random variances, and we will use this model to estimate the effects of random error in the NWAs on logistic models.

a. Statistical technique

Logistic regression (Hosmer and Lemeshow 1989) can be used to create a probabilistic estimate of persistent contrail formation. Logistic regression techniques are commonly used where the predictand, such as in this case, is a dichotomous (yes/no) variable. Although multiple linear regressions can also be used to make probabilistic forecasts (e.g., Glahn and Lowry 1972), logistic regression offers two advantages over linear regression: in logistic regression the forecast values cannot fall outside of the 0–1 probability range, and each predictor can be fit in a nonlinearly way to the predictand. The logistic model assumes the following fit:
i1558-8432-48-9-1780-e1
where P is the predictand (probability of persistent contrail occurrence) and βi (for i = 1, … , p) is the set of coefficients used to fit the predictors xi to the model. All predictors used in this study are based on meteorological quantities in the upper troposphere that are assumed to be related physically to the formation of spreading, persistent contrails. Initially, we consider two variables that come directly from the Schmidt–Appleman theory: humidity and temperature. Another variable, vertical velocity, will also be considered for the purpose of examining how the addition of other factors might affect the accuracy of the logistic model.

The maximum likelihood method was used to estimate the unknown coefficients βi and to fit the logistic regression model to the data. The chi-square statistic χ2 was used to assess the goodness of fit of each logistic model to the meteorological data. To reduce the number of predictors to an optimal number, a stepwise regression technique is used. In each step of the technique, a new predictor is added to the logistic model and the chi-square statistic is compared with the previous model. The new predictor that produces the largest improvement in model fit (i.e., the largest increase in χ2) is added to the model. To avoid overfitting of the model, the stepwise regression technique is allowed to add predictors to the model until the test for statistical significance reaches a significance level (i.e., p value) of 0.05. The set of predictors with the highest overall chi-square statistic is selected as the best group of predictors for each model.

b. Sample meteorological data

To build the test model, atmospheric profiles of temperature, humidity, and vertical velocity were derived from the 27-km horizontal resolution ARPS in 25-hPa intervals from 400 to 150 hPa. The ARPS data were obtained from the hourly conterminous United States (CONUS) domain analyses. Because of computing limitations, the ARPS data were stored at approximately 1° × 1° resolution. Atmospheric humidity expressed in the form of relative humidity with respect to ice was computed from the ARPS fields of potential temperature and specific humidity.

c. Synthetic meteorological data

To test the logistic regression technique, two simple sets of synthetic meteorological data and contrail observations were created based on the ARPS meteorological datasets and on the Schmidt–Appleman theory. First, distributions of ARPS 250-hPa relative humidity with respect to ice (denoted RHI), temperature (denoted TMP), and vertical velocity (denoted VV) data were created by selecting 176 days of data uniformly throughout 2 yr (April 2004–March 2006) of ARPS hourly analyses. Each distribution contains over 7.5 million individual data points throughout the ARPS model domain across the CONUS and surrounding oceans. These distributions are represented as solid lines in the graphs in Fig. 1. The relative humidity with respect to ice is distributed more or less uniformly.

The ARPS humidity data usually originate from the RUC modeling system, and the data presented here were analyzed using the nonvariational Bratseth (1986) successive correction scheme. The temperature distribution is somewhat skewed because of the changing temperature patterns throughout the year, but during short time periods (one or two days) the ARPS 250-hPa temperature distribution is almost normally distributed. Figure 1 shows the ARPS temperature distribution for 4–5 February 2006 as a dotted line. The vertical velocity distribution is distributed nearly equally about 0 cm s−1 and can be approximated by a logistic distribution. The logistic distribution can be rewritten as
i1558-8432-48-9-1780-e2
where μ is the mean of the distribution and s is a shape factor determining the width of the distribution.

Next, two sets of synthetic 250-hPa meteorological data were created to approximate the ARPS data. For the humidity data, a random uniform distribution from 5% to 125% was used. This humidity distribution is similar in form to the distribution used by Buehler and Courcoux (2003), which is based on radiosonde data. The humidity distribution was made slightly moister than the ARPS distribution to offset the suspected dry bias in the ARPS model (and to increase the overall persistent contrail occurrence rate), but this change in the distribution is not expected to affect the overall conclusions of this study. The contrail occurrence rate in the synthetic dataset is approximately 16%, which is similar to the occurrence rate derived from surface observations (Minnis et al. 2003). The synthetic temperature distribution is a random normal distribution with a mean of 223 K and a standard deviation of 5 K. The synthetic distribution roughly approximates a typical ARPS temperature distribution during January. The vertical velocity distribution was approximated by using a random logistic distribution with μ = 0 cm s−1 and s = 1.25 cm s−2. For both synthetic datasets, a total of 5000 simulated 250-hPa observations were produced for each of the three meteorological variables, and the resulting distributions for the first dataset are shown in Fig. 1 as dashed lines. The first dataset is used to build the logistic models, whereas the second dataset is used to evaluate the predictive skill of the models. Although the two synthetic datasets have the same general meteorological statistics, they are not identical.

Finally, persistent contrail occurrences for two scenarios (A and B) were determined for each simulated observation using two sets of contrail formation criteria. In scenario A, persistent contrail formation occurred when the relative humidity with respect to ice was 100% or greater and the temperature was less than or equal to 226.6 K, which is the critical temperature for contrail formation at 250-hPa, when RHI = 100% and the aircraft fuel combustion efficiency is 0.4. Scenario A represents persistent contrail formation simply in terms of the Schmidt–Appleman contrail formation theory and assumes that only temperature and humidity influence contrail formation. Because it is expected that other meteorological factors affect the development of persistent contrails, scenario B allows for the effects of vertical velocity on contrail occurrence. Vertical velocity was selected because it is known to affect the occurrence of persistent contrails. Duda et al. (2009) showed that surface observations of contrail occurrence appeared to be more likely in regions with rising motion in the upper troposphere, and Duda et al. (2004) reported that sinking motions of 1.5 cm s−1 in the upper troposphere correlated with the suppression of persistent contrail occurrence in satellite imagery. In addition, Carleton et al. (2008) showed that contrail outbreaks are associated with regions where the vertical motions in the upper troposphere are changing from subsiding to ascending. In scenario B, an adjusted relative humidity is computed in percent from,
i1558-8432-48-9-1780-e3

Contrail occurrence is then determined using the same temperature and humidity criteria as in scenario A (of course substituting RHIadj for RHI). Thus, rising motion would increase the likelihood of contrail occurrence, and sinking motion would decrease the likelihood of occurrence. Although this formula is arbitrary and was developed solely to demonstrate the possible effects of vertical velocity in contrail forecasting, it is well known that rising vertical motion can directly affect humidity by adiabatic cooling. From elementary thermodynamic theory (Rogers 1979), in a well-mixed layer, the change in humidity with height when RH = 70% and T = 225 K is 6.6% per 100 m. Thus, lifting a parcel 76 m would produce a 5% increase in humidity and would require approximately 2 h for a vertical velocity of 1 cm s−1.

d. Predictors and skill scores

In addition to the three synthetic data variables, 19 other predictors were selected to develop the test case contrail prediction models (Table 1). Five additional predictors are uniformly distributed random variables that have no relation to the predictand and four more are a product of a synthetic data variable and an unrelated random variable. These variables are included to test the ability of the regression method to accept or reject data that are known to be unrelated to the predictand. Another six predictors are the products of one or more of the three synthetic data variables, whereas the remaining four variables are more complicated combinations of vertical velocity and another synthetic meteorological variable. In particular, variable R5V (RHI + 5 × VV) reflects the adjusted RHI used in scenario B.

Two groups of statistical contrail models (scenarios A and B) then were derived from the first database of 5000 synthetic contrail observations and the 19 selected predictors. For simplicity, both sets of models are fit to all 5000 observations, and the results are verified using the second set of 5000 observations. To determine the accuracy of the contrail models, two statistical measures were employed. Both of these measures have been used to quantify the accuracy of previous categorical (i.e., yes/no and occurrence/nonoccurrence) contrail formation forecasts (Jackson et al. 2001; Walters et al. 2000). The contrail formation forecasts are separated into four categories based on the forecast and its outcome: a is the number of cases where persistent contrail formation is forecast, and persistent contrails are observed (hits); b is the number of cases where contrails are predicted, but no contrails are observed (false alarms); c is the number of cases where contrails are not forecast but contrails are observed (misses), and d is the number of cases where contrails are not forecast and no contrails are observed (correct rejections). The first measure is the percent correct (PC), and is calculated as (a + d)/(a + b + c + d). The percent correct represents the percentage of forecasts in which the method correctly predicted the observed event. The second variable is known as the Hanssen–Kuipers discriminant (HKD), or the true skill statistic (Wilks 1995). The HKD is calculated as (adbc)/[(a + c)(b + d)]. This measure of forecasting skill can also be interpreted as (accuracy for events) − (accuracy for nonevents) − 1, and it measures the skill of the “yes forecasts” and “no forecasts” of contrail occurrence equally, regardless of the relative numbers of each forecast. Although in cases where the forecast event is rare (such as contrail occurrence) HKD might be viewed as unduly rewarding yes forecasts, Gandin and Murphy (1992) show that HKD is the only equitable skill score for a two-event (i.e., yes or no) forecast. Equitable skill scores require that constant forecasts of a particular event are not favored over constant forecasts of other events (in this case, the no forecast should not be favored because persistent contrails rarely form, and thus a no forecast would most likely to be the correct forecast).

The logistic regression provides a probability of occurrence for an event between 0 and 1, but the skill scores rely on a dichotomous yes/no (persistent contrail occurs/does not occur) choice. What is the appropriate probability threshold to discriminate between yes and no? Jackson et al. (2001) predicted contrails when the probability was 0.5 or more and predicted no contrail when the probability was less than 0.5. Gandin and Murphy (1992) argue that the critical threshold for translating probabilistic forecasts into categorical forecasts in the two-event situation is the climatological mean probability of the event. In the case of Jackson et al. (2001), the climatological mean probability of contrail occurrence (either persistent or nonpersistent) was near 0.5 (0.64), but the occurrence of persistent contrails is a relatively rare event and the choice of threshold is pertinent. In this study, we test the effects of both thresholds on contrail forecast model accuracy.

e. Random error

As mentioned earlier, the logistic model was considered in this study because it can handle the effects of a consistent systematic bias error effectively. The effects of random error on the model, however, are not as clear. To study the impact of random error on the logistic model, various levels of normally distributed random error were added to both databases of 5000 synthetic observations. Table 2 presents the different random errors used in the simulations. The random errors are expressed in terms of the standard deviation of the added random error. Each of the contrail models developed in this section is named using the following convention. Models developed using the climatological mean probability as the critical threshold are designated as A1x or B1x, whereas models using 0.5 as the threshold are called A2x or B2x, where x is the random error label described in Table 2 and A and B refer to the contrail formation criteria used to determine contrail occurrence. Note that although each logistic model is created using perturbed meteorological data (except cases A1a, A2a, B1a, and B2a), the forecasts of contrail occurrence from those models are always compared to the same set of contrail occurrences that is based on the original, unperturbed data.

Although the random errors chosen for this study are intended to demonstrate the effect of the error on the logistic model, the actual expected magnitude of the meteorological errors is not certain. The values chosen for this study are based on previous estimates. Walters et al. (2000) estimate uncertainties in temperature of ±2 K, as a result of measurement errors by radiosonde and spatial and temporal differences between the radiosonde measurement and the contrail observation, and relative humidity errors of −7.5%, as a result of a systematic bias in radiosonde measurements. Gettelman et al. (2006) report a comparison of Atmospheric Infrared Sounder (AIRS) data with in situ aircraft measurements of temperature and relative humidity. The standard deviation of the differences between AIRS and in situ data was 1.5 K or less for temperature and 9% for relative humidities at pressure levels below 250 hPa. The root-mean-square differences between upper-tropospheric temperature and relative humidity computed in the RUC analyses and radiosonde observations are 0.5 K and 8%, respectively, at 300 hPa for the period between 11 September and 31 December 2002 (Benjamin et al. 2004). Mapes et al. (2003) studied random errors in tropical rawinsonde-array budgets and determined that the unresolved variability in such arrays is 0.5 K for temperature measurements and 15% for relative humidity measurements in the mid–upper troposphere. The random error in computed vertical velocity resulting from errors in the vertical integration of wind divergence was estimated by Mapes et al. (2003) to be on the order of 4 × 10−4 hPa s−1, or approximately 1 cm s−1, based on typical meteorological conditions at 250 hPa. We expect that the values of random error in Table 2 are at least representative of the random errors likely to be present in the RUC/ARPS data. Although the Mapes et al. (2003) study is based on tropical soundings, which probably have less variability than midlatitude soundings, where most persistent contrails occur, the RUC/ARPS models benefit from finer spatial and temporal resolution than rawinsonde arrays.

4. Results from synthetic dataset

The stepwise regression technique was applied to the original 5000 synthetic observations from the first database and to the two sets of 12 perturbed observations containing the various levels of random error described in Table 2. Each contrail formation scenario therefore produced 13 logistic models, and probability forecasts for each model were converted into two sets of yes/no persistent contrail occurrence forecasts based on the two critical probability thresholds. The forecasts were evaluated using the second dataset of synthetic observations, and the skill scores computed for each contrail formation scenario are presented and discussed in the next two subsections.

a. Scenario A

The temperature and relative humidity criteria described in section 3c are the only variables that determine persistent contrail occurrence for scenario A. Although the stepwise regression technique would sometimes produce more than one (equally well-fitted) set of predictors for each of the 13 datasets and the chosen groups of predictors sometimes varied between datasets, one group of predictors was most commonly chosen among the 13 datasets. For scenario A, the preferred set of predictors was RHI, TMP, TMP2, and RT. Table 3 presents the skill scores for each of the 13 datasets and both sets of critical probability thresholds for forecasts based on these four predictors. The climatological occurrence rate is simply the overall occurrence rate of persistent contrails determined from the contrail formation criteria in scenario A applied to the first group of 5000 synthetic observations, and it equals 0.1598. A comparison of scenarios A1 and A2 shows that the choice of 0.5 as the critical probability threshold increases PC but decreases HKD, because the occurrence of contrail persistence is relatively rare. The use of the critical probability threshold of 0.5 increases the number of no forecasts, which is the more likely event. Conversely, the HKD decreases because it tends to reward the prediction of rare events more than common events. Using the climatological occurrence rate tends to increase the number of yes forecasts and leads to an increase in the number of false alarms, but it also decreases the number of misses.

The accuracy of the logistic models remains high regardless of the random error added to the synthetic meteorological data. Even in case m, the HKD for scenario A1 is 0.767 and the accuracies of the yes and no forecasts are 89% and 85%, respectively. Random errors in relative humidity tend to affect the accuracy of the scenario A logistic models the most, and of course random errors in vertical velocity have no effect on model accuracy.

b. Scenario B

In scenario B, persistent contrail occurrence is controlled by temperature and a vertical velocity–adjusted relative humidity. Because the determination of contrail occurrence is more complicated in scenario B, the accuracy of the logistic models is slightly less overall than in scenario A. The set of predictors that were most commonly chosen by the logistic regression for scenario B is TMP, TMP2, RT, RV, TV, T5V, and R10V. The skill scores for each of the models derived from these seven predictors are presented in Table 3. The PC range from 0.970 for the error-free case B1a to 0.844 for case B1m, which has the largest random errors. The random errors in relative humidity tend to have the largest impact on the accuracy of the forecast models, and temperature errors have the smallest effect.

A comparison of the skill scores from the four-predictor models with the skill scores from the seven-predictor models shows that, for scenario A, the results are nearly identical. For scenario B, the seven-predictor models have about 5% better (absolute) accuracy than the four-predictor models when the random errors are small, and the models have nearly the same accuracy for the cases with the largest random errors. The influence of vertical velocity on the determination of contrail occurrence in this simulation is therefore minor, although the actual effects of vertical velocity on persistent contrail occurrence are not well known.

Although not shown here, other sets of predictors were sometimes chosen by the stepwise regression technique as the best model. The skill scores from those predictor sets were similar to the presented results. Not surprisingly, the logistic regression method nearly always chose some combination of relative humidity, temperature, and vertical velocity (for scenario B) as predictors. Rarely, one of the random variables was chosen as one of the predictors but only for the cases with the largest random errors. Thus, the logistic model was able to distinguish the proper predictors from a group of random variables, but sometimes variables such as R10V with subtle differences from the actual contrail occurrence selector were chosen ahead of the true selector (R5V).

The results from this test case based on the synthetic meteorological data demonstrate that the logistic method can develop highly accurate contrail prediction models based on expected levels of random error in the meteorological data. We note, however, that these results represent a best-case scenario for the logistic regression technique. All of the factors that affect contrail occurrence are few and well known, and all are included in the set of potential predictors. It is implicitly assumed that all of the synthetic observations occur within areas of air traffic, so that persistent contrails will occur if the conditions favor occurrence. Logistic models created using actual meteorological data and contrail occurrence observations are not expected to be as accurate. For a more complete assessment of contrail model accuracy, Duda and Minnis (2009) show examples of logistic models developed from numerical weather model data and from actual contrail observations.

5. Summary and concluding remarks

Straightforward application of the contrail formation criteria from the Schmidt–Appleman theory to diagnose persistent contrail occurrence is hindered by significant humidity errors within numerical weather prediction models. Logistic models of contrail occurrence have been proposed to overcome these problems, but basic questions remain about their accuracy. To investigate logistic models, we created sets of 5000 synthetic contrail observations to study the effects of random error in meteorological variables on the development of these probabilistic models. The simulated observations are based on distributions of temperature, humidity, and vertical velocity derived from Advanced Regional Prediction System weather analyses. The logistic models created from the simulated observations were evaluated using two common statistical measures of model accuracy: the percent correct and the Hanssen–Kuipers discriminant. To convert the probabilistic results of the logistic models into a dichotomous yes/no choice suitable for the statistical measures, two critical probability thresholds are considered. The HKD scores are higher when the climatological frequency of contrail occurrence is used as the critical threshold, whereas the PC scores are higher when the critical probability threshold is 0.5. For both thresholds, typical random errors in temperature, relative humidity, and vertical velocity derived from comparisons with radiosonde measurements are found to be small enough to allow for accurate logistic models of contrail occurrence. The accuracy of the models developed from synthetic data is over 85% for both the prediction of contrail occurrence and nonoccurrence. In practice, larger errors would be anticipated because persistent contrails are expected to be influenced by additional atmospheric variables (and thus more uncertainty) than those presented in this study.

Some unanswered issues about the effectiveness of the logistic model are not addressed here and require future study. The synthetic dataset not only has perfectly known meteorological data, but the occurrence of contrails is also precisely known. The occurrence of contrails is not always known; cloud cover may obscure both surface and satellite observations of contrails, and observations may not always be available for all times and locations. Also, aircraft may not fly at all times through some regions where persistent contrails are possible, although this is not expected to be a major problem for this study because most of the CONUS is nearly continuously traveled by jet aircraft throughout the day. The impacts of these factors on the determination of contrail occurrence by logistic models should be quantified.

More work is needed to realize the potential of logistic contrail forecasts. The most direct way to make the logistic models better is to reduce the errors within the meteorological data used to build the models. Meteorological errors directly affect the regressions developed in the logistic model; if the errors are large enough, they may cause the model to choose less pertinent predictors, which would further reduce model accuracy. Meteorological analyses could be improved by using the Atmospheric Infrared Sounder on the Aqua satellite to supplement the temperature and relative humidity data in numerical weather models. Methods to reduce errors in the determination of contrail occurrence could also be pursued. Additional studies are needed to determine if other regionally or temporally averaged variables would increase the accuracy of logistic models based on numerical weather forecasts and if other atmospheric variables may be relevant. Regional and seasonal models of contrail occurrence may help improve the overall performance of this type of persistent contrail prediction model. Finally, logistic models of contrail occurrence provide an additional advantage that has not been used here. Because logistic models compute a probability of occurrence, they could be useful in global circulation model (GCM) simulations of contrail coverage (Ponater et al. 2002; Marquart et al. 2003) to determine the impact of contrail radiative forcing on global climate. Such models use a simple analytical formula based on relative humidity and cirrus cloud coverage to determine contrail coverage. The logistic models could be easily used within the GCM to determine an appropriate contrail coverage fraction for a region based upon the product of the air traffic and the computed probability. Because the logistic model could be developed by comparing GCM model simulations to actual contrail observations, it may provide more accurate simulations of contrail coverage than current methods.

Acknowledgments

This material is based upon work supported by the NASA Earth Science Enterprise Radiation Sciences Division, the NASA Modeling, Analysis, and Prediction Program, NASA Contracts NAG1-02044 and NCCI-02043 NIA-2579, and by the National Science Foundation under Grant 0222623.

REFERENCES

  • Appleman, H., 1953: The formation of exhaust condensation trails by jet aircraft. Bull. Amer. Meteor. Soc., 34 , 1420.

  • Benjamin, S. G., and Coauthors, 2004: An hourly assimilation–forecast cycle: The RUC. Mon. Wea. Rev., 132 , 495518.

  • Bratseth, A. M., 1986: Statistical interpolation by means of successive corrections. Tellus, 38A , 439477.

  • Buehler, S. A., and N. Courcoux, 2003: The impact of temperature errors on perceived humidity supersaturation. Geophys. Res. Lett., 30 , 1759. doi:10.1029/2003GL017691.

    • Search Google Scholar
    • Export Citation
  • Carleton, A. M., D. J. Travis, K. Master, and S. Vezhapparambu, 2008: Composite atmospheric environments of jet contrail outbreaks for the United States. J. Appl. Meteor. Climatol., 47 , 641667.

    • Search Google Scholar
    • Export Citation
  • Duda, D. P., and P. Minnis, 2009: Basic diagnosis and prediction of persistent contrail occurrence using high-resolution numerical weather analyses/forecasts and logistic regression. Part II: Evaluation of sample models. J. Appl. Meteor. Climatol., 48 , 17901802.

    • Search Google Scholar
    • Export Citation
  • Duda, D. P., P. Minnis, L. Nguyen, and R. Palikonda, 2004: A case study of the development of contrail clusters over the Great Lakes. J. Atmos. Sci., 61 , 11321146.

    • Search Google Scholar
    • Export Citation
  • Duda, D. P., R. Palikonda, and P. Minnis, 2009: Relating observations of contrail persistence to numerical weather analysis output. Atmos. Chem. Phys., 9 , 13571364.

    • Search Google Scholar
    • Export Citation
  • Gandin, L. S., and A. H. Murphy, 1992: Equitable skill scores for categorical forecasts. Mon. Wea. Rev., 120 , 361370.

  • Gettelman, A., E. J. Fetzer, A. Eldering, and F. W. Irion, 2006: The global distribution of supersaturation in the upper troposphere from the Atmospheric Infrared Sounder. J. Climate, 19 , 60896103.

    • Search Google Scholar
    • Export Citation
  • Glahn, H. R., and D. A. Lowry, 1972: The use of model output statistics (MOS) in objective weather forecasting. J. Appl. Meteor., 11 , 12031211.

    • Search Google Scholar
    • Export Citation
  • Hosmer, D. W., and S. Lemeshow, 1989: Applied Logistic Regression. John Wiley & Sons, 307 pp.

  • Jackson, A., B. Newton, D. Hahn, and A. Bussey, 2001: Statistical contrail forecasting. J. Appl. Meteor., 40 , 269279.

  • Lund, I. A., 1955: Estimating the probability of a future event from dichotomously classified predictors. Bull. Amer. Meteor. Soc., 36 , 325328.

    • Search Google Scholar
    • Export Citation
  • Mapes, B. E., P. E. Ciesielski, and R. H. Johnson, 2003: Sampling errors in rawinsonde-array budgets. J. Atmos. Sci., 60 , 26972714.

  • Marquart, S., M. Ponater, F. Mager, and R. Sausen, 2003: Future development of contrail cover, optical depth, and radiative forcing: Impacts of increasing air traffic and climate change. J. Climate, 16 , 28902904.

    • Search Google Scholar
    • Export Citation
  • Minnis, P., J. K. Ayers, M. L. Nordeen, and S. P. Weaver, 2003: Contrail frequency over the United States from surface observations. J. Climate, 16 , 34473462.

    • Search Google Scholar
    • Export Citation
  • Minnis, P., J. K. Ayers, R. Palikonda, and D. Phan, 2004: Contrails, cirrus trends, and climate. J. Climate, 17 , 16711685.

  • Minnis, P., Y. Yi, J. Huang, and K. Ayers, 2005: Relationships between radiosonde and RUC-2 meteorological conditions and cloud occurrence determined from ARM data. J. Geophys. Res., 110 , D23204. doi:10.1029/2005JD006005.

    • Search Google Scholar
    • Export Citation
  • Ponater, M., S. Marquart, and R. Sausen, 2002: Contrails in a comprehensive global climate model: Parameterization and radiative forcing results. J. Geophys. Res., 107 , 4164. doi:10.1029/2001JD000429.

    • Search Google Scholar
    • Export Citation
  • Rogers, R. R., 1979: A Short Course in Cloud Physics. 2nd ed. Pergamon Press, 235 pp.

  • Schrader, M. L., 1997: Calculations of aircraft contrail formation critical temperatures. J. Appl. Meteor., 36 , 17251729.

  • Schumann, U., 1996: On conditions for contrail formation from aircraft exhausts. Meteor. Z., 5 , 322.

  • Travis, D. J., A. M. Carleton, and S. A. Changnon, 1997: An empirical model to predict widespread occurrences of contrails. J. Appl. Meteor., 36 , 12111220.

    • Search Google Scholar
    • Export Citation
  • Walters, M. K., J. D. Shull, and J. P. Asbury III, 2000: A comparison of exhaust condensation trail forecast algorithms at low relative humidity. J. Appl. Meteor., 39 , 8091.

    • Search Google Scholar
    • Export Citation
  • Wilks, D. S., 1995: Statistical Methods in the Atmospheric Sciences. Academic Press, 467 pp.

  • Xue, M., D. Wang, J. Gao, K. Brewster, and K. K. Droegemeier, 2003: The Advanced Regional Prediction System (ARPS), storm-scale numerical weather prediction and data assimilation. Meteor. Atmos. Phys., 82 , 139170.

    • Search Google Scholar
    • Export Citation

Fig. 1.
Fig. 1.

(a) Normalized probability density functions of 250-hPa relative humidity with respect to ice computed from the ARPS model over model domain (solid line) and a 5000-point simulated distribution (dashed line) based on a random uniform distribution. (b) Normalized probability density functions of 250-hPa temperature computed from the ARPS model over an 18-month period (solid line), the ARPS model over a 2-day period in February 2006 (dotted line), and a 5000-point simulation based on a random normal distribution (dashed line). (c) Normalized probability density functions of 250-hPa vertical velocity computed from the ARPS model (solid line) and a 5000-point random logistic distribution (dashed line).

Citation: Journal of Applied Meteorology and Climatology 48, 9; 10.1175/2009JAMC2056.1

Table 1.

Atmospheric parameters used as predictors in the logistic models.

Table 1.
Table 2.

Scenarios of normally distributed random error added to the synthetic meteorological measurements. The magnitude of the added random error is represented in each scenario in terms of the standard deviation of the error.

Table 2.
Table 3.

Skill scores (PC/HKD) computed for each of the 13 synthetic meteorological datasets based on a set of either four or seven predictors. Each scenario represents a combination of critical probability threshold and contrail occurrence criteria.

Table 3.
Save
  • Appleman, H., 1953: The formation of exhaust condensation trails by jet aircraft. Bull. Amer. Meteor. Soc., 34 , 1420.

  • Benjamin, S. G., and Coauthors, 2004: An hourly assimilation–forecast cycle: The RUC. Mon. Wea. Rev., 132 , 495518.

  • Bratseth, A. M., 1986: Statistical interpolation by means of successive corrections. Tellus, 38A , 439477.

  • Buehler, S. A., and N. Courcoux, 2003: The impact of temperature errors on perceived humidity supersaturation. Geophys. Res. Lett., 30 , 1759. doi:10.1029/2003GL017691.

    • Search Google Scholar
    • Export Citation
  • Carleton, A. M., D. J. Travis, K. Master, and S. Vezhapparambu, 2008: Composite atmospheric environments of jet contrail outbreaks for the United States. J. Appl. Meteor. Climatol., 47 , 641667.

    • Search Google Scholar
    • Export Citation
  • Duda, D. P., and P. Minnis, 2009: Basic diagnosis and prediction of persistent contrail occurrence using high-resolution numerical weather analyses/forecasts and logistic regression. Part II: Evaluation of sample models. J. Appl. Meteor. Climatol., 48 , 17901802.

    • Search Google Scholar
    • Export Citation
  • Duda, D. P., P. Minnis, L. Nguyen, and R. Palikonda, 2004: A case study of the development of contrail clusters over the Great Lakes. J. Atmos. Sci., 61 , 11321146.

    • Search Google Scholar
    • Export Citation
  • Duda, D. P., R. Palikonda, and P. Minnis, 2009: Relating observations of contrail persistence to numerical weather analysis output. Atmos. Chem. Phys., 9 , 13571364.

    • Search Google Scholar
    • Export Citation
  • Gandin, L. S., and A. H. Murphy, 1992: Equitable skill scores for categorical forecasts. Mon. Wea. Rev., 120 , 361370.

  • Gettelman, A., E. J. Fetzer, A. Eldering, and F. W. Irion, 2006: The global distribution of supersaturation in the upper troposphere from the Atmospheric Infrared Sounder. J. Climate, 19 , 60896103.

    • Search Google Scholar
    • Export Citation
  • Glahn, H. R., and D. A. Lowry, 1972: The use of model output statistics (MOS) in objective weather forecasting. J. Appl. Meteor., 11 , 12031211.

    • Search Google Scholar
    • Export Citation
  • Hosmer, D. W., and S. Lemeshow, 1989: Applied Logistic Regression. John Wiley & Sons, 307 pp.

  • Jackson, A., B. Newton, D. Hahn, and A. Bussey, 2001: Statistical contrail forecasting. J. Appl. Meteor., 40 , 269279.

  • Lund, I. A., 1955: Estimating the probability of a future event from dichotomously classified predictors. Bull. Amer. Meteor. Soc., 36 , 325328.

    • Search Google Scholar
    • Export Citation
  • Mapes, B. E., P. E. Ciesielski, and R. H. Johnson, 2003: Sampling errors in rawinsonde-array budgets. J. Atmos. Sci., 60 , 26972714.

  • Marquart, S., M. Ponater, F. Mager, and R. Sausen, 2003: Future development of contrail cover, optical depth, and radiative forcing: Impacts of increasing air traffic and climate change. J. Climate, 16 , 28902904.

    • Search Google Scholar
    • Export Citation
  • Minnis, P., J. K. Ayers, M. L. Nordeen, and S. P. Weaver, 2003: Contrail frequency over the United States from surface observations. J. Climate, 16 , 34473462.

    • Search Google Scholar
    • Export Citation
  • Minnis, P., J. K. Ayers, R. Palikonda, and D. Phan, 2004: Contrails, cirrus trends, and climate. J. Climate, 17 , 16711685.

  • Minnis, P., Y. Yi, J. Huang, and K. Ayers, 2005: Relationships between radiosonde and RUC-2 meteorological conditions and cloud occurrence determined from ARM data. J. Geophys. Res., 110 , D23204. doi:10.1029/2005JD006005.

    • Search Google Scholar
    • Export Citation
  • Ponater, M., S. Marquart, and R. Sausen, 2002: Contrails in a comprehensive global climate model: Parameterization and radiative forcing results. J. Geophys. Res., 107 , 4164. doi:10.1029/2001JD000429.

    • Search Google Scholar
    • Export Citation
  • Rogers, R. R., 1979: A Short Course in Cloud Physics. 2nd ed. Pergamon Press, 235 pp.

  • Schrader, M. L., 1997: Calculations of aircraft contrail formation critical temperatures. J. Appl. Meteor., 36 , 17251729.

  • Schumann, U., 1996: On conditions for contrail formation from aircraft exhausts. Meteor. Z., 5 , 322.

  • Travis, D. J., A. M. Carleton, and S. A. Changnon, 1997: An empirical model to predict widespread occurrences of contrails. J. Appl. Meteor., 36 , 12111220.

    • Search Google Scholar
    • Export Citation
  • Walters, M. K., J. D. Shull, and J. P. Asbury III, 2000: A comparison of exhaust condensation trail forecast algorithms at low relative humidity. J. Appl. Meteor., 39 , 8091.

    • Search Google Scholar
    • Export Citation
  • Wilks, D. S., 1995: Statistical Methods in the Atmospheric Sciences. Academic Press, 467 pp.

  • Xue, M., D. Wang, J. Gao, K. Brewster, and K. K. Droegemeier, 2003: The Advanced Regional Prediction System (ARPS), storm-scale numerical weather prediction and data assimilation. Meteor. Atmos. Phys., 82 , 139170.

    • Search Google Scholar
    • Export Citation
  • Fig. 1.

    (a) Normalized probability density functions of 250-hPa relative humidity with respect to ice computed from the ARPS model over model domain (solid line) and a 5000-point simulated distribution (dashed line) based on a random uniform distribution. (b) Normalized probability density functions of 250-hPa temperature computed from the ARPS model over an 18-month period (solid line), the ARPS model over a 2-day period in February 2006 (dotted line), and a 5000-point simulation based on a random normal distribution (dashed line). (c) Normalized probability density functions of 250-hPa vertical velocity computed from the ARPS model (solid line) and a 5000-point random logistic distribution (dashed line).

All Time Past Year Past 30 Days
Abstract Views 0 0 0
Full Text Views 182 73 7
PDF Downloads 133 53 3