## Introduction

This study is motivated by the fact that it is often necessary to perform regional transport and dispersion calculations with coupled atmospheric dynamic and dispersion–chemistry models when there is considerable uncertainty with many aspects of the modeling process (e.g., adequacy of physical-process parameterizations, data quality). Such challenging, nonideal modeling conditions may be encountered in many practical situations where coupled models are used for environmental planning and analysis, emergency response to toxic releases, and retrospective forensic analyses. The conventional approach has been to make particular choices about the modeling system configuration and to perform a single simulation with the coupled system (e.g., Yamada et al. 1992; Uliasz 1993). However, especially in uncertain conditions, it is beneficial to apply an ensemble of simulations with the coupled system, using various model physics and data estimates, in order to span the space of possible outcomes. The spread of the resulting ensemble-member solutions provides information on the uncertainty in the coupled model results. This need for better assessment of the uncertainty in air quality simulations has been recently articulated by Pielke (1998) and Straume et al. (1998). Dabberdt and Miller (2001), Straume et al. (1998), Straume (2001), and Mosca et al. (1998) present some of the first examples of ensemble air quality simulations. The study reported on here represents an extension to the above work.

The work described in this paper was part of a larger Department of Defense effort in which forensic analyses were performed of the atmospheric transport and dispersion of toxic material that may have been released from various facilities in Iraq in 1991, during and after the Gulf War. Westphal et al. (1999) and Warner and Sheu (2000) describe the application of two different modeling systems for the generation of mesogamma-scale reanalyses of the meteorological conditions that prevailed during the potential toxic release from the Khamisiyah, Iraq, weapons bunker. The modeling systems used were the U.S. Navy Coupled Ocean–Atmosphere Mesoscale Prediction System (COAMPS) and the Pennsylvania State University–National Center for Atmospheric Research (PSU–NCAR) Fifth-Generation Mesoscale Model (MM5). Both models were run in a data-assimilation mode, and a small ensemble of simulations was produced by COAMPS and MM5 using different data input and model physics. The simulation from each model that validated best against sparse meteorological data was used to provide input to a dispersion model for calculation of dosages. This same approach was subsequently used for a case in which there was a possible release of toxic material at Al Muthanna, Iraq.

The differences in the MM5 and COAMPS model solutions (both inter- and intramodel) for the Khamisiyah and Al Muthanna cases suggest a need to determine the extent to which differences in mesoscale dynamic-model simulations for the same case affect the results of dispersion-model simulations. To accomplish this, a number of Al Muthanna MM5 simulations are performed using a variety of configurations, and each is used to provide the meteorological input to the Second-Order Closure Integrated PUFF (SCIPUFF) dispersion model. The ensemble of coupled model simulations of ground-level dosage patterns is then examined to gain insight into the sensitivity of the dispersion model's results to mesoscale model simulation uncertainty. The ensemble of dosage fields also is used to construct explicit dosage-exceedance probability fields for comparison with those obtained from SCIPUFF. The latter calculation is made using ensemble mean wind fields, including the variability in the individual wind fields, as the meteorological inputs to SCIPUFF. Because the focus here is on coupled-model ensembling techniques, some of the SCIPUFF's capabilities, such as dry and wet deposition, are not exercised.

The aspects of the work described here that are relatively new, and that complement other recent efforts, are the use of ensemble techniques 1) for mesoscale reanalysis applications and 2) with a coupled atmospheric dynamic model and a probabilistic dispersion model. Regarding the first point, most previous ensemble applications have focused on weather prediction needs. The computational constraints of operational forecasting limited the model resolution to mesoalpha and larger scales. However, this study uses model ensemble techniques for the reanalysis of historical periods; that is, the generation of physically consistent atmospheric analyses of available data. This has enabled us to work on the computationally more demanding smaller mesobeta and mesogamma scales. The second point refers to the fact that this study is one of the first applications of ensemble techniques to a coupled atmospheric dynamic model and dispersion model. To the best of our knowledge, this is also the first test of the ability of a *probabilistic* dispersion model to quantify the effects of uncertainties in a dynamic model ensemble on the dispersion model's predictions of atmospheric transport and dispersion.

The desert areas considered in the model simulations included northern Saudi Arabia, Iraq, Kuwait, and western Iran. There is considerable mesoscale local forcing resulting from the existence of the Persian Gulf and from the orographic gradients associated with the Zagros Mountains in western Iran, the Tigris–Euphrates valley in Iraq, and the mountains on the western third of the Arabian Peninsula. In addition, there are modest variations in the vegetation of the desert surface that can influence the planetary boundary layer (PBL) through effects on the surface heat and moisture budgets. Warner and Sheu (2000) show that these local forcing effects, superimposed upon the synoptic-scale processes, produce large spatial and temporal variability in the PBL depth and the mean wind within the PBL. The product of the PBL depth and the mean horizontal wind within the PBL is defined as the PBL “ventilation,” which is a metric of the ability of the PBL to horizontally transport surface-released contaminants. For this geographic area, Warner and Sheu (2000) show that the daytime ventilation has a spatial and day-to-day variability of over a factor of 6. Thus, transport and dispersion processes can be complex.

Sections 2 and 3 describe the meteorological and dispersion models, respectively. The experimental design is discussed in section 4, and the prevailing meteorological conditions are summarized in section 5. Section 6 describes the verification of the meteorological ensemble; section 7 discusses the ensemble dosage simulations with the dispersion model; and section 8 provides a summary and discussion.

## The meteorological model

Details about the PSU–NCAR MM5 modeling system used in this study can be found in Dudhia (1993) and Grell et al. (1994). The triply nested computational grids used in the ensemble study are depicted in Fig. 1. The inner (grids 3N and 3S), middle (grid 2), and outer (grid 1) computational domains had mesh sizes of 127 × 127, 106 × 133 and 72 × 72 points and grid increments of 3.3, 10, and 30 km, respectively. The highest-resolution grids were deemed necessary because Warner and Sheu (2000) show the impact of the finescale desert landscape properties on the PBL depth, and the possible dynamic effects of the lakes in grid 3N should be resolved. There were two inner grids. One was centered over Al Muthanna in central Iraq (grid 3N) where dispersion simulations were required. Another was centered over Hafar al Batin, Saudi Arabia (grid 3S), the area closest to Al Muthanna with a similarly arid climate and with surface and radiosonde data available for comparison with the simulations. The nested grids, each with 35 computational layers in the vertical, were two-way interacting during the simulation. Simulations proceeded simultaneously on both grids 3S and 3N. Because the lowest model computational layer was approximately 40 m above ground level (AGL), with increasing layer depths above, it was not possible for the model to resolve the shallow nocturnal PBL well. The use of a shallower layer near the surface sometimes causes numerical stability problems, requiring a reduction of the time step. The model top was located at 50 hPa.

To create the ensemble of MM5 simulations, various options were employed for the physical-process parameterizations and for the global-scale analyses that were combined with local data to generate regional atmospheric analyses. Table 1 defines the model configurations for the various experiments performed. The MM5 model physics options used in the ensemble study included three PBL parameterizations: 1) the Medium-Range Forecast (MRF) technique used in the MRF Model of the National Centers for Environmental Prediction (NCEP) (Hong and Pan 1996), 2) the turbulent kinetic energy (TKE) parameterization (Shafran et al. 2000), and 3) the Burk and Thompson (1989) parameterization (BT). The Grell (1993) convective-precipitation parameterization was employed on grid 1 only, because of the inapplicability of the assumptions of the parameterization for the higher resolutions of grids 2 and 3. A simple explicit treatment of cloud microphysics based on Dudhia (1989) was employed. This scheme permits both ice and liquid phases for clouds and precipitation but does not permit mixed phases. The Dudhia (1989) radiation scheme in which longwave and shortwave radiation interact with the clear atmosphere, clouds, precipitation, and ground was used in all MM5 runs.

Both simple and relatively complex approaches were used for the surface energy and moisture budgets. The simpler approach employed the “slab model,” developed by Blackadar (1976, 1979) and further tested by Zhang and Anthes (1982), in which ground temperature is calculated for a single soil layer and there is no explicit representation of vegetation effects. The more complex approach used the land surface model (LSM) described in Chen and Dudhia (2001). The LSM, which employs one canopy layer and four soil layers, currently can be used in MM5 only in combination with the MRF PBL scheme. The slab-model's soil moisture was consistent with the LSM's initial soil moisture.

The model initial conditions were defined by analyzing radiosonde and surface data to the model grids using a successive-correction, objective analysis procedure (Benjamin and Seaman 1985) with three different first-guess fields. The three first-guess fields were the NCEP global analysis, the European Centre for Medium-Range Weather Forecasts (ECMWF) global analysis, and the U.S. Navy Operational Global Atmospheric Prediction System (NOGAPS) analysis. The ECMWF and NCEP analyses were available on a 2.5° × 2.5° grid, while the NOGAPS analysis was available on a 1.0° × 1.0° grid. Lateral boundary conditions for the outer grid (grid 1) were defined using linear temporal interpolation between 12-hourly objective analyses employing the same three first-guess fields. Note that there is no zero-order discontinuity in the transition from one set of lateral boundary conditions to the next. Any shock to the model solution that results from the coarse temporal resolution of the boundary conditions, as dictated by the 12-h frequency of the radiosondes on which the analyses are based, is somewhat minimized by the significant distance of the boundaries from the area of meteorological interest in the center of the grids. Because the purpose of the modeling system in this study was to produce a best analysis of the atmosphere rather than a forecast, four-dimensional data assimilation (FDDA) by Newtonian relaxation (Seaman et al. 1995; Stauffer and Seaman 1990; Stauffer et al. 1991) was employed on grid 1. In this approach, the model solution on grid 1 (only) was “nudged” toward analyses of upper-air and surface observations of temperature, wind, and specific humidity. The relaxation coefficients employed were standard values described in the above references. The upper-air (surface) data were available every 12 h (6 h) for construction of these analyses. In Fig. 1, the locations of the surface and radiosonde stations that were used to construct the objective analyses employed for the model initialization, lateral boundary conditions, and Newtonian relaxation are shown. Observations from some stations were occasionally unavailable. Except over Iraq, where no observations were available, typical distances were 250–500 km between radiosonde soundings and 100–200 km between surface observations. The density, especially of surface observations, was clearly variable across the study area. The simulations were permitted to evolve freely without FDDA on grids 2 and 3, but the model solutions on these grids were constrained by both the large-scale information passing from grid 1 to grid 2 and the land surface forcing. Figure 2 shows the topography for grids 1 and 2. The duration of the continuous simulations was 4.5 days, from 0000 UTC 7 February through 1200 UTC 11 February 1991. The initial time of the simulation was 23 h before the time of the gas release in order to allow the model to dynamically respond to local forcing before the solution was used as input to the dispersion model.

## The dispersion model

The probabilistic Lagrangian puff dispersion model used in this study was the SCIPUFF model (Sykes et al. 1984, 1988, 1993). The acronym SCIPUFF describes two aspects of the model. First, the numerical technique employed to solve the dispersion model equations is the Gaussian puff method (Bass 1980) in which a collection of overlapping three-dimensional puffs is used to represent an arbitrary time-dependent concentration field. The number of puffs is internally determined by the model and depends on such factors as the release characteristics, the size of the domain, the numerical resolution choices, and the meteorological conditions. Second, the turbulent diffusion parameterization used in SCIPUFF is based on the second-order closure theories of Donaldson (1973) and Lewellen (1977), providing a direct relationship between measurable velocity statistics and the turbulent dispersion rates.

The Lagrangian puff methodology affords a number of advantages for applications to atmospheric dispersion from localized sources. It avoids the artificial diffusion problems inherent in any Eulerian advection scheme and allows an accurate treatment of the wide range of length scales that prevail as a plume or cloud grows from its initial size and spreads onto larger atmospheric scales. The model is highly efficient for multiscale dispersion problems because puffs can split or merge as they grow.

The turbulence closure relations provide a general representation of the dispersion rate, which can be applied at arbitrary scales provided that an estimate of the velocity statistics is available. For planetary boundary layer scales, the production of turbulence by mechanical shears and buoyancy fluxes is relatively well understood, and turbulence statistics can be predicted from a knowledge of surface fluxes of momentum and heat under idealized conditions. For larger scales and upper-atmosphere stable conditions, the turbulence description is based on climatological information. The SCIPUFF model has been applied to dispersion on local scales of up to 50 km in range (Sykes et al. 1988) and on continental scales of up to 3000 km in range (Sykes et al. 1993).

The turbulence closure is also the basis of the probabilistic aspect of SCIPUFF, which uses a prediction of the concentration fluctuation variance, in combination with the mean value, to assess the statistical nature of the turbulent dispersion. Because turbulent motions are always chaotic, turbulent dispersion is inherently uncertain. The closure model relates the fluctuation intensity in the concentration field to the random variations in the wind field. This feature is of particular interest in the current application because we are concerned with an ensemble of wind fields with random variations representing the uncertainties in large-scale initial and boundary conditions and model physics parameterizations for the mesoscale prediction. In the present application, we can consider the fluctuations in the wind field within the ensemble of MM5 calculations as an additional component of the wind variability, augmenting the atmospheric turbulence with energy at the appropriate scales.

In the present study, SCIPUFF was used in both the ensemble mode and in an explicit or deterministic mode. In the explicit mode, SCIPUFF was used with each dynamic-model ensemble member to create an ensemble of dispersion calculations. In the ensemble mode, the ensemble mean wind field and the velocity variances from the ensemble of wind fields were used as inputs to SCIPUFF to give a single simulation of the ensemble dispersion statistics. The ensemble SCIPUFF results can be compared with the results from the explicit ensemble of dispersion simulations.

The application of SCIPUFF to a meteorological forecast ensemble is new. It should be noted that, in addition to the velocity variance field, the turbulence description requires a mesoscale length-scale estimate as input. The length scale associated with the velocity fluctuations is used to determine both the correlation timescale for the concentration flux and the dissipation timescale for the concentration fluctuation variance. The concentration fluctuation dissipation rate is modeled using a Kolmogorov (−5/3) inertial range assumption to obtain the turbulent energy at the cloud scale when the plume is smaller than the specified velocity length scale (Sykes et al. 1984). The general effect of increasing the length scale is to increase the uncertainty in the concentration field, since there will be less energy providing small-scale mixing and diffusion of the plume and more energy causing variability in the plume location. At present, the specification of this correlation length scale is not well understood. In section 7b of this study, we examine the sensitivity of SCIPUFF's simulations to the length-scale input and also estimate it from the wind field ensemble.

## Experimental design

The goal of traditional ensemble weather forecasting is to predict the probability of future weather events as completely as possible (Epstein 1969; Leith 1974; Mullen and Baumhefner 1994; Molteni et al. 1996). The ensemble approach is motivated by the fact that forecasts are sensitive to uncertainties in the model initial conditions (Lorenz 1963), physics (Harrison et al. 1999; Stensrud et al. 2000), numerics, surface landscape properties, and lateral boundary conditions. Consequently, any single deterministic simulation may be unreasonable. An ensemble is created by employing equally reasonable realizations of the model initial conditions, model physics, or other aspects of the model configuration, and producing forecasts or simulations using each realization.

The use of ensemble techniques in atmospheric dynamic modeling has largely been limited to weather forecasting, primarily on the synoptic and global scale. However, equivalent benefits can be derived from the use of the ensemble approach to applications where the model is used in retrospective studies to provide a best estimate of the atmospheric state and processes [for example, transport and dispersion, as in Straume et al. (1998), Straume (2001), and Mosca et al. (1998)]. For such reanalyses, the model is frequently used in a data-assimilation mode in which available data are ingested by the model during the simulation period. In this approach, in theory, the uncertainties in the model-based analysis system are accounted for by the use of an ensemble of simulations, where the ensemble mean represents a superior estimate of the actual state, as compared with what would be obtained from a single deterministic simulation. In addition, the variance of the ensemble provides a metric of the uncertainty in the analysis. In such retrospective studies, one of the major limitations of ensemble *prediction*—the computational demand associated with producing many model integrations—becomes unimportant because there are no computational constraints of the sort associated with operational forecasting.

It is important to consider the implications of this application of the ensemble approach in terms of the spread of the solutions of the ensemble members. For an ensemble of global model forecasts, the forecasts will continue to diverge from each other, over time, until the mean separation between the forecasts equals the mean separation between randomly chosen atmospheric states. However, the situation is somewhat different for limited-area ensemble forecasting or retrospective simulation with mesoscale models. Here, the lateral boundary conditions are specified based on forecasts from larger-scale (e.g., global) models or are based on analyses of large-scale weather conditions. In either case, the spread of the ensemble solutions that results from the use of various equally reasonable initial states or model physics parameterizations is limited by the use of identical lateral boundary conditions. Thus, for ensemble analysis or prediction by limited-area modeling systems, it also is reasonable to use various equally likely analyses for lateral boundary conditions (this study). Here, the same three analyses were used for the lateral boundary conditions, the initial conditions, and the data assimilation on grid 1.

The coupled MM5–SCIPUFF modeling system was used to calculate ground-level dosages (concentrations integrated over time) for a hypothetical instantaneous release of an inert gas at Al Muthanna. The source characteristics assumed in the SCIPUFF simulations are summarized in Table 2. With the exception of the release time and location, these source characteristics are not intended to bear any resemblance to the actual toxic gas release that may have occurred at Al Muthanna. The MM5 output for grid 3N (see Fig. 1), consisting of the three Cartesian wind velocity components, PBL depth, surface heat flux, potential temperature, and terrain elevation, was provided as input to SCIPUFF at 20-min intervals over the 85-h simulation period. The dosages calculated by SCIPUFF were interpolated at hourly intervals from SCIPUFF's adaptive grid to a latitude–longitude grid with an approximate grid interval of 3.3 km.

The SCIPUFF model was used to calculate dosages in two different ways. First, SCIPUFF was executed using the MM5 output from each member of the dynamic-model ensemble under the assumption that the mesoscale variance or uncertainty in the horizontal wind components was zero. Second, SCIPUFF was executed using the MM5 ensemble mean fields, with the mesoscale uncertainty represented by the variances and covariance of the horizontal wind components from the individual MM5 runs. This second ensemble calculation required an estimate of a mesoscale length scale, which at present is not well understood. Several different length scales were used to examine SCIPUFF's sensitivity to this parameter.

Table 1 summarizes the combinations of model physics options and global analyses used in the MM5 simulations for each ensemble member (EXP). Because precipitation was not a significant influence on the simulations, alternative convective parameterizations were not used. It is important to reiterate that these various choices of model physics and large-scale data analyses are all expected to be equally reasonable. The boundary layer parameterizations employed have been widely used and are accepted by the community. Although one of the surface-physics representations is more sophisticated than the other, that does not necessarily mean that it is superior. The simpler one has been used successfully for 25 years. With respect to the global analyses, all three are routinely used for various research and operational purposes. Thus, given the roughly equivalent skill of the various versions of the model, the individual ensemble-member simulations can be treated as equally likely realizations of the true state of the atmosphere. In addition, the differences in the dosages simulated by the ensemble members can be interpreted as reflecting the typical dosage uncertainty associated with those meteorological modeling choices.

It is important to recognize that this particular ensemble approach does not account for all sources of uncertainty in the coupled modeling system. There are additional uncertainties in the accuracy of 1) the regional observations that are employed to enhance the global analyses for initial and lateral boundary conditions and data assimilation in the dynamic model; 2) the landscape properties in the dynamic model; 3) the parameters used in SCIPUFF, such as the source characteristics; and 4) the relaxation coefficients used in the data assimilation. In addition, the overall data-assimilation approach used (e.g., continuous vs intermittent) could be varied to produce additional members. Thus, the simulated variance of the ensemble solutions reflects a lower limit to the overall uncertainty. To reflect the total uncertainty in the coupled system, the above-listed uncertainties also would have to be included.

## The prevailing meteorological conditions

Large-scale weather conditions aloft and near the surface were relatively undisturbed by storms during the 4.5-day period of the meteorological model simulations, from 0000 UTC 7 February through 1200 UTC 11 February 1991. At 500 hPa, moderately weak winds surrounding Iraq varied between westerly and northwesterly, with typical speeds of 10–20 m s^{−1}. The highest speeds of 25–30 m s^{−1} prevailed around 1200 UTC 9 February 1991. It is impossible to deduce the fine structure in the low-level flow (at 850 hPa and at the surface) in Iraq because of the absence of observations (Fig. 1). The general flow patterns can still be estimated based on surrounding data. The large-scale 850-hPa winds for the entire period are analyzed as northwesterly over Iraq. Some analyses for the 48-h period after 0000 UTC 9 February show weak winds in northern Saudi Arabia becoming easterly, but it is impossible to estimate how far north into Iraq these easterly winds might extend. Satellite images show periodic cloudiness during the whole period, but the overall pattern involved clear skies under the influence of high pressure (Walters et al. 1992).

## MM5 results

The southernmost of the pair of inner grids (grid 3S) was included to allow verification of the MM5 simulations against the closest radiosonde sounding at Hafar al Batin and the nearby surface station. The meteorological model verification statistics for Hafar al Batin serve as surrogates for those at Al Muthanna, which cannot be ascertained. Naturally, model skill statistics based on observations at only a few locations should be viewed with some skepticism. During the 4.5-day simulation period, there were 10 radiosonde profiles and 57 surface observations in grid 3S. Figures 3 and 4 show examples of the surface and upper-air model errors, respectively, for Hafar al Batin and the nearby surface station for each of the ensemble members for the 108-h period. The errors shown are the simulated and observed temperature differences (simulated minus observed), and the magnitude of the vector difference between the simulated and observed wind (which reflects both direction and speed errors). For the surface verification, the model solution for the wind was extrapolated from the lowest computational level at 40 m AGL to the 10-m-AGL observation level using similarity theory. The modeled temperature was extrapolated downward to the 2-m-AGL level using a standard lapse rate. For surface and upper-air error calculations, the model solution was bilinearly interpolated to the observation point from the surrounding four model grid points. In Fig. 3, the surface temperature error shows temperatures for the first 2 days that are generally too warm at night and too cool during the day. This error, which is common among most ensemble members, is especially apparent from simulation hours 48 to 60. This error could result from a number of contributing factors, including errors in the estimate of the surface physical properties in the model, the model vertical resolution near the ground as it affects the model's ability to resolve very shallow and strong nocturnal inversions, and the procedure by which the temperature is extrapolated from the lowest model computation level to the 2-m level of the measurements. Davis et al. (1999) report a comparable diurnal pattern and magnitude to the 2-m temperature error for a similar modeling system applied over an arid area. At individual times, the temperature-error difference among experiments is as large as 2°–3°C. The temperature error becomes increasingly negative with time for all ensemble members for the first 3 days. After that, the errors cease to become more negative. Wind speeds tend to be too high at night and too low during the day, with the magnitude of the speed error (not shown) ranging from 1 to 5 m s^{−1} and being largely associated with the diurnal error. The speed errors among the experiments differ by 1–2 m s^{−1}. Wind direction errors (not shown) have less diurnal consistency among experiments, with differences between ensemble members ranging from 60° to 120°. The upper-air and 10-m-AGL wind speed and direction errors of these simulations are comparable to those obtained from Warner and Sheu (2000). A direct quantitative comparison of published values is difficult because speed and direction error statistics are tabulated separately in Warner and Sheu (2000), while the speed of the vector wind error is provided here. Specific humidity errors (not shown) are both positive and negative for the different experiments, with the larger error values of 1–2 g kg^{−1}. The multiday trends in the errors for temperature and wind likely result from the fact that the statistics are computed for a relatively small area around Hafar al Batin, and the weather events that slowly traverse this area are simulated with varying degrees of accuracy by the model. Generally speaking, when model performance statistics are computed over a larger area, there is less day-to-day variability.

For most experiments, the upper-air statistics in Fig. 4 show similar temporal variability in the biases, and there is no clear trend toward increasing error with increasing simulation time. This is likely a result of the use of analyses for lateral boundary conditions on grid 1, and of the assimilation of observations on that grid. The large variations in the wind biases with time are explained in part by the fact that only one sounding is available. Simulated daytime (1200 UTC, 1500 LT) PBL depths were within 25 hPa (∼250 m) of the observed values.

The near-surface and upper-air errors shown in Figs. 3 and 4 suggest that there is no ensemble member that is clearly better than the rest. Also, tabulated surface and upper-air mean error, bias, and root-mean-square error averaged for the entire period (not shown) do not indicate the clear superiority of any ensemble member. When errors for a particular variable are smaller for one of the ensemble members, the errors in the other variables are not also smaller. In addition, no global analysis, no boundary layer parameterization, and no surface-physics treatment performs consistently better. The typical ensemble-modeling assumption that the model configurations and data for all ensemble members are roughly equally accurate is consistent with the relative uniformity in the skill statistics. It is worth noting that these error statistics are not dissimilar from those obtained by Warner and Sheu (2000) in their reanalysis of the Khamisiyah, Iraq, mesoscale conditions with a similar modeling system. For example, their mean absolute 2-m temperature error for the simulation period was 2°C for each of the members of their small ensemble.

## SCIPUFF results

Two measures are used to illustrate differences among the ensemble members in the dispersion produced by the coupled modeling system: total dosage within ∼200 km of the release, and the time evolution of the area exposed above a dosage threshold. To contrast the two previously described approaches to ensembling, plots of dosage-exceedance probabilities are compared.

### Dosage fields

*D*) was calculated according to Eq. (1) as a summation over time of the concentration (

*C*), where

*i*is model time step number and Δ

*t*is the model time step:

Even though the gas moved generally to the southeast for all ensemble members, there clearly are significant differences among the solutions. In some experiments, the plume remained narrow as it traveled to the southeast. In others, the same initial movement prevailed, but the plume widened rapidly, especially toward the west. These differences result from the fact that some ensemble members carry low-level easterlies into southern and central Iraq (thus causing a westward displacement of the plume boundary), while other ensemble members do not. There are some consistent patterns in the relationships between plume spread and experimental conditions. For example, all the members that employed the ECMWF analysis (1, 4, 7, and 10) showed a broad plume, whereas three out of the four members that employed the NCEP analysis (2, 5, 8, and 11) had a narrow plume. Which scenario is correct is impossible to verify. The large-scale analysis for 850 hPa shows weak easterly flow in northern Saudi Arabia, but the lack of upper-air data in Iraq makes it impossible to say how far north the easterlies extended.

Some experiments (e.g., EXP 3) produced northwesterly winds at the lowest model level (40 m AGL) in excess of 5 m s^{−1} at Al Muthanna and to the southeast after the gas release. These winds sweep most of the gas off grid 3N in less than 12 h. In other experiments (e.g., EXP 1), the flow is much weaker, the gas is not swept to the southeast as rapidly, and the gas becomes caught in an easterly flow that develops in the southern half of the grid after 12 h. Figure 6 shows the 40-m-AGL winds simulated for 12 and 20 h after the gas release for every fifth point of grid 3N for EXP 1 and 3. The EXP 1 winds at Al Muthanna (denoted by the star in Fig. 6) and to the southeast are from the northwest and are weak (about 2 m s^{−1}) for the first 12 h after release (12 h is shown). At this speed, the gas is transported about one-third of the distance to the southeast corner of the grid in 12 h. In EXP 3, northwesterly winds in excess of 5 m s^{−1} during the first 12 h after the release carry the gas off the grid before the winds turn easterly. This easterly flow, which influences the gas in EXP 1, is shown in the figure at 20 h after the release. Thus, important distinctions among the ensemble members are the strength of the northwesterly flow and the degree to which the winds turn easterly before the gas is transported off the grid.

One way to quantify the practical implications of the spread in the model solutions is to plot the time evolution of the area covered by the dosage above some threshold (for example, the dosage corresponding to the “first noticeable effects” or the “general population limit”). We arbitrarily chose the lowest dosage plotted in Fig. 5 for this purpose. (Note that all dosages scale exactly with the initial mass of the gas release.) Area-coverage computations were limited to the part of the grid that is within the circle with a 210-km radius that is tangent to the sides of grid 3N. Figure 7 shows that the areas with dosage above the threshold vary by over a factor of 4 within the ensemble. In addition, it is also clear that the area coverage for the threshold dosage continues to increase out to almost 30 h for some simulations, but for others the area exposed reaches its maximum in as few as 8 h.

### Dosage probabilities

A comparison of the explicit and ensemble probabilities of exceeding a specified dosage value is a practical measure of the skill of the probabilistic prediction. However, there is the limitation that the finite ensemble, containing *N* members, gives discrete probability levels in multiples of 1/*N*. This limitation is easily addressed in a contour map of isoprobability because we can limit our attention to regions where the probability significantly exceeds 1/*N*, or about 10% in our case. Our probabilistic comparisons focus on the probability of exceeding specified dosage levels, ranging from a relatively low dosage covering a large area to a relatively high dosage, where the small region of influence is close to the source.

As noted in section 3, the SCIPUFF ensemble simulation requires an estimate of the correlation length scale associated with the velocity fluctuations in the meteorological ensemble. In general, we expect this correlation to be an anisotropic function of both three-dimensional space and time. However, given the limited availability of correlation information and our limited understanding of the relationship between the correlation and the empirical scale in the closure model, we chose to make the dispersion simulation using a single horizontal length scale Λ_{H} to describe the velocity correlation. The horizontal scale is thus a model parameter that can be adjusted to provide the optimum simulation, and several calculations were made to examine the sensitivity. We note that, because the model simulations are compared for different dosage exceedance levels and over the entire region affected by the plume, the single parameter adjustment is not simply “fitting” the observations.

Figure 8 shows the explicit probability of the dosage exceeding 10^{−7} kg s m^{−3}, together with SCIPUFF ensemble simulations using Λ_{H} = 10 and 50 km. The explicit probability field was estimated directly from the 12 explicit SCIPUFF simulations. That is, the probability at each surface grid location was defined as the number of ensemble simulations that exceeded 10^{−7} kg s m^{−3}, divided by 12. For the SCIPUFF ensemble simulations, the probability was obtained from the clipped normal distribution using the predicted dosage mean and variance. When compared with the explicit ensemble, the Λ_{H} = 10 km simulation gives a larger region where the probability is greater than 40%, and both the Λ_{H} = 10 and 50 km simulations give smaller areas for the 10% probability contour. The explicit maximum probability has better agreement with that based on Λ_{H} = 10 km. Although not shown, probabilities resulting from the use of Λ_{H} values greater than 50 km are almost identical to the 50-km result for this exceedance level, indicating that the effective dissipation is negligible for the larger Λ_{H} values at this relatively short range.

For the lower dosage value of 10^{−9} kg s m^{−3} in Fig. 9, the probability simulations are more sensitive to the length-scale assumption, and Λ_{H} = 50 km shows the best agreement with the explicit estimates. The Λ_{H} = 10 km assumption gives probabilities too close to unity, indicating that the variance is too small because of the higher dissipation rate associated with the smaller length scale. The Λ_{H} = 50 km result generally tends to underestimate the area of the highest probabilities, indicating that a length scale slightly smaller than 50 km may give the optimum simulation. For the lowest dosage level of 10^{−11} kg s m^{−3}, shown in Fig. 10, the Λ_{H} = 10 km result compares better with the explicit estimate.

*u*and

*υ*wind components vertically averaged over the model levels up to 1100 m. The correlations are shown for locations at the center of the model domain, close to the release location, and at the center of the southeast quadrant, in the direction of the plume transport. The autocorrelation, which was averaged over the entire time period of the simulation using the velocity values at the discrete times

*t*

_{n}=

*n*Δ

*t,*was estimated asfor the

*u*component, where the lag time

*τ*=

*s*Δ

*t, N*is the number of time samples, andis the correlation between the velocities at times

_{T}*t*

_{i}and

*t*

_{j}. The subscript

*m*refers to the ensemble member, and the ensemble mean and variance are defined asThe same analysis is applied to the

*υ*component to give the

*R*

^{υυ}correlation.

Figure 11 indicates that the correlation timescale ranges between 3.5 and 6 h, based on the point where the correlation function falls to 0.5. Some of the correlations fall smoothly to near zero, while others remain finite for long times, making an integral definition of the correlation scale very unreliable. The slow decay of some correlations may be due to the small number of ensemble members, but our limited sampling makes it difficult to deduce any general result. The average (*u,* *υ*) velocities at the two locations are (3.1, −3.1) and (3.4, −3.7) m s^{−1}, resulting in correlation length scales ranging from 55 to 108 km. The actual correlation integral timescale is expected to be proportional to our simple estimate, with the constant of proportionality depending on the shape of the correlation function. The timescale corresponding to a correlation value of 0.5 is chosen as a robust estimate that is not sensitive to the limited sampling statistics. However, we only expect Λ_{H} to be proportional to the correlation, following the approach of Mellor and Yamada (1974) where all the turbulence scales are assumed to be proportional to a “master” scale, so the actual value for the correlation scale is not critical. Given the flexibility of an *O*(1) constant, the estimated value of 50 km for Λ_{H} is not inconsistent with the estimated correlation scales. Further investigation with different meteorological conditions is needed to determine whether this correlation measure bears a fixed relation to Λ_{H}.

## Summary

For a variety of reasons, coupled dynamic meteorological models and dispersion models are most often used in a deterministic sense. That is, choices are made about a variety of aspects of the model physics and data, and a single simulation is performed. Unfortunately, the choices made can often significantly affect the model solution. The work described here illustrates an ensemble-simulation approach that addresses this uncertainty issue. Specifically, a number of coupled-model simulations were performed with different equally plausible sources of dynamic model data and model physics. Dosage exceedance probability statistics can be computed from the resulting ensemble of simulations of meteorological conditions. As an alternative, this paper examines the feasibility of obtaining equivalent dosage probability statistics from a single ensemble run of a probabilistic dispersion model.

The case employed in this study is one in which a toxic gas may have been released near Al Muthanna, Iraq during the Gulf War. In order to estimate exposure of personnel, calculations are required for the transport and dispersion of the gas. Unfortunately, such analyses must often be accomplished in nonideal conditions. In this case, no meteorological data are available for Iraq during the period of interest, and meteorological model physics representations have not been well tested for this region.

Twelve meteorological model simulations were performed using different, but equally reasonable, choices for model physics and sources of large-scale meteorological conditions. Even though similar basic meteorological features were apparent in all of the simulations, the differences were sufficient to produce significant sensitivity in the resulting dosage fields for the inert gas. For example, the areas near the source exposed to selected dosage thresholds vary by up to a factor of 4 among members of the ensemble. Moreover, for some ensemble members a significant fraction of the plume remains within the area of interest for a period that is over 4 times as long as for other members of the ensemble. Clearly, using any individual ensemble member to estimate event exposure would have been potentially misleading. In addition, if these estimates had been forecasts from the coupled modeling system, rather than retrospective diagnostic simulations, the emergency responses based on individual ensemble members would have differed significantly.

The large variations in the dosage fields simulated by the coupled MM5–SCIPUFF modeling system for the individual ensemble members illustrate why it is desirable to quantify the effects of meteorological uncertainties in dispersion model hazard predictions. One way of doing this is to perform the dispersion model calculations for each member of the meteorological ensemble and use the results to estimate dosage exceedance probabilities. However, SCIPUFF's probabilistic simulation capability enables dosage-exceedance probabilities to be estimated in a single run with ensemble meteorological inputs. Although only one ensemble was considered in this study, the relatively good correspondence between the explicit and ensemble dosage probabilities is encouraging and suggests that SCIPUFF's probabilistic methodology can be of value in quantifying the uncertainty in a dispersion simulation because of the uncertainties in the dynamic meteorological-model simulation. It also is reasonable to assume that SCIPUFF would have a similar value in quantifying the effects of meteorological uncertainties other than those estimated from an ensemble of dynamic model simulations. However, further investigation of the correlation length Λ_{H} is needed before the results in this paper can be generalized.

## Acknowledgments

This research was funded by the Office of the Special Assistant to the Deputy Secretary of Defense for Gulf War Illnesses through a U.S. Army Test and Evaluation Command Interagency Agreement with the National Science Foundation. Daran Rife assisted in the performance of some of the numerical calculations. Nelson Seaman and David Stauffer provided valuable advice about the experimental design and the analysis of the results.

## REFERENCES

Bass, A. 1980. Modelling long range transport and diffusion.

*Proc. Second Joint Conf. on Applications of Air Pollution Meteorology,*New Orleans, LA, Amer. Meteor. Soc., Air Pollution Control Association, 193–215.Benjamin, S. G., and N. L. Seaman. 1985. A simple scheme for objective analysis in curved flow.

*Mon. Wea. Rev.*113:1184–1198.Blackadar, A. K. 1976. Modeling the nocturnal boundary layer. Preprints,

*Third Symp. on Atmospheric Turbulence, Diffusion and Air Quality,*Raleigh, NC, Amer. Meteor. Soc., 46–49.Blackadar, A. K. . 1979. High resolution models of the planetary boundary layer.

*Advances in Environmental Science and Engineering,*J. Pfafflin and E. Ziegler, Eds., Vol. 1, No. 1, Gordon and Breach, 50–85.Burk, S. D., and W. T. Thompson. 1989. A vertically nested regional numerical weather prediction model with second-order closure physics.

*Mon. Wea. Rev.*117:2305–2324.Chen, F., and J. Dudhia. 2001. Coupling an advanced land surface–hydrology model with the Penn State–NCAR MM5 modeling system. Part I: Model implementation and sensitivity.

*Mon. Wea. Rev.*129:569–585.Dabberdt, W. F., and E. Miller. 2001. Uncertainty, ensembles and air quality dispersion modeling: Applications and challenges.

*Atmos. Environ.*34:4667–4673.Davis, C., , T. Warner, , E. Astling, , and J. Bowers. 1999. Development and application of an operational, relocatable, mesogamma-scale weather analysis and forecasting system.

*Tellus*51A:710–727.Donaldson, Cdu P. 1973. Atmospheric turbulence and the dispersal of atmospheric pollutants.

*Workshop on Micrometeorology,*D. A. Haugen, Ed., Amer. Meteor. Soc., 313–390.Dudhia, J. 1989. Numerical study of convection observed during the winter monsoon experiment using a mesoscale two-dimensional model.

*J. Atmos. Sci.*46:3077–3107.Dudhia, J. . 1993. A nonhydrostatic version of the Penn State/NCAR mesoscale model: Validation tests and the simulation of an Atlantic cyclone and cold front.

*Mon. Wea. Rev.*121:1493–1513.Epstein, E. S. 1969. Stochastic dynamic prediction.

*Tellus*21:739–759.Grell, G. A. 1993. Prognostic evaluation of assumptions used by cumulus parameterizations.

*Mon. Wea. Rev.*121:764–787.Grell, G. A., , J. Dudhia, , and D. R. Stauffer. 1994. A description of the Fifth Generation Penn State/NCAR Mesoscale Model (MM5). NCAR Tech. Note NCAR/TN 398+STR, 138 pp. [Available from NCAR, P. O. Box 3000, Boulder, CO 80307.].

Harrison, M. S. J., , T. N. Palmer, , D. S. Richardson, , and R. Buizza. 1999. Analysis and model dependencies in medium range ensembles: Two transplant case studies.

*Quart. J. Roy. Meteor. Soc.*125:2487–2515.Hong, S-Y., and H-L. Pan. 1996. Nonlocal boundary layer vertical diffusion in a medium-range forecast model.

*Mon. Wea. Rev.*124:2322–2339.Leith, C. E. 1974. Theoretical skill of Monte Carlo forecasts.

*Mon. Wea. Rev.*102:409–418.Lewellen, W. S. 1977. Use of invariant modeling.

*Handbook of Turbulence,*W. Frost and T. H. Moulden, Eds., Plenum Press, 237–280.Lorenz, E. N. 1963. Deterministic nonperiodic flow.

*J. Atmos. Sci.*20:130–141.Mellor, G. L., and T. Yamada. 1974. A hierarchy of turbulence closure models for planetary boundary layers.

*J. Atmos. Sci.*31:1791–1806.Molteni, F., , R. Buizza, , T. N. Palmer, , and T. Petroliagis. 1996. The ECMWF ensemble prediction system.

*Quart. J. Roy. Meteor. Soc.*122:73–119.Mosca, S., , G. Graziani, , W. Klug, , R. Bellasio, , and R. Bianconi. 1998. A statistical methodology for the evaluation of long-range dispersion models: An application of the ETEX exercise.

*Atmos. Environ.*32:4307–4324.Mullen, S. L., and D. P. Baumhefner. 1994. Monte Carlo simulations of explosive cyclogenesis.

*Mon. Wea. Rev.*122:1548–1567.Pielke, R. A. Sr, 1998. The need to assess uncertainty in air quality evaluations.

*Atmos. Environ.*32:1467–1468.Seaman, N. L., , D. R. Stauffer, , and A. M. Lario-Gibbs. 1995. A multiscale four-dimensional data assimilation system applied in the San Joaquin Valley during SARMAP. Part I: Modeling design and basic performance characteristics.

*J. Appl. Meteor.*34:1739–1761.Shafran, P. C., , N. L. Seaman, , and G. A. Gayno. 2000. Evaluation of numerical predictions of boundary layer structure during the Lake Michigan Ozone Study.

*J. Appl. Meteor.*39:412–426.Stauffer, D. R., and N. L. Seaman. 1990. Use of four-dimensional data assimilation in a limited-area mesoscale model. Part I: Experiments with synoptic data.

*Mon. Wea. Rev.*118:1250–1277.Stauffer, D. R., , N. L. Seaman, , and F. S. Binkowski. 1991. Use of four-dimensional data assimilation in a limited-area mesoscale model. Part II: Effects of data assimilation within the planetary boundary layer.

*Mon. Wea. Rev.*119:734–754.Stensrud, D. J., , J-W. Bao, , and T. T. Warner. 2000. Using initial condition and model physics perturbations in short-range ensemble simulations of mesoscale convective systems.

*Mon. Wea. Rev.*128:2077–2107.Straume, A. G. 2001. A more extensive investigation of the use of ensemble forecasts for dispersion model evaluation.

*J. Appl. Meteor.*40:425–445.Straume, A. G., , E. N. Koffi, , and K. Nodop. 1998. Dispersion modeling using ensemble forecasts compared to ETEX measurements.

*J. Appl. Meteor.*37:1444–1456.Sykes, R. I., , W. S. Lewellen, , and S. F. Parker. 1984. A turbulent-transport model for concentration fluctuations and fluxes.

*J. Fluid Mech.*139:193–218.Sykes, R. I., , W. S. Lewellen, , S. F. Parker, , and D. S. Henn. 1988. A hierarchy of dynamic plume models incorporating uncertainty. Vol. 4, Second-order Closure Integrated Puff. Electric Power Research Institute EPRI EA-6095, Project 1616–28, 99 pp. [Available from R. I. Sykes, ARAP/Titan, 50 Washington Rd., P.O. Box 2229, Princeton, NJ 08543-2229.].

Sykes, R. I., , S. F. Parker, , D. S. Henn, , and W. S. Lewellen. 1993. Numerical simulation of ANATEX tracer data using a turbulent closure model for long-range dispersion.

*J. Appl. Meteor.*32:929–947.Uliasz, M. 1993. The atmospheric mesoscale dispersion modeling system.

*J. Appl. Meteor.*32:139–149.Walters, K. R., , K. M. Traxler, , M. T. Gilford, , R. D. Arnold, , R. C. Bonam, , and K. R. Gibson. 1992. Gulf War weather. U.S. Air Force Environmental Technical Applications Center Tech. Note USAFETAC/TN-92/003, 243 pp. [Available from USAFETAC, Scott Air Force Base, IL 62225-5438.].

Warner, T. T., and R-S. Sheu. 2000. Multiscale local forcing of the Arabian Desert daytime boundary layer, and implications for the dispersion of surface-released contaminants.

*J. Appl. Meteor.*39:686–707.Westphal, D. L. Coauthors,. 1999. Meteorological reanalyses for the study of Gulf War illnesses: Khamisiyah case study.

*Wea. Forecasting*14:215–241.Yamada, T., , S. Bunker, , and M. Moss. 1992. Numerical simulations of atmospheric transport and diffusion over coastal complex terrain.

*J. Appl. Meteor.*31:565–578.Zhang, D., and R. A. Anthes. 1982. A high-resolution model of the planetary boundary layer—sensitivity tests and comparisons with SESAME-79 data.

*J. Appl. Meteor.*21:1594–1609.

Experimental conditions for each of the ensemble-member simulations. See section 2 for explanation of the abbreviations.

Characteristics of the source in the SCIPUFF transport and dispersion calculation

^{+}

The National Center for Atmospheric Research is sponsored principally by the National Science Foundation.