## 1. Introduction

Source characterization for airborne contaminants is an important capability in air-quality analysis and in homeland security. The purpose of source characterization is to ascertain the locations, times, and strengths of one or more emissions. Once a source is identified and characterized, releases from the source may be mitigated or, at the very least, modeled to predict their potential harm to society.

The tools currently available for characterizing contaminant sources include forward-predicting transport and dispersion models and backward-looking models. Output from forward dispersion models can be interpreted as an expectation for the subsequent concentration field. In contrast, backward models can be applied to identify the sources of measured contaminants.

The approach presented in this paper is to couple a forward dispersion model with a backward model using a genetic algorithm (GA). Previous studies referred to below have approached this problem using a genetic algorithm to optimize the calibration factor coupling the dispersion model with the backward model, where the optimal calibration factor apportions the correct strength of pollutant to each candidate source and thus represents the source characterization. The current work takes a more comprehensive approach than previous work, using the GA to optimize source timing and location.

Works such as Holland (1975) and Goldberg (1989) first introduced the GA and its broad range of applications. Genetic algorithms have been used to calculate calibration factors in a simpler context by Cartwright and Harris (1993), and Haupt (2005) demonstrated the effectiveness of this technique in a series of sensitivity studies. Haupt and Haupt (2004) discuss the many different types of GAs, of which the continuous-parameter GA is most appropriate for tuning the calibration factors of a coupled model. Haupt et al. (2006, hereinafter referred to as HYA) provide a summary of the continuous-parameter GA used in the current study. HYA validated the concept using the Gaussian plume dispersion equation with a backward model inspired by chemical mass balance (CMB), using synthetic receptor data produced by the Gaussian plume equation. The coupled model performed well, even with a moderate amount of noise introduced in the synthetic data.

The next step, covered in this paper, is to demonstrate the GA-coupled modeling approach in a real-data application. The goal is to use the model as a source characterization tool in the context of an operational dispersion model and with real data. Such a coupled model could be useful in source characterization for hazardous-release events for which both monitored contaminant and meteorological data are available.

The primary upgrade from the HYA coupled model is the replacement of the Gaussian plume equation with the much more sophisticated Second-Order Closure Integrated Puff (SCIPUFF) dispersion model (Sykes et al. 1998). The impact of upgrading the coupled model’s forward component is tested using synthetic data produced by SCIPUFF itself. This step mimics the synthetic data tests applied to the Gaussian plume equation by HYA. Subsequent tests, including validation with real data, can then be performed with knowledge that any issues encountered are not related to the incorporation of SCIPUFF into the coupled model. One advantage of using synthetic data is that noise can be added to the data to directly measure the impact of noise on the coupled model’s performance. Also, with synthetic data an exact solution is known.

Real-data runs are then conducted with the coupled model using neutrally buoyant tracer concentration data from the Dipole Pride 26 (DP26) field tests. These runs determine the model’s ability to characterize pollutant sources correctly despite the stochastic scatter of realizations around the forecast ensemble mean.

This work demonstrates the applicability of the coupled model approach and the ability of the genetic algorithm to compute the optimal calibration factors linking the dispersion model with the receptor data. Because there is never a perfect match with real data because of the differences between stochastic realizations and ensemble mean predictions, as well as imperfections in models, boundary conditions, and initial conditions, there is a need to calibrate dispersion model output to match monitored data, and the GA-coupled model provides one approach.

## 2. Validation method

*n*at the receptor for each observation period

*m*, as computed by the dispersion model using assumed source emission rates; 𝗥 is the column (1D) matrix representing the measured concentration of pollutant at the receptor with one value for each time period

*m*; and

**S**is a vector of the unknown calibration factors linking the two. Each element

*C*denotes a run of the forward dispersion model at time

_{mn}*m*from source

*n*. This equation can be applied to continuous releases or time-dependent releases, because multiple time periods (denoted by

*m*) are contained in the equation. The equation also accounts for multiple sources (denoted by

*n*). Furthermore, the equation can be expanded by an extra dimension to account for any other varying parameter, as in section 4 for multiple receptors.

The GA is used to compute the **S** vector that provides the best fit to the data according to the backward model (HYA). This matrix problem is often poorly conditioned, and therefore simple inversion techniques are typically not able to solve the problem. Haupt (2005), however, demonstrates the GA’s ability to find the solution.

As the forward component of the coupled model, SCIPUFF’s role is to compute the contributions from each source filling the elements of matrix 𝗖 in (1). SCIPUFF is an ensemble mean dispersion model designed to compute the time-dependent field of expected concentrations resulting from one or more sources. The model solves the transport equations using a second-order closure scheme and treats releases as a collection of Gaussian puffs (Sykes et al. 1986; Sykes and Gabruk 1997). SCIPUFF can be used to predict expected concentrations of emitted gases, particulates, or hazardous releases. One specific application is as an aid to health agencies in predicting dispersion of hazardous pollutants (Cox et al. 1998). SCIPUFF is an appropriate choice for our coupled model because of its ability to compute expected concentrations over predefined time periods for any number of sources. Thus, its output can be easily integrated into the backward model defined above.

In the current application, SCIPUFF is run once for each source, with each run’s output corresponding to a particular element in the 𝗖 matrix. These forward-model computations only need to be performed once prior to the initialization of the GA, and therefore the use of a sophisticated dispersion model for this component does not impose a computational cost proportional to the number of generations used in the GA’s refinement of the calibration factor. During the validation phase of this study, SCIPUFF was also used to create synthetic data [column matrix 𝗥 in (1)]. Sets of synthetic data corresponding to particular source configurations were created using a synthetic two-dimensional wind field. The output fed to the coupled model is the column matrix of instantaneous contaminant concentrations at a previously defined receptor located 5 m above the surface. Each observation corresponds to one value for each element of 𝗥.

One thing for which (1) does not account is background concentrations of pollutants. A primary objective of the coupled model is to provide source characterization for homeland security applications—to be specific, the release of toxic substances for which there is unlikely to be a background concentration. The GA method could be modified to account for a background concentration by including it as a parameter that is added to all of the concentrations in 𝗖 and is optimized by the GA along with the source characterization.

The coupled model is currently designed for point sources, because that is most appropriate for hazardous releases. The coupled model could be applied to area sources in addition to point sources by adjusting the parameters within the dispersion model. Use with area sources has not been tested, but we could characterize any type of source that the dispersion model allows.

## 3. Synthetic source validation

Validation is conducted here using synthetic receptor data derived from three different source configurations designed to test the GA’s robustness in the presence of sources of various strengths and distances from the receptor. Each configuration includes 32 independent meteorological periods (here, hours) and 16 sources. The matrix problem requires the number of independent sampling periods to be at least the number of sources; otherwise there are infinitely many solutions.

The validation runs also employ synthetic meteorological data. The wind rotates clockwise by 22.5° every hour, resulting in 16 unique wind directions that repeat twice. The wind speeds are 5 m s^{−1} for the first 16 h and 10 m s^{−1} for the second 16 h. The meteorological data assume Pasquill stability class D, corresponding to neutral stratification (Pasquill 1961).

The validation is performed using a 100-run Monte Carlo simulation for each set of synthetic data. Because the GA initializes solutions using random numbers, the Monte Carlo approach allows us to compute error bars and confidence intervals to quantify the uncertainty associated with the GA’s solutions. Each run compares the best-fit **S** as found by the GA after 20 000 iterations, which typically results in a minimum cost function value of 10^{−2} (HYA), which represents solution accuracy.

### a. Results without noise

The first configuration is a circular source array with a radius of 5 km and sources of equal strengths [i.e., **S** = (1, 1, . . . , 1)^{T}]. The mean solution averaged across all sources and all 100 Monte Carlo runs was 1.0004, with an interrun standard deviation (averaged across the 16 sources) of 0.0127. Ninety percent of the solutions fell within 0.02 of the actual solution of 1.

The second configuration uses the same circular source array but varies the source strength [i.e., **S** = (0, 1, 2, 3, . . . , 0, 1, 2, 3)^{T}]. The purpose of this configuration is to test the GA’s ability to diagnose varying source strengths. Table 1 shows the means and standard deviations for 4 of the 16 sources, each corresponding to a different **S** value (0, 1, 2, or 3). With the exception of source 1, all standard deviations are of the same order of magnitude as when **S** was 1 for every source. The standard deviation for source 1 is higher because the GA imposes a lower bound of 0 on the solutions.

The third configuration arranges the sources in a spiral formation, with **S** = (0, 1, 2, 3, . . . , 0, 1, 2, 3)^{T}. The source-to-receptor distances vary linearly from 2.5 to 17.5 km, to ensure that the coupled model’s performance in previous sections is not a function of each source’s distance from the receptor. Table 2 shows the means and standard deviations for 4 of the 16 sources in the same format as is used in Table 1. Again, the GA does an exceptional job of closing in on the solution, not just in the mean of the Monte Carlo simulation, but in individual runs as well, as demonstrated by the small values of the standard deviations.

Results across all configurations were exceptional, especially considering that each GA run was stopped after 20 000 iterations, which is a small number given the low population size. The results are comparable to those found in HYA for the Gaussian plume model.

### b. Results with noise

Real data, unlike synthetic data, are often subject to monitoring errors. As discussed in HYA, these errors may arise from the stochastic variation of turbulent flow but also may arise from uncertainties in weather data, source strength characterization, or the monitoring of receptor concentrations. Therefore, we contaminate our data with noise to simulate the variability and errors present in monitored concentration data.

To test the impact of such errors, additional Monte Carlo simulations were run for each of the three source configurations discussed above. For each configuration, 12 sets of Monte Carlo simulations were made, each with a different signal-to-noise ratio. The values used spanned the range from infinity (no noise) to 0.01 (noise > signal by two orders of magnitude). Additive and multiplicative noise were both tested. Results are only shown for the spiral configuration with **S** = (0, 1, 2, 3, . . . , 0, 1, 2, 3)^{T}, because results for other configurations are similar, and the spiral configuration provides the sternest test.

Results with noise from the spiral configuration are shown in Figs. 1 and 2, which depict the effects of additive and multiplicative noise, respectively. The figures show the GA’s computed calibration factor for 4 of the 16 sources as a function of the logarithm of the signal-to-noise ratio, with error bars signifying plus/minus one standard deviation (dashed), and the 90% confidence interval (dotted). These graphs demonstrate that the GA is capable of producing the solution when noise is as large as the signal for single runs or when noise is 2 times the signal for the average of the 100-run Monte Carlo set. Other differences between additive and multiplicative noise are discussed in HYA.

## 4. Field-data test method

To this point, we have used synthetic receptor data produced by models that predict ensemble means of concentrations. Now that SCIPUFF’s incorporation into the coupled model is validated, we can apply the model to a case involving real data to see how well it performs for individual realizations with respect to characterization of source location, emission time, and strength. The dataset used here is from the DP26 field experiments.

The DP26 field experiments were carried out in November of 1996 at the Nevada Test Site (Biltoft 1998). The tests released sulfur hexafluoride at locations surrounding the receptors. The releases were instantaneous rather than continuous. There were 17 different field tests carried out during the DP26 experiments. Our study used 14 of these tests; three were discarded because of missing data. Figure 3 shows the test domain and orientation of sources and receptors. Labels N2, N3, S2, and S3 are the source locations, and the thick black lines show the approximate receptor locations. Further details on these field experiments can be found in Biltoft (1998) and Watson et al. (1998).

Chang et al. (2003) used the DP26 data to validate various dispersion models, including SCIPUFF. They found that about 50%–60% of SCIPUFF-predicted concentrations came within a factor of 2 of the observations. Most large errors occurred when the modeled puff missed the receptors altogether as a result of errors in the wind field. Still, SCIPUFF performed as well as any other dispersion model they validated.

As part of the GA-coupled model, we are also using SCIPUFF to model the DP26 experiments. Therefore, we should expect the same discrepancies between expected concentrations and monitored values as were found by Chang et al. (2003). Errors in the coupled model’s results can perhaps be attributed to these discrepancies, which result in part from the difference between stochastic realization measurements and ensemble mean predictions. Chang et al. (2003) attributed the differences mainly to the wind field interpolated by SCIPUFF. The GA-coupled model is designed to help to account for these issues by calibrating the dispersion-model output to match the receptor data. Also, a future coupled model could tune the wind field to correct for the occasions on which the modeled puff misses the receptors.

*r*indexes the receptors. As before,

*m*indexes the observation periods and

*n*indexes the sources.

**S**is not expanded in this extra dimension, given that the goal of the GA is to find a single

**S**that provides the best fit across all receptors. If the model matches the data perfectly, a single

**S**vector would provide an exact fit for all receptors. The basic cost function remains the same, but now (2) is modified to produce a new equation that sums over all receptors:

Several initial runs were made with the coupled model using the DP26 data. The goal of these runs is to characterize the emission locations and times (strength characterization is the focus of subsequent sections). These runs used the four emission locations (N2, N3, S2, S3) at two times each, for a total of eight sources. Because each field test emitted from no more than two sources, **S** should be equal to 1 at the emitting sources and 0 for all nonemitters, if all else is perfect. In other words, it should detect which source was the actual emitter for each field test by comparing receptor data with SCIPUFF-predicted contributions from each potential source; **S** values are not restricted to only 0 or 1 but rather are a continuous range, which is initially 0–5.

*n*, is summed across all time periods and receptors. These totals are normalized by the maximum contribution from any source

*n*to produce a scale factor ranging from 0 to 1:

**S**

*values to 0. This method does not assume any prior knowledge regarding which sources are potential emitters but does provide objective estimates of each source’s potential contribution to the domain. For the Dipole Pride dataset, half of the candidate sources are downwind of the receptor domain in most runs, and therefore the scale factor eliminates this half. The GA then uses the cost function to zero out the remaining nonemitters on the upwind half of the domain.*

_{n}The range of values allowed by the GA should be increased to counter the inclusion of the scale factor, which may act to narrow the range for potentially correct sources. The range should not be made too large, however, because the run-to-run variability in solutions is proportional to this range.

**S**vector to depend less on magnitudes and more on whether a source is emitting at all. With a logarithmic cost function, overestimations in expected concentration, as are often found with this dataset, do not so greatly impact the model’s performance. For example, if the receptor data value is 200 parts per trillion (ppt), but the model’s predicted concentration is 2000 ppt, a logarithmic cost function identifies the value of 2000 ppt as more optimal than a value of 0 ppt. This outcome is desired because the main issue in source identification is not strictly the magnitude but rather is the nonzero nature of each source’s contribution.

The use of logarithms may only be necessary because of the dispersion model used. Concentrations calculated by Gaussian and puff models like SCIPUFF may vary from realization measurements by up to an order of magnitude, in particular with the Dipole Pride dataset (Chang et al. 2003). Thus, a logarithmic scheme presented in (6) is necessary for our application. Results in section 5b show that the logarithmic cost function (6) gives better model performance than does the linear cost function (4).

## 5. Performance optimization

The goal is to produce an automated process that best characterizes the location, time, and strength of the pollutant source(s). Now we seek to optimize the performance of the coupled model with the DP26 dataset by performing various tests, each designed to determine the impact of different parameters. Although the optimization is specific to DP26, many of the results can be applied to the coupled model in general for other datasets.

### a. GA versus random search

The first test determines whether solving the matrix problem requires the GA at all. The GA’s performance is compared with the performance of a random-search method, which produced random **S** values and evaluated them with the same cost function. Figure 4 shows the minimum cost for one of the DP26 tests, as found by the GA (dashed) and the random search (solid), averaged over five runs, each with 20 000 iterations. Although the “number of iterations” applies specifically to the GA, the corresponding computing time for the random-search method is normalized to the equivalent number of GA iterations so that the graph provides an equal comparison.

It is clear that the random search took much longer to find a solution with a low cost-function value. In fact, out to 20 000 iterations, the random search never caught up to the GA. The GA converged to the optimal solution quickly (in about 7000 iterations), whereas the random search is likely to be several more thousands of iterations away from reaching the optimal solution. This result shows that a random search is inefficient and that more-sophisticated optimization methods such as a GA are required. HYA and Haupt (2005) tested other methods such as matrix inversion, but because the matrix problem is often poorly conditioned these matrix-inversion techniques did not perform well.

### b. Cost functions

As mentioned earlier, the cost function was modified to consider logarithms of squared differences, resulting in enhanced performance. We now extend this study with four different cost functions. Each of these cost functions is either logarithmic or nonlogarithmic and uses one of two normalization schemes. Details on these cost-function schemes not covered previously can be found in appendix A.

The coupled model is run with data from each DP26 field test for each cost function. Because the run-to-run variability in solutions was very small, long Monte Carlo simulations are not necessary. This also applies to the sensitivity studies that follow.

Table 3 shows the success scores for each cost function. The success scores are described in appendix B; higher success scores are better. Although the choice of cost function did not affect the ability of the GA to identify the correct release time, the logarithmic, postnormalization cost function (6) used up to this point scored highest in identifying the release location.

The logarithmic, postnormalization cost function performed best, but computing the logarithms in the cost function adds 25% to the run time. The improved performance in correct source identification is, however, worth the extra computational cost, because the logarithmic cost function correctly identified three more sources than did the nonlogarithmic cost function.

### c. Population sizes and mutation rates

Haupt (2005) performed a sensitivity study on GA population sizes and mutation rates using synthetic data and found that two combinations of sizes and rates were optimal: high population sizes coupled with low mutation rates and low population sizes coupled with high mutation rates. A similar sensitivity study is made using the DP26 dataset using 5 of the 14 field tests. The goal is to find an optimal combination of population size and mutation rate in terms of a minimum number of cost-function evaluations required for convergence to an optimal solution (defined as being within a specific tolerance of the “exact” solution) and therefore minimum computing time.

Figure 5 shows the number of cost function calls for 80 combinations of population sizes and mutation rates, averaged across five runs for each field test. The optimal mutation rate was 0.15, and the optimal population size was in the range of 4–12. Previous DP26 runs used 0.20 and 8, respectively, and therefore we were not far off the optimal case already. The values used do not affect the ability of the GA to find the optimal solution, they only affect the number of iterations needed to obtain it; therefore, there is no need to rerun previous tests with the optimal mutation rate.

### d. Output intervals

The DP26 dataset provides receptor data every 15 min. SCIPUFF can also output values every 15 min; however, the DP26 receptor data are not instantaneous concentrations but rather are time-integrated averages. Shortening the output interval in SCIPUFF and then averaging back up to 15 min may improve the GA’s performance by producing a better match to the 15-min averages.

Four output intervals are tested: 15, 5, 3, and 1.5 min. SCIPUFF-produced concentrations are averaged into 15-min time periods for inclusion into (3). The coupled model is only run for the field tests for which the output interval chosen affected performance, based on the findings of initial runs.

Table 4 shows the success scores (appendix B) for this test. Source identification skill is not affected by the output interval, but the 3-min interval resulted in the best performance for time and strength identification. A surprise was that smaller output intervals did not perform better—shortening the output interval beyond 3 min did not appreciably improve accuracy.

### e. Receptor sets

The DP26 dataset includes data for 90 different receptors. So far the model has used all 90 at the expense of computing time. The cost-function evaluations take less time when only using one receptor, decreasing computing time by a factor of 14. Because of this large difference, it is useful to investigate whether using a subset of the receptors results in similar (or even better) performance.

Six different receptor subsets are tested: all 90, the 200 line (see Fig. 3), the middle receptor (referred to as No. 215), the highest-magnitude receptor (in terms of measured concentration), the highest-magnitude receptor on the 200 line (Fig. 3), and the highest-magnitude range of five adjacent receptors. The latter three are unique for each field test and were determined by objectively analyzing the data.

Table 5 shows the success scores (appendix B). The best source and time identifications are found when including all 90 receptors. However, the field-test-specific highest-magnitude receptor sets result in the best strength identification. A possible reason is that many receptors recorded concentrations near 0 for the majority of sampling periods during the field tests, and including these receptors in the analysis lowers the GA-calculated strength.

Using a subset of the receptors within the centerline of the plume gives a more exact strength characterization, but it is not always best for source identification. The modeled plume may be slightly off-center of the actual plume and thus outside this subset of receptors. Subsequent analyses use all 90 receptors so as to optimize source and time identification.

### f. Specific source and time characterization

So far, the coupled model’s source array has only included sources N2, N3, S2, and S3 (Fig. 3) at each of two times within each field test. We now use a source array that does not assume a particular release location or time by using a grid of locations and multiple times. This study is split into three parts.

The first part uses a 0.06° × 0.06° grid of points with 0.02° resolution as the source array. The grid is located on the side of the domain where the release was known to originate. Each source specifies the previously known correct release time, so that the coupled model is only characterizing the location within the grid. Because there are 16 sources and 14 sampling periods, this configuration appears to violate the requirement of the matrix problem that the number of sources be less than the number of sampling periods; however, if all 90 receptors are used, there are effectively 14 × 90 pieces of information. The coupled model was able to pinpoint the correct emission grid location for 6 of 14 field tests, and an additional 3 field tests came within one grid point.

The second part of the study considers 16 different emission times separated by 10 min each. Each source specifies the previously known correct location, and therefore this portion of the study only characterizes the release time. The release time was pinpointed exactly for 9 of 19 emissions throughout the 14 field tests, and for 14 of 19 emissions the GA-produced emission time was within 30 min of the actual time. Some of the field tests had major issues related to missing or faulty data (both receptor and surface meteorological data); model runs that invoke these field tests have performed poorly in every study thus far but are still included in the overall analysis.

The first two parts of this study each assume some previously known information, either the location or the time of emission. The final portion of the study did not assume either but rather attempted to pinpoint the location and time simultaneously, using four locations on the corners of a 0.04° × 0.04° grid and four times separated by 40 min each.

The problem is ill posed; therefore, the coupled model is unable to diagnose upwind location and release time simultaneously. Figure 6 provides an illustration. Consider a receptor at the location marked by the X and two sources A and B, both of which are upwind of the receptor. Suppose source A emits at time 0, and source B emits at time 3. Because both puffs would reach the receptor X at time 6, no model could apportion the correct amount of pollutant to each source because the puffs have merged. Source A may be the correct emitter in this theoretical example, but because there are many possible solutions and our model is robust enough to find one of them, the GA attributes some random combination of pollutant to both A and B that satisfies the conditions.

Using multiple receptors in the crosswind direction does not necessarily solve this problem because there can be large overlap of puffs from two or more sources that lie nearly along the wind direction. Off-centerline receptors do not solve the problem because they would sample the fringes of both puffs (Fig. 6). Even though the puffs from sources A and B are of different sizes, the model still disperses some of each source’s puff to an off-center receptor, and so the problem remains. One possible solution to this problem includes placing additional receptors between the candidate sources; the correlation problem exists with Dipole Pride because all sources are located outside of the receptor domain.

This example illustrates the impact of correlation between multiple parameters on the ability of a GA, or any other optimization algorithm, to tune them correctly. In basic terms, parameters that affect different aspects of the plume are easy to optimize, whereas those that affect primarily the same aspects, such as upwind location and release time, are difficult to optimize. This same effect has been seen in many optimization problems using various optimization methods (Robins and Thomas 2005; Chow et al. 2006). As shown in HYA, our GA is more robust than simple matrix inversion for solving such ill-conditioned problems. In general, one would expect iterative improvement methods such as a GA to outperform single-pass methods such as matrix inversion because they damp rather than amplify round-off error (HYA).

### g. Source-strength characterization

In an effort to improve source-strength characterization, the candidate source array in the next set of coupled model runs uses only the location and time found by the study in the previous section. This next set takes the locations and release times found above as given and reruns the coupled model to tune only the source strength. The best-fit source locations and times were used instead of the actual documented emissions so as to minimize the impact of SCIPUFF’s dispersion and transport errors on the source. The coupled model has thus far underestimated the strengths when incorporating all 90 receptors, but performance should improve when the coupled model is forced to attribute all of the pollution to only one or two sources. Different receptor sets are retested to see whether these results agree with the previous analysis (section 5e).

Unlike the case for a larger source array, using all 90 receptors with just the correct sources produces a better strength characterization than the other receptor sets tested (highest single receptor, and highest adjacent range of 5). In cases in which SCIPUFF’s centerline does not agree with the data, and only the receptors within the actual puff are used in the smaller receptor sets, SCIPUFF’s predicted concentrations at the included receptors are low and the GA compensates by greatly overestimating the source strength. This does not happen in larger receptor sets because all receptors are included in the analysis, including those the puff is modeled to envelop. Therefore, receptors that the modeled puff passes through, but not the actual puff, act to decrease the strength characterization to a more realistic value.

Using all receptors, source strengths came within an order of magnitude of the actual strength reported by Biltoft (1998) for only 9 of 19 emission scenarios. Although performing well in regard to location and emission time, the coupled model’s strength specification has thus far not been as precise. This is merely a reflection of the disparity between ensemble averages and realizations, where an ensemble model necessarily gives a broader, weaker, and smoother plume than would an individual turbulent realization.

### h. Multiple-emission field tests

Another study analyzes the coupled model’s performance for the seven field tests with multiple emissions. For these field tests, the coupled model is able to characterize both releases correctly only 42% of the time (three out of seven). Additional analysis is done for the other seven field tests with only one emission, for which the coupled model produces a high-confidence one-emission solution (i.e., the solution did not indicate there were multiple emissions) for five of seven field tests.

Because there were difficulties in the data or execution of many of the multiple-release field tests, it is not surprising that less than one-half of the two-emission tests result in a correct characterization of each. The reason some of the field tests even had a second emission at all is because the first emission missed the receptors (Biltoft 1998). Thus, many of the multiple-emission field tests include at least one faulty emission, making this an ill-posed problem. Furthermore, even when both emissions reached the receptors, often one release gave higher concentrations than the other even when the releases were of equal magnitude, further complicating the strength analysis. This does not mean the coupled model is unable to account for multiple emissions; validation with synthetic data (section 3) shows that it can.

## 6. Multistage process for source characterization

Because different candidate source arrays work better for characterizing source location, time, or strength, a combination of methods offers possible improvement in at least some of the parameters. The goal is to combine the best features of the runs described above to produce a multistage process involving multiple model runs to determine progressively the correct source, time, and strength characterization. Each stage in the process uses a different candidate source array. This candidate source array is coarse at first but is refined in each stage of the multistage process until the correct source(s) are found. The multistage process, outlined in Fig. 7, is detailed in the following paragraphs.

The first stage starts with a coarse grid designed to estimate the number of emission sources and time of emission. We start with 32 sources—eight locations at four unique emission times—that encompass all possible source locations and emission times within each field test. The locations are shown in Fig. 8. The coupled model calculates how much pollutant should be attributed to each location–time combination. The contributions from all locations are summed for each time period to give a total contribution for each time period, and if each sum is within 1/3 of the maximum for any time (1/3 was found to be a good benchmark in previous analyses) the time is included in the next stage. This stage allows for a maximum of four unique emissions (one at each of the four times).

The second stage performs a separate coupled model run for each time output from the first iteration. The goal is to find each time period’s most probable location using a 0.06° × 0.06° grid with 0.02° resolution (as done in section 5). The location returning the maximum **S** value, indicating the most probable location for each emission time, is then carried to the next stage.

The third stage runs the coupled model once with the locations found in the second stage in the source array, using release times separated by 10 min to refine the release time characterization. The goal of this stage is to pinpoint the most probable emission time for each release by comparing the **S** values.

The final stage performs one more model run with the source–time combinations found in the third stage with the goal of strength specification. After the GA determines the best-fit **S** vector, we filter out any source whose strength was found to be less than 1/3 of the maximum found for any source. If any sources are eliminated, this step is repeated without those sources to allow for more exact strength calculation for the remaining sources. The final result is a list of emission locations, times, and strengths.

This four-stage process was run once on each of the 14 field tests to mimic a real-world attempt to specify the sources, times, strengths, and number of emissions. Source characterization for one of the field tests was perfect in finding the correct location, time, and number of emissions, and the strength characterization for this field test was too low by a factor of 2. Locations determined for other field tests were typically close, and the release times were generally within 40 min of the actual time. The majority of strength characterizations were still underestimated, but they were closer than in runs from previous sections. Because the model now only considers the “best fit” sources found by the previous stages in the final stage, these best-fit sources resulted in higher strength specifications.

The model determined the correct number of sources in 6 of 14 field tests while overestimating the number of sources on 5 of the other 8 field tests. The remaining three field tests each involved receptor data in which one puff likely missed the receptor domain, and thus they resulted in underestimations of the number of sources.

To improve the performance of the multistage process, the factor of 1/3 used to filter out nonemitting sources could be optimized further; however, the specific value is subjective and may not apply to other datasets. We considered whether the last stage might benefit from removing the logarithm from the cost function (6) but found that this change did not improve strength characterization.

This process could surely be fine-tuned further but, as used here, shows the real-world applicability of this coupled model toward source characterization. The process can be applied to other datasets, although specifics such as the 1/3 filtering threshold would need to be reoptimized and likely changed. When the coupled model is applied to other datasets, it is expected that prior information regarding some aspects of the sources would improve the ability of the coupled model to characterize their remaining aspects. Examples of useful prior information include a range of possible sources, times, and strengths, as used in this study. It is not currently known exactly how much uncertainty the coupled model can cope with in determining the correct source characterization. The results from section 3 and HYA, however, suggest that if noise is on the same order of magnitude as the signal then correct characterization is still possible.

## 7. Conclusions and recommendations

A GA-coupled model incorporating a sophisticated dispersion model, SCIPUFF, characterizes source strength and location using synthetic data and is validated successfully. Results using noise-free synthetic data are exceptional and are still very good with noise as long as the magnitude of the noise does not exceed that of the signal. The coupled model is limited, however, by the inability of ensemble dispersion models to predict exact realization measurements that are highly dependent on the details of turbulent flow and the loss of turbulent information associated with a backward approach. In contrast, when validating with synthetic data, the receptor data *are* the ensemble mean expectation, because SCIPUFF was used to produce them. Another source of error is that hourly meteorological data may not be sufficient to model pollutant dispersion to the resolution desired. Notwithstanding, the GA-coupled model has performed to a high degree of accuracy.

Results from the validation of SCIPUFF’s incorporation into the coupled model are strikingly similar to previous results using a much simpler dispersion model. Comparison of Fig. 1 from this paper with Fig. 11 from HYA shows that the graphs are almost indistinguishable. The graphs from other source configurations (not shown) are also nearly identical. This result suggests that the performance of the GA in obtaining the solution to the matrix problem is not dependent on the dispersion model used and that models of increasing complexity can be used in the coupled model without hindering performance. This validation approach should work with any dispersion model that produces consistent results, because the dispersion model itself is also used to produce the synthetic receptor data. Computing time depends more on the GA than the dispersion model, because the dispersion model is only run once for each source, further supporting the ability of the coupled model to use a dispersion model of any level of complexity. Of course, this does not suggest that the incorporation of SCIPUFF into the coupled model does not upgrade the performance in a real-data application, but only that it does not increase the programming complexity or decrease the performance of the GA.

Validation of the model using data from the Dipole Pride 26 field experiments demonstrates the model’s potential applications. The coupled model performs well in identifying the correct emission locations and times to as good a degree of accuracy as can be expected given the disparity between the ensemble nature of dispersion modeling and real data resulting from a single realization of turbulent flows. The model struggles somewhat when working with multiple-emission field tests, but this can be attributed mostly to difficulties with the data and execution of these multiple-emission tests and not to a shortcoming in the coupled model. Validation with synthetic data shows that the coupled model can diagnose multiple emissions.

Multiple GA runs can be used with different candidate source arrays to produce the best source/time/strength characterization. The multistage process described in section 6 can in theory be tuned to apply to other datasets. The specifics of the multistage process are project dependent, but the multistage process as a whole should apply to any dataset or situation, with additional fine-tuning. If general information regarding the nature of the sources is known (e.g., an idea of when, where, how many, and how much), the coupled model is more robust in finding a solution. This fact is partly because the coupled model cannot properly differentiate between groups of parameters with highly correlated effects on receptor values, such as upwind source location and release time.

One advantage of the Dipole Pride dataset is the availability of meteorological surface data every 15 min at several points within the domain. Surface data of this resolution in time and space are seldom available in other applications. A lack of sufficient meteorological data is likely to hinder the coupled model’s performance in source characterization. It is recommended that the coupled model be used to help to locate the emission source(s) given sufficient meteorological data and at least a rough initial estimate of the location and time of emission.

Work with the coupled model is ongoing, including plans to use the GA concept to find errors in meteorological data. Errors of measurement and prediction are present not only in model output and monitored receptor data, but also in meteorological data. A future paper discusses the application of a GA to tune the surface wind direction along with source characterization to account for these multiple sources of error. As long as the errors are not highly correlated and a sufficient amount of receptor data is available, preliminary work indicates that the GA method is capable of addressing both uncertainty in meteorological data and monitored receptor data.

## Acknowledgments

This work was supported by internal research and development funds from the Pennsylvania State University Applied Research Laboratory. We thank Joseph Chang for providing us the Dipole Pride 26 data and Ian Sykes for helpful information on SCIPUFF. The GA used here was coauthored by Randy L. Haupt.

## REFERENCES

Biltoft, C. A., 1998: Dipole Pride 26: Phase II of Defense Special Weapons Agency transport and dispersion model validation. Dugway Proving Ground Doc. DPG-FR-98-001, prepared for Defense Threat Reduction Agency by Meteorology and Obscurants Divisions, 76 pp.

Cartwright, H. M., and S. P. Harris, 1993: Analysis of the distribution of airborne pollution using GAs.

,*Atmos. Environ.***27A****,**1783–1791.Chang, J. C., P. Franzese, K. Chayantrakom, and S. R. Hanna, 2003: Evaluations of CALPUFF, HPAC, and VLSTRACK with two mesoscale field datasets.

,*J. Appl. Meteor.***42****,**453–466.Chow, F. K., B. Kosovic, and S. T. Chan, 2006: Source inversion for contaminant plume dispersion in urban environments using building-resolving simulations. Preprints,

*Sixth Symp. on the Urban Environment*, Atlanta, GA, Amer. Meteor. Soc., 1–9.Cox, R. M., J. Sontowski, R. N. Fry, C. M. Dougherty, and T. J. Smith, 1998: Wind and diffusion modeling for complex terrain.

,*J. Appl. Meteor.***37****,**996–1009.Goldberg, D. E., 1989:

*Genetic Algorithms in Search, Optimization, and Machine Learning*. Addison-Wesley, 412 pp.Haupt, R. L., and S. E. Haupt, 2004:

*Practical Genetic Algorithms*. 2d ed. with CD. John Wiley and Sons, 255 pp.Haupt, S. E., 2005: A demonstration of coupled receptor/dispersion modeling with a genetic algorithm.

,*Atmos. Environ.***39****,**7181–7189.Haupt, S. E., G. S. Young, and C. T. Allen, 2006: Validation of a receptor/dispersion model coupled with a genetic algorithm using synthetic data.

,*J. Appl. Meteor.***45****,**476–490.Holland, J. H., 1975:

*Adaptation in Natural and Artificial Systems*. The University of Michigan Press, 211 pp.Pasquill, F., 1961: The estimation of the dispersion of windborne material.

,*Meteor. Mag.***90****,**33–49.Robins, P., and P. Thomas, 2005: Non-linear Bayesian CBRN source term estimation.

*Eighth Int. Conf. on Information Fusion*, Philadelphia, PA, Institute of Electrical and Electronics Engineers.Sykes, R. I., and R. S. Gabruk, 1997: A second-order closure model for the effect of averaging time on turbulent plume dispersion.

,*J. Appl. Meteor.***36****,**1038–1045.Sykes, R. I., W. S. Lewellen, and S. F. Parker, 1986: A Gaussian plume model of atmospheric dispersion based on second-order closure.

,*J. Climate Appl. Meteor.***25****,**322–331.Sykes, R. I., S. F. Parker, D. S. Henn, C. P. Cerasoli, and L. P. Santos, 1998: PC-SCIPUFF version 1.2PD technical documentation. Titan Research and Technology Division ARAP Rep. 718, 172 pp.

Watson, T. B., R. E. Keislar, B. Reese, D. H. George, and C. A. Biltoft, 1998: The Defense Special Weapons Agency Dipole Pride 26 field experiment. NOAA Air Resources Laboratory Tech. Memo. ERL ARL-225, 90 pp.

## APPENDIX A

### Cost Functions

HYA showed that the “L2” norm produced the most efficient cost function. Therefore, all cost functions considered here involve some form of a squared difference. The two normalization schemes discussed here affect model performance, but the specific values used are arbitrary, because the GA mating mechanism is based on ranking and not on absolute difference.

## APPENDIX B

### Success Scores

*S*

_{loc}(

*f*) is the location score,

*S*

_{time}(

*f*) is the emission time score,

*S*

_{str}(

*f*) is the strength score,

*f*denotes one particular field test, and

*F*is the total number of field tests included in the computation. The scores are added up across all field tests to produce total success scores for source location, emission time, and source strength. The sum of these three scores is the overall performance for source characterization.

The higher the success score is, the better is the performance. When 14 field tests are included in the analysis, as in Tables 3 and 5, the best possible score across all field tests is 56 + 56 + 56 = 168 and the worst possible score is 14 + 14 + 14 = 42. In Table 4, only 7 field tests are included in the analysis, and therefore the best possible total score is 28 + 28 + 28 = 84 and the worst possible total score is 7 + 7 + 7 = 21.

Success scores are based on which source (i.e., location–emission time combination) is given the highest *S _{n}*. For example, if the source with the highest strength is at the correct location, and the source strength is much higher than for other locations (indicating high confidence in the solution),

*S*

_{loc}(

*f*) is 4. If the source is at the incorrect location,

*S*

_{loc}(

*f*) is 1. If the source is at the correct location, but there is less confidence in the solution (i.e., the strength is only somewhat higher than for other locations),

*S*

_{loc}(

*f*) is either 2 or 3, depending on how confident one is in the solution. The

*S*

_{time}(

*f*) scores are given in a similar fashion to those of

*S*

_{loc}(

*f*). For

*S*

_{str}(

*f*), the score depends on how close the highest source strength for any source is to the actual strength, independent of whether the source with the highest strength is at the correct location or emission time. For field tests with multiple emissions, each emission is given a separate score for

*S*

_{loc}(

*f*),

*S*

_{time}(

*f*), and

*S*

_{str}(

*f*). These scores are then averaged between the two emissions to give a final

*S*

_{loc}(

*f*),

*S*

_{time}(

*f*), and

*S*

_{str}(

*f*) on a scale of 1–4 that is included in the total sum.

Means and standard deviations for four sources in the circular configuration setup across 100 Monte Carlo runs.

Means and standard deviations for four sources in the spiral configuration setup across 100 Monte Carlo runs.

Success scores given to four cost functions in identifying (ID) the correct emission source, time, and strength, and the sum of those three, across the 14 Dipole Pride 26 field tests. A higher score indicates more success. The cost functions include either a logarithmic squared difference or just a squared difference, and either of two normalization schemes.

Success scores given to four time steps in identifying the correct emission source, time, and strength, and the sum of those three, across 7 of the 14 Dipole Pride 26 field tests. A higher score indicates more success. Receptor data are given every 15 min; therefore smaller time steps averaged several expected concentrations within each larger 15-min time step.

Success scores given to six receptor sets in identifying the correct emission source, time, strength, and the sum of those three, across the 14 Dipole Pride 26 field tests. A higher score indicates more success. The receptor sets include “All 90” (all receptors included), “200 line” (Fig. 3), “No. 215” (the middle receptor in the 200 line), “Highest single” (highest-magnitude receptor with respect to its values), “Highest in 200 line” (same as highest single, but restricted to the 200 row in Fig. 3), and “Highest range of 5” (highest-magnitude range of five adjacent receptors’ values).