A Study of Tracer Distribution Parameter Estimation from Sparse Samples

J. Z. Holland Department of Meteorology, University of Maryland, College Park, Maryland

Search for other papers by J. Z. Holland in
Current site
Google Scholar
PubMed
Close
Restricted access

Abstract

How do design parameters such as the spacing of sampling stations affect the quality of information obtained from atmospheric dispersion experiments? In large-scale experiments such as the Cross-Appalachian Tracer Experiment (CAPTEX) and the Across North America Tracer Experiment (ANATEX), the average crosswind spacing between surface sampling stations Δy may be of the same order as, or greater than, the tracer cloud width parameter σy. For such sparse samplings, investigated are the errors in estimating five parameters of the crosswind distribution of tracer dosage or concentration: the crosswind integrated dosage (M); the centroid coordinate (); the lateral dispersion parameter (σy); the skewness (S); and the kurtosis (K). These are examined as functions of the ratio of σy to Δy, the nonuniformity of station spacing, and the location of relative to the nearest sampling station. Tracer experiments are simulated with the Gaussian distribution model, as well as a non-Gaussian model consisting of the sum of two Gaussian distributions with different means and variances. It is found that increasingly sparse sampling produces an increasing negative bias in estimates of σy and positive bias in K, as well as increasing root-mean-square errors of all the parameters. Root-mean-square errors also increase with increasing range of skewness and kurtosis of mixed populations of non-Gaussian distributions and with increasing range of random spacings of stations.

To examine the feasibility of inferring errors in the five distribution parameters from field experiments, large sets of simulated trials are performed in which selected “observables” as well as the distribution parameter errors are computed. Sets of trials differ with respect to the range of non-Gaussian distribution parameters, the range of random variation of station spacing, and the range of average sampling station density. For each set of trials conditional means and variances of the errors in estimates of the five distribution parameters are computed as functions of simulated observable parameters representing apparent sampling grid density, displacement of nearest station relative to apparent centroid, and central grid interval anomaly relative to average spacing. These functions contain a large fraction of the error variance for σy, S, and K, but a relatively small fraction for M and .

Errors estimated from such sets of simulations, given observable predictors from eight CAPTEX cases with very sparse crosswind sampling in a narrow (100 km) crosswind strip, are compared with errors estimated from the observations by using more dense samplings, obtained from a wider (300 km) annular strip, as “truth.” Errors in σy, S, and K, predicted by the conditional means from the simulated trials, correlated well with those estimated from the CAPTEX data but errors in M and did not. The Gaussian and non-Gaussian distribution sets differ little in their ability to predict conditional bias errors, but the residual rms errors estimated from comparison with CAPTEX data are grossly underestimated by the Gaussian model and appear to be better predicted using appropriate non-Gaussian models.

The results suggest that traditional analysis of data from a sampling array such as that employed in CAPTEX may systematically underestimate the lateral dispersion by more than 30%, and overestimate the kurtosis by more than a factor of 3. The observed distributions appear to have systematically large kurtosis compared to the Gaussian.

Abstract

How do design parameters such as the spacing of sampling stations affect the quality of information obtained from atmospheric dispersion experiments? In large-scale experiments such as the Cross-Appalachian Tracer Experiment (CAPTEX) and the Across North America Tracer Experiment (ANATEX), the average crosswind spacing between surface sampling stations Δy may be of the same order as, or greater than, the tracer cloud width parameter σy. For such sparse samplings, investigated are the errors in estimating five parameters of the crosswind distribution of tracer dosage or concentration: the crosswind integrated dosage (M); the centroid coordinate (); the lateral dispersion parameter (σy); the skewness (S); and the kurtosis (K). These are examined as functions of the ratio of σy to Δy, the nonuniformity of station spacing, and the location of relative to the nearest sampling station. Tracer experiments are simulated with the Gaussian distribution model, as well as a non-Gaussian model consisting of the sum of two Gaussian distributions with different means and variances. It is found that increasingly sparse sampling produces an increasing negative bias in estimates of σy and positive bias in K, as well as increasing root-mean-square errors of all the parameters. Root-mean-square errors also increase with increasing range of skewness and kurtosis of mixed populations of non-Gaussian distributions and with increasing range of random spacings of stations.

To examine the feasibility of inferring errors in the five distribution parameters from field experiments, large sets of simulated trials are performed in which selected “observables” as well as the distribution parameter errors are computed. Sets of trials differ with respect to the range of non-Gaussian distribution parameters, the range of random variation of station spacing, and the range of average sampling station density. For each set of trials conditional means and variances of the errors in estimates of the five distribution parameters are computed as functions of simulated observable parameters representing apparent sampling grid density, displacement of nearest station relative to apparent centroid, and central grid interval anomaly relative to average spacing. These functions contain a large fraction of the error variance for σy, S, and K, but a relatively small fraction for M and .

Errors estimated from such sets of simulations, given observable predictors from eight CAPTEX cases with very sparse crosswind sampling in a narrow (100 km) crosswind strip, are compared with errors estimated from the observations by using more dense samplings, obtained from a wider (300 km) annular strip, as “truth.” Errors in σy, S, and K, predicted by the conditional means from the simulated trials, correlated well with those estimated from the CAPTEX data but errors in M and did not. The Gaussian and non-Gaussian distribution sets differ little in their ability to predict conditional bias errors, but the residual rms errors estimated from comparison with CAPTEX data are grossly underestimated by the Gaussian model and appear to be better predicted using appropriate non-Gaussian models.

The results suggest that traditional analysis of data from a sampling array such as that employed in CAPTEX may systematically underestimate the lateral dispersion by more than 30%, and overestimate the kurtosis by more than a factor of 3. The observed distributions appear to have systematically large kurtosis compared to the Gaussian.

Save