## Abstract

Based on the Bayesian statistical decision theory, a probabilistic quality control (QC) technique is developed to identify and flag migrating-bird-contaminated sweeps of level II velocity scans at the lowest elevation angle using the QC parameters presented in Part I. The QC technique can use either each single QC parameter or all three in combination. The single-parameter QC technique is shown to be useful for evaluating the effectiveness of each QC parameter based on the smallness of the tested percentages of wrong decision by using the ground truth information (if available) or based on the smallness of the estimated probabilities of wrong decision (if there is no ground truth information). The multiparameter QC technique is demonstrated to be much better than any of the three single-parameter QC techniques, as indicated by the very small value of the tested percentages of wrong decision for no-flag decisions (not contaminated by migrating birds). Since the averages of the estimated probabilities of wrong decision are quite close to the tested percentages of wrong decision, they can provide useful information about the probability of wrong decision when the multiparameter QC technique is used for real applications (with no ground truth information).

## 1. Introduction

The recent success of the Collaborative Radar Acquisition Field Test (CRAFT) project (Droegemeier et al. 2002) leveraged the Internet-II high-speed communication capability and demonstrated the feasibility of real-time access to the level II data from any of the 120 National Weather Service (NWS) Weather Surveillance Radar-1988 Doppler (WSR-88D) radars and any of the 26 Department of Defense and 12 Federal Aviation Administration radars. This brings tremendous new opportunities as well as challenges for operational applications of level II Doppler radar data, especially in data assimilation. Currently, the National Centers for Environmental Prediction (NCEP) is preparing to assimilate WSR-88D level II wind data from the 120 NWS WSR-88D radars into their operational numerical weather forecast (NWP) systems. The Navy’s Coupled Ocean/Atmosphere Mesoscale Prediction System (COAMPS; Hodur 1997) is also preparing to assimilate real-time level II data from either single radar or multiple radars (Xu et al. 2004; Zhao et al. 2004). For these NWP data assimilation applications, as reviewed and explained in the introduction of Part I (Zhang et al. 2005), the existing radar data quality control (QC) techniques are insufficient and mostly inadequate. It is thus necessary to leverage the existing radar data QC techniques and develop new techniques for level II data QC. A very important task of this undertaking is to develop a QC technique to flag migrating-bird-contaminated level II wind fields. To this end, as reported in Part I, a comprehensive dataset was prepared together with ground truth information [for the 2003 spring migrating season (15 April–15 June)]. Based on the radar ornithological studies of Gauthreaux et al. (1998) and Gauthreaux and Belser (1998), the main features of bird echoes were extracted from the prepared dataset and quantified by three QC parameters. The histograms presented in Part I suggest that the three QC parameters can be used to statistically discriminate bird-contaminated sweeps from noncontaminated sweeps of velocity scans. How to best utilize these QC parameters to accomplish the task, not yet studied and demonstrated, is the topic of this paper.

In this paper, a QC technique is developed based on the Bayesian decision theory (Duda et al. 2001) by using the QC parameters presented in Part I. To provide independent tests, the dataset prepared in Part I is partitioned into two. The first set, called dataset I, covers the period from 1 May to 15 June, and it contains 4442 sweeps of nighttime scans at the lowest elevation angle (0.5°). The second set, called dataset II, covers the period from 15 to 30 April, and it contains 1340 sweeps of nighttime scans at the lowest elevation angle. Dataset I will be used to produce the prior probability of migrating-bird contamination and the likelihood for each QC parameter. By using the Bayes theorem, the posterior probability of migrating-bird contamination can be calculated based the QC parameter values computed for each sweep selected from dataset II. The posterior probability can then be used to decide whether the selected sweep is contaminated by migrating birds. The probability of wrong decision can be also estimated. The decision can be tested against the ground truth information in dataset II. The estimated probabilities of wrong decision can be then verified statistically. The detailed QC technique is presented in the next section. The independent test results are reported in section 3. Conclusions are given in section 4.

## 2. Method

The three QC parameters selected in part I are the mean reflectivity (MRF), the velocity data coverage (VDC), and the averaged percentage of along-beam perturbation velocity sign changes (VSC). As explained in Part I, MRF is used to quantify the enhanced reflectivity by migrating birds, VDC is to quantify the enhanced velocity coverage due to the presence of migrating birds, and VSC is to quantify the grainy textures in the velocity imageries caused by migrating birds. The precise formulations for calculations of these QC parameters are given in (1)–(3) of Part I. The QC technique that uses these QC parameters is introduced step-by-step in the following three subsections.

### a. Prior probability and likelihood

Denote by *H*_{0} (or *H*_{1}) the hypothetical situation in which a concerned sweep of velocity scan is noncontaminated (or contaminated) by migrating birds. The prior probability of *H*_{0} being true is estimated by *P*(*H*_{0}) = *N*_{0}/(*N*_{0} + *N*_{1}) and the prior probability of *H*_{1} being true is estimated by *P*(*H*_{1}) = *N*_{1}/(*N*_{0} + *N*_{1}), where *N*_{0} is the number of noncontaminated sweeps, *N*_{1} is the number of contaminated sweeps, and *N*_{0} + *N*_{1} is the total number of nighttime sweeps of velocity scans at the lowest elevation angle (0.5°) in dataset I (for the period 1 May–15 June 2003). Clearly, *H*_{0} and *H*_{1} are mutually excluded and *P*(*H*_{0}) + *P*(*H*_{1}) = 1. The prior probabilities estimated from dataset I are *P*(*H*_{0}) = 0.573 and *P*(*H*_{1}) = 1 − *P*(*H*_{0}) = 0.427.

The three QC parameters are denoted by *x _{i}* (

*i*= 1, 2, 3), where

*x*

_{1}= MRF,

*x*

_{2}= VDC, and

*x*

_{3}= VSC. Denote by

*p*(

*x*|

_{i}*H*

_{0}) the likelihood of

*x*(i.e., the empirical probability density estimated from the data) conditioned by

_{i}*H*

_{0}being true, and by

*p*(

*x*|

_{i}*H*

_{1}) the likelihood of

*x*

_{i}conditioned by

*H*

_{1}being true. When only a single one of the three random variables is considered, the notation

*x*can be simplified into

_{i}*x*by dropping the subscript index

*i,*as long as the meaning is clearly understood. In this case, the maximum and minimum of

*x*can be denoted by

*x*

_{max}and

*x*

_{min}, respectively. The entire range of

*x*can be divided into

*M*intervals, and the

*m*th interval is between

*x*

_{m}_{−1}and

*x*, where

_{m}*x*=

_{m}*x*

_{min}+

*m*Δ

*x*(

*m*= 0, 1, 2, . . . ,

*M*) and Δ

*x*= (

*x*

_{max}−

*x*

_{min})/

*M*. The probability of

*x*within the

*m*th interval conditioned by

*H*

_{0}being true can be then estimated by

where ∫_{m} (•) *dx* denotes the integral over the *m*th interval of *x*, *n*_{0m} is the number of noncontaminated sweeps when *x* is within the *m*th interval, and *N*_{0} is the total number of noncontaminated sweeps in dataset I. Similarly, the probability of *x* within the *m*th interval conditioned by *H*_{1} being true can be then estimated by

where ∫_{m} (•) *dx* is as in (5), *n*_{1m} is the number of contaminated sweeps when *x* is within the *m*th interval, and *N*_{1} is the total number of contaminated fields in dataset I.

The continuous form of *p*(*x|H*_{0}) is expressed by a truncated expansion of Legendre polynomials; that is,

where *c _{k}* is the

*k*th expansion coefficient,

*g*

_{k}(

*x*) is the

*k*th-order Legendre polynomial, the summation Σ

_{k}is over

*k*(= 1, 2, . . . ,

*N*), and

_{k}*N*is the order of the truncation. Substituting (3) into (1) for each interval of

_{k}*x*, we obtain

*M*algebraic equations for the expansion coefficients

*c*(

_{k}*k*= 1, 2, . . . ,

*N*). By limiting the truncation to

_{k}*N*+ 1 <

_{k}*M*, the expansion coefficients

*c*can be solved by minimizing the residuals of these equations in the sense of least squares under the constraint of ∫

_{k}*p*(

*x*|

*H*

_{0})

*dx*= 1, where the integral is over the full range of

*x*. Substituting the obtained coefficients

*c*back to (3) gives the continuous form of

_{k}*p*(

*x*|

*H*

_{0}). Similarly, the continuous form of

*p*(

*x*|

*H*

_{1}) is obtained by substituting (3) into (2) and solving for

*c*from the resulting algebraic equations under the constraint of ∫

_{k}*p*(

*x*|

*H*

_{1})

*dx*= 1.

The estimated continuous forms of *p*(*x*|*H*_{0}) and *p*(*x*|*H*_{1}) are plotted for each of the three QC parameters *x* = *x _{i}* (

*i*= 1, 2, 3) in Figs. 1a–c. As shown in Fig. 1a for the first QC parameter (

*x*

_{1}= MRF),

*p*(

*x*

_{1}|

*H*

_{0}) is unimodal (with a single peak) but

*p*(

*x*

_{1}|

*H*

_{1}) is bimodal (with two peaks). The second peak of

*p*(

*x*

_{1}|

*H*

_{1}) at MRF = 13 dB

*Z*is very small and quite far away from the main peak at MRF = 8 dB

*Z*, while the latter is quite far away from the main peak of

*p*(

*x*

_{1}|

*H*

_{1}) at MRF = 4 dB

*Z*. For the second QC parameter (

*x*

_{2}= VDC),

*p*(

*x*

_{2}|

*H*

_{0}) and

*p*(

*x*

_{2}|

*H*

_{1}) are both unimodal and peaked, far away from each other, at VDC = 43% and 65%, respectively. Thus, VDC should be a better QC parameter than MRF. For the third QC parameter (

*x*

_{3}= VSC),

*p*(

*x*

_{3}|

*H*

_{0}) and

*p*(

*x*

_{3}|

*H*

_{1}) are both unimodal and peaked at VSC = 38% and 41%, respectively. The peak of

*p*(

*x*

_{3}|

*H*

_{1}) is sharp, so VSC can be a good QC parameter. Each of the above-estimated likelihood functions fits closely to its counterpart histogram (not shown) estimated from dataset I, while each histogram estimated here is very close to that estimated from the total dataset in Part I. Thus, the above-described features for each estimated likelihood function were also seen from the corresponding histogram in Part I. This implies that the statistic characters of each QC parameter are stable and persistent over the period of the 2003 spring migrating season (15 April–15 June).

### b. Single-parameter QC

This section describes how a single QC parameter, say, the *i*th parameter, is used to identify whether or not a sweep is contaminated by migrating birds within a specified range of probability. This QC technique is built on the Bayes decision theory (Duda et al. 2001). First, by using the *i*th formula in (1)–(3) of Part I, the value of *x _{i}* is calculated for the concerned sweep of velocity scan (in dataset II). Conditioned by this given value of

*x*, the posterior probability of

_{i}*H*

_{0}being true (noncontaminated situation) is then computed by using the Bayes formula,

where *p*(*x _{i}*) =

*p*(

*x*|

_{i}*H*

_{0})

*P*(

*H*

_{0}) +

*p*(

*x*|

_{i}*H*

_{1})

*P*(

*H*

_{1}). Similarly, the posterior conditional probability of

*H*

_{1}being true (contaminated situation) is given by

It is easy to see from (4) and (5) that *P*(*H*_{0}|*x _{i}*) +

*P*(

*H*

_{1}|

*x*) = 1. Here, the probabilities

_{i}*P*(

*H*

_{0}) and

*P*(

*H*

_{1}) as well as the likelihood functions

*p*(

*x*|

_{i}*H*

_{0}) and

*p*(

*x*|

_{i}*H*

_{1}) are considered to be known (a priori estimated from dataset I as shown in the previous subsection). The posterior probabilities in (4) and (5) can be readily computed and used to identify the likely situation (

*H*

_{0}or

*H*

_{1}) of the concerned sweep in dataset II (independent of dataset I). In particular, the true situation is more likely to be

*H*

_{0}than

*H*

_{1}if

*P*(

*H*

_{0}|

*x*) >

_{i}*P*(

*H*

_{1}|

*x*) or, equivalently,

_{i}*P*(

*H*

_{0}|

*x*) > 0.5 or

_{i}*P*(

*H*

_{1}|

*x*) < 0.5. By the same token, the true situation is more likely to be

_{i}*H*

_{1}than

*H*

_{0}if

*P*(

*H*

_{0}|

*x*) <

_{i}*P*(

*H*

_{1}|

*x*) or, equivalently,

_{i}*P*(

*H*

_{0}|

*x*) < 0.5 or

_{i}*P*(

*H*

_{1}|

*x*) > 0.5.

_{i}When a decision is made by using the above rule based on the derived conditional probability regarding the situation (*H*_{0} or *H*_{1} being true) of the concerned sweep, the probability of wrong decision can be also estimated. Denote by *D*_{0} (or *D*_{1}) the decision that a concerned sweep is noncontaminated (or contaminated) by migrating birds. Denote by *P*(*D*_{0} = *H*_{0}|*x _{i}*) the probability of

*D*

_{0}being correct conditioned by a given value of

*x*, and by

_{i}*P*(

*H*

_{0}|

*x*,

_{i}*D*

_{0}) the probability of

*H*

_{0}being correct conditioned by given

*x*and

_{i}*D*

_{0}. According to the above rule, we have

*P*(

*D*

_{0}=

*H*

_{0}|

*x*) =

_{i}*P*(

*H*

_{0}|

*x*,

_{I}*D*

_{0}). Similarly, the probability of

*D*

_{1}being correct (given

*x*) is

_{i}*P*(

*D*

_{1}=

*H*

_{1}|

*x*) =

_{i}*P*(

*H*

_{1}|

*x*,

_{i}*D*

_{1}). On the other hand, the probability of

*D*

_{0}being wrong (given

*x*) can be denoted by

_{j}*P*(

*D*

_{0}≠

*H*

_{0}|

*x*) and, clearly,

_{i}*P*(

*D*

_{0}≠

*H*

_{0}|

*x*) = 1 −

_{i}*P*(

*H*

_{0}|

*x*,

_{i}*D*

_{0}) =

*P*(

*H*

_{1}|

*x*,

_{i}*D*

_{0}). Similarly, the probability of

*D*

_{1}being wrong (given

*x*) is

_{i}*P*(

*D*

_{1}≠

*H*

_{1}|

*x*) =

_{i}*P*(

*H*

_{0}|

*x*,

_{i}*D*

_{1}). Thus, it is easy to see that the probability of wrong decision is always smaller than 0.5; that is,

The two formulas will be used to estimate the probability of wrong decision when the single-parameter QC technique is applied to dataset II in section 3. Since the decision made on each sweep can be tested against the ground truth information in dataset II, the estimated probabilities of wrong decision can be also tested statistically. The single-parameter QC technique will be used to evaluate the effectiveness of each QC parameter in section 3.

### c. Multiparameter QC

The above QC technique can be extended and improved if all three QC parameters are used in combination. If the three QC parameters are not independent from each other, then it is necessary to consider their joint likelihood (empirical probability density function) in the three-dimensional space of **x** = (*x*_{1}, *x*_{2}, *x*_{3}). In this case, the procedure remains essentially the same as described in section 2b except that the single scalar random variable *x*_{i} should be replaced by the vector random variable **x**. Ideally, this approach should yield much improved QC capability. The current dataset I, however, is not sufficiently large to reliably support this approach. As shown by a scatterplot of QC parameter data points (Fig. 2), the parameter data points (for contaminated sweeps) are quite sparse in subspace (*x*_{1}, *x*_{2}), even though nearly 50% of the total points are within the interval of 40% < *x*_{3} < 42% over the peak of *p*(*x*_{3}|*H*_{1}).

Because the parameter data points are sparse and irregularly scattered in the three-dimensional space of **x**, the joint likelihood cannot be reliably estimated from the current dataset I. Their marginal likelihood functions, however, can be and have been well estimated in each one-dimensional subspace of **x** (see section 2a). These marginal likelihood functions can be used jointly to improve the QC technique. For NWP data assimilation applications, it is desirable to eliminate as many unqualified observations as possible without throwing away too many qualified observations. Based on this principle, a concerned sweep of velocity scan is considered to be contaminated by migrating birds if *P*(*H*_{1}|*x*_{i}) > *P*(*H*_{0}|*x _{i}*) for any one of the three

*x*(

_{i}*i*= 1, 2, 3). In this case, the probability of wrong decision can be estimated as follows:

where med and min denote the median and minimum, respectively, among the three (*i* = 1, 2, 3). The usefulness of (7) is examined in the next section, where the multiparameter QC technique is tested with dataset II.

## 3. Test experiments with dataset II

### a. Posterior probabilities computed from dataset I

In this section, single-parameter and multiparameter QC techniques are applied to the 1340 nighttime sweeps at the lowest elevation angle (0.5°) in dataset II and the results are tested against the ground truth information in dataset II. The prior probabilities have been estimated from dataset I, and their values are *P*(*H*_{0}) = 0.573 and *P*(*H*_{1}) = 1 − *P*(*H*_{0}) = 0.427 (see section 2a). The likelihood functions have also been estimated from the 4442 nighttime sweeps at 0.5° in dataset I with the ground truth information, and the results are presented in Fig. 1. Using these priors, the posterior probabilities, *P*(*H*_{0}|*x _{i}*) and

*P*(

*H*

_{1}|

*x*), can be computed from (4) and (5), respectively. The computed posterior probabilities are plotted as functions of each

_{i}*x*in Fig. 3. Note that

_{i}*P*(

*H*

_{0}|

*x*) +

_{i}*P*(

*H*

_{1}|

*x*) = 1 (see section 2b), so the variation of

_{i}*P*(

*H*

_{0}|

*x*) is exactly opposite to that of

_{i}*P*(

*H*

_{1}|

*x*), and the two probability curves intersect always at the probability value of 0.5 in each panel of Fig. 3. Thus, we only need to examine

_{i}*P*(

*H*

_{1}|

*x*), that is, the posterior probability of bird contamination conditioned by a given value of each

_{i}*x*.

_{i}As shown in Fig. 3a, *P*(*H*_{1}|*x*_{1}) is zero when *x*_{1} = MRF < 2 dB*Z* but increases monotonically to 0.6 as MRF increases from 2 to 8 dB*Z*, decreases rapidly to 0.2 as MRF increases continuously to 11 dB*Z*, and then increases again sparsely to 0.8 as MRF increases further to 13 dB*Z*. For the reason explained above, the variation of *P*(*H*_{0}|*x*_{1}) is exactly opposite to that of *P*(*H*_{1}|*x*_{1}). The two curves *P*(*H*_{0}|*x*_{1}) and *P*(*H*_{1}|*x*_{1}) intersect three times at the probability value of 0.5. The posterior probability of bird contamination is higher than 0.5 when MRF is in the range of (6, 10.0) or (11.5, 14) dB*Z*. These two ranges of MRF correspond exactly to those in Fig. 1a over which *p*(*x*_{1}|*H*_{1}) > *p*(*x*_{1}|*H*_{0}). Figure 3b shows that the posterior probability *P*(*H*_{1}|*x*_{2}) is zero or nearly zero when *x*_{2} = VDC < 35% and increases almost linearly to nearly 1 as VDC increases from 35% to 70%. The posterior probability of bird contamination is higher than 0.5 when VDC > 53%. This range of high probability corresponds exactly to that in Fig. 1b over which *p*(*x*_{2}|*H*_{1}) > *p*(*x*_{2}|*H*_{0}). Similarly, as shown in Fig. 3c, the posterior probability *P*(*H*_{1}|*x*_{2}) is zero or nearly zero when *x*_{3} = VSC < 31% and increases almost nearly monotonically to 0.68 as VSC increases from 31% to 43%. The posterior probability of bird contamination is higher than 0.5 only when VSC > 39%. This range corresponds exactly to that in Fig. 1c over which *p*(*x*_{3}|*H*_{1}) > *p*(*x*_{3}|*H*_{0}).

### b. Results of single-parameter QC

The posterior probabilities in Figs. 3a–c are used by the single-parameter QC technique (see section 2b) to check the 1340 sweeps in dataset II, and the probability of wrong decision is estimated for each decision made on each sweep. The estimated probabilities (EP) of wrong decision are then averaged in two groups; that is,

where *P*(*D*_{0} ≠ *H*_{0}|*x _{i}*) =

*P*(

*H*

_{1}|

*x*,

_{i}*D*

_{0}) and

*P*(

*D*

_{1}≠

*H*

_{1}|

*x*) =

_{i}*P*(

*H*

_{0}|

*x*,

_{i}*D*

_{1}) are used as in (6); Σ

_{0}and Σ

_{1}denote summations over the numbers of decisions of

*D*

_{0}and

*D*

_{1}, respectively; while

*M*

_{0}and

*M*

_{1}are the total numbers of decisions of

*D*

_{0}and

*D*

_{1}, respectively. The results are listed in the first and third rows of Table 1 for each QC parameter.

The decision made on each sweep is tested against the ground truth information in dataset II. The tested percentages (TP) of wrong decision are calculated also in two groups by

where *W*_{0} and *W*_{1} are the numbers of wrong decisions of *H*_{0} and *H*_{1}, respectively, and *M*_{0} and *M*_{1} are as in (8). The results are listed in the second and fourth rows of Table 1 for each QC parameter. The performance of each QC parameter can be evaluated by the smallness of TP(*D*_{0} ≠ *H*_{0}) and TP(*D*_{1} ≠ *H*_{1}). Clearly, as shown in Table 1, VSC performs better than VDC and VDC performs better than MRF. The performance of each QC parameter can be also assessed by the smallness of EP(*D*_{0} ≠ *H*_{0}) and EP(*D*_{1} ≠ *H*_{1}), without using the ground truth information. As shown in Table 1, the assessed performance rating is consistent with the above-evaluated rating, although EP(*D*_{0} ≠ *H*_{0}) underestimates the probability of wrong decision evaluated by TP(*D*_{0} ≠ *H*_{0}), while EP(*D*_{1} ≠ *H*_{1}) overestimates the probability of wrong decision evaluated by TP(*D*_{1} ≠ *H*_{1}).

### c. Results of multiparameter QC

The posterior probabilities in Figs. 3a–c are used in combination by the multiparameter QC technique (see section 2c) to check the 1340 sweeps in dataset II, and the probability of wrong decision is estimated for each decision. The estimated probabilities of wrong decision are then averaged in two groups; that is,

where (7) is used, the summations Σ_{0} and Σ_{1} and associated *M*_{0} and *M*_{1} have the same meanings as those in (8). The results are listed in the first and third rows of the last column in Table 1.

The decision made by the multiparameter QC on each sweep is tested against the ground truth information in dataset II. The tested percentages of wrong decision are calculated in two groups in the same way as in (9a) and (9b). The calculated TP(*D*_{0} ≠ *H*_{0}) and TP(*D*_{1} ≠ *H*_{1}) are listed in the second and fourth rows of the last column in Table 1. Note that TP(*D*_{0} ≠ *H*_{0}) is now much smaller than those of single-parameter QC, although TP(*D*_{1} ≠ *H*_{1}) is not so. This fits well the purpose that the multiparameter QC is designed for, that is, to eliminate as many as unqualified observations as possible without throwing away too many qualified observations for NWP data assimilation applications. Evaluated by the smallness of TP(*D*_{0} ≠ *H*_{0}), the multiparameter QC technique performs much better than any of the three single-parameter QC techniques. The superior performance of the multiparameter QC is also well reflected by the smallness of the estimated EP(*D*_{0} ≠ *H*_{0}). The estimated EP(*D*_{1} ≠ *H*_{1}) is also quite close to TP(*D*_{1} ≠ *H*_{1}), as shown in Table 1. Thus, when the multiparameter QC technique is used for real applications (with no ground truth information), (7) and (10) can be used to estimate the probability of wrong decision for individual decisions and the averaged probability of wrong decision.

## 4. Conclusions

By using the three QC parameters presented in Part I (Zhang et al. 2005), a probabilistic QC technique is developed based on the Bayesian decision theory (Duda et al. 2001) to identify and flag migrating-bird-contaminated sweeps of level II velocity scans at the lowest elevation angle (0.5°). To provide independent tests, the dataset prepared in Part I is partitioned into dataset I (containing 4442 nighttime sweeps at 0.5° for the period 1 May–15 June) and dataset II (containing 1340 nighttime sweeps at 0.5° for the period 15–30 April). Dataset I is used to estimate the prior probability of migrating-bird contamination and the likelihood (empirical probability density estimated from the data) for each QC parameter. Dataset II is used for the tests. The QC technique can use either each single QC parameter or all three in combination.

The single-parameter QC technique is used to evaluate the effectiveness of each QC parameter based on the smallness of the tested percentages of wrong decision, defined by TP(*D*_{0} ≠ *H*_{0}) and TP(*D*_{1} ≠ *H*_{1}) in (9). The results show that VSC is more effective than VDC, while VDC is more effective than MRF (see Table 1). For each QC decision (bird-contaminated or not contaminated), the probability of wrong decision is also estimated [see (6) and (8)], without using the ground truth information. The averages of the estimated probabilities of wrong decision are shown to be useful for assessing the effectiveness of each QC parameter (without knowing the ground truth information).

The multiparameter QC technique is designed and developed for NWP data assimilation applications. Because accepting one bad observation may ruin the analysis more than throwing away a dozen good observations, the principle of the design is to eliminate as many unqualified observations as possible without throwing away too many qualified observations. This is often necessary for data assimilation, especially radar data assimilation. Since volume scans of level II velocity are available in real time every 5 or 10 min from each of the 120 WSR-88D radars, the existing or near-future operational data assimilation systems will be heavily loaded by radar wind observations, unless the observations are thinned (into fewer superobservations) in time and space. There may be also a considerable degree of information redundancy in level II wind observations with regard to the information and related resolutions that can be effectively absorbed by the current operational NWP systems. This justifies the aforementioned principle in designing the multiparameter QC technique. The multiparameter QC technique is tested with dataset II. As evaluated by the smallness of TP(*D*_{0} ≠ *H*_{0}), the multiparameter QC technique is indeed, as expected, much better than any of the three single-parameter QC techniques (see Table 1). The probability of wrong decision is also estimated [by using (6)] for each decision. The averages of the estimated probabilities of wrong decision [see (8)] are quite close to the tested percentages of wrong decision [see (9) and the last column of Table 1], by using the ground truth information in dataset II. Thus, the formulas in (7) and (10) can provide valid and useful estimates of the probabilities of wrong decision and their averages when the multiparameter QC technique is used for real applications (with no ground truth information).

The performances of the QC techniques can be also evaluated by their hit rates and false alarm rates. The hit rate is defined by the percentage of correctly detected contaminated scans among all actually contaminated scans, while the false alarm rate is the percentage of incorrectly detected noncontaminated scans among all truly noncontaminated scans. The results are listed in the last two rows in Table 1. As shown, the maximum hit rate of the single-parameter QC technique is 0.714 and is smaller (by 0.232) than that of the multiparameter QC technique, while the false alarm rate of the multiparameter QC technique is larger (by at least 0.242) than that of the single-parameter QC technique. The hit rate of the multiparameter QC is 0.946, which means that almost all contaminated scans are detected successfully. This, of course, is at the cost of the increased false alarm rate (the percentage of discarded noncontaminated sweeps among the total noncontaminated sweeps), which is consistent with the aforementioned principle in designing the multiparameter QC technique.

A flowchart is summarized in Fig. 4 to show how the multiparameter QC is implemented as a part of the total QC system for level II velocity data. As shown, the QC parameters are calculated first as soon as the radar raw data become available. These QC parameters are used not only for bird-contamination identification but also for other level II velocity data quality problems (Liu et al. 2003; Zhang et al. 2003). Because the prior probabilities and likelihood functions are estimated in advance, the posterior probabilities can be computed very efficiently. This computational efficiency is very important for operational applications, especially if the QC system is used to process all the volume scans in real time from the 120 WSR-88D radars.

The multiparameter QC technique has so far been tested only with velocity data collected by the KTLX radar during the 2003 spring migrating season. It is not clear how well the estimated prior probabilities and likelihood functions can be applied to level II velocity data collected in the fall migrating seasons and/or in other regions of the continental United States. This problem requires further investigation.

## Acknowledgments

The authors are thankful to anonymous reviewers for their comments and suggestions that improved the presentation of the results. The research work was supported by the NOAA A8R2WRP project and FAA Contract IA#DTFA03-01-X-9007 to NSSL and by ONR Grants N000140310822 and N000140410312 to the University of Oklahoma.

## REFERENCES

**,**

**,**

**,**

**,**

## Footnotes

*Corresponding author address:* Dr. Qin Xu, National Severe Storms Laboratory, 1313 Halley Circle, Norman, OK 73069. Email: Qin.Xu@noaa.gov