## 1. Introduction

Knowing what precipitation type is reaching the ground is a fundamental prerequisite for accurate determination of amount. Thus, for quantitative precipitation estimation (QPE), first a correct classification needs to be made so that appropriate semiempirical relations can be chosen to estimate the corresponding rates and/or accumulations. Because of sensitivity to hydrometeor concentration, shape, orientation, dielectric constant, and size, polarimetric variables have emerged as leading discriminators of precipitation type (Zrnic and Ryzhkov 1999).

Very early in the development of differential polarization measurements, it became apparent that these could be used to determine the presence of hail and possibly gauge its size (Seliga and Bringi 1978). Later, Hall et al. (1980) proposed a table of relations between the reflectivity factor and differential reflectivity to classify hydrometeors; noteworthy are the descriptive (nonquantitative) entries in the table that, it turns out, are very suitable for building “fuzzy” classification rules. A more detailed table can be found in Doviak and Zrnic (1993); it served as a starting point for some decision-tree type classification schemes (e.g., Lopez and Aubagnac 1997; Carey and Rutledge 1996), which were used to discriminate among rain, graupel, and hail. A similar decision type rule has been advanced and applied to data collected with a 5-cm wavelength radar by Holler et al. (1994).

Building on the table in Doviak and Zrnic (1993), Straka and Zrnic (1993) developed a scheme based on “fuzzy” logic principles in which boundaries between classes were allowed to overlap, but the weighting (membership) functions were of a pulse type (i.e., values of 0 and 1). Subsequently, Straka (1996) used a variety of overlapping weighting functions with smooth transitions in one and two dimensions; this scheme became the basis of several subsequent attempts for classification of hydrometeors (Zrnic and Ryzhkov 1999; Vivekanandan et al. 1999). A further step in the development is reported by Liu and Chandrasekhar (2000) who used a feedback scheme to adjust the weights of the fuzzy classifier; they also give a comprehensive set of values for the one-dimensional membership functions.

Meanwhile Straka et al. (2000) have presented an extensive set of relations in the form of tables and two-dimensional graphs that delineate regions, in the space of *Z*_{h}(the reflectivity factor for the horizontally polarized wave) and any polarimetric variable, where specific hydrometeor signatures reside. These graphs (or variants thereof) have been used by the authors to build membership functions for “fuzzy” classification schemes (Straka 1996; Ryzhkov and Zrnic 1999), one of which has been implemented on the NCAR's S-Pol radar (Vivekanandan et al. 1999).

The purpose of this paper is to present details of the algorithm not found elsewhere and a methodology for sensitivity analysis to the various polarimetric variables used for classification. Verification, let alone simple comparisons of classification algorithms, is a daunting task because in situ observations are too few and often not coincidental with the radar measurements. Moreover, the radar sampling volumes are several orders of magnitude larger than the typical particle probes whose images contain uncertainties as well. Because there are other simpler means to develop and evaluate the algorithm, no comparison with in situ probes is attempted herein. Rather, we rely on spatial continuity, height above ground, and comparison with conceptual models to qualify the algorithm's performance. An important thrust of our paper is a procedure we developed to compare the merits of the polarimetric variables for classifying various hydrometeors. This procedure and self-consistency checks based on intuition, precipitation physics, and conceptual models can be carried out before in situ comparisons are attempted.

In section 2 of the paper we briefly describe the algorithm. Then follows a description of data that are used by the algorithm. In section 3 we present the performance of the algorithm on the data from the Cimarron (Zahrai and Zrnic 1993) and the S-Pol polarimetric radars (Lutz et al. 1997). Ramification for operational radars are then mentioned, and relative merits of various polarimetric variables to isolate hydrometeor types are discussed.

## 2. Algorithm

The classification algorithm described herein belongs to the “fuzzy logic” family. Much has been written about this method (Mendel 1995), and several articles with applications to meteorological problems have appeared (Straka 1996; Cornman et al. 1998; Vivekanandan et al. 1999; Liu and Chandrasekar 2000). Nonetheless, none of the jargon used in that discipline is needed to understand its principles. In the following description, we utilize the key nomenclature, define two-dimensional membership functions, and rely on analogy with probability density functions for reasons that will be apparent shortly.

In its essence a classifier assigns (maps) an observed point *X* in the multiparameter space to a class *j* (which refers to the bulk hydrometeor type such as rain, hail, graupel, snow, etc.). In this paper, the point *X* has for coordinates the following six variables:

the reflectivity factor for horizontally polarized waves

*Z*_{h},the differential reflectivity

*Z*_{DR},the specific differential phase

*K*_{DP},the cross-correlation coefficient between the horizontally and vertically polarized (copolar) waves

*ρ*_{hv}(0),the linear depolarization ratio

*L*_{DR}, andthe environmental temperature

*T.*

*j*). The classical statistical decision theory solves, in principle, this type of a problem as follows. It starts with the probability densities

*P*

_{j}(

*X*) for each class. Then, the integral

*V*

_{j}), corresponding to the likely class

*j,*gives the probability of correct classification, whereas the integral over the complement of

*V*is the probability of misclassification of the specific class

_{j}*j.*The choice of the volume

*V*

_{j}amounts to identification of a boundary delineating the class; misclassification occurs because of the ambiguity at and in the vicinity of the boundaries between classes. By considering consequences of either decision (the point does or does not belong to a class), one can partition the space of the polarimetric variables (universe in the nomenclature of set theory) to optimize the outcome. Normally an accepted false alarm rate for misclassification would be adopted, and by trial and error, the boundary would be changed as long as the probability of correct classification increases while the probability of misclassification is kept at or below the desired level. This is analogous to the dilemma facing weather forecasters who are about to issue warnings. Over warning increases the probability of detection at the expense of more false alarms. A proper balance between these two decisions is often a subjective call.

In most cases of practical significance, and certainly for hydrometeor classification, the probabilities of classes are not available, and it is unlikely that these would become known in the near future. Therefore, simplifications are sought such that reasonable classification can be made.

*W*

_{j}(

*Y*

_{i}), somewhat similar to probabilities, are defined in a way that mimics one's expectation (Straka and Zrnic 1993; Vivekanandan et al. 1999; Liu and Chandrasekhar 2000). Here,

*Y*

_{i}stands for any one of the polarimetric variables. An intuitive yet good practice is to restrict the maxima of

*W*

_{j}s to be 1 and locate these at values for which the variable almost always corresponds to the

*j*th class. Further from these maxima, the weighting function should decrease to reach zero in regions where the variable is unlikely to be associated with the

*j*th class. To distinguish between reliable and less reliable variables,

*Y*

_{i}, one can multiply the

*W*

_{j}(

*Y*

_{i}) with a multiplicative factor

*A*

_{i}less or equal to one. In this manner, relative changes of confidence in various variables can be easily accounted for. Thus, the classification based on this scheme involves sums

*M*is the number of variables, and the natural assignment is the one that maximizes

*S*

_{j}.

But, there is more to come. Because the numerical values of *S*_{j} allow quantitative assignment of the confidence in the classified bulk hydrometeors (in the resolution volume), if the weights represent the “truth” well, then the values (in %) of *S*_{j} can be interpreted as the confidence in outcome of the classification procedure. That is, the higher the max (*S*_{j}) is, and the larger the difference between the max (*S*_{j}) and the next to max (*S*_{j}), the more likely it is that the classification is correct.

A short discussion concerning maximization of the sum (2), as opposed to some other combination of the weights, is in order. For example, Liu and Chandrasekar (2000) maximize the product of weights and report considerable skill of their classifier. Simple considerations suggest that with the sum, the probability of correct classification should be larger. That is, if due to noise, one of the weights (for a specific class) is zero, the rest still can contribute significantly to bring the datum into the correct category. On the other hand, the percent of false (absurd) classifications should be smaller in the product maximization scheme. That is, if one variable is considerably out of range for a given class, its very small weight will suppress that class. Another difference between the two procedures concerns the way that confidence in the variables is expressed. In the maximization of sums, the multiplicative factors *A*_{i} primarily determine the confidence, whereas the width of the weighting functions has a secondary role. In the maximization of products, the width of the weighting functions determines the confidence in the variables; increase of the width lowers the confidence in the variable. Because comparisons between the two procedures have yet to be made, further research and tests are required to evolve an optimum scheme.

Because of high dimensionality and unknown shape of the partition boundaries, a reasonable approach is to deal with projections to smaller dimensions. At the moment, this is a sound practical approach as much has been learned about how values of individual variables relate to hydrometeor types (Herzegh and Jameson 1992; Doviak and Zrnic 1993; Holler et al. 1994; Straka et al. 2000). Although simple, the use of one-dimensional weighting functions gives a less tight partition than what is possible by capitalizing on the dependencies (correlations) between the variables. If two polarimetric variables (corresponding to a specific class of hydrometeor types) are related, their scattergram has a trend as opposed to a centrally symmetric shape for uncorrelated variables; see, for example, the *Z*_{h}, *Z*_{DR} scattergrams (Leitao and Watson 1984), the *Z*_{h}, *K*_{DP} scattergrams (Balakrishnan and Zrnic 1990a; Ryzhkov et al. 1997), and the *Z*_{h}, *ρ*_{hv} scattergrams (Balakrishnan and Zrnic 1990b). Thus, a more precise classification would be achieved if the class is delineated with a boundary that follows a fixed percentage contour in the scattergram of the two variables for a specific hydrometeor type. Straka et al. (2000) suggest how to partition pairs of variables (two-dimensional subspaces) into regions of dominant hydrometeor types. According to several investigators, discrimination in two-dimensional (2D) regions produced very promising results (Straka 1996; Carey and Rutledge 1998; Lopez and Aubagnac 1997; Zrnic and Ryzhkov 1999; Vivekanandan et al. 1999; Liu and Chandrasekar 2000).

Simulations (Scarchilli et al. 1996) and observations allow definitions of regions in the subspace of three or more variables (triplet *Z,* *Z*_{DR}, *K*_{DP} in the cited reference) whereby hydrometeors of one type prevail. Nonetheless, direct partition (i.e., creation of boundaries) in larger than two-dimensional subspaces is quite complex and, to our knowledge, has not yet been attempted. Moreover, two-dimensional partitioning is readily visualized, and therefore, the connection between changes in the boundaries and the corresponding effects on classification becomes apparent. This is important for the evolution of the algorithm, which requires numerous iterations, and this visual-cognitive feedback can lead to a rapid convergence toward satisfactory performance. Accordingly, we have defined 2D weighting functions *W*_{j}(*Z*_{h}, *Y*_{i}) where *Z*_{h} is the reflectivity factor for horizontally polarized waves, and *Y*_{i} is one of the other five variables.

*W*

_{j}

*Z*

_{h}

*Z*

_{DR}

*W*

_{j}

*Z*

_{h}

*W*

_{j}

*Z*

_{DR}

*Z*

_{h}

As an example, the shape of the weighting function for the class “moderate rain” (Straka et al. 2000) is plotted in Fig. 1. Here, *W*_{MR}(*Z*_{h}) = 1 for *Z*_{h} between 35 and 45 dB*z*; from these boundaries, the weighting function decreases linearly (with the slope of 0.2 dB^{−1}) so that if *Z*_{h} is displaced by 5 dB from the boundary, it reaches 0. The conditional weighting functions are also trapezoidal except the coordinates defining each trapezoid (Fig. 1) depend on the value of *Z*_{h}; the decrease of *W*_{MR}(*Z*_{DR} | *Z*_{h}) from 1 to 0 occurs over the interval of 0.3 dB in *Z*_{DR}.

In a similar way, all other 2D weighting functions are defined on, altogether, five pairs of variables. All the “a priory” and “conditional” *W*s are trapezoids whose top vertices are prescribed by the diagrams in Straka et al. (2000) in a similar manner, as explained in the previous paragraph. The weighting functions overlap and intersect at the value of ½. Some minor deviations from these diagrams have evolved in the course of this work.

The increments in the polarimetric variables over which various weighting functions change from 1 to 0 (or vice versa) are listed in Table 1. These were obtained in a subjective manner from experience, examination of scattergrams, and consideration of statistical errors in estimates of the polarimetric variables.

Tests of the algorithm demonstrated that the zero degree isotherm of the environmental temperature was often above the one inferred by the location of the melting layer (which is lowered by the downdraft). This can produce inconsistent weighting functions. To avoid such contradictions, we modify the environmental temperature profile so that its zero degree height coincides with the minimum of *ρ*_{hv}. That is, in cases where the vertical cross section of *ρ*_{hv} shows a well-defined melting zone (Zrnic et al. 1993), a vertical profile through the melting zone is constructed, and the location of its minimum is found. In this way, the heights of the melting layer bottom obtained from the temperature and polarimetric measurements are forced to coincide. But through convective cores, the melting zone is lifted in the updraft and lowered in the downdraft and, thus, offset from the imposed zero *T* height. Nonetheless, the polarimetric variables have such strong signatures in the convective cell that their weighting functions overwhelm the temperature's, and hence, the offset does not affect the outcome of classification. We emphasis that it is not necessary to have an actual temperature profile. A standard atmosphere profile suffices if it is modified to have the zero isotherm at a heigh determined by the radar.

The weighting functions of temperature are rooted in physical principles and are also trapezoidal, except the ones for rain are open ended at temperatures larger than 0°C. We have tabulated the transition points of the weighting trapezoidal functions in Table 2, whereas Fig. 2 relates the tabulated values to the typical trapezoidal weights.

Currently, there are 11 hydrometeor classes that build on the synthesis in Straka et al. (2000). The hydrometeor classes are

light rain (LR < 5 mm h

^{−1}),moderate rain (MR: 5 to 30 mm h

^{−1}),heavy rain (HR > 30 mm h

^{−1}),rain dominated by large drops (LD),

rain/hail mixture (R/H),

graupel and/or small hail (GSH),

hail (HA),

dry snow (DS),

wet snow (WS),

horizontally oriented ice crystals (ICH), and

vertically oriented ice crystals (ICV).

The real-time version of this algorithm implemented on the S-Pol radar (Vivekanandan et al. 1999) differs in some categories. Drizzle, cloud drops, supercooled droplets, irregular ice crystals, and insects are additional classes of scatterers, but there is no distinction between horizontally and vertically oriented ice crystals, nor is there a category for rain with large drops. Vertical orientation of crystals is caused by strong electric fields and their presence might be a precursor to the onset of lightning. The extra hydrometeor categories for the S-Pol radar may be possible because it measures *L*_{DR}, which is not available on the Cimarron radar.

In addition to the described weighting functions, there are thresholds on differential reflectivity and temperature such that the confidence (sum *S*_{j}) in a hydrometeor class *j* is set to zero if the thresholds are satisfied. Table 3 lists the thresholds and the affected hydrometeors. It is self evident that these thresholds prevent absurd classification.

Finally, the multipliers *A*_{i} representing the importance attached to the various polarimetric variables are listed in Table 4. Experience combined with trial and error lead us to adopt these values. Reasons related to hardware on the Cimarron radar cause bias in the *ρ*_{hv} such that the values are lower and noisier than expected, and therefore, *A*_{ρ} is set to 0.3 as opposed to 0.5 for the S-Pol radar.

It is of utmost importance to present valid data to this or any other automated algorithm. To that end, we utilize a median filter on *Z*_{h}, a running average to smooth the *Z*_{DR}, *ρ*_{hv}, and *L*_{DR}, and a least-squares fit (Ryzhkov and Zrnic 1996) to obtain specific differential phase. Furthermore, we discard *K*_{DP} in and adjacent to regions where its values are below −0.5° km^{−1}. Reflectivity factor and differential reflectivity are corrected for attenuation following the procedure suggested by Bringi et al. (1990).

## 3. Sensitivity tests

Objective comparison of polarimetric schemes for classification could be done, in principle, if a large set of comprehensive in situ observations were available. Currently, there are no platforms to provide such observations over large volumes and at rates compatible with radar scanning times; nonetheless, for the purpose of this paper, such comparisons are premature. For preliminary comparisons and crude adjustments of the weighting (membership) functions, there is a much simpler procedure that we have adopted. It consists of visual identification of the obvious gross errors and determination of the responsible variables. Then, the boundaries and/or membership functions built on these variables are changed until the gross errors are reduced or disappear. This requires care so that the ripple effect on the previously correct classification is minimized, whereas the reduction in erroneous classes is maximized.

The reflectivity factor and environmental sounding are available to all those with access to ubiquitous nonpolarimetric radars. Hence, it is pertinent to determine how these two variables fair in comparison with the full set of polarimetric variables. In addition, it is useful to know which polarimetric variable in combination with the reflectivity factor has the most discriminatory power. To illustrate the effects of temperature, we present results of classification whereby use only of the pair *Z*_{h}, *T* is contrasted with application of the other variables. Similarly, we consider the other pairs individually and in suitable combinations.

We have chosen a dataset obtained with the Cimarron polarimetric radar from a hail storm and a dataset obtained with the S-Pol radar from an airmass storm in Florida to demonstrate the effects variables have on the classification outcome. Besides presenting images of classified hydrometeor fields, we also make a somewhat subjective but nonetheless quantitative ranking of the variables according to their relative importance.

### a. Hail storm observed with the Cimarron radar

#### 1) Visual analysis of classified fields

The polarimetric variables available on the Cimarron radar are reflectivity factor *Z*_{h}, differential reflectivity *Z*_{DR}, specific differential phase *K*_{DP}, and cross-correlation coefficient *ρ*_{hv}. Vertical cross sections of the fields of these variables through a convective storm reveal some well-defined hydrometeor signatures as well as artifacts (Fig. 3). For example, the high reflectivity core aloft (Fig. 3a) likely contains hail. The positive *Z*_{DR} near ground indicates rain, and the column of positive *Z*_{DR} at 80 km (Fig. 3b) contains large drops in relatively low concentration. Positive *Z*_{DR} above the storm top is due to mismatch of antenna sidelobes for vertical and horizontal polarizations; these data are excluded from the forthcoming presentations. The *K*_{DP} shows concentration of rain between 70 and 80 km, and there is a column of *K*_{DP} that overlaps the *Z*_{DR} column (Fig. 3d).

The censored values aloft correspond to *K*_{DP} < −0.5° km^{−1}, which are likely due to artifacts (Ryzhkov and Zrnic 1998). The field of correlation is least reliable, and there is no indication of the melting layer in this storm (Fig. 3c). An abrupt increase of *Z*_{DR} in the lower part of precipitation in this case hints at the onset of melting of graupel and snow. Because the strong vertical gradient of *Z*_{DR} at its highest point above ground (at the range of 40 km) almost coincides with the 0°C isotherm of the sounding, no adjustment was made to the environmental temperature.

The algorithm was applied to the fields of polarimetric variables (Fig. 3), and it created the hydrometeor classes represented in Fig. 4. We consider this field to be the standard against which fields obtained with less than the full set of variables can be gauged. Although we do not have independent verification about the quality of the field in Fig. 4, we submit that the comparison can reveal some useful information. Things like relative merit of polarimetric variables and the average confidence (Mendel 1995) in various classes of hydrometeors can be quantified.

Figures 5a–d contain the fields of hydrometeors, each of which was obtained from a single pair of variables (one variable in any of the pairs is always *Z*_{h}); comparison with Fig. 4 reveals the relative significance of the selected pair. This significance will be quantified after a brief qualitative description of salient features attributed to the various pairs.

The primary variables responsible for locating the onset of rain are the *Z*_{h} and *Z*_{DR}, as can be seen in Fig. 5a. Further, the heavy rain and rain with large drops at 50 km as well as aloft at 80 km are also identified from the *Z*_{h}, *Z*_{DR} pair. The light rain in the anvil is erroneous and is a consequence of inability to distinguish between light rain and dry snow in the two-dimensional subspace *Z*_{h}, *Z*_{DR}. The areas of light rain and dry snow overlap in a large region of the *Z*_{h}, *Z*_{DR} subspace (see Figs. 2a and 2b in Straka et al. 2000). With further research, it might be possible to decrease the overlapping area.

The pair *Z*_{h}, *K*_{DP} identifies some horizontally oriented ice crystals, but at the top of the cloud it misclassifies crystals into light rain. Otherwise, graupel and rain/hail mixture are the two dominant categories. Both extend into the region where their presence (in this case) is physically forbidden; for rain/hail that is throughout a large upper part of storm, and for graupel it is the northwest part near ground. These errors are a consequence of inadequate information in the *Z*_{h}, *K*_{DP} pair. Sporadic data void regions (black patches within the field) are where the *K*_{DP} is excessively negative.

The pair *Z*_{h}, *T* (Fig. 5d) produces a smooth field in which the vertical stratification of hydrometeors is controlled by the temperature profile. Note how the phase transition at zero temperature is at the constant height of about 4 km; it is followed in height with graupel/small hail, and dry snow is at the top. This general pattern is modulated in the horizontal direction by the influence of the reflectivity factor. Although pleasing to the eye and somewhat physically satisfying, the classification misses important details seen in Fig. 4.

The pair *Z*_{h}, *ρ*_{hv} produces the least reliable classification due to extensive overlap of discrimination regions (Straka et al. 2000). The classes close to the ground are credible mainly due to the influence of *Z*_{h}. Data-void regions are where the values of the correlation coefficient are below 0.3.

Figures 5a–d indicate that none of the pairs by themselves can create wet snow in the melting layer; its presence is revealed only if the algorithm is applied to all the polarimetric variables (Fig. 4). The pairs *Z*_{h}, *Z*_{DR}, and *Z*_{h}, *T* produce fields that closely match what is expected as function of height in storm cells. Other works (Wakimoto and Bringi 1988) and our experience suggest that the boundary between frozen and liquid precipitation at low levels is faithfully delineated using the *Z*_{h}, *Z*_{DR} pair.

For further comparison and analysis, it is worthwhile to examine classified fields obtained from the sets of the polarimetric variables in which one polarimetric variable is absent (Fig. 6). Absence of *Z*_{DR} (Fig. 6a) produces a field in which the strong influence of *K*_{DP} and *T* is seen, whereas the absence of *K*_{DP} (Fig. 6b) replicates fairly well the field in Fig. 4; neither one of these fields contain wet snow in the melting layer. Only if *Z*_{h}, *Z*_{DR}, and *K*_{DP} are simultaneously present do the weights add up to produce wet snow in the melting layer (Figs. 6c and d).

Absence of *ρ*_{hv} (Fig. 6c) has a minor effect on the classification outcome; however, it increases the thickness of the wet snow region. The value of temperature in eliminating physically impossible classes is seen in Fig. 6d. These are the sporadic areas of wet snow at heights above 7 km, spots of rain/hail mixture at similar heights, patches of light rain at the edges of storm, some ice crystals close to the ground, and wet snow in the upper part of the storm.

Comparison of Figs. 6a–d clearly shows that *Z*_{DR} is the most important polarimetric variable. In its absence, the classified fields clash with those expected from conceptual models and physical considerations.

#### 2) Relative importance of the variables

Next, we will quantify the discussion concerning relative contribution of variables to hydrometeor classes in Figs. 4–6. This we do, as in the previous section, by examining classification with single pairs versus the rest of variables. The reader is referred to appendix A for explanation of the measure *μ* that we assign to the field of classified hydrometeors. Briefly, the measure is a number of data points that are classified to the same category of hydrometeors. With appropriate normalization of measures, we obtain various percentages (akin probabilities) of classification, which are entered in tables. Thus, from the tables one can objectively establish relative merits of variables used in the classification scheme. Two tables (5 and 6) are presented and discussed in detail herein; several others are in appendix B.

We will often make reference to appendix A (Fig. A1 and Table A1) in the following text. Thus, *μ*(A), *μ*(P), and *μ*(R) stand for the number of data in a class obtained from all the available variables, a single pair of variables, and the rest of the variables (i.e., excluding the polarimetric variable that was part of the pair), respectively, as explained in appendix A. The notation in Table A1 clarifies the meaning of various measures presented here and in appendix B. The total number of classified data points (i.e., resolution volumes) for this storm was 14050.

We start with Table 5 to illustrate values obtained for the Oklahoma hailstorm data. The entries are percentages with respect to the case in which all the variables are used for classification (see also Table A1). Thus, the large value indicates that a particular pair contributes significantly to the classified category, whereas very small value indicates irrelevance of the variable. If a column contains one large value and other very small values, then the variable corresponding to the large value is almost exclusively responsible for the classification to the corresponding category. In the twelfth column are the percentages for the whole data field (i.e., for all the classes combined).

A glance at the Table 5 reveals that *Z*_{h}, *Z*_{DR} pair is the most significant of the pairs. In column 14 listed is the number of hydrometeor classes for which the percent of correct classification is larger than 70. (For Gaussian distributions, this is almost equal to the probability within one standard deviation.) If we rank the variables according to the number of hydrometeor classes for which the pair produces ≥70% of agreement with that produced by the complete set, the order would be *Z*_{DR}, *T,* *ρ*_{hv}, *K*_{DP} (6 categories >70% for *Z*_{DR}, 5 for *T,* 4 for *ρ*_{hv}, and 2 for *K*_{DP}). But if the rank is according to the number of categories for which the percent is ≤30%, the ranking would be slightly altered to *Z*_{DR}, *T,* *K*_{DP}, *ρ*_{hv}; in this ranking the variable with the smallest number of categories is ranked highest because it fails to classify the smallest number of categories. (The other variables fail to classify more categories.) Ranking according to the percentage for all classes (column 12) is *Z*_{DR}, *T,* *K*_{DP}, *ρ*_{hv}.

It is significant that *Z*_{h}, *T* pair is ranked second, although the multiplier for its weighting function is 0.4 (Table 4); only the multiplier *A*_{ρhv} has a smaller weight of 0.3. This strong influence of temperature is partly due to its independence of the radar variables (which inherently are somewhat related). Furthermore, the temperature information influences a large number of data above the melting layer and in weak reflectivity regions where the classification is often ambiguous. These factors and the strong effect of *Z*_{h} (in the horizontal direction) are likely the cause for such a high importance of the *Z*_{h}, *T* pair.

Even though the *Z*_{h}, *K*_{DP} pair has a 0.8 weight multiplier, for classification it is of secondary importance, similar to the *Z*_{h}, *ρ*_{hv} pair. This is because for the majority of hydrometeors, the boundaries in the *Z*_{h}, *K*_{DP} space overwhelmingly overlap. Further in this case, there is a large region of *K*_{DP} close to 0, which is ambiguous. But the specific differential phase offers advantages for rainfall measurement (Zrnic and Ryzhkov 1996), which are not included in our ranking.

Further examination of the table reveals features in data that are a direct consequence of the classification scheme, like the large drop category is detected exclusively with the *Z*_{h}, *Z*_{DR} pair. Dry snow, wet snow, and vertically oriented ice crystals are not at all classified by this pair. Vertically oriented ice crystals are exclusively detected with the *Z*_{h}, *K*_{DP} pair, but in this particular case, *K*_{DP} misses large drops and horizontally oriented crystals.

Another aspect of the scheme and the variables' influence can be gleaned by normalizing with respect to the number of points obtained if a particular pair (*P*) or lack thereof (i.e., the rest *R* of the pairs) is used in the classification. For that reason, we present Table 6. The values in Table 6 correspond to the categories that are wrongly classified (analogous to false alarm) by the use of the pair. That is for each category, the measures are normalized to the total number of points classified into that category. For example, 71% of the data are erroneously classified as light rain out of all data classified as LR from the *Z*_{h}, *Z*_{DR} pair; these are mostly in the anvil (Fig. 5a). The missing entries indicate that there were no such categories identified by the pair [i.e., the corresponding *μ*(*P*) is 0]. This lack of identification is a deficiency routed in the fact that the particular hydrometeor signature in the polarimetric pair is week or nonexistent. From Table 6, it becomes immediately apparent what pairs misclassify which categories. So in addition to the *Z*_{h}, *Z*_{DR} pair, significant misclassification of LR is caused by the *Z*_{h}, *K*_{DP} pair and also by the *Z*_{h}, *ρ*_{hv} pair. Substantial mis-classifiers of other categories are *Z*_{h}, *K*_{DP} for R/H, WS, and ICH; and *Z*_{h}, *ρ*_{hv} for LR, MR, R/H, and WS. Use of combined variables eliminates these misclassifications. Nonetheless, we have no independent confirmation that the combined use of all the variables produces a correct field of hydrometeors. Hence as stated earlier, the values in this and other tables are only meant for relative comparisons. As far as the number of categories that can be identified (that is, minimum of missing entries), the ranking would be *K*_{DP}, *Z*_{DR}, *T,* *ρ*_{hv}. The total percentages imply the following ranking: *Z*_{DR}, *T,* *K*_{DP}, *ρ*_{hv}. Other aspects of merits of different polarimetric variables can be seen in Tables B1 to B6 in appendix B. This is where the ranking of variables is justified and explained. The summary of the rankings according to the columns labeled total (Table 7) indicates that the *Z*_{h}, *Z*_{DR} pair (note that *Z*_{h} is always used) is overall most effective in the classification scheme. The strong importance of temperature, in spite of its relatively low weight (0.4 in Table 4), suggests that ambiguities in polarimetric signatures and statistical uncertainty might be significant spoilers. It remains to be seen if the addition of spatial filters and quantitative use of weighting functions can reduce the importance of temperature. (This is pertinent for cases where *T* is not available.) As expected, because of the 0.3 multiplier, the *ρ*_{hv} contributes least to our current classification scheme.

### b. Ordinary storm observed with the S-Pol radar

During August and September 1998, the S-Pol radar was located near Melbourne, Florida, to support the Texas-Florida Under flying experiment (TEFLUN). Because the S-Pol radar measures *L*_{DR} and produces better quality *ρ*_{hv} than the Cimarron, this set allows an augmented analysis and a more extensive comparison of the utility of the various variables. Also, comparison of two different storm types adds value to the analysis.

#### 1) Visual analysis of classified fields

The fields of the polarimetric variables (Fig. 7) indicate the storm is well developed with peak reflectivities of about 55 dB*Z* below the melting layer. The transition from ice to rain is seen in the *Z*_{DR} as an abrupt increase between 3.5 and 4 km; it is also apparent in the *L*_{DR} field as a sharp decrease to less than −26 dB. The bright band in the *ρ*_{hv} is well defined and discernable in few places even within the storm core. The height of the minimum of the bright band was used to shift down by 1.1 km the temperature profile of the environmental sounding (0°C isotherm from 4.7 to 3.6 km). Both *Z*_{DR}, and *K*_{DP} have visible columns that coincide, and *K*_{DP} indicates heavy rain (>100 mm h^{−1}) between 20 and 23 km from the radar. Note the *L*_{DR} cap (Jameson et al. 1996; Hubbert et al. 1998) at a range of 20 km that coincides with the top of the *Z*_{DR} column. It indicates freezing of the supercooled drops lifted by the updraft. Adjacent to it (at farther range) is a vertical band of *L*_{DR} minima that suggest the presence of graupel.

Classification from the whole suite of variables and from single pairs produces the results in Fig. 8. Graupel/small hail is located in the frozen region of the three distinct cells (Fig. 8a). Dry snow fills most of the rest of the cloud above the melting layer, and at the top of the cloud, some horizontally and even vertically oriented ice crystal are indicated. The sporadic presence of vertically oriented crystals suggests that these are likely artifacts. Although the horizontally oriented crystals are expected at cloud tops, the detection herein could be fortuitous; this is because the crystals also appear at the bottom of the anvil in the region of weak reflectivity where they do not belong. Heavy rain and R/H mixture are interleaved below the main core, and there is a small patch of hail in the core just above the melting zone. Noteworthy is the presence of snow in the bright band, and a sliver below it is rain composed of large drops.

Examination of the *Z*_{DR} field (Fig. 7b) indicates its strong influence on the moderate, heavy, and large drop rain (Fig. 8b). The wet snow is correctly located, and its chief identifier is the *ρ*_{hv} (Fig. 8f). Hydrometeors within the major cell (17–27 km from the radar) are almost equally well identified with the *Z*_{DR}, *Z*_{h} pair as with the *K*_{DP}, *Z*_{h} pair (Fig. 8c), except the field obtained from the latter one is noisier and has a larger area of heavy rain. The phase of precipitation in the two weak cells is correctly separated with the *Z*_{DR}, *Z*_{h} pair but is not separable with the *K*_{DP}, *Z*_{h} pair. Note that the *Z*_{DR}, *Z*_{h} pair fails to discriminate between light rain and dry snow in a large region above the freezing level; neither does it detect vertically or horizontally oriented ice crystals. Clearly *K*_{DP}, *Z*_{h} identifies the horizontally and vertically oriented ice crystals. Note the area of vertical crystals just above the cell at 40 km (Fig. 8c). Although in the composite classification (Fig. 8a) this area is absent, it might have been wrongly eliminated; similar signatures above growing storms have been previously noted and attributed to aligned crystals (Zrnic and Ryzhkov 1999; Caylor and Chandrasekar 1996).

Identification based on the *T,* *Z*_{h} pair is too smooth, generally credible, but lacks details of wet snow or rain/hail mixture. We stress again that in this particular case, the minimum of *ρ*_{hv} was used to place the zero degree isotherm at a correct altitude. If the *L*_{DR}, *Z*_{h} pair is the only discriminant, the field of hydrometeors shows structure, but with the exception of some light and moderate rain below the melting layer and some snow above, the types are mostly wrong. The field produced by the *ρ*_{hv}, *Z*_{h} is surprisingly smooth (Fig. 8f). The strength of this pair is in identifying the wet snow and moderate to heavy rain below the melting layer. As expected, it is not possible to discriminate between rain and ice forms above the melting layer.

By examining the effects of omission of variables from the classification process, one can obtain another feel for the relative merits among the variables (Fig. 9). Absence of *Z*_{DR} caused misclassification in the rain region where some LR and MR are assigned to GSH. This would degrade rainfall measurement. Also in the region of frozen hydrometeors, there are substantial sporadic areas of horizontally and vertically oriented ice crystals (although the area above the two-week cells at 40 km could contain vertically oriented crystals). For most of the hydrometeor categories, the absence of *K*_{DP} has little or no effect (Fig. 9c), except in the core of the strongest cell; there, below the melting layer, GSH occupies larger area, the one touching the ground looks suspicious. Otherwise the fields are smoother and, thus, more appealing than in Fig. 9a! Perhaps *K*_{DP} should be used only in regions where its signatures are strong (moderate or larger rain, rain/hail mixture, etc.). It is very satisfying that the absence of *T* does not change the basic patterns of the classified fields (Fig. 9d). As expected, the field looks noisier; sporadic dry snow appears below the melting layer, and speckles of light rain are seen in the region of snow. The absence of either *L*_{DR} or *ρ*_{hv} has a similar effect. Minor differences are in the depth of regions with oriented crystals at the top (larger in the absence of *L*_{DR}) and in the area of rain/hail mixture (slightly larger in the absence of *ρ*_{hv}).

The presented images indicate the significance of *Z*_{h}, *Z*_{DR} in convective cores, the importance of *ρ*_{hv} for wet snow, and the temperature for resolving ambiguities above and below the melting zone. It is significant that the absence of temperature information leaves generally consistent fields. Nonetheless, *L*_{DR} or *T* are needed to discriminate between dry snow and light rain, as can be seen in Fig. 10. Absence of *L*_{DR} and *T* (Fig. 10a) produces a light rain throughout most of the upper part of the storm where the ice phase should dominate. Inclusion of *L*_{DR} at the expense of *ρ*_{hv} and *T* (Fig. 10b) restores the snow region above the melting layer. This finding is significant for applications to winter storms in which the spatial distribution of temperature is not available, and hence, *L*_{DR} might be the most important discriminator between rain and snow.

In the absence of temperature, hydrometeor classes appear noisier but, nonetheless, have coherent structure and are at the expected location. This attests to the strength of the polarimetric variables. Thus, we believe that in similar cases, additional processing with spatial filters would reduce the incoherent speckles.

#### 2) Relative importance of the variables

Some 14 812 data values were classified (between 10 and ∼50 km from the radar), and the smallest number in a class was 20 for hail; the next smallest was 220 for wet snow. Because of insufficient sample size, we will refrain from quantitative interpretation of data classified as hail.

The percentage of hydrometeors classified using a pair of variables that agree with those obtained from the full set is tabulated in Table 8. The ranking according to column 13 would be first *Z*_{DR} and *T,* second *ρ*_{hv}, and third *L*_{DR} and *K*_{DP}, but according to column 14, the ranking would be *Z*_{DR} and *T,* *K*_{DP}, *ρ*_{hv}, *L*_{DR}. For all classes combined, the temperature leads and is followed by *Z*_{DR}, *K*_{DP}, *L*_{DR}, and *ρ*_{hv}. The 25% lead of *T* over *Z*_{DR} is somewhat unexpected considering that in the Oklahoma storm, temperature was a close second (Table 5). But in the Florida event, a large portion of frozen precipitation has weak reflectivity whereby classification is often ambiguous (Fig. 9d is an example where the *T* is excluded), unlike the Oklahoma storm, which has highly reflecting upper part (see how few obviously wrong classes are in the frozen part in Fig. 6d). Perhaps the reason is that the 0°C isotherm is set to the correct height, which is constant for the Florida storm.

The equivalents to false alarms, that is, percentages of data wrongly classified by the pair, are in Table 9. From this table and according to the totals, the order of variables is *T,* *Z*_{DR}, *K*_{DP}, *L*_{DR}, and *ρ*_{hv}. It is noteworthy that the three “less useful variables” *K*_{DP}, *L*_{DR}, and *ρ*_{hv} each has a large false classification for a specific hydrometeor type. Here, *K*_{DP} misclassifies 100% of hail, *L*_{DR} mis-classifies 97% of the large drops, and *ρ*_{hv} misclassifies 85% of the large drop rain.

Relative merits of other variables are discussed and tabulated in appendix C, and the rank of variables according to the metrics in these tables is listed in Table 10. Comparison of the rows of this table with the corresponding rows of Table 7 brings out the following curious fact. The first variable in Table 7 it always second in Table 10 and vice versa. Moreover, according to all the criteria in Table 10, the *ρ*_{hv} is ranked consistently last. Because of immense practical implications of a simultaneous scheme proposed for the WSR-88D (Doviak et al. 2000), we further focus on the relative merits of *L*_{DR} versus *ρ*_{hv}.

Current plans are to simultaneously transmit and receive the *H* and *V* polarizations, and this precludes measurement of *L*_{DR} (unless it is done on alternate scans). At least in this Florida storm, *L*_{DR} does not seem to add significantly more to the classification, as can be seen by the fact that omission of either *L*_{DR} or *ρ*_{hv} (with the use of temperature) has almost the same effect (95% and 96% agreement in column 12 of Table C1). This we have verified in an alternate manner by comparing the percentage of agreement for classification, which omits *L*_{DR} with the one that omits *ρ*_{hv} (93% of classified data agree). A more important question (to which we have no answer) asks, “Can the addition of *L*_{DR} be the only way to detect a significant precipitation type” (like icing conditions)?

We do have, however, an objective way to determine the potential utility of *L*_{DR} in special meteorological conditions such as transition between rain and snow in the horizontal direction. Because surface temperature measurements are sparse, the horizontal *T* field might not be available, in which case, the polarimetric variables must stand on their own. Figure 10 suggests that in the absence of temperature, *L*_{DR} should be a much better discriminator of dry snow than *ρ*_{hv}. This assertion is quantified in Table 11, where the percent of “correctly” classified hydrometeors is presented for the case without *T* and *L*_{DR}, as well as for the case without *T* and *ρ*_{hv}. If *L*_{DR} is lacking, only 36% of dry snow is “correctly” identified. In the absence of *ρ*_{hv} (but presence of *L*_{DR}) 88% of the dry snow is still “correctly” classified. Although these results are for vertical cross sections, they apply equally well to horizontal fields of the classification variables.

## 4. Discussion

Both the specifics of our classification scheme and the particular dataset to which it is applied bear on the deduced merits of the variables. Presently, we have not yet evolved the boundaries to the point of diminished return, nor do we have optimum weighting functions. Therefore, we cannot isolate the two effects. Nonetheless, comparison of the relative performance of classification for the two storms can, in some instances, point out to what extent data affect the classifier.

Although the two storms are convective, they have distinct characteristics that might induce substantially different responses of the classifier. Note that the percentages of identified hydrometeor categories out of the total for the two storms differ by a factor of 2 in five categories (Table 12); these are R/H, HA, DS, ICH, and ICV. The contrast in hail is good and expected considering that the Oklahoma storm was a cell in a squall line, and the Florida storm was an airmass storm. This is also the reason that the Florida storm has substantially more dry snow; the deficit of the Oklahoma storm in dry snow is made up by an increase in graupel-small hail.

It is gratifying that all four rain categories are comparable (Table 12), this implies consistency in detecting rain. Similar conclusions apply to the wet snow. As far as the rain and wet snow are concerned, the algorithm is not sensitive to the type of convective storm or the radar.

The algorithm's relative performance on the Oklahoma and Florida data is similar. Comparison of entries in Tables 5, 6, B1–6 with corresponding Tables 8, 9, C1–6 indicates that on the average, the difference between likewise entries in the tables exceeds 30% in only 15% of the total number of entries. Thus, in these cases, the relative merits of the polarimetric variables are invariant to differences in the type of storm and environment, the extra *L*_{DR} available on the S-Pol radar, and the quality of *ρ*_{hv} data.

Comparison of Tables 7 and 10 reveals that for all rows except the seventh the leading two variables (either *Z*_{DR} or *T*) are opposite. It is significant that the *L*_{DR} and *ρ*_{hv} rank similarly behind the rest of the variables, and the *L*_{DR} is ahead of *ρ*_{hv}. For several reasons, the polarimetric scheme contemplated for upgrades of the WSR-88D is one with simultaneous transmission and reception of *H* and *V* waves (Doviak et al. 2000). In that mode the *L*_{DR} is not available, but if needed, it could be obtained in alternate volume scans that transmit *H* and receive *H* and *V.* Our analysis supports the notion that *L*_{DR} might not be essential for precipitation identification in convective storms. But where temperature is not available, such as when it changes through zero on the ground, like in winter storms, *L*_{DR} might offer advantages not found in the other variables. This was deduced by excluding the temperature and comparing classifications in the presence and absence of *L*_{DR} and *ρ*_{hv}.

A very important aspect for future examination is ranking classification according to the value associated with the knowledge of a specific hydrometeor type. Such ranking might not even be unique; rather, it would be a function of the utility in a given situation. The value of detecting hail in convective storms is much higher than detecting snow aloft. But in a snow storm, the principal utility would be to detect the freezing zone and accurately measure the amounts. Pragmatic reasons suggest that in a short run, tuning the algorithm for a storm type might result in a faster evolution toward better performance.

We realize that our analysis incorporates partly a circular argument. That is, we have given weights (which depend on the confidence in the variables, Table 4) to the variables, and the results of the analysis are generally consistent with these weights. Yet, specifics differ. For example, the *Z*_{h}, *Z*_{DR}, which both have a weight of one, come in as a strong first in most comparisons. But *T,* which has a weight of 0.4, is a very close second (behind *Z*_{DR}). Two reasons could explain its importance: one, *T* has information independent of the polarimetric variables, and two, it is paired with *Z,* which has very strong discriminatory power.

## 5. Conclusions

Polarimetric measurements from an Oklahoma squall line and a Florida airmass storm were automatically interpreted. The automatic interpretation is in the form of an algorithm that classifies hydrometeors. The principles behind the algorithm are so simple that a prerequisite for understanding is high school algebra. Because the equations are linear sums, it would seem that there should be no difficulty in analyzing and improving its performance. Yet this is far from the truth. It is the multidimensionality that obfuscates the progress here. The partitioning boundaries are many, and they overlap; furthermore, there are numerous combinations of values of the variables that can cause maxima in the confidence factors. Therefore, the analysis is involved and the evolution tedious. It will just take much time to develop a fine version of the algorithm. At this early stage of algorithm development, in situ verification seems premature; thus, we have opted to access its attributes by examining self consistency of data, spatial continuity, and compliance with conceptual models.

Our first goal was to test sensitivity of the algorithm to the various polarimetric variables. This was done subjectively by comparing the fields of hydrometeors obtained using the full set of available polarimetric variables with a diminished set whereby some variables have been left out. Comparisons reveal the following. The reflectivity factor and differential reflectivity combined have the strongest discriminating power. Inclusion of the temperature profile helps eliminate a substantial number of spurious errors. Although the absence of temperature information degrades the scheme, it appears that the resultant fields are generally coherent and not far off from the fields obtained by adding *T* to the suite of polarimetric variables.

An objective way to test the sensitivity of the algorithm to variables and rank their utility was devised. It involves definition of a measure (appendix A), which turns out to be a number of data points classified into a category in the presence or absence of variables. Ratios of various measures (similar to probabilities) define the percentage of occurrence of a class. By comparing these percentages for cases in which some variables are excluded to those where all are included, a relative merit can be assigned to the variables.

## Acknowledgments

Partial funding for this research was provided by the National Science Foundation Grants ATM-9120009, ATM-9311911, EAR-9512145, and ATM-9617318, the National Severe Storms Laboratory, the Cooperative Institute for Mesoscale Meteorological Studies, the Graduate College of the University of Oklahoma (Dr. E. C. Smith), and the Federal Aviation Administration. Mike Schmidt and Richard Wahkinney have maintained and calibrated the Cimarron radar, Joan O'Bannon drafted Figs. 2 and A1, whereas Chris Curtis produced Fig. 1, respectively. Finally a perceptive reviewer pointed out an inconsistency with the weighting functions for temperature and prompted us to correct it.

## REFERENCES

Balakrishnan, N., and Zrnic D. S. , 1990a: Estimation of rain and hail rates in mixed phase precipitation.

,*J. Atmos. Sci.***47****,**565–583.——, and ——,. 1990b: Use of polarization to characterize precipitation and discriminate large hail.

,*J. Atmos. Sci.***47****,**1525–1540.Bringi, V. N., Chandrasekar V. , Balakrishnan N. , and Zrnic D. S. , 1990: An examination of propagation effects in rainfall on radar measurements at microwave frequencies.

,*J. Atmos. Oceanic Technol.***7****,**829–840.Carey, L. D., and Rutledge S. A. , 1996: A multiparameter radar case study of the microphysical an kinematic evolution of a lightning producing storm.

,*Meteor. Atmos. Phys.***59****,**33–66.Caylor, I. J., and Chandrasekar V. , 1996: Time-varying ice crystal orientation in thunderstorms observed with multiparameter radar.

,*IEEE Trans. Geosci. Remote Sens.***34****,**847–858.Cornman, L. B., Goodtich R. B. , Morse C. S. , and Ecklund W. L. , 1998: A Fuzzy logic method for improved moment estimation from Doppler spectra.

,*J. Atmos. Oceanic Technol.***15****,**1287–1305.Doviak, R. J., and Zrnic D. S. , 1993:

*Doppler Radar and Weather Observations.*Academic Press, 562 pp.——, Bringi, V., Ryzhkov A. , Zahrai A. , and Zrnic D. S. , 2000: Considerations for polarimetric upgrades of operational WSR-88D radars.

,*J. Atmos. Oceanic Technol.***17****,**257–278.Hall, M. P. M., Cherry S. M. , Goddard J. W. F. , and Kennedy G. R. , 1980: Raindrop sizes and rainfall rate measured by dual-polarization radar.

,*Nature***285****,**195–198.Herzegh, P. H., and Jameson A. R. , 1992: Observing precipitation through dual-polarization radar measurements.

,*Bull. Amer. Meteor. Soc.***73****,**1365–1374.Holler, H., Bringi V. N. , Hubbert J. , Hagen M. , and Meischner P. F. , 1994: Life cycle and precipitation formation in a hybrid-type hailstorm revealed by polarimetric and Doppler radar measurements.

,*J. Atmos. Sci.***51****,**2500–2522.Hubbert, J. V., Bringi V. N. , and Carey L. D. , 1998: CSU-CHILL polarimetric radar measurements from a severe hail storm in Eastern Colorado.

,*J. Appl. Meteor.***37****,**749–775.Jameson, A. R., Murhpy M. J. , and Krider E. P. , 1996: Multiple-parameter radar observations of isolated Florida thunderstorms during the onset of electrification.

,*J. Appl. Meteor.***35****,**343–354.Leitao, M. J., and Watson P. A. , 1984: Application of dual linearly polarized radar data for prediction of microwave path attenuation at 10–30 GHz.

,*Radio Sci.***19****,**209–221.Liu, H., and Chandrasekar V. , 2000: Classification of hydrometeor type based on multiparameter radar measurements: Development of a neuro-fuzzy system and in situ verification.

,*J. Atmos. Oceanic Technol.***17****,**140–164.Lopez, R. E., and Aubagnac J. P. , 1997: The lightning activity of a hailstorm as a function of changes in its microphysical characteristics inferred from polarimetric radar observations.

,*J. Geophys. Res.***102****,**16 799–16 813.Lutz, J., Rilling B. , Wilson J. , Weckwerth T. , and Vivekanandan J. , 1997: S-Pol after three operational deployment, technical performance, siting experiences, and some data examples. Preprints,

*28th Conf. on Radar Meteorology,*Austin, TX, Amer. Meteor. Soc., 286–295.Mandel, J., 1995: Fuzzy logic systems for engineering: A tutorial.

,*Proc. IEEE***83****,**345–377.Papoulis, A., 1965:

*Probability, Random Variables and Stochastic Processes.*McGraw-Hill Co., 583 pp.Ryzhkov, A., and Zrnic D. S. , 1996: Assessment of rainfall measurement that uses specific differential phase.

,*J. Appl. Meteor.***35****,**2080–2090.——, and ——,. 1998: Beamwidth effects on the differential phase measurements of rain.

,*J. Atmos. Oceanic Technol.***15****,**624–634.——, ——, and Atlas, D., 1997: Polarimetrically tuned R(Z) relations and comparison of radar rainfall methods.

,*J. Appl. Meteor.***36****,**341–349.Scarchilli, G., Gorgucci E. , and Chandrasekar V. , 1996: Self-consistency of polarization diversity measurement of rainfall.

,*IEEE Trans. Geosci. Remote Sens.***30****,**22–26.Seliga, T. A., and Bringi V. N. , 1978: Differential reflectivity and differential phase shift: Applications in radar meteorology.

,*Radio Sci.***13****,**271–275.Straka, J. M., 1996: Hydrometeor fields in a supercell storm as deduced from dual-polarization radar. Preprints,

*18th Conf. on Severe Local Storms,*San Francisco, CA, Amer. Meteor. Soc., 551–554.——, and Zrnic, D. S., 1993: An algorithm to deduce hydrometeor types and contents from multiparameter radar data. Preprints,

*26th Int. Conf. on Radar Meteorology,*Norman, OK, Amer. Meteor. Soc., 513–516.——, ——, and Ryzhkov, A. V., 2000: Bulk hydrometeor classification and quantification using polarimetric radar data: Synthesis of relations.

,*J. Appl. Meteor.***39****,**1341–1372.Vivekanandan, J., Zrnic D. S. , Ellis S. M. , Oye R. , Ryzhkov A. V. , and Straka J. , 1999: Cloud microphysics retrieval using S-band dual-polarization radar measurements.

,*Bull. Amer. Meteor. Soc.***80****,**381–388.Wakimoto, R. M., and Bringi V. N. , 1988: Dual-polarization observations of microbursts associlated with intense convection: The 20 July storm during the MIST project.

,*Mon. Wea. Rev.***116****,**1521–1539.Zahrai, A., and Zrnic D. S. , 1993: The 10-cm wavelength polarimetric radar at NOAA's National Severe Storms Laboratory.

,*J. Atmos. Oceanic Technol.***10****,**649–662.Zrnic, D. S., and Ryzhkov A. , 1996: Advantages of rain measurements using specific differential phase.

,*J. Atmos. Oceanic Technol.***13****,**454–464.——, and ——,. 1999: Polarimetry for weather surveillance radars.

,*Bull. Amer. Meteor. Soc.***80****,**389–406.——, Balakrishnan, N., Ziegler C. L. , Bringi V. N. , Aydin K. , and Matejka T. , 1993: Polarimetric signatures in the stratiform region of a mesoscale convective system.

,*J. Appl. Meteor.***32****,**678–693.

## APPENDIX A

### Metrics for Quantitative Comparisons

Definitions of metrics for quantitative comparisons of the effects of various polarimetric variables on the outcome of the classification scheme are presented herein.

To aid the reader, we start with a conceptual diagram (Fig. A1) of the field of an identified class of hydrometeors. One can think of it as a horizontal or vertical cross section in which only one class of hydrometeors is shown. The set *A* corresponds to the data points (here from radar resolution volumes), which have been classified as one hydrometeor type (e.g., hail) using *all* (*A*) the available variables; this set is a standard for comparisons, and the corresponding classification will be called “correct.” The set *P* corresponds to the same hydrometeor class (hail) except that the data have been classified using a single *pair* (*P*), and the set *R* corresponds to the same class (hail), but the data have been classified using the *rest* (*R*) of the variables (all except the pair that was used to generate *P*). For the sake of simplicity in the sequel, the sets *P,* *R,* and *A* refer to any single hydrometeor type or to all the classified hydrometeors. Contrast this to the cumbersome *P*_{LD}(*Z*_{h}, *Z*_{DR}), which could otherwise signify the set of large drops classified using the pair *Z*_{h}, *Z*_{DR}. From the context and in the tables, it should be clear to which hydrometeor classes the shorthand notation refers.

Next, we prove that the partitions (i.e., subsets) in Fig. A1 are the only ones possible. That is, for the classification procedure based on Eqs. (2) and (3), agreement of the pair with the rest implies that the use of all the variables will also agree. In other words, the seven subsets including the null category in Fig. A1 are the only ones possible.

*Z*

_{h},

*Z*

_{DR}, and the classified category by the pair and by the rest is the same, say snow(s). Thus, the weighting function for the pair satisfies

*W*

_{s}

*Z*

_{h}

*Z*

_{DR}

*W*

_{j}

*Z*

_{h}

*Z*

_{DR}

*j*

*s.*

*j*≠

*s,*and the summation is over all the variables

*Y*

_{i}except

*Z*

_{DR}. Normalization by the Σ

*A*

_{i}is omitted from (A2) as it has no bearing on the proof.

Now, multiply both sides of (A1) with *A*_{ZDR} and add the right side of (A1) to the right side of (A2) and the left side of (A1) to the left side of (A2). This operation does not change the inequality, neither does normalization by Σ *A*_{i}, which is actually used when all the variables are present. This completes the proof.

*P,*

*R,*

*A*belong to a Borel field (e.g., Papoulis 1965) and so do the various intersections and unions of these sets. Next, we discuss assignment of a measure to these sets, from which some quantitative comparison of performance will ensue. Briefly, a measure

*μ*on a Borel field is an assignment of a number

*μ*(

*E*) for every element

*E*that belong to the field such that

*μ*(

*E*) ≥ 0. For our purpose, the measure will be the number of data points in a classified category of hydrometers. (Each data point corresponds to a single resolution volume.) That is, we count the number of classified data points (in a specific hydrometeor category) for which use of all variables, use of the single pair, and use of the remaining variables (without the single pair) agree or disagree. Then, a suitable combination and normalization of these measures can reveal the relative importance of the variables. Thus in our notation, we define metrics as ratios of measures (completely analogous to probabilities); for example,

*μ*

*P*

*A*

*μ*

*A*

*μ*

*P*

*P*

*A*

*μ*

*P*

## APPENDIX B

### Relative Comparison of Variables Utility—Oklahoma Storm

Tables with various metrics for relative comparison of the variables' utility in the classification scheme are listed herein. In tables with measures on *P* and subsets of *P,* the second member of the relevant pair is listed in the vertical column defining the rows. (The first member of the pair is always the reflectivity factor *Z*_{h}.) Thus, the variable is either part of the pair used exclusively for classification, or it is the one that is excluded from the classification; from the table titles and appropriate label of the defining column, it should be clear which one it is. The indicated class categories are as listed in the text and the figures. The meaning of various measures presented in the forthcoming tables is explained in appendix A.

Similar to Table 4, Table B1 presents percentages of classified precipitation, except it is for the case whereby a variable has been excluded. Large percentages indicate that the variable is relatively insignificant; that is, without it classification is still successful; small numbers indicate the opposite. Ranking of the variables according to the number of classes for which the percent is ≥70% is *Z*_{DR}, *T,* *K*_{DP}, *ρ*_{hv}; smaller numbers in column 13 mean increased importance of the excluded variable because less classes are correctly identified by the rest of the variables. Ranking of the variables according to the number of classes for which the percent is ≤30% is *Z*_{DR}, *K*_{DP}, *T,* *ρ*_{hv}, and according to the percentage in the total (column 13), the ranking would be *T,* *Z*_{DR}, *K*_{DP}, *ρ*_{hv}.

To determine how one pair and the rest of the variables contribute to the various categories, we present in Table B2 the percentages of correctly classified data for which the pair and the absence of the variables produces the same class (see Table A1 for definition). The table can loosely be interpreted as the information carried by the pair that is also present in the rest of the pairs. This means that the pair is worth as much as the rest; hence, large numbers imply higher ranking of the pair. Thus in this data, a good portion of light rain, graupel/small hail, and hail would be classified the same with or without the *Z*_{h}, *Z*_{DR} pair; that is for these categories, this pair is as good as the rest of the pairs. The information about heavy rain (86% of data in Table B2) and hail (76%) carried by the pair *Z*_{h}, *T* is also very high. The *Z*_{h}, *ρ*_{hv} pair is almost sufficient for LR, MR, and HR classification. This occasional redundancy of a pair is partly caused by the common presence of *Z*_{h} in all pairs.

Small percentages in B2 indicate that the main contributor is either the variable or the rest but not both. According to the total, the variables (understood to be always paired with *Z*_{h}) would rank as *Z*_{DR}, *T,* *K*_{DP}, *ρ*_{hv}.

Counting the data points in sets described by rows 4 and 5 of Table A1 produces Tables B3 and B4 (for clarity, see also Fig. A1). Thus, in B3 is the percentage of data that is missed in the absence of a variable. In B4 is the percentage of data that is missed in the absence of the rest of the variables. According to Table B3, HR and LD are primarily classified by the *Z*_{h}, *Z*_{DR} pair, the significant contributor to MR is the pair *Z*_{h}, *T,* and vertically oriented ice crystals are due to *Z*_{h}, *K*_{DP}. In the overall hydrometeor categories, the percent of contribution by any single pair is 19% or less, and according to the totals, the ranking is *T,* *Z*_{DR}, and *K*_{DP} second, and *ρ*_{hv} last.

Table B4 is almost, but not quite complementary to B3, and from column 12, the rank of variables is *Z*_{DR}, *T,* *K*_{DP}, *ρ*_{hv}.

Table B5 presents somewhat complementary information to Table 6. It indicates the percent of mis-classified data for cases whereby a specific variable is excluded from the scheme. Small percentages indicate the excluded variable is not very relevant for a classified category (i.e., the lack of it does not cause significant mis-classification). Large values signify substantial mis-classification due to the missing variable. Hence, this criteria would rank the variables as *T,* *Z*_{DR}, *K*_{DP}, *ρ*_{hv}. Note again the *Z*_{DR} and *T* lead because the first is needed to identify LD and the second to identify WS. The missing entries are where *μ*(*R*) = 0; hence, the omitted variables prevent classification in the corresponding category.

Finally, Table B6 presents the percentage of correctly identified hydrometeors for which all the variables are needed. These percentages are relative with respect to the pair and its absence. The mis-classified data is because the pair assigns it one class; the rest assigns it another class, but use of all the variables assigns it a class different from either of these two. That is the combined use of the pair, and the rest overrides either one. For example, consider dry snow (DS) and the pair *Z*_{h}, *Z*_{DR}; if this pair and the rest were used separately, then 36% of dry snow would be missed. Note the high percent of wet snow obtained by combining *Z*_{h}, *K*_{DP}, and the rest which could not be obtained by either one. As expected the, categories where percentages are relatively high are those for which there is significant overlap in the boundaries (i.e., DS, WS, ICH).

## APPENDIX C

### Relative Comparison of Variables Utility—Florida Storm

In Table C1 are the percentages of classified hydrometeors that agree if one variable is excluded from the set with those obtained from all the variables. Large percentages go with relatively unimportant variables. Thus, the leading variable according to the number of categories that can be classified (column 13) would be *Z*_{DR}; the rest are of about equal importance. This same ranking ensues if the total % of the field is a benchmark (column 12).

The information carried by a pair of variables that is also present in the rest of the variables is summarized in Table C2. Here, the large percentages imply that the pair has relatively high importance. A surprisingly strong influence of *T* is seen in both the total percentage (column 12) and the number of categories (column 13). This is perhaps due to the generally weaker polarimetric signatures in the Florida storm compared to the Oklahoma one.

The strength of a pair for classification can be gauged by the percent of data that would be missed in the absence of it (Table C3). The dominant pair for hail detection is *Z*_{h}, *Z*_{DR}; in its absence, 83% of the hail would not be detected. (The importance of this miss is diminished by the fact that sample size for hail is very small.) The order of importance according to the total percentage is *Z*_{DR}, *T,* *K*_{DP}, and the last spot is shared by *L*_{DR} and *ρ*_{hv}.

The importance of *T* is seen in Table C4 whereby the percent of correctly classified data in the absence of a variable is indicated. Small percentages mean that the rest of the variables contribute relatively little to the class. Accordingly, we rank in order of importance *T,* *Z*_{DR}, *K*_{DP}, *L*_{DR}, and *ρ*_{hv}.

Percentages of wrongly classified hydrometeors due to the absence of one variable are in Table C5. From the total percent of misclassified data (Table C5), we note that the absence of *Z*_{DR} has the largest effect on the total: 30% of the misses (column 12). This is followed by *T,* then *L*_{DR}, *K*_{DP}, and *ρ*_{hv}.

The last Table, C6, indicates the percent of classified hydrometeors for which combination of the pair and the rest is needed to accomplish classification. The results are similar to the ones in Table B6. There are few differences such as in wet snow for (*Z*_{h}, *K*_{DP}) and horizontally oriented crystals (*Z*_{h}, *Z*_{DR}), which have a much lower percentage for the Florida storm.

Increment of polarimetric variables over which the weighting function changes from 1 to 0

The transition points for the weighting trapezoidal functions of temperature

Values of thresholds and types of suppressed hydrometeors

Multipliers (*A _{i}*) of the weighting functions

Percent of “correctly” classified hydrometeors μ(*P* ∩ *A*)/μ(*A*) using the pair alone; Oklahoma storm

Percent of “wrongly” classified data (per category) μ*P* − *P* ∩ *A*/μ(*P*) using a pair of variables out of all those classified using the same pair; Oklahoma storm

Relative rank of the variables; Oklahoma storm

Percent of “correctly” classified hydrometeors μ(*P* ∩ *A*)/μ(*A*) using the pair alone; Florida storm

Percent of “wrongly” classified data (per category) μ(*P* − *P* ∩ *A*)/μ(*P*) using a pair of variables out of all those classified using the same pair; Florida storm

Relative rank of the variables; Florida storm

Percent of hydrometeors “correctly” classified without *T* and either *L*_{DR} or ρ_{hv}; Florida storm

Percent of hydrometeors classified in various categories for the Oklahoma (OK) data and the Florida (FL) data

Table A1. Metrics used for relative comparisons between variables.*

Table B1. Percent of “correctly” classified hydrometeors μ(*R* ∩ *A*)/μ(*A*) using one less variable; the variable not used is indicated to the left of the table

Table B2. Percent of “correctly” classified hydrometeors μ(*R* ∩ *P* ∩ *A*)/μ(*A*) for which the pair and its absence (i.e., the rest of variables) produce the same class

Table B3. Percent of “correctly” classified hydrometeors μ(*P* ∩ *A* − *R* ∩ *P* ∩ *A*)/μ(*A*) that would be misclassified in the absence of the variable

Table B4. Percent of “correctly” classified hydrometeors μ(*R* ∩ *A* − *R* ∩ *P* ∩ *A*)/μ(*A*) that would be missed in the absence of the remaining variables

Table B5. Percent of “wrongly” classified data μ(*R* − *R* ∩ *A*)/μ(*R*) using the rest of the variables out of all those classified using the rest of the variables

Table B6. Percent of hydrometeors μ(*A* − *P* ∩ *A* − *R* ∩ *A* + *P* ∩ *A* ∩ *R*)/μ(*A*) that can be “correctly” classified only if all the variables are used

Table C1. Percent of “correctly” classified hydrometeors *μ*(*R* ∩ *A*)/*μ*(*A*) using one less variable; the variable not used is indicated to the left of the table

Table C2. Percent of “correctly” classified hydrometeors μ(*R* ∩ *P* ∩ *A*)/μ(*A*) for which the pair and its absence (i.e., the rest of variables) produce the same category

Table C3. Percent of “correctly” classified hydrometeors μ(*P* ∩ *A* − *R* ∩ *P* ∩ *A*)/μ(*A*) that would be mis-classified in the absence of the variable

Table C4. Percent of “correctly” classified hydrometeors μ(*R* ∩ *A* − *R* ∩ *P* ∩ *A*)/μ(*A*) that would be missed in the absence of the remaining variables

Table C5. Percent of “wrongly” classified data μ(*R* − *R* ∩ *A*)/μ(*R*) using the rest of the variables out of all those classified using the rest of the variables

Table C6. Percent of hydrometeors μ(*A* − *P* ∩ *A* − *R* ∩ *A* + *P* ∩ *A* ∩ *R*)/μ(*A*) that can be “correctly” classified only if all the variables are used