• Aggarwal, C. C., and Reddy C. K. , Eds., 2013: Data Clustering: Algorithms and Applications. Chapman and Hall/CRC, 652 pp.

  • Al-Sakka, H., Boumahmoud A.-A. , Fradon B. , Frasier S. J. , and Tabary P. , 2013: A new fuzzy logic hydrometeor classification scheme applied to the French X-, C-, and S-band polarimetric radars. J. Appl. Meteor. Climatol., 52, 23282344, doi:10.1175/JAMC-D-12-0236.1.

    • Search Google Scholar
    • Export Citation
  • Baldini, L., Gorgucci E. , Chandrasekar V. , and Peterson W. , 2005: Implementations of CSU hydrometeor classification scheme for C-band polarimetric radars. 32nd Conf. on Radar Meteorology, Albuquerque, N. M., Amer. Meteor. Soc., P11.4. [Available online at https://ams.confex.com/ams/32Rad11Meso/techprogram/paper_95865.htm.]

  • Bechini, R., and Cremonini R. , 2002: The weather radar system of north-western Italy: An advanced tool for meteorological surveillance. Proceedings of the Second European Conference on Radar in Meteorology and Hydrology, Copernicus, 400404.

  • Bechini, R., Baldini L. , and Chandrasekar V. , 2013: Polarimetric radar observations in the ice region of precipitating clouds at C-band and X-band radar frequencies. J. Appl. Meteor. Climatol., 52, 11471169, doi:10.1175/JAMC-D-12-055.1.

    • Search Google Scholar
    • Export Citation
  • Brandes, E. A., Vivekanandan J. , Tuttle J. D. , and Kessinger C. J. , 1995: A study of thunderstorm microphysics with multiparameter radar and aircraft observations. Mon. Wea. Rev., 123, 31293143, doi:10.1175/1520-0493(1995)123<3129:ASOTMW>2.0.CO;2.

    • Search Google Scholar
    • Export Citation
  • Bringi, V. N., and Chandrasekar V. , 2001: Polarimetric Doppler Weather Radar: Principles and Applications.Cambridge University Press, 648 pp.

  • Bringi, V. N., Chandrasekar V. , Balakrishnan N. , and Zrnić D. S. , 1990: An examination of propagation effects in rainfall on radar measurements at microwave frequencies. J. Atmos. Oceanic Technol., 7, 829840, doi:10.1175/1520-0426(1990)007<0829:AEOPEI>2.0.CO;2.

    • Search Google Scholar
    • Export Citation
  • Carey, L. D., and Rutledge S. A. , 1996: A multiparameter radar case study of the microphysical and kinematic evolution of a lightning producing storm. Meteor. Atmos. Phys., 59, 3364, doi:10.1007/BF01032000.

    • Search Google Scholar
    • Export Citation
  • Caylor, I. J., and Chandrasekar V. , 1996: Time-varying ice crystal orientation in thunderstorms observed with multiparameter radar. IEEE Trans. Geosci. Remote Sens., 34, 847858, doi:10.1109/36.508402.

    • Search Google Scholar
    • Export Citation
  • Chandrasekar, V., Keränen R. , Lim S. , and Moisseev D. , 2013: Recent advances in classification of observations from dual polarization weather radars. Atmos. Res., 119, 97111, doi:10.1016/j.atmosres.2011.08.014.

    • Search Google Scholar
    • Export Citation
  • Conway, J. W., and Zrnić D. S. , 1993: A study of embryo production and hail growth using dual-Doppler and multiparameter radars. Mon. Wea. Rev., 121, 25112528, doi:10.1175/1520-0493(1993)121<2511:ASOEPA>2.0.CO;2.

    • Search Google Scholar
    • Export Citation
  • CSU–CHILL, cited 2014: National Weather Radar facility. [Available online at http://www.chill.colostate.edu.]

  • Davies, D. L., and Bouldin D. W. , 1979: A cluster separation measure. IEEE Trans. Pattern Anal. Mach. Intell., PAMI-1, 224227, doi:10.1109/TPAMI.1979.4766909.

    • Search Google Scholar
    • Export Citation
  • Dolan, B., and Rutledge S. A. , 2009: A theory-based hydrometeor identification algorithm for X-band polarimetric radars. J. Atmos. Oceanic Technol., 26, 20712088, doi:10.1175/2009JTECHA1208.1.

    • Search Google Scholar
    • Export Citation
  • Dolan, B., Rutledge S. A. , Lim S. , Chandrasekar V. , and Thurai M. , 2013: A robust C-band hydrometeor identification algorithm and application to a long-term polarimetric radar dataset. J. Appl. Meteor. Climatol., 52, 21622186, doi:10.1175/JAMC-D-12-0275.1.

    • Search Google Scholar
    • Export Citation
  • Fisher, R., Perkins S. , Walker A. , and Wolfart E. , cited 2003: Connected components labeling. [Available online at http://homepages.inf.ed.ac.uk/rbf/HIPR2/label.htm.]

  • Giangrande, S. E., Krause J. M. , and Ryzhkov A. V. , 2008: Automatic designation of the melting layer with a polarimetric prototype of the WSR-88D radar. J. Appl. Meteor., 47, 13541364, doi:10.1175/2007JAMC1634.1.

    • Search Google Scholar
    • Export Citation
  • Gonzalez, R., and Woods R. , Eds., 2002: Digital Image Processing. 2nd ed. Prentice Hall, 793 pp.

  • Gourley, J. J., Tabary P. , and du Chatelet J. P. , 2007: A fuzzy logic algorithm for the separation of precipitating from nonprecipitating echoes using polarimetric radar observations. J. Atmos. Oceanic Technol., 24, 14391451, doi:10.1175/JTECH2035.1.

    • Search Google Scholar
    • Export Citation
  • Hendry, A., and McCormick G. C. , 1976: Radar observations of the alignment of precipitation particles by electrostatic fields in thunderstorms. J. Geophys. Res., 81, 53535357, doi:10.1029/JC081i030p05353.

    • Search Google Scholar
    • Export Citation
  • Hooke, R., and Jeeves T. , 1961: Direct search” solution of numerical and statistical problems. J. Assoc. Comput. Mach., 8, 212229, doi:10.1145/321062.321069.

    • Search Google Scholar
    • Export Citation
  • Hubbert, J. C., and Bringi V. N. , 1995: An iterative filtering technique for the analysis of copolar differential phase and dual-frequency radar measurements. J. Atmos. Oceanic Technol., 12, 643648, doi:10.1175/1520-0426(1995)012<0643:AIFTFT>2.0.CO;2.

    • Search Google Scholar
    • Export Citation
  • Junyent, F., Chandrasekar V. , McLaughlin D. , Insanic E. , and Bharadwaj N. , 2010: The CASA Integrated Project 1 networked radar system. J. Atmos. Oceanic Technol., 27, 6178, doi:10.1175/2009JTECHA1296.1.

    • Search Google Scholar
    • Export Citation
  • Kennedy, P. C., and Rutledge S. A. , 2011: S-band dual polarization radar observations of winter storms. J. Appl. Meteor. Climatol., 50, 844858, doi:10.1175/2010JAMC2558.1.

    • Search Google Scholar
    • Export Citation
  • King, W. D., 1984: Air flow and particle trajectories around aircraft fuselages. I: Theory. J. Atmos. Oceanic Technol., 1, 513, doi:10.1175/1520-0426(1984)001<0005:AFAPTA>2.0.CO;2.

    • Search Google Scholar
    • Export Citation
  • King, W. D., 1986: Air flow and particle trajectories around aircraft fuselages. IV: Orientation of ice crystals. J. Atmos. Oceanic Technol., 3, 433439, doi:10.1175/1520-0426(1986)003<0433:AFAPTA>2.0.CO;2.

    • Search Google Scholar
    • Export Citation
  • Korolev, A., Emery E. , and Creelman K. , 2013: Modification and tests of particle probe tips to mitigate effects of ice shattering. J. Atmos. Oceanic Technol., 30, 690708, doi:10.1175/JTECH-D-12-00142.1.

    • Search Google Scholar
    • Export Citation
  • Lakshmanan, V., Rabin R. , and DeBrunner V. , 2001: Segmenting radar reflectivity data using texture. Preprints, 30th Int. Conf. on Radar Meteorology, Munich, Germany, Amer. Meteor. Soc., P2.2. [Available online at https://ams.confex.com/ams/30radar/techprogram/paper_21594.htm.]

  • Lim, S., Chandrasekar V. , and Bringi V. N. , 2005: Hydrometeor classification system using dual-polarization radar measurements: Model improvements and in situ verification. IEEE Trans. Geosci. Remote Sens.,43, 792801, doi:10.1109/TGRS.2004.843077.

    • Search Google Scholar
    • Export Citation
  • Lim, S., Cifelli R. , Chandrasekar V. , and Matrosov S. Y. , 2013: Precipitation classification and quantification using X-band dual-polarization weather radar: Application in the hydrometeorology testbed. J. Atmos. Oceanic Technol., 30, 21082120, doi:10.1175/JTECH-D-12-00123.1.

    • Search Google Scholar
    • Export Citation
  • Liu, H., and Chandrasekar V. , 1998: Classification of hydrometeor type based on multiparameter radar measurements. Preprints, Int. Conf. on Cloud Physics, Everett, WA, Amer. Meteor. Soc., 253–256.

  • Liu, H., and Chandrasekar V. , 2000: Classification of hydrometeors based on polarimetric radar measurements: Development of fuzzy logic and neuro-fuzzy systems, and in situ verification. J. Atmos. Oceanic Technol., 17, 140164, doi:10.1175/1520-0426(2000)017<0140:COHBOP>2.0.CO;2.

    • Search Google Scholar
    • Export Citation
  • Mahalanobis, P., 1936: On the generalized distance in statistics. Proc. Natl. Inst. Sci. India, 2, 4955.

  • Marzano, F., Scaranari D. , and Vulpiani G. , 2007: Supervised fuzzy-logic classification of hydrometeors using C-band weather radars. IEEE Trans. Geosci. Remote Sens.,45, 37843799, doi:10.1109/TGRS.2007.903399.

    • Search Google Scholar
    • Export Citation
  • Openshaw, S., 1983: The Modifiable Areal Unit Problem. Concepts and Techniques in Modern Geography Series, Vol. 38, Geo Books, 40 pp.

  • Otto, T., and Russchenberg H. W. J. , 2011: Estimation of specific differential phase and differential backscatter phase from polarimetric weather radar measurements of rain. IEEE Geosci. Remote Sens. Lett., 8, 988992, doi:10.1109/LGRS.2011.2145354.

    • Search Google Scholar
    • Export Citation
  • Park, H. S., Ryzhkov A. V. , Zrnić D. S. , and Kim K.-E. , 2009: The hydrometeor classification algorithm for the polarimetric WSR-88D: Description and application to an MCS. Wea. Forecasting, 24, 730748, doi:10.1175/2008WAF2222205.1.

    • Search Google Scholar
    • Export Citation
  • Parzen, E., 1962: On estimation of a probability density function and mode. Ann. Math. Stat., 33, 10651076, doi:10.1214/aoms/1177704472.

    • Search Google Scholar
    • Export Citation
  • Ryzhkov, A. V., and Zrnić D. S. , 1998: Discrimination between rain and snow with a polarimetric radar. J. Appl. Meteor., 37, 12281240, doi:10.1175/1520-0450(1998)037<1228:DBRASW>2.0.CO;2.

    • Search Google Scholar
    • Export Citation
  • Schneebeli, M., Grazioli J. , and Berne A. , 2014: Improved estimation of the specific differential phase shift using a compilation of Kalman filter ensembles. IEEE Trans. Geosci. Remote Sens., 52, 51375149, doi:10.1109/TGRS.2013.2287017.

    • Search Google Scholar
    • Export Citation
  • Straka, J. M., Zrnić D. S. , and Ryzhkov A. V. , 2000: Bulk hydrometeor classification and quantification using polarimetric radar data: Synthesis of relations. J. Appl. Meteor., 39, 13411372, doi:10.1175/1520-0450(2000)039<1341:BHCAQU>2.0.CO;2.

    • Search Google Scholar
    • Export Citation
  • Testud, J., Bouar E. L. , Obligis E. , and Ali-Mehenni M. , 2000: The rain profiling algorithm applied to polarimetric weather radar. J. Atmos. Oceanic Technol., 17, 332356, doi:10.1175/1520-0426(2000)017<0332:TRPAAT>2.0.CO;2.

    • Search Google Scholar
    • Export Citation
  • Vivekanandan, J., Ellis S. M. , Oye R. , Zrnić D. S. , Ryzhkov A. V. , and Straka J. , 1999: Cloud microphysics retrieval using S-band dual-polarization radar measurements. Bull. Amer. Meteor. Soc., 80, 381388, doi:10.1175/1520-0477(1999)080<0381:CMRUSB>2.0.CO;2.

    • Search Google Scholar
    • Export Citation
  • Wang, Y. T., and Chandrasekar V. , 2009: Algorithm for estimation of the specific differential phase. J. Atmos. Oceanic Technol., 26, 25652578, doi:10.1175/2009JTECHA1358.1.

    • Search Google Scholar
    • Export Citation
  • Wolberg, G., 1998: Cubic spline interpolation: A review. Columbia University Computer Science Tech. Rep. CUCS-389-88, 14 pp. [Available online at http://hdl.handle.net/10022/AC:P:12065.]

  • Zrnić, D. S., Ryzhkov A. , Straka J. , Liu Y. , and Vivekanandan J. , 2001: Testing a procedure for automatic classification of hydrometeor types. J. Atmos. Oceanic Technol., 18, 892913, doi:10.1175/1520-0426(2001)018<0892:TAPFAC>2.0.CO;2.

    • Search Google Scholar
    • Export Citation
  • View in gallery

    Diagram of the basic fuzzy logic classification. Adapted from Liu and Chandrasekar (2000).

  • View in gallery

    Diagram of the full four-step classification. The box labeled “fuzzy hydroclass” is represented in detail in Fig. 1. See text for further details.

  • View in gallery

    Membership functions for the input variable Zdr and hydrometeor classes large drops and rain–heavy rain. As reported in Table 1, the values of m and alar are expressed as a function of Zh. The thick solid line represents the 1.0 central value of the beta function m(Zh), while the lower and upper boundaries of the colored areas represent the 0.5 value of m(Zh) − al(Zh) and m(Zh) + ar(Zh), respectively (al = ar in this case).

  • View in gallery

    Average quality of the wet ice class for different temperature shifts during the maximization loop (block 1 of the diagram in Fig. 1). The number below each curve represents the estimated second-order derivative (curvature; % °C−2). All curves show values below (higher curvature) the adopted threshold of −0.3% °C−2.

  • View in gallery

    Geometry of the contiguity window for (left) an elevation scan and (right) an azimuth scan. The colors represent the cos(ϕ) term in Eq. (9) (penalty term).

  • View in gallery

    Checkerboard representation of the Boolean penalty matrix terms Bp in Eq. (9). The x and y axis labels represent the hydrometeor classes; see text for details. Black boxes in a given column indicate hydrometeors to be penalized above a candidate (top labels), while black boxes in a row indicate hydrometeors to be penalized below a candidate (left labels).

  • View in gallery

    Number of bin classification changes at each iteration of the cluster analysis for three different cases (section 3) and two fixed values of λ [α = 1 in Eq. (10)].

  • View in gallery

    Illustration of the CCL algorithm for the case study reported in Fig. 10. Two connected regions identified as (a) hail and (b) three regions of rain are identified and sequentially labeled (overplotted numbers).

  • View in gallery

    Observed distribution (histograms) of (a) the reflectivity and (b) the correlation coefficient within the connected region 2 classified as hail in Fig. 8a. The solid line represents the corresponding membership function (right y axis), while the dashed line is the kernel-density-estimated PDF of the observations.

  • View in gallery

    The RHI along the 135° azimuth at 2306 UTC 22 Jun 2013 from CHILL radar. (a) Reflectivity and (b) final region-based hydrometeor classification with λ = 0.8 and α = 0.75. See the text for the list of classes in the legend. Panels (c)–(f) are Zdr, Kdp, the bin-based hydrometeor classification after temperature adjustment, and the region-based classification using λ = 1.0 and α = 0.95. The overplotted solid black line in (a)–(d) marks the separation between solid and liquid–mixed phase particles in the final classification of (b).

  • View in gallery

    The RHI along the 45° azimuth at 2257 UTC 2 Jul 2008, from CHILL radar: (a) reflectivity, (b) region-based hydrometeor classification, (c) Kdp, and (d) Zdr.

  • View in gallery

    The PPI at 1.2° elevation from the C-band Bric radar, at 0130 UTC 27 Apr 2009: (a) observed Zh, (b) attenuation-corrected Zdr, (c) ρHV, (d) bin-based hydrometeor classification, (e) cluster-based classification, and (f) final region-based classification.

  • View in gallery

    CASA IP1 RHIs of dual-polarization variables: (a) observed Zh , (b) attenuation-corrected Zh, (c) Zdr, (d) ρHV, (e) Kdp, and (f) final hydrometeor classification at 0414 UTC 20 May 2011, along the azimuth 101.6°.

  • View in gallery

    K-means unsupervised classification (K − meansbin in Table 3) for (a) CHILL 2008 case (Fig. 11) and (b) CASA case (Fig. 13).

  • View in gallery

    Sensitivity to an artificial Gaussian white noise in the input observations (a)–(e) Zh and (f)–(j) Zdr for the cases analyzed in section 3. The plots report the number of polar bins that changed class with respect to the reference hydrometeor map without added noise for the bin-based and the cluster-based classifications (output of block 3 in Fig. 2, with λ = 0.8, α = 0.75). In addition, the right y axes report the fraction (%) of bins changed with respect to the total number of valid bins. To facilitate the interpretation, the area between the two lines is light blue when the cluster-based classification produced fewer class changes, in orange otherwise.

  • View in gallery

    As in Fig. 15, but for the input observations (a)–(e) ρHV and (f)–(j) Kdp. The Kdp noise introduced is scaled with the radar frequency.

  • View in gallery

    As in Fig. 15, but the sensitivity to an artificial bias is shown, for the input observations (a)–(e) Zh and (f)–(j) Zdr.

All Time Past Year Past 30 Days
Abstract Views 0 0 0
Full Text Views 575 306 62
PDF Downloads 413 240 24

A Semisupervised Robust Hydrometeor Classification Method for Dual-Polarization Radar Applications

View More View Less
  • 1 Colorado State University, Fort Collins, Colorado
© Get Permissions
Full access

Abstract

Most of the recent hydrometeor classification schemes are based on fuzzy logic. When the input radar observations are noisy, the output classification could also be noisy, since the process is bin based and the information from neighboring radar cells is not considered. This paper employs cluster analysis, in combination with fuzzy logic, to improve the hydrometeor classification from dual-polarization radars using a multistep approach. The first step involves a radar-based optimization of an input temperature profile from auxiliary data. Then a first-guess fuzzy logic processing produces the classification to initiate a cluster analysis with contiguity and penalty constraints. The result of the cluster analysis is eventually processed to identify the regions populated with adjacent bins assigned to the same hydrometeor class. Finally, the set of connected regions is passed to the fuzzy logic algorithm for the final classification, exploiting the statistical sample composed by the distribution of the dual-polarization and temperature observations within the regions. Example applications to radar in different environments and meteorological situations, and using different operating frequency bands—namely, S, C, and X bands—are shown. The results are discussed with specific attention to the robustness of the method and the segregation of the data space. Furthermore, the sensitivity to noise and bias in the input variables is also analyzed.

Corresponding author address: Renzo Bechini, Department of Electrical and Computer Engineering, Colorado State University, 1373 Campus Delivery, Fort Collins, CO 80523. E-mail: rbechini@engr.colostate.edu

Abstract

Most of the recent hydrometeor classification schemes are based on fuzzy logic. When the input radar observations are noisy, the output classification could also be noisy, since the process is bin based and the information from neighboring radar cells is not considered. This paper employs cluster analysis, in combination with fuzzy logic, to improve the hydrometeor classification from dual-polarization radars using a multistep approach. The first step involves a radar-based optimization of an input temperature profile from auxiliary data. Then a first-guess fuzzy logic processing produces the classification to initiate a cluster analysis with contiguity and penalty constraints. The result of the cluster analysis is eventually processed to identify the regions populated with adjacent bins assigned to the same hydrometeor class. Finally, the set of connected regions is passed to the fuzzy logic algorithm for the final classification, exploiting the statistical sample composed by the distribution of the dual-polarization and temperature observations within the regions. Example applications to radar in different environments and meteorological situations, and using different operating frequency bands—namely, S, C, and X bands—are shown. The results are discussed with specific attention to the robustness of the method and the segregation of the data space. Furthermore, the sensitivity to noise and bias in the input variables is also analyzed.

Corresponding author address: Renzo Bechini, Department of Electrical and Computer Engineering, Colorado State University, 1373 Campus Delivery, Fort Collins, CO 80523. E-mail: rbechini@engr.colostate.edu

1. Introduction

Most of the recent hydrometeor classification schemes are based on fuzzy logic (Liu and Chandrasekar 1998; Vivekanandan et al. 1999; Liu and Chandrasekar 2000; Zrnić et al. 2001; Dolan and Rutledge 2009; Al-Sakka et al. 2013). When the fuzzy logic algorithm is applied to noisy input variables, the resulting classification is also noisy. This is essentially because the classification is performed on the individual radar resolution bins (simply bins) with a typical size ~1° × 100 m, irrespective of the surrounding observations. For operational applications in particular, it is desirable to have a smooth and clear output to ease the interpretation by the end user. An obvious way to get a smooth output is by filtering either the input variables or the final classification. In this paper we present an attempt to produce a noiseless classification by combining in a unique way the following items:

  1. the quality (or strength) of the classification

  2. the spatial coherence and self-aggregation propensity of the observations

  3. basic microphysical constraints

The aim is to define an objective and completely automatic procedure to attain a result as close as possible to the classification that a human expert could perform. The items listed above are natural information sources that a radar expert is normally considering to partition the physical space in different categories based on the multiple inputs provided by the polarimetric radar observations. The adopted approach lies within the class of semisupervised learning and is implemented here around a core cluster analysis module. In this type of learning method, also called the constraint-based method, the auxiliary background knowledge or a given amount of labeled data is used to improve the partition of the data space. The technique is basically heuristic. In fact the supervised component of the analysis relies on constraints and rules that can be ultimately considered background knowledge supported by precipitation microphysics.

The proposed method assumes a standard bin-based classification scheme is already available and is intended to improve the initial output, through noise reduction and exploitation of the self-aggregation process inherent in cluster analysis. The ultimate goal of the method is to attain a simplified picture of the hydrometeor’s distribution, summarized by a limited number of connected regions in the two-dimensional observation space [either plan position indicator (PPI) or range–height indicator (RHI)].

The development of the classification methodology relies considerably on techniques commonly used in data mining and digital image processing. In particular an adaptation of the K-means clustering algorithm (Aggarwal and Reddy 2013) is developed to incorporate a spatial contiguity constraint and a penalty term for the inclusion of microphysical constraints. One of the distinguishing points of the proposed methodology is that the final classification is performed over connected regions, as opposed to traditional methods, which are applied to the individual range bins or Cartesian grid points. The connected regions in the radar observations’ domain can be derived based on digital image analysis techniques. In this work, the connected component labeling (CCL) algorithm (Gonzalez and Woods 2002; Fisher et al. 2003) is employed for the identification and unique labeling of regions populated with adjacent bins assigned to the same hydrometeor class.

For the definition of the dual-polarization variables’ ranges and membership functions for the set of hydrometeor classes, we rely on the relevant published work, in particular, Straka et al. (2000), Liu and Chandrasekar (2000), and Zrnić et al. (2001) for S band; Baldini et al. (2005), Marzano et al. (2007), and Dolan et al. (2013) for C band; and Dolan and Rutledge (2009), Chandrasekar et al. (2013), and Lim et al. (2013) for X band.

In section 2, the proposed methodology is described in detail, and in section 3 application examples using data from different radars operating at frequencies ranging from S band to X band are illustrated. Section 4 presents a statistical evaluation of the method, and finally section 5 summarizes the main results.

2. Method for noiseless hydrometeor classification

The basic building block of the classification procedure is the quite popular fuzzy logic method depicted in Fig. 1. The success of the fuzzy logic techniques in hydrometeor classification is likely due to the ease of implementation. All that is needed is a proper set of rules, generally implemented through analytical membership functions. The specific fuzzy logic technique adopted here is essentially based on the classification methods described in Liu and Chandrasekar (1998) and Zrnić et al. (2001), with few modifications discussed in section 2a. The overall structure of the classification algorithm is represented by the diagram in Fig. 2. The building blocks 1–4 encompassed by the dashed line are discussed in detail in the following subsections. Hereinafter, we’ll refer to the output of blocks 2–4 in Fig. 2 as bin-based, cluster-based, and region-based classifications, respectively.

Fig. 1.
Fig. 1.

Diagram of the basic fuzzy logic classification. Adapted from Liu and Chandrasekar (2000).

Citation: Journal of Atmospheric and Oceanic Technology 32, 1; 10.1175/JTECH-D-14-00097.1

Fig. 2.
Fig. 2.

Diagram of the full four-step classification. The box labeled “fuzzy hydroclass” is represented in detail in Fig. 1. See text for further details.

Citation: Journal of Atmospheric and Oceanic Technology 32, 1; 10.1175/JTECH-D-14-00097.1

The input radar variables for the hydrometeor classification are reflectivity Zh, differential reflectivity Zdr, specific differential phase shift Kdp (the range derivative of the differential phase shift Φdp), and cross-polar correlation (Horizontal–Vertical) coefficient ρHV. The vertical profile of temperature T from either a nearby observed sounding or numerical model output is also considered, leading to a total of Nvar = 5 inputs. In addition, for the identification of nonmeteorological echoes (section 2a), the spatial variance of the differential reflectivity σ(Zdr), the spatial variance of the total differential phase shift σdp), and the Doppler velocity V are also used. The observed total differential phase shift is defined as
e1
where δhv is the differential phase shift upon backscattering (Bringi and Chandrasekar 2001). To avoid overestimating the variability of Zdr and Ψdp due to physical trends in the observations (e.g., Ψdp increasing in heavy rain), the variances are calculated on the residuals after linear regression over five consecutive range bins.

The algorithm is designed to work on the radar observations and textures in the two-dimensional (either PPI or RHI) polar domain (radial θ, range R). The choice is in the first instance dictated by the need to limit the overall computational resources for real-time applications, avoiding interpolation of all radar variables and textures. Working in the polar domain also has the notable advantage of avoiding loss of information near the radar and preserving the contiguity properties in the observation space.

The classification is partitioned in Ncl = 11 hydrometeor classes: large drops (LD), drizzle (DR), rain (RA), heavy rain (HR), rain plus hail (RH), hail (HA), graupel (GR), wet ice (WI), dry ice (DI), crystals (CR), and dendrites (DN). In addition to the above-mentioned 11 hydrometeor classes, nonhydrometeor decision classes such as clutter (CL) and clear air (CA) categories are considered in the preliminary fuzzy logic classification (section 2a). The most notable difference between the hydrometeor classes used here and the most previous studies is the addition of a specific class for dendrite crystals. Kennedy and Rutledge (2011) for S-band radar and Bechini et al. (2013) for C-band and X-band radars have shown that dendrite crystals have specific dual-polarization signatures. The quantities Kdp (as high as 2° km−1 at C band) and the environmental temperature (ranging between approximately −10° and −20°C) are the most relevant variables employed for the identification. Another new feature of this classification system is a macro-class (LIQUID, MIXED, SOLID, HAIL) associated with the hydrometeor classes, for use in the final step of the classification (block 4 in Fig. 2). The LIQUID macro-class includes four hydrometeor types (LD, DR, RA, and HR). The MIXED macro-class is only composed of WI and SOLID includes GR, DI, CR, and DN, while HAIL includes RH and HA.

a. Preliminary processing and fuzzy logic classification

Several processing tasks normally performed separately are here integrated in a single procedure. The reason is twofold: on one hand the total computational time is reduced; on the other hand, the adopted scheme provides an environment favorable to deal with the interconnection between attenuation correction and hydrometeor classification in the development process.

The preliminary processing of the dual-polarization moments includes clutter identification, Ψdp filtering, Kdp calculation, and attenuation correction. The fuzzy logic hydrometeor identification algorithm follows the classical fuzzification–inference–aggregation–defuzzification flow depicted in Fig. 1. In addition to the class for a specific bin, the fuzzy logic algorithm provides in output the quality (described hereinafter) associated with the specific class assignment. This is an important point because the quality of the classification is needed in the subsequent cluster analysis (section 2c). To represent the membership functions (MBF), both beta functions (Liu and Chandrasekar 2000) and trapezoidal functions (Zrnić et al. 2001) have been used. Park et al. (2009) proposed asymmetric trapezoidal functions to better approximate the shapes of the probability distributions. Keeping a similar approach we define an asymmetric beta distribution, with values ranging from 0 to 1, as
e2
where x is the value of the input variable (radar or temperature); m is the center of the function; al and ar are the left- and right-side half-widths, respectively; and bl and br are the left- and right-side slopes. In the current implementation, br = bl for all hydrometeor classes, while different values are used for the nonmeteorological classes clutter and clear air. In fact the parameters for the clutter and clear air classes are derived by means of statistical analysis over samples of known echo type (Gourley et al. 2007) and the values of blbr can be set to better match the empirical distributions.

For the hydrometeor categories, the parameters m, al, ar, bl, and br are derived for each class and radar parameter, based on relevant observational and modeling studies. Specifically, we mainly rely on Straka et al. (2000), Liu and Chandrasekar (2000), and Zrnić et al. (2001) for S band; Baldini et al. (2005), Marzano et al. (2007), and Dolan et al. (2013) for C band; and Dolan and Rutledge (2009), Chandrasekar et al. (2013), and Lim et al. (2013) for X band.

As an example, Table 1 reports the parameters defining the beta function for the four dual-polarization variables at S band. Similar to Zrnić et al. (2001), we deal with physical dependencies between radar variables for specific hydrometeor classes, by introducing a dependency between the parameters m, alar, and Zh. This is visually represented in Fig. 3, where the MBF of Zdr is plotted for the large drops and rain–heavy rain classes. The large drops class is essentially defined as the rain category, following a similar increase with Zh but with higher Zdr. The large drops class can be found either below the freezing level, originating from the melting of large snowflakes, or within strong updrafts, indicating the presence of supercooled liquid drops (Conway and Zrnić 1993; Brandes et al. 1995).

Table 1.

Values of the parameters used to define the asymmetric beta function [Eq. (2)] for the dual-polarization radar variables at S band. When aral both values are indicated; otherwise, a single value is reported. For some classes the parameters m and alar are expressed as a function of (where ; ) to account for the expected physical correlation.

Table 1.
Fig. 3.
Fig. 3.

Membership functions for the input variable Zdr and hydrometeor classes large drops and rain–heavy rain. As reported in Table 1, the values of m and alar are expressed as a function of Zh. The thick solid line represents the 1.0 central value of the beta function m(Zh), while the lower and upper boundaries of the colored areas represent the 0.5 value of m(Zh) − al(Zh) and m(Zh) + ar(Zh), respectively (al = ar in this case).

Citation: Journal of Atmospheric and Oceanic Technology 32, 1; 10.1175/JTECH-D-14-00097.1

The inference rule is given by the weighted sum of the MBF of temperature and radar parameters (Zrnić et al. 2001; Lim et al. 2005):
e3
where wk(xi) are the weights associated with the input variable xi and hydrometeor class k. Term Sk is the resulting confidence associated with the class k, and the output class is the one for which Sk is maximum. The additive inference rule has the advantage of mitigating the effect of measurement errors, especially on the dual-polarization parameters, which may show unreliable values mainly because of partial beam filling, sidelobes effects and low signal-to-noise ratio. The weights are a priori assigned based on the confidence and the discriminating capability of the temperature and radar variables for every hydrometeor type.

The weight associated with the temperature is not a priori fixed as for the radar variables but is defined to be a parabolic function of the temperature itself, with a minimum value (1.0) at +1.25°C. The weights then double at −14°C(+16°C) and triple at −20°C(+23°C). Lim et al. (2005) proposed a hybrid scheme, where the sum inference rule is used for Zdr, Kdp, and ρHV, while the product rule is used for Zh and T. In such scheme the reflectivity and temperature are strong constraints, with the advantage of reducing most misclassifications due to overlapping dual-polarization radar MBF. But on the other hand, the multiplicative membership function of T introduces very sharp and often unrealistic transitions across precipitation phase changes. The use of the parabolic weight for the temperature is intended to exploit the temperature information to minimize misclassifications in regions far from phase transitions and to rely more on radar information to discriminate near the melting layer.

A special feature of the algorithm implementation is that the clutter class is treated exactly as an additional class and is identified (optionally, depending on the specific signal processor filtering activated) within the same fuzzy logic volume processing. This allows a reduced total computation time. More specifically, within the loop over the radials, the clutter class is identified at radial n by imposing a threshold value on its confidence (when Sclutter > 0.5, the bin is flagged as clutter, irrespective of the confidence associated with the other meteorological classes). The rest of the hydrometeor classes are treated within the same loop but considering the preceding radial n − 1, which is already flagged for clutter. This is devised to allow possible calculation of averages and textures in range for the purpose of hydrometeor classification, excluding the bins marked as clutter.

The processing of the nth radial also includes filtering Ψdp using the Hubbert and Bringi (1995) scheme to remove δhv, after the clutter bins are flagged. Term Kdp is calculated by a linear moving window regression in a two-step procedure: a first step estimate is obtained over a large window (e.g., ±16 range bins). In the second step, the initial Kdp estimate is used to set the proper window width (smaller for higher Kdp). In this way it is possible to retain the small-scale variability in heavy precipitation and at the same time limit the noise in weaker precipitation regions. More sophisticated algorithms are available for the estimation of Kdp, but mainly for application to rainfall estimation (Wang and Chandrasekar 2009; Otto and Russchenberg 2011; Schneebeli et al. 2014). For the purpose of hydrometeor classification, the current approach has the advantage of being simple to implement, robust [it provides an estimate of Kdp even in regions of low signal-to-noise ratio (SNR) where Ψdp is more noisy], and completely independent of the other polarimetric variables. However, if a Kdp estimate is already available from a separate processing, then this can be used as input for the hydrometeor classification.

The rain profiling algorithm based on Testud et al. (2000) is applied to correct the horizontal reflectivity for path attenuation, while differential attenuation is linearly estimated from the horizontal attenuation αH (Bringi et al. 1990). Attenuation correction is performed exclusively in the rain medium (i.e., below the freezing level) to avoid introducing a bias due to differential phase increases in the ice region. However, especially for X band, the attenuation-corrected measurements are subject to increasing uncertainty with range, because of the inherent approximation of the attenuation estimate. An obvious implication is that the Zh and Zdr measurements at farther ranges, when attenuation is relevant, should be given less confidence in the hydrometeor classification. This is implemented by applying an arbitrarily defined attenuation-dependent and exponentially decaying weight to Zh and Zdr [multiplying the weight wk(xi) in Eq. (3)], as a function of the estimated horizontal attenuation:
e4
where the values of are set to 20-dBZ and 10-dB Zh and Zdr, respectively.

Before hydrometeor classification is performed, optional range averaging over the radar observations can be applied, excluding the bins flagged as clutter.

Based on the MBF defined in Eq. (2), it is possible to define the quality (or strength) of the classification in several ways, for example:

  1. confidence of the identified particle class [maximum confidence among all hydrometeor classes: maxk(Sk)];

  2. difference between the confidence of the first and second identification (Al-Sakka et al. 2013);

  3. difference between the confidence of the first identification and the highest confidence of a particle with a different macro-class.

After a detailed evaluation, the simplest process (method 1) was chosen for exhibiting the most robust performance in the definition of the seeds for the cluster analysis (section 2c).

b. Adjustment of the freezing-level height

Because of the uncertainty typically affecting the temperature vertical profile (space–time representativeness of observed soundings, model uncertainty), a preliminary step in the algorithm (block 1 in Fig. 2) aims at fine-tuning the location of the freezing level. In the literature several approaches have been proposed, based on the dual-polarization measurements with a focus on the location of the minimum of ρHV (Ryzhkov and Zrnić 1998; Giangrande et al. 2008), or considering the change in the vertical gradient of Zdr (Lim et al. 2005). Here, an iterative procedure is adopted, to adjust the level of the 0°C isotherm based on the results of the bin-based hydrometeor classification itself. A direct search algorithm (Hooke and Jeeves 1961) is used to find the temperature increment (positive or negative) that maximizes the average quality of the bins identified as wet ice. The proposed recursive approach exploits all the dual-polarization information available, without the need for additional specific processing.

The temperature optimization for the application examples discussed in section 3 is represented in Fig. 4. To accept the estimated temperature shift, the curvature of the wet ice quality versus dT curves represented in the plots is considered. A threshold on the second-order derivative estimated from nonuniformly spaced data points (Wolberg 1998) is used in the current implementation, which allowed for accepting all temperature shifts corresponding to the maxima in Fig. 4.

Fig. 4.
Fig. 4.

Average quality of the wet ice class for different temperature shifts during the maximization loop (block 1 of the diagram in Fig. 1). The number below each curve represents the estimated second-order derivative (curvature; % °C−2). All curves show values below (higher curvature) the adopted threshold of −0.3% °C−2.

Citation: Journal of Atmospheric and Oceanic Technology 32, 1; 10.1175/JTECH-D-14-00097.1

Finally, the whole temperature profile is shifted by the resulting increment. When the absolute value of this final temperature increment is significant (e.g., >0.5°C), the new estimation of the freezing-level height is used to rerun the attenuation correction and the bin-based hydrometeor classification. Although discontinuities may arise from one scan to the next, in an operational implementation, a stable estimate should be attained through filtering the available preceding time series of the temperature shift.

The current implementation relies on the four polarimetric radar variables (Zh, Zdr, ρHV, and Kdp) and temperature for the identification of the wet ice class, as for all the other hydrometeor classes. It is likely that the adjustment of the freezing level could be further improved by additionally considering the variance of Ψdp and the vertical derivatives of Zdr specifically for the wet ice category. In fact, these additional parameters should improve the identification of wet ice and also make it less susceptible to small biases in Zh and Zdr.

c. Cluster analysis

The bin-based fuzzy logic classification performed using the adjusted temperature profile is the initial step of the cluster analysis (block 3 in Fig. 2). The centroids of each hydrometeor class are calculated and used as initial values (seeds) for an analysis based on K-means clustering (Aggarwal and Reddy 2013).

The K centroids (K being the number of hydrometeor classes found in the current data: KNcl) are defined by a vector μk whose components are the quality-weighted average values of the radar variables Zh, Zdr, Kdp, and ρHV (observation vector x) for each hydrometeor class. The ith component of the vector μk can be expressed as follows:
e5
where Sk is the confidence [Eq. (3)] associated with the class assigned by the bin-based classification to the nth bin and xi is the ith radar variable. The quantity is the number of observations (bins) that have been assigned to the class k ∈ [1:K].

The temperature T is not included in the cluster analysis, which entirely relies on the radar observations. In fact the inclusion of T in this step may lead to an excessive vertical stratification in the resulting analysis. In addition, only the 11 hydrometeor classes are considered hereinafter; that is, the identification of clutter and clear air is finalized in the bin-based classification step. It is also worth emphasizing that the membership functions and associated weights for the radar parameters are not used in the clustering process, which is designed to extract additional information from the observations with a “learn from data” approach.

The K-means clustering approach is used to partition the data space, by iteratively computing the affinity between a hydrometeor assignment and each of the K already existing clusters. Lakshmanan et al. (2001) proposed a segmentation technique for the radar reflectivity using textures. Applying a K-means clustering approach to the vector x of dual-polarization variables, under smoothness assumption a cost function Ek can be defined for every bin in the polar domain:
e6
where k indicates the hydrometeor class. The Dk is the Mahalanobis distance (Mahalanobis 1936) between the cluster vector μk for the class k and the observation vector x for the current bin, defined as
e7
where Σ is the covariance matrix of the observations and the superscript T indicates the transposed vector. Since the covariance matrix is, in general, positive definite, the matrix Σ can be first decomposed into a product of a lower triangular matrix and its transpose using Cholesky decomposition and then inverted. The Mahalanobis distance is widely used in cluster analysis and provides an objective method for the normalization of the observation vector, whose components (Zh, Zdr, Kdp, and ρHV) represent different variables with different physical units [dBZ, dB, (°) km−1, and unitless, respectively]. The use of the Mahalanobis distance for vector normalization allows for overcoming one of the typical limitations of K-means clustering, that is, the tendency to produce spherical clusters arising from the adoption of the Euclidean metric.
The Ck is a measure of contiguity given by the number of neighboring bins with a hydrometeor class different from the candidate class k for the current bin:
e8
where Jt,r is the class of the neighboring bin in the polar array; NR and Nθ are the number of bins in range and azimuth (or elevation), respectively; and Ntot is the number of valid observations, defined as
eq1
Terms NR and Nθ are set according to a desired width and height of the contiguity window, typically of the order of ~(1 km × 1 km) or less. For the higher elevations in vertical scans (RHI), it is necessary to cut the polar contiguity window to keep a uniform geometry in the Cartesian projection, as shown in Fig. 5.
Fig. 5.
Fig. 5.

Geometry of the contiguity window for (left) an elevation scan and (right) an azimuth scan. The colors represent the cos(ϕ) term in Eq. (9) (penalty term).

Citation: Journal of Atmospheric and Oceanic Technology 32, 1; 10.1175/JTECH-D-14-00097.1

Term Pk is a penalty term introduced to provide some weak physical constraints to the cluster analysis. In the actual configuration, the constraints only involve the relative position in the vertical of different hydrometeors. For example, it is more likely to find hail rather than rain above a candidate rain-plus-hail bin. The checkerboard diagram in Fig. 6 illustrates the implemented pairwise constraints. The columns indicate the unlikely hydrometeors (black boxes) above a given candidate k (top labels), while the rows indicate the unlikely hydrometeors below a given candidate (left labels). Those penalty terms add up after proper weighting according to the relative position of the neighboring bin with respect to the candidate bin. The penalty term Pk can then be expressed as
e9
where Bp represents the Boolean penalty matrix in Fig. 6 (black boxes are ones and white boxes are zeroes), with k representing the column index and Jt,r representing the row index. Term ϕ is the angle between the vertical and the line connecting the candidate bin k and the neighboring bin. The polar bins in Fig. 5 are colored according to the weighting term cos(ϕ). A good performance has been achieved over many different cases with the multiplying factor γ in Eq. (6) set to a value of the order of 10. The penalty term is clearly more relevant for RHI scans, where the observations are distributed on a vertical plane rather than on a low-elevation PPI conical surface (Fig. 5).
Fig. 6.
Fig. 6.

Checkerboard representation of the Boolean penalty matrix terms Bp in Eq. (9). The x and y axis labels represent the hydrometeor classes; see text for details. Black boxes in a given column indicate hydrometeors to be penalized above a candidate (top labels), while black boxes in a row indicate hydrometeors to be penalized below a candidate (left labels).

Citation: Journal of Atmospheric and Oceanic Technology 32, 1; 10.1175/JTECH-D-14-00097.1

Figure 7 gives a picture of the typical convergence rate for two values of λ (λ = 0.2 and λ = 0.9). With low values of λ (more weight to contiguity), the convergence is, in general, faster. Decreasing the value of λ with increasing iteration n was found to be a good option to provide a robust classification result in few iterations:
e10
Fig. 7.
Fig. 7.

Number of bin classification changes at each iteration of the cluster analysis for three different cases (section 3) and two fixed values of λ [α = 1 in Eq. (10)].

Citation: Journal of Atmospheric and Oceanic Technology 32, 1; 10.1175/JTECH-D-14-00097.1

In this way during the first iterations, the K clusters may change significantly by self-adapting to the observations, while the last iterations provide the desired spatial smoothing. The iteration is stopped when the number of bins changed in the classification array falls below a given threshold (1% of the total number of valid bins).

d. Region-based classification

Depending on the value of the parameters in the cost function [Eq. (6)], especially when high values of λ are used, the K-means cluster analysis may cause some clusters to drift too much from the initial values. As a consequence the original hydrometeor class may become no longer appropriate for a given region. The formulation of the cost function with increasing emphasis on the contiguity component [Eq. (10)] yields a virtually noiseless classification map. On this map it is practical to identify a limited number of connected regions (Nr) by running a CCL algorithm (Gonzalez and Woods 2002; Fisher et al. 2003). The implemented CCL algorithm works by checking for eight connections (four horizontal and vertical neighbors, and four diagonal neighbors) in binary input images. One hydrometeor class is processed at a time, by setting all the pixels marked with the current class to one and setting the pixels with a different class to zero. From this input binary image, all the connected regions are sequentially labeled. An example illustrating the CCL algorithm results for two hydrometeor classes is reported in Fig. 8, where the labels identifying the connected regions are represented by different gray levels. In this example, referring to the case study analyzed in section 3a, two separate connected regions of hail and three regions of rain are identified. These plots elucidate one of the main features of the proposed approach, that is, the aim to deal with homogeneous and spatially connected regions instead of individual range bins.

Fig. 8.
Fig. 8.

Illustration of the CCL algorithm for the case study reported in Fig. 10. Two connected regions identified as (a) hail and (b) three regions of rain are identified and sequentially labeled (overplotted numbers).

Citation: Journal of Atmospheric and Oceanic Technology 32, 1; 10.1175/JTECH-D-14-00097.1

The fuzzy logic hydrometeor classification can now be applied to the set of connected regions in a way that is similar to the way it is applied to the individual bins. It is in principle only necessary to identify a proper parameter (typically the mean or the median) representing the underlying statistical distribution for every region. Alternatively, a direct comparison between the probability density function (PDF) of the univariate observations and the associated membership function can be performed. This latter approach is illustrated in Fig. 9. In practice the value of MBFk(xi) in Eq. (3) is replaced by an affinity measure defined as
e11
where is the portion of the area under the PDF (dashed line in Fig. 9) for variable xi, lying outside the domain defined by the membership function for the class k (hatched area in Fig. 9). Term is the total integral of the PDF, obtained from the observations using a Gaussian kernel smoothing (Parzen 1962). The AFFk(xi) parameter, encompassing the 0–1 range just like the membership function in Eq. (2), provides a measure of the conformity of the observed distribution to the theoretical expectation. For most cases this definition provides a similar estimate as the one simply (and quickly) obtained using the average or median value of the distribution. But in some instances—for example, highly nonnormal distributions or distributions with large tails exceeding both sides of the membership function—it gives a more robust indication about the affinity of the region with a given hydrometeor class. The classification then proceeds as for the bin-based application described in section 2a.
Fig. 9.
Fig. 9.

Observed distribution (histograms) of (a) the reflectivity and (b) the correlation coefficient within the connected region 2 classified as hail in Fig. 8a. The solid line represents the corresponding membership function (right y axis), while the dashed line is the kernel-density-estimated PDF of the observations.

Citation: Journal of Atmospheric and Oceanic Technology 32, 1; 10.1175/JTECH-D-14-00097.1

The resulting region-based classification may either confirm to the original class of the connected region (which means that the cluster analysis did not drift excessively) or propose a new assignment. At this point, we can further exploit the availability of a statistical sample for every connected region (as opposed to the single-value bin) to introduce some simple physically based rules. These rules rely on the 0%, 25%, 50%, 75%, and 100% quantiles of the temperature distributions and are implemented as if statements within the loop over the hydrometeor classes, for a given connected region nreg. With k being the index of the candidate class:

  • If macro-class(k) is LIQUID

    • – if then skip k

    • – if then skip k

  • If macro-class(k) is SOLID

    • – if then skip k

    • – if then skip k

The superscript of T represents the percentile of the temperature. Terms and are the 25% and 75% quantiles, respectively, over all the regions identified as wet ice. Therefore, some of the above-mentioned conditions are only applicable if at least one wet ice region is present in the classification map. The scope of the rules is to avoid some clearly unreliable situations, for example, when a rain region is entirely above the melting layer. This kind of situation is not likely to occur but is still possible, since the temperature information is not included in the cluster analysis. Other common features, in convective systems, are regions of liquid hydrometeors extending from near the surface to above the ambient freezing level within an updraft. These regions will not be affected by the adopted rules and will be preserved.

The region is finally reassigned only if the quality of the new assignment is higher than the quality of the original class by a given relative threshold (parameter β in block 4 of the diagram in Fig. 2). The higher the value of β, the higher the confidence posed in the self-adapting attitude of the data. Lower values of β are more suitable for an operational conservative approach. In the following application examples, a value of β = 4% is used. This means that the new proposed assignment must have a confidence at least 4% higher than the confidence associated with the original class to be accepted.

The region-based classification is not performed if a region is too scarcely populated (<50 bins) because a reliable PDF may not be estimated. The bins of these regions are assigned after the loop in block 4 (Fig. 2) is completed, based on the mode of the surrounding bins.

Although at the moment only few very simple rules are adopted, it is clear that working with connected regions considerably expands the potential of the classification. The properties of the spatially connected statistical samples (density function, orientation, vertical/horizontal extension, etc.) can be further exploited to introduce more physically based constraints.

3. Application examples

To test the method, dual-polarization data from different radar systems operating at S, C, and X bands are considered (Table 2). In particular, the following examples are based on measurements collected by the S-band University of Chicago–Illinois State Water Survey (CHILL) Colorado State University radar in Colorado (CSU–CHILL 2014), the C-band Bric della Croce (Bric) radar in Italy (Bechini and Cremonini 2002), and the Collaborative Adaptive Sensing of the Atmosphere (CASA) Integrated Project 1 (IP1) X-band system in Cyril, Oklahoma (Junyent et al. 2010).

Table 2.

Main characteristics of the three radars used for the application and evaluation of the method. SHV stands for simultaneously transmitting H and V polarizations. The PRF of the C-band radar is staggered with a ratio of .

Table 2.

The classification may vary significantly depending on the choice of the parameters λ, α, and γ in Eqs. (6) and (10). In the following examples, except where explicitly noted, the initial value of λ is set to 0.8, α = 0.75, and γ = 12.0. The λ parameter in particular determines the adopted approach to the cluster analysis. High values of λ allow the clusters to vary significantly from the first-guess classification. The input variables may assume values departing from the initial definition of the classes through the membership functions. Low values of λ, on the other hand, produce a classification relying more on spatial contiguity. In this case the distribution of the input variables within the regions may widen, although the average values will not change significantly. With the set of chosen values, it is found that, in general, less than 10% of the regions are reassigned in the final step of the classification (fuzzy logic applied to the connected regions, as depicted by block 4 in Fig. 2). This is considered an acceptable compromise between a heavy data-adaptive approach (high values of λ) and an analysis based on spatial contiguity.

a. S band, deep convection (Colorado)

Figure 10 shows a RHI scan through a deep convective storm in Colorado observed by the CHILL radar at 2306 UTC 22 June 2013. The radar operated in alternate transmit simultaneous receive (ATSR) transmission, with 128 integrated pulses and a radial resolution of 0.25°. The most close and recent temperature profile available for real-time processing is from the Denver, Colorado, 1200 UTC sounding. From this profile the height of the freezing level is located around 4650 m MSL. The temperature optimization loop (block 1 in Fig. 2) provides a considerable temperature shift of −3.75°C, as inferred from the average quality of the wet snow bins in Fig. 4. Although the analyzed scene is mainly convective, the more stratiform region closer to the radar provides enough data for a fairly robust adjustment of the freezing-level height (relocated around 4000 m MSL), as can be inferred by visual inspection of the differential reflectivity field in the 20–60-km range. In particular, the separation line between solid and liquid/mixed phase particles in Fig. 10c follows quite well the boundary between marked positive and weakly positive Zdr. The bin-based classification after the temperature adjustment is represented in Fig. 10e, while Fig. 10f shows the region-based final classification using a different pair of coefficients (λ = 1.0, α = 0.95). Both region-based classifications (Figs. 10b and 10f) show a considerable improvement with respect to the first-guess bin-based map (Fig. 10e) in terms of noise reduction, but the choice of the λ and α coefficients determines some difference in the output results. As previously mentioned the classification with λ = 0.8 and α = 0.75 is a compromise between a conservative approach (low λ), which would basically act like a spatial filter, and a more data-adaptive approach (high λ). In this case the final classification is basically the output of the cluster analysis (block 3 in Fig. 1), since only a couple of small wet ice connected regions (composed of fewer than 100 polar bins) have been reassigned in the final step (block 4).

Fig. 10.
Fig. 10.

The RHI along the 135° azimuth at 2306 UTC 22 Jun 2013 from CHILL radar. (a) Reflectivity and (b) final region-based hydrometeor classification with λ = 0.8 and α = 0.75. See the text for the list of classes in the legend. Panels (c)–(f) are Zdr, Kdp, the bin-based hydrometeor classification after temperature adjustment, and the region-based classification using λ = 1.0 and α = 0.95. The overplotted solid black line in (a)–(d) marks the separation between solid and liquid–mixed phase particles in the final classification of (b).

Citation: Journal of Atmospheric and Oceanic Technology 32, 1; 10.1175/JTECH-D-14-00097.1

The use of high values of λ through the K-means cluster iterations (λ = 1.0, α = 0.95), on the other hand, may significantly modify the characteristics of the clusters. Several regions showing a major evolution in their extension and properties have been reassigned based on the fuzzy logic classification of the connected regions. Specifically, five regions (three of which are composed by more than 103 polar bins) have been reassigned in the final step of the classification, out of 36 total connected regions. Some of the large drizzle regions in Fig. 10f were previously classified as wet ice after the cluster analysis. These regions have grown considerably during the K-means cluster iterations, reaching an extension from the freezing level down to the surface, which caused the subsequent automatic reassignment to a more plausible class. This specific behavior can be seen as a lack of a well spatially defined signature for the wet ice (although still sufficient for the freezing-level adjustment). A moderate enhancement in Zdr is apparent, while both Zh and ρHV do not present the expected spatial trend in a typical well-defined melting layer.

The most relevant difference with respect to the classification in Fig. 10b is the lack of a hail region, replaced in the lowest levels by a slightly enlarged rain-plus-hail region. The rain-plus-hail class is distinguished from the overhanging graupel class mainly for the lower correlation coefficient (and higher temperature), while Kdp reaches values up to 1° km−1 in the rain class in the 65–75-km range but Kdp ~ 0° km−1 between 75 and 80 km near the surface and above (Fig. 10d). The hail area aloft grew during the cluster analysis at the expense of the surrounding graupel, leading to the inclusion of excessively low reflectivity values to be still acceptable for a hail classification. The median reflectivity was lowered from 54.6 dBZ in the bin-based classification to 50.9 dBZ, leading to a final reassignment to the graupel class. The distinction between hail and graupel in this case merely lies in the absolute value of the reflectivity, since all the other polarimetric variables show rather weak signatures. This causes the hail area to grow within the graupel’s bigger region until equilibrium is reached, when the two regions have similar extensions. This behavior can be viewed as a limitation of the method, but it is actually stressing the low discriminating power between hail and graupel based on the available observations in this specific case. An easy workaround in this type of situation may be a later reassignment of a graupel subregion solely based on the reflectivity value.

Figure 11 illustrates another example of a deep convective storm observed in July 2008 by the CHILL radar. As for the previous case, these dual-polarization measurements are also collected in ATSR mode, but with a larger number of integrated pulses (256) and a radial resolution of 0.2°, resulting in high-quality moments and very little noise. This can be considered a text case, with very well-defined dual-polarization signatures. In particular, the Kdp signal is very strong and helps to designate precisely the boundary between the hail and rain-plus-hail regions. The rain-plus-hail region is characterized by the high Zh and Kdp but lowering Zdr, in particular, in the 50–55-km range. In the stratiform region, on the other hand, Kdp provides very little information for the region-based classification, while Zdr is fundamental for the definition of the transition between dry ice, wet ice, and rain.

Fig. 11.
Fig. 11.

The RHI along the 45° azimuth at 2257 UTC 2 Jul 2008, from CHILL radar: (a) reflectivity, (b) region-based hydrometeor classification, (c) Kdp, and (d) Zdr.

Citation: Journal of Atmospheric and Oceanic Technology 32, 1; 10.1175/JTECH-D-14-00097.1

The two examples presented elucidate the most noteworthy characteristic of this method. The region-based classification is empowered by a controllable degree of self-adaptation to the most prominent spatial polarimetric signatures. This essentially means that the procedure itself does not need any specific tuning for a particular radar or operating frequency. At S band the classification is especially sensitive to Zh and Zdr, while at attenuating frequencies like C and X band, Kdp is automatically gaining importance owing to the frequency scaling and the consequent better definition of spatial patterns for several hydrometeors.

b. C band, stratiform precipitation (Italy)

Observations collected in PPI mode with 1.0° angular resolution by the C-band dual-polarization operational radar located in Torino, Italy, are discussed in this section. The case presented refers to a warm frontal stratiform precipitation system and is discussed in detail in Bechini et al. (2013). The freezing-level optimization loop estimates a −2.25°C temperature shift (Fig. 4), lowering the 0°C altitude from approximately 2.3 km (Milano Linate sounding at 0000 UTC) to about 2.0 km MSL. Figure 12 shows the PPI at 1.2° elevation of the radar dual-polarization observations and the resulting hydrometeor classification. The radar is located on a hill and is surrounded by complex orography. The dual-polarization measurements are subject to sidelobe contamination, which enhances the measurement noise and determines low ρHV values over some parts of the domain. Hydrometeor classification in stratiform widespread precipitation is, in general, an easy task when considering significant elevation angles, owing to the sharp transitions along the range on the conical surface. For low elevations, however, the designation of the melting-layer boundaries is not straightforward. In Fig. 12d the bin-based classification shows a noisy map, with ambiguous results especially in the low-reflectivity area in the western sector. The low correlation and noisy Zdr caused by sidelobe effects produced many clearly wrong wet ice identifications close to the surface. The final classification in contrast shows a definite improvement, mainly in terms of overall noise reduction and a very well-defined separation between liquid (drizzle and rain) and mixed (wet ice) precipitation. The lower boundary of the melting layer appears reasonably well defined with good enough spatial resolution.

Fig. 12.
Fig. 12.

The PPI at 1.2° elevation from the C-band Bric radar, at 0130 UTC 27 Apr 2009: (a) observed Zh, (b) attenuation-corrected Zdr, (c) ρHV, (d) bin-based hydrometeor classification, (e) cluster-based classification, and (f) final region-based classification.

Citation: Journal of Atmospheric and Oceanic Technology 32, 1; 10.1175/JTECH-D-14-00097.1

In the bin-based classification (Fig. 12d), 157 bins are suspiciously identified as being in the rain-plus-hail class due to the relatively high-attenuation-corrected reflectivity (the 25th, 50th, and 75th percentiles are 50.1, 51.1, 52.5 dBZ, respectively) and the low correlation coefficient (0.817, 0.870, 0.898, respectively). From this initial classification, an enlarged region grew during the cluster analysis (Fig. 12e), encompassing about 1200 bins with (25%, 50%, and 75%) percentiles of Zh and ρHV of (46.4, 48.0, 49.6 dBZ) and (0.839, 0.874, 0.902), respectively. This expanded rain-plus-hail region (and a few other smaller regions) are eventually reassigned to wet snow in the final step, because of the low affinity measure AFFk(xi) = 0.49 for the intensity-reduced Zh distribution. The high Kdp values up to 1.8° km−1 (not shown) well above the melting layer are responsible for the two regions of dendrite crystals adjacent (and at higher altitude) to the extended graupel area. This classification appears fairly consistent with the existence of a midlevel weak widespread updraft associated with light riming above the melting layer (graupel) and sustains the crystal growth by vapor deposition around the −15°C temperature region (dendrites).

c. X band, convective precipitation (Oklahoma)

Figure 13 shows the RHI observations (0.25° radial resolution) and hydrometeor classification from the CASA IP1 test bed data in Oklahoma at 0414 UTC 20 May 2011. The temperature profile is taken from Norman, Oklahoma, sounding observations at 0000 UTC, reporting the height of the freezing level around 4.5 km. The 0°C level is then lowered to about 4.1 km MSL after the temperature optimization loop, providing an estimate of −3.75°C shift (Fig. 4).

Fig. 13.
Fig. 13.

CASA IP1 RHIs of dual-polarization variables: (a) observed Zh , (b) attenuation-corrected Zh, (c) Zdr, (d) ρHV, (e) Kdp, and (f) final hydrometeor classification at 0414 UTC 20 May 2011, along the azimuth 101.6°.

Citation: Journal of Atmospheric and Oceanic Technology 32, 1; 10.1175/JTECH-D-14-00097.1

The X-band observations are corrected for attenuation and differential attenuation. Figures 13a and 13b show the observed and the attenuation-corrected reflectivity, respectively. Attenuation up to 17 dB for Zh and 2.2 dB for Zdr is estimated. Given these high attenuation estimates, the weights associated with Zh and Zdr are reduced according to Eq. (4). This explains how the wet ice region identification is possible, despite negative Zdr values beyond the rain-plus-hail core. The presence of the rain-plus-hail region may actually be responsible for an underestimation of the differential attenuation, stressing the ultimate importance of coupling the hydrometeor classification and attenuation correction for short wavelength radars. The classification appears, in general, reasonable with few well-distinguished regions. The relevance of the Kdp observations is especially apparent for delineating the modulated boundary between liquid and frozen particles.

The polarimetric signatures of Kdp and Zdr at approximately 10-km height and 30-km range (negative values), respectively, point to the existence of vertically oriented ice crystals, associated with electrical activity inside the storm (Hendry and McCormick 1976; Caylor and Chandrasekar 1996; Carey and Rutledge 1996). This specific hydrometeor class is not considered in our scheme, and the more generic definition adopted for crystals implies a relatively low reflectivity (≤22 dBZ), which eventually causes this upper portion of the storm to lie within the dry ice region.

4. Evaluation of the method

A comprehensive direct validation of hydrometeor classification algorithms from radar is, in general, not possible because of the scarcity of in situ measurements. Even when in situ observations are available, the comparison is challenging considering the largely different sampling volumes of radar and cloud probes, as well as specific issues with airborne cloud particle sampling (King 1984, 1986; Korolev et al. 2013). In this work we rely on an internal evaluation approach, considering the application examples illustrated in the previous section. In addition to the obvious noise reduction in the final classification, in the following subsections we focus on a metric for the evaluation of clustering algorithms and on the sensitivity to noise and bias in the input observations.

a. Clustering evaluation with DB index

Davies and Bouldin (1979) introduced a metric for the internal evaluation of clustering algorithms, with the useful property of being independent on both the number of clusters analyzed and the method used to produce the data partition. The internal evaluation scheme provides a measure of the goodness of the cluster partitioning, based on quantities inherent to the employed dataset. The definition of the Davies–Boulden (DB) index relies on the ratio of a measure of the scatter within the cluster (intracluster distance) to a measure of the separation between the clusters (intercluster distance):
e12
where
e13
In Eq. (13) K is the number of clusters; and represent the dispersions (average distance between each point in the cluster and the centroid of the cluster) of clusters k and j, respectively; and dk,j is the distance between the centroids of clusters k and j, using the same metric defined in Eq. (7). It is clear that a lower DB value is associated with a better partitioning of the data, indicating that the hydrometeor classes are well separated based on the input dual-polarization measurements.

Rather than the absolute value of the DB index, we consider the relative variation of the index between different classification schemes. This allows in particular for assessing the improvement achieved using the proposed methodology, with respect to the traditional bin-based classification. Table 3 reports the DB index obtained for the cases described in section 3 (with a supplemental higher elevation for the Bric case) in the three steps of the classification, after the adjustment for the temperature profile (blocks 2–4 in Fig. 2). In addition, the results from two completely unsupervised K-means clustering are reported. The first one (K − meansbin) is obtained using the seeds from the bin-based classification (with the same number of classes). The second K-means clustering (K − means opt) is performed using the optimal number of classes (with random seeds), determined as the one that minimizes the DB index. In this case, several experiments with independent initializations revealed that the effect of the random seeds on the final partitions is quite limited; that is, the variability in the number of output classes is, in general, ≤1 and the standard deviation of the DB index is ≤0.02.

Table 3.

The DB internal evaluation index (DBI) for the classification examples in Figs. 1013. The first three columns report the values of the index for the intermediate (bin-based hydroclass and cluster analyses) and final (region-based hydroclass) classification results. Parentheses give the number of distinct hydrometeor classes. The last two columns report the DBI value for the unsupervised K-means clustering, with the same number of classes as resulting from the bin-based classification (K − meansbin) and with the optimal number of classes (K − meansopt). All classifications use λ = 0.8 and α = 0.75 except where denoted with an asterisk, which use λ = 1.0 and α = 0.95 (Fig. 10f).

Table 3.

For the cases considered, with the exception of the CASA X-band RHI, the cluster analysis produced an improvement in the partition of the hydrometeor classes, with respect to the first-guess bin-based classification. This may sound tautological given the purpose of the clustering algorithm. But in fact, considering the specific minimization process including the additional contiguity and penalty terms [Eq. (6)], a reduction of the DB index is not automatically guaranteed. This result indicates that the method has the ability to conjugate a reduction in the output classification noise with a tighter affinity within the same class and a better separation of the hydrometeor classes. In most cases, the DB index reduction is achieved with a concurrent reduction of the number of the hydrometeor classes; that is, some hydrometeor type poorly represented in the bin-based classification has been suppressed in the cluster analysis.

During the last step of the classification process (region-based hydroclass in Fig. 2), the number of classes can be further reduced following the reassignment of one or more connected regions. This is the case, for example, of the CHILL 2013 RHI with λ = 0.8 and α = 0.75, where a couple of residual wet ice regions are converted to rain and drizzle, and the DB index of the final region-based classification is further decreased. For the same case, but with λ = 1.0 and α = 0.95, on the other hand the reassignments of some connected regions (in particular the hail region to graupel) reflects in an increase of the DB index. In fact while the high weight of the distance component in the clustering analysis (λ = 1.0) is responsible for a better clustering (DB = 1.83 as compared to DB = 2.24 for the same case with λ = 0.8), the excessive deviation from the reference membership functions and the subsequent reassignment as discussed in section 3a implies a deterioration of the clustering tightness, but still a better performance than the bin-based classification (DB = 3.96).

The unsupervised K-means clustering using the seeds from the bin-based classification (K − meansbin in Table 3) predictably shows a better performance in terms of the DB index, respect to the region-based classification. The DB index for the optimized K-means clustering (K − meansopt) is the lowest by definition, but may imply a consistent change in the number of hydrometeor classes. In fact the number of represented classes in the analysis is much lower respect to the final region-based classification for the CHILL 2013 and CASA cases, while it is higher for the Bric PPIs.

The K-means clustering classifications using the initial classes from the bin-based classification are shown in Fig. 14 for the CHILL 2008 case and the CASA case, which are shown in Figs. 11 and 13, respectively. The limitation of these analyses is clear, with several hydrometeor classes deviating too much from the original definition. Both the hail and the rain-plus-hail regions have grown excessively in the CHILL case, including much lower reflectivity values. In addition, the lack of a temperature constraint in this process has led to the wrong dry ice classification of many bins below the melting layer and drizzle in the highest portion of the cloud (low reflectivity, high correlation, and near-zero Kdp and Zdr). In the CASA case (Fig. 14d), the weak reflectivity region closer to the radar, due to the lack of any well-defined polarimetric signature (and low ρHV likely arising from nonuniform beam filling), resulted in a noisy classification from the surface up to about 7 km MSL. A fairly coherent region in the upper portion of the storm is identified and marked as hail, whereas this region most likely contains vertical ice crystals originating from cloud electrification processes.

Fig. 14.
Fig. 14.

K-means unsupervised classification (K − meansbin in Table 3) for (a) CHILL 2008 case (Fig. 11) and (b) CASA case (Fig. 13).

Citation: Journal of Atmospheric and Oceanic Technology 32, 1; 10.1175/JTECH-D-14-00097.1

Looking back at the cluster analysis, the increased DB value for the CASA RHI (DB = 3.44 in Table 3, compared to DB = 3.09 for the bin-based classification) seems actually ascribable to the influence of the weak reflectivity region closer to the radar. In fact if the DB index is calculated only for the main storm, by simply applying a reflectivity threshold of 25 dBZ, the resulting DB values are 2.30 (11 classes), 1.84 (9 classes), and 1.66 (7 classes) for the bin-based, cluster-based, and region-based classifications, respectively. These values present a similar qualitative behavior as in the other cases, showing a significant improvement in the final classification.

This case emphasizes the importance of the supervised component in the classification process. The membership functions, variables weighting, and temperature optimization are all important components to derive a reliable first guess. Furthermore, the contiguity/penalty terms in the cluster analysis and the subsequent possible reassignment of the connected regions have a fundamental role in controlling the aggregation process. The effort pursued with the proposed method is ultimately to balance the data-adaptive strength of the clustering technique with the conventional fuzzy logic, physically based approach. In fact, a pure unsupervised clustering may give better partitioning results in terms of DB index, but at the cost of results not physically meaningful (Fig. 14).

Finally, the very limited reduction of the DB index for the C-band 1.2° PPI scan in Table 3 may indicate a difficulty in improving the first-guess hydrometeor classification, in terms of clusters tightness, for low-elevation radar scans. Data at low elevations are sometimes lacking well-defined polarimetric signatures, especially in vertically stratified clouds, where the radar resolution volume may encompass particles in different thermodynamic phases. Running the method on a higher elevation (10°) from the same volume actually produces a better result (last line in Table 3), in agreement with the RHI scans. The cluster analysis in fact achieves a more significant decrease of the DB index in this case (from ~2.0 to ~1.6).

b. Sensitivity to noise and bias

In this section, we consider the sensitivity to input noise and bias, with the aim to assess the impact of the cluster analysis in mitigating the effect of perturbed inputs. For this purpose we compare the bin-based classification with the cluster-based classification (output of block 3 in Fig. 2), after properly altering the original radar measurements described in section 3.

The sensitivity to additional noise in the input variables Zh and Zdr is shown in Fig. 15, while Fig. 16 reports the corresponding results for ρHV and Kdp. The white noise is generated using a Gaussian distribution with increasing standard deviation (x axis). The extent of the noise considered is clearly higher than the normal levels found in real observations but is used here to emphasize the fundamental qualitative trend. The lines in the plots represent the number of polar bins that changed class, compared to the reference classification using the original unperturbed observations. The colored areas between the two lines (bin-based and cluster-based classifications) highlight the improvement (light blue) or worsening (orange) of the proposed method in terms of robustness to the noise. The fact that the classification using perturbed inputs remains closer to the original one is taken as an indication of its robustness, especially for real-time applications.

Fig. 15.
Fig. 15.

Sensitivity to an artificial Gaussian white noise in the input observations (a)–(e) Zh and (f)–(j) Zdr for the cases analyzed in section 3. The plots report the number of polar bins that changed class with respect to the reference hydrometeor map without added noise for the bin-based and the cluster-based classifications (output of block 3 in Fig. 2, with λ = 0.8, α = 0.75). In addition, the right y axes report the fraction (%) of bins changed with respect to the total number of valid bins. To facilitate the interpretation, the area between the two lines is light blue when the cluster-based classification produced fewer class changes, in orange otherwise.

Citation: Journal of Atmospheric and Oceanic Technology 32, 1; 10.1175/JTECH-D-14-00097.1

Fig. 16.
Fig. 16.

As in Fig. 15, but for the input observations (a)–(e) ρHV and (f)–(j) Kdp. The Kdp noise introduced is scaled with the radar frequency.

Citation: Journal of Atmospheric and Oceanic Technology 32, 1; 10.1175/JTECH-D-14-00097.1

With few exceptions the classification resulting from the proposed method shows a lower sensitivity to the noise in the radar observations compared to the bin-based classification. The most notable improvement is obtained for the ρHV and Kdp radar measurements. The specific asymptotic behavior in the ρHV plots (Figs. 16a–e) seems to be caused by the fact that although the measurable range of ρHV is 0–1, the actual range for the meteorological targets is about one order of magnitude shorter, that is, ~(0.85–1.0). Consequently, the contribution of ρHV to the particle classification becomes practically irrelevant for additional white noise with a standard deviation in excess of 0.1.

The plots referring to the 1.2° PPI from the Bric radar (all four radar variables in Figs. 15d,i and 16d,i) and the 2013 RHI from CHILL (limitedly to Zh and Zdr in Figs. 15b,g) show a small impact for the new classification method. As observed in section 4a, the beneficial effect of partition clustering is quite limited for the low-elevation Bric scan and this may explain the neutral impact in terms of sensitivity to additional noise in the radar observations. For the CHILL 2013 case, on the other hand, there is a lower sensitivity to noise in ρHV and Kdp, while for Zh and Zdr no significant differences are noted. A more detailed analysis shows that this specific behavior may be ascribed to the generation of several small regions of wet ice in the stratiform portion of the storm, when the noise in Zh or Zdr is increased. This splitting into multiple small regions around the melting layer has the effect of eliminating the advantage of the cluster analysis in relation to noise sensitivity. The same negative effect is not seen in the CHILL 2008 case, which is also showing a relevant stratiform region. The main difference between the two cases is that in the 2008 case (Fig. 11), the melting-layer signature is well marked in few variables—that is, Zdr, ρHV (not shown), and weakly in Zh and Kdp—while in the 2013 case (Fig. 10) only Zdr shows the characteristic increase due to the transition between snow and rain. As a consequence, when the noise in Zdr is increased, the cluster analysis has no other observations to rely on in order to preserve the original consolidated regions.

The sensitivity to additional bias in the input power variables Zh and Zdr is also considered and represented in Fig. 17. While the effect of additional reflectivity bias (Figs. 17a–e) is rather ambiguous, the sensitivity to the Zdr bias (Figs. 17f–j) is significantly lower for the cluster-based classification compared to the bin-based classification. This is interpreted as one of the beneficial effects of the data-adaptive aggregation process during the cluster analysis. The unsupervised K-means clustering would actually be completely insensitive to systematic biases in the observations. The proposed semisupervised procedure is, however, able to maintain a certain degree of insensitivity. The key point is the adaptive aggregation of the observations during the cluster analysis, eventually attaining a set of homogeneous connected regions. This is possible of course when the classification relies on multiple input variables, so that if one variable is biased, then the other variables will still drive a specific hydrometeor identification and involve a degree of self-adaptation to the biased observations in the cluster analysis.

Fig. 17.
Fig. 17.

As in Fig. 15, but the sensitivity to an artificial bias is shown, for the input observations (a)–(e) Zh and (f)–(j) Zdr.

Citation: Journal of Atmospheric and Oceanic Technology 32, 1; 10.1175/JTECH-D-14-00097.1

The reflectivity does not show a clear improvement in the sensitivity to the additional bias. This behavior may be explained considering the peculiarity of the reflectivity measurements, as opposed to the other dual-polarization variables considered. The correlation coefficient ρHV and the differential variables Zdr and Kdp are often characterized by abrupt changes corresponding to transitions between different hydrometeor types. This natural segregation behavior is exploited in the cluster analysis. Reflectivity by contrast is an absolute power measurement that does not depend on the shape of the hydrometeors, so it most often behaves as a continuum involving only gradual quantitative transitions. For this reason it appears to be less suited for a data-adaptive approach in a clustering analysis. As a matter of fact, a key limitation of the pure K-means clustering is the tendency to produce clusters of approximately similar size, as was shown in the CHILL 2013 example (Fig. 10) using λ = 1.0. In that case the hail region expanded excessively mainly because of the continuous decrease of the reflectivity outward from the storm core. The current method already limits this issue through the use of the cost function involving the additional contiguity and penalty terms (λ ≤ 1, α ≪ 1, respectively).

c. Implementation and performance evaluation

The procedure is coded in C language for its efficiency and portability. The implementation is modular, in the sense that various parts of the procedure can be activated by editing a text configuration file. In fact, depending on the specific radar data being used, some of the processing may be redundant; for example, in some cases the input radar data may already contain a Kdp estimate. All the radar-specific parameters (radar constant, operating frequency, etc.), membership functions (one table for each class and frequency band), and algorithm-specific parameters (λ, α, γ, extension of the contiguity window, etc.) are defined in external configuration files. This provides a considerable flexibility for testing and adaptation to new radars/wavelengths, without the need to edit and recompile the source code.

Table 4 presents a summary of the algorithm performance for the cases analyzed, running on a 2010 laptop hosting an Intel i7-620M at 2.67 GHz. The total wall-clock time is clearly dependent on the number of range bins processed, as reported in the first row. While the RHI scans took about 10–12 s to complete the classification procedure, the processing of the low-elevation PPI needed almost twice that time to complete. The 10° elevation PPI, on the other hand, only took ~8 s. One of the most time-consuming modules—that is, the temperature optimization loop—does not need to run at every radar scan. In fact the height of the freezing level is not expected to vary significantly at the time scale of the scan update frequency. In an operational implementation, this module may only be executed, for example, every 30 min and only for selected elevations from a 3D polar volume. Overall, the performance appears compatible with real-time operations.

Table 4.

Summary of the detailed performance of the hydrometeor classification procedure for the different cases analyzed. The total number of valid range bins processed is indicated in the first row. The partial times are provided for the modules depicted in Fig. 2, with item 2 including Ψdp filtering and the calculation of Kdp. For items 1 and 3, the number of iterations performed is indicated in parentheses. The input/output (I/O) time represents the time spent reading and writing the radar data files.

Table 4.

5. Discussion and conclusions

This study has two major goals:

  1. to exploit the spatial information content of dual-polarization radar observations for the purpose of hydrometeor classification

  2. to enhance the standard bin-based fuzzy logic classification by additionally considering the quality of the classification itself, the spatial coherence, and the self-aggregation propensity of the radar observations

Thus, the basic fuzzy logic classification method is improved with a data-adaptive approach, typical of clustering methods (unsupervised component). Additional supervised components are provided in the form of pairwise constraints (penalty term) and elementary physical rules (temperature limits for the connected regions).

The method is implemented in a four-step procedure (Fig. 2), including optimization of the temperature profile (step 1), bin-based fuzzy logic classification (step 2), cluster analysis (step 3), and fuzzy logic classification applied to the connected regions (step 4). The key features of the proposed method are a degree of self-adaptability and the shift from classical bin-based classification schemes to a region-based classification.

The method has been tested on few cases in different meteorological situations and using data from radars operating at different frequencies. These preliminary application examples show promising results, especially for RHI scans. Continuous scanning in elevation clearly helps to detect the vertical variations in the dual-polarization observations, with a resulting higher segregation across a region encompassing a range of different microphysical species. The classification also worked fairly well for the analyzed PPI scan in widespread precipitation, likely due to the large amount of available data across the transition regions. However, more tests considering a wider range of precipitation events is needed to assess the potential for real-time application to volume scans resulting from a collection of PPI at fixed elevation angles.

An analysis of the performance showed that the method is suitable for operational applications. In addition, for real-time applications the recent history of cluster descriptions should be considered to improve their statistical representativeness, providing a classification both robust and stable over time. To this end, the use of specific RHI interleaved with volume scans (some radars—e.g., during CASA IP1—actually perform this type of complex strategy) will be investigated to better train the clusters of the hydrometeor classes.

The sensitivity to noise and bias in the input variables has been shown for five cases (three RHIs and two PPIs). The semisupervised approach seems able to maintain a certain degree of insensitivity to a measurement bias in Zdr. For Zh , however, there is no defined improvement with respect to the bin-based method. In addition the method showed a reduced sensitivity to noisy observations. In fact, the addition of a white Gaussian noise to the four radar variables (Zh, Zdr, Kdp, and ρHV) resulted in fewer changes in the classification resulting from the cluster analysis (compared to the classification with original measurements) when the proposed method is used. This is certainly a noticeable outcome of the method, which is able to provide a clean and appealing classification also for operational applications. It is clear that for a generalization of these preliminary findings, the analysis of a larger and comprehensive dataset is needed.

Going from a bin-based to a region-based classification has several advantages. The regions are identified from the semisupervised analysis, which is able to reduce the intracluster distance and increase the intercluster distance (lower Davies–Bouldin index). This allows for a more clear separation between the different hydrometeors. The final fuzzy logic classification applied to the connected regions is very robust, being based on a statistical sample rather than on single-point measurements. In addition, the classification is locally adaptive, in the sense that it automatically exploits and adapts to the most marked spatial signatures in the polarimetric observations.

Having defined a classification map with (relatively) few connected regions may in the future allow the inclusion of further spatial processing, for example, based on the polygons defining the regions, the boundaries between different regions, or their orientation (e.g., horizontally elongated for melting hydrometeors, vertically elongated for hail). In general, working with sets of contiguous bins, instead of pointwise measurements, considerably widens the potential of the classification algorithm, by allowing the inclusion of more sophisticated physical constraints based on either spatial properties or microphysical processes.

Working with regions, however, also introduces some issues specific to spatial analysis, in particular the modifiable areal unit problem (Openshaw 1983). In fact, when point-based measurements are aggregated into contiguous regions, the resulting statistics is affected (biased) by the size and boundaries of the regions. The size effect, mainly controlled by the contiguity window in our procedure (Fig. 5), must be considered in relation to the processes one is aiming to represent. The scale of the process is generally different from the spatial resolution of the data. In cloud microphysics the scale of the process can range from subpixel (processes within the radar resolution bins, e.g., turbulent mixing) to much larger scales, like organized updrafts, phase changes across the melting layer, etc. The heuristic approach adopted in the proposed methodology is clearly aimed at representing the larger-scale features in the cloud, by aggregating a multitude of radar bins with similar characteristics.

In addition to the parameters specified for the definition of the membership functions and the weights associated with the input variables, the current procedure involves additional parameters in the cost function (λ, α, γ), and a few other arbitrarily defined thresholds. The parameters of the cost function in particular can be set depending on the specific purpose of the classification. For example, if λ is very low (near zero), then the method simply acts like a smoother. If λ is set to a high value (close to one), then the method could be useful for exploratory analysis and eventually for refinement of the membership functions.

On the downside, the increased degrees of freedom in this method may complicate the setup for a specific system and also the interpretation of the final results. However, the proposed method is mainly devised to be superimposed on an existing fuzzy logic classification algorithm. Therefore, the membership functions are assumed to be already fairly established and the setup of the procedure should only involve the specification of the cost function and a few other parameters.

Acknowledgments

This research is supported by the NASA GPM program and the NSF-AIR program. The CSU–CHILL National Radar Facility is funded by NSF Cooperative Agreement ATM 0735110. The Bric radar data are provided by Arpa Piemonte.

REFERENCES

  • Aggarwal, C. C., and Reddy C. K. , Eds., 2013: Data Clustering: Algorithms and Applications. Chapman and Hall/CRC, 652 pp.

  • Al-Sakka, H., Boumahmoud A.-A. , Fradon B. , Frasier S. J. , and Tabary P. , 2013: A new fuzzy logic hydrometeor classification scheme applied to the French X-, C-, and S-band polarimetric radars. J. Appl. Meteor. Climatol., 52, 23282344, doi:10.1175/JAMC-D-12-0236.1.

    • Search Google Scholar
    • Export Citation
  • Baldini, L., Gorgucci E. , Chandrasekar V. , and Peterson W. , 2005: Implementations of CSU hydrometeor classification scheme for C-band polarimetric radars. 32nd Conf. on Radar Meteorology, Albuquerque, N. M., Amer. Meteor. Soc., P11.4. [Available online at https://ams.confex.com/ams/32Rad11Meso/techprogram/paper_95865.htm.]

  • Bechini, R., and Cremonini R. , 2002: The weather radar system of north-western Italy: An advanced tool for meteorological surveillance. Proceedings of the Second European Conference on Radar in Meteorology and Hydrology, Copernicus, 400404.

  • Bechini, R., Baldini L. , and Chandrasekar V. , 2013: Polarimetric radar observations in the ice region of precipitating clouds at C-band and X-band radar frequencies. J. Appl. Meteor. Climatol., 52, 11471169, doi:10.1175/JAMC-D-12-055.1.

    • Search Google Scholar
    • Export Citation
  • Brandes, E. A., Vivekanandan J. , Tuttle J. D. , and Kessinger C. J. , 1995: A study of thunderstorm microphysics with multiparameter radar and aircraft observations. Mon. Wea. Rev., 123, 31293143, doi:10.1175/1520-0493(1995)123<3129:ASOTMW>2.0.CO;2.

    • Search Google Scholar
    • Export Citation
  • Bringi, V. N., and Chandrasekar V. , 2001: Polarimetric Doppler Weather Radar: Principles and Applications.Cambridge University Press, 648 pp.

  • Bringi, V. N., Chandrasekar V. , Balakrishnan N. , and Zrnić D. S. , 1990: An examination of propagation effects in rainfall on radar measurements at microwave frequencies. J. Atmos. Oceanic Technol., 7, 829840, doi:10.1175/1520-0426(1990)007<0829:AEOPEI>2.0.CO;2.

    • Search Google Scholar
    • Export Citation
  • Carey, L. D., and Rutledge S. A. , 1996: A multiparameter radar case study of the microphysical and kinematic evolution of a lightning producing storm. Meteor. Atmos. Phys., 59, 3364, doi:10.1007/BF01032000.

    • Search Google Scholar
    • Export Citation
  • Caylor, I. J., and Chandrasekar V. , 1996: Time-varying ice crystal orientation in thunderstorms observed with multiparameter radar. IEEE Trans. Geosci. Remote Sens., 34, 847858, doi:10.1109/36.508402.

    • Search Google Scholar
    • Export Citation
  • Chandrasekar, V., Keränen R. , Lim S. , and Moisseev D. , 2013: Recent advances in classification of observations from dual polarization weather radars. Atmos. Res., 119, 97111, doi:10.1016/j.atmosres.2011.08.014.

    • Search Google Scholar
    • Export Citation
  • Conway, J. W., and Zrnić D. S. , 1993: A study of embryo production and hail growth using dual-Doppler and multiparameter radars. Mon. Wea. Rev., 121, 25112528, doi:10.1175/1520-0493(1993)121<2511:ASOEPA>2.0.CO;2.

    • Search Google Scholar
    • Export Citation
  • CSU–CHILL, cited 2014: National Weather Radar facility. [Available online at http://www.chill.colostate.edu.]

  • Davies, D. L., and Bouldin D. W. , 1979: A cluster separation measure. IEEE Trans. Pattern Anal. Mach. Intell., PAMI-1, 224227, doi:10.1109/TPAMI.1979.4766909.

    • Search Google Scholar
    • Export Citation
  • Dolan, B., and Rutledge S. A. , 2009: A theory-based hydrometeor identification algorithm for X-band polarimetric radars. J. Atmos. Oceanic Technol., 26, 20712088, doi:10.1175/2009JTECHA1208.1.

    • Search Google Scholar
    • Export Citation
  • Dolan, B., Rutledge S. A. , Lim S. , Chandrasekar V. , and Thurai M. , 2013: A robust C-band hydrometeor identification algorithm and application to a long-term polarimetric radar dataset. J. Appl. Meteor. Climatol., 52, 21622186, doi:10.1175/JAMC-D-12-0275.1.

    • Search Google Scholar
    • Export Citation
  • Fisher, R., Perkins S. , Walker A. , and Wolfart E. , cited 2003: Connected components labeling. [Available online at http://homepages.inf.ed.ac.uk/rbf/HIPR2/label.htm.]

  • Giangrande, S. E., Krause J. M. , and Ryzhkov A. V. , 2008: Automatic designation of the melting layer with a polarimetric prototype of the WSR-88D radar. J. Appl. Meteor., 47, 13541364, doi:10.1175/2007JAMC1634.1.

    • Search Google Scholar
    • Export Citation
  • Gonzalez, R., and Woods R. , Eds., 2002: Digital Image Processing. 2nd ed. Prentice Hall, 793 pp.

  • Gourley, J. J., Tabary P. , and du Chatelet J. P. , 2007: A fuzzy logic algorithm for the separation of precipitating from nonprecipitating echoes using polarimetric radar observations. J. Atmos. Oceanic Technol., 24, 14391451, doi:10.1175/JTECH2035.1.

    • Search Google Scholar
    • Export Citation
  • Hendry, A., and McCormick G. C. , 1976: Radar observations of the alignment of precipitation particles by electrostatic fields in thunderstorms. J. Geophys. Res., 81, 53535357, doi:10.1029/JC081i030p05353.

    • Search Google Scholar
    • Export Citation
  • Hooke, R., and Jeeves T. , 1961: Direct search” solution of numerical and statistical problems. J. Assoc. Comput. Mach., 8, 212229, doi:10.1145/321062.321069.

    • Search Google Scholar
    • Export Citation
  • Hubbert, J. C., and Bringi V. N. , 1995: An iterative filtering technique for the analysis of copolar differential phase and dual-frequency radar measurements. J. Atmos. Oceanic Technol., 12, 643648, doi:10.1175/1520-0426(1995)012<0643:AIFTFT>2.0.CO;2.

    • Search Google Scholar
    • Export Citation
  • Junyent, F., Chandrasekar V. , McLaughlin D. , Insanic E. , and Bharadwaj N. , 2010: The CASA Integrated Project 1 networked radar system. J. Atmos. Oceanic Technol., 27, 6178, doi:10.1175/2009JTECHA1296.1.

    • Search Google Scholar
    • Export Citation
  • Kennedy, P. C., and Rutledge S. A. , 2011: S-band dual polarization radar observations of winter storms. J. Appl. Meteor. Climatol., 50, 844858, doi:10.1175/2010JAMC2558.1.

    • Search Google Scholar
    • Export Citation
  • King, W. D., 1984: Air flow and particle trajectories around aircraft fuselages. I: Theory. J. Atmos. Oceanic Technol., 1, 513, doi:10.1175/1520-0426(1984)001<0005:AFAPTA>2.0.CO;2.

    • Search Google Scholar
    • Export Citation
  • King, W. D., 1986: Air flow and particle trajectories around aircraft fuselages. IV: Orientation of ice crystals. J. Atmos. Oceanic Technol., 3, 433439, doi:10.1175/1520-0426(1986)003<0433:AFAPTA>2.0.CO;2.

    • Search Google Scholar
    • Export Citation
  • Korolev, A., Emery E. , and Creelman K. , 2013: Modification and tests of particle probe tips to mitigate effects of ice shattering. J. Atmos. Oceanic Technol., 30, 690708, doi:10.1175/JTECH-D-12-00142.1.

    • Search Google Scholar
    • Export Citation
  • Lakshmanan, V., Rabin R. , and DeBrunner V. , 2001: Segmenting radar reflectivity data using texture. Preprints, 30th Int. Conf. on Radar Meteorology, Munich, Germany, Amer. Meteor. Soc., P2.2. [Available online at https://ams.confex.com/ams/30radar/techprogram/paper_21594.htm.]

  • Lim, S., Chandrasekar V. , and Bringi V. N. , 2005: Hydrometeor classification system using dual-polarization radar measurements: Model improvements and in situ verification. IEEE Trans. Geosci. Remote Sens.,43, 792801, doi:10.1109/TGRS.2004.843077.

    • Search Google Scholar
    • Export Citation
  • Lim, S., Cifelli R. , Chandrasekar V. , and Matrosov S. Y. , 2013: Precipitation classification and quantification using X-band dual-polarization weather radar: Application in the hydrometeorology testbed. J. Atmos. Oceanic Technol., 30, 21082120, doi:10.1175/JTECH-D-12-00123.1.

    • Search Google Scholar
    • Export Citation
  • Liu, H., and Chandrasekar V. , 1998: Classification of hydrometeor type based on multiparameter radar measurements. Preprints, Int. Conf. on Cloud Physics, Everett, WA, Amer. Meteor. Soc., 253–256.

  • Liu, H., and Chandrasekar V. , 2000: Classification of hydrometeors based on polarimetric radar measurements: Development of fuzzy logic and neuro-fuzzy systems, and in situ verification. J. Atmos. Oceanic Technol., 17, 140164, doi:10.1175/1520-0426(2000)017<0140:COHBOP>2.0.CO;2.

    • Search Google Scholar
    • Export Citation
  • Mahalanobis, P., 1936: On the generalized distance in statistics. Proc. Natl. Inst. Sci. India, 2, 4955.

  • Marzano, F., Scaranari D. , and Vulpiani G. , 2007: Supervised fuzzy-logic classification of hydrometeors using C-band weather radars. IEEE Trans. Geosci. Remote Sens.,45, 37843799, doi:10.1109/TGRS.2007.903399.

    • Search Google Scholar
    • Export Citation
  • Openshaw, S., 1983: The Modifiable Areal Unit Problem. Concepts and Techniques in Modern Geography Series, Vol. 38, Geo Books, 40 pp.

  • Otto, T., and Russchenberg H. W. J. , 2011: Estimation of specific differential phase and differential backscatter phase from polarimetric weather radar measurements of rain. IEEE Geosci. Remote Sens. Lett., 8, 988992, doi:10.1109/LGRS.2011.2145354.

    • Search Google Scholar
    • Export Citation
  • Park, H. S., Ryzhkov A. V. , Zrnić D. S. , and Kim K.-E. , 2009: The hydrometeor classification algorithm for the polarimetric WSR-88D: Description and application to an MCS. Wea. Forecasting, 24, 730748, doi:10.1175/2008WAF2222205.1.

    • Search Google Scholar
    • Export Citation
  • Parzen, E., 1962: On estimation of a probability density function and mode. Ann. Math. Stat., 33, 10651076, doi:10.1214/aoms/1177704472.

    • Search Google Scholar
    • Export Citation
  • Ryzhkov, A. V., and Zrnić D. S. , 1998: Discrimination between rain and snow with a polarimetric radar. J. Appl. Meteor., 37, 12281240, doi:10.1175/1520-0450(1998)037<1228:DBRASW>2.0.CO;2.

    • Search Google Scholar
    • Export Citation
  • Schneebeli, M., Grazioli J. , and Berne A. , 2014: Improved estimation of the specific differential phase shift using a compilation of Kalman filter ensembles. IEEE Trans. Geosci. Remote Sens., 52, 51375149, doi:10.1109/TGRS.2013.2287017.

    • Search Google Scholar
    • Export Citation
  • Straka, J. M., Zrnić D. S. , and Ryzhkov A. V. , 2000: Bulk hydrometeor classification and quantification using polarimetric radar data: Synthesis of relations. J. Appl. Meteor., 39, 13411372, doi:10.1175/1520-0450(2000)039<1341:BHCAQU>2.0.CO;2.

    • Search Google Scholar
    • Export Citation
  • Testud, J., Bouar E. L. , Obligis E. , and Ali-Mehenni M. , 2000: The rain profiling algorithm applied to polarimetric weather radar. J. Atmos. Oceanic Technol., 17, 332356, doi:10.1175/1520-0426(2000)017<0332:TRPAAT>2.0.CO;2.

    • Search Google Scholar
    • Export Citation
  • Vivekanandan, J., Ellis S. M. , Oye R. , Zrnić D. S. , Ryzhkov A. V. , and Straka J. , 1999: Cloud microphysics retrieval using S-band dual-polarization radar measurements. Bull. Amer. Meteor. Soc., 80, 381388, doi:10.1175/1520-0477(1999)080<0381:CMRUSB>2.0.CO;2.

    • Search Google Scholar
    • Export Citation
  • Wang, Y. T., and Chandrasekar V. , 2009: Algorithm for estimation of the specific differential phase. J. Atmos. Oceanic Technol., 26, 25652578, doi:10.1175/2009JTECHA1358.1.

    • Search Google Scholar
    • Export Citation
  • Wolberg, G., 1998: Cubic spline interpolation: A review. Columbia University Computer Science Tech. Rep. CUCS-389-88, 14 pp. [Available online at http://hdl.handle.net/10022/AC:P:12065.]

  • Zrnić, D. S., Ryzhkov A. , Straka J. , Liu Y. , and Vivekanandan J. , 2001: Testing a procedure for automatic classification of hydrometeor types. J. Atmos. Oceanic Technol., 18, 892913, doi:10.1175/1520-0426(2001)018<0892:TAPFAC>2.0.CO;2.

    • Search Google Scholar
    • Export Citation
Save