Comparison of Clustering Approaches in a Multimodel Ensemble for U.S. East Coast Cold Season Extratropical Cyclones

Benjamin M. Kiel aSchool of Marine and Atmospheric Sciences, Stony Brook University, State University of New York, Stony Brook, New York

Search for other papers by Benjamin M. Kiel in
Current site
Google Scholar
PubMed
Close
and
Brian A. Colle aSchool of Marine and Atmospheric Sciences, Stony Brook University, State University of New York, Stony Brook, New York

Search for other papers by Brian A. Colle in
Current site
Google Scholar
PubMed
Close
Restricted access

Abstract

Several clustering approaches are evaluated for 1–9-day forecasts using a multimodel ensemble that includes the GEFS, ECMWF, and Canadian ensembles. Six clustering algorithms and three clustering spaces are evaluated using mean sea level pressure (MSLP) and 12-h accumulated precipitation (APCP) for cool-season extratropical cyclones across the Northeast United States. Using the MSLP cluster membership to obtain the APCP clusters is also evaluated, along with applying clustering determined at one lead time to cluster forecasts at a different lead time. Five scenarios from each clustering algorithm are evaluated using displacement and intensity/amount errors from the scenario nearest to the MSLP and 12-h APCP analyses in the NCEP GFS and ERA5, respectively. Most clustering strategies yield similar improvements over the full ensemble mean and are similar in probabilistic skill except that 1) intensity displacement space gives lower MSLP displacement and intensity errors; and 2) Euclidean space and agglomerative hierarchical clustering, when using either full or average linkage, struggle to produce reasonably sized clusters. Applying clusters derived from MSLP to 12-h APCP forecasts is not as skillful as clustering by 12-h APCP directly, especially if several members contain little precipitation. Use of the same cluster membership for one lead time to cluster the forecast at another lead time is less skillful than clustering independently at each forecast lead time. Finally, the number of members within each cluster does not necessarily correspond with the best forecast, especially at the longer lead times, when the probability of the smallest cluster being the best scenario was usually underestimated.

Significance Statement

Numerical weather prediction ensembles are widely used, but more postprocessing tools are necessary to help forecasters interpret and communicate the possible outcomes. This study evaluates various clustering approaches, combining a large number of model forecasts with similar attributes together into a small number of scenarios. The 1–9-day forecasts of both sea level pressure and 12-h precipitation are used to evaluate the clustering approaches for a large number of U.S. East Coast winter cyclones, which is an important forecast problem for this region.

© 2024 American Meteorological Society. This published article is licensed under the terms of the default AMS reuse license. For information regarding reuse of this content and general copyright information, consult the AMS Copyright Policy (www.ametsoc.org/PUBSReuseLicenses).

Corresponding author: Benjamin M. Kiel, benjaminmkiel@gmail.com

Abstract

Several clustering approaches are evaluated for 1–9-day forecasts using a multimodel ensemble that includes the GEFS, ECMWF, and Canadian ensembles. Six clustering algorithms and three clustering spaces are evaluated using mean sea level pressure (MSLP) and 12-h accumulated precipitation (APCP) for cool-season extratropical cyclones across the Northeast United States. Using the MSLP cluster membership to obtain the APCP clusters is also evaluated, along with applying clustering determined at one lead time to cluster forecasts at a different lead time. Five scenarios from each clustering algorithm are evaluated using displacement and intensity/amount errors from the scenario nearest to the MSLP and 12-h APCP analyses in the NCEP GFS and ERA5, respectively. Most clustering strategies yield similar improvements over the full ensemble mean and are similar in probabilistic skill except that 1) intensity displacement space gives lower MSLP displacement and intensity errors; and 2) Euclidean space and agglomerative hierarchical clustering, when using either full or average linkage, struggle to produce reasonably sized clusters. Applying clusters derived from MSLP to 12-h APCP forecasts is not as skillful as clustering by 12-h APCP directly, especially if several members contain little precipitation. Use of the same cluster membership for one lead time to cluster the forecast at another lead time is less skillful than clustering independently at each forecast lead time. Finally, the number of members within each cluster does not necessarily correspond with the best forecast, especially at the longer lead times, when the probability of the smallest cluster being the best scenario was usually underestimated.

Significance Statement

Numerical weather prediction ensembles are widely used, but more postprocessing tools are necessary to help forecasters interpret and communicate the possible outcomes. This study evaluates various clustering approaches, combining a large number of model forecasts with similar attributes together into a small number of scenarios. The 1–9-day forecasts of both sea level pressure and 12-h precipitation are used to evaluate the clustering approaches for a large number of U.S. East Coast winter cyclones, which is an important forecast problem for this region.

© 2024 American Meteorological Society. This published article is licensed under the terms of the default AMS reuse license. For information regarding reuse of this content and general copyright information, consult the AMS Copyright Policy (www.ametsoc.org/PUBSReuseLicenses).

Corresponding author: Benjamin M. Kiel, benjaminmkiel@gmail.com
Save
  • Arthur, D., and S. Vassilvitskii, 2007: K-means++: The advantages of careful seeding. Proc. 18th Annual ACM-SIAM Symp. on Discrete Algorithms, New Orleans, LA, SIAM, 1027–1035.

  • Blake, E. S., T. B. Kimberlain, R. J. Berg, J. P. Cangialosi, and J. L. Bevin II, 2013: Tropical cyclone report: Hurricane Sandy (AL182012). NHC Tech. Rep., 157 pp., https://www.nhc.noaa.gov/data/tcr/AL182012_Sandy.pdf.

  • Booth, J. F., H. E. Hieder, D. E. Lee, and Y. Kushnir, 2015: The paths of extratropical cyclones associated with wintertime high-wind events in the northeastern United States. J. Appl. Meteor. Climatol., 54, 18711885, https://doi.org/10.1175/JAMC-D-14-0320.1.

    • Search Google Scholar
    • Export Citation
  • Bougeault, P., and Coauthors, 2010: The THORPEX Interactive Grand Global Ensemble. Bull. Amer. Meteor. Soc., 91, 10591072, https://doi.org/10.1175/2010BAMS2853.1.

    • Search Google Scholar
    • Export Citation
  • Brier, G. W., 1950: Verification of forecasts expressed in terms of probability. Mon. Wea. Rev., 78, 13, https://doi.org/10.1175/1520-0493(1950)078<0001:VOFEIT>2.0.CO;2.

    • Search Google Scholar
    • Export Citation
  • Bytheway, J. L., and C. D. Kummerow, 2015: Toward an object-based assessment of high-resolution forecasts of long lived convective precipitation in the central U.S. J. Adv. Model. Earth Sci., 7, 12481264, https://doi.org/10.1002/2015MS000497.

    • Search Google Scholar
    • Export Citation
  • Colle, B. A., Z. Zhang, K. A. Lombardo, E. Chang, P. Liu, and M. Zhang, 2013: Historical evaluation and future prediction of eastern North American and western Atlantic extratropical cyclones in the CMIP5 models during the cool season. J. Climate, 26, 68826903, https://doi.org/10.1175/JCLI-D-12-00498.1.

    • Search Google Scholar
    • Export Citation
  • Crossett, C. C., A. K. Betts, L.-A. L. Dupigny-Giroux, and A. Bomblies, 2020: Evaluation of daily precipitation from the ERA5 global reanalysis against GHCN observations in the northeastern United States. Climate, 8, 148, https://doi.org/10.3390/cli8120148.

    • Search Google Scholar
    • Export Citation
  • Defays, D., 1977: An efficient algorithm for a complete link method. Comput. J., 20, 364366, https://doi.org/10.1093/comjnl/20.4.364.

    • Search Google Scholar
    • Export Citation
  • Dolan, R., and R. E. Davis, 1992: An intensity scale for Atlantic coast northeast storms. J. Coastal Res., 8, 840853.

  • Duda, J. D., and W. A. Gallus Jr., 2013: The impact of large-scale forcing on skill of simulated convective initiation and upscale evolution with convection-allowing grid spacings in the WRF. Wea. Forecasting, 28, 9941018, https://doi.org/10.1175/WAF-D-13-00005.1.

    • Search Google Scholar
    • Export Citation
  • Dunn, J. C., 1973: A fuzzy relative of the ISODATA process and its use in detecting compact well separated clusters. J. Cybern., 3, 3257, https://doi.org/10.1080/01969727308546046.

    • Search Google Scholar
    • Export Citation
  • Elkan, C., 2003: Using the triangle inequality to accelerate k-means. Proc. 20th Int. Conf. on Machine Learning (ICML-2003), Washington, D.C., AAAI Press, 147–153, https://dl.acm.org/doi/10.5555/3041838.3041857.

  • Hart, N. C. G., S. L. Gray, and P. A. Clark, 2015: Detection of coherent airstreams using cluster analysis: Application to an extratropical cyclone. Mon. Wea. Rev., 143, 35183531, https://doi.org/10.1175/MWR-D-14-00382.1.

    • Search Google Scholar
    • Export Citation
  • Hersbach, H., and Coauthors, 2019: ERA5 monthly averaged data on single levels from 1979 to present. Copernicus Climate Change Service (C3S), accessed 21 January 2021, https://cds.climate.copernicus.eu/cdsapp#!/dataset/reanalysis-era5-single-levels-monthly-means?tab=overview.

  • Houtekamer, P. L., L. Lefaivre, J. Derome, H. Ritchie, and H. L. Mitchell, 1996: A system simulation approach to ensemble prediction. Mon. Wea. Rev., 124, 12251242, https://doi.org/10.1175/1520-0493(1996)124<1225:ASSATE>2.0.CO;2.

    • Search Google Scholar
    • Export Citation
  • Hubert, L., and P. Arabie, 1985: Comparing partitions. J. Classification, 2, 193218, https://doi.org/10.1007/BF01908075.

  • Kiel, B. M., 2021: Comparison of clustering approaches in a multimodel ensemble for U.S. East Coast winter storms. M.S. thesis, Dept. of Marine and Atmospheric Sciences, State University of New York at Stony Brook, 197 pp.

  • Kohonen, T., 1982: Self-organized formation of topically correct feature maps. Biol. Cybern., 43, 5969, https://doi.org/10.1007/BF00337288.

    • Search Google Scholar
    • Export Citation
  • Korfe, N. G., and B. A. Colle, 2018: Evaluation of cool-season extratropical cyclones in a multimodel ensemble for eastern North America and the western Atlantic Ocean. Wea. Forecasting, 33, 109127, https://doi.org/10.1175/WAF-D-17-0036.1.

    • Search Google Scholar
    • Export Citation
  • Lamberson, W. S., M. J. Bodner, J. A. Nelson, and S. A. Sienkiewicz, 2023: The use of ensemble clustering on a multimodel ensemble for medium-range forecasting at the Weather Prediction Center. Wea. Forecasting, 38, 539554, https://doi.org/10.1175/WAF-D-22-0154.1.

    • Search Google Scholar
    • Export Citation
  • Lopes, A. M., and J. A. T. Machado, 2015: Dynamical analysis and visualization of tornadoes time series. PLOS ONE, 10, e0120260, https://doi.org/10.1371/journal.pone.0120260.

    • Search Google Scholar
    • Export Citation
  • Ma, C.-M., and E. K. M. Chang, 2017: Impacts of storm-track variations on wintertime extreme weather events over the continental United States. J. Climate, 30, 46014624, https://doi.org/10.1175/JCLI-D-16-0560.1.

    • Search Google Scholar
    • Export Citation
  • MacQueen, J., 1967: Some methods for classification and analysis of multivariate observations. Proc. Fifth Berkeley Symp. on Mathematical Statistics and Probability, Volume 1: Statistics, Berkeley, CA, University of California Press, 281–297, https://projecteuclid.org/ebooks/berkeley-symposium-on-mathematical-statistics-and-probability/Proceedings-of-the-Fifth-Berkeley-Symposium-on-Mathematical-Statistics-and/chapter/Some-methods-for-classification-and-analysis-of-multivariate-observations/bsmsp/1200512992?tab=ChapterArticleLink.

  • Molteni, F., R. Buizza, T. N. Palmer, and T. Petroligas, 1996: The ECMWF ensemble prediction system: Methodology and results. Quart. J. Roy. Meteor. Soc., 122, 73119, https://doi.org/10.1002/qj.49712252905.

    • Search Google Scholar
    • Export Citation
  • Neal, R., D. Fereday, R. Cocker, and R. E. Corner, 2016: A flexible approach to defining weather patterns and their application in weather forecasting over Europe. Meteor. Appl., 23, 389400, https://doi.org/10.1002/met.1563.

    • Search Google Scholar
    • Export Citation
  • North, G. R., T. L. Bell, R. F. Calahan, and F. J. Moeng, 1982: Sampling errors in the estimation of empirical orthogonal functions. Mon. Wea. Rev., 110, 699706, https://doi.org/10.1175/1520-0493(1982)110<0699:SEITEO>2.0.CO;2.

    • Search Google Scholar
    • Export Citation
  • Ohba, M., and S. Sugimoto, 2019: Differences in climate change impacts between weather patterns: Possible effects on spatial heterogeneous changes in future extreme rainfall. Climate Dyn., 52, 41774191, https://doi.org/10.1007/s00382-018-4374-1.

    • Search Google Scholar
    • Export Citation
  • Rand, W. M., 1971: Objective criteria for the evaluation of clustering methods. J. Amer. Stat. Assoc., 66, 846850, https://doi.org/10.1080/01621459.1971.10482356.

    • Search Google Scholar
    • Export Citation
  • Reusch, D. B., R. B. Alley, and B. C. Hewitson, 2007: North Atlantic climate variability from a self-organizing map perspective. J. Geophys. Res., 112, D02104, https://doi.org/10.1029/2006JD007460.

    • Search Google Scholar
    • Export Citation
  • Ross, T. J., 2010: Fuzzy Logic with Engineering Applications. 3rd. ed. Wiley, 606 pp.

  • Rousi, E., C. Anagnostopoulou, K. Tolika, and P. Maheras, 2015: Representing teleconnection patterns over Europe: A comparison of SOM and PCA methods. Atmos. Res., 152, 123137, https://doi.org/10.1016/j.atmosres.2013.11.010.

    • Search Google Scholar
    • Export Citation
  • Rousseeuw, P. J., 1987: Silhouettes: A graphical aid to the interpretation and validation of cluster analysis. J. Comput. Appl. Math., 20, 5365, https://doi.org/10.1016/0377-0427(87)90125-7.

    • Search Google Scholar
    • Export Citation
  • Sokal, R. R., and C. D. Michener, 1958: A statistical method for evaluating systematic relationships. Univ. Kans. Sci. Bull., 38, 14091438.

    • Search Google Scholar
    • Export Citation
  • Swinbank, R., and Coauthors, 2016: The TIGGE project and its achievements. Bull. Amer. Meteor. Soc., 97, 4967, https://doi.org/10.1175/BAMS-D-13-00191.1.

    • Search Google Scholar
    • Export Citation
  • Toth, Z., and E. Kalnay, 1993: Ensemble forecasting at NMC: The generations of perturbations. Bull. Amer. Meteor. Soc., 74, 23172330, https://doi.org/10.1175/1520-0477(1993)074<2317:EFANTG>2.0.CO;2.

    • Search Google Scholar
    • Export Citation
  • Uccellini, L. W., and J. E. T. Hoeve, 2019: Evolving the national weather service to build a Weather-Ready Nation: Connecting observations, forecasts, and warnings to decision-makers through impact-based decision support services. Bull. Amer. Meteor. Soc., 100, 19231942, https://doi.org/10.1175/BAMS-D-18-0159.1.

    • Search Google Scholar
    • Export Citation
  • Vinh, N. X., J. Epps, and J. Bailey, 2009: Information theoretic measures for clusterings comparison: Is a correction for chance necessary? Proc. 26th Annual Int. Conf. on Machine Learning, Montreal, QC, Canada, Association for Computing Machinery, 1073–1080, https://doi.org/10.1145/1553374.1553511.

  • von Storch, H., 1999: Spatial patterns: EOFs and CCA. Analysis of Climate Variability, H. von Storch and A. Navarra, Eds., Springer, 231–263.

  • Ward, J. H., Jr., 1963: Hierarchal grouping to optimize an objective function. J. Stat. Assoc., 58, 236244, https://doi.org/10.1080/01621459.1963.10500845.

    • Search Google Scholar
    • Export Citation
  • Zheng, M., E. K. M. Chang, B. A. Colle, Y. Luo, and Y. Zhu, 2017: Apply fuzzy clustering to a multimodel ensemble for U.S. East Coast winter storms: Scenario identifications and forecast verification. Wea. Forecasting, 32, 881903, https://doi.org/10.1175/WAF-D-16-0112.1.

    • Search Google Scholar
    • Export Citation
  • Zheng, M., E. K. M. Chang, and B. A. Colle, 2019: Evaluating U.S. East Coast winter storms in a multimodel ensemble using EOF and clustering approaches. Mon. Wea. Rev., 147, 19671987, https://doi.org/10.1175/MWR-D-18-0052.1.

    • Search Google Scholar
    • Export Citation
All Time Past Year Past 30 Days
Abstract Views 275 275 33
Full Text Views 127 127 24
PDF Downloads 152 152 25