• Abrol, V., and P. Sharma, 2020: A geometric approach to archetypal analysis via sparse projections. Proc. 37th Int. Conf. on Machine Learning, Online, ICML, 4251, http://proceedings.mlr.press/v119/abrol20a/abrol20a.pdf.

  • Adames, A. F., and J. M. Wallace, 2014: Three-dimensional structure and evolution of the MJO and its relation to the mean flow. J. Atmos. Sci., 71, 20072026, https://doi.org/10.1175/JAS-D-13-0254.1.

    • Search Google Scholar
    • Export Citation
  • Alexander, M. A., I. Bladé, M. Newman, J. R. Lanzante, N.-C. Lau, and J. D. Scott, 2002: The atmospheric bridge: The influence of ENSO teleconnections on air–sea interaction over the global oceans. J. Climate, 15, 22052231, https://doi.org/10.1175/1520-0442(2002)015<2205:TABTIO>2.0.CO;2.

    • Search Google Scholar
    • Export Citation
  • Aloise, D., A. Deshpande, P. Hansen, and P. Popat, 2009: NP-hardness of Euclidean sum-of-squares clustering. Mach. Learn., 75, 245248, https://doi.org/10.1007/s10994-009-5103-0.

    • Search Google Scholar
    • Export Citation
  • Ashok, K., S. K. Behera, S. A. Rao, H. Weng, and T. Yamagata, 2007: El Niño Modoki and its possible teleconnection. J. Geophys. Res., 112, C11007, https://doi.org/10.1029/2006JC003798.

    • Search Google Scholar
    • Export Citation
  • Bauckhage, C., 2014: A note on archetypal analysis and the approximation of convex hulls. arXiv, 1410.0642, https://doi.org/10.48550/arXiv.1410.0642.

  • Bauckhage, C., and C. Thurau, 2009: Making archetypal analysis practical. Pattern Recognition, J. Denzler, G. Notni, and H. Süße, Eds., Lecture Notes in Computer Science, Vol. 5748, Springer, 272281.

    • Search Google Scholar
    • Export Citation
  • Bauckhage, C., and K. Manshaei, 2014: Kernel archetypal analysis for clustering web search frequency time series. 22nd Int. Conf. on Pattern Recognition, Stockholm, Sweden, IEEE, 15441549.

  • Boumal, N., B. Mishra, P.-A. Absil, and R. Sepulchre, 2014: Manopt, a Matlab toolbox for optimization on manifolds. J. Mach. Learn. Res., 15, 14551459

    • Search Google Scholar
    • Export Citation
  • Boyd, S. P., and L. Vandenberghe, 2004: Convex Optimization. 1st ed. Cambridge University Press, 727 pp.

  • Cai, W., and T. Cowan, 2009: La Niña Modoki impacts Australia autumn rainfall variability. Geophys. Res. Lett., 36, L12805, https://doi.org/10.1029/2009GL037885.

    • Search Google Scholar
    • Export Citation
  • Cai, W., T. Cowan, and A. Sullivan, 2009: Recent unprecedented skewness towards positive Indian Ocean Dipole occurrences and its impact on Australian rainfall. Geophys. Res. Lett., 36, L11705, https://doi.org/10.1029/2009GL037604.

    • Search Google Scholar
    • Export Citation
  • Capotondi, A., and P. D. Sardeshmukh, 2015: Optimal precursors of different types of ENSO events. Geophys. Res. Lett., 42, 99529960, https://doi.org/10.1002/2015GL066171.

    • Search Google Scholar
    • Export Citation
  • Capotondi, A., and P. D. Sardeshmukh, 2017: Is El Niño really changing? Geophys. Res. Lett., 44, 85488556, https://doi.org/10.1002/2017GL074515.

    • Search Google Scholar
    • Export Citation
  • Capotondi, A., A. T. Wittenberg, J.-S. Kug, K. Takahashi, and M. J. McPhaden, 2020: ENSO Diversity. El Niño Southern Oscillation in a Changing Climate, Geophys. Monogr., Vol. 253, Amer. Geophys. Union, 65–86, https://doi.org/10.1002/9781119548164.ch4.

  • Chen, D., and Coauthors, 2015: Strong influence of westerly wind bursts on El Niño diversity. Nat. Geosci., 8, 339345, https://doi.org/10.1038/ngeo2399.

    • Search Google Scholar
    • Export Citation
  • Chen, Y., J. Mairal, and Z. Harchaoui, 2014: Fast and robust archetypal analysis for representation learning. Conf. on Computer Vision and Pattern Recognition, Columbus, OH, IEEE, 14781485.

  • Christiansen, B., 2007: Atmospheric circulation regimes: Can cluster analysis provide the number? J. Climate, 20, 22292250, https://doi.org/10.1175/JCLI4107.1.

    • Search Google Scholar
    • Export Citation
  • Christensen, H. M., J. Berner, D. R. B. Coleman, and T. N. Palmer, 2017: Stochastic parameterization and El Niño–Southern Oscillation. J. Climate, 30, 1738, https://doi.org/10.1175/JCLI-D-16-0122.1.

    • Search Google Scholar
    • Export Citation
  • Cichocki, A., R. Zdunek, A. H. Phan, and S.-I. Amari, 2009: Nonnegative Matrix and Tensor Factorizations: Applications to Exploratory Multi-Way Data Analysis and Blind Source Separation. 1st ed. Wiley Publishing, 504 pp.

  • Cutler, A., and L. Breiman, 1994: Archetypal analysis. Technometrics, 36, 338347, https://doi.org/10.1080/00401706.1994.10485840.

  • Elad, M., 2010: Sparse and Redundant Representations: From Theory to Applications in Signal and Image Processing. 1st ed. Springer, 376 pp.

  • Elden, L., 2019: Matrix Methods in Data Mining and Pattern Recognition. 2nd ed. Society for Industrial and Applied Mathematics, 229 pp.

  • Esposito, F., 2021: A review on initialization methods for nonnegative matrix factorization: Towards omics data experiments. Mathematics, 9, 1006, https://doi.org/10.3390/math9091006.

    • Search Google Scholar
    • Export Citation
  • Eugster, M. J. A., and F. Leisch, 2011: Weighted and robust archetypal analysis. Comput. Stat. Data Anal., 55, 12151225, https://doi.org/10.1016/j.csda.2010.10.017.

    • Search Google Scholar
    • Export Citation
  • Fligner, M. A., and J. S. Verducci, 1986: Distance based ranking models. J. Roy. Stat. Soc., 48B, 359369, https://doi.org/10.1111/j.2517-6161.1986.tb01420.x.

    • Search Google Scholar
    • Export Citation
  • Franzke, C. L. E., T. J. O’Kane, D. P. Monselesan, J. S. Risbey, and I. Horenko, 2015: Systematic attribution of observed Southern Hemisphere circulation trends to external forcing and internal variability. Nonlinear Processes Geophys., 22, 513525, https://doi.org/10.5194/npg-22-513-2015.

    • Search Google Scholar
    • Export Citation
  • Fu, C., H. F. Diaz, and J. O. Fletcher, 1986: Characteristics of the response of sea surface temperature in the central Pacific associated with warm episodes of the Southern Oscillation. Mon. Wea. Rev., 114, 17161739, https://doi.org/10.1175/1520-0493(1986)114<1716:COTROS>2.0.CO;2.

    • Search Google Scholar
    • Export Citation
  • Gan, G., C. Ma, and J. Wu, 2020: Data Clustering: Theory, Algorithms and Applications. 2nd ed. SIAM Press, 406 pp.

  • Gerber, S., L. Pospisil, M. Navandar, and I. Horenko, 2020: Low-cost scalable discretization, prediction, and feature selection for complex systems. Sci. Adv., 6, eaaw0961, https://doi.org/10.1126/sciadv.aaw0961.

    • Search Google Scholar
    • Export Citation
  • Gillis, N., 2020: Nonnegative Matrix Factorization. Society for Industrial and Applied Mathematics, 350 pp.

  • Gini, C., 1921: Measurement of inequality of incomes. Econ. J., 31, 124126, https://doi.org/10.2307/2223319.

  • Han, R., B. Osting, D. Wang, and Y. Xu, 2022: Probabilistic methods for approximate archetypal analysis. Inf. Inference J. IMA, 2022, iaac008, https://doi.org/10.1093/imaiai/iaac008.

    • Search Google Scholar
    • Export Citation
  • Hannachi, A., 2021: Further topics. Patterns Identification and Data Mining in Weather and Climate, A. Hannachi, Ed., Springer, 367413.

    • Search Google Scholar
    • Export Citation
  • Hannachi, A., and N. Trendafilov, 2017: Archetypal analysis: Mining weather and climate extremes. J. Climate, 30, 69276944, https://doi.org/10.1175/JCLI-D-16-0798.1.

    • Search Google Scholar
    • Export Citation
  • Hasselmann, K., 1988: PIPs and POPs: The reduction of complex dynamical systems using principal interaction and oscillation patterns. J. Geophys. Res., 93, 11 01511 021, https://doi.org/10.1029/JD093iD09p11015.

    • Search Google Scholar
    • Export Citation
  • Horenko, I., 2009: On robust estimation of low-frequency variability trends in discrete Markovian sequences of atmospheric circulation patterns. J. Atmos. Sci., 66, 20592072, https://doi.org/10.1175/2008JAS2959.1.

    • Search Google Scholar
    • Export Citation
  • Horenko, I., 2010a: Finite element approach to clustering of multidimensional time series. SIAM J. Sci. Comput., 32, 6283, https://doi.org/10.1137/080715962.

    • Search Google Scholar
    • Export Citation
  • Horenko, I., 2010b: On clustering of non-stationary meteorological time series. Dyn. Atmos. Oceans, 49, 164187, https://doi.org/10.1016/j.dynatmoce.2009.04.003.

    • Search Google Scholar
    • Export Citation
  • Horenko, I., 2010c: On the identification of nonstationary factor models and their application to atmospheric data analysis. J. Atmos. Sci., 67, 15591574, https://doi.org/10.1175/2010JAS3271.1.

    • Search Google Scholar
    • Export Citation
  • Hotelling, H., 1933: Analysis of a complex of statistical variables into principal components. J. Educ. Psychol., 24, 417441, https://doi.org/10.1037/h0071325.

    • Search Google Scholar
    • Export Citation
  • Hurley, N., and S. Rickard, 2009: Comparing measures of sparsity. IEEE Trans. Inf. Theory, 55, 47234741, https://doi:10.1109/TIT.2009.2027527.

    • Search Google Scholar
    • Export Citation
  • IPCC, 2013: Climate Change 2013: The Physical Science Basis. Cambridge University Press, 1535 pp., https://doi.org/10.1017/CBO9781107415324.

  • IPCC, 2019: The Ocean and Cryosphere in a Changing Climate. H.-O. Pörtner et al., Eds., Cambridge University Press, 766 pp., https://www.ipcc.ch/site/assets/uploads/sites/3/2022/03/SROCC_FullReport_FINAL.pdf.

  • Izenman, A. J., 2008: Linear dimensionality reduction. Modern Multivariate Statistical Techniques: Regression, Classification, and Manifold Learning, 1st ed. A. J. Izenman, Ed., Springer Texts in Statistics, Springer, 195236.

    • Search Google Scholar
    • Export Citation
  • Jolliffe, I. T., 1986: Principal Component Analysis. Springer Verlag, 271 pp.

  • Jolliffe, I. T., and J. Cadima, 2016: Principal component analysis: A review and recent developments. Philos. Trans. Roy. Soc., 374A, 20150202, https://doi.org/10.1098/rsta.2015.0202.

    • Search Google Scholar
    • Export Citation
  • Kao, H.-Y., and J.-Y. Yu, 2009: Contrasting eastern-Pacific and central-Pacific types of ENSO. J. Climate, 22, 615632, https://doi.org/10.1175/2008JCLI2309.1.

    • Search Google Scholar
    • Export Citation
  • Keller, S. M., M. Samarin, F. Arend Torres, M. Wieser, and V. Roth, 2021: Learning extremal representations with deep archetypal analysis. Int. J. Comput. Vis., 129, 805820, https://doi.org/10.1007/s11263-020-01390-3.

    • Search Google Scholar
    • Export Citation
  • Kobayashi, S., and Coauthors, 2015: The JRA-55 reanalysis: General specifications and basic characteristics. J. Meteor. Soc. Japan, 93, 548, https://doi.org/10.2151/jmsj.2015-001.

    • Search Google Scholar
    • Export Citation
  • Komarov, O., 2021: okomarov/ginicoeff. GitHub, accessed 10 November 2021, https://github.com/okomarov/ginicoeff.

  • Liang, Y., A. V. Fedorov, and P. Haertel, 2021: Intensification of westerly wind bursts caused by the coupling of the Madden-Julian oscillation to SST during El Niño onset and development. Geophys. Res. Lett., 48, e2020GL089395, https://doi.org/10.1029/2020GL089395.

    • Search Google Scholar
    • Export Citation
  • Lorenz, E., 1956: Empirical orthogonal functions and statistical weather prediction. MIT Department of Meteorology Statistical Forecasting Project Scientific Rep. 1, 49 pp., https://eapsweb.mit.edu/sites/default/files/Empirical_Orthogonal_Functions_1956.pdf.

  • Mair, S., and U. Brefeld, 2019: Coresets for archetypal analysis. 9 pp., https://papers.nips.cc/paper/2019/file/7f278ad602c7f47aa76d1bfc90f20263-Paper.pdf.

  • Mair, S., A. Boubekki, and U. Brefeld, 2017: Frame-based data factorizations. Int. Conf. on Machine Learning, Sydney, New South Wales, Australia, ICML, 2305–2313, http://proceedings.mlr.press/v70/mair17a/mair17a.pdf.

  • Mairal, J., 2014: Sparse modeling for image and vision processing. Found. Trends Comput. Graph. Vis., 8, 85283, https://doi.org/10.1561/0600000058.

    • Search Google Scholar
    • Export Citation
  • Mairal, J., 2017: SPAMS: A SPArse Modeling Software, v 2.6. http://thoth.inrialpes.fr/people/mairal/spams/doc/html/index.html.

  • Matthews, A. J., 2011: A multiscale framework for the origin and variability of the South Pacific convergence zone. Quart. J. Roy. Meteor. Soc., 138, 11651178, https://doi.org/10.1002/qj.1870.

    • Search Google Scholar
    • Export Citation
  • Meilă, M., 2007: Comparing clusterings—An information based distance. J. Multivar. Anal., 98, 873895, https://doi:10.1016/j.jmva.2006.11.013.

    • Search Google Scholar
    • Export Citation
  • Mo, K. C., and M. Ghil, 1987: Statistics and dynamics of persistent anomalies. J. Atmos. Sci., 44, 877902, https://doi.org/10.1175/1520-0469(1987)044<0877:SADOPA>2.0.CO;2.

    • Search Google Scholar
    • Export Citation
  • Monahan, A. H., J. C. Fyfe, M. H. Ambaum, D. B. Stephenson, and G. R. North, 2009: Empirical orthogonal functions: The medium is the message. J. Climate, 22, 65016514, https://doi.org/10.1175/2009JCLI3062.1.

    • Search Google Scholar
    • Export Citation
  • Mørup, M., and L. K. Hansen, 2012: Archetypal analysis for machine learning and data mining. Neurocomputing, 80, 5463, https://doi.org/10.1016/j.neucom.2011.06.033.

    • Search Google Scholar
    • Export Citation
  • Motevalli-Soumehsaraei, B., and A. Barnard, 2019: Archetypal analysis package, version 1. CSIRO, https://doi.org/10.25919/5d3958889f7ff.

  • Newman, M., and Coauthors, 2016: The Pacific decadal oscillation, revisited. J. Climate, 29, 43994427, https://doi.org/10.1175/JCLI-D-15-0508.1.

    • Search Google Scholar
    • Export Citation
  • Nguyen, L. H., and S. Holmes, 2019: Ten quick tips for effective dimensionality reduction. PLOS, 15, e1006907, https://doi.org/10.1371/journal.pcbi.1006907.

    • Search Google Scholar
    • Export Citation
  • North, G. R., 1984: Empirical orthogonal functions and normal modes. J. Atmos. Sci., 41, 879887, https://doi.org/10.1175/1520-0469(1984)041<0879:EOFANM>2.0.CO;2.

    • Search Google Scholar
    • Export Citation
  • O’Kane, T. J., J. S. Risbey, C. Franzke, I. Horenko, and D. P. Monselesan, 2013: Changes in the metastability of the midlatitude Southern Hemisphere circulation and the utility of nonstationary cluster analysis and split-flow blocking indices as diagnostic tools. J. Atmos. Sci., 70, 824842, https://doi.org/10.1175/JAS-D-12-028.1.

    • Search Google Scholar
    • Export Citation
  • O’Kane, T. J., J. S. Risbey, D. P. Monselesan, I. Horenko, and C. L. E. Franzke, 2016: On the dynamics of persistent states and their secular trends in the waveguides of the Southern Hemisphere troposphere. Climate Dyn., 46, 35673597, https://doi.org/10.1007/s00382-015-2786-8.

    • Search Google Scholar
    • Export Citation
  • O’Kane, T. J., D. P. Monselesan, J. S. Risbey, I. Horenko, and C. L. E. Franzke, 2017: On memory, dimension, and atmospheric teleconnections. Math. Climate Wea. Forecasting, 3, 127, https://doi.org/10.1515/mcwf-2017-0001.

    • Search Google Scholar
    • Export Citation
  • Quinn, C., D. Harries, and T. J. O’Kane, 2021: Dynamical analysis of a reduced model for the North Atlantic Oscillation. J. Atmos. Sci., 78, 16471671, https://doi.org/10.1175/JAS-D-20-0282.1.

    • Search Google Scholar
    • Export Citation
  • Rasmusson, E. M., and T. H. Carpenter, 1982: Variations in tropical sea surface temperature and surface wind fields associated with the Southern Oscillation/El Niño. Mon. Wea. Rev., 110, 354384, https://doi.org/10.1175/1520-0493(1982)110<0354:VITSST>2.0.CO;2.

    • Search Google Scholar
    • Export Citation
  • Reynolds, R. W., T. M. Smith, C. Liu, D. B. Chelton, K. S. Casey, and M. G. Schlax, 2007: Daily high-resolution-blended analyses for sea surface temperature. J. Climate, 20, 54735496, https://doi.org/10.1175/2007JCLI1824.1.

    • Search Google Scholar
    • Export Citation
  • Richardson, D., A. S. Black, D. P. Monselesan, T. S. Moore, J. S. Risbey, A. Schepen, D. T. Squire, and C. R. Tozer, 2021: Identifying periods of forecast model confidence for improved subseasonal prediction of precipitation. J. Hydrometeor., 22, 371385, https://doi.org/10.1175/JHM-D-20-0054.1.

    • Search Google Scholar
    • Export Citation
  • Risbey, J. S., T. J. O’Kane, D. P. Monselesan, C. Franzke, and I. Horenko, 2015: Metastability of Northern Hemisphere teleconnection modes. J. Atmos. Sci., 72, 3554, https://doi.org/10.1175/JAS-D-14-0020.1.

    • Search Google Scholar
    • Export Citation
  • Risbey, J. S., D. P. Monselesan, A. S. Black, T. S. Moore, D. Richardson, D. T. Squire, and C. R. Tozer, 2021: The identification of long-lived Southern Hemisphere flow events using archetypes and principal components. Mon. Wea. Rev., 149, 19872010, https://doi.org/10.1175/MWR-D-20-0314.1.

    • Search Google Scholar
    • Export Citation
  • Seitola, T., V. Mikkola, J. Silen, and H. Järvinen, 2014: Random projections in reducing the dimensionality of climate simulation data. Tellus, 66A, 25274, https://doi.org/10.3402/tellusa.v66.25274.

    • Search Google Scholar
    • Export Citation
  • Seth, S., and M. J. A. Eugster, 2016: Probabilistic archetypal analysis. Mach. Learn., 102, 85113, https://doi.org/10.1007/s10994-015-5498-8.

    • Search Google Scholar
    • Export Citation
  • Steinschneider, S., and U. Lall, 2015: Daily precipitation and tropical moisture exports across the eastern United States: An application of archetypal analysis to identify spatiotemporal structure. J. Climate, 28, 85858602, https://doi.org/10.1175/JCLI-D-15-0340.1.

    • Search Google Scholar
    • Export Citation
  • Suleman, A., 2017a: On ill-conceived initialization in archetypal analysis. Adv. Data Anal. Classif., 11, 785808, https://doi.org/10.1007/s11634-017-0303-0.

    • Search Google Scholar
    • Export Citation
  • Suleman, A., 2017b: Validation of archetypal analysis. Int. Conf. on Fuzzy Systems, Naples, Italy, IEEE, 1–6, https://doi.org/10.1109/FUZZ-IEEE.2017.8015385.

  • Takaya, K., and H. Nakamura, 2001: A formulation of a phase-independent wave-activity flux for stationary and migratory quasigeostrophic eddies on a zonally varying basic flow. J. Atmos. Sci., 58, 608627, https://doi.org/10.1175/1520-0469(2001)058<0608:AFOAPI>2.0.CO;2.

    • Search Google Scholar
    • Export Citation
  • Takens, F., 1981: Detecting strange attractors in turbulence. Dynamical Systems and Turbulence, Warwick 1980, D. Rand and L.-S. Young, Eds., Vol. 898, Springer, 366381.

    • Search Google Scholar
    • Export Citation
  • Thurau, C., K. Kersting, and C. Bauckhage, 2009: Convex non-negative matrix factorization in the wild. Ninth IEEE Int. Conf. on Data Mining, Miami Beach, FL, IEEE, 523532, https://doi.org/10.1109/ICDM.2009.55.

  • Thurau, C., K. Kersting, M. Wahabzada, and C. Bauckhage, 2011: Convex non-negative matrix factorization for massive datasets. Knowl. Inf. Syst., 29, 457478, https://doi.org/10.1007/s10115-010-0352-6.

    • Search Google Scholar
    • Export Citation
  • Trenberth, K. E., 1976: Spatial and temporal variations of the Southern Oscillation. Quart. J. Roy. Meteor. Soc., 102, 639653, https://doi.org/10.1002/qj.49710243310.

    • Search Google Scholar
    • Export Citation
  • Trenberth, K. E., and L. Smith, 2006: The vertical structure of temperature in the tropics: Different flavors of El Niño. J. Climate, 19, 49564973, https://doi.org/10.1175/JCLI3891.1.

    • Search Google Scholar
    • Export Citation
  • Trendafilov, N., and M. Gallo, 2021: Data analysis on simplexes. Multivariate Data Analysis on Matrix Manifolds, N. Trendafilov and M. Gallo, Eds., Springer Series in the Data Sciences, Springer, 373402.

    • Search Google Scholar
    • Export Citation
  • Vinué, G., 2017: Anthropometry: An R package for analysis of anthropometric data. J. Stat. Software, 77, 139, https://doi.org/10.18637/jss.v077.i06.

    • Search Google Scholar
    • Export Citation
  • Wang, B., 1995: Interdecadal changes in El Niño onset in the last four decades. J. Climate, 8, 267285, https://doi.org/10.1175/1520-0442(1995)008<0267:ICIENO>2.0.CO;2.

    • Search Google Scholar
    • Export Citation
  • Yu, H.-F., N. Rao, and I. S. Dhillon, 2016: Temporal regularized matrix factorization for high-dimensional time series prediction. Advances in Neural Information Processing Systems 29, Barcelona, Spain, NIPS, 847–855, https://proceedings.neurips.cc/paper/2016/hash/85422afb467e9456013a2a51d4dff702-Abstract.html.

  • Zonoobi, D., A. A. Kassim, and Y. V. Venkatesh, 2011: Gini index as sparsity measure for signal reconstruction from compressive samples. IEEE J. Sel. Top. Signal Process., 5, 927932, https://doi:10.1109/JSTSP.2011.2160711.

    • Search Google Scholar
    • Export Citation
  • View in gallery

    Toy datasets illustrating the results of AA applied to 100 samples of 1000 points per sample drawn from on (a) a 3D normal distribution of unit variance, (b) a 3D uniform distribution centered on the origin, and (c) the distribution of points from the first three scaled PCs of daily SST anomalies, λ1–3PC1–3. In all examples, the dark gray points depict 1000 points of the last of the 100 samples and the light gray points represents the points projections on the XY, YZ, and ZX planes of the coordinate system. All the samples AA resulting eight archetypes are shown in color.

  • View in gallery

    Simplex representations of detrended monthly SST anomalies over the 1982–2020 for archetypes cardinalities of (a) 4 and (b) 8.

  • View in gallery

    AA and RSAA results for the stochastic matrices, SAA, CAA, SRSAA, and CRSAA, performed for eight archetypes on detrended monthly SST anomalies over 1982–2020 period, 468 monthly records. The top row shows the resulting S (left) and C (right) for AA on the full dataset, to be compared to the middle row showing S and C when all (468) principal components are retained in RSAA. The bottom row shows the absolute difference in S and C between AA and RSAA.

  • View in gallery

    (top) Explained variances and (bottom) sum of squares errors for increasing number of principal components or archetypes for PCA (blue) and AA (red) for both detrended (continuous) and full (dashed) monthly SST anomalies. The AA explained variance [Eq. (8)] reported for both the full and detrended cases corresponds to RSAA results for cardinality from 3 to 20 when all 468 PCs are retained.

  • View in gallery

    Gini coefficients as a function of retained dimensions (scaled principal components) for matrices (top) C and (bottom) S of in terms of archetype cardinality (shown as different colors). Gini coefficients (solid), AA explained variances for the reduced problem (dashed) and AA explained variances [Eq. (8)] as fractions of the total variance (dotted).

  • View in gallery

    AA spatial pattern and time series results using (a) full and (b) detrended monthly SST anomalies over 1982–2020 for a selected archetype number of 4. The two left columns on each subplot show archetypes constructed by XC and XS˜T, with S˜p×t=Sp×t/i=1tS(p,i), respectively. The two right columns show C and S matrix time series, respectively, with multivariate ENSO index (MEI) time series (gray) included on both.

  • View in gallery

    As in Fig. 6, but for an AA cardinality of 6.

  • View in gallery

    As in Fig. 6, but for an AA cardinality of 8.

  • View in gallery

    (a) Stacked bar plots of S-matrix probabilities for detrended monthly SST anomalies over 1982–2020 and (b) corresponding matched archetypes using pattern correlation for cardinalities ranging from 2 to 8. Each row in (a) corresponds to AA results ranging from 2 (top row) to 8 (bottom row), with MEI time series (black) included on all. The bar color codes correspond to matched archetypes referenced to AA results for a cardinality of 8, whereas the labels Ai in each row indicate the archetype ranks based on the time mean of the AA stochastic matrix SnAA with nAA = 2, …, 8 and i = 1, …, nAA in decreasing order of S¯nAA(i). The 7 columns × 8 rows AA patterns in (b) correspond to matched archetypes (rows) referenced to the AA results for cardinality of 8 (last column) across cardinalities ranging from 2 (first column) to 8 (last column).

  • View in gallery

    AA composite results using detrended monthly SST anomalies over 1982–2020 for a selected archetype number of 4. (left) The resulting spatial patterns of SST anomalies constructed with the S matrix values, followed by (center) 300-hPa zonal wind component (shading) with superimposed 500-hPa geopotential height anomalies (green contours) and thermal wind anomalies components (vectors) and, by (right) monthly averaged daily maximum surface wind speed anomalies (shading) with superimposed velocity potential difference anomalies between 150- and 850-hPa levels (green contours) and wave activity flux anomalies components at 200 hPa (vectors).

All Time Past Year Past 30 Days
Abstract Views 0 0 0
Full Text Views 217 217 117
PDF Downloads 206 206 85

Archetypal Analysis of Geophysical Data Illustrated by Sea Surface Temperature

View More View Less
  • 1 aInternational Laboratory for High-Resolution Earth System Prediction, Texas A&M University, College Station, Texas
  • | 2 bCSIRO Oceans and Atmosphere, Hobart, Tasmania, Australia
  • | 3 cCentre for Southern Hemisphere Oceans Research, Hobart, Tasmania, Australia
  • | 4 dDepartment of Meteorology, Stockholm University, Stockholm, Sweden
  • | 5 eDepartment of Human and Social Sciences, University of Naples L’Orientale, Naples, Italy
Free access

Abstract

The ability to find and recognize patterns in high-dimensional geophysical data is fundamental to climate science and critical for meaningful interpretation of weather and climate processes. Archetypal analysis (AA) is one technique that has recently gained traction in the geophysical science community for its ability to find patterns based on extreme conditions. While traditional empirical orthogonal function (EOF) analysis can reveal patterns based on data covariance, AA seeks patterns from the points located at the edges of the data distribution. The utility of any objective pattern method depends on the properties of the data to which it is applied and the choices made in implementing the method. Given the relative novelty of the application of AA in geophysics it is important to develop experience in applying the method. We provide an assessment of the method, implementation, sensitivity, and interpretation of AA with respect to geophysical data. As an example for demonstration, we apply AA to a 39-yr sea surface temperature (SST) reanalysis dataset. We show that the decisions made to implement AA can significantly affect the interpretation of results, but also, in the case of SST, that the analysis is exceptionally robust under both spatial and temporal coarse graining.

Significance Statement

Archetypal analysis (AA), when applied to geophysical fields, is a technique designed to find typical configurations or modes in underlying data. This technique is relatively new to the geophysical science community and has been shown to be beneficial to the interpretation of extreme modes of the climate system. The identification of extreme modes of variability and their expression in day-to-day weather or state of the climate at longer time scales may help in elucidating the interplay between major teleconnection drivers and their evolution in a changing climate. The purpose of this work is to bring together a comprehensive report of the AA methodology using an SST reanalysis for demonstration. It is shown that the AA results are significantly affected by each implementation decision, but also can be resilient to spatiotemporal averaging. Any application of AA should provide a clear documentation of the choices made in applying the method.

Amanda Black and Didier Monselesan are co-first authors.

© 2022 American Meteorological Society. For information regarding reuse of this content and general copyright information, consult the AMS Copyright Policy (www.ametsoc.org/PUBSReuseLicenses).

Corresponding author: Didier P. Monselesan, didier.monselesan@csiro.au

Abstract

The ability to find and recognize patterns in high-dimensional geophysical data is fundamental to climate science and critical for meaningful interpretation of weather and climate processes. Archetypal analysis (AA) is one technique that has recently gained traction in the geophysical science community for its ability to find patterns based on extreme conditions. While traditional empirical orthogonal function (EOF) analysis can reveal patterns based on data covariance, AA seeks patterns from the points located at the edges of the data distribution. The utility of any objective pattern method depends on the properties of the data to which it is applied and the choices made in implementing the method. Given the relative novelty of the application of AA in geophysics it is important to develop experience in applying the method. We provide an assessment of the method, implementation, sensitivity, and interpretation of AA with respect to geophysical data. As an example for demonstration, we apply AA to a 39-yr sea surface temperature (SST) reanalysis dataset. We show that the decisions made to implement AA can significantly affect the interpretation of results, but also, in the case of SST, that the analysis is exceptionally robust under both spatial and temporal coarse graining.

Significance Statement

Archetypal analysis (AA), when applied to geophysical fields, is a technique designed to find typical configurations or modes in underlying data. This technique is relatively new to the geophysical science community and has been shown to be beneficial to the interpretation of extreme modes of the climate system. The identification of extreme modes of variability and their expression in day-to-day weather or state of the climate at longer time scales may help in elucidating the interplay between major teleconnection drivers and their evolution in a changing climate. The purpose of this work is to bring together a comprehensive report of the AA methodology using an SST reanalysis for demonstration. It is shown that the AA results are significantly affected by each implementation decision, but also can be resilient to spatiotemporal averaging. Any application of AA should provide a clear documentation of the choices made in applying the method.

Amanda Black and Didier Monselesan are co-first authors.

© 2022 American Meteorological Society. For information regarding reuse of this content and general copyright information, consult the AMS Copyright Policy (www.ametsoc.org/PUBSReuseLicenses).

Corresponding author: Didier P. Monselesan, didier.monselesan@csiro.au

1. Introduction

Pattern recognition plays an important role in quantitative climatology helping in diagnosing and understanding climate processes. Archetypal analysis (AA) is one technique that is gaining traction in the geophysical science community for its ability to find patterns based on extreme modes of data. Given the relatively new discovery of the utility of this analysis to geophysical problems, resources and references are scattered for the researcher who wishes to implement this technique. The goal of this paper is to present the AA method along with a detailed description of the decisions made in its implementation, and the effect each decision may have on the final output. We also provide a discussion on the interpretation of AA with respect to geophysical data.

Empirical orthogonal function (EOF) decomposition or factorization has become a hallmark of statistical analysis and data reduction (Hotelling 1933; Jolliffe 1986) since its application in the mid-1950s by Lorenz (1956) to weather and climate studies. Known also as principal component analysis (PCA), EOF analysis constructs patterns in the spatial dimension that maximize variance. The constructed EOFs are not directly interpretable in terms of the original data and therefore any attempt to attribute a particular dynamical mechanism to any one EOF pattern is discouraged when analyzing geophysical data (Monahan et al. 2009). By comparison, AA seeks patterns from the extreme points of a convex hull, or envelope, surrounding the data in state space. It follows that the constructed archetypes may be interpreted in terms of the original data, as shown in the derivation section of this paper and by previous studies (Mørup and Hansen 2012; Bauckhage 2014). Other pattern recognition types discussed here are nonnegative matrix factorization (Cichocki et al. 2009; Gillis 2020; Mairal 2014, 2017), clustering, and optimization on manifold (Boumal et al. 2014; Hannachi and Trendafilov 2017; Hannachi 2021; Trendafilov and Gallo 2021).

Like any data mining tool, many decisions can be made in archetypal analysis to tweak or optimize certain parameters based on the needs of the user. The output of archetypal analysis will depend strongly on the decisions made along the way. A 39-yr sea surface temperature (SST) reanalysis dataset is used here as paradigm for the method and to illustrate some of these choices.

We demonstrate the utility of this analysis method and the benefits that arise particularly when analyzing climate and weather datasets. We show that the decisions made to implement AA can greatly affect the interpretation of results and should therefore be considered carefully and documented thoroughly in all work involving AA. The structure of this paper is as follows. Sections 1 and 2, provide an historical perspective and the rationale for the work. Next, we introduce the datasets in section 3. In section 4, we contrast both PCA and AA methods, and introduce the minimization algorithm for reduced space archetypal analysis (RSAA). Section 5 describes some decisions required on the input data, on the archetypal analysis, and their impacts to the final result. We then examine potential generalization and extension of the AA method in section 6. Section 7 illustrates teleconnections derived from extreme conditions resolved by AA applied to SSTA. Last, section 8 provides a summary and conclusion statement. The appendix touches on the available computing packages.

2. Data

We apply AA to the Optimum Interpolation Sea Surface Temperature (OISST), version 2.1 (v2.1), high-resolution dataset (Reynolds et al. 2007) provided SST on a 0.25° global grid, of which a subset from 1982 to 2020 is reinterpolated to a 4° × 4° resolution when computational efficiency is required. Some illustrations of the technique consider daily anomalies, but here we focus on monthly anomalies. SST anomalies (SSTA) are defined the standard way. Daily and monthly SST anomalies represent here a departure from daily and monthly climatological values, both defined as time mean for each day of the year and month of the year across all years considered in the interval 1982–2020, respectively. Only complete years are considered. The archetypes spatial and temporal imprints, and their linkages to extratropical atmospheric circulation are revealed by compositing JRA-55 reanalysis atmospheric fields (Kobayashi et al. 2015) at the surface and on isobaric levels with corresponding level of temporal aggregation.

The SST dataset is used principally for illustration purpose of the AA method as the weather and climate research community is familiar with its variability patterns across multiple spatiotemporal scales and its teleconnections have been extensively studied and could be readily compared to the AA results presented hereafter.

3. Mathematical derivation

AA belongs to an ever increasing class of data analysis methods called matrix factorization (Cichocki et al. 2009; Elad 2010; Elden 2019; Gan et al. 2020; Gillis 2020; Hannachi 2021), where factorization allows one to represent the original data as a combination of factors or components that are provably easier to interpret. Another advantage of factorization is that the dimensionality of dataset can also be reduced, and so its complexity. Due diligence when these methods are employed is to thoroughly test their domain of applicability. Hereafter, we will focus on PCA, one of the oldest and most widespread techniques in statistical data analysis (Hotelling 1933; Jolliffe 1986), and AA, a lesser known one, but emerging when applied to geophysical problems (Steinschneider and Lall 2015; Hannachi and Trendafilov 2017; Richardson et al. 2021; Risbey et al. 2021).

The goal of factorization, when it is suitable for the problem at hand, is to extract a reduced representation of the underlying processes generating some imprint in the data. In particular, let us assume that the underlying phenomenon is El Niño–Southern Oscillation (ENSO), leading to SSTA, the dataset we wish to investigate. Without loss of generality, we assume SSTA to be represented by a two-dimensional data matrix X = Xs×t, where the dimensions s and t represent the space and time coordinates, respectively. Here, the space coordinate s maps into a two-dimensional longitude and latitude coordinates domain. SSTA are typically centered with respect to the time dimension t,
j=1tX(i,j)=0,i,
for all geographical locations if the climatological mean is considered over the entire time domain. Throughout the paper, the subscripts, s × t, are used to remind the reader of matrix dimensions, whereas round brackets, (i, j), refer to single matrix elements.
Mathematically, X can be decomposed or factored by a standard algorithm called singular value decomposition (SVD), available to most data analysis software in use today,
Xs×tXs×tr=Us×rΛr×rVr×tT=i=1rλiPCi(t)EOFi(s),
where r ≤ min(s, t). When the rank r of X equals min(s, t), X = Xr to machine precision. The factors, the matrices U and V, or component-wise the functions EOFi and PCi, identify data-driven spatial patterns and associated time series, in contrast to other decomposition methods such as Fourier or spherical harmonics for example, where basis functions are a priori chosen. In geophysics, the spatial patterns are referred to as EOFs and the time series, principal components (PCs). An underlying constraint in the SVD algorithm is that both EOFs and PCs are orthonormal,
i=1sj=1sU(i,t)U(j,t)=δij,t,
and, correspondingly,
i=1tj=1tV(s,i)V(s,j)=δij,s.
The matrix Λ = λiδij, with i, j = 1, …, r ≤ min(s, t), is diagonal and positive definite, Λ ≥ 0. The λi ≥ 0 are where the true physical scales of the data reside. An interesting property of SVD is that the factors,
λi2Trace{XXT},
correspond to the fraction of the variance explained by individual product or mode, PCi(t)EOFi(s), and are related to the eigenvalues, λi2, of the covariance of X (Jolliffe 1986; Jolliffe and Cadima 2016; Hannachi 2021). The positiveness of Λ has the added advantage that modes can be ranked as function of explained variance. This property can be further exploited in two ways, 1) to isolate modes of variability based on the explained variance of X and 2) to reduce the dimensionality of X for computational convenience and interpretability. PCA modes also display the important property of “nestedness,” whereby the variance explained by the lower rank approximation Xr is also contained in Xr+1, rmin(s,t)1. Note, however, that PCA modes of variability do not necessarily map neatly into physical modes of variability (North 1984; Mo and Ghil 1987; Hasselmann 1988; Monahan et al. 2009). As in Richardson et al. (2021) and Risbey et al. (2021), we will apply PCA as a dimension reduction tool.
AA has been applied to the analysis of weather and climate data only recently. For a comprehensive description of various implementations of the method, the reader is referred to the following publications, (Steinschneider and Lall 2015; Hannachi and Trendafilov 2017; Richardson et al. 2021; Risbey et al. 2021; Han et al. 2022). Only a brief derivation is presented hereafter. Similar to PCA in Eq. (2), AA corresponds also to a matrix factorization method, whereby the original data matrix X = Xs×t is approximated by the product of two factors, XC and S, according to the optimization problem
argminC,SXXCSF2,
where .F2 stands for the Frobenius norm and the solutions, C and S, are to be sought in sets of left-stochastic1 matrices C = Ct×p and S = Sp×t for a prescribed order p. The factor XCs×p = Xs×tCt×p, p convex combinations of t data points, are called archetypes. Furthermore, each data point in X can also be approximated by a convex combination of p archetypes according to XCS = XCs×pSp×t = Xs×tCt×pSp×t. Although not immediately apparent in the formulation of the AA problem, the archetypes can be shown to lie on the convex hull2 enclosing the data; a welcome characteristic if one is interested in identifying extremes or outliers of datasets (Cutler and Breiman 1994; Bauckhage and Thurau 2009; Bauckhage 2014).
We consider hereafter a modification of AA, called RSAA, introduced by Richardson et al. (2021), Risbey et al. (2021), and Han et al. (2022) to substantially reduce the computational burden incurred by large datasets. RSAA is performed on a reduced form of the data matrix, Xr, derived by applying a PCA-driven dimension reduction step to X, where the factorization in Eq. (2) is truncated to a given order r informed by a predefined percentage of the total variance explained to be kept in the analysis. This consideration is mainly driven by computational resources available and by an investigation of what processes, if any, account for the residual variance excluded. RSAA seeks to minimize the following cost function:
XrXrCrSrF2=UΛVTUΛVTCrSrF2=U(ΛVTΛVTCrSr)F2=ΛVTΛVTCrSrF2,
for the left-stochastic matrices Cr=Ct×pr and Sr=Sp×tr, the superscript r indicating here the solution for the reduced problem in contrast to Eq. (6). Elementwise, the stochasticity constraints are such that both Cr, Sr ≥ 0 withi=1tCr(i,p)=1,p, and i=1pSr(i,t)=1,t. The invariance of the Frobenius norm, F2, under unitary transformation has been employed in Eq. (7) to eliminate the spatial dependency U from the minimization. The parameter p, the archetype cardinality3 or order, is set a priori. The reduced space archetypes are defined as the products ΛVTCr, and all records in ΛVT, the matrix built on the eigenvalues λi and PCi, can be approximated by ΛVTCrSr. To recover the archetypes of the original (albeit reduced) dataset Xr, the reduced space archetypes are simply left multiplied by the matrix U, built on the EOFi, UΛVTCr. Similarly, all records in Xr are approximated by UΛVTCrSr.

PCA and AA are data-driven factorization methods belonging to unsupervised clustering techniques (Mørup and Hansen 2012). Only the PCA truncation order r and archetype cardinality p (akin to a number of clusters in clustering methods) allowed in the decomposition are predefined. Although not explicitly indicated in our notation, as not to render it too cumbersome, we weight spatially the SSTA data matrix Xs×t bycos(θ), θ being the latitude of the grid cell s, prior to applying SVD. It is also important to realize that, in both PCA and AA cases, the factorization does not take into account serial correlations existing in the data. Indeed, any permutation of time records changes neither the EOFi nor the archetypes XC; and vice versa, any permutation of the space records s changes neither the PCi nor the stochastic matrix S, given that permutations are also unitary transformations leaving the Froebenius norm invariant. Further postprocessing steps are therefore required to tease out any spatiotemporal relationship.

The constraints in the minimization procedure in both cases critically affect the factorization. For PCA and a predefined truncation of level r = min(s, t), the factorization is lossless. In other words, the original dataset X is identical to its factorization expression within machine precision. For AA, the positiveness and stochasticity (convexity) constraints on both factors C and S, lead to sparser and lossy representations as a function of the predefined cardinality selected p and retained dimension r. AA representations will never explain the total variance of the original dataset, even when r = min(s, t), unless the number of archetypes trivially equals the number of observations, that is, p = t (Bauckhage 2014). In the following, the predefined truncation level r ≤ min(s, t) corresponds to the number of retained dimensions or PCi with i = 1, …, r, ranked by fraction of explained variance, used in RSAA.

The appeal of AA over PCA, has been succinctly summarized by Bauckhage (2014):

Archetypes are convex combinations of data points and data points are approximated in terms of convex combinations of archetypes.

The convexity characteristic of AA is crucial and leads to a probabilistic interpretation of both archetypes XC, and data point representations XCS, the matrices C and S being stochastic. However, Bauckhage (2014) summary description eludes the fact that archetypes are “closer” to individual data points and possibly more representative than EOF patterns. EOF patterns may never be observed, as the covariance of X is the mean (expectation) of the product of anomalies derived from the climatology computed over the entire dataset.

Finally, to reveal potential teleconnections based on AA, the composites AF for p archetypes from any geophysical field, F = Fs×t, can be constructed directly from the stochastic matrices, C = Ct×p or S = Sp×t, as the products AF=FCorFS˜T, both C and S˜p×t=Sp×t/[j=1tS(p,j)] depending solely on time. We note that the field F to composite needs not to be interpolated on the grid of the original dataset X, but time records and aggregation levels should overlap and be matched, respectively, for consistency (i.e., daily to daily, monthly to monthly averaged records).

4. Data-driven characteristics impacting AA

a. Data structure and distribution

Since the AA inception by Cutler and Breiman (1994), it is recognized that the location of archetypes is intimately linked to the underlying distribution of data under investigation and driven by outliers or extremes thereof. As with any clustering technique, the behavior of the method on selected datasets should be therefore well understood prior to its implementation. As illustrated by elliptically shaped density level sets4 of points drawn from a normal distribution N(μ, Σ) with mean μ = [1, 1] and covariance
Σ=[10.80.81],
Cutler and Breiman (1994) show AA solutions for four archetypes are preferentially located along the level set ellipse’s major 2a and minor 2b axes with a/b = 3.5 When applied to 100 samples of 1000 points each, the nonisotropic character of the underlying distribution comes to the fore for eccentricities e=1(b2/a2)>0. One can easily convince oneself that, when a = b (e = 0), the distribution is spherical and no preferential directions away from the mean μ are privileged by the method. The archetype locations in data space will strongly depend on outliers in each of the samples, as the underlying distribution is invariant under rotations centered on the mean, μ, and outliers are equally likely in any direction.
When 100 samples of 1000 points each are drawn from an isotropic distribution in 3 dimensions, for example, a multivariate normal distribution, with mean μ = [0, 0, 0] and trivial covariance
Σ=[100010001],
the eight archetypes resulting from AA applied to each individual samples in Fig. 1a tend to be uniformly distributed on a spherical shell loosely bounded by the convex hulls of all samples. As expected, no preferential direction can be detected by the method, but the ensemble of archetypes over all samples are mainly located on the outer shell, the furthest away from the center. In contrast, if the samples are drawn from a multivariate uniform distribution centered on the origin, the distribution support being a three-dimensional cube with edges of unit length, the eight archetypes are preferentially located close to the cube vertices and therefore correspond to extreme points of the distribution. In Fig. 1b, the archetypes are labeled as a function of their distances to a given vertex and the method detects preferential directions in the data space, along the eight vertices of the cube.
Fig. 1.
Fig. 1.

Toy datasets illustrating the results of AA applied to 100 samples of 1000 points per sample drawn from on (a) a 3D normal distribution of unit variance, (b) a 3D uniform distribution centered on the origin, and (c) the distribution of points from the first three scaled PCs of daily SST anomalies, λ1–3PC1–3. In all examples, the dark gray points depict 1000 points of the last of the 100 samples and the light gray points represents the points projections on the XY, YZ, and ZX planes of the coordinate system. All the samples AA resulting eight archetypes are shown in color.

Citation: Artificial Intelligence for the Earth Systems 1, 3; 10.1175/AIES-D-21-0007.1

In general, geophysical observables do not have a spherically or elliptically shaped distribution. As a tractable example, Fig. 1c displays the results of AA for eight archetypes applied to 100 samples of 1000 daily SSTA records drawn from the 14 250 records of sea surface temperatures over the 1982–2020 period. When the first three scaled PCs, λ1–3PC1–3, are retained for AA, the reduced dataset seems at first glance elliptically distributed given that the eigenvalues in the singular value decomposition of SSTA are such that λ1 > λ2 > λ3. However, a closer inspection of the three-dimensional cloud points reveals that, at times, excursions occur away from the broad ellipsoidally shaped distribution and outliers are readily identified by the AA method and are consistently arranged in small spatially coherent clusters across all 100 samples.

To represent AA results for dimension larger than 3, a two-dimensional representation of the stochastic matrix S = Sp×t and corresponding archetypes Xs×tCt×p can be constructed given that the points S(⋅, t) belong to the simplex
Δp1={(s1,,sp)p|i=1psi=1,si0}.

Points in Δp−1 can be projected on the plane perpendicular to the diagonal of the nonnegative orthant6 of the p-dimensional hypercube centered at the origin of p. The simplex projection footprint corresponds to the edges and interior points of regular p polygons (Seth and Eugster 2016; Hannachi and Trendafilov 2017), where vertices correspond to the pure archetypes XC(s, ⋅) if all but one element of S(⋅, t) at time t equal zero and the sole nonzero element being necessarily equal to 1. Any other point either on the edges or in the interior of the p polygons corresponds to a convex mixture of archetypes. Figure 2 displays simplex projections of the stochastic matrix Sp×t for detrended monthly SSTA over 1982–2020 for four (Fig. 2a) and eight (Fig. 2b) archetypes, respectively. The vertex color indicates how close the data record representation lies to a single archetype. Mixed colors away from vertices toward the 4- and 8-polygon centers indicate records represented by mixtures of archetypes. The simplex representation may help to visualize the clustering of “extreme” records in the datasets.

Fig. 2.
Fig. 2.

Simplex representations of detrended monthly SST anomalies over the 1982–2020 for archetypes cardinalities of (a) 4 and (b) 8.

Citation: Artificial Intelligence for the Earth Systems 1, 3; 10.1175/AIES-D-21-0007.1

b. Dimensionality reduction

The spatiotemporal dimensions of the SST and JRA-55 datasets considered throughout this work are commensurate to the resolutions of other major global reanalysis efforts and ocean–atmosphere general circulation model (OAGCM) simulations. For OISST v2.1, the daily (monthly) data matrix X = Xs×t consists for s = 691 119 spatial points times t = 14 245 (468) daily (monthly) records, in total about O(1010) data points for daily records and an order of magnitude less for monthly records, O(109). Similarly, JRA-55 atmospheric reanalysis fields used hereafter to investigate teleconnections correspond to 41 760 spatial points × 14 245 day records, also a O(109) order of magnitude for single pressure level data on a 1.25° × 1.25° latitude–longitude grid. The large and inflationary increase in spatial and temporal resolution over the last four decades has led to exponential growth in size of available datasets and is mainly due to both the advent of spaceborne observing and supercomputing platforms used to collect and assimilate observations into circulation models. Due to its high dimensionality, the voluminous amount of data is particularly hard to manipulate when one is interested in the study of large scale phenomena and their global teleconnection imprints. A direct application of AA to the original data matrix X, an example of a Frobenius distance clustering problem, incurs large computational burden and is likely to be ill posed. AA optimization times are driven by the nondeterministic polynomial (NP)-hard7 character of the minimization algorithms (Aloise et al. 2009; Bauckhage and Manshaei 2014).

To alleviate the dimensionality issue, we may first consider a reduction of the domain size or spatiotemporal averaging. This is suitable for certain problems; for example, when AA is used to find extreme patterns involving mesoscale ocean processes and both local and remote linkages to atmospheric circulation, or when variability at smaller scales and higher frequencies can safely be ignored. For example, applying 4° × 4° spatial averaging to the original 0.25° × 0.25° OISST v2.1 dataset decreases the spatial dimension by factor 16 × 16 = 256, two orders of magnitude. Additional aggregation of the time dimension from daily to monthly records further reduces the total number of data points by a factor of 14 245/468 × 256 ≈ 7792, without substantially changing SST AA results for large scale phenomena such as ENSO, for example.

Another class of dimensionality reduction rests on general matrix factorization procedures (Mørup and Hansen 2012; Nguyen and Holmes 2019), where the original data matrix X can be decomposed into the sum or product of factors with the view to reducing its complexity and dimensionality, diminishing the computational burden of clustering algorithms such as AA. Throughout this work, we focus on one dimension reduction method, PCA, as formulated in the mathematical derivation section (section 3). PCA is an unsupervised, linear dimension reduction method with the following advantages: 1) the dimension reduction level can be informed by the fractional amount of explained variance desired to be maintained in the reduced data passed to AA and 2) the global data structure, its spatial covariance that is, is preserved in contrast to other more sophisticated methods (Izenman 2008; Nguyen and Holmes 2019). It is important to note, however, that linear reduction methods do not necessarily “remove” the inherent nonlinear characteristics of the reduced dataset.

To illustrate the equivalence between AA and RSAA methods as in Eq. (7), we apply AA and RSAA to spatially averaged OISST v2.1 monthly anomalies on 4° × 4° grid. The AA and RSAA data matrix dimensions are X = X2643 × 468 and ΛVT=XRSAA,468×468, where all 468 scaled PCs, ΛVT, in the singular vector decomposition UΛVT of X, have been retained. No information loss has incurred given that 100% of the variance is explained by 468 modes in this case, leading to a dimension reduction factor of 2643/468 ≈ 6. Figure 3 compares the eight archetypes AA and RSAA stochastic matrix components, SAA, CAA, SRSAA, and CRSAA (top and middle rows), and their absolute difference |SAASRSAA| and |CAACRSAA| (bottom row). Differences in the resulting S and C matrices are of a small order of magnitude O(10−3). The data points approximated by the convex combinations of archetypes, XCAASAA and UXRSAACRSAASRSAA, are similar but not identical, the AA optimization procedure being a NP-hard problem for both. We note that the absolute differences |SAASRSAA| and |CAACRSAA| remain of the order of O(10−3) for archetype cardinality ranging from 2 to at least 20 (not shown).

Fig. 3.
Fig. 3.

AA and RSAA results for the stochastic matrices, SAA, CAA, SRSAA, and CRSAA, performed for eight archetypes on detrended monthly SST anomalies over 1982–2020 period, 468 monthly records. The top row shows the resulting S (left) and C (right) for AA on the full dataset, to be compared to the middle row showing S and C when all (468) principal components are retained in RSAA. The bottom row shows the absolute difference in S and C between AA and RSAA.

Citation: Artificial Intelligence for the Earth Systems 1, 3; 10.1175/AIES-D-21-0007.1

A difficulty encountered by most dimensionality reduction methods is to justify the level of truncation or reduction to apply to the original dataset. Usually for PCA, researchers rely on the first “significant” step change between consecutive values of ranked fraction of variance explained [Eq. (5)] when displayed on a scree plot. However, the optimal truncation level is conditioned by the data itself. The SSTA data matrix spectral characteristics show a rather smooth and incremental decrease between consecutive eigenvalues without an obvious step change and, therefore, the truncation levels applied throughout this work are mainly informed by the percentage of variance one wishes to retain and driven by computational considerations: typically, 90%–100% of the variance corresponding to reduced dimensions of the order of O(103) or less. The residual variance made out of the excluded modes of variability can be displayed to gain insight on where it lays and what it represents. Similarly, a spectral analysis of the associated PCs informs on the time scales excluded from the reduced dataset. Hereafter, the variance explained is given as the ranked fraction of variance of the full or reduced dataset and ranges from 0 to 1.

The attentive reader may realize that the difficulty mentioned for the dimension reduction problem also applies to the number of archetypes to be chosen in AA. How to decide how many clusters are needed in order to detect existing or new climate and weather regimes? Christiansen (2007) cautions against the application of clustering techniques to atmospheric regimes without a comprehensive analysis of the robustness and reproducibility of the results. In Fig. 4, the variance explained and sum of squares errors for increasing level of 1) PCA truncation (blue curves) and 2) archetype cardinality (red curves) are shown to gradually change for both SSTA and detrended SSTA. Here, the fraction variance explained by the AA factorization or in short the AA explained variance is equal to the difference between data matrix total variance, Σ2=XF2=Trace{XXT}, and the AA residual sum of squares, XXCSF2, divided by XF2,
ΣAA2=1XXCSF2XF2=1XXCSF2Σ2.
Fig. 4.
Fig. 4.

(top) Explained variances and (bottom) sum of squares errors for increasing number of principal components or archetypes for PCA (blue) and AA (red) for both detrended (continuous) and full (dashed) monthly SST anomalies. The AA explained variance [Eq. (8)] reported for both the full and detrended cases corresponds to RSAA results for cardinality from 3 to 20 when all 468 PCs are retained.

Citation: Artificial Intelligence for the Earth Systems 1, 3; 10.1175/AIES-D-21-0007.1

No obvious spectral gaps can be readily identified and the spectral characteristics of the RSAA factorization, as for PCA, are of little help.

In investigating the impact of dimension reduction on RSAA factorization, one does not fail to notice that both stochastic matrices, solutions of Eq. (7) C and S, display levels of sparsity depending on both the truncation order and the archetype cardinality. First, one notes that the level of sparsity for both C and S increases as a function of the number of archetypes. This behavior is understandable when one considers the extreme case where the number of archetypes equals the number of observation records t. Trivially, both C and S stochastic matrices are the sparsest with entries taking the values of 1 or 0 when Xs×t = XCS, that is, I = CS exactly, with C and S equal to the identity matrix I = It×t.

The sparsity dependence on the retained dimension order is less obvious and an objective measure of sparsity is required. Following Hurley and Rickard (2009), who systematically compared a range of metrics, we adopt here a novel application of an evaluation metric called the Gini coefficient or index (Gini 1921; Hurley and Rickard 2009; Zonoobi et al. 2011; Abrol and Sharma 2020) and is defined as follows. Given a set of N observations, O = (o1, …, oN), which values can be ordered from the smallest to largest such that o(1)o(N) with (1), …, (N) now indexing the sorted elements of O, the Gini coefficient Γ(O) is defined as
Γ(O)=12i=1No(i)O1(Ni+1/2N),
where O1 corresponds to the l1 norm of O, O1=i=1N|oi|.

The Gini coefficient has been used previously in economics as a measure of wealth inequality, where Gini = 1 indicates maximum inequality and Gini = 0 indicates perfectly distributed wealth across oi values. In the application of AA, Gini = 1 for the probability matrix C indicates that the resulting archetype patterns are each expressed by a single data record, whereas a lower Gini number would indicate archetype patterns expressed by more than one record. In general, we would like the sparse matrix, C, to be fairly distributive so as not to be overwhelmed by one particular record for each archetype, but still representative of extremes in the data distribution. Therefore, there lies some “sweet spot” that may be indicated by the Gini coefficient for maintaining a distributed C matrix while simultaneously staying true to the extreme mode nature of AA.8

Figure 5 illustrates the effect of increasing retained dimensions on how well dispersed the mixture weights are, as well as a comparison of the variance explained by the reduced problem versus the explained variances compared to the total variance of the original data. The top panel plots the Gini coefficient as a function of retained dimension (the number of scaled PCs) for seven selected archetype numbers (shown by different colored lines). For C, the Gini coefficients monotonically decrease from values close to 1, total inequality, to values ranging from 0.98 to 0.91 for the full problem (100% of the total original variance with 468 PCs retained) over the range of archetypal numbers from 3 to 20 considered here. Clearly, the sparsity of C has only been marginally affected by the number of dimensions retained. The “wealth redistribution” or sparsity lessening in C occurs more quickly as function of retained dimensions with fewer archetypes.

Fig. 5.
Fig. 5.

Gini coefficients as a function of retained dimensions (scaled principal components) for matrices (top) C and (bottom) S of in terms of archetype cardinality (shown as different colors). Gini coefficients (solid), AA explained variances for the reduced problem (dashed) and AA explained variances [Eq. (8)] as fractions of the total variance (dotted).

Citation: Artificial Intelligence for the Earth Systems 1, 3; 10.1175/AIES-D-21-0007.1

The impact on S is, however, remarkable in Fig. 5 bottom panel. The Gini coefficient values are shown as solid lines, overlaid by the AA explained variances for the reduced problem (dashed) and the explained AA variances compared to the total variance (dotted). For RSAA applied to monthly SSTA, we observe that the Gini coefficients increase approximately monotonically as a function of archetype numbers and asymptote as function of the number of retained dimensions, the earlier the lower number of archetypes selected and without any substantial gain in the total variance explained by the AA factorization. For any given AA cardinality, the “sweet spot” for the number of retained dimensions used for the analysis 1) corresponds approximately to the number of the fastest change in the AA explained variances for the reduced problem on the dashed curves and 2) where the total variance explained on the dotted lines remains unchanged in Fig. 5.

When the dimensionality of the original data becomes computationally prohibitive even for PCA, “approximate” methods of dimension reduction can be deployed. For geophysical datasets, Seitola et al. (2014) and Hannachi (2021) have recently illustrated that the issue of dimensionality could be addressed through random projections (RP), where the PCA decomposition factors, U, Λ, and V in Eq. (2) can be approximated at low computational cost. We refer the reader to the appendix in Seitola et al. (2014) for details on how to generate RP approximations ΛRPVRPT as proxies for ΛVT in RSAA.

c. Trend analysis

Geophysical data often display nonstationary or trending behavior, of which a notable example is the observed warming of SST caused by globally rising temperatures due to increasing greenhouse gas concentrations (IPCC 2013, 2019). It is important to ascertain the impact of this trend on natural climate modes. For example, there is clear observational evidence that significant changes in the nature of key ENSO indicators happen posterior to 1980, when at least three major El Niño episodes occurred in the 39-yr period between 1982 and 2020 compared to the previous 39 years (Capotondi and Sardeshmukh 2015, 2017; Capotondi et al. 2020). Throughout this work, we focus only on SSTA over the satellite era (Reynolds et al. 2007). Given the relatively short record of near-global SST coverage available since the advent of satellite observing platforms, the power of any statistical analysis to investigate the interplay between SSTA variability and a “warming” or slowly changing mean state is limited. Therefore, we will not attempt hereafter to explain this interaction. However, if we are interested in detecting the different “flavors” of ENSO, removing a linear trend to SSTA prior to AA implementation is a legitimate step to prevent the global warming signature from potentially “washing out” the ENSO global extreme imprints if one assumes that internal variability of the climate system can be neatly separated from anthropogenic forcing effects. Conversely, if we are interested in the change of ENSO extreme impacts under climate change, we would want to retain that trend. It remains part of the due diligence in the application of AA to properly formulate the questions that this method aims to address.

To illustrate the impact of trend removal, we compare the resulting archetype patterns, XC and XST, and stochastic weights, C and S, for full and linearly detrended SSTA for nAA = 4, 6, and 8 numbers of archetypes in Figs. 6b, 7b, and 8b, respectively. When no SSTA trend is removed, the overall warming is captured in all cases by two archetypes dominated by global cooling and warming patterns: rows 1 and 3 in Fig. 6a for nAA = 4, rows 1 and 2 in Fig. 7a for nAA = 6, and rows 2 and 4 in Fig. 8a for nAA = 8. The associated weights, C and S, also reflect this nonstationary behavior with stochastic weights S approximately decreasing from 1982 to 2010 for the cooling pattern and increasing from 2011 to 2020 for the warming pattern. The remaining archetypes resemble patterns associated with the different “flavors” of ENSO (for a recent review, see Capotondi et al. 2020). When removing the linear trend prior to applying AA, the global warming and cooling patterns disappear and, for equal number of archetypes, are “replaced” by additional patterns also bearing resemblance to typical ENSO conditions studied in the literature (Ashok et al. 2007; Cai and Cowan 2009). The effect of the trend removal is especially clear in C and S, both taking nonzero values spread over the entire time period under consideration.

Fig. 6.
Fig. 6.

AA spatial pattern and time series results using (a) full and (b) detrended monthly SST anomalies over 1982–2020 for a selected archetype number of 4. The two left columns on each subplot show archetypes constructed by XC and XS˜T, with S˜p×t=Sp×t/i=1tS(p,i), respectively. The two right columns show C and S matrix time series, respectively, with multivariate ENSO index (MEI) time series (gray) included on both.

Citation: Artificial Intelligence for the Earth Systems 1, 3; 10.1175/AIES-D-21-0007.1

Fig. 7.
Fig. 7.

As in Fig. 6, but for an AA cardinality of 6.

Citation: Artificial Intelligence for the Earth Systems 1, 3; 10.1175/AIES-D-21-0007.1

Fig. 8.
Fig. 8.

As in Fig. 6, but for an AA cardinality of 8.

Citation: Artificial Intelligence for the Earth Systems 1, 3; 10.1175/AIES-D-21-0007.1

5. Archetypal analysis computation and extensions

a. Initialization and convergence

Esposito (2021) has recently reviewed initialization methods for nonnegative matrix factorization (NMF). NMF shares algorithmic similarities with AA and so do initialization strategies employed to solve the optimization problem sketched in section 4, Eq. (7). Here, we combine random-based and clustering-based or data-driven initialization procedures. Being simpler to implement, random-based procedures are used as a benchmark for more sophisticated ones but require a thorough investigation of their robustness and reproducibility. For AA, a suite of strategies has been adopted in the literature (Bauckhage and Thurau 2009; Thurau et al. 2009, 2011; Eugster and Leisch 2011; Mørup and Hansen 2012; Bauckhage and Manshaei 2014; Suleman 2017a,b; Mair et al. 2017; Mair and Brefeld 2019). Our benchmark in all cases, the first trial of typically a number of randomly sampled initializations per optimization, is always the data-driven “FurthestSum” procedure advocated by Mørup and Hansen (2012). The algorithm FurthestSum initializes the AA procedure with a number of observation points equal to the desired AA cardinality through the matrix C, such that 1) the points are located on the convex hull of the dataset and 2) are the furthest away from each other, by taking into account the distance, in this instance the Frobenius norm, between all data points and the initial candidates.

However, Suleman (2017a) criticizes FurthestSum as “ill-conceived” and potentially leading to archetype redundancy after convergence for increasing archetype cardinality. To protect against this eventuality, we implement the initialization procedure prescribed by Mair and Brefeld (2019) based on coreset construction for AA, algorithm 2 in Mair and Brefeld (2019), where initial seed archetypes are randomly drawn from a distribution constructed from the square Euclidean distance of each data point X(⋅, t) from the time mean of Xs×t that sits at the origin in our case given that X corresponds to anomalies. The AA coreset strategy privileges initialization points the furthest away from the mean but are not necessarily located on the data convex hull in contrast to FurthestSum.

For all results reported in the paper, we implement the MATLAB code of Mørup and Hansen (2012) PCHA, suitably modified to accommodate both FurthestSum and AA coreset initializations (see Table A1 in the appendix for references). The AA procedure runs through an outer loop consisting of 1000 initialization trials, 999 random coreset and one FurthestSum trials, where for each individual trial, the iterative nonlinear least squares algorithm in PCHA, the inner loop, is allowed to converge with relative sum of square error (SSE) stopping criterion of 10−8. We report the solution that minimizes the relative SSE across all 1000 trials. We note that FurthestSum, throughout our many experiments, never corresponds to the optimum. All computations are performed in double precision.

Table 1

Archetypes nesting for SSTA with trend (left-hand side) and SSTA with linear trend removed (right-hand side). The numbers in each row label the archetype rank based on the time mean of the AA stochastic matrix, SnAAS¯nAA(i)=[j=1tS(i,j)]/t with nAA = 4, …, 10 and i = 1, …, nAA in decreasing order of S¯nAA(i). Each row corresponds to near-identical archetypes found across AAnAA for nAA = 4, …, 10. Circled archetype numbers on the left-hand side label the trending patterns.

Table 1
Table A1

A nonexhaustive list of archetypal analysis package URLs with corresponding computing language types and main references.

Table A1

b. Archetype cardinality

We observe no clear “knee point” in the evolution of neither the AA explained variance nor the sum of square errors between the full SSTA data and the AA representations as a function of archetype cardinality in Fig. 4. Therefore, a balance has to be struck between AA cardinality and representation of extremes conditions in the original dataset. To avoid the pitfalls of archetype redundancy mentioned by Suleman (2017a), for example, several initialization procedures, dimension reduction truncation and aggregation levels have to be tested for a number of archetype orders and the results compared (Bauckhage and Thurau 2009; Suleman 2017b).

It is interesting to note, somewhat unexpectedly, that the global SSTA archetypes “nest” in contrast to the assumption of Risbey et al. (2021) for AA applied to geopotential anomalies at 500 hPa. A pattern correlation distance is applied to identify archetype correspondence for different cardinalities. The correspondence is corroborated by visual inspection directly in Fig. 9. Overall, pattern correlations across near-identical archetypes are typically larger than 0.8. Table 1 shows the AA correspondence across several archetype orders from nAA = 4 to 10, for both the full and detrended cases, where each row corresponds to near-identical archetypes independent of the order nAA, at least when nAA is small, ≤10. Such a correspondence could not be readily established when comparing the full with nAA = p and the detrended with nAA = p − 2 sets of archetypes. For the nondetrended SSTA, 2 archetypes account for the cooling and warming patterns. As a linear trend has been removed, the cooling and warming patterns found in the full case have to be absent in the detrended case. For example, Fig. 6b detrended archetypes for a cardinality of 4 (nAA = p − 2 with p = 6) can only be compared to Fig. 7a archetypes 3–6 for a cardinality of 6 (nAA = p) given that archetypes 1 and 2 correspond to global cooling and warming patterns. The mismatch between Fig. 6b detrended archetypes for a cardinality of 4 and the remaining archetypes 3–6 in Fig. 7a for the full problem possibly indicates that a clean separation between a slow-changing mean state and natural modes of variability is elusive, at least as far as the distribution of extremes is concerned.

Fig. 9.
Fig. 9.

(a) Stacked bar plots of S-matrix probabilities for detrended monthly SST anomalies over 1982–2020 and (b) corresponding matched archetypes using pattern correlation for cardinalities ranging from 2 to 8. Each row in (a) corresponds to AA results ranging from 2 (top row) to 8 (bottom row), with MEI time series (black) included on all. The bar color codes correspond to matched archetypes referenced to AA results for a cardinality of 8, whereas the labels Ai in each row indicate the archetype ranks based on the time mean of the AA stochastic matrix SnAA with nAA = 2, …, 8 and i = 1, …, nAA in decreasing order of S¯nAA(i). The 7 columns × 8 rows AA patterns in (b) correspond to matched archetypes (rows) referenced to the AA results for cardinality of 8 (last column) across cardinalities ranging from 2 (first column) to 8 (last column).

Citation: Artificial Intelligence for the Earth Systems 1, 3; 10.1175/AIES-D-21-0007.1

The nesting properties of AA, when applied to SSTA, could be utilized to increase computational speed when stepping through the archetype orders by “recycling” the results from the previous order or perturbations thereof as initial seeds for the next order. Throughout this paper, we assume at the outset that archetypes do not nest, even approximately, and we reinitialize randomly our trials chosen independently of the previous order results.

As companion to Table 1, Fig. 9 illustrates the patterns “nestedness” and the impact of cardinality on the affiliation probabilities expressed by the matrix S = Sp×t for detrended SSTA and cardinalities ranging from p = 2 to 8. The stacked bar plots of S-matrix affiliation probabilities as function of time, Fig. 9a, take advantage of the stochastic constraint along the cardinality dimension p, where each row corresponds to AA results for cardinalities ranging from 2 (top row) to 8 (bottom row). The bar color codes correspond to matched archetypes referenced to AA results for a cardinality of 8 and are given in the legend of the last row. For each time record, the length of each colored bar corresponds to the probability of a given archetype to be expressed. The labels Ai in each row indicate the archetype ranks based on the time mean of the AA stochastic matrix SnAA, S¯nAA(i)=j=1tS[(i,j)]/t with nAA = 2, …, 8 and i = 1, …, nAA in decreasing order of S¯nAA(i). As a guide, the 7 columns × 8 rows AA patterns in Fig. 9b correspond to matched archetypes (rows) referenced to the AA results for cardinality of 8 (last column) across all tested cardinalities ranging from 2 (first column) to 8 (last column).

Figure 9a illustrates the hierarchical nature of the AA power of discrimination for extreme SSTA conditions as a function of increasing cardinality. For example, it shows that no spurious blending of archetypes occurs when “it matters,” that is, when extreme conditions occur. This can be seen when the archetype corresponding to the three major Niño intervals between 1982 and 2020 matches across 2–8 cardinalities as the pink colored intervals in 1982–83, 1997–98, and 2015–16 indicate. For a cardinality of 2 in Fig. 9a first row, all records, being convex combinations of archetypes, have to be expressed as “blended” patterns by construction unless they correspond to strongly expressed Niño and Niña intervals (respectively pink and light green) as depicted in Fig. 9b fifth and third rows. Individual data records not corresponding to extremes will lead to a low discrimination score Δp(t) introduced in Eq. (10) or a low Gini coefficient Γp(t) in Eq. (11) for S introduced in section 5c. This occurs when the probabilities of being expressed are approximately the same (≈1/p) for all archetypes for a given cardinality p. The extra patterns introduced by increasing the cardinality could be viewed as transition patterns or potentially new “extreme” conditions in SSTA identified by AA.

c. Serial correlation and causality

As mentioned in section 4, a direct AA factorization of SSTA, Xs×tXCs×pSp×t = Xs×tCt×pSp×t with s × t spatiotemporal dimensions and p archetype cardinality, does not explicitly capture serial correlation or temporal patterns existing in the data. Serial correlation or causal relationship between records need to be extracted from the matrix Sp×t, which explains how individual records are expressed in terms of archetypes XCs×p and how the AA representation evolves with time.

Risbey et al. (2021) introduced a simple metric, a discrimination score, to isolate persistent atmospheric flow patterns based on the matrix Sp×t. The discrimination score Δp(t) takes advantage of the stochastic nature of S, i=1pS(i,t)=1,t, such that,
Δp(t)=1(1p1)[1maxi=1,,pS(i,t)1]=1(1p1)[1S(imax,t)1],
where maxi=1,,pS(i,t) and imax = imax(t) correspond to the maximum values of S(⋅, t) and archetypes labeled by imax contributing with highest probability to the snapshot X(⋅, t) for each time record t. Combined with a persistence criterion, the discrimination score Δp(t) and archetype affiliation imax(t) with imax(t){1,,p} can be used to study the evolution of “extreme” conditions or their AA representation over the period of interest. A discrimination score equal to 1 means that the archetype corresponding to imax(t) is expressed with a probability weight of 1 in record t, whereas a zero discrimination score means that all archetypes are equally expressed in record t, all with the same probability of 1/p.
Unsurprisingly, the Gini coefficient9 [Eq. (9)], Γp(t), based on the sum of S = Sp×t row vector components weighted by their normalized ascending rank index for each record t,
Γp(t)=12i=1pS[πtp(i),t]S(·,t)1(pi+1/2p)=12i=1pS[πtp(i),t]j=1pS(j,t)(pi+1/2p)=12i=1pS[πtp(i),t](pi+1/2p),
highly correlates with the discrimination score Δp(t) [Eq. (10)] with correlation values ≥ 0.95 for archetype cardinality p ranging from 3 to 20. Here, the permutation πtp ranks in increasing order the values of S(,t),t, noting that S is left stochastic, that is, j=1pS(j,t)=1,t.
To complement the previous approach based on a univariate discrimination score, Δp(t) or Γp(t), based on the “winning” archetype for time record t, the whole archetype ranking in Sp×t can be exploited. The AA representation can be viewed as a form of discretization or categorization of the dynamics where, for a given archetype cardinality p, the time evolution can be represented by a series of permutations πtp,
πtp={1pπtp(1)πtp(p)}.

Transitions from one record S(⋅, t) to the next S(⋅, t + 1) can be now analyzed from the changes from πtp to πt+1p, where πtp corresponds to the permutations needed to rank in decreasing order the probabilities recorded in Sp×t. The time mean of the AA stochastic matrix S, S¯(i)=[j=1tS(i,j)]/t with i = 1, …, p, ranked in decreasing order of S¯(i) corresponds here to the identity or the reference permutation and is originally used to numerically label the set of archetypes by Mørup and Hansen (2012). A suitable distance function or metric over the set of permutations of p elements could be chosen (Fligner and Verducci 1986) not only to track the changes in πtp, but also to compare clustering methods as in Meilă (2007), for example. We intend to pursue this line of enquiry in further works with a focus on onset, persistence, and decay of archetypal states.

As an alternative to postprocessing results obtained from the direct application of AA, one could imagine AA to be applied to datasets where serial correlation has been explicitly modeled. Horenko (2009, 2010a,b,c), O’Kane et al. (2013), Risbey et al. (2015), Franzke et al. (2015), O’Kane et al. (2016), Yu et al. (2016), O’Kane et al. (2017), Gerber et al. (2020), and Quinn et al. (2021) extend matrix factorization techniques to time series predictions, where lags of the dataset under investigation are included in vector autoregressive or dynamic linear models. These methods have to be combined with an appropriate level of regularization as the number of free parameters typically increases quadratically with the spatial, latent or retained10 dimensions s multiplied by the number of lags L, O(Ls2). Additional levels of regularization are often further imposed on solutions to handle ill-conditioning and overfitting plaguing problems of these types for high-dimensional datasets.

A simpler approach would be to apply time-embedding (Takens 1981) to construct an augmented data matrix Ξ = Ξ (m=+1)s×tm = [X(⋅, m + 1: t); …; X(⋅, 1: tm)] by stacking lagged versions of the original data matrix Xs×t where the embedding dimensions m is the maximum number of selected lags. Dimension reduction techniques could be applied to the augmented data matrix Ξ(m+1)s×tm to further reduce the spatial dimensions as in RSAA implemented in section 4.

d. Multivariate RSAA

The time-embedding construction followed by RSAA described previously is a special case of a more general method where a combined EOF analysis is employed (O’Kane et al. 2017; Hannachi 2021) followed by RSAA on the augmented data matrix Ξ(s1++sm)×t=[X1;;Xm]. The matrix Ξ is constructed by stacking m geophysical fields Xi=Xsi×ti with i = 1, …, m defined over an identical temporal domain. Each individual field Xm needs to be suitably scaled as not to favor any particular field in the PCA dimension reduction step as Xm may have wildly different scales and physical units. A conventional AA is then performed on the retained scaled principal components ΛVr×tT, which now capture some of the variance common to all Xm. One notes that Ξ(s1++sm)×t=U(s1++sm)×rΛr×rVr×tT for rmin(s1++sm,t) retained dimensions and the resulting archetype field patterns can be simply recovered as U(s1++sm)×rΛr×rVr×tTCt×p for a cardinality equal to p.

6. Application of archetypal analysis

In section 5, we illustrate in Fig. 1c how AA identifies extreme SST conditions in the reduced space spanned by the first three scaled PCs of SSTA and we describe the impact of detrending the dataset to separate extremes in the interannual variability from those driven by anthropogenic forcing. Hereafter, we present an application of AA to characterize ENSO, building on the example of Hannachi and Trendafilov (2017). For a choice of four archetypes applied to global SST anomaly fields, the resulting archetypes display patterns indicative of the four ENSO types: the classical eastern Pacific-type El Niño and La Niña (Rasmusson and Carpenter 1982) and the central Pacific-type [coined “Modoki” by Ashok et al. (2007)] El Niño and La Niña (e.g., Fu et al. 1986; Wang 1995; Trenberth and Smith 2006; Kao and Yu 2009; Cai et al. 2009). These patterns are shown in column 1 in Fig. 10.

Fig. 10.
Fig. 10.

AA composite results using detrended monthly SST anomalies over 1982–2020 for a selected archetype number of 4. (left) The resulting spatial patterns of SST anomalies constructed with the S matrix values, followed by (center) 300-hPa zonal wind component (shading) with superimposed 500-hPa geopotential height anomalies (green contours) and thermal wind anomalies components (vectors) and, by (right) monthly averaged daily maximum surface wind speed anomalies (shading) with superimposed velocity potential difference anomalies between 150- and 850-hPa levels (green contours) and wave activity flux anomalies components at 200 hPa (vectors).

Citation: Artificial Intelligence for the Earth Systems 1, 3; 10.1175/AIES-D-21-0007.1

The time series of the archetype coefficients, C or S, can be used as indices of the ENSO types, and to form composites of other flow fields corresponding to each archetype. Lagged compositing could be used to investigate the state of the ocean or atmosphere in a certain period leading up to, or caused by extreme conditions, though composite results reported in this paper correspond to lag zero. There are a multitude of compositing techniques one might employ using either of the C or S matrix time series. We have already discussed two methods for building (time) mean composites of any given field F based on AA in section 4, where the stochastic matrices C and S are employed to derive composites by simple matrix multiplications where all individual entries of C and S are taken into account. However, the discrimination score Δp(t) [Eq. (10)] or the Gini coefficient Γp(t) [Eq. (11)] can also be used to exclude time records from the stochastic matrices, C and S, upon which the composites of F are formed, if these entries are below a given probability threshold informed by the distribution of probabilities across time records of any given archetype. Furthermore, as applied by Risbey et al. (2021) to isolate long lasting atmospheric blocking events, a persistence criterion combined to a discrimination score or Gini-coefficient-based “thresholding” can be applied to only retain time records to composite upon when a given archetype has 1) a high probability of being expressed and 2) persists over several time records.

In the following examples, we formed composites based on the S matrix as explained in section 4. Columns 2 and 3 in Fig. 10 provide two sets of atmospheric diagnostic composites to elucidate teleconnection patterns associated with the classical (archetypes 3 and 4) and Modoki (archetypes 1 and 2) ENSO patterns.

Column 2 in Fig. 10 illustrates the anomalous 300-hPa zonal wind component, 500-hPa geopotential height anomalies, and thermal wind anomaly vectors corresponding to the spatial pattern of SST anomaly for each ENSO archetype to the left. Several key features match with previously identified ENSO behavior. Conventional understanding of a Northern Hemisphere wintertime La Niña episode includes a strong westerly Pacific jet stream that splits around a well-developed North Pacific high pressure system (Alexander et al. 2002; Newman et al. 2016; Christensen et al. 2017; Capotondi et al. 2020). Both classical and Modoki La Niña archetypes (Fig. 10, rows 2 and 3) follow this behavior, though the classical La Niña pattern has a more coherent strengthening of the subtropical jet over Asia into the North Pacific. This strengthening of the jet occurs in association or response to the enhanced thermal wind induced by the warm and cold SST anomalies across the North Pacific Ocean in this pattern. In the Southern Hemisphere the main response for La Niña is in the polar jet stream. For La Niña Modoki (Fig. 10, row 2) there is an almost circumglobal change from warm SST anomalies to cold SST anomalies around 50°S latitude, which results in westerly thermal wind anomalies and an enhanced polar jet stream. The stronger polar jet stream is associated with lower geopotential height poleward of the jet, indicating enhanced storminess at high latitudes. At lower latitudes in the South Pacific, the subtropical jet stream is weakened for La Niña Modoki by the strong easterly thermal wind anomaly.

The Modoki El Niño in the top row has a strong anticyclone at 500 hPa in geopotential height over the Gulf of Alaska with evidence of an in situ response to thermal wind anomalies, consistent with findings by Kao and Yu (2009) that indicated Modoki ENSO tends to favor in situ development forced primarily by the atmosphere. In the Southern Hemisphere, both El Niño types (rows 1 and 4) feature SST gradients that enhance the thermal wind in the vicinity of the subtropical jet in the Pacific, though this is much stronger for classical El Niño (row 4). The geopotential height anomalies for both El Niño types feature a ridge and trough about South America reminiscent of the Pacific–South America pattern.

A different set of atmospheric diagnostic composites for the same archetype patterns is provided in column 3 of Fig. 10. Here, monthly averaged maximum daily wind speed anomalies at the surface are shaded and superimposed with contours of velocity potential difference anomalies between 150 and 850 hPa as in Adames and Wallace (2014), as well as vectors of anomalous wave activity flux (WAF) at 200 hPa (Takaya and Nakamura 2001). In line with recent studies (Liang et al. 2021; Chen et al. 2015), both flavors of El Niño are associated with a lessening of the easterly trade winds. Conversely, both flavors of La Niña correspond to a strengthening of the easterly trade winds. The well-developed North Pacific block in the Modoki La Niña (Fig. 10, row 2) is supported by strong WAF activity from the tropics into the North Pacific.

The composite behavior of WAF in the Southern Hemisphere may prompt questions about the influence of ENSO on the South Pacific convergence zone (SPCZ), which has been described as a “graveyard for fronts” (Trenberth 1976) and more recently associated with Rossby wave breaking (Matthews 2011). In each of the ENSO archetype composites there is a flux of wave activity in the Southern Hemisphere polar waveguide, which tends to move equatorward across the Australian continent and into the SPCZ region, consistent with the analysis of Matthews (2011). However, there are variations in this picture from case to case as indicated by the set of longitudes where the flux out of the waveguide moves equatorward. For El Niño Modoki (Fig. 10, row 1) and classical La Niña (Fig. 10, row 3) the strong equatorward flux is in the Indian Ocean and Australian continent region. For classical El Niño (Fig. 10, row 4) and La Niña (Fig. 10, row 2) Modoki the equatorward flux is over the Australian continent and there is convergence of wave activity flux in the South Pacific region. In these two cases the impact on the SPCZ seems stronger as indicated by the more coherent northwest–southeast-oriented anomaly in maximum surface wind for these composites.

The strongest anomaly composites are found in the tropical to subtropical bands for classical El Niño (Fig. 10, row 4) and Modoki La Niña (Fig. 10, row 2) in both surface and atmospheric fields aloft. Monthly averages of maximum daily surface wind speed, velocity potential difference, Δ150 − 850, and thermal wind anomaly composites show a clear correspondence to ENSO phases. For classical El Niños there is a slackening of the surface trade winds in the central Pacific, enhanced/reduced convection activity in the eastern/western Pacific, and symmetrically diverging thermal wind anomalies from the equatorial region. The opposite conditions are observed for Modoki La Niñas with a reinforcement of the trade winds in the western to central Pacific, a reduced/enhanced convection activity in the eastern/western Pacific and symmetrically convergent northeasterly and southeasterly thermal wind anomalies toward the equator.

One notable feature of the composite teleconnection patterns associated with the four ENSO types is that they are not particularly symmetric. That is, the teleconnections for classical versus Modoki forms are different for the same ENSO type, just as the El Niño and La Niña forms are different. In some cases, the teleconnection for the Modoki form more closely resembles that for the opposite ENSO type than it does the classical equivalent. This is understandable in that the teleconnections form in response to the global SST patterns for each type, which can be very different outside the tropics. More local gradients in SST in the archetype patterns can drive thermal wind responses that modify the jets and dynamical response in a region.

7. Summary and conclusions

This paper has demonstrated the utility of the AA method and the benefits that arise particularly when analyzing geophysical data. A derivation of RSAA is first provided as the foundation for working with large datasets that require an initial dimensionality reduction step to increase computational efficiency. Using a prototype dataset of monthly SST anomalies between 1982 and 2020, we have shown how outliers around a broadly ellipsoid-shaped distribution may be readily identifiable as archetypes of the data. These spatial archetype patterns resemble anomalies of SST associated with ENSO. If trends are of interest, the nondetrended data yields archetypes that show gradual warming from an initially cold pattern to a warmer one. Detrending the data prior to AA may remove a global warming pattern and instead reveal different flavors of ENSO, like the central Pacific (aka Modoki) versus classical eastern Pacific ENSO. As the number of archetypes increase from 4 to 8, the spatial patterns increase in diversity while still retaining familiar ENSO patterns. The Gini coefficient is introduced as a tool to inform on the number of principal components to be retained in the analysis based on the total variance explained by RSAA and conditioned on archetype cardinality. The Gini coefficient can also be used as a univariate discrimination score to identify extreme conditions and their persistence. Last, a useful application of AA is presented to show that compositing around the AA matrix factors time series reveals familiar atmospheric teleconnection patterns associated with extreme SST anomaly patterns.

We show that the decisions made to implement AA can greatly affect the interpretation of results. There is a priori no guarantee that solutions exist for the minimization problem for any given task, or that the solutions found will be meaningful. Results of AA should therefore be considered carefully through the lens of each individual decision made. Chosen methods should be documented thoroughly in all work involving AA to encourage reproducibility and understanding.

Acknowledgments.

This research was supported by the Multiyear Climate Project at the Commonwealth Scientific and Industrial Research Organisation, Oceans and Atmosphere. Bernadette Sloyan and Christopher Chapman were also funded by The Centre for Southern Hemisphere Oceans Research, Hobart, Tasmania, Australia. Abdelwaheb Hannachi and Nikolay Trendafilov received no external support for this work. We thank Richard Matear at the Commonwealth Scientific and Industrial Research Organisation, Oceans and Atmosphere, for his constant support and encouragement. We are also grateful to the reviewers whose insightful comments led to substantial improvement of the paper.

Data availability statement.

SST data are from the OISST v2.1 high-resolution dataset provided by the NOAA/OAR/ESRL Physical Sciences Laboratory. These data are available at https://psl.noaa.gov/. The atmospheric reanalysis data used to relate extreme events to large-scale climate modes come from the JRA-55 project carried out by the Japan Meteorological Agency (JMA). JRA-55 data are available at https://jra.kishou.go.jp.

Footnotes

1

A left-stochastic matrix is a non-negative matrix with each column summing to 1.

2

In geometry, the convex hull of a set S of points sampled from r-dimensional Euclidean space, is the smallest convex r polytope enclosing the entire data set and which vertices are points of S.

3

The cardinality of a set corresponds of the number of elements in the set.

5

The square root λ1/λ2 of the ratio of the covariance matrix Σ eigenvalues, λ1 = 1.8 and λ2 = 0.2.

6

The non-negative orthant is the generalization of the first quadrant in two dimensions to n dimensions.

7

Qualify nondeterministic polynomial acceptable problems in reference to the computing time needed to find their “near optimal” solutions.

8

Abrol and Sharma (2020) recently make use of the Gini sparsity measure to develop computationally efficient greedy AA (GAA) algorithm, where it is implemented in the AA optimization procedure to update the sparse stochastic matrix C.

10

If a dimension reduction step has been implemented for example.

APPENDIX

Available AA Packages

A number of open-source AA packages are available online. They have been implemented for most major computing languages in use today such as MATLAB, Python, and R. The reader is referred to Table A1 for a nonexhaustive selection of package URLs. Throughout the paper, we have used exclusively the AA package developed by Mørup and Hansen (2012), after trialing several implementations listed in Table A1. We have found the pure MATLAB script “PCHA” extremely robust and easy to modify for our purpose. PCHA computation speed compares to the Sparse Modeling Software (SPAMS), version 2.6, MATLAB version of Mairal (2014, 2017) resulting in near-identical solutions in the stochastic matrices C and S with sum of square differences of the order O(10−4) for monthly SSTA and an archetype cardinality of 6. We note, however, that Python pymf-class packages may not be as robust for high-dimensional data set due to an internal issue with the optimization routine implemented therein. This issue was also reported by Chen et al. (2014). Finally, we have also tested the elegant implementation AA of Hannachi and Trendafilov (2017) and Trendafilov and Gallo (2021) using the optimization on manifold package “Manopt” developed by Boumal et al. (2014) and have found it computationally efficient and twice as fast as PCHA for the “Trust Region” optimization solver. All AA optimizations are performed on Intel Core i7-10875H CPU at 2.30 GHz laptop for small problems, on a dual Intel Xeon Gold 6132 CPU at 2.60 GHz desktop for intermediate problems and a dual Intel Xeon CPU E5-2697A v4 at 2.60 GHz blade server for large problems.

REFERENCES

  • Abrol, V., and P. Sharma, 2020: A geometric approach to archetypal analysis via sparse projections. Proc. 37th Int. Conf. on Machine Learning, Online, ICML, 4251, http://proceedings.mlr.press/v119/abrol20a/abrol20a.pdf.

  • Adames, A. F., and J. M. Wallace, 2014: Three-dimensional structure and evolution of the MJO and its relation to the mean flow. J. Atmos. Sci., 71, 20072026, https://doi.org/10.1175/JAS-D-13-0254.1.

    • Search Google Scholar
    • Export Citation
  • Alexander, M. A., I. Bladé, M. Newman, J. R. Lanzante, N.-C. Lau, and J. D. Scott, 2002: The atmospheric bridge: The influence of ENSO teleconnections on air–sea interaction over the global oceans. J. Climate, 15, 22052231, https://doi.org/10.1175/1520-0442(2002)015<2205:TABTIO>2.0.CO;2.

    • Search Google Scholar
    • Export Citation
  • Aloise, D., A. Deshpande, P. Hansen, and P. Popat, 2009: NP-hardness of Euclidean sum-of-squares clustering. Mach. Learn., 75, 245248, https://doi.org/10.1007/s10994-009-5103-0.

    • Search Google Scholar
    • Export Citation
  • Ashok, K., S. K. Behera, S. A. Rao, H. Weng, and T. Yamagata, 2007: El Niño Modoki and its possible teleconnection. J. Geophys. Res., 112, C11007, https://doi.org/10.1029/2006JC003798.

    • Search Google Scholar
    • Export Citation
  • Bauckhage, C., 2014: A note on archetypal analysis and the approximation of convex hulls. arXiv, 1410.0642, https://doi.org/10.48550/arXiv.1410.0642.

  • Bauckhage, C., and C. Thurau, 2009: Making archetypal analysis practical. Pattern Recognition, J. Denzler, G. Notni, and H. Süße, Eds., Lecture Notes in Computer Science, Vol. 5748, Springer, 272281.

    • Search Google Scholar
    • Export Citation
  • Bauckhage, C., and K. Manshaei, 2014: Kernel archetypal analysis for clustering web search frequency time series. 22nd Int. Conf. on Pattern Recognition, Stockholm, Sweden, IEEE, 15441549.

  • Boumal, N., B. Mishra, P.-A. Absil, and R. Sepulchre, 2014: Manopt, a Matlab toolbox for optimization on manifolds. J. Mach. Learn. Res., 15, 14551459

    • Search Google Scholar
    • Export Citation
  • Boyd, S. P., and L. Vandenberghe, 2004: Convex Optimization. 1st ed. Cambridge University Press, 727 pp.

  • Cai, W., and T. Cowan, 2009: La Niña Modoki impacts Australia autumn rainfall variability. Geophys. Res. Lett., 36, L12805, https://doi.org/10.1029/2009GL037885.

    • Search Google Scholar
    • Export Citation
  • Cai, W., T. Cowan, and A. Sullivan, 2009: Recent unprecedented skewness towards positive Indian Ocean Dipole occurrences and its impact on Australian rainfall. Geophys. Res. Lett., 36, L11705, https://doi.org/10.1029/2009GL037604.

    • Search Google Scholar
    • Export Citation
  • Capotondi, A., and P. D. Sardeshmukh, 2015: Optimal precursors of different types of ENSO events. Geophys. Res. Lett., 42, 99529960, https://doi.org/10.1002/2015GL066171.

    • Search Google Scholar
    • Export Citation
  • Capotondi, A., and P. D. Sardeshmukh, 2017: Is El Niño really changing? Geophys. Res. Lett., 44, 85488556, https://doi.org/10.1002/2017GL074515.

    • Search Google Scholar
    • Export Citation
  • Capotondi, A., A. T. Wittenberg, J.-S. Kug, K. Takahashi, and M. J. McPhaden, 2020: ENSO Diversity. El Niño Southern Oscillation in a Changing Climate, Geophys. Monogr., Vol. 253, Amer. Geophys. Union, 65–86, https://doi.org/10.1002/9781119548164.ch4.

  • Chen, D., and Coauthors, 2015: Strong influence of westerly wind bursts on El Niño diversity. Nat. Geosci., 8, 339345, https://doi.org/10.1038/ngeo2399.

    • Search Google Scholar
    • Export Citation
  • Chen, Y., J. Mairal, and Z. Harchaoui, 2014: Fast and robust archetypal analysis for representation learning. Conf. on Computer Vision and Pattern Recognition, Columbus, OH, IEEE, 14781485.

  • Christiansen, B., 2007: Atmospheric circulation regimes: Can cluster analysis provide the number? J. Climate, 20, 22292250, https://doi.org/10.1175/JCLI4107.1.

    • Search Google Scholar
    • Export Citation
  • Christensen, H. M., J. Berner, D. R. B. Coleman, and T. N. Palmer, 2017: Stochastic parameterization and El Niño–Southern Oscillation. J. Climate, 30, 1738, https://doi.org/10.1175/JCLI-D-16-0122.1.

    • Search Google Scholar
    • Export Citation
  • Cichocki, A., R. Zdunek, A. H. Phan, and S.-I. Amari, 2009: Nonnegative Matrix and Tensor Factorizations: Applications to Exploratory Multi-Way Data Analysis and Blind Source Separation. 1st ed. Wiley Publishing, 504 pp.

  • Cutler, A., and L. Breiman, 1994: Archetypal analysis. Technometrics, 36, 338347, https://doi.org/10.1080/00401706.1994.10485840.

  • Elad, M., 2010: Sparse and Redundant Representations: From Theory to Applications in Signal and Image Processing. 1st ed. Springer, 376 pp.

  • Elden, L., 2019: Matrix Methods in Data Mining and Pattern Recognition. 2nd ed. Society for Industrial and Applied Mathematics, 229 pp.

  • Esposito, F., 2021: A review on initialization methods for nonnegative matrix factorization: Towards omics data experiments. Mathematics, 9, 1006, https://doi.org/10.3390/math9091006.

    • Search Google Scholar
    • Export Citation
  • Eugster, M. J. A., and F. Leisch, 2011: Weighted and robust archetypal analysis. Comput. Stat. Data Anal., 55, 12151225, https://doi.org/10.1016/j.csda.2010.10.017.

    • Search Google Scholar
    • Export Citation
  • Fligner, M. A., and J. S. Verducci, 1986: Distance based ranking models. J. Roy. Stat. Soc., 48B, 359369, https://doi.org/10.1111/j.2517-6161.1986.tb01420.x.

    • Search Google Scholar
    • Export Citation
  • Franzke, C. L. E., T. J. O’Kane, D. P. Monselesan, J. S. Risbey, and I. Horenko, 2015: Systematic attribution of observed Southern Hemisphere circulation trends to external forcing and internal variability. Nonlinear Processes Geophys., 22, 513525, https://doi.org/10.5194/npg-22-513-2015.

    • Search Google Scholar
    • Export Citation
  • Fu, C., H. F. Diaz, and J. O. Fletcher, 1986: Characteristics of the response of sea surface temperature in the central Pacific associated with warm episodes of the Southern Oscillation. Mon. Wea. Rev., 114, 17161739, https://doi.org/10.1175/1520-0493(1986)114<1716:COTROS>2.0.CO;2.

    • Search Google Scholar
    • Export Citation
  • Gan, G., C. Ma, and J. Wu, 2020: Data Clustering: Theory, Algorithms and Applications. 2nd ed. SIAM Press, 406 pp.

  • Gerber, S., L. Pospisil, M. Navandar, and I. Horenko, 2020: Low-cost scalable discretization, prediction, and feature selection for complex systems. Sci. Adv., 6, eaaw0961, https://doi.org/10.1126/sciadv.aaw0961.

    • Search Google Scholar
    • Export Citation
  • Gillis, N., 2020: Nonnegative Matrix Factorization. Society for Industrial and Applied Mathematics, 350 pp.

  • Gini, C., 1921: Measurement of inequality of incomes. Econ. J., 31, 124126, https://doi.org/10.2307/2223319.

  • Han, R., B. Osting, D. Wang, and Y. Xu, 2022: Probabilistic methods for approximate archetypal analysis. Inf. Inference J. IMA, 2022, iaac008, https://doi.org/10.1093/imaiai/iaac008.

    • Search Google Scholar
    • Export Citation
  • Hannachi, A., 2021: Further topics. Patterns Identification and Data Mining in Weather and Climate, A. Hannachi, Ed., Springer, 367413.

    • Search Google Scholar
    • Export Citation
  • Hannachi, A., and N. Trendafilov, 2017: Archetypal analysis: Mining weather and climate extremes. J. Climate, 30, 69276944, https://doi.org/10.1175/JCLI-D-16-0798.1.

    • Search Google Scholar
    • Export Citation
  • Hasselmann, K., 1988: PIPs and POPs: The reduction of complex dynamical systems using principal interaction and oscillation patterns. J. Geophys. Res., 93, 11 01511 021, https://doi.org/10.1029/JD093iD09p11015.

    • Search Google Scholar
    • Export Citation
  • Horenko, I., 2009: On robust estimation of low-frequency variability trends in discrete Markovian sequences of atmospheric circulation patterns. J. Atmos. Sci., 66, 20592072, https://doi.org/10.1175/2008JAS2959.1.

    • Search Google Scholar
    • Export Citation
  • Horenko, I., 2010a: Finite element approach to clustering of multidimensional time series. SIAM J. Sci. Comput., 32, 6283, https://doi.org/10.1137/080715962.

    • Search Google Scholar
    • Export Citation
  • Horenko, I., 2010b: On clustering of non-stationary meteorological time series. Dyn. Atmos. Oceans, 49, 164187, https://doi.org/10.1016/j.dynatmoce.2009.04.003.

    • Search Google Scholar
    • Export Citation
  • Horenko, I., 2010c: On the identification of nonstationary factor models and their application to atmospheric data analysis. J. Atmos. Sci., 67, 15591574, https://doi.org/10.1175/2010JAS3271.1.

    • Search Google Scholar
    • Export Citation
  • Hotelling, H., 1933: Analysis of a complex of statistical variables into principal components. J. Educ. Psychol., 24, 417441, https://doi.org/10.1037/h0071325.

    • Search Google Scholar
    • Export Citation
  • Hurley, N., and S. Rickard, 2009: Comparing measures of sparsity. IEEE Trans. Inf. Theory, 55, 47234741, https://doi:10.1109/TIT.2009.2027527.

    • Search Google Scholar
    • Export Citation
  • IPCC, 2013: Climate Change 2013: The Physical Science Basis. Cambridge University Press, 1535 pp., https://doi.org/10.1017/CBO9781107415324.

  • IPCC, 2019: The Ocean and Cryosphere in a Changing Climate. H.-O. Pörtner et al., Eds., Cambridge University Press, 766 pp., https://www.ipcc.ch/site/assets/uploads/sites/3/2022/03/SROCC_FullReport_FINAL.pdf.

  • Izenman, A. J., 2008: Linear dimensionality reduction. Modern Multivariate Statistical Techniques: Regression, Classification, and Manifold Learning, 1st ed. A. J. Izenman, Ed., Springer Texts in Statistics, Springer, 195236.

    • Search Google Scholar
    • Export Citation
  • Jolliffe, I. T., 1986: Principal Component Analysis. Springer Verlag, 271 pp.

  • Jolliffe, I. T., and J. Cadima, 2016: Principal component analysis: A review and recent developments. Philos. Trans. Roy. Soc., 374A, 20150202, https://doi.org/10.1098/rsta.2015.0202.

    • Search Google Scholar
    • Export Citation
  • Kao, H.-Y., and J.-Y. Yu, 2009: Contrasting eastern-Pacific and central-Pacific types of ENSO. J. Climate, 22, 615632, https://doi.org/10.1175/2008JCLI2309.1.

    • Search Google Scholar
    • Export Citation
  • Keller, S. M., M. Samarin, F. Arend Torres, M. Wieser, and V. Roth, 2021: Learning extremal representations with deep archetypal analysis. Int. J. Comput. Vis., 129, 805820, https://doi.org/10.1007/s11263-020-01390-3.

    • Search Google Scholar
    • Export Citation
  • Kobayashi, S., and Coauthors, 2015: The JRA-55 reanalysis: General specifications and basic characteristics. J. Meteor. Soc. Japan, 93, 548, https://doi.org/10.2151/jmsj.2015-001.

    • Search Google Scholar
    • Export Citation
  • Komarov, O., 2021: okomarov/ginicoeff. GitHub, accessed 10 November 2021, https://github.com/okomarov/ginicoeff.

  • Liang, Y., A. V. Fedorov, and P. Haertel, 2021: Intensification of westerly wind bursts caused by the coupling of the Madden-Julian oscillation to SST during El Niño onset and development. Geophys. Res. Lett., 48, e2020GL089395, https://doi.org/10.1029/2020GL089395.

    • Search Google Scholar
    • Export Citation
  • Lorenz, E., 1956: Empirical orthogonal functions and statistical weather prediction. MIT Department of Meteorology Statistical Forecasting Project Scientific Rep. 1, 49 pp., https://eapsweb.mit.edu/sites/default/files/Empirical_Orthogonal_Functions_1956.pdf.

  • Mair, S., and U. Brefeld, 2019: Coresets for archetypal analysis. 9 pp., https://papers.nips.cc/paper/2019/file/7f278ad602c7f47aa76d1bfc90f20263-Paper.pdf.

  • Mair, S., A. Boubekki, and U. Brefeld, 2017: Frame-based data factorizations. Int. Conf. on Machine Learning, Sydney, New South Wales, Australia, ICML, 2305–2313, http://proceedings.mlr.press/v70/mair17a/mair17a.pdf.

  • Mairal, J., 2014: Sparse modeling for image and vision processing. Found. Trends Comput. Graph. Vis., 8, 85283, https://doi.org/10.1561/0600000058.

    • Search Google Scholar
    • Export Citation
  • Mairal, J., 2017: SPAMS: A SPArse Modeling Software, v 2.6. http://thoth.inrialpes.fr/people/mairal/spams/doc/html/index.html.

  • Matthews, A. J., 2011: A multiscale framework for the origin and variability of the South Pacific convergence zone. Quart. J. Roy. Meteor. Soc., 138, 11651178, https://doi.org/10.1002/qj.1870.

    • Search Google Scholar
    • Export Citation
  • Meilă, M., 2007: Comparing clusterings—An information based distance. J. Multivar. Anal., 98, 873895, https://doi:10.1016/j.jmva.2006.11.013.

    • Search Google Scholar
    • Export Citation
  • Mo, K. C., and M. Ghil, 1987: Statistics and dynamics of persistent anomalies. J. Atmos. Sci., 44, 877902, https://doi.org/10.1175/1520-0469(1987)044<0877:SADOPA>2.0.CO;2.

    • Search Google Scholar
    • Export Citation
  • Monahan, A. H., J. C. Fyfe, M. H. Ambaum, D. B. Stephenson, and G. R. North, 2009: Empirical orthogonal functions: The medium is the message. J. Climate, 22, 65016514, https://doi.org/10.1175/2009JCLI3062.1.

    • Search Google Scholar
    • Export Citation
  • Mørup, M., and L. K. Hansen, 2012: Archetypal analysis for machine learning and data mining. Neurocomputing, 80, 5463, https://doi.org/10.1016/j.neucom.2011.06.033.

    • Search Google Scholar
    • Export Citation
  • Motevalli-Soumehsaraei, B., and A. Barnard, 2019: Archetypal analysis package, version 1. CSIRO, https://doi.org/10.25919/5d3958889f7ff.

  • Newman, M., and Coauthors, 2016: The Pacific decadal oscillation, revisited. J. Climate, 29, 43994427, https://doi.org/10.1175/JCLI-D-15-0508.1.

    • Search Google Scholar
    • Export Citation
  • Nguyen, L. H., and S. Holmes, 2019: Ten quick tips for effective dimensionality reduction. PLOS, 15, e1006907, https://doi.org/10.1371/journal.pcbi.1006907.

    • Search Google Scholar
    • Export Citation
  • North, G. R., 1984: Empirical orthogonal functions and normal modes. J. Atmos. Sci., 41, 879887, https://doi.org/10.1175/1520-0469(1984)041<0879:EOFANM>2.0.CO;2.

    • Search Google Scholar
    • Export Citation
  • O’Kane, T. J., J. S. Risbey, C. Franzke, I. Horenko, and D. P. Monselesan, 2013: Changes in the metastability of the midlatitude Southern Hemisphere circulation and the utility of nonstationary cluster analysis and split-flow blocking indices as diagnostic tools. J. Atmos. Sci., 70, 824842, https://doi.org/10.1175/JAS-D-12-028.1.

    • Search Google Scholar
    • Export Citation
  • O’Kane, T. J., J. S. Risbey, D. P. Monselesan, I. Horenko, and C. L. E. Franzke, 2016: On the dynamics of persistent states and their secular trends in the waveguides of the Southern Hemisphere troposphere. Climate Dyn., 46, 35673597, https://doi.org/10.1007/s00382-015-2786-8.

    • Search Google Scholar
    • Export Citation
  • O’Kane, T. J., D. P. Monselesan, J. S. Risbey, I. Horenko, and C. L. E. Franzke, 2017: On memory, dimension, and atmospheric teleconnections. Math. Climate Wea. Forecasting, 3, 127, https://doi.org/10.1515/mcwf-2017-0001.

    • Search Google Scholar
    • Export Citation
  • Quinn, C., D. Harries, and T. J. O’Kane, 2021: Dynamical analysis of a reduced model for the North Atlantic Oscillation. J. Atmos. Sci., 78, 16471671, https://doi.org/10.1175/JAS-D-20-0282.1.

    • Search Google Scholar
    • Export Citation
  • Rasmusson, E. M., and T. H. Carpenter, 1982: Variations in tropical sea surface temperature and surface wind fields associated with the Southern Oscillation/El Niño. Mon. Wea. Rev., 110, 354384, https://doi.org/10.1175/1520-0493(1982)110<0354:VITSST>2.0.CO;2.

    • Search Google Scholar
    • Export Citation
  • Reynolds, R. W., T. M. Smith, C. Liu, D. B. Chelton, K. S. Casey, and M. G. Schlax, 2007: Daily high-resolution-blended analyses for sea surface temperature. J. Climate, 20, 54735496, https://doi.org/10.1175/2007JCLI1824.1.

    • Search Google Scholar
    • Export Citation
  • Richardson, D., A. S. Black, D. P. Monselesan, T. S. Moore, J. S. Risbey, A. Schepen, D. T. Squire, and C. R. Tozer, 2021: Identifying periods of forecast model confidence for improved subseasonal prediction of precipitation. J. Hydrometeor., 22, 371385, https://doi.org/10.1175/JHM-D-20-0054.1.

    • Search Google Scholar
    • Export Citation
  • Risbey, J. S., T. J. O’Kane, D. P. Monselesan, C. Franzke, and I. Horenko, 2015: Metastability of Northern Hemisphere teleconnection modes. J. Atmos. Sci., 72, 3554, https://doi.org/10.1175/JAS-D-14-0020.1.

    • Search Google Scholar
    • Export Citation
  • Risbey, J. S., D. P. Monselesan, A. S. Black, T. S. Moore, D. Richardson, D. T. Squire, and C. R. Tozer, 2021: The identification of long-lived Southern Hemisphere flow events using archetypes and principal components. Mon. Wea. Rev., 149, 19872010, https://doi.org/10.1175/MWR-D-20-0314.1.

    • Search Google Scholar
    • Export Citation
  • Seitola, T., V. Mikkola, J. Silen, and H. Järvinen, 2014: Random projections in reducing the dimensionality of climate simulation data. Tellus, 66A, 25274, https://doi.org/10.3402/tellusa.v66.25274.

    • Search Google Scholar
    • Export Citation
  • Seth, S., and M. J. A. Eugster, 2016: Probabilistic archetypal analysis. Mach. Learn., 102, 85113, https://doi.org/10.1007/s10994-015-5498-8.

    • Search Google Scholar
    • Export Citation
  • Steinschneider, S., and U. Lall, 2015: Daily precipitation and tropical moisture exports across the eastern United States: An application of archetypal analysis to identify spatiotemporal structure. J. Climate, 28, 85858602, https://doi.org/10.1175/JCLI-D-15-0340.1.

    • Search Google Scholar
    • Export Citation
  • Suleman, A., 2017a: On ill-conceived initialization in archetypal analysis. Adv. Data Anal. Classif., 11, 785808, https://doi.org/10.1007/s11634-017-0303-0.

    • Search Google Scholar
    • Export Citation
  • Suleman, A., 2017b: Validation of archetypal analysis. Int. Conf. on Fuzzy Systems, Naples, Italy, IEEE, 1–6, https://doi.org/10.1109/FUZZ-IEEE.2017.8015385.

  • Takaya, K., and H. Nakamura, 2001: A formulation of a phase-independent wave-activity flux for stationary and migratory quasigeostrophic eddies on a zonally varying basic flow. J. Atmos. Sci., 58, 608627, https://doi.org/10.1175/1520-0469(2001)058<0608:AFOAPI>2.0.CO;2.

    • Search Google Scholar
    • Export Citation
  • Takens, F., 1981: Detecting strange attractors in turbulence. Dynamical Systems and Turbulence, Warwick 1980, D. Rand and L.-S. Young, Eds., Vol. 898, Springer, 366381.

    • Search Google Scholar
    • Export Citation
  • Thurau, C., K. Kersting, and C. Bauckhage, 2009: Convex non-negative matrix factorization in the wild. Ninth IEEE Int. Conf. on Data Mining, Miami Beach, FL, IEEE, 523532, https://doi.org/10.1109/ICDM.2009.55.

  • Thurau, C., K. Kersting, M. Wahabzada, and C. Bauckhage, 2011: Convex non-negative matrix factorization for massive datasets. Knowl. Inf. Syst., 29, 457478, https://doi.org/10.1007/s10115-010-0352-6.

    • Search Google Scholar
    • Export Citation
  • Trenberth, K. E., 1976: Spatial and temporal variations of the Southern Oscillation. Quart. J. Roy. Meteor. Soc., 102, 639653, https://doi.org/10.1002/qj.49710243310.

    • Search Google Scholar
    • Export Citation
  • Trenberth, K. E., and L. Smith, 2006: The vertical structure of temperature in the tropics: Different flavors of El Niño. J. Climate, 19, 49564973, https://doi.org/10.1175/JCLI3891.1.

    • Search Google Scholar
    • Export Citation
  • Trendafilov, N., and M. Gallo, 2021: Data analysis on simplexes. Multivariate Data Analysis on Matrix Manifolds, N. Trendafilov and M. Gallo, Eds., Springer Series in the Data Sciences, Springer, 373402.

    • Search Google Scholar
    • Export Citation
  • Vinué, G., 2017: Anthropometry: An R package for analysis of anthropometric data. J. Stat. Software, 77, 139, https://doi.org/10.18637/jss.v077.i06.

    • Search Google Scholar
    • Export Citation
  • Wang, B., 1995: Interdecadal changes in El Niño onset in the last four decades. J. Climate, 8, 267285, https://doi.org/10.1175/1520-0442(1995)008<0267:ICIENO>2.0.CO;2.

    • Search Google Scholar
    • Export Citation
  • Yu, H.-F., N. Rao, and I. S. Dhillon, 2016: Temporal regularized matrix factorization for high-dimensional time series prediction. Advances in Neural Information Processing Systems 29, Barcelona, Spain, NIPS, 847–855, https://proceedings.neurips.cc/paper/2016/hash/85422afb467e9456013a2a51d4dff702-Abstract.html.

  • Zonoobi, D., A. A. Kassim, and Y. V. Venkatesh, 2011: Gini index as sparsity measure for signal reconstruction from compressive samples. IEEE J. Sel. Top. Signal Process., 5, 927932, https://doi:10.1109/JSTSP.2011.2160711.

    • Search Google Scholar
    • Export Citation
Save