• Allen, M. R., and L. A. Smith, 1997: Optimal filtering in singular spectrum analysis. Phys. Lett., 234A, 419428, https://doi.org/10.1016/S0375-9601(97)00559-8.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Andrews, T., J. M. Gregory, and M. J. Webb, 2015: The dependence of radiative forcing and feedback on evolving patterns of surface temperature change in climate models. J. Climate, 28, 16301648, https://doi.org/10.1175/JCLI-D-14-00545.1.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Armour, K. C., J. Marshall, J. R. Scott, A. Donohoe, and E. R. Newsom, 2016: Southern Ocean warming delayed by circumpolar upwelling and equatorward transport. Nat. Geosci., 9, 549554, https://doi.org/10.1038/ngeo2731.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Barnes, E. A., J. W. Hurrell, I. Ebert-Uphoff, C. Anderson, and D. Anderson, 2019: Viewing forced climate patterns through an AI lens. Geophys. Res. Lett., 46, 13 38913 398, https://doi.org/10.1029/2019GL084944.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Barnes, E. A., B. Toms, J. W. Hurrell, I. Ebert-Uphoff, C. Anderson, and D. Anderson, 2020: Indicator patterns of forced change learned by an artificial neural network. J. Adv. Model. Earth Syst., 12, e2020MS002195, https://doi.org/10.1029/2020MS002195.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Bell, T. L., 1986: Theory of optimal weighting of data to detect climatic change. J. Atmos. Sci., 43, 16941710, https://doi.org/10.1175/1520-0469(1986)043<1694:TOOWOD>2.0.CO;2.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Bellomo, K., L. N. Murphy, M. A. Cane, A. C. Clement, and L. M. Polvani, 2018: Historical forcings as main drivers of the Atlantic multidecadal variability in the CESM large ensemble. Climate Dyn., 50, 36873698, https://doi.org/10.1007/s00382-017-3834-3.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Bellucci, A., A. Mariotti, and S. Gualdi, 2017: The role of forcings in the twentieth-century North Atlantic multidecadal variability: The 1940–75 North Atlantic cooling case study. J. Climate, 30, 73177337, https://doi.org/10.1175/JCLI-D-16-0301.1.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Bonan, D. B., J. E. Christian, and K. Christianson, 2019: Influence of North Atlantic climate variability on glacier mass balance in Norway, Sweden and Svalbard. J. Glaciol., 65, 580594, https://doi.org/10.1017/jog.2019.35.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Booth, B. B., N. J. Dunstone, P. R. Halloran, T. Andrews, and N. Bellouin, 2012: Aerosols implicated as a prime driver of twentieth-century North Atlantic climate variability. Nature, 484, 228232, https://doi.org/10.1038/nature10946.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Booth, B. B., G. R. Harris, A. Jones, L. Wilcox, M. Hawcroft, and K. S. Carslaw, 2018: Comments on “Rethinking the lower bound on aerosol radiative forcing.” J. Climate, 31, 94079412, https://doi.org/10.1175/JCLI-D-17-0369.1.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Bretherton, C. S., C. Smith, and J. M. Wallace, 1992: An intercomparison of methods for finding coupled patterns in climate data. J. Climate, 5, 541560, https://doi.org/10.1175/1520-0442(1992)005<0541:AIOMFF>2.0.CO;2.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Broccoli, A. J., K. A. Dahl, and R. J. Stouffer, 2006: Response of the ITCZ to Northern Hemisphere cooling. Geophys. Res. Lett., 33, L01702, https://doi.org/10.1029/2005GL024546.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Cane, M. A., A. C. Clement, A. Kaplan, Y. Kushnir, D. Pozdnyakov, R. Seager, S. E. Zebiak, and R. Murtugudde, 1997: Twentieth-century sea surface temperature trends. Science, 275, 957960, https://doi.org/10.1126/science.275.5302.957.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Cane, M. A., A. C. Clement, L. N. Murphy, and K. Bellomo, 2017: Low-pass filtering, heat flux, and Atlantic multidecadal variability. J. Climate, 30, 75297553, https://doi.org/10.1175/JCLI-D-16-0810.1.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Chen, X., J. M. Wallace, and K.-K. Tung, 2017: Pairwise-rotated EOFs of global SST. J. Climate, 30, 54735489, https://doi.org/10.1175/JCLI-D-16-0786.1.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Christian, J. E., N. Siler, M. Koutnik, and G. Roe, 2016: Identifying dynamically induced variability in glacier mass-balance records. J. Climate, 29, 89158929, https://doi.org/10.1175/JCLI-D-16-0128.1.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Coats, S., and K. Karnauskas, 2017: Are simulated and observed twentieth century tropical Pacific sea surface temperature trends significant relative to internal variability? Geophys. Res. Lett., 44, 99289937, https://doi.org/10.1002/2017GL074622.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Coats, S., J. E. Smerdon, B. I. Cook, and R. Seager, 2015: Are simulated megadroughts in the North American Southwest forced? J. Climate, 28, 124142, https://doi.org/10.1175/JCLI-D-14-00071.1.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Cowtan, K., and R. G. Way, 2014: Coverage bias in the HadCRUT4 temperature series and its impact on recent temperature trends. Quart. J. Roy. Meteor. Soc., 140, 19351944, https://doi.org/10.1002/qj.2297.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • DelSole, T., M. K. Tippett, and J. Shukla, 2011: A significant component of unforced multidecadal variability in the recent acceleration of global warming. J. Climate, 24, 909926, https://doi.org/10.1175/2010JCLI3659.1.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Déqué, M., 1988: 10-day predictability of the Northern Hemisphere winter 500-mb height by the ECMWF operational model. Tellus, 40A, 2636, https://doi.org/10.3402/tellusa.v40i1.11780.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Deser, C., and M. L. Blackmon, 1993: Surface climate variations over the North Atlantic Ocean during winter: 1900–1989. J. Climate, 6, 17431753, https://doi.org/10.1175/1520-0442(1993)006<1743:SCVOTN>2.0.CO;2.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Deser, C., R. Knutti, S. Solomon, and A. S. Phillips, 2012a: Communication of the role of natural variability in future North American climate. Nat. Climate Change, 2, 775779, https://doi.org/10.1038/nclimate1562.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Deser, C., A. Phillips, V. Bourdette, and H. Teng, 2012b: Uncertainty in climate change projections: The role of internal variability. Climate Dyn., 38, 527546, https://doi.org/10.1007/s00382-010-0977-x.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Deser, C., A. Phillips, M. A. Alexander, and B. V. Smoliak, 2014: Projecting North American climate over the next 50 years: Uncertainty due to internal variability. J. Climate, 27, 22712296, https://doi.org/10.1175/JCLI-D-13-00451.1.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Deser, C., L. Terray, and A. S. Phillips, 2016: Forced and internal components of winter air temperature trends over North America during the past 50 years: Mechanisms and implications. J. Climate, 29, 22372258, https://doi.org/10.1175/JCLI-D-15-0304.1.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Deser, C., J. W. Hurrell, and A. S. Phillips, 2017: The role of the North Atlantic Oscillation in European climate projections. Climate Dyn., 49, 31413157, https://doi.org/10.1007/s00382-016-3502-z.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Deser, C., and Coauthors, 2020a: Insights from Earth system model initial-condition large ensembles and future prospects. Nat. Climate Change, 10, 277286, https://doi.org/10.1038/s41558-020-0731-2.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Deser, C., and Coauthors, 2020b: Isolating the evolving contributions of anthropogenic aerosols and greenhouse gases: A new CESM1 large ensemble community resource. J. Climate, 33, 78357858, https://doi.org/10.1175/JCLI-D-20-0123.1.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Eade, R., D. Smith, A. Scaife, E. Wallace, N. Dunstone, L. Hermanson, and N. Robinson, 2014: Do seasonal-to-decadal climate predictions underestimate the predictability of the real world? Geophys. Res. Lett., 41, 56205628, https://doi.org/10.1002/2014GL061146.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Eisenman, I., T. Schneider, D. S. Battisti, and C. M. Bitz, 2011: Consistent changes in the sea ice seasonal cycle in response to global warming. J. Climate, 24, 53255335, https://doi.org/10.1175/2011JCLI4051.1.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Enfield, D. B., A. M. Mestas-Nuñez, and P. J. Trimble, 2001: The Atlantic multidecadal oscillation and its relation to rainfall and river flows in the continental U.S. Geophys. Res. Lett., 28, 20772080, https://doi.org/10.1029/2000GL012745.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • England, M. H., and Coauthors, 2014: Recent intensification of wind-driven circulation in the Pacific and the ongoing warming hiatus. Nat. Climate Change, 4, 222227, https://doi.org/10.1038/nclimate2106.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Frankcombe, L. M., M. H. England, M. E. Mann, and B. A. Steinman, 2015: Separating internal variability from the externally forced climate response. J. Climate, 28, 81848202, https://doi.org/10.1175/JCLI-D-15-0069.1.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Frankignoul, C., G. Gastineau, and Y.-O. Kwon, 2017: Estimation of the SST response to anthropogenic and external forcing and its impact on the Atlantic multidecadal oscillation and the Pacific decadal oscillation. J. Climate, 30, 98719895, https://doi.org/10.1175/JCLI-D-17-0009.1.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Greatbatch, R. J., G. Gollan, T. Jung, and T. Kunz, 2012: Factors influencing Northern Hemisphere winter mean atmospheric circulation anomalies during the period 1960/61 to 2001/02. Quart. J. Roy. Meteor. Soc., 138, 19701982, https://doi.org/10.1002/qj.1947.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Guo, R., C. Deser, L. Terray, and F. Lehner, 2019: Human influence on winter precipitation trends (1921–2015) over North America and Eurasia revealed by dynamical adjustment. Geophys. Res. Lett., 46, 34263434, https://doi.org/10.1029/2018GL081316.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Harzallah, A., and R. Sadourny, 1995: Internal versus SST-forced atmospheric variability as simulated by an atmospheric general circulation model. J. Climate, 8, 474495, https://doi.org/10.1175/1520-0442(1995)008<0474:IVSFAV>2.0.CO;2.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Hasselmann, K., 1979: On the signal-to-noise problem in atmospheric response studies. Meteorology over the Tropical Oceans, D. B. Shaw, Ed., Royal Meteorological Society, 251–259.

  • Hasselmann, K., 1993: Optimal fingerprints for the detection of time-dependent climate change. J. Climate, 6, 19571971, https://doi.org/10.1175/1520-0442(1993)006<1957:OFFTDO>2.0.CO;2.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Haustein, K., and Coauthors, 2019: A limited role for unforced internal variability in twentieth-century warming. J. Climate, 32, 48934917, https://doi.org/10.1175/JCLI-D-18-0555.1.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Hawkins, E., and R. Sutton, 2009: The potential to narrow uncertainty in regional climate predictions. Bull. Amer. Meteor. Soc., 90, 10951108, https://doi.org/10.1175/2009BAMS2607.1.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Hegerl, G. C., H. von Storch, K. Hasselmann, B. D. Santer, U. Cubasch, and P. D. Jones, 1996: Detecting greenhouse-gas-induced climate change with an optimal fingerprint method. J. Climate, 9, 22812306, https://doi.org/10.1175/1520-0442(1996)009<2281:DGGICC>2.0.CO;2.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Held, I. M., and B. J. Soden, 2006: Robust responses of the hydrological cycle to global warming. J. Climate, 19, 56865699, https://doi.org/10.1175/JCLI3990.1.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Hurrell, J. W., 1995: Decadal trends in the North Atlantic Oscillation: Regional temperatures and precipitation. Science, 269, 676679, https://doi.org/10.1126/science.269.5224.676.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Jeffrey, S., L. Rotstayn, M. Collier, S. Dravitzki, C. Hamalainen, C. Moeseneder, K. Wong, and J. Syktus, 2013: Australia’s CMIP5 submission using the CSIRO Mk3.6 model. Aust. Meteor. Oceanogr. J., 63 (1), 114, https://doi.org/10.22499/2.6301.001.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Kang, S. M., I. M. Held, D. M. Frierson, and M. Zhao, 2008: The response of the ITCZ to extratropical thermal forcing: Idealized slab-ocean experiments with a GCM. J. Climate, 21, 35213532, https://doi.org/10.1175/2007JCLI2146.1.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Kay, J., and Coauthors, 2015: The Community Earth System Model (CESM) large ensemble project: A community resource for studying climate change in the presence of internal climate variability. Bull. Amer. Meteor. Soc., 96, 13331349, https://doi.org/10.1175/BAMS-D-13-00255.1.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Khodri, M., and Coauthors, 2017: Tropical explosive volcanic eruptions can trigger El Niño by cooling tropical Africa. Nat. Commun., 8, 778, https://doi.org/10.1038/S41467-017-00755-6.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Kirchmeier-Young, M. C., F. W. Zwiers, and N. P. Gillett, 2017: Attribution of extreme events in Arctic sea ice extent. J. Climate, 30, 553571, https://doi.org/10.1175/JCLI-D-16-0412.1.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Kohyama, T., D. L. Hartmann, and D. S. Battisti, 2017: La Niña–like mean-state response to global warming and potential oceanic roles. J. Climate, 30, 42074225, https://doi.org/10.1175/JCLI-D-16-0441.1.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Kushner, P. J., I. M. Held, and T. L. Delworth, 2001: Southern Hemisphere atmospheric circulation response to global warming. J. Climate, 14, 22382249, https://doi.org/10.1175/1520-0442(2001)014<0001:SHACRT>2.0.CO;2.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Lawley, D., 1956: Tests of significance for the latent roots of covariance and correlation matrices. Biometrika, 43, 128136, https://doi.org/10.2307/2333586.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Lehner, F., C. Deser, and L. Terray, 2017: Toward a new estimate of “time of emergence” of anthropogenic warming: Insights from dynamical adjustment and a large initial-condition model ensemble. J. Climate, 30, 77397756, https://doi.org/10.1175/JCLI-D-16-0792.1.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Lorenz, E., 1975: Climatic predictability. The Physical Basis of Climate and Climate Modelling, B. Bolin et al., Eds., GARP Publication Series, Vol. 16, World Meteorological Organization, 132–136.

  • Maher, N., S. McGregor, M. H. England, and A. Sen Gupta, 2015: Effects of volcanism on tropical variability. Geophys. Res. Lett., 42, 60246033, https://doi.org/10.1002/2015GL064751.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Maher, N., and Coauthors, 2019: The Max Planck Institute Grand Ensemble: Enabling the exploration of climate system variability. J. Adv. Model. Earth Syst., 11, 20502069, https://doi.org/10.1029/2019MS001639.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Mantua, N. J., S. R. Hare, Y. Zhang, J. M. Wallace, and R. C. Francis, 1997: A Pacific interdecadal climate oscillation with impacts on salmon production. Bull. Amer. Meteor. Soc., 78, 10691079, https://doi.org/10.1175/1520-0477(1997)078<1069:APICOW>2.0.CO;2.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • McKinnon, K. A., A. Poppick, E. Dunn-Sigouin, and C. Deser, 2017: An “observational large ensemble” to compare observed and modeled temperature trend uncertainty due to internal variability. J. Climate, 30, 75857598, https://doi.org/10.1175/JCLI-D-16-0905.1.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • McPhaden, M., T. Lee, and D. McClurg, 2011: El Niño and its relationship to changing background conditions in the tropical Pacific Ocean. Geophys. Res. Lett., 38, L15709, https://doi.org/10.1029/2011GL048275.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Meehl, G. A., and Coauthors, 2009: Decadal prediction: Can it be skillful? Bull. Amer. Meteor. Soc., 90, 14671486, https://doi.org/10.1175/2009BAMS2778.1.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Merrifield, A., F. Lehner, S.-P. Xie, and C. Deser, 2017: Removing circulation effects to assess central U.S. land–atmosphere interactions in the CESM large ensemble. Geophys. Res. Lett., 44, 99389946, https://doi.org/10.1002/2017GL074831.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Newman, M., and Coauthors, 2016: The Pacific decadal oscillation, revisited. J. Climate, 29, 43994427, https://doi.org/10.1175/JCLI-D-15-0508.1.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • North, G. R., T. L. Bell, R. F. Cahalan, and F. J. Moeng, 1982: Sampling errors in the estimation of empirical orthogonal functions. Mon. Wea. Rev., 110, 699706, https://doi.org/10.1175/1520-0493(1982)110<0699:SEITEO>2.0.CO;2.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Palmer, T. N., 1999: A nonlinear dynamical perspective on climate prediction. J. Climate, 12, 575591, https://doi.org/10.1175/1520-0442(1999)012<0575:ANDPOC>2.0.CO;2.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Pausata, F. S. R., D. Zanchetti, C. Karamperidou, R. Caballero, and D. Battisti, 2020: ITCZ shift and extratropical teleconnections drive ENSO response to volcanic eruptions. Sci. Adv., 23, eaaz5006, https://doi.org/10.1126/sciadv.aaz5006.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Rodgers, K. B., J. Lin, and T. L. Frölicher, 2015: Emergence of multiple ocean ecosystem drivers in a large ensemble suite with an Earth system model. Biogeosciences, 12, 33013320, https://doi.org/10.5194/bg-12-3301-2015.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Saffioti, C., E. M. Fischer, S. C. Scherrer, and R. Knutti, 2016: Reconciling observed and modeled temperature and precipitation trends over Europe by adjusting for circulation variability. Geophys. Res. Lett., 43, 81898198, https://doi.org/10.1002/2016GL069802.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Santer, B. D., W. Brüggemann, U. Cubasch, K. Hasselmann, H. Höck, E. Maier-Reimer, and U. Mikolajewica, 1994: Signal-to-noise analysis of time-dependent greenhouse warming experiments. Climate Dyn., 9, 267285, https://doi.org/10.1007/BF00204743.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Scaife, A., and D. Smith, 2018: A signal-to-noise paradox in climate science. npj Climate Atmos. Sci., 1, 28, https://doi.org/10.1038/S41612-018-0038-4.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Scaife, A., and Coauthors, 2014: Skillful long-range prediction of European and North American winters. Geophys. Res. Lett., 41, 25142519, https://doi.org/10.1002/2014GL059637.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Schneider, T., 2001: Analysis of incomplete climate data: Estimation of mean values and covariance matrices and imputation of missing values. J. Climate, 14, 853871, https://doi.org/10.1175/1520-0442(2001)014<0853:AOICDE>2.0.CO;2.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Schneider, T., and S. M. Griffies, 1999: A conceptual framework for predictability studies. J. Climate, 12, 31333155, https://doi.org/10.1175/1520-0442(1999)012<3133:ACFFPS>2.0.CO;2.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Schneider, T., and I. M. Held, 2001: Discriminants of twentieth-century changes in Earth surface temperatures. J. Climate, 14, 249254, https://doi.org/10.1175/1520-0442(2001)014<0249:LDOTCC>2.0.CO;2.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Seager, R., N. Naik, and G. A. Vecchi, 2010: Thermodynamic and dynamic mechanisms for large-scale changes in the hydrological cycle in response to global warming. J. Climate, 23, 46514668, https://doi.org/10.1175/2010JCLI3655.1.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Seager, R., M. Cane, N. Henderson, D.-E. Lee, R. Abernathey, and H. Zhang, 2019: Strengthening tropical Pacific zonal sea surface temperature gradient consistent with rising greenhouse gases. Nat. Climate Change, 9, 517522, https://doi.org/10.1038/s41558-019-0505-x.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Semenov, V. A., M. Latif, J. H. Jungclaus, and W. Park, 2008: Is the observed NAO variability during the instrumental record unusual? Geophys. Res. Lett., 35, L11701, https://doi.org/10.1029/2008GL033273.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Semenov, V. A., M. Latif, D. Dommenget, N. S. Keenlyside, A. Strehz, T. Martin, and W. Park, 2010: The impact of North Atlantic–Arctic multidecadal variability on Northern Hemisphere surface air temperature. J. Climate, 23, 56685677, https://doi.org/10.1175/2010JCLI3347.1.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Shindell, D. T., and Coauthors, 2013: Radiative forcing in the ACCMIP historical and future climate simulations. Atmos. Chem. Phys., 13, 29392974, https://doi.org/10.5194/acp-13-2939-2013.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Siler, N., C. Proistosescu, and S. Po-Chedley, 2019: Natural variability has slowed the decline in western U.S. snowpack since the 1980s. Geophys. Res. Lett., 46, 346355, https://doi.org/10.1029/2018GL081080.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Simpson, I. R., C. Deser, K. A. McKinnon, and E. A. Barnes, 2018: Modeled and observed multidecadal variability in the North Atlantic jet stream and its connection to sea surface temperatures. J. Climate, 31, 83138338, https://doi.org/10.1175/JCLI-D-18-0168.1.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Sippel, S., N. Meinshausen, A. Merrifield, F. Lehner, A. G. Pendergrass, E. Fischer, and R. Knutti, 2019: Uncovering the forced climate response from a single ensemble member using statistical learning. J. Climate, 32, 56775699, https://doi.org/10.1175/JCLI-D-18-0882.1.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Smoliak, B. V., J. M. Wallace, P. Lin, and Q. Fu, 2015: Dynamical adjustment of the Northern Hemisphere surface air temperature field: Methodology and application to observations. J. Climate, 28, 16131629, https://doi.org/10.1175/JCLI-D-14-00111.1.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Solomon, A., and M. Newman, 2012: Reconciling disparate twentieth-century Indo-Pacific ocean temperature trends in the instrumental record. Nat. Climate Change, 2, 691699, https://doi.org/10.1038/nclimate1591.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Solomon, A., and Coauthors, 2011: Distinguishing the roles of natural and anthropogenically forced decadal climate variability: Implications for prediction. Bull. Amer. Meteor. Soc., 92, 141156, https://doi.org/10.1175/2010BAMS2962.1.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Stern, W., and K. Miyakoda, 1995: Feasibility of seasonal forecasts inferred from multiple GCM simulations. J. Climate, 8, 10711085, https://doi.org/10.1175/1520-0442(1995)008<1071:FOSFIF>2.0.CO;2.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Stolpe, M. B., I. Medhaug, and R. Knutti, 2017: Contribution of Atlantic and Pacific multidecadal variability to twentieth-century temperature changes. J. Climate, 30, 62796295, https://doi.org/10.1175/JCLI-D-16-0803.1.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Stolpe, M. B., I. Medhaug, J. Sedláček, and R. Knutti, 2018: Multidecadal variability in global surface temperatures related to the Atlantic meridional overturning circulation. J. Climate, 31, 28892906, https://doi.org/10.1175/JCLI-D-17-0444.1.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Sun, L., M. Alexander, and C. Deser, 2018: Evolution of the global coupled climate response to Arctic sea ice loss during 1990–2090 and its contribution to climate change. J. Climate, 31, 78237843, https://doi.org/10.1175/JCLI-D-18-0134.1.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Takahashi, C., and M. Watanabe, 2016: Pacific trade winds accelerated by aerosol forcing over the past two decades. Nat. Climate Change, 6, 768772, https://doi.org/10.1038/nclimate2996.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Tandon, N. F., and P. J. Kushner, 2015: Does external forcing interfere with the AMOC’s influence on North Atlantic sea surface temperature? J. Climate, 28, 63096323, https://doi.org/10.1175/JCLI-D-14-00664.1.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Thompson, D. W., S. Solomon, P. J. Kushner, M. H. England, K. M. Grise, and D. J. Karoly, 2011: Signatures of the Antarctic ozone hole in Southern Hemisphere surface climate change. Nat. Geosci., 4, 741749, https://doi.org/10.1038/ngeo1296.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Thompson, D. W., E. A. Barnes, C. Deser, W. E. Foust, and A. S. Phillips, 2015: Quantifying the role of internal climate variability in future climate trends. J. Climate, 28, 64436456, https://doi.org/10.1175/JCLI-D-14-00830.1.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Ting, M., Y. Kushnir, R. Seager, and C. Li, 2009: Forced and internal twentieth-century SST trends in the North Atlantic. J. Climate, 22, 14691481, https://doi.org/10.1175/2008JCLI2561.1.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Trenberth, K. E., and J. W. Hurrell, 1994: Decadal atmosphere–ocean variations in the Pacific. Climate Dyn., 9, 303319, https://doi.org/10.1007/BF00204745.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Trenberth, K. E., and D. J. Shea, 2006: Atlantic hurricanes and natural variability in 2005. Geophys. Res. Lett., 33, L12704, https://doi.org/10.1029/2006GL026894.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Ulbrich, U., and M. Christoph, 1999: A shift of the NAO and increasing storm track activity over Europe due to anthropogenic greenhouse gas forcing. Climate Dyn., 15, 551559, https://doi.org/10.1007/s003820050299.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Venzke, S., M. R. Allen, R. T. Sutton, and D. P. Rowell, 1999: The atmospheric response over the North Atlantic to decadal changes in sea surface temperature. J. Climate, 12, 25622584, https://doi.org/10.1175/1520-0442(1999)012<2562:TAROTN>2.0.CO;2.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Wallace, J. M., Q. Fu, B. V. Smoliak, P. Lin, and C. M. Johanson, 2012: Simulated versus observed patterns of warming over the extratropical Northern Hemisphere continents during the cold season. Proc. Natl. Acad. Sci. USA, 109, 14 33714 342, https://doi.org/10.1073/pnas.1204875109.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Watanabe, M., and H. Tatebe, 2019: Reconciling roles of sulphate aerosol forcing and internal variability in Atlantic multidecadal climate changes. Climate Dyn., 53, 46514665, https://doi.org/10.1007/s00382-019-04811-3.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Wills, R. C., D. S. Battisti, D. L. Hartmann, and T. Schneider, 2017: Extracting modes of variability and change from climate model ensembles. Proc. Seventh Int. Workshop on Climate Informatics: CI 2017, V. Lyubchich et al., Eds., NCAR Tech. Note NCAR/TN-536+PROC, 25–28.

  • Wills, R. C., T. Schneider, J. M. Wallace, D. S. Battisti, and D. L. Hartmann, 2018: Disentangling global warming, multidecadal variability, and El Niño in Pacific temperatures. Geophys. Res. Lett., 45, 24872496, https://doi.org/10.1002/2017GL076327.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Wills, R. C., K. C. Armour, D. S. Battisti, and D. L. Hartmann, 2019a: Ocean–atmosphere dynamic coupling fundamental to the Atlantic multidecadal oscillation. J. Climate, 32, 251272, https://doi.org/10.1175/JCLI-D-18-0269.1.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Wills, R. C., D. S. Battisti, C. Proistosescu, L. Thompson, D. L. Hartmann, and K. C. Armour, 2019b: Ocean circulation signatures of North Pacific decadal variability. Geophys. Res. Lett., 46, 16901701, https://doi.org/10.1029/2018GL080716.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Yin, J. H., 2005: A consistent poleward shift of the storm tracks in simulations of 21st century climate. Geophys. Res. Lett., 32, L18701, https://doi.org/10.1029/2005GL023684.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Zhang, R., and Coauthors, 2013: Have aerosols caused the observed Atlantic multidecadal variability? J. Atmos. Sci., 70, 11351144, https://doi.org/10.1175/JAS-D-12-0331.1.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Zhang, R., R. Sutton, G. Danabasoglu, Y.-O. Kwon, R. Marsh, S. G. Yeager, D. E. Amrhein, and C. M. Little, 2019: A review of the role of the Atlantic meridional overturning circulation in Atlantic multidecadal variability and associated climate impacts. Rev. Geophys., 57, 316375, https://doi.org/10.1029/2019rg000644.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Zhang, X., and J. E. Walsh, 2006: Toward a seasonally ice-covered Arctic Ocean: Scenarios from the IPCC AR4 model simulations. J. Climate, 19, 17301747, https://doi.org/10.1175/JCLI3767.1.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Zwiers, F., 1996: Interannual variability and predictability in an ensemble of AMIP climate simulations conducted with the CCC GCM2. Climate Dyn., 12, 825847, https://doi.org/10.1007/s003820050146.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • View in gallery
    Fig. 1.

    (a)–(d) S/N-maximizing patterns of seasonal-mean surface temperature anomalies in the CESM-LE historical and RCP8.5 simulations over the time period 1920–2019, with N = 200 EOFs retained. The time evolution of the S/N-maximizing patterns in all ensemble members are shown as standard deviation anomalies with gray lines. The black line shows the ensemble-mean time evolution of each pattern (i.e., ⟨tk⟩). Note that seasonal cycle in the ensemble-mean time evolution indicates forced changes in the seasonality of surface temperature. The 100-yr ensemble-mean trends in each pattern are shown separately for (left to right) January–March (JFM), April–June (AMJ), July–September (JAS), and October–December (OND) in the bar chart insets. The y scale for the bar chart insets is half that for the time series in (c).

  • View in gallery
    Fig. 2.

    Fingerprint patterns uk (°C−1) of seasonal-mean surface temperature anomalies in the CESM-LE historical and RCP8.5 simulations over the time period 1920–2019, with N = 200 EOFs retained (cf. S/N-maximizing patterns vk in Figs. 1a and 1b).

  • View in gallery
    Fig. 3.

    (a) Signal fraction (sk) and percent of the total (ensemble-mean) forced variance captured by the leading S/NPs. The dashed line gives the minimum value of sk that is significant at the 5% significance level computed by block bootstrapping. Note that the percentages of the total forced variance do not add to exactly 100% because of the nonorthogonality of the S/NPs. (b) Global mean of the grid point squared correlation between the pattern filtered estimate of the forced response XS/NP from one 20-member half-ensemble and the simple ensemble mean X of the opposite 20-member half-ensemble, as a function of the number of S/NPs included M and the number of EOFs retained N. The dashed line gives the global-mean gridpoint squared correlation between 20-member half-ensembles when no pattern filtering is applied.

  • View in gallery
    Fig. 4.

    Spatial maps of the squared correlation between estimates of the forced response in 20-member half-ensembles: (a) when the forced response is estimated by a simple ensemble mean and (b) when the forced response is estimated by S/NP filtering with M = 11 and N = 200. (c) Difference between (a) and (b). Spatial maps of the root-mean-square error (RMSE) between estimates of the forced response in 20-member half-ensembles: (d) when the forced response is estimated by a simple ensemble mean and (e) when the forced response is estimated by S/NP filtering with M = 11 and N = 200. (f) Difference between (d) and (e).

  • View in gallery
    Fig. 5.

    Forced responses of large-scale temperature indices in the CESM-LE, computed from (left) a simple ensemble mean X and (right) the pattern filtered ensemble mean XS/NP, from S/NP filtering of seasonal surface temperature anomalies with M = 11 and N = 200. Blue and orange lines show the first and second 20-member half-ensembles of the CESM-LE, respectively. The black line shows the full 40-member CESM-LE. The squared correlation between the 20-member half-ensembles is shown in the bottom right of each panel. North Atlantic SST is averaged over 40°–60°N and 0°–80°W. The Pacific east–west SST difference is the difference between the eastern equatorial Pacific (90°–150°W, 6°S–6°N) and the western equatorial Pacific (120°E−180°, 6°S–6°N). U.S. land surface temperatures are averaged over 30°–45°N, including most of the contiguous United States and parts of Mexico and Canada.

  • View in gallery
    Fig. 6.

    (a)–(c) Multi-field S/N-maximizing patterns of seasonal-mean surface temperature (TS), sea level pressure (SLP), and precipitation (PR) anomalies in the CESM-LE historical and RCP8.5 simulations over the time period 1920–2019, with N = 200 EOFs retained. The time evolution of the S/N-maximizing patterns in all ensemble members are shown as standard deviation anomalies with gray lines. The black line shows the ensemble-mean time evolution of each pattern. The 100-yr ensemble-mean trends in each pattern are shown separately for (left to right) JFM, AMJ, JAS, and OND in the bar chart insets. The y scale for the bar-chart insets is half that for the time series in (a).

  • View in gallery
    Fig. 7.

    Forced responses of large-scale precipitation and SLP indices in the CESM-LE, computed from (left) a simple ensemble mean X and (right) the pattern filtered ensemble mean XS/NP, from multi-field S/NP filtering of seasonal surface temperature, precipitation, and SLP anomalies with M = 12 and N = 200. Blue and orange lines show the first and second 20-member half-ensembles of the CESM-LE, respectively. The black line shows the full 40-member CESM-LE. The squared correlation between the 20-member half-ensembles is shown in the bottom right of each panel. China precipitation is averaged over land within 100°–120°E and 20°–40°N, which includes small parts of Southeast Asia. U.S. Southwest (U.S.-SW) precipitation is averaged over land within 105°–125°W and 30°–40°N, which includes small parts of northwest Mexico. An approximate North Atlantic Oscillation (NAO) index is computed from the unnormalized SLP anomaly difference (Pa) between Lisbon and Reykjavik. The North Pacific index is the average SLP anomaly over 160°E−140°W and 30°–65°N, as in Trenberth and Hurrell (1994).

  • View in gallery
    Fig. 8.

    (a) The global-mean gridpoint squared correlation and (b) the root-mean square error (RMSE) between estimates of the forced surface temperature response in nE-member subensembles and a reference estimate of the forced response, computed from S/NP filtering of 20 CESM-LE ensemble members that are withheld from the subensembles. Within the subensembles, the forced response is estimated by a simple ensemble mean (blue), S/NP filtering (black), and LFP filtering (orange). (c)–(f) As in (a), but with spatial averaging computed before computing the squared correlation between forced response estimates. Spatial averages are computed as in Fig. 5. All values are the mean over five random choices of ensemble member sampling (without replacement); error bars show ±1 standard deviation spread due to sampling uncertainty.

  • View in gallery
    Fig. 9.

    The number of ensemble members needed to constrain the forced response in local temperature using (a) a simple ensemble mean and (b) S/NP filtering. The criterion used is that the forced response must share 80% of its variance with the reference estimate (S/NP filtering of the opposite 20-member half-ensemble; i.e., no ensemble members are shared between the estimate and the reference). Values shown are the median over five random choices of ensemble member sampling.

  • View in gallery
    Fig. 10.

    Low-frequency patterns (LFPs) of seasonal-mean surface temperature anomalies in the CESM-LE historical and RCP8.5 simulations over the time period 1920–2019, with N = 50 EOFs retained. The time evolution of the LFPs in all ensemble members are shown as standard deviation anomalies with gray lines. The orange (blue) lines show the ensemble member with the most (least) change in LFP-1 over 2000–19. The black line shows the ensemble-mean time evolution of each pattern. Modified from Wills et al. (2017).

  • View in gallery
    Fig. 11.

    (left) Spread in time evolutions of large-scale temperature indices in individual members of the CESM-LE and (right) spread in time evolutions of the same large-scale temperature indices after the application of LFP filtering in individual members of the CESM-LE. Averaging regions for the large-scale temperature indices are defined in the caption of Fig. 5. Note the different y-axis scales for the Pacific east–west SST difference and the U.S. land surface temperature. For reference, the forced response estimate from S/NP filtering of the full 40-member CESM-LE (as in the right-hand side of Fig. 5) is shown in green (same on left and right).

  • View in gallery
    Fig. 12.

    Low-frequency patterns (LFPs) of seasonal surface temperature anomalies, and their time evolution in standard deviation anomalies, from the infilled HadCRUT4 (Cowtan and Way 2014) observational product over the time period 1920–2019, with N = 25 EOFs retained.

  • View in gallery
    Fig. 13.

    Partitioning of observed trends (from HadCRUT4; Cowtan and Way 2014) into an LFP-filtered component, based on LFP filtering with M = 2 LFPs included and N = 25 EOFs retained, and a residual. Trends are shown in units of degrees Celsius per trend length [e.g., °C (40 yr)−1]. Note that the residual component of 1920–2019 temperature trends is multiplied by a factor of 3 for ease of comparison.

  • View in gallery
    Fig. 14.

    Time evolution of (a) global-mean surface temperature, (b) North Atlantic SST averaged over 40°–60°N (i.e., the North Atlantic warming hole), and (c) the SST difference between the eastern and western equatorial Pacific (averaging regions as in Fig. 5) in HadCRUT4 (Cowtan and Way 2014), before and after applying LFP filtering.

  • View in gallery
    Fig. 15.

    Comparison across four single-model large ensembles and HadCRUT4 observations of the time evolution of (a) global-mean surface temperature, (b) North Atlantic SST averaged over 40°–60°N (i.e., the North Atlantic warming hole), and (c) the SST difference between the eastern and western equatorial Pacific (averaging regions as in Fig. 5). In models, the time series shown are averaged over the full ensemble after application of S/NP filtering. In the analysis of CESM-LE, CSIRO-LE, GFDL-LE, and MPI-LE, we choose a number of EOFs to retain between 90% and 91% of the total variance (200, 225, 150, and 225, respectively); we choose the number of S/NPs based on a 5% significance threshold in sk, computed by block bootstrapping (11, 8, 9, and 10, respectively). The observations are LFP filtered, as shown in Fig. 14.

  • View in gallery
    Fig. 16.

    Global mean of gridpoint squared correlation between XLFP, computed entirely from the HadCRUT4 observational product, and XS/NP, computed from four different large ensembles, over the time period 1930–2019 (excluding the micro-initialization spinup period), as a function of the number M of LFPs included and the number N of EOFs retained in the observational LFP filtering. The values of M and N used in the S/NP filtering are given in the caption of Fig. 15.

All Time Past Year Past 30 Days
Abstract Views 665 0 0
Full Text Views 1499 990 228
PDF Downloads 1305 787 89

Pattern Recognition Methods to Separate Forced Responses from Internal Variability in Climate Model Ensembles and Observations

Robert C. J. WillsUniversity of Washington, Seattle, Washington

Search for other papers by Robert C. J. Wills in
Current site
Google Scholar
PubMed
Close
,
David S. BattistiUniversity of Washington, Seattle, Washington

Search for other papers by David S. Battisti in
Current site
Google Scholar
PubMed
Close
,
Kyle C. ArmourUniversity of Washington, Seattle, Washington

Search for other papers by Kyle C. Armour in
Current site
Google Scholar
PubMed
Close
,
Tapio SchneiderCalifornia Institute of Technology, Pasadena, California

Search for other papers by Tapio Schneider in
Current site
Google Scholar
PubMed
Close
, and
Clara DeserNational Center for Atmospheric Research, Boulder, Colorado

Search for other papers by Clara Deser in
Current site
Google Scholar
PubMed
Close
Free access

Abstract

Ensembles of climate model simulations are commonly used to separate externally forced climate change from internal variability. However, much of the information gained from running large ensembles is lost in traditional methods of data reduction such as linear trend analysis or large-scale spatial averaging. This paper demonstrates how a pattern recognition method (signal-to-noise-maximizing pattern filtering) extracts patterns of externally forced climate change from large ensembles and identifies the forced climate response with up to 10 times fewer ensemble members than simple ensemble averaging. It is particularly effective at filtering out spatially coherent modes of internal variability (e.g., El Niño, North Atlantic Oscillation), which would otherwise alias into estimates of regional responses to forcing. This method is used to identify forced climate responses within the 40-member Community Earth System Model (CESM) large ensemble, including an El Niño–like response to volcanic eruptions and forced trends in the North Atlantic Oscillation. The ensemble-based estimate of the forced response is used to test statistical methods for isolating the forced response from a single realization (i.e., individual ensemble members). Low-frequency pattern filtering is found to skillfully identify the forced response within individual ensemble members and is applied to the HadCRUT4 reconstruction of observed temperatures, whereby it identifies slow components of observed temperature changes that are consistent with the expected effects of anthropogenic greenhouse gas and aerosol forcing.

Corresponding author: Robert C. Jnglin Wills, rcwills@uw.edu

Abstract

Ensembles of climate model simulations are commonly used to separate externally forced climate change from internal variability. However, much of the information gained from running large ensembles is lost in traditional methods of data reduction such as linear trend analysis or large-scale spatial averaging. This paper demonstrates how a pattern recognition method (signal-to-noise-maximizing pattern filtering) extracts patterns of externally forced climate change from large ensembles and identifies the forced climate response with up to 10 times fewer ensemble members than simple ensemble averaging. It is particularly effective at filtering out spatially coherent modes of internal variability (e.g., El Niño, North Atlantic Oscillation), which would otherwise alias into estimates of regional responses to forcing. This method is used to identify forced climate responses within the 40-member Community Earth System Model (CESM) large ensemble, including an El Niño–like response to volcanic eruptions and forced trends in the North Atlantic Oscillation. The ensemble-based estimate of the forced response is used to test statistical methods for isolating the forced response from a single realization (i.e., individual ensemble members). Low-frequency pattern filtering is found to skillfully identify the forced response within individual ensemble members and is applied to the HadCRUT4 reconstruction of observed temperatures, whereby it identifies slow components of observed temperature changes that are consistent with the expected effects of anthropogenic greenhouse gas and aerosol forcing.

Corresponding author: Robert C. Jnglin Wills, rcwills@uw.edu

1. Introduction

The observed increase in global temperatures over the past century has not been uniform in space or time. Variability in the rate and pattern of global warming arises from a combination of anthropogenic influences, natural external forcing (e.g., from volcanic sulfur emissions), and internal climate variability arising from processes within (and interactions between) the atmosphere, oceans, cryosphere, and land surface. A primary goal of climate science is to separate the influences of external forcing and internal variability on the global temperature record, as is needed to attribute observed climate changes, to estimate the climate response to future changes in radiative forcing, and to characterize and understand internal climate variability.

The separation of externally forced climate change and internal variability has typically been addressed by computing the climate response that is robust across an ensemble of simulations (Harzallah and Sadourny 1995; Hawkins and Sutton 2009; Ting et al. 2009; Solomon et al. 2011; Deser et al. 2014; Frankcombe et al. 2015). Averaging over multiple ensemble members removes internal variability that varies in phase between realizations. Externally forced climate change can be estimated by the ensemble mean, and internal variability can be estimated by deviations from the ensemble mean. However, multimodel ensembles such as the Coupled Model Intercomparison Project (CMIP) conflate model biases with internal variability. This has motivated the use of single-model large ensembles (e.g., Kay et al. 2015; Deser et al. 2020a), where the same model is run multiple times with the same forcing but small differences in the initial condition.

Estimating the climate response to forcing from large ensembles is subject to any model biases in the forced response. This has led to a wide range of conclusions on, for example, the extent to which multidecadal variability in Atlantic sea surface temperatures (SSTs) represents true internal variability or is modified by anthropogenic forcing (Ting et al. 2009; Booth et al. 2012; Zhang et al. 2013; Tandon and Kushner 2015; Bellucci et al. 2017; Bellomo et al. 2018; Watanabe and Tatebe 2019) and the extent to which the observed strengthening of the Pacific trade winds and east–west SST gradient since the late 1970s is forced or unforced (McPhaden et al. 2011; England et al. 2014; Takahashi and Watanabe 2016; Coats and Karnauskas 2017; Kohyama et al. 2017; Seager et al. 2019). Comparing across multiple climate models can give insights into which aspects of the forced response are robust and which are not, but this approach becomes computationally intensive as large ensembles are needed for multiple climate models. It is therefore important to identify how many ensemble members are needed to identify forced climate responses and what if anything can be gleaned from individual simulations or from observations.

Seminal work by Deser et al. (2012b, 2014) emphasized that as many as 10–40 ensemble members or more may be needed to identify regional climate responses on time scales up to a few decades, particularly for fields with large internal variability such as precipitation and sea level pressure (SLP). This has motivated modeling centers to run large ensembles with between 20 and 100 ensemble members (Jeffrey et al. 2013; Kay et al. 2015; Rodgers et al. 2015; Kirchmeier-Young et al. 2017; Sun et al. 2018; Maher et al. 2019; Deser et al. 2020a). Now that these large ensembles are available as a test bed, it is possible to revisit the question of how many ensemble members are needed, in order to inform future modeling efforts.

Many studies diagnose the forced response based on the ensemble average of a linear trend or large-scale spatial average. However, this ignores spatiotemporal covariance information that can be valuable in separating forced climate responses from internal variability. A number of studies have demonstrated spatiotemporal analysis methods for isolating the forced climate response from a single realization (Schneider and Held 2001; Wallace et al. 2012; Smoliak et al. 2015; Deser et al. 2016; Frankignoul et al. 2017; Wills et al. 2018; Sippel et al. 2019), with the ultimate goal of isolating the forced component of observed climate changes. However, there has been less focus on the best way to extract forced climate responses from small ensembles (2–10 ensemble members). In this study, we use large ensembles to test statistical methods for isolating forced climate responses, with the goal of identifying the forced response from small ensembles and/or from a single realization. We demonstrate how pattern recognition methods (e.g., Déqué 1988; Allen and Smith 1997; Schneider and Griffies 1999; Ting et al. 2009) can be used to separate patterns of forced response (i.e., patterns with high signal-to-noise ratios) from patterns of internal variability (i.e., patterns with low signal-to-noise ratios) within climate model ensembles, reducing the number of ensemble members needed to skillfully estimate the forced response.

Spatiotemporal analysis methods to estimate the forced response within individual realizations fall into two categories: 1) time-scale separation methods and 2) dynamical adjustment. Taking advantage of the fact that forced climate change operates on a longer time scale than most internal variability, time-scale separation methods seek to identify the slowest evolving anomaly patterns and use them to estimate the forced response (Schneider and Held 2001; Frankignoul et al. 2017; Wills et al. 2018). For example, low-frequency component analysis (LFCA; Wills et al. 2018) filters out patterns of anomalies that exhibit primarily high-frequency variability (i.e., that have a small ratio of low-pass filtered variance to total variance). Dynamical adjustment instead estimates the influence of atmospheric internal variability on a target variable by regression against a variable that is representative of the atmospheric circulation (e.g., SLP). This approach has been successful, especially for removing the influence of internal variability on temperature and precipitation changes at midlatitudes (Wallace et al. 2012; Smoliak et al. 2015; Deser et al. 2016; Saffioti et al. 2016; Merrifield et al. 2017; Lehner et al. 2017; Sippel et al. 2019; Guo et al. 2019) and on snowpack or glacier mass balance changes (Christian et al. 2016; Siler et al. 2019; Bonan et al. 2019). However, in cases where atmospheric circulation changes are important to the forced response (see, e.g., Palmer 1999), dynamical adjustment requires a separate method to estimate forced circulation changes (e.g., the mean over a large ensemble). We are interested in a more general method that could, for example, be applied directly to estimate forced changes in atmospheric circulations. We therefore focus on time-scale separation methods rather than dynamical adjustment in this paper. We refer the reader to Sippel et al. (2019) for a thorough discussion of how to approach this problem using dynamical adjustment.

This paper is organized as follows. In section 2, we describe the pattern recognition methods considered in this study and the climate model simulations and observational data analyzed. In section 3, we demonstrate how identifying signal-to-noise-maximizing patterns improves estimates of the forced climate response within climate model ensembles compared to a simple ensemble average. We show that it isolates forced responses in quantities with low signal-to-noise ratios such as the east–west SST gradient across the equatorial Pacific, SLP over the North Pacific, and precipitation over the southwest United States. In section 4, we show that this method can identify many aspects of the forced response with less than five ensemble members. In section 5, we demonstrate how identifying low-frequency patterns can be used to estimate the forced climate response from a single ensemble member and apply this method to characterize long-term changes in observed temperatures that are consistent with the expected responses to external forcing. In section 6, we summarize our conclusions and discuss the generalizability and applications of the statistical methods presented herein.

2. Methods and data

In this paper, we use statistical methods that identify patterns of externally forced or low-frequency changes. These methods rely on a pattern recognition method called linear discriminant analysis (a type of supervised machine learning) to find spatial patterns, or linear combinations of empirical orthogonal functions (EOFs), that maximize a particular type of variance representing a “signal” compared to “noise” that exists within internal variability or among realizations (Déqué 1988; Allen and Smith 1997; Schneider and Griffies 1999; Venzke et al. 1999; Schneider and Held 2001; Ting et al. 2009; DelSole et al. 2011; Wills et al. 2018). This broad category of analyses has variously been referred to as optimal filtering, predictable component analysis, or signal-to-noise-maximizing EOF analysis.

We use two types of such analyses, which differ in their definition of what type of variance constitutes a signal and what type of variance constitutes noise. In signal-to-noise-maximizing pattern (S/NP) filtering, signal is defined by the mean over an ensemble of simulations; therefore, at least two ensemble members are required. Noise is defined as differences between ensemble members and includes all internal variability, regardless of time scale. It is based on earlier work by Schneider and Griffies (1999, hereafter SG99) and Ting et al. (2009, hereafter T09). Similar to mulitvariate analysis of variance (MANOVA) methods (e.g., Harzallah and Sadourny 1995; Stern and Miyakoda 1995; Zwiers 1996), it tests whether anomaly patterns within an ensemble are distinct in periods with different external forcing (i.e., predictability of the second kind; Lorenz 1975). In low-frequency pattern (LFP) filtering, signal is defined as variance that makes it through a low-pass filter. Noise is defined as all variability at time scales shorter than the low-pass cutoff. It has also been called low-frequency component analysis and is based on earlier work by Wills et al. (2018, hereafter W18); see also Schneider and Held (2001, hereafter SH01).

In both cases, “filtering” refers to the retention of only the leading-order patterns (i.e., S/NPs or LFPs), such that patterns of (high-frequency) internal variability are removed from the dataset. These methods thus use the spatial structure of covariance in climate noise to optimally filter it out.

a. S/N-maximizing pattern filtering

The goal of S/NP filtering is to find anomaly patterns (S/NPs) for which different ensemble members agree on the temporal evolution [i.e., patterns with a high signal-to-noise ratio (S/N); SG99; T09]. The variability not described by these patterns can then be truncated, such that patterns of ensemble member disagreement (i.e., noise from internal variability) do not alias into the ensemble average.

We seek anomaly patterns associated with time series tk that maximize the ratio of (ensemble mean) signal to total variance:

sk=tkTtktkTtk.

Here, angle brackets denote an ensemble average. These time series are determined by the projection of a fingerprint pattern uk onto the ensemble data matrix X:

tk=Xuk.

The nne × p ensemble data matrix X is constructed by concatenating the n × p data matrices Xi from each ensemble member in the time dimension, where n is the length of time series, ne is the number of ensemble members, and p is the spatial dimension. Each ensemble member data matrix Xi is weighted by the square root of grid cell area, such that the covariance matrix is area weighted.

To ensure that the identified patterns correspond to variability that actually occurs within the ensemble, the fingerprint patterns uk are required to be linear combinations of the N leading ensemble EOFs ak, with normalized weight vectors ek:

uk=[a1σ1 a2σ2 aNσN]ek.

The ensemble EOFs ak are eigenvectors of the ensemble-mean covariance matrix C,

Cak=σk2ak,

where σk2 is the variance associated with the kth EOF. The ensemble-mean covariance matrix C (i.e., the pooled covariance matrix) can be computed as

C=nE1i=1nECi,

where Ci=(n1)1XiTXi are the individual ensemble member climatological covariance matrices. The ensemble EOFs are normalized such that ||ak|| = 1 and the principal components ck=σk1Xak have unit variance over the entire ensemble.

We can solve for the linear-combination coefficients ek that give uk and tk that maximize sk by plugging (2) and (3) into (1) and using the definition of a principal component ck=σk1Xak to turn this into an eigenvalue problem, Sek=skek, where

Smn=cmTcn,m, n[0 N].

The matrix S has N eigenvectors ek, with eigenvalues that give the ratio sk of signal to total variance. Finally, the S/N-maximizing patterns vk are determined by the regression of the ensemble data matrix X onto each tk:

vk=XTtk=XTXuk=[σ1a1 σ2a2 σNaN]ek.

In this analysis, the time series tk retain their orthogonality (like principal components), but the S/NPs vk do not.

The S/NPs are sorted by sk such that the leading S/NPs are patterns of forced response within the ensemble. This is equivalent to sorting by S/N, which is uniquely determined by the eigenvalue sk:

S/N=sk(1sk)1.

The first FP is the linear combination of the leading N EOFs with the maximum possible S/N.

Note the difference between the fingerprint patterns uk [Eq. (3)] and the S/NPs vk [Eq. (7)]: the fingerprint patterns are weight vectors used to detect the signal, which have no direct physical meaning; in contrast the S/NPs characterize the signal itself. Fingerprint patterns are also used in the optimal fingerprinting method for detection and attribution, in order to detect a model-based signal within observational data (Hasselmann 1979, 1993; Bell 1986; Hegerl et al. 1996). Here, in contrast, the signal (as characterized by the S/NPs) is determined empirically within a single model-based dataset.

Once the S/NPs have been calculated, the forced response is isolated by constructing a truncated dataset from the M leading S/NPs:

XS/NP=k=1MtkvkT.

We will show that the ensemble average of the truncated dataset XS/NP (i.e., S/NP filtering) gives a better estimate of the forced response than a simple ensemble average X. The inclusion of M S/NPs to construct an estimate of the forced response XS/NP is what distinguishes S/NP filtering from the method of T09, which focuses on the leading pattern in order to estimate the contribution of forcing to Atlantic multidecadal variability.

S/NP filtering has two hyperparameters: N, the number of EOFs retained, and M, the number of S/NPs used in constructing the truncated dataset. The number of EOFs N should generally not exceed the degrees of freedom in the signal of interest, which in the case of the ensemble mean used here is approximately n − 1. We pick N to retain 75%–95% of the total variance. We choose M either by computing a significance threshold for sk based on block bootstrapping or by empirically finding the value of M that maximizes agreement between subsets of the large ensemble (i.e., by comparison to a validation set; see section 3). We find similar answers with methods to choose M based on the eigenvalue spectrum sk (cf. North et al. 1982). Our results are generally insensitive to these hyperparameter choices for 50 < N < 400 and 2 < M < 20 (see section 3).

A similar method was presented by DelSole et al. (2011) that looks for patterns that maximize the variance in a simulation of forced climate change relative to a preindustrial control run. This has the advantage of requiring only one forced simulation and one preindustrial control run (rather than at least two forced simulations). However, it could miss forced responses where forcing only modifies the timing (i.e., phase) of a mode of internal variability. In most other respects these methods would identify similar patterns of forced response.

b. Low-frequency pattern filtering

S/NP filtering relies on the computation of an ensemble mean to diagnose the variance that is forced within a dataset. In the case that only a single realization is available, it is necessary to come up with a new variance criterion to distinguish forced from unforced variance. Responses to anthropogenic forcing generally differ from most internal variability in terms of their long time scale. We can therefore look for the slowest evolving patterns within a dataset, which will predominantly include the forced response. One method to find the slowest evolving patterns is low-frequency component analysis (LFCA; W18; see also SH01), which solves for patterns with the maximum ratio of low-frequency to total variance (i.e., LFPs).

LFCA uses the same linear algebra machinery as S/NP filtering, but instead seeks anomaly patterns associated with time series tk that maximize the ratio of low-frequency signal to total variance:

rk=tk˜Ttk˜tkTtk.

Low-frequency signal is defined as any variations that makes it through a low-pass filter (denoted by a tilde). Here, we apply a linear Lanczos filter with a 10-yr low-pass cutoff to anomalies from the linear trend to focus on variability at decadal and longer time scales (i.e., multidecadal variability). In low-pass filtering, we do not filter over discontinuities between ensemble members; the data from each ensemble member are filtered separately and then concatenated into a single tk˜.

The LFPs vk and their time series tk are determined by Eqs. (7) and (2), respectively, but with weight vectors ek that are normalized eigenvectors of the covariance matrix R of the first N low-pass filtered principal components ck˜:

Rmn=cm˜Tcn˜,m, n[0 N].

The matrix R has N eigenvectors, Rek=rkek, with eigenvalues that give the ratio rk of low-frequency to total variance. The LFPs are sorted by rk such that the leading LFPs are the anomaly patterns that maximize the ratio of low-frequency to total variance.

Just as in S/NP filtering, a truncated dataset is created that contains just the variability captured by the leading M LFPs:

XLFP=k=1MtkvkT.

In addition to the hyperparameters of S/NP filtering (N and M), LFP filtering depends in general on the properties of the filter used, although we will not explore this particular sensitivity here. A detailed discussion of the robustness of LFPs to the choice of parameters and filter can be found in W18. Unlike principal component analysis of low-pass filtered data, LFCA uses information about spatiotemporal covariance at all time scales (e.g., in computing the EOFs ak). LFCA thus provides a method to isolate the regions and physical mechanisms important at long time scales while avoiding the issues with attributing lead–lag relationships based on filtered data (Cane et al. 2017; Wills et al. 2019a,b).

c. Model output and observational datasets

We focus primarily on surface temperature anomalies in the 40-member CESM1 large ensemble (CESM-LE; Kay et al. 2015), analyzing years 1920–2005 from the historical simulations and years 2006–19 from the RCP8.5 simulations. Each ensemble member experiences the same historical and RCP8.5 forcing from greenhouse gases, anthropogenic aerosols, volcanic sulfur emissions, solar variability, and ozone. They differ by machine-precision atmospheric perturbations on 1 January 1920 (so-called micro initialization). Seasonal (3-monthly) anomalies are computed with respect to the each ensemble member’s climatological seasonal cycle over 1920–2019. Results are unchanged if the anomalies are computed instead with respect to the ensemble-mean climatology. In section 3b, we also include analysis of seasonal precipitation and SLP anomalies.

For comparison, we also analyze a 30-member ensemble of the CSIRO-Mk3.6 climate model (CSIRO-LE; Jeffrey et al. 2013), a 20-member ensemble of the GFDL-CM3 climate model (GFDL-LE; Sun et al. 2018), and a 100-member ensemble of the MPI-ESM climate model (MPI-LE; Maher et al. 2019), including years 1920–2005 from the historical simulations and years 2006–19 from the RCP8.5 simulations. As in the CESM-LE, the GFDL-CM3-LE uses micro initialization in 1920. The ensemble members of the CSIRO-LE and MPI-LE, however, are all started from different ocean states in 1850 (so-called macro initialization). For computational efficiency, all analysis is done on grids that are half the atmospheric models’ resolution (~1° in CESM-LE; ~1.8° in CSIRO-LE and MPI-LE; ~2° in GFDL-LE) such that four model grid points are averaged into one analysis grid point. For the observational analysis in section 5c, we use the infilled surface temperature reconstruction of Cowtan and Way (2014), based on HadCRUT4 data, for the period 1920–2019.

3. Improved identification of forced climate responses

a. Forced surface temperature responses

We begin by identifying the S/NPs of seasonal (3-monthly) surface temperature anomalies in the 40-member CESM-LE over the time period 1920–2019. S/NP-1 shows the predominant pattern of long-term global warming (Fig. 1a) and can be detected based on changes in temperature throughout the subtropical oceans (Fig. 2a). All ensemble members show approximately the same timing of its evolution (gray lines in Fig. 1a) and are tightly clustered about the ensemble-mean time series (black line in Fig. 1a). S/NP-1 captures centennial global warming punctuated by volcanically induced global cooling due to the eruptions of Agung in 1963, El Chichón in 1982, and Pinatubo in 1991. However, it is not the only pattern of forced response: S/NP-2, which shows hemispherically asymmetric temperature anomalies, also has a common temporal evolution in all ensemble members (Fig. 1b). The signal fraction (i.e., the eigenvalue sk) is only slightly lower for S/NP-2 than for S/NP-1 (0.81 vs 0.97; Fig. 3a) and both have a S/N well above 1. The timing of S/NP-2 corresponds to Northern Hemisphere cooling between 1940 and 1970, and warming since, consistent with anthropogenic aerosol forcing (Shindell et al. 2013), S/NP-2 also shows large negative anomalies (cold Northern Hemisphere) following volcanic eruptions. S/NP-2 can be detected based on the asymmetry in subtropical ocean warming between the Northern and Southern Hemisphere (Fig. 2b).

Fig. 1.
Fig. 1.

(a)–(d) S/N-maximizing patterns of seasonal-mean surface temperature anomalies in the CESM-LE historical and RCP8.5 simulations over the time period 1920–2019, with N = 200 EOFs retained. The time evolution of the S/N-maximizing patterns in all ensemble members are shown as standard deviation anomalies with gray lines. The black line shows the ensemble-mean time evolution of each pattern (i.e., ⟨tk⟩). Note that seasonal cycle in the ensemble-mean time evolution indicates forced changes in the seasonality of surface temperature. The 100-yr ensemble-mean trends in each pattern are shown separately for (left to right) January–March (JFM), April–June (AMJ), July–September (JAS), and October–December (OND) in the bar chart insets. The y scale for the bar chart insets is half that for the time series in (c).

Citation: Journal of Climate 33, 20; 10.1175/JCLI-D-19-0855.1

Fig. 2.
Fig. 2.

Fingerprint patterns uk (°C−1) of seasonal-mean surface temperature anomalies in the CESM-LE historical and RCP8.5 simulations over the time period 1920–2019, with N = 200 EOFs retained (cf. S/N-maximizing patterns vk in Figs. 1a and 1b).

Citation: Journal of Climate 33, 20; 10.1175/JCLI-D-19-0855.1

Fig. 3.
Fig. 3.

(a) Signal fraction (sk) and percent of the total (ensemble-mean) forced variance captured by the leading S/NPs. The dashed line gives the minimum value of sk that is significant at the 5% significance level computed by block bootstrapping. Note that the percentages of the total forced variance do not add to exactly 100% because of the nonorthogonality of the S/NPs. (b) Global mean of the grid point squared correlation between the pattern filtered estimate of the forced response XS/NP from one 20-member half-ensemble and the simple ensemble mean X of the opposite 20-member half-ensemble, as a function of the number of S/NPs included M and the number of EOFs retained N. The dashed line gives the global-mean gridpoint squared correlation between 20-member half-ensembles when no pattern filtering is applied.

Citation: Journal of Climate 33, 20; 10.1175/JCLI-D-19-0855.1

The next two S/NPs (sk = 0.77 and 0.67, respectively) capture centennial changes in the seasonal cycle of temperature, which manifest themselves in annual cycles in the corresponding ensemble-mean time series (black lines in Figs. 1c,d), with opposite phasing in the early and later parts of the simulations (insets in Fig. 1 show ensemble-mean trends separately for each season). These S/NPs have the largest anomalies in regions of sea ice cover (Figs. 1c,d), indicating that they are capturing changes in the seasonal extent of sea ice (as discussed in Zhang and Walsh 2006; Eisenman et al. 2011). Higher-order S/NPs (not shown) capture nonmonotonic long-term changes, regional responses to volcanic eruptions, and evolution from the common initial ocean state in January 1920 (a result of micro initialization).

To construct an estimate of the forced response, we must choose the number of patterns M to retain. Two possible methods for choosing M are 1) finding a significance level for sk using block bootstrapping or 2) using the large ensemble to empirically determine the number of patterns that works best. To test the block bootstrapping approach, we take random 10-yr samples from the 40-member ensemble (with replacement) to construct randomized ensembles where the members should not agree on the timing of climate responses. We then rerun the S/NP filtering on these randomized ensembles. Using the 95th percentile of s1 in the bootstrapped ensembles as an estimate of the 5% significance level, we find that sk > 0.15 are significant.1 S/NPs with sk below this level could occur due to random chance. According to this bootstrapping test, 11 S/NPs are statistically significant (Fig. 3a). We find a similar answer if we use the simpler North et al. (1982) test, based on the separation of neighboring eigenvalues relative to the sampling uncertainty in those eigenvalues δsk = sk(2/DOF)1/2, where the degrees of freedom (DOF) is approximately the number of seasonal time steps minus one (i.e., 399).

Within a large ensemble, we can also empirically test which value of M best estimates the forced response (which should be the same in all subsets of the large ensemble). To do so, we split the ensemble in half, apply S/NP filtering to one 20-member half-ensemble (the training set), and test how well the resulting XS/NP agrees with the ensemble mean X of the opposite 20-member half-ensemble (the validation set). We test agreement based on the global average of the squared correlation between the two estimates of the forced response at a grid point (Fig. 3b). As long as two or more S/NPs are included, S/NP filtering improves the agreement with the ensemble mean of the validation set (the agreement between the ensemble means of the opposite half-ensembles is shown with a dashed line in Fig. 3b). The large jump in agreement between M = 1 and M = 2 means that it is critical to include at least two DOF (two patterns) in an estimate of the forced response.

For the case where 200 EOFs (91.2% of the total variance) are included in the analysis, including M = 10 S/NPs maximizes the agreement with the ensemble mean of the validation set.2 Including further EOFs increases the number M of S/NPs required to maximize this agreement without substantially improving the maximum value of the global-mean squared correlation. The reduction in agreement beyond M = 10–12 is a sign of overfitting to the evolution of anomalies in the particular ensemble members used. We choose representative hyperparameter values of N = 200 EOFs and M = 11 S/NPs for most of the analysis that follows.

Spatial maps of the squared grid point correlation between 20-member half-ensembles, before (Fig. 4a) and after (Fig. 4b) applying S/NP filtering, show that S/NP filtering substantially increases the agreement between subensembles. The largest improvements are over the Northern Hemisphere continents, the North Pacific, the tropical Pacific, Australia, and Antarctica (Fig. 4c). We find qualitatively similar results if we instead use the root-mean-square error (RMSE) between two half-ensembles to measure their agreement (Figs. 4d–f). Note that the S/NP filtering of each subensemble is independent and no information (e.g., EOFs) is shared between analyses.

Fig. 4.
Fig. 4.

Spatial maps of the squared correlation between estimates of the forced response in 20-member half-ensembles: (a) when the forced response is estimated by a simple ensemble mean and (b) when the forced response is estimated by S/NP filtering with M = 11 and N = 200. (c) Difference between (a) and (b). Spatial maps of the root-mean-square error (RMSE) between estimates of the forced response in 20-member half-ensembles: (d) when the forced response is estimated by a simple ensemble mean and (e) when the forced response is estimated by S/NP filtering with M = 11 and N = 200. (f) Difference between (d) and (e).

Citation: Journal of Climate 33, 20; 10.1175/JCLI-D-19-0855.1

Detecting climate signals in grid point temperature is significantly harder than detecting climate signals in a large-scale spatial average because of the larger amplitude of internal variability at small scales (e.g., Deser et al. 2012a). We would therefore like to test whether the improved identification of the forced response by S/NP filtering extends to large-scale averages. Again, we compare agreement between the two 20-member half-ensembles before and after applying pattern filtering.

The time evolution of global-mean surface temperature is in good agreement between the two half-ensembles, even before applying pattern filtering (squared correlation of 0.98; Fig. 5a). S/NP filtering improves this agreement (squared correlation of 0.99; Fig. 5b), but only marginally so. The global average already averages out most internal variability, so pattern filtering does not substantially improve the estimate of the forced response in global-mean surface temperature. Note, however, that it does improve the global-mean surface temperature response estimate when fewer ensemble members are available (see section 4).

Fig. 5.
Fig. 5.

Forced responses of large-scale temperature indices in the CESM-LE, computed from (left) a simple ensemble mean X and (right) the pattern filtered ensemble mean XS/NP, from S/NP filtering of seasonal surface temperature anomalies with M = 11 and N = 200. Blue and orange lines show the first and second 20-member half-ensembles of the CESM-LE, respectively. The black line shows the full 40-member CESM-LE. The squared correlation between the 20-member half-ensembles is shown in the bottom right of each panel. North Atlantic SST is averaged over 40°–60°N and 0°–80°W. The Pacific east–west SST difference is the difference between the eastern equatorial Pacific (90°–150°W, 6°S–6°N) and the western equatorial Pacific (120°E−180°, 6°S–6°N). U.S. land surface temperatures are averaged over 30°–45°N, including most of the contiguous United States and parts of Mexico and Canada.

Citation: Journal of Climate 33, 20; 10.1175/JCLI-D-19-0855.1

The improved identification of climate responses by S/NP filtering is again apparent if we examine regional temperature anomalies such as the North Atlantic (NA) SST (40°–60°N, including the NA warming hole), the SST difference between the eastern and western equatorial Pacific, or the U.S. land surface temperature averaged over 30°–45°N (Figs. 5c–h). Particularly noteworthy is that the 20-member and even 40-member ensemble means of the equatorial Pacific east–west SST difference show substantial noise from El Niño–Southern Oscillation (ENSO) (Fig. 5e), which is removed in the S/NP-filtered estimate of the forced response (Fig. 5f). The squared correlation between the two half-ensembles is only 0.16 before S/NP filtering, but increases to 0.40 after. This reveals an El Niño–like response to volcanic forcing that was not apparent in the 20- or 40-member ensemble means. This response has been studied elsewhere (Maher et al. 2015; Khodri et al. 2017; Pausata et al. 2020), but has only been identifiable by compositing over hundreds of modeled eruption responses. Pattern filtering also reveals ensemble agreement on evolution from a common La Niña–like initial state in January 1920 (a result of micro initialization) and a weak El Niño–like trend since ~1990. In the U.S.-average land temperature, a simple ensemble average shows a long-term warming trend punctuated by cooling in response to volcanic eruptions, but it also has considerable seasonal-to-interannual noise superimposed (Fig. 5g). S/NP filtering identifies the same forced climate signal, but with almost all of this noise removed (Fig. 5h).

b. Forced precipitation and SLP responses

Identifying climate signals in surface temperature is generally easier than in other variables, because the pattern of global warming differs from dominant modes of temperature variability (see, e.g., Santer et al. 1994). To test whether the improved identification of climate responses by S/NP filtering extends to other variables, we consider seasonal precipitation and SLP anomalies in the 40-member CESM-LE. For both variables, S/NP filtering considerably improves the agreement between halves of the CESM-LE on their estimates of the forced response, compared to a simple ensemble mean. Using the metric in Fig. 3b, S/NP filtering (with 10 patterns retained) improves the skill in identifying the spatiotemporal evolution of the forced response from 0.08 to 0.13 for precipitation and from 0.14 to 0.20 for SLP (cf. from 0.62 to 0.73 for surface temperature; Fig. 3b). While more noise remains in these variables after S/NP filtering, the fractional improvement is actually greater than for temperature.

Further improvement can be made by performing a combined analysis on all three fields. We will show the results from this three-field analysis before returning to discuss how it differs from the individual-field analyses at the end of this section. For the multi-field analysis, seasonal precipitation and SLP anomaly matrices are concatenated with the surface temperature anomaly matrix X in the spatial dimension (i.e., creating a new data matrix X with 3 times the spatial dimension). This is analogous to the generalization of EOF analysis to multiple field variables (Bretherton et al. 1992; Deser and Blackmon 1993). Each field variable is normalized by the trace of its covariance matrix such that all variables are unitless and weighted equally. The rest of the multi-field analysis proceeds exactly as in the individual-field case. By using a combined analysis of all three fields, we hope to take advantage of the relatively high S/N in surface temperature anomalies to identify contemporaneous forced responses in precipitation and SLP.

The multi-field S/N-maximizing patterns show similar temperature anomaly patterns to those found in the temperature-only analysis (Fig. 6, cf. Fig. 1). However, the multi-field analysis additionally identifies contemporaneous precipitation and SLP anomaly patterns. Multi-field S/NP-1 shows increasing SLP in the subtropics and midlatitudes and decreasing SLP in the Arctic and Antarctic (Fig. 6a), trends associated with the poleward shift of the storm tracks and jet streams in both hemispheres (Kushner et al. 2001; Yin 2005). The associated precipitation anomaly pattern shows on average that the dry subtropical regions get drier and the wet extratropical regions get wetter (Held and Soden 2006; Seager et al. 2010), but there is also considerable variability with longitude. In particular, there is strong drying in Southeast Asia and Indonesia, which in this model is a response to the centennial increase in aerosol optical depth in this region (Deser et al. 2020b). Multi-field S/NP-2 shows positive SLP anomalies in the Pacific and Indian Oceans and negative SLP anomalies over Southeast Asia, North America, and the midlatitude Southern Ocean (Fig. 6b). It also shows a shift of the South Pacific convergence zone (SPCZ) toward the southwest and positive precipitation anomalies over China, Southeast Asia, and tropical South America, which looks similar to this model’s precipitation response to anthropogenic aerosol forcing over 1930–79 (Deser et al. 2020b). On average it shows a northward shift in precipitation, consistent with hemispherically asymmetric heating due to anthropogenic Northern Hemisphere aerosol loading (Broccoli et al. 2006; Kang et al. 2008). The presence of precipitation responses to anthropogenic aerosol forcing in both patterns indicates that greenhouse gas and anthropogenic aerosol forcing can still be mixed between the first two patterns. Multi-field S/NP-3 shows changes in the seasonality of extratropical sea level pressure anomalies associated with changes in the seasonality of sea ice and high-latitude temperature anomalies, especially in the Aleutian low region and the Southern Ocean. The associated precipitation anomalies are weaker and of smaller spatial scale than those in multi-field S/NP-1 and S/NP-2.

Fig. 6.
Fig. 6.

(a)–(c) Multi-field S/N-maximizing patterns of seasonal-mean surface temperature (TS), sea level pressure (SLP), and precipitation (PR) anomalies in the CESM-LE historical and RCP8.5 simulations over the time period 1920–2019, with N = 200 EOFs retained. The time evolution of the S/N-maximizing patterns in all ensemble members are shown as standard deviation anomalies with gray lines. The black line shows the ensemble-mean time evolution of each pattern. The 100-yr ensemble-mean trends in each pattern are shown separately for (left to right) JFM, AMJ, JAS, and OND in the bar chart insets. The y scale for the bar-chart insets is half that for the time series in (a).

Citation: Journal of Climate 33, 20; 10.1175/JCLI-D-19-0855.1

As with surface temperature, S/NP filtering improves the identification of forced responses in large-scale precipitation anomalies and SLP indices including global-mean precipitation, precipitation averaged over China (land within 100°–120°E and 20°–40°N), precipitation averaged over the Southwest United States (U.S.-SW; land within 105°–125°W and 30°–40°N), the SLP difference between Lisbon and Reykjavik [an unnormalized variant of the North Atlantic Oscillation (NAO) index of Hurrell (1995)], and the North Pacific index [NPI; SLP averaged over 160°E–140°W and 30°–65°N, as in Trenberth and Hurrell (1994)]. Most of these forced responses have a low S/N and are therefore difficult to detect with simple ensemble averaging of 20-member or even 40-member ensembles (left side of Fig. 7). However, by S/NP filtering with the leading 12 multi-field S/N-maximizing patterns (which maximizes the agreement with the ensemble mean of a 20-ensemble-member validation set for N = 200 EOFs; 79.9% of the total variance retained), both 20-member half-ensembles find the same forced responses in these precipitation and SLP indices (right side of Fig. 7).

Fig. 7.
Fig. 7.

Forced responses of large-scale precipitation and SLP indices in the CESM-LE, computed from (left) a simple ensemble mean X and (right) the pattern filtered ensemble mean XS/NP, from multi-field S/NP filtering of seasonal surface temperature, precipitation, and SLP anomalies with M = 12 and N = 200. Blue and orange lines show the first and second 20-member half-ensembles of the CESM-LE, respectively. The black line shows the full 40-member CESM-LE. The squared correlation between the 20-member half-ensembles is shown in the bottom right of each panel. China precipitation is averaged over land within 100°–120°E and 20°–40°N, which includes small parts of Southeast Asia. U.S. Southwest (U.S.-SW) precipitation is averaged over land within 105°–125°W and 30°–40°N, which includes small parts of northwest Mexico. An approximate North Atlantic Oscillation (NAO) index is computed from the unnormalized SLP anomaly difference (Pa) between Lisbon and Reykjavik. The North Pacific index is the average SLP anomaly over 160°E−140°W and 30°–65°N, as in Trenberth and Hurrell (1994).

Citation: Journal of Climate 33, 20; 10.1175/JCLI-D-19-0855.1

With the exception of changes in global-mean precipitation (Figs. 7a,b), the forced responses uncovered by multi-field S/NP filtering would be difficult to detect using more traditional methods. For example, while the long-term decreasing trend in China precipitation would be easy enough to detect in 20-member or even smaller ensembles using standard ensemble averaging or linear trend analysis (Fig. 7c), the reduction in precipitation following volcanic eruptions and the long-term trend in seasonality (toward wetter winters and drier summers) are not apparent until after the S/NP filtering is applied (Fig. 7d). In U.S.-SW precipitation, the signal is small compared to internal variability such that it is completely swamped by noise, even when averaging over a 40-member ensemble (Fig. 7e). However, a weak but robust signal is found in both 20-member half-ensembles using S/NP filtering (Fig. 7f): increased precipitation following volcanic eruptions and a very small long-term positive trend (<0.1 mm day−1 century−1). Recent work by Coats et al. (2015) has investigated whether external forcing, such as from volcanoes, has influenced long-term droughts in this region and concluded that they are dominated by internal variability. While we also find that internal variability is a bigger influence than external forcing on precipitation in this region, we find that volcanic eruptions lead to a detectable shift toward wetter conditions over the subsequent several years (in CESM), likely linked to the El Niño–like response to eruptions.

SLP anomalies have very high amplitude internal variability, which is aliased into even the 40-member ensemble average (Figs. 7g,i). Long-term forced shifts in the NAO or NPI are therefore hard to detect, although there is much interest in knowing the relative contribution of forcing to observed trends (Hurrell 1995; Ulbrich and Christoph 1999; Semenov et al. 2008; Greatbatch et al. 2012; Deser et al. 2017). S/NP filtering provides a means to characterize forced responses in these indices within large ensembles. The CESM-LE shows a forced positive trend in the NAO between 1950 and 1990 (Fig. 7h), especially in the winter half-year, which corresponds roughly to the timing and magnitude of the observed trend over that period (Hurrell 1995; Ulbrich and Christoph 1999; Semenov et al. 2008). There is also a forced negative trend in the NAO between 1990 and 2019. In the Pacific, the CESM-LE shows a forced positive trend in the NPI over the entire century (Fig. 7j), which is focused in the summer half-year. This trend is punctuated by negative anomalies following volcanic eruptions. Another interesting feature isolated by the S/NP filtering is a 200-Pa NPI anomaly in the first three months of 1920, a symptom of the micro initialization.

Multi-field S/NP filtering uncovers a rich spatiotemporal complexity within the forced responses of precipitation and SLP in CESM-LE that would be lost on other methods. This does benefit from the use of surface temperature in the analysis, as individual-field analyses of precipitation or SLP alone do not give as good of agreement between the 20-member half-ensembles (reducing the squared correlations given on the right-hand side of Fig. 7 to 0.78, 0.92, 0.36, 0.36, and 0.45, from top to bottom, compared to the multi-field analysis values given in the figure). Individual-field S/NP filtering is still considerably better than a simple ensemble mean (cf. values on the left side of Fig. 7), with the notable exception of global-mean precipitation, forced changes in which are underestimated by individual-field S/NP filtering. The forced response of global-mean precipitation is retained in the multi-field analysis (Fig. 7b), presumably because of its correlation with aspects of the surface temperature response. Overall, multi-field S/NP filtering isolates the forced responses of precipitation and SLP somewhat better than individual-field S/NP filtering, especially for global-mean precipitation.

4. How many ensemble members are needed?

Now that we have shown how S/NP filtering improves estimates of the forced response (for the specific case of 20-member half-ensembles), we return to investigate the question of how many ensemble members are needed to identify the forced response. To do so, we reserve one S/NP-filtered 20-member half-ensemble for comparison, which we will refer to as the reference estimate, and test how well this forced response can be identified within subsets of the remaining ensemble members. To quantify sampling uncertainty, we repeat this procedure for five random choices of ensemble member sampling (without replacement). For simplicity, we use M = 11 S/NPs for all ensemble sizes, although it would be an easy generalization to identify the optimal value of M for each ensemble size.

For the case of identifying the forced evolution of temperature at a grid point, S/NP filtering gives a dramatic improvement in squared correlation with the reference estimate compared to simple ensemble averaging (Figs. 8a,b). This is true for all ensemble sizes between 2 and 20 members. The S/NP-filtered estimate of the forced response based on 3 ensemble members is better than the simple ensemble average of 20 members, both in terms of squared correlation and root-mean-square error with the reference estimate. The S/NP-filtered estimate based on 2 ensemble members is only slightly worse. This means that S/NP filtering reduces the number of ensemble members needed to estimate the forced response by a factor of ~7–10 compared to simple ensemble averaging.

Fig. 8.
Fig. 8.

(a) The global-mean gridpoint squared correlation and (b) the root-mean square error (RMSE) between estimates of the forced surface temperature response in nE-member subensembles and a reference estimate of the forced response, computed from S/NP filtering of 20 CESM-LE ensemble members that are withheld from the subensembles. Within the subensembles, the forced response is estimated by a simple ensemble mean (blue), S/NP filtering (black), and LFP filtering (orange). (c)–(f) As in (a), but with spatial averaging computed before computing the squared correlation between forced response estimates. Spatial averages are computed as in Fig. 5. All values are the mean over five random choices of ensemble member sampling (without replacement); error bars show ±1 standard deviation spread due to sampling uncertainty.

Citation: Journal of Climate 33, 20; 10.1175/JCLI-D-19-0855.1

We can characterize the number of ensemble members needed to estimate the forced response based on where the variance shared with the reference estimate exceeds a threshold (e.g., 80%). Based on the 80% threshold, four ensemble members are needed with S/NP filtering, while more than 20 ensemble members would be needed with simple ensemble averaging (Fig. 8a). We can map how many ensemble members are needed to detect the forced response in different local temperature anomalies by computing the number of ensemble members at which the subensemble forced response estimate first exceeds an 80% squared correlation with the reference estimate (Fig. 9). Using a simple ensemble mean, more than 20 ensemble members are needed for more than half of grid points globally (Fig. 9a), whereas 2–3 ensemble members are generally enough to detect local forced responses with S/NP filtering (Fig. 9b). Only a few locations, such as the North Pacific, the equatorial Pacific between 150°E and 180°, India, and some regions of the North Atlantic and Southern Ocean (regions of small-scale and/or low-frequency variability) require more than 10 ensemble members when using S/NP filtering.

Fig. 9.
Fig. 9.

The number of ensemble members needed to constrain the forced response in local temperature using (a) a simple ensemble mean and (b) S/NP filtering. The criterion used is that the forced response must share 80% of its variance with the reference estimate (S/NP filtering of the opposite 20-member half-ensemble; i.e., no ensemble members are shared between the estimate and the reference). Values shown are the median over five random choices of ensemble member sampling.

Citation: Journal of Climate 33, 20; 10.1175/JCLI-D-19-0855.1

Similar results hold for detecting forced responses in large-scale average temperature anomalies. For NA SST anomalies, 3 ensemble members are needed with S/NP filtering versus 7 with a simple ensemble mean (Fig. 8d); for U.S. average land surface temperature, 2–3 members are needed versus 14–15 (Fig. 8f). Fewer ensemble members are needed to capture the forced response in global-mean surface temperature: 2 ensemble members with S/NP filtering versus 4 with simple ensemble averaging (here based on a stricter 95% variance criterion; Fig. 8c). The forced response in the Pacific SST gradient does not satisfy the 80% squared correlation criterion for any choice of ensemble size, but the squared correlation does not increase further after about 10 ensemble members, suggesting that including more than ~10 ensemble members in an estimate of the forced response (based on S/NP filtering) has marginal returns.

Similar results are found for the three other large ensembles (CSIRO-LE, GFDL-LE, and MPI-LE): using S/NP filtering, these ensembles require 6, 3, and 5 ensemble members, respectively, to meet the 80% threshold in global-mean squared correlation (cf. Fig. 8a). They need 2–4 ensemble members to meet the 95% squared correlation threshold for global-mean surface temperature, 2–3 ensemble members to meet the 80% squared correlation threshold for U.S. temperature, and 2–12 ensemble members to meet the 80% squared correlation threshold for NA SST. None of the ensembles exceed a 50% squared correlation for the east–west Pacific SST difference, not even the two 50-member subensembles of MPI-LE. However, this could simply be a result of these models not having a strong response of the Pacific SST gradient to forcing over the past 100 years.

For all temperature indices except the Pacific SST gradient, S/NP filtering with 2–3 ensemble members already gives a reasonable estimate of the forced response, which raises the question of what can be done with a single ensemble member. We will answer this question in the next section.

5. Estimating the forced response from a single realization

a. Testing LFP filtering within the CESM-LE

For the case of a single ensemble member (or, equivalently, observations), agreement on the timing of evolution of large-scale temperature anomaly patterns can no longer be used as a metric for whether they are forced or unforced. Another major difference between forced changes and (most) internal variability is their longer time scale. We can take advantage of this longer time scale to identify patterns that are representative of the forced response. This was first proposed by SH01, who solved for patterns of global surface temperature anomalies that maximize the variance between decadal means relative to the total variance. This was further explored by W18, who solved for patterns of Pacific SST anomalies that maximized the ratio of low-frequency (low-pass filtered) to total variance and found that this can cleanly separate long-term warming from variability associated with the Pacific decadal oscillation (PDO) and ENSO. Here, we use the CESM-LE to test how well the method used in W18 (and described in section 2b) can isolate the forced climate response within a single realization.

First, we show the low-frequency patterns (LFPs) of the full 40-member CESM-LE (Fig. 10). We retain only 50 EOFs in the analysis (vs. 200 in S/NP filtering), amounting to 76.7% of the total variance, because there are fewer DOF in a low-pass filtered 100-yr time series than there are in the full 100-yr time series. The leading LFP shows a global warming pattern, with amplified warming over land and at high latitudes, similar to S/NP-1 (Fig. 10a, cf. Fig. 1a; pattern correlation = 0.999). The second LFP shows cooling of the North Atlantic, Arctic, and Northern Hemisphere land through the 1950s and 1960s and a subsequent recovery, as well as opposite signed changes in the Southern Ocean (Fig. 10b), similar to S/NP-2 (Fig. 1b; pattern correlation = 0.85). The variance among ensemble members is somewhat greater for LFP-2 than for S/NP-2, likely because of the greater projection onto regions of Atlantic multidecadal variability (Enfield et al. 2001; Wills et al. 2019a; Zhang et al. 2019). The third LFP shows low-frequency internal variability associated with the PDO (Fig. 10c) (Mantua et al. 1997; Newman et al. 2016; Wills et al. 2019b). There is only a small excursion in the ensemble mean time series, before 1930, resulting from memory of common ocean initial conditions in January 1920. The fourth LFP also shows somewhat PDO-like low-frequency internal variability, but with opposite signed anomalies in the Greenland, Norwegian, Barents, and Kara Seas (Fig. 10d). It shows little agreement on the timing of its evolution among ensemble members, except for a small response to the twentieth-century volcanic eruptions. The remaining LFPs show internal variability with increasingly shorter time scales.

Fig. 10.
Fig. 10.

Low-frequency patterns (LFPs) of seasonal-mean surface temperature anomalies in the CESM-LE historical and RCP8.5 simulations over the time period 1920–2019, with N = 50 EOFs retained. The time evolution of the LFPs in all ensemble members are shown as standard deviation anomalies with gray lines. The orange (blue) lines show the ensemble member with the most (least) change in LFP-1 over 2000–19. The black line shows the ensemble-mean time evolution of each pattern. Modified from Wills et al. (2017).

Citation: Journal of Climate 33, 20; 10.1175/JCLI-D-19-0855.1

As with S/NP filtering, we need to choose how many patterns to include in estimating the forced response. Using the CESM-LE, we can determine the ratio of forced signal to total variance sk for each LFP. The only LFPs that exceed the sk ≈ 0.15 cutoff used in the S/NP filtering analysis are LFP-1 (sk = 0.95), LFP-2 (sk = 0.62), and LFP-48 (sk = 0.27). LFP-48 is not low-frequency (i.e., it has low rk); it shows primarily changes in the seasonal cycle and is excluded here.3 LFP-3, for comparison, has sk = 0.09. We therefore include the leading two LFPs in an estimate of the forced response. Applying LFP filtering to individual ensemble members, we also find that M = 2 patterns maximizes the agreement with a reference estimate (the ensemble mean of 20 ensemble members not included in the LFP filtering).

We find that LFP filtering of a single-ensemble member provides a better estimate of the forced response than a 20-member ensemble mean (Fig. 8), capturing ~80% of the spatiotemporal variations in the forced response as diagnosed by the reference estimate. It remains the best method to estimate the forced response for up to about 4–5 ensemble members (depending on the metric used), beyond which S/NP filtering is the best method. For global-mean surface temperature (Fig. 8c) and U.S. land surface temperature (Fig. 8f), LFP filtering remains nearly as good an estimate of the forced response as S/NP filtering for up to 20 ensemble members. The benefits of LFP filtering are not as clear for ocean regions with substantial low-frequency internal variability, such as the North Atlantic (in terms of squared correlation), but the RMSE is substantially reduced. The reduction in RMSE can be seen in Fig. 11, which shows the distribution of individual ensemble member time series before and after applying LFP filtering. LFP filtering reduces the spread in the responses by a factor of 2 for global-mean surface temperature and by as much as a factor of 10 in other metrics (note the different y axes). Note, however, that LFP filtering does remove some signals, such as the El Niño–like response to volcanic eruptions and some of the changes in seasonality.

Fig. 11.
Fig. 11.

(left) Spread in time evolutions of large-scale temperature indices in individual members of the CESM-LE and (right) spread in time evolutions of the same large-scale temperature indices after the application of LFP filtering in individual members of the CESM-LE. Averaging regions for the large-scale temperature indices are defined in the caption of Fig. 5. Note the different y-axis scales for the Pacific east–west SST difference and the U.S. land surface temperature. For reference, the forced response estimate from S/NP filtering of the full 40-member CESM-LE (as in the right-hand side of Fig. 5) is shown in green (same on left and right).

Citation: Journal of Climate 33, 20; 10.1175/JCLI-D-19-0855.1

b. Filtering with linear inverse models

With similar goals in mind, Frankignoul et al. (2017) described an optimal perturbation filter (LIMopt) based on linear inverse models (LIMs), and showed that it is among the best available methods for determining the forced climate response from a single realization. Specifically, they considered methods that do not require multiple ensemble members and compared the LIMopt method to a linear trend, quadratic trend, regression against global-mean SST, and multivariate ensemble empirical mode decomposition. We have also tested the LIMopt method for the isolation of the forced response from subsets of the CESM-LE (see the online supplemental material). We find that LFP filtering performs better for global-mean surface temperature and for grid point temperatures, and that it has skill equal to or greater than LIMopt for most large-scale temperature metrics. Furthermore, LFP filtering scales better with the addition of further ensemble members. Comparing with the work of Frankignoul et al. (2017), this also means that LFP filtering isolates the forced response within individual ensemble members better than a linear trend, quadratic trend, regression against global-mean SST, or multivariate ensemble empirical mode decomposition.

c. Application to HadCRUT4 observations

Given the success of LFP filtering in estimating the forced response from individual ensemble members (Figs. 8 and 11), we would like to see what this method can tell us about the forced response in observations. We examine the HadCRUT4 infilled observational surface temperature product (Cowtan and Way 2014). We compute the LFPs of seasonal (3-monthly) surface temperature anomalies over the period 1920–2019, retaining 25 EOFs (67.6% of the total variance) (Fig. 12). While the infilling of missing data can in general lead to biases in the estimated covariance matrix and thus in the LFPs, we find similar results when using HadCRUT3 data imput with a regularized expectation maximization algorithm (Schneider 2001) (not shown).

Fig. 12.
Fig. 12.

Low-frequency patterns (LFPs) of seasonal surface temperature anomalies, and their time evolution in standard deviation anomalies, from the infilled HadCRUT4 (Cowtan and Way 2014) observational product over the time period 1920–2019, with N = 25 EOFs retained.

Citation: Journal of Climate 33, 20; 10.1175/JCLI-D-19-0855.1

LFP-1 and LFP-2 of observed temperature anomalies are similar to LFP-1 and LFP-2 of the CESM-LE (pattern correlations of 0.52 and 0.55, respectively). This suggests that LFP filtering with M = 2 LFPs would help to isolate the forced response, as in the large ensemble. LFP-3 and LFP-4 are both somewhat PDO-like (cf. W18), giving additional motivation to exclude them from the LFP filtering.

Most long-term trends in observations can be attributed to the first two LFPs (Fig. 13). Over the full century, the influence of the residual is small, and most temperature changes are captured by the LFP-filtered data. Over 1940–79, Northern Hemisphere cooling, which is thought to result in part from aerosol forcing, is retained in the LFP-filtered data. Over this period, there are also weak cooling trends in the eastern Atlantic and eastern Pacific and a strong warming/cooling dipole over western Eurasia and Siberia (captured by the residual). The recent trend over 1980–2019 is largely captured by the LFP-filtered data, except for a negative PDO-like trend in the Pacific and a weak cooling trend in the Atlantic.

Fig. 13.
Fig. 13.

Partitioning of observed trends (from HadCRUT4; Cowtan and Way 2014) into an LFP-filtered component, based on LFP filtering with M = 2 LFPs included and N = 25 EOFs retained, and a residual. Trends are shown in units of degrees Celsius per trend length [e.g., °C (40 yr)−1]. Note that the residual component of 1920–2019 temperature trends is multiplied by a factor of 3 for ease of comparison.

Citation: Journal of Climate 33, 20; 10.1175/JCLI-D-19-0855.1

We also use LFP filtering to examine the slow component of observed changes in key large-scale temperature indices (Fig. 14). Almost all of the observed global-mean surface temperature changes and much of the observed Atlantic multidecadal variability remain in the LFP-filtered data. The Pacific east–west SST gradient is dominated by high-frequency internal variability (i.e., ENSO), but it also exhibits a slow La Niña–like trend since 1980. Note, however, that the LFP-filtered trend in the east–west SST gradient is smaller than the trend in the raw data (Fig. 13).

Fig. 14.
Fig. 14.

Time evolution of (a) global-mean surface temperature, (b) North Atlantic SST averaged over 40°–60°N (i.e., the North Atlantic warming hole), and (c) the SST difference between the eastern and western equatorial Pacific (averaging regions as in Fig. 5) in HadCRUT4 (Cowtan and Way 2014), before and after applying LFP filtering.

Citation: Journal of Climate 33, 20; 10.1175/JCLI-D-19-0855.1

Interpreting this observational analysis in the context of the results from our LFP-filtering analysis of the CESM-LE (Figs. 8, 10, and 11) may give insight into the forced and unforced components of observed temperature changes. In particular, Fig. 8 suggests that the LFP filtering gives a good estimate of the forced component of changes in large-scale temperature indices from a single realization, roughly equivalent to an estimation of the forced response from a 5-member ensemble mean. This means that the LFP-filtered time series in Fig. 14 approximate the forced responses in these indices. However, it is important to keep in mind that the analysis is only guaranteed to isolate the slow component, which happens to be a better approximation of the forced response than the full unfiltered dataset in most cases. The LFP-filtered time series can still contain some amount of low-frequency internal variability, and should be interpreted with the spread in Figs. 11b, 11d, and 11f in mind.

The LFP-filtered observations are broadly consistent with the forced component (based on S/NP filtering) of temperature changes in four different large ensembles (Fig. 15): CESM-LE (Kay et al. 2015), CSIRO-LE (Jeffrey et al. 2013), GFDL-LE (Sun et al. 2018), and MPI-LE (Maher et al. 2019). One model (GFDL CM3) has too much midcentury cooling of both global-mean temperatures and NA SSTs, suggesting that its aerosol forcing may be too strong. It also seems to overestimate warming in the past two decades, suggesting that its climate sensitivity may be too high. Another model (MPI-ESM) has too little midcentury cooling of NA SSTs, suggesting that its aerosol forcing may be too weak. This is consistent with a diagnosis of aerosol radiative forcing based on simulations with fixed SST (Booth et al. 2018), where these two models span the range of diagnosed aerosol forcing strength in CMIP5 models. In general, the models show midcentury cooling of NA SSTs that occurs earlier than in observations (Fig. 15b), though the timing in observations could also be influenced by Atlantic multidecadal variability.

Fig. 15.
Fig. 15.

Comparison across four single-model large ensembles and HadCRUT4 observations of the time evolution of (a) global-mean surface temperature, (b) North Atlantic SST averaged over 40°–60°N (i.e., the North Atlantic warming hole), and (c) the SST difference between the eastern and western equatorial Pacific (averaging regions as in Fig. 5). In models, the time series shown are averaged over the full ensemble after application of S/NP filtering. In the analysis of CESM-LE, CSIRO-LE, GFDL-LE, and MPI-LE, we choose a number of EOFs to retain between 90% and 91% of the total variance (200, 225, 150, and 225, respectively); we choose the number of S/NPs based on a 5% significance threshold in sk, computed by block bootstrapping (11, 8, 9, and 10, respectively). The observations are LFP filtered, as shown in Fig. 14.

Citation: Journal of Climate 33, 20; 10.1175/JCLI-D-19-0855.1

The response of the Pacific east–west SST difference varies across models from positive (El Niño–like) to weakly negative (La Niña–like) (Fig. 15c). None of the other models show as strong of an El Niño–like response to volcanic eruptions as CESM. Observations show a La Niña–like trend between the 1970s and present that is outside of the range of model forced responses (Fig. 15c), as has been found in several studies looking at the full twentieth century (Cane et al. 1997; Solomon and Newman 2012; Coats and Karnauskas 2017). The LFP-filtered observations show a hint of an El Niño–like response to the eruption of Mt. Pinatubo even though LFP-filtering was found to smooth out El Niño–like responses to volcanic eruptions in the CESM-LE (cf. Fig. 11).

The best agreement between LFP-filtered observations and ensemble-based estimates of the forced response is actually found with M = 1 observational LFP, but the estimate with M = 2 observational LFPs remains in good agreement with the ensemble-based forced response estimates (Fig. 16). The reason including LFP-2 reduces agreement with the models might be because the observational LFP-2 reaches its minimum somewhat later than the CESM-LE LFP-2, in the mid-1980s instead of around 1970 (Figs. 10 and 12). Overall, the forced responses in the MPI-LE and the CESM-LE have the highest correlation with the observational record (Fig. 16).

Fig. 16.
Fig. 16.

Global mean of gridpoint squared correlation between XLFP, computed entirely from the HadCRUT4 observational product, and XS/NP, computed from four different large ensembles, over the time period 1930–2019 (excluding the micro-initialization spinup period), as a function of the number M of LFPs included and the number N of EOFs retained in the observational LFP filtering. The values of M and N used in the S/NP filtering are given in the caption of Fig. 15.

Citation: Journal of Climate 33, 20; 10.1175/JCLI-D-19-0855.1

The observed trend in temperature asymmetry between the Northern and Southern Hemispheres during the period 1940–79 shows up in the LFP-filtered component in our analysis (Fig. 13), but only if two LFPs are included. This trend in hemispheric asymmetry could have been caused by anthropogenic aerosols (Booth et al. 2012; Tandon and Kushner 2015; Bellucci et al. 2017; Bellomo et al. 2018; Watanabe and Tatebe 2019), stratospheric ozone changes (Thompson et al. 2011), unforced AMOC variability (Semenov et al. 2010; DelSole et al. 2011; Chen et al. 2017), or a transient response of ocean circulations to climate change (Armour et al. 2016; Stolpe et al. 2018). The key to disentangling the forced and unforced components of observed global temperature changes lies in distinguishing between these hypotheses. LFP filtering provides a potential path forward by identifying the main slowly changing temperature pattern (LFP-2) in need of attribution. Climate model ensembles with individual forcing from greenhouse gasses, aerosols, and ozone may provide utility in attributing these hemispherically asymmetric temperature changes.

Overall, estimates of the forced and unforced components of observed temperature trends based on LFP filtering largely agree with other estimates in the literature (Frankcombe et al. 2015; Frankignoul et al. 2017; Bellucci et al. 2017; Stolpe et al. 2017, 2018; Haustein et al. 2019), with the exception of T09, DelSole et al. (2011), and Chen et al. (2017), who use related statistical analyses but suggest that only the first pattern is forced and therefore conclude that a large portion of recent warming can be attributed to internal climate variability. In the case of T09 and DelSole et al. (2011), this comes from requiring that forced responses show up in a multimodel average, which could average out aerosol-forced climate responses that differ in pattern, strength, or timing between models.

6. Discussion and conclusions

a. Summary and conclusions

Here, we have demonstrated how S/NP filtering improves estimates of the forced response within climate model ensembles. Within the CESM-LE, this uncovers forced responses that would otherwise be difficult to detect, such as an El Niño–like response to volcanic eruptions, increased (decreased) precipitation in the U.S. Southwest (China) following volcanic eruptions, forced trends in the NAO, and regional changes in the seasonality of temperature, precipitation, and SLP. While all of these signals have a small S/N in a particular year or season, this method uncovers the time progression of local climate change signals that, when averaged over 30 or so years (or sufficient volcanic eruptions), would be statistically significant. The details of the diagnosed forced responses differ across models, but in all four large ensembles tested, S/NP filtering identifies the forced response with fewer ensemble members than a simple ensemble average. The inclusion of at least two DOF (patterns of change) in the forced response is critical in all cases, suggesting that methods that include only one pattern of forced response will generally underestimate the contribution of external forcing to observed temperature changes. Dynamical adjustment (Wallace et al. 2012; Smoliak et al. 2015; Deser et al. 2016; Sippel et al. 2019) may perform similarly for some applications, but does not allow for the detection of forced atmospheric circulation responses, as were identified in the CESM-LE using S/NP filtering.

Using pattern filtering methods for estimating the forced response within climate model ensembles, we revisited the question of how many ensemble members are needed to isolate the forced climate response from internal variability. We tested the number of ensemble members needed (from one-half of the CESM-LE) to converge on the same forced response estimate as was obtained from the other half of the CESM-LE. The answer depends on the particular climate response of interest and on the error tolerance level. For global-mean surface temperature, even a simple ensemble mean is able to isolate the forced response with about three ensemble members. However, S/NP filtering is able to isolate the forced global-mean surface temperature response with two ensemble members and LFP filtering with a single ensemble member. To capture 80% of the full spatiotemporally variable climate response globally, more ensemble members are required: 4–5 when using S/NP filtering. This is a large improvement over simple ensemble averaging, which would need well over 20 ensemble members to reach this threshold. Even in noisy climate metrics such as the tropical Pacific SST gradient, U.S. Southwest precipitation, or the NAO, the addition of ensemble members beyond an ensemble size of about 10 has marginal returns for the identification of the forced response. For future modeling efforts, increasing the number and quality (e.g., resolution) of, for example, 5-member or 10-member ensembles would provide greater benefit than increasing the ensemble size.

Using the CESM-LE as a test bed, we showed that LFP filtering can give an estimate of the forced response from a single realization, although it can miss rapid forced signals such as the response to volcanic eruptions. LFP filtering differs from simple low-pass filtering because it includes information about the spatiotemporal structure of the high-frequency noise in order to optimally filter it out. LFP filtering of a single ensemble member captures more than 80% of the spatiotemporal variance in the ensemble’s forced climate response. With these results as motivation, we used LFP filtering to approximate the forced and unforced components of observed temperature trends, without using any model-based information. Our results support the conjecture that most of the multidecadal changes in global-mean surface temperature and North Atlantic SST are forced and that there has been an externally forced strengthening of the tropical Pacific SST gradient over the past four decades. This approach to estimating the forced response from observations provides an alternative to approaches that combine observational and model-based information (e.g., detection and attribution), which are subject to model biases in the forced response.

b. Generalizability

The number of ensemble members needed to isolate forced climate signals will depend in general on the amplitude of the signal of interest and the characteristics of the noise in the model used. We have focused on simulations of global climate change over 1920–2019, where the forced response is comparable in amplitude to modes of internal variability. Fewer ensemble members would be needed to isolate the forced climate response in simulations with stronger forcing, such as simulations of twenty-first-century climate change or of a quadrupling of CO2. Properties of the internal variability within climate model ensembles and observations also influence the ability to isolate the forced response. Higher-amplitude noise from internal variability does not necessarily make climate responses harder to detect, because this high-amplitude noise could all be contained in a few spatial patterns (e.g., ENSO). The climate variability that is most difficult to remove from estimates of the forced response is that which is on small spatial scales (such that it does not show up in the leading EOFs) and/or on long time scales (such that it has fewer temporal DOF).

A number of studies have pointed out that observations are more predictable than expected from comparison to individual members of climate model ensembles, despite similar amplitudes of climate variability in models and observations, especially on seasonal-to-decadal time scales in the North Atlantic (Scaife et al. 2014; Eade et al. 2014; Scaife and Smith 2018). One potential explanation for this so-called signal-to-noise paradox is that the fraction of atmospheric variability driven by variations in SST is larger in observations than in models, such that a model that is able to correctly predict the evolution of SSTs may correctly predict the timing (but not the amplitude) of observed atmospheric variability once the unpredictable atmospheric noise is averaged out (see, e.g., Simpson et al. 2018). The implications of the signal-to-noise paradox for the skill of pattern filtering in isolating the forced climate response are not clear cut; more unpredictable atmospheric noise in models would make it harder to isolate the forced response in models (and therefore overestimate the difficulty in observations), but more multidecadal coupled atmosphere–ocean variability in observations would pose a challenge for isolating the forced response in observations. Based on this literature, we have no reason to believe that our analysis in section 5 systematically overestimates or underestimates what can be learned about the forced climate response from a single realization.

One limitation of the pattern filtering methods presented here is that they only consider linear combinations of state variables. This may lead to underestimates of nonlinear climate responses (e.g., in cases where positive and negative anomalies have different patterns or amplitudes). This may be apparent in the estimated El Niño–like response to volcanic eruptions (Fig. 5f; cf. Fig. 5e). Future work should investigate whether nonlinear machine learning methods can be constructed that take advantage of patterns with high signal-to-noise ratio, in a similar spirit to the analyses shown here (e.g., Barnes et al. 2019, 2020).

c. Further applications

Estimates of forced responses from pattern filtering are complementary to estimates of the uncertainty in long-term trends, as can be computed from unforced variability in control runs or observations (Thompson et al. 2015; McKinnon et al. 2017). To characterize the unforced variability in observations, these studies rely on removing the forced response, either through detrending or the subtraction of a model-based forced response estimate. However, some of the variability about the long-term trend likely comes from aerosol forcing and other nonmonotonic forcing, as encompassed in LFC-2 of observed temperatures (Fig. 12). If these nonmonotonic forced responses are not fully removed (e.g., if there are biases in the modeled forced response), then this may bias the estimates of unforced variability in observations. By first removing nonmonotonic forced responses using LFP filtering, the uncertainty in long-term trends that results from internal variability could be better estimated from observations.

Separating the forced response from the internal variability also helps us understand internal decadal variability, which may help improve decadal climate predictions (Meehl et al. 2009). Current methods of removing the forced component from indices of internal variability, such as removing the linear trend (Enfield et al. 2001) or global-mean SST (Trenberth and Shea 2006), will become less effective as the forced climate change pattern evolves over time (Andrews et al. 2015). LFP filtering provides a way to identify and remove the forced response from indices of climate variability.

Pattern filtering methods can also provide utility for the analysis of multimodel ensembles (e.g., CMIP), as shown in Ting et al. (2009) and DelSole et al. (2011). However, if the timing of a particular forced response pattern differs across models, application of S/NP filtering to a multimodel ensemble would filter this response out. Therefore, in order to study intermodel differences in the forced response, it is necessary to apply pattern filtering separately to each climate model.

Overall, the common framework of S/NP and LFP filtering provide a powerful set of tools for separating forced and unforced components of climate change in climate model ensembles and observations, thereby identifying the full spatiotemporal complexity of the climate system’s response to radiative forcing.

Acknowledgments

R.C.J.W. and D.S.B. acknowledge support from the National Science Foundation (Grant AGS-1929775) and the Tamaki Foundation. R.C.J.W. and K.C.A. acknowledge support from the National Science Foundation (Grant AGS-1752796). R.C.J.W. is also supported by the University of Washington eScience Institute. T.S. is supported by Eric and Wendy Schmidt by recommendation of the Schmidt Futures program and by the Earthrise Alliance. The CESM project is supported primarily by the National Science Foundation (NSF). This material is based on work supported by the National Center for Atmospheric Research, which is a major facility sponsored by the NSF under Cooperative Agreement 1852977. We thank Dennis Hartmann, Cristian Proistosescu, Flavio Lehner, Elizabeth Maroon, Mingfang Ting, and David Bonan for valuable input on this work. The code for S/NP filtering is available at github.com/rcjwills/forced-patterns. The code for LFCA is available at github.com/rcjwills/lfca.

REFERENCES

  • Allen, M. R., and L. A. Smith, 1997: Optimal filtering in singular spectrum analysis. Phys. Lett., 234A, 419428, https://doi.org/10.1016/S0375-9601(97)00559-8.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Andrews, T., J. M. Gregory, and M. J. Webb, 2015: The dependence of radiative forcing and feedback on evolving patterns of surface temperature change in climate models. J. Climate, 28, 16301648, https://doi.org/10.1175/JCLI-D-14-00545.1.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Armour, K. C., J. Marshall, J. R. Scott, A. Donohoe, and E. R. Newsom, 2016: Southern Ocean warming delayed by circumpolar upwelling and equatorward transport. Nat. Geosci., 9, 549554, https://doi.org/10.1038/ngeo2731.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Barnes, E. A., J. W. Hurrell, I. Ebert-Uphoff, C. Anderson, and D. Anderson, 2019: Viewing forced climate patterns through an AI lens. Geophys. Res. Lett., 46, 13 38913 398, https://doi.org/10.1029/2019GL084944.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Barnes, E. A., B. Toms, J. W. Hurrell, I. Ebert-Uphoff, C. Anderson, and D. Anderson, 2020: Indicator patterns of forced change learned by an artificial neural network. J. Adv. Model. Earth Syst., 12, e2020MS002195, https://doi.org/10.1029/2020MS002195.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Bell, T. L., 1986: Theory of optimal weighting of data to detect climatic change. J. Atmos. Sci., 43, 16941710, https://doi.org/10.1175/1520-0469(1986)043<1694:TOOWOD>2.0.CO;2.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Bellomo, K., L. N. Murphy, M. A. Cane, A. C. Clement, and L. M. Polvani, 2018: Historical forcings as main drivers of the Atlantic multidecadal variability in the CESM large ensemble. Climate Dyn., 50, 36873698, https://doi.org/10.1007/s00382-017-3834-3.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Bellucci, A., A. Mariotti, and S. Gualdi, 2017: The role of forcings in the twentieth-century North Atlantic multidecadal variability: The 1940–75 North Atlantic cooling case study. J. Climate, 30, 73177337, https://doi.org/10.1175/JCLI-D-16-0301.1.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Bonan, D. B., J. E. Christian, and K. Christianson, 2019: Influence of North Atlantic climate variability on glacier mass balance in Norway, Sweden and Svalbard. J. Glaciol., 65, 580594, https://doi.org/10.1017/jog.2019.35.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Booth, B. B., N. J. Dunstone, P. R. Halloran, T. Andrews, and N. Bellouin, 2012: Aerosols implicated as a prime driver of twentieth-century North Atlantic climate variability. Nature, 484, 228232, https://doi.org/10.1038/nature10946.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Booth, B. B., G. R. Harris, A. Jones, L. Wilcox, M. Hawcroft, and K. S. Carslaw, 2018: Comments on “Rethinking the lower bound on aerosol radiative forcing.” J. Climate, 31, 94079412, https://doi.org/10.1175/JCLI-D-17-0369.1.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Bretherton, C. S., C. Smith, and J. M. Wallace, 1992: An intercomparison of methods for finding coupled patterns in climate data. J. Climate, 5, 541560, https://doi.org/10.1175/1520-0442(1992)005<0541:AIOMFF>2.0.CO;2.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Broccoli, A. J., K. A. Dahl, and R. J. Stouffer, 2006: Response of the ITCZ to Northern Hemisphere cooling. Geophys. Res. Lett., 33, L01702, https://doi.org/10.1029/2005GL024546.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Cane, M. A., A. C. Clement, A. Kaplan, Y. Kushnir, D. Pozdnyakov, R. Seager, S. E. Zebiak, and R. Murtugudde, 1997: Twentieth-century sea surface temperature trends. Science, 275, 957960, https://doi.org/10.1126/science.275.5302.957.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Cane, M. A., A. C. Clement, L. N. Murphy, and K. Bellomo, 2017: Low-pass filtering, heat flux, and Atlantic multidecadal variability. J. Climate, 30, 75297553, https://doi.org/10.1175/JCLI-D-16-0810.1.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Chen, X., J. M. Wallace, and K.-K. Tung, 2017: Pairwise-rotated EOFs of global SST. J. Climate, 30, 54735489, https://doi.org/10.1175/JCLI-D-16-0786.1.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Christian, J. E., N. Siler, M. Koutnik, and G. Roe, 2016: Identifying dynamically induced variability in glacier mass-balance records. J. Climate, 29, 89158929, https://doi.org/10.1175/JCLI-D-16-0128.1.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Coats, S., and K. Karnauskas, 2017: Are simulated and observed twentieth century tropical Pacific sea surface temperature trends significant relative to internal variability? Geophys. Res. Lett., 44, 99289937, https://doi.org/10.1002/2017GL074622.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Coats, S., J. E. Smerdon,