Cloud Climatologies from Global Climate Models—A Comparison of CMIP5 and CMIP6 Models with Satellite Data

Axel Lauer aDeutsches Zentrum für Luft- und Raumfahrt (DLR), Institut für Physik der Atmosphäre, Oberpfaffenhofen, Germany

Search for other papers by Axel Lauer in
Current site
Google Scholar
PubMed
Close
https://orcid.org/0000-0002-9270-1044
,
Lisa Bock aDeutsches Zentrum für Luft- und Raumfahrt (DLR), Institut für Physik der Atmosphäre, Oberpfaffenhofen, Germany

Search for other papers by Lisa Bock in
Current site
Google Scholar
PubMed
Close
,
Birgit Hassler aDeutsches Zentrum für Luft- und Raumfahrt (DLR), Institut für Physik der Atmosphäre, Oberpfaffenhofen, Germany

Search for other papers by Birgit Hassler in
Current site
Google Scholar
PubMed
Close
,
Marc Schröder bDeutscher Wetterdienst (DWD), Offenbach, Germany

Search for other papers by Marc Schröder in
Current site
Google Scholar
PubMed
Close
, and
Martin Stengel bDeutscher Wetterdienst (DWD), Offenbach, Germany

Search for other papers by Martin Stengel in
Current site
Google Scholar
PubMed
Close
Free access

Abstract

Simulating clouds with global climate models is challenging as the relevant physics involves many nonlinear processes covering a wide range of spatial and temporal scales. As key components of the hydrological cycle and the climate system, an evaluation of clouds from models used for climate projections is an important prerequisite for assessing the confidence in the results from these models. Here, we compare output from models contributing to phase 6 of the Coupled Model Intercomparison Project (CMIP6) with satellite data and with results from their predecessors (CMIP5). We use multiproduct reference datasets to estimate the observational uncertainties associated with different sensors and with internal variability on a per-pixel basis. Selected cloud properties are also analyzed by region and by dynamical regime and thermodynamic conditions. Our results show that for parameters such as total cloud cover, cloud water path, and cloud radiative effect, the CMIP6 multimodel mean performs slightly better than the CMIP5 ensemble mean in terms of mean bias, pattern correlation, and relative root-mean square deviation. The intermodel spread in CMIP6, however, is not reduced compared to CMIP5. Compared with CALIPSO-ICECLOUD data, the CMIP5/6 models overestimate cloud ice, particularly in the lower and middle troposphere, partly due to too high ice fractions for given temperatures. This bias is reduced in the CMIP6 multimodel mean. While many known biases such as an underestimation in cloud cover in stratocumulus regions remain in CMIP6, we find that the CMIP5 problem of too few but too reflective clouds over the Southern Ocean is significantly improved.

© 2022 American Meteorological Society. For information regarding reuse of this content and general copyright information, consult the AMS Copyright Policy (www.ametsoc.org/PUBSReuseLicenses).

Corresponding author: Axel Lauer, axel.lauer@dlr.de

Abstract

Simulating clouds with global climate models is challenging as the relevant physics involves many nonlinear processes covering a wide range of spatial and temporal scales. As key components of the hydrological cycle and the climate system, an evaluation of clouds from models used for climate projections is an important prerequisite for assessing the confidence in the results from these models. Here, we compare output from models contributing to phase 6 of the Coupled Model Intercomparison Project (CMIP6) with satellite data and with results from their predecessors (CMIP5). We use multiproduct reference datasets to estimate the observational uncertainties associated with different sensors and with internal variability on a per-pixel basis. Selected cloud properties are also analyzed by region and by dynamical regime and thermodynamic conditions. Our results show that for parameters such as total cloud cover, cloud water path, and cloud radiative effect, the CMIP6 multimodel mean performs slightly better than the CMIP5 ensemble mean in terms of mean bias, pattern correlation, and relative root-mean square deviation. The intermodel spread in CMIP6, however, is not reduced compared to CMIP5. Compared with CALIPSO-ICECLOUD data, the CMIP5/6 models overestimate cloud ice, particularly in the lower and middle troposphere, partly due to too high ice fractions for given temperatures. This bias is reduced in the CMIP6 multimodel mean. While many known biases such as an underestimation in cloud cover in stratocumulus regions remain in CMIP6, we find that the CMIP5 problem of too few but too reflective clouds over the Southern Ocean is significantly improved.

© 2022 American Meteorological Society. For information regarding reuse of this content and general copyright information, consult the AMS Copyright Policy (www.ametsoc.org/PUBSReuseLicenses).

Corresponding author: Axel Lauer, axel.lauer@dlr.de

1. Introduction

In meteorology, clouds are defined as a visible aggregate or aerosol of microscopic water droplets and/or ice particles in the atmosphere above Earth’s surface. As a key component of the hydrological cycle, clouds play an important role in weather and climate. By reflecting incoming solar radiation and by absorbing outgoing longwave radiation, clouds have a large impact on the Earth’s radiation budget called cloud radiative effect (cre) (e.g., Ramanathan et al. 1989). In addition, clouds play an important role in tropospheric chemistry through their involvement in multiphase reactions such as oxidation of SO2 to SO4 and scavenging of atmospheric trace gases and aerosol particles (e.g., Herckes and Collett 2015). Furthermore, cloud–climate feedbacks have important implications for climate sensitivity (Bjordal et al. 2020; Bony et al. 2015; Zelinka et al. 2020) and thus for the amplitude and pace of future climate change. Despite their pivotal role in weather and climate, clouds are still quite challenging to simulate with global climate models as the relevant physics includes many nonlinear processes on temporal and spatial scales covering several orders of magnitude (Bony et al. 2015). The numerical representation of clouds in global climate models therefore requires a high degree of parameterization, including formation, growth, and sedimentation of water particles. This makes a careful evaluation of clouds in global climate models an important part when assessing the skills of the models in reproducing observed climate as well as building confidence in projections of future climate change.

In this paper, we document and evaluate the mean cloud properties from models contributing to the latest phase of the Coupled Model Intercomparison Project (CMIP6) with satellite data and compare the results with output from the predecessor phase CMIP5. We focus on the cloud properties cloud fraction, cloud liquid water and cloud ice water, and total cloud water path (tcwp) as well as on the shortwave (swcre) and longwave (lwcre) cloud radiative effects at the top of the atmosphere. Section 2 describes the CMIP model ensembles, the model simulations evaluated, and the satellite data used for comparison with the models. Section 3 describes the methods applied including a brief overview on the Earth System Model Evaluation Tool (ESMValTool) used for the analyses. Results are presented in section 4 including climatologies, interannual variability, seasonal cycle, cloud properties by dynamical regime, and analyses of the cloud regimes in the Southern Ocean, southeastern Pacific, and Pacific intertropical convergence zone (ITCZ). A summary of the findings and conclusions is presented in section 5.

2. Models, model simulations, and satellite data

a. CMIP models

In this study we investigate the performance of the models participating in the current phase of CMIP (CMIP6; Eyring et al. 2016) by comparing results from the “historical” simulations [in which forcings due to both natural causes such as volcanic eruptions and solar variability and human factors such as CO2 concentration (e.g., fossil fuel burning), aerosols, and land use were included over the time period 1850–2014] with the corresponding CMIP5 (Taylor et al. 2012) model runs (historical simulations for 1850–2005 conducted with the best record of natural and anthropogenic climate forcing) and satellite data. We used one ensemble member (typically the first member) per model. The 20-yr means are calculated by using the last 20 years of the historical simulation (i.e., 1986–2005 for CMIP5 and 1995–2014 for CMIP6). While this choice of model years is somewhat arbitrary, we found that it has very little impact on the multiyear ensemble averages. This is not surprising as ESMs are not expected to reproduce the exact observed phase of climate modes largely controlling present-day variability of clouds but rather their statistical properties.

A particular focus of this work is to assess how much progress has been made over the recent years by the model ensembles as a whole. This is done by evaluating multimodel means from the CMIP5 and CMIP6 model ensembles rather than trying to trace individual models as a number of new models contribute to CMIP6 that have not been part of CMIP5 and because some individual models changed so substantially that it would be difficult to compare them with older model versions. Table 1 gives an overview of the 35 CMIP5 and 28 CMIP6 models analyzed here.

Table 1

List of models analyzed. If more than one ensemble member is available, only the first ensemble member “r1i1p1” (CMIP5, historical) or “r1i1p1f1” (CMIP6, historical) is analyzed. Models providing cloud liquid water path instead of total cloud water path are marked with an asterisk (*); see section 3c.

Table 1

b. Satellite data

In the following, each satellite and reanalysis dataset that is used in the comparison with the model simulations is briefly described. An overview of each of the datasets is provided in Table 2 (for variable definitions see Table 3).

Table 2

Observationally based datasets used in the comparison with model simulations. For variable definitions, see Table 3; for details on the estimated uncertainty range (calculated if two or more datasets for are available for the variable), see section 3d.

Table 2
Table 3

Variables (CMOR names) and derived variables used. Derived variables are marked with an asterisk (*).

Table 3

1) CALIPSO

The Cloud–Aerosol Lidar and Infrared Pathfinder Satellite Observation (CALIPSO) combines an active lidar instrument (CALIOP) with passive infrared and visible imagers to obtain global vertically resolved properties of thin clouds and aerosols. CALIPSO and CloudSat (see below) are highly complementary and part of the A-Train satellite constellation consisting of several satellites flying in formation (Stephens et al. 2018). Here, we use the GCM-Oriented CALIPSO Cloud Product (CALIPSO-GOCCP; Chepfer et al. 2010) providing 3D observations of the cloud fraction and the CALIPSO version 1.00 lidar Level 3 Ice Cloud Data Product (CALIPSO-ICECLOUD; NASA/LARC/SD/ASDC 2018) providing histograms of the 3D cloud ice content. To calculate the grid box average ice water content for comparison with the models, we follow the procedure described in the CALIPSO-ICECLOUD Data Quality Summary available at https://www-calipso.larc.nasa.gov/resources/calipso_users_guide/qs/cal_lid_l3_ice_cloud_v1-00.php (last access 8 July 2021). For more details, we refer to the CALIPSO data user’s guide and the references therein available at https://www-calipso.larc.nasa.gov/resources/calipso_users_guide/data_quality/level_3_icecloud_summary_v100.php (last access 8 July 2021).

2) CERES-EBAF

The Clouds and Earth’s Radiant Energy Systems (CERES) Energy Balanced and Filled (EBAF) dataset provides global monthly-mean top-of-atmosphere (TOA) and surface longwave (LW), shortwave (SW), and net radiative fluxes under clear and all-sky conditions. CERES instruments are flown on NASA’s Terra and Aqua satellites. For the generation of the CERES-EBAF dataset, an objective constrainment algorithm is used in order to adjust SW and LW TOA fluxes within their ranges of uncertainty to remove the inconsistency between average global net TOA flux and the observed ocean heating rate. In addition, gaps in the CERES clear-sky TOA flux maps are filled using clear-sky fluxes from the Moderate Resolution Imaging Spectrometer (MODIS) (Loeb et al. 2009, 2012). Here, we use monthly mean CERES-EBAF Ed 4.1 data that are provided on a 1° × 1° grid (NASA/LARC/SD/ASDC 2019).

3) CLARA-A2.1

The satellite-derived climate data record CLARA (CM SAF Cloud, Albedo and Radiation dataset from AVHRR data) is based on data from the Advanced Very High Resolution Radiometer (AVHRR) and covers the time period 1982–2018. In this study, we use monthly data from edition 2.1 (Karlsson et al. 2020), which is an extended version of CLARA-A2 (Karlsson et al. 2017), for the cloud liquid water and ice water path as well as for total cloud fraction provided on a global 0.25° × 0.25° grid. The data and corresponding documentation are available at https://doi.org/10.5676/EUM_SAF_CM/CLARA_AVHRR/V002_01.

4) CloudSat

CloudSat uses a cloud profiling radar to provide retrievals of global vertically resolved cloud liquid water content from space (Stephens et al. 2002). In this study, we use 3D cloud liquid water content data from individual CloudSat orbits (level 2 data: 2B_CWC_RO “radar only”) that are binned onto a 2° × 2° grid with 40 vertical levels. Individual orbits and days are averaged to monthly means. CloudSat is sensitive to both cloud and precipitation liquid and ice particles. Estimates of cloud liquid water from CloudSat are therefore expected to overestimate the actual cloud liquid water content (e.g., Waliser et al. 2009). Similar to Li et al. (2018), cases when precipitation is present are therefore not included in this analysis. Precipitating pixels are identified using the CloudSat level 2 data for column precipitation (2C_PRECIP_COLUMN). For comparison with the CMIP models, CloudSat data are converted from kg m−3 to kg kg−1 using the air density from the U.S. Standard Atmosphere 1976. In addition, the in-cloud values from CloudSat are converted to the grid box averages provided by the models assuming that missing values in the CloudSat orbits are actually cloud-free (i.e., zero cloud liquid water). For more details on CloudSat, refer to Stephens et al. (2002) and Stephens et al. (2018).

5) ESA CCI

For this study we use two datasets from the European Space Agency’s Climate Change Initiative (ESA CCI; e.g., Hollmann et al. 2013), ESACCI-CLOUD and CM SAF/CCI TCWV-global (COMBI).

The version 3 cloud datasets of ESA CCI CLOUD are based on data from the passive imager satellite sensors AVHRR and (A)ATSR, and are named AVHRR-AMv3, AVHRR-PMv3 (Stengel et al. 2020), and ATSR2-AATSRv3 (Poulsen et al. 2020). In this study, Level 3C data from AVHRR-AMv3 and AVHRR-PMv3 are averaged to include both AVHRR morning (AM) and afternoon (PM) satellite measurements, resulting in a combined dataset with monthly mean data at a horizontal resolution of 0.5° × 0.5° for the overlapping time period (1992–2016) of both source datasets. Variables used in this study are total cloud cover, liquid water path, and ice water path. Known limitations of the AVHRR data include an underrepresentation of optically very thin clouds and sparse temporal sampling as well as satellite drift affecting in particular data before the year 2001 (Ji and Brown 2017). For details, refer to Stengel et al. (2020).

The dataset CM SAF/CCI TCWV-global (COMBI) combines microwave and near-infrared imager based TCWV over the ice-free ocean as well as over land, coastal ocean, and sea ice. COMBI is available on a global grid with a spatial resolution of 0.5° × 0.5° and 0.05° × 0.05° and a daily and monthly temporal resolution (here we use the 0.5° and monthly resolutions). The data record relies on microwave observations from SSM/I, SSMIS (Fennig et al. 2020), AMSR-E, and TMI and on near-infrared observations from the Medium Resolution Imaging Spectrometer (MERIS; third reprocessing), MODIS-Terra (collection 6.1), and the Ocean and Land Color Instrument (OLCI; first reprocessing). Details of the retrieval for microwave imagers can be found in Andersson et al. (2010) and Graw et al. (2017) and for near-infrared imagers in Diedrich et al. (2015), Fischer et al. (2021), and Lindstrot et al. (2012).

6) ISCCP-FH

From the third-generation version of the International Satellite Cloud Climatology Project (ISCCP) we use TOA radiative fluxes (cloud radiative effects) and water vapor path data for this study (ISCCP-FH; Rossow and NOAA CDR Program 2017). The ISCCP H-series products use data derived from geostationary and polar-orbiting satellite imaging radiometers with common visible and infrared channels. They supersede their predecessors from the D-series. Key improvements include, for instance, an improved quality control, higher spatial resolution input and output products, and updated documentation and metadata (Young et al. 2018). The ISCCP-FH data used here are provided as monthly means on a 1° × 1° grid. We would like to note that variable wvp included in the ISCCP-FH dataset is one of the “sub-product parameters” and originally comes from ISCCP-HGG.

7) MAC-LWP

The Multisensor Advanced Climatology of Liquid Water Path (MAC-LWP; Elsaesser et al. 2017) is an updated and enhanced version of the University of Wisconsin (UWisc; O’Dell et al. 2008) cloud liquid water path climatology. The data consist of monthly mean liquid water path over the ocean provided on a 1° × 1° grid covering the time period 1988–2016. MAC-LWP uses input data from SSM/I, TMI, AMSR-E, WindSat, SSMIS, AMSR-2, and GMI. A known limitation of this dataset is the increased uncertainty in regions with substantial precipitation as the microwave emission signal from cloud water is similar to that of precipitation. In contrast to many other climatological LWP datasets, MAC-LWP explicitly accounts for the diurnal cycle of cloud liquid water, despite being limited to ice-free ocean regions only.

8) MODIS

Here, we use data from the Moderate Resolution Imaging Spectroradiometer (MODIS) L3 Atmosphere Product, Collection 6.1 covering the time period 2003–18. For details on the instrument, retrievals, and derived cloud products, we refer to Platnick et al. (2003), Platnick et al. (2017), and Marchant et al. (2016), for the atmosphere L3 monthly product to Platnick et al. (2015). In this study, we use the variables clivi, clt, and lwp.

9) PATMOS-x

The Advanced Very High Resolution Radiometer (AVHRR) Pathfinder Atmospheres – Extended (PATMOS-x) dataset is a satellite-based long-term record providing properties of tropospheric clouds and aerosols, Earth’s surface, Earth’s radiation budget, and relevant ancillary variables (Heidinger et al. 2014). PATMOS-x is composed of data from 17 different sensors flown on a number of polar-orbiting NOAA and MetOp satellites. Here, we use data for clt covering the time period 1982–2016 on a 0.1° × 0.1° grid. For details on the dataset, processing, and limitations, we refer to Heidinger et al. (2014).

c. Reanalysis data

The two reanalysis datasets used in this study are briefly described in the following. A summary is provided in Table 2.

1) ERA5

The European Centre for Medium-Range Weather Forecasts (ECMWF) fifth-generation reanalysis ERA5 is the successor of the widely used ERA-Interim reanalysis (Dee et al. 2011). ERA5 is based on four‐dimensional variational (4D-Var) data assimilation and uses Cycle 41r2 of the Integrated Forecasting System (IFS) (Copernicus Climate Change Service 2017). In this study, ERA5 data served on the Copernicus Climate Change Service Climate Data Store (CDS) (Copernicus Climate Change Service 2017) are used. Atmospheric variables are provided on the CDS interpolated to 37 pressure levels ranging from 1000 hPa near the surface to 1 hPa (ECMWF 2020). The ERA5 datasets used here have a horizontal resolution of 0.25° × 0.25° grid. For more details on the ERA5 dataset we refer to Hersbach et al. (2020). Here, we use monthly means of the variables ta, wap, cl, clivi, clt, and lwp.

2) ERA-Interim

The reanalysis ERA-Interim (Dee et al. 2011) is a global dataset providing a multitude of atmospheric variables over the time period January 1979 through August 2019. ERA-Interim was produced by ECMWF and has been widely used in climate studies. It has been superseded by the ERA5 reanalysis. Similar to ERA5, ERA-Interim uses 4D-Var data assimilation but is based on the older Cycle 31r2 of the ECMWF IFS. The dataset has a spatial resolution of 0.75° × 0.75° and is provided on 60 vertical levels from the surface up to 0.1 hPa. For more details, please refer to Dee et al. (2011) or, for example, https://www.ecmwf.int/en/forecasts/datasets/reanalysis-datasets/era-interim (last access: 1 September 2021).

3. Methods

a. Earth system model evaluation tool

All analyses in this study are made with the open-source community diagnostics and performance metrics tool for evaluation of Earth system models “Earth System Model Evaluation Tool” (ESMValTool; Eyring et al. 2020; Lauer et al. 2020; Righi et al. 2020; Weigel et al. 2021). The ESMValTool allows for consistent processing of all datasets (e.g., regridding to common grids, masking of land/sea and missing values, vertical interpolation, etc.) and traceability and reproducibility of the results by providing provenance records for all results. With the ESMValTool “recipes” (i.e., configuration files defining input data, preprocessing steps, and diagnostics to be applied) recipe_lauer22jclim_fig*.yml, figures shown in this paper can be reproduced. For use with the ESMValTool, the variables and metadata of the input datasets have to be formatted following the CMOR (Climate Model Output Rewriter; Taylor et al. 2006) tables and definitions [e.g., https://github.com/PCMDI/cmip6-cmor-tables/tree/master/Tables (PCMDI 2019) for CMIP6]. While the CMIP5 and CMIP6 model output follow this standard, some observational data have to be reformatted according to this standard. For this, the ESMValTool contains a set of scripts that are provided with the ESMValTool source code and are publicly available on GitHub (https://github.com/ESMValGroup/ESMValTool). These scripts give exact downloading and processing instructions to create the observational datasets used in this publication.

b. Regridding, masking, and vertical interpolation

To average different datasets, calculate differences or spatial correlations with a reference dataset, data have been linearly regridded to a regular 2° × 2° latitude–longitude grid using the regridding function of the ESMValTool preprocessor (Righi et al. 2020). Application of land/sea masks and area selection were also performed with the ESMValTool preprocessor. Vertically resolved variables such as cloud liquid water or cloud ice water content from the CMIP models are typically provided on model specific hybrid levels. To calculate differences, data have been vertically interpolated to the pressure levels defined by the CMIP6 coordinate “plev27” (27 pressure levels from 1000 to 100 hPa) or the height levels defined by “alt40” (40 height levels up to ∼19 km) (e.g., https://github.com/PCMDI/cmip6-cmor-tables/tree/master/Tables, last access 7 November 2020) depending on the variable.

c. Calculation of liquid water path and total water path

Cloud liquid water path (lwp) from the CMIP models is calculated by subtracting the ice water path (clivi) from the vertically integrated total cloud water path (clwvi = ice + liquid). Some models erroneously provide lwp only instead of total cloud water path. Subtracting clivi from clwvi would then result in significant negative and thus unphysical values. Most of the affected CMIP5 models are listed on the CMIP5 errata page (https://pcmdi.llnl.gov/mips/cmip5/errata.html; last access on 8 October 2021). Similarly, some CMIP6 models are affected by the same issue. All CMIP5 and CMIP6 models affected and used in this study are marked in Table 1 with an asterisk. For these models we presume that variable clwvi provides liquid water path data only.

For the same reason, we do not use variable clwvi directly as total cloud water path, but calculate total cloud water path (tcwp) for all models as tcwp = lwp + clivi.

d. Multiobservational datasets and estimation of observational uncertainties

For evaluation of the climatologies from the models, multiobservational means X¯ are calculated as average over the time (t) means x¯n=(1/Tn)t=1Tnxt,n of all Nobs observational datasets (x¯n) whenever two or more observational datasets are available for a given variable x:
X¯=1Nobsn=1Nobsx¯n.
Here, all observational datasets are given the same weight, independent of their record length. In most cases, the record length used is the maximum record available (see Table 2) (i.e., of similar length or longer than for the models) but is allowed to be shorter if fewer years of data are available. This allows for estimating the observational uncertainty on a per-pixel basis by calculating the average standard deviation of individual annual means against the multiobservational climatology. The standard deviation σn of dataset n is calculated over the individual years (t) and using the multiobservational mean X¯:
σn=1Tn1t=1Tn(X¯xt,n)2.
This standard deviation is then averaged over all observational datasets using the same weight for each dataset:
σ¯=1Nobsn=1Nobsσn2.
Constructed this way, σ¯ includes estimates for the uncertainty introduced by natural interannual variability as well as deviations of the individual observational datasets from the multiobservational mean. Considering interannual variability is important in this context as individual years from ESMs do not necessarily correspond to the same observed years as relevant climate modes are not necessarily in the same phase in the models and in the real world.

In the figures presented here, areas are stippled if the absolute values of the differences between the model climatology and the multiobservational mean do not exceed σ¯.

The estimates of the global average uncertainty of the observational datasets given in Table 2 are calculated as the square root of the average of the area-weighted squared differences between the values of the respective dataset and the multiobservational mean climatology. The observational datasets used to calculate a multiobservational dataset for a given variable are listed in Table 4.

Table 4

Observational datasets used to calculate the multiobservational datasets per variable. Reanalysis datasets are not included when calculating multiobs products.

Table 4

e. Temporal variability and seasonal cycle amplitude

The temporal (interannual) variability at each grid cell is estimated as the temporal standard deviation of the deseasonalized monthly means (σtemp). The monthly mean values of a variable (x) are normalized by the average over the entire time series (x¯). The (relative) temporal variability of the multimodel ensembles and of the multiobservational reference (σ¯temp) is calculated by averaging over all N individual model or observational datasets:
σ¯temp=1Ni=1Nσtemp,ix¯i×100%.
The (relative) seasonal cycle amplitude (Aseas) is calculated as difference between the minimum (xmin) and maximum value (xmax) of the average seasonal cycle normalized by the climatological annual mean value (x¯). As for the interannual variability, the seasonal cycle amplitude of the multimodel ensembles and of the multiobservational reference (A¯seas) is calculated by averaging over all N individual model or observational datasets:
A¯seas=1Ni=1N|xmax,ixmin,ix¯i|×100%.

4. Comparison of CMIP6 with CMIP5 and satellite data

Modeled clouds and satellite observations are usually difficult to compare because observations are affected by the satellite instrument’s sensitivity, the temporal and spatial sampling, and the vertical overlap of the cloud layers, while the clouds in climate models are assumed to be plane-parallel and are of coarse horizontal and vertical resolution. Ideally, a satellite simulator such as COSP [Cloud Feedback Model Intercomparison Project (CFMIP) Observation Simulator Package; Bodas-Salcedo et al. 2011] is used during the model simulation to mimic the satellite viewing geometry, temporal sampling, and specific instrument characteristics such as cut-off values. Many CMIP5 and CMIP6 historical simulations, however, have been run without such a satellite simulator, or variables of interest such as cloud water path or 3D cloud liquid and ice water content are not available. Alongside with the evaluation of satellite simulator output of 3D cloud fraction [section 4a(3)] we also use multiproduct observational datasets (see section 3d) that provide a per-pixel estimation of observational uncertainties by different sensors and internal variability when output from satellite simulators was not available. Using multiple independent datasets for comparison with the models is considered one possibility to at least partly reduce limitations of observational datasets caused by observational uncertainties and internal variability when no other information is available (e.g., Flato et al. 2013). This method is, however, not able to reduce sampling biases introduced when only datasets from polar-orbiting satellites are available (twice daily sampling). If no satellite simulators are available for the models, the only way to reduce this kind of sampling bias would be a correction applied to the satellite observations by the data providers as done, for example, for the sampling-bias-corrected AIRS Obs4MIPs V2.1 dataset (Tian and Hearty 2020).

a. Multiyear annual mean climatologies

1) Geographical distribution

Figure 1 shows maps of the multiyear annual mean climatologies of total cloud cover (clt), liquid (lwp) and ice water path (iwp), longwave (lwcre) and shortwave cloud radiative effect (swcre), and the water vapor path (wvp) for the reference dataset (multiobservational mean, herein called the “multiobs mean”) and the differences between the two CMIP ensembles and the reference. This allows for a comparison of typical regional features in the geographical distribution of these parameters. Table 5 summarizes global values of mean, bias, pattern correlation, and root-mean square deviation for all mentioned parameters.

Fig. 1.
Fig. 1.

The 20-yr annual averages (models) of total cloud cover (clt), liquid water path (lwp), ice water path (iwp), TOA shortwave (swcre) and longwave (lwcre) cloud radiative effects, and water vapor path (wvp), shown from top to bottom. (left) Observational reference datasets (multiobs means) and the differences of the (center) CMIP5 and (right) CMIP6 multimodel means to the observational reference dataset. Differences that are smaller than the observational uncertainty estimate calculated from interannual variability and variability across individual observational datasets compared with the multiobs mean [Eq. (3)] are stippled.

Citation: Journal of Climate 36, 2; 10.1175/JCLI-D-22-0181.1

Table 5

Range (minimum and maximum) of the 20-yr global annual mean, average bias, pattern correlation, and root-mean-square deviation (RSMD) from individual observations and individual CMIP5 and CMIP6 models compared with the multiobs reference. For the models, also the 10% and 90% quantiles are given in parentheses.

Table 5

The observed distribution of clt shows high values in the midlatitude storm track regions associated with frontal systems, the intertropical convergence zone (ITCZ) associated with frequent and strong convection, and the stratocumulus decks off the west coasts of the continents in the subtropical subsidence regions. The CMIP6 models underestimate total cloud cover over the Southern Ocean and in stratocumulus regions by about 5% less than the CMIP5 models. Overestimation of clt north and south of the equator has increased by about 5% in CMIP6 compared with CMIP5 to up to 15%. The overestimation in clt south of the equator is also seen in other quantities such lwp or lwcre. This double-ITCZ bias has found to be only slightly reduced from CMIP5 to CMIP6 (Tian and Dong 2020). There is also a stronger overestimation in high latitudes of up to more than 30% in CMIP6, which has also been reported for single models (e.g., Nor-ESM; Seland et al. 2020), and which is not seen in CMIP5 to the same extent. However, many satellite products have shortcomings in detecting optically thin clouds (e.g., Karlsson et al. 2017). Stengel et al. (2018) found that removing modeled clouds with optical thicknesses below 1 can reduce the cloud fraction by several tens of percent, which indicates that clt in at least some observational datasets used here can be assumed to be significantly biased low in the polar regions. This makes the comparison between observations and models challenging in these regions even though the differences between the multimodel mean (MMM) and the multiobs mean are larger than the estimated ±1σ uncertainty in these regions.

Cloud liquid water path measurements from satellites show large uncertainties resulting in a large spread among different datasets. On global annual average we find about 20% less lwp in CMIP6 than in CMIP5, which is in better agreement with the multiobs mean. Most differences of the MMMs from the multiobs mean are, however, below the uncertainty estimate of ±1 sigma of the individual observed time series to the multiobs mean [Eq. (3)] except in the tropical regions near or within the ITCZ and in some polar regions. The pattern correlation (Pearson product-moment coefficient of linear correlation between two fields) of the MMM with the multiobs reference dataset increased from 0.71 in CMIP5 to 0.76 in CMIP6. The bias of the modeled liquid water path toward low values at low latitudes and high values in the extratropics has also been reported by single model results (e.g., Nor-ESM; Seland et al. 2020).

The observational uncertainty in ice water path is similarly high as the one for lwp as indicated by the large spread between different satellite products (see Table 5). Compared to CMIP5, we find a reduction by about 15% in the global annual MMM from CMIP6 to 32 g m−2. This value is slightly below the range of observational datasets analyzed (36–61 g m−2). Pattern correlation in CMIP6, however, increased from 0.73 in CMIP5 to 0.78.

Compared with CMIP5, the CMIP6 MMM of the shortwave cloud (sw) radiative effect shows an improvement in the overestimation of sw cooling north and south of ITCZ by about 5–10 W m−2. Over the Southern Ocean, the underestimation of the observed sw cooling in the CMIP6 models is improved by about 5 W m−2. This improvement is at least partly based on advances in model parameterizations and was already shown for several individual models (e.g., Danabasoglu et al. 2020; Kawai et al. 2019; Madeleine et al. 2020; Voldoire et al. 2019). In contrast, there is rather little change in the underestimation of swcre in the subtropical stratocumulus regions between CMIP5 and CMIP6 as described, for example, for the IPSL model by Madeleine et al. (2020). Biases of up to 30–40 W m−2 remain for example in the southeastern Pacific. Pattern correlation of swcre improved slightly from 0.91 in CMIP5 to 0.94 in CMIP6. This agrees very well with results of Jian et al. (2020).

Similar to swcre, there is an overestimation in the longwave cloud radiative effect north and south of the ITCZ in the models. This bias is improved in CMIP6 by about 5 W m−2 compared with the CMIP5 MMM, already seen for CNRM-CM6-1 (Voldoire et al. 2019) and CESM2 (Danabasoglu et al. 2020). Otherwise, there are very few changes from CMIP5 to CMIP6. Pattern correlations are high and amount 0.93 in CMIP5 and 0.94 in CMIP6.

The geographical distribution of water vapor path in CMIP6 shows a small improvement with smaller biases over the tropical ocean of about 1–2 kg m−2, a smaller overestimation of wvp north and south of the ITCZ, and about 1–2 kg m−2 less underestimation over the tropical Atlantic. The global annual average of the CMIP6 MMM (24.8 kg m−2) is slightly larger than in CMIP5 (24.1 kg m−2) and now within range of analyzed observational datasets (24.8–26.2 kg m−2). The pattern correlations of the MMMs are high and amount to about 0.99 for both CMIP5 and CMIP6.

For some of the parameters investigated here (lwp, iwp, swcre, wvp), we find that the CMIP6 intermodel spread is larger than in CMIP5 (see Table 5), while the average root-mean-square deviation is decreased (clt, lwp, swcre, wvp). We would expect this increased intermodel spread to be at least partly caused by an increased complexity of the CMIP6 models compared to CMIP5 introducing additional degrees of freedom.

Biases in simulated sea surface temperatures (SSTs) can affect simulated cloud properties. We therefore also analyzed some results from simulations using the atmosphere components of the CMIP models and prescribed observed SSTs (the so-called AMIP runs) for the same time periods (not shown). Similar to Lauer and Hamilton (2013), who investigated CMIP3 and CMIP5 models, we found that in general the skill of the AMIP multimodel means (MMMs) in reproducing the observed cloud climatologies measured by global means, biases, pattern correlations, and RMSDs does not systematically differ from the MMM obtained from the historical simulations. This is the case for both CMIP5 and CMIP6. While there are some improvements of specific biases such as, for example, the double-ITCZ visible in the geographical distributions of lwp and iwp, the AMIP models do not systematically outperform the coupled models in reproducing observed mean cloud properties.

2) Taylor diagrams

The Taylor diagrams shown in Fig. 2 give the spatial standard deviation and linear pattern correlation of the 20-yr annual means of a variable from individual models and observational climatologies with the corresponding multiobs mean reference dataset (see also Table 4). The spatial standard deviation is given as the ratio between the value for the individual dataset and the value from the reference dataset. The reference dataset is represented in each panel by the filled black circle on the x axis at x = 1. In the polar coordinate system used by the Taylor diagrams, the linear distance between the individual dataset and the reference dataset is proportional to the (centered) root-mean-square error (rmse) (Taylor 2001) and can be estimated using the green circles centered on the reference (black) dots. Using the multiobs mean as reference data allows for also including the individual observational datasets as an estimate of the observational uncertainty.

Fig. 2.
Fig. 2.

Taylor diagrams showing the 20-yr annual average performance of the CMIP5 (light blue) and CMIP6 (light red) models for total cloud cover (clt), liquid water path (lwp), ice water path (iwp), TOA shortwave (swcre) and longwave (lwcre) cloud radiative effects, and water vapor path (wvp). Individual models are shown by the filled circles, the multimodel means by the stars, and the individual observational datasets by the colored circles. The multiobs reference is shown by the filled black circle on the x axis at x = 1.

Citation: Journal of Climate 36, 2; 10.1175/JCLI-D-22-0181.1

For all six variables investigated, the performance of the MMMs from CMIP5 and CMIP6 in reproducing the observed annual mean distribution is quite similar. For clt, the pattern correlation of the MMM decreased slightly from 0.86 in CMIP5 to 0.83 in CMIP6 while the spread of the individual models’ correlations narrowed down from 0.35–0.89 in CMIP5 to 0.60–0.88 in CMIP6. For comparison, also data from the reanalyses ERA-Interim and ERA5 are included. Their pattern correlations are 0.90 and 0.89, respectively. Correlations of the individual observational datasets range between 0.96 and 0.99.

The models’ skill in reproducing the geographical distribution of shortwave and longwave cloud radiative effects as well as water vapor path is high compared to the other variables investigated. Correlations range from 0.91 for swcre for CMIP5 (CMIP6 = 0.94) to 0.99 for wvp (CMIP5 and CMIP6). In contrast, the intermodel spread as well as the interobservation spread for lwp and iwp are very high regarding the pattern correlations as well as the ratio of the spatial standard deviations. For lwp, the pattern correlations of the MMMs (CMIP5 = 0.71, CMIP6 = 0.76) are within the range given by individual observational datasets, between 0.49 (MAC-LWP) and 0.94 (MODIS), which makes an assessment of the models’ performance difficult and confirms the challenges of obtaining accurate global maps of lwp from satellite retrievals. For iwp, the spread of pattern correlations for the individual observational datasets ranges from 0.74 (CLARA-AVHRR) to 0.94 (ESACCI-CLOUD). For comparison, the pattern correlation of the CMIP5 MMM with the multiobs reference is 0.73 and 0.78 for CMIP6. Again, because of the large spread among the different observational datasets, this improvement in CMIP6 is difficult to assess. More accurate measurements of the global iwp distribution would be needed.

3) Zonal means

The vertical distribution of clouds plays an important role in their effect on the radiation budget. A negative net cloud radiative effect calculated as the sum of swcre and lwcre is a cooling effect, a positive sign is a net warming effect of clouds. While for optically thick low-level clouds the net cloud radiative effect is typically dominated by swcre, resulting in a cooling net effect, lwcre becomes more important with increasing cloud-top height as the temperature difference between the cloud top and the surface increases.

Figure 3 shows a comparison of the zonal means of the 3D cloud fraction from CALIPSO-GOCCP (Chepfer et al. 2010) and the CMIP5 and CMIP6 MMMs of the subset of models providing the corresponding output (clcalipso) from the satellite simulation software COSP. The observations show large cloud fractions in the lowermost troposphere (0–2 km) over the Southern Ocean and in Northern Hemispheric middle and high latitudes. The 3D cloud fraction decreases with height with the values in midlatitudes, being higher and appearing separated from the adjacent subtropical latitudes with low cloud fractions. Another local maximum in the observed zonal mean distribution of the 3D cloud fraction is observed in the upper troposphere (12–15 km) in the central tropics with the highest values in the ITCZ. This geographical distribution is qualitatively reproduced by the models. While both model generations underestimate cloud fraction in the lower and middle troposphere by roughly 5%, and by up to 30% (absolute differences) in the boundary layer over the Southern Ocean, cloud fractions are overestimated by 3%–5% in a band extending about 2–3 km below the tropopause region across most latitudes. This overestimation of the cloud fraction is more pronounced in the CMIP6 MMM than in CMIP5, already indicated for single models (e.g., the IPSL model; Madeleine et al. 2020). An exception to this is the tropical tropopause region, in which the CMIP5 models underestimate the observed cloud fraction by up to 5%. This underestimation is not present in the CMIP6 MMM.

Fig. 3.
Fig. 3.

Multiyear annual averages of zonal mean cloud fraction (%). The top row shows (left) data from CALIPSO-GOCCP, as well as differences of the (center) CMIP5 (9 models) and (right) CMIP6 (7 models) multimodel mean (models providing COSP output) compared with CALIPSO-GOCCP. The bottom row shows (left) data from ERA5, as well as differences of the (center) CMIP5 (38 models) and (right) CMIP6 (35 models) multimodel mean compared with ERA5. Stippled differences are not statistically significant at a 95% confidence level.

Citation: Journal of Climate 36, 2; 10.1175/JCLI-D-22-0181.1

As COSP output for clcalipso was only available from 9 CMIP5 and 7 CMIP6 models, we also calculated zonal means of the native (unprocessed) 3D cloud fraction (cl) that was available from 38 CMIP5 and 35 CMIP6 models. These are compared to ERA5 reanalysis results as no directly comparable observations are available. The zonal mean cloud fraction from ERA5 is qualitatively similar to the one from CALIPSO, but shows generally higher values of up to 10% throughout the domain with exceptions being the lowermost troposphere over the Southern Ocean (−20%) and the northern midlatitudes (−15%), as well as the middle troposphere in the ITCZ region (−5%). Both model generations generally overestimate the ERA5 cloud fraction by about 0%–5% (CMIP5) and 0%–10% (CMIP6). Particularly in CMIP5, there is no clearly outstanding pattern in the bias of cloud fraction visible related to specific regions or cloud types.

The vertical distribution of cloud liquid water and cloud ice water from observations from CloudSat and from CALIPSO-ICECLOUD, respectively, and the deviation of the CMIP5 and CMIP6 MMMs from the observations are shown in Fig. 4. Zonal mean cloud liquid water (clw) observations show maxima centered in the lowermost troposphere at about 800–900 hPa and 60°S and 60°N. Bands of high values are extending from these locations to about 10°N (the ITCZ region) and 400 hPa. Another band of high values is found in the lowermost troposphere at about 850 hPa across almost all latitudes. Both model generations overestimate clw in the boundary below the 900-hPa layer by up to 0.03 g kg−1 (CMIP5) and 0.02 g kg−1 (CMIP6). Here, CloudSat shows almost no cloud liquid water. We would like to note that CloudSat’s cloud profiling radar is known to have difficulties in detecting low-level clouds because of the inherent difficulty the sensor has in detecting such clouds due to surface clutter (Marchand et al. 2008).

Fig. 4.
Fig. 4.

The top row shows the multiyear annual average of (left) zonal mean cloud liquid water content (g kg−1) from CloudSat-L2, and differences of the (center) CMIP5 (32 models) and (right) CMIP6 (28 models) multimodel mean compared with CloudSat. The bottom row shows (left) the zonal mean cloud ice water content (g kg−1) from CALIPSO-ICECLOUD, and differences of the (center) CMIP5 (32 models) and (right) CMIP6 (27 models) multimodel mean compared with the CALIPSO data. Stippled differences are not statistically significant at a 95% confidence level.

Citation: Journal of Climate 36, 2; 10.1175/JCLI-D-22-0181.1

In contrast, the models underestimate clw in both bands with observed high liquid water values extending from about 900 hPa at 60°S and 60°N to about 10°N at 400 hPa. The CMIP5 MMM underestimates clw in these bands by about 50% in the midlatitudes and up to about 75% in the tropics. This underestimation is even more pronounced in the CMIP6 MMM, amounting to about 75% in the midlatitudes and up to 90% in the tropics. It is noteworthy that the CloudSat values used here do not include pixels with precipitating clouds as CloudSat is known to be sensitive to both cloud water and precipitation [see section 2b(5)]. The CloudSat clw values are therefore rather regarded as a lower estimate, making the underestimation (above the boundary layer) of clw in the models even more pronounced.

For cloud ice water content (cli), the CALIPSO-ICECLOUD observations show a maximum in the zonal means at about 10°N and 200 hPa (ITCZ) as well as two bands of high values in northern and southern midlatitudes at around 300 hPa. The CMIP5 models overestimate cli in the lower and middle troposphere in midlatitudes by up to 0.01 g kg−1, in the upper tropical troposphere at about 250 hPa by up to 0.02 g kg−1. This bias is reduced in the CMIP6 MMM showing deviations in the lower and middle troposphere in midlatitudes up to 0.008 g kg−1 in the Southern Hemisphere and up to 0.005 g kg−1 in the Northern Hemisphere. This improvement was already seen for the IPSL models (Madeleine et al. 2020). In the upper tropical troposphere, the CMIP6 MMM bias is reduced to 0.01 g kg−1. In contrast to this overestimation, both model generations show an underestimation of cli in the upper troposphere in the middle and high latitudes by about 0.003 g kg−1. We would like to note that the vertical integral of the 3D CALIPSO-ICECLOUD data (not shown) result in about a factor of 2 higher iwp values than the multiobs mean [see section 4a(1)]. This suggests that the overestimation of cli by the models is not simply caused by rather low cli values in the CALIPSO-ICECLOUD dataset due to saturation effects of the lidar instrument.

4) Ice fraction

To investigate the pattern of too much cloud ice in the lower and middle troposphere and too little ice in the upper troposphere in the models [section 4a(3)] and to better understand the differences between CMIP5 and CMIP6, we calculate average cloud ice fraction as a function of temperature. Similar to Zelinka et al. (2020), we use monthly means for the calculation of the ice fraction but instead of vertical integrals of cloud liquid water and cloud ice, we use the full 3D fields. For each 3D grid cell containing cloud water, the ice fraction is calculated as cli/(cli + clw) and then binned by temperature from the same grid cell. We use 10−6 kg kg−1 as a threshold for the minimum cloud water to calculate the ice fraction within the corresponding grid cell. Grid cells with less cloud water (sum of ice + liquid) are ignored. As an observationally based estimate, we do the same calculation using cli from CALIPSO-ICECLOUD, clw from CloudSat, and air temperature (ta) from ERA5. Figure 5 shows the results for nine individual CMIP5/CMIP6 model pairs that we identified by almost identical model names (BCC-CSM1/2, CESM1/2-WACCM, GFDL-CM3/4, GISS-E2-H/GISS-E2-1-H, HadGEM2-CC/HadGEM3-GC31-LL, MICRO-ESM/MIROC-ES2L, MPI-ESM-LR/MPI-ESM1-2-LR, MRI-ESM1/MRI-ESM2-0, NorESM1-M/NorESM2-LM) and the MMM of all 32 CMIP5 and 28 CMIP6 models with the required data available. The observations show a very steep gradient in ice fraction from >90% at 245 K to about 10% at 260 K. The models show a wide spread in the relation between temperature and ice fraction. A number of models and in particular the MMMs show a strongly overestimated ice fraction in the temperature range 250–270 K and an underestimation at temperatures below about 245 K. The intermodel spread is, however, as shown by the shaded area indicating the ±1 standard deviation range in Fig. 5, very large and includes the observations at most temperatures except around 260 K. While some models (BCC-CSM, CESM-WACCM, GISS-E2, MRI-ESM) show large differences between the CMIP5 and the CMIP6 version, the models GFDL-CM, HadGEM, MPI-ESM, and NorESM show rather little difference between the two model generations. Particularly the CMIP6 versions of BCC-CSM, CESM-WACCM, and MRI-ESM are now closer to the observations [also shown in, e.g., Frey and Kay (2018) and Kawai et al. (2019)]. While the average ice fraction per temperature bin is higher in some CMIP6 models compared with their CMIP5 counterparts (BCC-CSM, CESM-WACCM, GFDL-CM) it is lower in other models (GISS-E2, HadGEM, MRI-ESM) as shown in Bodas‐Salcedo et al. (2019). Across the whole model ensemble (MMM), both model ensembles show few differences in the mean and intermodel spread across most temperatures. An exception is the temperature range 245–255 K, in which the CMIP6 models predict a cloud ice fraction that is about 4%–5% smaller than in CMIP5 (absolute differences). This suggests that the ice fraction for a given air temperature, which we expect to be to a large degree influenced by cloud microphysics, is probably not the only reason explaining the difference in the zonal means of clw and cli between CMIP5 and CMIP6. Other reasons contributing to the differences in simulated cloud ice could be differences in the simulated temperature fields. An analysis of the annual mean zonal average temperature (not shown) shows that the models simulate slightly lower temperatures compared with ERA5 in regions where cloud ice is overestimated by the models. Similar to Tian et al. (2013), who reported a tropospheric cold bias in many CMIP5 models, we find that temperatures are underestimated in CMIP5 in northern and southern midlatitudes at around 450 hPa by about 2 K, and in the upper tropical troposphere at around 250 hPa by about 1 K. This cold bias is improved by about 0.5 K in the CMIP6 MMM, which might also contribute to a small degree to the reduced cloud ice in the CMIP6 MMM in these parts of the troposphere.

Fig. 5.
Fig. 5.

Global average fraction of ice water content in clouds calculated from 20 years (models) of monthly means of 3D cloud ice and cloud liquid water content binned by temperature. Ice fraction is shown for 9 individual CMIP5 models (blue dots) and their corresponding CMIP6 counterparts (red dots) for which all data were available. Also shown are the CMIP5 and CMIP6 ensemble means for each bin calculated from 32 models (CMIP5) and 28 models (CMIP6) as well as their ±1 standard deviation (shading). CMIP5 models are shown in blue, CMIP6 models in red, and the observational reference [CloudSat-L2 (clw), CALIPSO-ICECLOUD (cli), ERA5 (ta)] in black.

Citation: Journal of Climate 36, 2; 10.1175/JCLI-D-22-0181.1

In addition to the global averages, we also calculated the ice fraction for the Southern Ocean (latitude belt 30°–65°S) and Pacific ITCZ (135°E–85°W, 0°–12°N) (not shown). While we found the results for the Southern Ocean to differ very little from the global averages, the models show higher ice fractions in the Pacific ITCZ. Here, the CMIP5 MMM shows an increase in ice fraction of up to about 10% at 260 K and the CMIP6 MMM of up to about 15% at 250 K. For the temperature range of ∼250–270 K, this increases the overestimation of the ice fraction in this region compared with the global averages. This suggests that an important contribution to the overestimation of modeled cloud ice in this temperature range is related to convective clouds.

Our results suggest that the overestimation of the cloud ice fraction over a large temperature range (∼250–270 K) in the models probably explains part of the overestimation of cloud ice in the lower and middle troposphere as cloud ice can either be formed or persist in the models at temperatures higher than shown in the observation (i.e., at lower altitudes in the troposphere). We tested this hypothesis by applying the observed ice fraction to the model data whenever the modeled ice fraction for a given temperature was larger than the observed one. We did this by multiplying the modeled ice concentration by the ratio of ice fraction(T)obs and ice fraction(T)mod whenever the observed ice fraction is smaller than the modeled one for a given temperature. This results in a decrease of the positive bias in simulated zonally averaged MMM cloud ice in the lower and middle troposphere in midlatitudes by roughly 70%–80%. Another reason for the overestimation in cloud ice could be that cloud ice is not removed quickly enough from the atmosphere (e.g., through sedimentation).

b. Interannual variability

The interannual variability is estimated by subtracting the average seasonal cycle from the time series of monthly means and then calculating the relative temporal standard deviation [Eq. (4) and section 3e]. The observations show in Fig. 6 a high interannual variability of all cloud parameters in the Pacific ITCZ as well the tropical ocean regions and the South Pacific convergence zone (SPCZ). Particularly in the midlatitudes, the continents often show a higher interannual variability than the ocean regions at the same latitude. Other regions of high variability are for example the Sahara in North Africa. This is a result of very small cloud amounts in this region so that even small changes lead to large relative changes. This general pattern is qualitatively reproduced by the models. However, a common feature across different variables is an underestimation of the observed interannual variability in the Pacific ITCZ by the models. For CMIP5, this underestimation ranges between about 20% to 30% for lwcre and swcre and around 30% to 40% for lwp, and it has been reduced by about 10% for these three variables in CMIP6. For all eight variables investigated here, the differences between the CMIP5 and CMIP6 MMMs throughout most of the domain are, however, rather small (typically <5%). It is noteworthy that the global average interannual variabilities of wvp and clt are small (<15%) in both the observations and the models. The observed variabilities of lwp and iwp, however, are high, about 60% and 108%, respectively. In both cases, the average variability is strongly underestimated in both model generations, amounting to about 40% for lwp and >60% for iwp. While part of these differences can probably be attributed to the large uncertainties in lwp and iwp data obtained from space, this could also point to the total cloud water path in the models not being sensitive enough to changes in dynamics and thermodynamic structure of the atmosphere as this underestimation is found in many different cloud regimes and geographical regions. The observed interannual variability of lwcre and swcre lies in between the ones for clt and for cloud water path with about 28% (lwcre) and 27% (swcre). Both model ensembles show similar global averages of about 31% for lwcre and about 27% for swcre.

Fig. 6.
Fig. 6.

Relative temporal standard deviation of 20 years (models) of total cloud cover (clt), liquid water path (lwp), ice water path (iwp), TOA shortwave (swcre) and longwave (lwcre) cloud radiative effects, and water vapor path (wvp), shown from top to bottom, from the (center) CMIP5 and (right) CMIP6 models compared with (left) multiobs averages. The temporal standard deviations are calculated from monthly anomalies after subtracting the climatological mean seasonal cycle. These have then been averaged over all models [Eq. (4)].

Citation: Journal of Climate 36, 2; 10.1175/JCLI-D-22-0181.1

c. Seasonal cycle

As a measure for the strength of the seasonal cycle, we calculate its amplitude as the relative difference between minimum and maximum monthly mean values within the average seasonal cycle [Eq. (4) and section 3e]. Figure 7 shows the seasonal cycle amplitude for the six parameters clt, lwp, ipw, swcre, lwcre, and wvp. For total cloud cover, the observed seasonal cycle amplitude is about 40%–60% over large parts of the tropical and subtropical oceans, between 10% and 20% over midlatitude oceans, and <10% in large parts of the Southern Ocean. The amplitudes are higher particularly over large parts of low- and midlatitude continental areas. While this pattern is qualitatively reproduced by the models, the CMIP5 models overestimate the seasonal cycle amplitude by about 10%, particularly over the tropical ocean and the central parts of the Southern Ocean. This overestimation is slightly reduced in the CMIP6 MMM by about 5%. In northern India, northern Australia, and southern Africa, the models underestimate the observed seasonal cycle amplitude of clt. While there is little difference in these negative biases between CMIP5 and CMIP6 in India and Australia, the underestimation in southern Africa is about 10% stronger in CMIP6 than in CMIP5. The observed seasonal cycle amplitude of lwp and iwp in midlatitude ocean regions and in the Southern Ocean in particular is underestimated by the models by about 10%–20%. This negative bias is slightly reduced in the CMIP6 MMM by about 5%–10%. The observed seasonal cycle amplitudes are overestimated by the models over the tropical ocean by about 10% for swcre and by up to 20%–30% for lwcre. This bias is slightly larger (∼5%–10%) in CMIP6 than in the CMIP5 MMM. Similar to clt, the seasonal cycle amplitude of wvp is overestimated by the models particular over tropical and subtropical oceans by about 10%. This did not change significantly from CMIP5 to CMIP6.

Fig. 7.
Fig. 7.

Relative amplitude of the mean seasonal cycle averaged over 20 years (models) of total cloud cover (clt), liquid water path (lwp), ice water path (iwp), shortwave (swcre) and longwave (lwcre) cloud radiative effects, and water vapor path (wvp), shown from top to bottom, from the (center) CMIP5 and (right) CMIP6 models compared with (left) multiobs averages [Eq. (5)].

Citation: Journal of Climate 36, 2; 10.1175/JCLI-D-22-0181.1

d. Dynamical regime

The relative occurrence of specific cloud types is sensitive to the large-scale circulation (dynamical regime) and local thermodynamic conditions (Bony et al. 2004). Bony and Dufresne (2005) used monthly mean 500-hPa vertical velocity (ω500) as a proxy for the large-scale circulation over tropical oceans (latitude belt from 30°S to 30°N) to calculate composites of climate variables. Similar to Williams et al. (2003), we calculate two-dimensional composites of cloud properties binned by 500-hPa vertical velocity as a proxy for the large-scale circulation and binned by SST as a proxy for local thermodynamic conditions. We used all grid cells over the ocean. As observational reference, we use ESACCI-CLOUD data for total cloud cover and total cloud water path as well as ERA5 for SST and ω500.

Figure 8 shows such composites for total cloud cover (top) and total cloud water path (middle). The CMIP5 and CMIP6 ensemble means have been calculated by averaging the two-dimensional distributions from each individual model participating in CMIP5 and CMIP6, respectively. In this analysis, the ITCZ can be roughly characterized by SSTs > 300 K and ω500 < −3 Pa min−1. The observations show an increase in clt from about 75% to >90% with updraft velocity. This increase is qualitatively reproduced by the CMIP5 models, but absolute values are underestimated by several percent with maximum cloud cover values of about 85%. This is improved in CMIP6 particularly for strong updraft velocities. The tcwp increases from about 0.2 to 0.3 kg m−2 with increasing updraft velocities in the observations, which is overestimated in CMIP5 models and is better matched in CMIP6.

Fig. 8.
Fig. 8.

Two-dimensional distribution of average (top) total cloud cover (clt) and (middle) total cloud water path (tcwp) binned by SST (x axis) and vertical velocity at 500 hPa (ω500; y axis) averaged over 20 years and all grid cells over the ocean. (bottom) The relative frequency of occurrence of the SST–ω500 combinations.

Citation: Journal of Climate 36, 2; 10.1175/JCLI-D-22-0181.1

In the subtropical stratocumulus regions, roughly characterized in this analysis by SSTs between 290 and 295 K and subsidence (vertical velocities of >0 Pa min−1), observations show that total cloud cover is particularly sensitive to SST and decreases roughly from 65% at 290 K to 55% at 295 K. This sensitivity is underestimated in CMIP5 (decrease from roughly 55% at 290 K to 50% at 295 K). The CMIP6 MMM also underestimates this gradient in clt with SST but shows a larger sensitivity of total cloud cover to SST (about 60% at 290 K to 50% at 295 K) that is in better agreement with observations. In both model ensembles, absolute clt values are underestimated compared with the observations. Absolute changes in tcwp are small, as tcwp values in this regime are typically rather small, and also decrease with increasing SST with little difference between CMIP5 and CMIP6.

Extratropical conditions are characterized by SSTs being typically below 290–295 K depending on the season. Here, we find an increase in total cloud cover with updraft velocities and a decrease with increasing SST. This behavior is qualitatively reproduced by the models but the CMIP5 MMM underestimates clt values by about 10% throughout most of the extratropical regime. The CMIP6 MMM is in better agreement with observations showing higher absolute values. The maximum in clt at moderate vertical velocities (<4 Pa min−1) and low SSTs (<280 K) is underestimated in both model generations, but again with improved absolute values in CMIP6. Total cloud water path in the extratropical regime is more sensitive to vertical velocity than to SST. While this is qualitatively reproduced by the CMIP models, particularly the CMIP5 MMM strongly overestimates the sensitivity of tcwp to vertical velocity. Here, CMIP6 models are in better agreement with the observations regarding absolute values as well as the gradient in tcwp with vertical velocity.

The relative frequency of occurrence of the different SST–ω500 combinations is shown in the bottom row of Fig. 8. The reanalysis data show maxima over cold ocean surfaces (T < 285 K) combined with weakly ascending conditions (−2 Pa min−1 < ω500 < 0) associated particularly with mid- and high latitudes, over warm ocean surfaces (290 K < T < 300 K) combined with weak subsidence (0 < ω500 < 3 Pa min−1) associated with the subtropical subsidence regions, and over the ocean with T > 295 K combined with weak to strong ascending motion (−6 Pa min−1 < ω500 < 0) associated with the rising branch of the Hadley cell in the tropics. The two multimodel ensembles qualitatively reproduce the shape of this distribution. The biggest relative differences are found in the tropics with the CMIP5 models underestimating the frequency of occurrence of SST–ω500 combinations of T ∼ 302 K and −5 Pa min−1 < ω500 < −2 Pa min−1 by about 50%, the CMIP6 models by about 40%. In contrast, for strong ascending motions (ω500 < −6 Pa min−1) in this temperature range the CMIP5 models overestimate the frequency of occurrence by about roughly 50%–100%. In CMIP6, this overestimation is even slightly more pronounced. This overestimation is probably partly related to the so-called double-ITCZ bias (e.g., Tian and Dong 2020) in some of the models that can lead to two ascending branches of the Hadley cell in the tropics, which are visible in zonal annual means of the vertical velocity (not shown). As this problem in the zonal annual mean distribution of the vertical velocity is found to be reduced in CMIP6 (not shown), this suggests that, for example, convection might be too strong or too sensitive to high SSTs.

e. Regional cloud analyses

1) Southern Ocean

As a key region for connecting the surface ocean with the deep ocean, the Southern Ocean plays an important role in climate through its major contribution to the global ocean heat uptake and storage (e.g., Sallée 2018). The Southern Ocean is known to be a challenging region for ESMs with many models underestimating reflected shortwave radiation at the top of the atmosphere as a result of too few clouds (e.g., Trenberth and Fasullo 2010). Other models are simulating clouds over the Southern Ocean that are too reflective, which is thought to be caused by an overestimation of cloud ice and a corresponding underestimation of supercooled cloud liquid water (e.g., Bodas-Salcedo et al. 2012; Lohmann and Neubauer 2018).

To investigate the relationship between cloud radiative effect, total cloud water path, and cloud cover, we apply a diagnostic similar to Lauer et al. (2018), who binned monthly mean cloud radiative effect by cloud cover. Here we apply the same kind of binning also to tcwp. As the incoming solar radiation is at a maximum in austral summer and the effect of errors in model clouds on radiation are the largest, we focus on the months December, January, and February (DJF). Figure 9 shows 20-yr seasonal (DJF) averages of swcre and of tcwp versus total cloud fraction averaged over the Southern Ocean. The Southern Ocean is defined as all ocean grid cells in the latitude belt 30°–65°S. For cloud cover values ranging between about 20% and 75%, the CMIP5 MMM overestimates swcre (i.e., more negative values) compared to the ESACCI-CLOUD observations. In contrast, the CMIP6 MMM shows improved agreement with the observations up to clt values of about 70%. Also, swcre is underestimated (i.e., values not negative enough) in both model ensembles for total cloud cover values over 80%. Comparison of the total cloud water path per total cloud cover value suggests that this improved agreement is at least partially caused by a reduction in tcwp in the CMIP6 models, which is in better agreement with the ESACCI-CLOUD data than in CMIP5. The frequency distributions of simulated clt values, also shown in Fig. 9, indicate that the models tend to overestimate the frequency of values below about 80% and strongly underestimate very high values (above 80%), that are the most frequent in the observations. While this underestimation of very high clt values is still present in CMIP6, this bias is significantly reduced. The cumulative frequency of occurrence of clt values above 80% is 0.65 in the ESACCI-CLOUD data compared to 0.27 in the CMIP5 MMM and 0.47 in CMIP6.

Fig. 9.
Fig. 9.

The 20-yr seasonal average (DJF) of (top) TOA cloud shortwave radiative effect (W m−2) and (middle) total cloud water path (tcwp) vs total cloud fraction (clt; %) averaged over the Southern Ocean defined as latitude belt 30°–65°S (ocean grid cells only). Shown are (left) CMIP5 and (right) CMIP6 multimodel means as red filled circles and lines. The black circles and lines show observational estimates obtained from ESACCI-CLOUD data. The red shaded areas represent the range between the 10th and 90th percentiles of the results from all individual models within each model ensemble. (bottom) The frequency distribution of monthly mean total cloud cover with the red curve showing the multimodel average, the blue curve the ESACCI-CLOUD data, and the thin gray lines the individual models. The red shading shows ±1 standard deviation of the intermodel spread.

Citation: Journal of Climate 36, 2; 10.1175/JCLI-D-22-0181.1

These findings suggest that the CMIP5 problem of “too few, too bright” clouds over the Southern Ocean (e.g., Trenberth and Fasullo 2010) is reduced in CMIP6: the frequency of occurrence of high total cloud cover values in CMIP6 is closer to the observations than in CMIP5, and at the same time swcre for a given total cloud cover is in better agreement with ESACCI-CLOUD data than CMIP5 over a wide range of cloud cover values. This is at least partly caused by an improvement in the simulated total cloud water path, which is in closer agreement with ESACCI-CLOUD data in CMIP6 than in CMIP5. The average ratio of simulated to observed tcwp across all total cloud cover bins in CMIP5 is 2.1 while this is improved to an average ratio of 1.4 in CMIP6.

2) Southeastern Pacific

The southeastern Pacific (SEP) off the west coasts of northern Chile, Peru, and Ecuador is home of the largest and most persistent subtropical stratocumulus regime in the world (e.g., Bretherton et al. 2010). The stratocumulus clouds are formed and maintained by complex interactions between a relatively cool ocean surface and a moist marine boundary layer capped by a strong inversion with warm and dry air aloft (e.g., Lin et al. 2014). Because of their low cloud tops, these clouds have a cooling effect on climate by reflecting a large fraction of the incoming solar radiation while contributing only very little to the TOA longwave cloud radiative effect. Despite their importance in climate (e.g., Wood 2012), stratocumulus clouds have been notoriously difficult to simulate with global climate models (e.g., Caldwell et al. 2013; Lauer and Hamilton 2013). Common model problems in this region are an underestimation of cloud cover, too high cloud tops, and an unrealistic cloud albedo (Jian et al. 2020). Particularly the underestimation in cloud cover results in an overly weak shortwave cloud radiative effect, leading to too warm sea surface temperatures (Lin et al. 2014).

We investigate the relationship between cloud radiative effect, total cloud water path, and cloud cover over the SEP defined as rectangular ocean region covering 85°–95°W, 25°–5°S. As stratocumulus clouds play an important role in climate throughout the year, we focus on annual averages. Associated results are shown in Fig. 10.

Fig. 10.
Fig. 10.

As in Fig. 9, but for annual averages and over the southeast Pacific (SEP) defined as the region 85°–95°W, 25°–5°S.

Citation: Journal of Climate 36, 2; 10.1175/JCLI-D-22-0181.1

The observations show an almost linear increase in tcwp with cloud fraction, which is reproduced by the multimodel ensemble means up to monthly mean clt values of about 80%. Beyond about 80% (CMIP5) and 90% (CMIP6), the intermodel spread indicated by the red shaded areas shows a very large variability of simulated tcwp among the individual models. On average over all total cloud cover bins, the CMIP5 MMM overestimates observed tcwp values by about 75%, the CMIP6 MMM by about 40%. As tcwp is an important factor determining the shortwave cloud radiative effect, this decrease in average tcwp per total cloud cover class in CMIP6 is also reflected in the corresponding swcre. On average over all total cloud cover bins, amplitudes of swcre are underestimated in the CMIP5 MMM by about 5% whereas this underestimation amounts to about 15% in the CMIP6 MMM. Similar to the Southern Ocean, the frequencies of high monthly mean clt values are increased in CMIP6 compared to CMIP5 and thus in better agreement with observations. For CMIP5, the cumulative frequencies of cloud cover larger than 60% is 37%, in CMIP6 48%, compared to 79% in the observations. This means that the occurrence of stratocumulus clouds in the southeast Pacific is still underestimated in CMIP6 even though there are clear improvements compared to CMIP5. In contrast, swcre for a given total cloud fraction is stronger underestimated in CMIP6 than in CMIP5 which can be partly attributed to a decrease in simulated tcwp per cloud cover class. This agrees with the results from Jian et al. (2020), who also saw that the AMIP6 models underestimate cloud albedos and have more difficulties than AMIP5 models in capturing the linear relationship of cloud fraction and planetary albedo over the marine stratocumulus regions, and especially in the SEP.

3) Pacific ITCZ

The ITCZ is a low pressure region in the tropics generated by convergence of the trade winds. As an important part of the global circulation, the ITCZ forms the upward branch of the Hadley cell that redistributes heat from the tropics to midlatitudes. Frequent occurrence of convection, including deep convection, is common in the ITCZ (e.g., Waliser and Gautier 1993). The ITCZ is a challenging region for global climate models with biases in the simulated clouds and convection. A well-known bias through different generations of global climate models including CMIP6 models is the so-called double ITCZ, an unrealistic zone over the southeastern Pacific and South Atlantic at about 10°S parallel to the equator that somewhat resembles a second ITCZ (Tian and Dong 2020). To capture the ITCZ over the eastern and western Pacific, we here define the region “Pacific ITCZ” used for the analysis as rectangular region covering 135°E–85°W, 0°–12°N. Associated analysis results are shown in Fig. 11.

Fig. 11.
Fig. 11.

As in Fig. 9, but for annual averages and averaged over the Pacific ITCZ region defined as 135°E–85°W, 0°–12°N.

Citation: Journal of Climate 36, 2; 10.1175/JCLI-D-22-0181.1

When averaged over all total cloud cover bins, the MMMs of CMIP5 and CMIP6 show larger tcwp values by a factor of 1.5 and 1.1, respectively, than the observational reference. Similar to the SEP, amplitudes of swcre are underestimated in CMIP5 by about 3% when averaged over all cloud cover bins, in CMIP6 by about 17%. Again, the reduction in swcre amplitude per cloud cover in CMIP6 can be at least partially attributed to the corresponding decrease in tcwp. In contrast to the Southern Ocean and the SEP, the frequencies of high monthly mean cloud cover values are overestimated in the CMIP6 MMM. The frequency of occurrence of a monthly mean total cloud cover of 60% and larger is 62% in the observations and 70% in CMIP6 compared with 58% in CMIP5. Possible reasons for this could be a more active convection in CMIP6 or longer cloud lifetimes.

5. Summary and conclusions

In this study, we compared results from historical simulations from models contributing to CMIP6 with satellite data and with output from the previous phase CMIP5. Whenever possible, reference datasets based on multiple satellite products were used for evaluation of the models in order to allow for an estimation of the observational uncertainties associated with different sensors and with internal variability on a per-pixel basis. A particular focus of this work was to assess how much progress has been made over the recent years by the model ensembles as a whole by comparing the MMMs from CMIP5 and CMIP6 with satellite data and with each other. In total, 35 models contributing to CMIP5 and 28 models contributing to CMIP6 models were used for the analysis.

Our results show that for many cloud parameters investigated here such as total cloud cover, cloud water path, and cloud radiative effect, the CMIP6 MMM performs slightly better than CMIP5 ensemble mean in terms of mean bias, pattern correlation, and relative root-mean-square deviation. The intermodel spread in CMIP6, however, is not reduced or is even slightly larger than in CMIP5. This is probably a result of the increased model complexity and thus an increased degree of freedom in the models. Compared with CALIPSO-ICECLOUD data, the CMIP5/6 models overestimate cloud ice particularly in the lower and middle troposphere. This bias in the cloud ice climatology is reduced in CMIP6. Our results suggest that the overestimation of the cloud ice fraction over a relatively large temperature range (∼250–270 K) in the models can probably explain part of the cloud ice bias in the lower and middle troposphere as cloud ice is probably formed in the models at too high temperatures. Another reason for the overestimation in cloud ice could be that cloud ice is not removed quickly enough from the atmosphere, e.g., through sedimentation, resulting in cloud ice being too persistent (i.e., too long residence times in the atmosphere) in this temperature range.

While many known biases such as an underestimation in cloud cover in stratocumulus regions or unrealistic cloud distributions in the tropics due to a double-ITCZ in some models remain a problem in CMIP6, we find that the CMIP5 problem of too few but too reflective clouds over the Southern Ocean is significantly reduced in CMIP6. Here, CMIP6 models simulate more frequently high total cloud cover values, which are in better agreement with satellite data than the frequency distribution of cloud cover from the CMIP5 models. At the same time, the shortwave cloud radiative effect for a given total cloud cover value in CMIP6 is reduced and in better agreement with satellite data compared with CMIP5.

While we found little change in interannual variability and amplitude of seasonal cycle of the investigated variables between CMIP5 and CMIP6, cloud cover and total cloud water path from CMIP6 are in better agreement with observations over a wide range of dynamical regimes characterized by 500-hPa vertical velocity and sea surface temperature. In CMIP6, total cloud cover is found to be increased in regions with high SSTs and strong upward motion (ITCZ) while showing a reduced sensitivity of the total cloud water path with updraft velocity at 500 hPa, which is in better agreement with observations than in CMIP5. In contrast, the sensitivity of total cloud cover to SST in the subtropical stratocumulus regions is underestimated in the models, with a slight improvement in CMIP6 compared to CMIP5. Total cloud water path in the extratropical regime is more sensitive to vertical velocity than to SST and is overestimated by the models. CMIP6 models are found to be in better agreement with the observations regarding absolute values as well as the gradient in tcwp with vertical velocity than the CMIP5 models. The results further show that the models are overestimating the frequency of occurrence of high updraft values in regions with high SSTs by about 50%–100%. In CMIP6, this overestimation is even slightly more pronounced than in CMIP5, and is probably partly related to the double-ITCZ bias in some of the models that can lead to two ascending branches of the Hadley cell in the tropics. As this problem in the zonal annual mean distribution of the vertical velocity is found to be reduced in CMIP6, this suggests that, for example, convection might be too strong or too sensitive to high SSTs.

We note that despite the increased complexity of the CMIP6 models and the resulting increase in degrees of freedom, the MMM climatology of many investigated cloud parameters has improved in some regions compared with satellite data and results from CMIP5. Among the most notable improvements is an improved agreement of cloud amount and reflectivity of clouds over the Southern Ocean. Many challenges remain, however, such as simulating clouds in the ITCZ or in the stratocumulus regions. Clouds remain a very challenging task for global climate models and further evaluation studies are therefore important for future improvements of cloud parameterizations in these models. For this, more widespread application of satellite simulators including covering more instruments (e.g., Eliasson et al. 2020, 2019) in upcoming model intercomparison projects such as CMIP7 would clearly help to evaluate the models and improve our understanding of differences and model deficiencies.

Acknowledgments.

The development of ESMValTool is supported by several projects. The diagnostic development of ESMValTool v2 for this paper was supported by the ESA Climate Change Initiative Climate Model User Group (ESA CCI CMUG). We acknowledge the World Climate Research Program’s (WCRP’s) Working Group on Coupled Modelling (WGCM), which is responsible for CMIP, and we thank the climate modelling groups for producing and making available their model output in the framework of ESGF. The CMIP data of this study were replicated and made available for this study by the DKRZ.

This work used resources of the Deutsches Klimarechenzentrum (DKRZ) granted by its Scientific Steering Committee (WLA) under project ID bd0854.

CALIPSO-GOCCP data were obtained from the service ClimServ of the Institut Pierre Simon Laplace (IPSL). CALIPSO-ICECLOUD data were obtained from the NASA Langley Research Center Atmospheric Science Data Center. The work performed was done using CLARA-AHRR data from the EUMETSAT Satellite Application Facility on Climate Monitoring (CM SAF), source: https://www.cmsaf.eu/. CloudSat-L2 data have been obtained from the CloudSat Data Processing Center run by the Cooperative Institute for Research in the Atmosphere (CIRA) at Colorado State University. This manuscript contains modified Copernicus Climate Change Service (2021) information with ERA5 data retrieved from the Climate Data Store (neither the European Commission nor ECMWF is responsible for any use that may be made of the Copernicus information or data it contains). This manuscript contains modified (time averaged) ERA-Interim data, copyright European Centre for Medium-Range Weather Forecasts (ECMWF), source: www.ecmwf.int; ERA-Interim is published under a Creative Commons Attribution 4.0 International (CC BY 4.0; https://creativecommons.org/licenses/by/4.0/). ECMWF does not accept any liability whatsoever for any error or omission in the data, their availability, or for any loss or damage arising from their use. The ESA Climate Change Initiative (CCI) and Cloud_cci project are kindly acknowledged. The combined microwave and near-infrared imager based product COMBI (CM SAF/CCI TCWV-global) was initiated, funded, and provided by the Water Vapour project of the ESA CCI, with contributions from Brockmann Consult, Spectral Earth, Deutscher Wetterdienst, and the EUMETSAT Satellite Climate Facility on Climate Monitoring (CM SAF). The combined MW and NIR product will be owned by EUMETSAT CM SAF and will be released by CM SAF via https://wui.cmsaf.eu. The ISCCP H-Series Cloud Properties CDR used in this study was acquired from the NOAA National Centers for Environmental Information (NCEI; formerly NCDC) (https://www.ncei.noaa.gov). This CDR was developed by William B. Rossow and colleagues supported by NOAA’s CDR Program. The MAC-LWP dataset is provided by the Goddard Earth Sciences Data and Information Services Center (GES DISC) funded by NASA’s Science Mission Directorate (SMD). The AVHRR Clouds Properties–PATMOS-x CDR used in this study was acquired from NOAA’s National Climatic Data Center (http://www.ncdc.noaa.gov). This CDR was originally developed by A. Heidinger and colleagues for NOAA’s CDR Program. The CERES-EBAF data were obtained from the NASA Langley Research Center Atmospheric Science Data Center.

The dataset MODIS used in this work was obtained from the obs4MIPs [https://esgf-node.llnl.gov/projects/obs4mips/ (accessed on 2 November 2021)] project hosted on the Earth System Grid Federation [https://esgf.llnl.gov (accessed on 2 November 2021)].

We thank Sabine Brinkop (DLR) for helpful comments on the manuscript.

Data availability statement.

CMIP5 and CMIP6 data are available freely and publicly from the Earth System Grid Federation (ESGF). Observations used in the evaluation are detailed in Table 2. The observational datasets are not distributed with the ESMValTool that is restricted to the code as open source software. Observational datasets that are available through the Observations for Model Intercomparisons Project (obs4MIPs; https://esgf-node.llnl.gov/projects/obs4mips/) can be downloaded freely from the ESGF and directly used in the ESMValTool. For all other observational datasets, the ESMValTool provides a collection of scripts (NCL and Python) with exact downloading and processing instructions to recreate the datasets used in this publication.

Code availability statement. ESMValTool v2 is released under the Apache License, version 2.0. The latest release of ESMValTool v2 is publicly available on Zenodo at https://doi.org/10.5281/zenodo.3387139. The source code of the ESMValCore package, which is installed as a dependency of the ESMValTool v2, is also publicly available on Zenodo at https://doi.org/10.5281/zenodo.3401363. ESMValTool and ESMValCore are developed on the GitHub repositories available at https://github.com/ESMValGroup.

REFERENCES

  • Adachi, Y., and Coauthors, 2013: Basic performance of a new Earth system model of the Meteorological Research Institute (MRI-ESM1). Pap. Meteor. Geophys., 64, 119, https://doi.org/10.2467/mripapers.64.1.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Andersson, A., K. Fennig, C. Klepp, S. Bakan, H. Graßl, and J. Schulz, 2010: The Hamburg Ocean atmosphere parameters and fluxes from satellite data—HOAPS-3. Earth Syst. Sci. Data, 2, 215234, https://doi.org/10.5194/essd-2-215-2010.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Arora, V. K., and Coauthors, 2011: Carbon emission limits required to satisfy future representative concentration pathways of greenhouse gases. Geophys. Res. Lett., 38, L05805, https://doi.org/10.1029/2010GL046270.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Bentsen, M., and Coauthors, 2013: The Norwegian Earth system model, NorESM1-M—Part 1: Description and basic evaluation of the physical climate. Geosci. Model Dev., 6, 687720, https://doi.org/10.5194/gmd-6-687-2013.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Bi, D., and Coauthors, 2013: The ACCESS coupled model: Description, control climate and evaluation. Aust. Meteor. Oceanogr. J., 63, 4164, https://doi.org/10.22499/2.6301.004.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Bjordal, J., T. Storelvmo, K. Alterskjaer, and T. Carlsen, 2020: Equilibrium climate sensitivity above 5°C plausible due to state-dependent cloud feedback. Nat. Geosci., 13, 718721, https://doi.org/10.1038/s41561-020-00649-1.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Bodas-Salcedo, A., and Coauthors, 2011: COSP: Satellite simulation software for model assessment. Bull. Amer. Meteor. Soc., 92, 10231043, https://doi.org/10.1175/2011BAMS2856.1.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Bodas-Salcedo, A., K. D. Williams, P. R. Field, and A. P. Lock, 2012: The surface downwelling solar radiation surplus over the Southern Ocean in the Met Office model: The role of midlatitude cyclone clouds. J. Climate, 25, 74677486, https://doi.org/10.1175/JCLI-D-11-00702.1.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Bodas-Salcedo, A., J. P. Mulcahy, T. Andrews, K. D. Williams, M. A. Ringer, P. R. Field, and G. S. Elsaesser, 2019: Strong dependence of atmospheric feedbacks on mixed‐phase microphysics and aerosol–cloud interactions in HadGEM3. J. Adv. Model. Earth Syst., 11, 17351758, https://doi.org/10.1029/2019MS001688.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Bony, S., and J. L. Dufresne, 2005: Marine boundary layer clouds at the heart of tropical cloud feedback uncertainties in climate models. Geophys. Res. Lett., 32, L20806, https://doi.org/10.1029/2005GL023851.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Bony, S., J. L. Dufresne, H. Le Treut, J.-J. Morcrette, and C. Senior, 2004: On dynamic and thermodynamic components of cloud changes. Climate Dyn., 22, 7186, https://doi.org/10.1007/s00382-003-0369-6.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Bony, S., and Coauthors, 2015: Clouds, circulation and climate sensitivity. Nat. Geosci., 8, 261268, https://doi.org/10.1038/ngeo2398.

  • Boucher, O., and Coauthors, 2020: Presentation and evaluation of the IPSL-CM6A-LR climate model. J. Adv. Model. Earth Syst., 12, e2019MS002010, https://doi.org/10.1029/2019MS002010.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Bretherton, C. S., R. Wood, R. C. George, D. Leon, G. Allen, and X. Zheng, 2010: Southeast Pacific stratocumulus clouds, precipitation and boundary layer structure sampled along 20°S during VOCALS-REx. Atmos. Chem. Phys., 10, 10 63910 654, https://doi.org/10.5194/acp-10-10639-2010.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Caldwell, P. M., Y. Zhang, and S. A. Klein, 2013: CMIP3 subtropical stratocumulus cloud feedback interpreted through a mixed-layer model. J. Climate, 26, 16071625, https://doi.org/10.1175/JCLI-D-12-00188.1.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Cameron-Smith, P., J.-F. Lamarque, P. Connell, C. Chuang, and F. Vitt, 2006: Toward an Earth system model: Atmospheric chemistry, coupling, and petascale computing. J. Phys. Conf. Ser., 46, 343350, https://doi.org/10.1088/1742-6596/46/1/048.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Cao, J., and Coauthors, 2018: The NUIST Earth System Model (NESM) version 3: Description and preliminary evaluation. Geosci. Model Dev., 11, 29752993, https://doi.org/10.5194/gmd-11-2975-2018.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Chepfer, H., S. Bony, D. Winker, G. Cesana, J. L. Dufresne, P. Minnis, C. J. Stubenrauch, and S. Zeng, 2010: The GCM-Oriented CALIPSO cloud product (CALIPSO-GOCCP). J. Geophys. Res., 115, D00H16, https://doi.org/10.1029/2009JD012251.

    • Search Google Scholar
    • Export Citation
  • Collins, W. J., and Coauthors, 2011: Development and evaluation of an Earth-system model—HadGEM2. Geosci. Model Dev., 4, 10511075, https://doi.org/10.5194/gmd-4-1051-2011.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Copernicus Climate Change Service, 2017: ERA5: Fifth generation of ECMWF atmospheric reanalyses of the global climate. C3S and CDS, accessed 2 November 2021, https://confluence.ecmwf.int/display/CKB/ERA5%3A+data+documentation.

  • Danabasoglu, G., and Coauthors, 2020: The Community Earth System Model version 2 (CESM2). J. Adv. Model. Earth Syst., 12, e2019MS001916, https://doi.org/10.1029/2019MS001916.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Dee, D. P., and Coauthors, 2011: The ERA-Interim reanalysis: Configuration and performance of the data assimilation system. Quart. J. Roy. Meteor. Soc., 137, 553597, https://doi.org/10.1002/qj.828.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Diedrich, H., R. Preusker, R. Lindstrot, and J. Fischer, 2015: Retrieval of daytime total columnar water vapour from MODIS measurements over land surfaces. Atmos. Meas. Tech., 8, 823836, https://doi.org/10.5194/amt-8-823-2015.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Donner, L. J., and Coauthors, 2011: The dynamical core, physical parameterizations, and basic simulation characteristics of the atmospheric component AM3 of the GFDL global coupled model CM3. J. Climate, 24, 34843519, https://doi.org/10.1175/2011JCLI3955.1.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Dufresne, J.-L., and Coauthors, 2013: Climate change projections using the IPSL-CM5 Earth System model: From CMIP3 to CMIP5. Climate Dyn., 40, 21232165, https://doi.org/10.1007/s00382-012-1636-1.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Dunne, J. P., and Coauthors, 2012: GFDL’s ESM2 global coupled climate–carbon Earth system models. Part I: Physical formulation and baseline simulation characteristics. J. Climate, 25, 66466665, https://doi.org/10.1175/JCLI-D-11-00560.1.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Dunne, J. P., and Coauthors, 2020: The GFDL Earth system model version 4.1 (GFDL-ESM 4.1): Overall coupled model description and simulation characteristics. J. Adv. Model. Earth Syst., 12, e2019MS002015, https://doi.org/10.1029/2019MS002015.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • ECMWF, 2020: ERA5 data documentation. Copernicus Climate Change Service (C3S) Climate Data Store (CDS), accessed 20 July 2020, https://cds.climate.copernicus.eu/cdsapp#!/dataset/reanalysis-era5-single-levels-monthly-means?tab=overview.

  • Eliasson, S., K. G. Karlsson, E. van Meijgaard, J. F. Meirink, M. Stengel, and U. Willén, 2019: The Cloud_cci simulator v1.0 for the Cloud_cci climate data record and its application to a global and a regional climate model. Geosci. Model Dev., 12, 829847, https://doi.org/10.5194/gmd-12-829-2019.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Eliasson, S., K. G. Karlsson, and U. Widen, 2020: A simulator for the CLARA-A2 cloud climate data record and its application to assess EC-Earth polar cloudiness. Geosci. Model Dev., 13, 297314, https://doi.org/10.5194/gmd-13-297-2020.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Elsaesser, G. S., C. W. O’Dell, M. D. Lebsock, R. Bennartz, T. J. Greenwald, and F. J. Wentz, 2017: The Multisensor Advanced Climatology of Liquid Water Path (MAC-LWP). J. Climate, 30, 10 19310 210, https://doi.org/10.1175/JCLI-D-16-0902.1.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Eyring, V., S. Bony, G. A. Meehl, C. A. Senior, B. Stevens, R. J. Stouffer, and K. E. Taylor, 2016: Overview of the Coupled Model Intercomparison Project phase 6 (CMIP6) experimental design and organization. Geosci. Model Dev., 9, 19371958, https://doi.org/10.5194/gmd-9-1937-2016.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Eyring, V., and Coauthors, 2020: Earth System Model Evaluation Tool (ESMValTool) v2.0—An extended set of large-scale diagnostics for quasi-operational and comprehensive evaluation of Earth system models in CMIP. Geosci. Model Dev., 13, 33833438, https://doi.org/10.5194/gmd-13-3383-2020.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Fennig, K., M. Schröder, A. Andersson, and R. Hollmann, 2020: A fundamental climate data record of SMMR, SSM/I, and SSMIS brightness temperatures. Earth Syst. Sci. Data, 12, 647681, https://doi.org/10.5194/essd-12-647-2020.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Fischer, J., R. Preusker, M. Schröder, and O. Danne, 2021: Water Vapour Climate Change Initiative (WV_cci)–CCI+ Phase 1. ATBD Part I: MERIS-MODIS-OLCI L2 Products, ESA, 43 pp., https://climate.esa.int/media/documents/Water_Vapour_cci_D2.2_ATBD_Part_1_-_MERIS-MODIS-OLCI_L2_products_v2.1.pdf.

  • Flato, G., and Coauthors, 2013: Evaluation of climate models. Climate Change 2013: The Physical Science Basis, T. F. Stocker et al., Eds., Cambridge University Press, 741866.

    • Search Google Scholar
    • Export Citation
  • Fogli, P. G., and Coauthors, 2009: INGV-CMCC Carbon (ICC): A carbon cycle Earth system model. CMCC Research Paper No. 61, 31 pp., https://papers.ssrn.com/sol3/papers.cfm?abstract_id=1517282M.

    • Crossref