Clouds are a key component of the climate system affecting radiative balances and the hydrological cycle. Previous studies from the Coupled Model Intercomparison Project phase 3 (CMIP3) showed quite large biases in the simulated cloud climatology affecting all GCMs as well as a remarkable degree of variation among the models that represented the state of the art circa 2005. Here the progress that has been made in recent years is measured by comparing mean cloud properties, interannual variability, and the climatological seasonal cycle from the CMIP5 models with satellite observations and with results from comparable CMIP3 experiments. The focus is on three climate-relevant cloud parameters: cloud amount, liquid water path, and cloud radiative forcing. The comparison shows that intermodel differences are still large in the Coupled Model Intercomparison Project phase 5 (CMIP5) simulations, and reveals some small improvements of particular cloud properties in some regions in the CMIP5 ensemble over CMIP3. In CMIP5 there is an improved agreement of the modeled interannual variability of liquid water path and of the modeled longwave cloud forcing over mid- and high-latitude oceans with observations. However, the differences in the simulated cloud climatology from CMIP3 and CMIP5 are generally small, and there is very little to no improvement apparent in the tropical and subtropical regions in CMIP5.
Comparisons of the results from the coupled CMIP5 models with their atmosphere-only versions run with observed SSTs show remarkably similar biases in the simulated cloud climatologies. This suggests the treatments of subgrid-scale cloud and boundary layer processes are directly implicated in the poor performance of current GCMs in simulating realistic cloud fields.
The simulation of clouds with global climate models (GCMs) involves many nonlinear processes spanning a large range of spatial and temporal scales. While cloud systems such as the subtropical stratocumulus decks can extend over thousands of kilometers, cloud droplet formation and droplet growth occur on the micrometer scale. Similarly, time scales relevant to clouds and cloud microphysics range from weeks to fractions of seconds. All of this makes the simulation of clouds with climate models very difficult (Solomon et al. 2007). Because of the large impact of clouds on the radiation budget and their pivotal role in the hydrological cycle, even small changes in cloud properties could have a significant impact on climate (e.g., Hartmann and Doelling 1991). Clouds and their response to climate change therefore remain a major source of uncertainty for projections of the climate response to anticipated anthropogenic forcing (e.g., Cess et al. 1990; Bony and Dufresne 2005; Stowasser et al. 2006; Solomon et al. 2007; Medeiros et al. 2008; Lauer et al. 2010). Even for simulations of the long-term mean cloud fields under present-day conditions, GCMs generally display rather large deviations from observations, and there are quite large disagreements among GCMs in various aspects of their cloud climatology simulations (Weare 2004; Zhang et al. 2005; Waliser et al. 2007, 2009; Lauer et al. 2010; Chen et al. 2011). Deficiencies in GCM representation of cloud fields will affect the application of such models to simulate chemistry–climate interactions (e.g., by causing biases in photolysis rates) and, specifically, to simulate aerosol indirect climate forcing (e.g., underestimation in cloud frequency could underestimate indirect aerosol effect). Significant biases in the simulation of present-day clouds also raise concerns about the accurate representation of cloud feedback processes in climate change projections.
In this study we investigate the performance of state-of-the-art global coupled GCMs from the fifth phase of the Coupled Model Intercomparison Project (CMIP5) (Taylor et al. 2012) by comparing simulated cloud properties with observations. To address how much progress has been made over the recent years, we compare the CMIP5 models to results from the previous generation of global coupled models (CMIP3). Four cloud parameters are of particular interest as they largely determine the impact of clouds on the radiation budget and climate in the model simulations: total cloud amount (CA), liquid water path (LWP), ice water path (IWP), and top of the atmosphere (ToA) cloud forcing (CF). Both LWP and IWP contribute to CA and CF. Here we focus on CA, LWP, and CF, which we will partition between shortwave cloud forcing (SCF) and longwave cloud forcing (LCF). All three of these cloud parameters are also available from the CMIP3 models, allowing for a side-by-side comparison of CMIP5 with the previous-generation models. This enables us to assess the progress of simulating clouds with global coupled models. As we will demonstrate, both the CMIP3 and CMIP5 models display a great deal of disagreement among models in their cloud climatology simulations. Models developed at individual centers may have changed greatly between their CMIP3 and CMIP5 versions, so we do not try to track improvements of individual models. Results from the ensemble of all CMIP models are used to document the general progress in simulating clouds in comprehensive GCMs. We will also concentrate on the geographical patterns of vertically integrated cloud parameters. A more detailed evaluation of vertical profiles of water vapor and cloud water content from individual CMIP5 models can be found in Jiang et al. (2012).
We will show that even the state-of-the-art CMIP5 models display substantial regional biases in their cloud climatology simulations. These coupled GCMs also all display considerable regional biases in the basic aspects of the climate, as evidenced particularly by their simulation of sea surface temperature (SST). It is natural to ask how much the deficiencies in the cloud simulations are directly related to the biases in SST simulations. We performed a side-by-side comparison of results from the CMIP5 models’ coupled ocean–atmosphere runs with their counterparts using prescribed SSTs based on observations. This comparison allowed us to assess the role of biases in simulated SSTs in mean biases and the intermodel spread of the cloud properties investigated here.
Section 2 describes the models, the present-day model simulations, and the satellite data used for this intercomparison. The performance metrics and results from the comparison of the CMIP5 models with CMIP3 results and satellite observations are presented in section 3. Also discussed in section 3 is the comparison between coupled model simulations and those with prescribed SSTs. The main conclusions are given in section 4.
2. Models, model simulations, and satellite data
We analyze the results of 27 CMIP5 models (Table 1) from the “historical” model runs (twentieth-century simulations for 1850–2005 conducted with the best record of natural and anthropogenic climate forcing), the results from 18 CMIP5 models run with prescribed observed SSTs [Atmospheric Model Intercomparison Project (AMIP) runs; Table 2] (Taylor et al. 2012), and the results from 23 CMIP3 models (Table 3) for the twentieth-century runs with natural and anthropogenic forcings (20C3M experiments) (Meehl et al. 2007). The model data were obtained from the World Climate Research Programme’s (WCRP) CMIP3 and CMIP5 data archive, which is operated by the Program for Climate Model Diagnosis and Intercomparison (PCMDI). The model data used for this comparison cover 20-yr time periods from 1986 to 2005 for CMIP5 and from 1980 to 1999 for CMIP3. These time periods have been chosen to allow for a maximum overlap with available satellite data, particularly with global observations of liquid water path (O’Dell et al. 2008), which are available for 1988–2007. The satellite data, instruments, and time periods used for this comparison are summarized in Table 4.
We picked the three satellite datasets for LWP, CA, and CF listed in Table 4 particularly for their long time coverage of at least 20 years. A long record is needed for an evaluation of cloud properties on a statistical basis because individual years from the global coupled models do not correspond to any specific observed year. Multiyear climate modes such as ENSO can have a strong impact on cloud properties (e.g., Park and Leovy 2004). For the purpose of this study, the long time series of global radiative fluxes from the International Satellite Cloud Climatology Project flux data (ISCCP-FD), which are available since 1983, seem preferable to shorter time series like those from the Earth Radiation Budget Experiment (ERBE, 1984–1990) (Barkstrom 1984) or the Clouds and Earth’s Radiant Energy System (CERES, 2000–11) (Wielicki et al. 1996). (Note that the CMIP5 historical runs are only required to extend through 2005 and the CMIP3 20C3M runs only through 1999.) The LWP climatology from O’Dell et al. (2008) has been chosen not only for the number of years with observations available but also because O’Dell et al. adjusted their results to be representative of the diurnal mean. The diurnal cycle of LWP can be important when comparing modeled monthly means with satellite data available only at certain times of the day (e.g., overpass time at 1030 local time), as neglecting it might introduce errors of unknown magnitude.
The uncertainties in the observational data given in Table 4 are estimated as the systematic errors, assuming that statistical errors are small when averaging over 20 years of data. O’Dell et al. (2008) give potential systematic errors in the LWP climatology of the order of 15%–30%. The uncertainty estimate for monthly regional average ToA fluxes from ISCCP-FD is 5–10 W m−2 (Zhang et al. 2004), translating into 25%–50% for the ToA cloud forcing. Total cloud amount from ISCCP is estimated to be about 10% too low over land and “about right” over the ocean (Rossow and Schiffer 1999).
With exception of the data shown in Figs. 1 (and 4) (which show results on the native model grids), all model data have been regridded to a common l° × l° grid for comparison with observations and for calculating multimodel means using bilinear interpolation. If there are multiple ensemble members available for any given model and model experiment, we only consider the first ensemble member in our analysis: CMIP3 is runl and CMIP5 is rlilpl. This facilitates the comparison as the number of available ensemble members varies greatly among the individual models and among CMIP3 and CMIP5. An analysis of the spread in the 20-yr-mean LWP among the ensemble members of individual models shows that its relative standard deviation is typically small and ranges between 5% and 15% in regions with frequent deep convection and between 2% and 5% otherwise. The interensemble spread for LCF and SCF is even smaller. This interensemble spread is much smaller than the intermodel spread. We therefore do not expect our results to change significantly when using ensemble members other than runl and rlilpl or when using ensemble means.
3. Comparison of CMIP5 with CMIP3 and satellite data
a. Multiyear annual mean
Figure 1 shows the 20-yr annual mean liquid water path averaged over the years 1986–2005 from 24 CMIP5 models that have LWP as an available variable. These maps are compared with the multimodel mean and with the University of Wisconsin LWP climatology (UWisc). UWisc is based on satellite observations from the Special Sensor Microwave Imager (SSM/I), the Tropical Rainfall Measuring Mission (TRMM) Microwave Imager (TMI), and the Advanced Microwave Scanning Radiometer for Earth Observing System (EOS) (AMSR-E) in the years 1988–2007. The model LWP values are obtained by subtracting the vertically integrated ice water path (clivi) from the vertically integrated cloud water path (ice + liquid) clwvi. However, some models provided only the vertically integrated liquid water path as clwvi. Some of these models are listed on the PCMDI CMIP5 errata page (http://cmip-pcmdi.llnl.gov/cmip5/errata/cmip5errata.html). In addition to the models listed there, we found that subtracting the ice water path from the cloud water path for EPSL-CM5B-LR results in significant negative LWP values (monthly means of individual grid cells range up to −300 g m−2). We therefore assume that clwvi from IPSL-CM5B-LR is also liquid water only. We treat clwvi from the following 9 out of 24 CMIP5 models as liquid water only: CCSM4, CSIRO-Mk3.6.0, IPSL-CM5A-MR, IPSL-CM5A-LR, IPSL-CM5B-LR, MIROC-ESM, MIROC-ESM-CHEM, MPI-ESM-LR, and MPI-ESM-P. Similarly, clwvi from the following CMIP3 models is assumed to be liquid water only: BCCR-BCM2.0, CSIRO-Mk3.0, and CSIRO-Mk3.5 (Jiang et al. 2012).
The models show a large spread in the overall LWP amounts as well as in the geographical pattern of the simulated LWP. Averaged over all grid cells with available satellite observations, the mean LWP values in the models range from 37 to 167 g m−2 with a median of 83 g m−2. The mean LWP from satellite observations is 85 g m−2 (±25 g m−2 assuming an uncertainty of ±30%) and that from the multimodel mean is 87 g m−2. The intermodel spread among the CMIP5 models has narrowed down compared with the CMIP3 models, which give global annual average LWP values ranging between 28 and 209 g m−2. The CMIP3 median is 73 g m−2, and the multimodel mean is 78 g m−2. Interestingly, BCC-CSM1.1, CCSM4, and NorESMl-M show similar biases in the simulated annual mean LWP, with a strong overestimation in middle and higher latitudes. All three of these models have in common that the atmosphere component of the model is based on National Center for Atmospheric Research (NCAR)’s Community Atmosphere Model version 5 (CCSM4; Gent et al. 2011) or its earlier version CAM3 [BCC-CSM1.1 (Wu et al. 2010); NorESMl-M (Kirkevåg et al. 2013)]. This suggests that the biases in the simulated LWP climatology are determined to a large degree by the atmosphere component of the model.
The linear pattern correlation of annual mean LWP between the individual CMIP5 models and the satellite observations ranges from a low of 0.03 to a high of 0.77. It is remarkable that some models simulate annual mean LWP fields with almost no geographical correlation with observations. The correlations of individual CMIP3 model simulations with observations range over 0.22–0.70. The correlation of the CMIP5 multimodel mean (0.59) did not improve over CMIP3 (0.64).
A measure of the performance of the CMIP model ensemble in reproducing observed mean cloud properties is obtained by calculating the differences in modeled (xmod) and observed (xobs) 20-yr means. These differences are then averaged over all N models in the CMIP3 or CMIP5 ensemble to calculate the multimodel ensemble mean bias Δmm, which is defined at each grid point as
Figure 2 shows 20-yr annual means for liquid water path, total cloud amount, and ToA CF from satellite observations and the ensemble mean bias of the CMIP3 and CMIP5 models. The most striking overall impression from this figure is how similar the geographical distribution of biases are in the CMIP3 and CMIP5 ensemble means for all four cloud properties shown. Both the CMIP3 and the CMIP5 ensemble means overestimate LWP in the midlatitude storm track regions, a bias that is apparent in many of the individual models as well (Fig. 1). This overestimation is not reduced in CMIP5 compared with CMIP3. Expressed as a fraction of the observed values in most parts of the CMIP5 storm track regions, the LWP biases are 30%–50% compared with 20%–40% in CMIP3. The negative bias in the Pacific intertropical convergence zone (ITCZ) in the CMIP3 models (−30% to −50% of the observed) is slightly improved in CMIP5 (−20% to −50%), particularly in the central Pacific. The systematic underestimation of LWP in the stratocumulus regions, particularly off the coasts of South America (−40% to −50%), South Africa (−30%), and west Australia (−10% to −20%), however, did not improve in CMIP5 with mean biases similar to those in CMIP3. The representation of LWP in the South Pacific convergence zone (SPCZ) did not improve in CMIP5 compared with the previous generation of coupled climate models.
In contrast to the overestimation of LWP, the models generally underestimate total CA in low and middle latitudes. While there are small improvements in reproducing the observed total CA in high latitudes in the CMIP5, underestimation of CA, particularly in midlatitudes, is found to be slightly larger in CMIP5 (−20%) than in CMIP3 (−10% to −20%). The underestimation of the cloud amount in the stratocumulus regions off the subtropical west coasts of the continents is similar in the CMIP5 and CMIP3 ensemble mean results, and the bias ranges between −30% and −50% of the observed CA. It is noteworthy that the ISCCP data used here are known to often show an artifact over the Indian Ocean between 50° and 100°E. This artifact in total cloud amount is related to a dependence of the retrieved data on the satellite zenith angle (Rossow and Garder 1993).
The CF is defined as the difference between ToA all-sky and clear-sky outgoing radiation in the solar spectral range (SCF) and in the thermal spectral range (LCF). A negative CF corresponds to an energy loss and a cooling effect, and a positive CF corresponds to an energy gain and a warming effect. Biases in annual average SCF are slightly reduced in CMIP5 over the subtropical central North Pacific (30%–50%) and the North Atlantic (20%–40%), as well as over North Africa (−10% to −50%) and South Asia (−20% to −60%), compared with CMIP3. Very little change between CMIP5 and CMIP3 is found in the stratocumulus regions (−30% to −60%), the ITCZ (−5% to −25%), the Southern Hemisphere oceans, and the Americas. In contrast to SCF, LCF improved significantly over the oceans, particularly in middle and high latitudes. The large biases in LCF found in CMIP3 in the Pacific and Atlantic south of the ITCZ and in the SPCZ are still present in CMIP5 even though the magnitude of the biases is reduced (CMIP3: 100%–200%; CMIP5: 50%–150%). The CMIP5 models show a larger underestimation in LCF, particularly over the Americas, Australia, and Asia (−20% to −30%), than do the CMIP3 models (−10% to −20%).
The overall comparisons of the annual mean cloud properties with observations are summarized for individual models and for the ensemble means by the Taylor diagrams for CA, LWP, SCF, and LCF shown in Fig. 3. These give the standard deviation and linear correlation with satellite observations of the total spatial variability calculated from 20-yr annual means. The standard deviations are normalized by the observed values, so the observed climatology is represented in each panel by the black stars on the x axis at x = 1. In this polar coordinate system, the linear distance between the observations and each model is proportional to the root-mean-square error (rmse) (Taylor 2001) and can be gauged using the green circles centered on the observational dots in Fig. 3 (contour labels in standard deviation). The linear correlation coefficients for total CA among the individual CMIP3 models range from 0.12 to 0.87 (multimodel mean = 0.76); in CMIP5 this range is from 0.11 to 0.83 (multimodel mean = 0.80). The rmse of the simulated 20-yr-mean CA ranges from 11% to 20% among the CMIP3 models (multimodel mean = 11%) and from 10% to 23% (multimodel mean = 12%) in CMIP5.
Total CA is strongly determined by LWP and IWP. Similar to Jiang et al. (2012), we find that the model spread in total CA is smaller than that in LWP (and IWP) (Jiang et al. 2012). A possible reason for that could be that biases in LWP and IWP are partly compensating each other.
In addition to the standard model variables, some CMIP5 models provide diagnostics resembling satellite observations of, for instance, total cloud amount, as it would be seen from ISCCP: CanESM2, IPSL-CM5A-LR, IPSL-CM5A-MR, MIROC-ESM, and MIROC-ESM-CHEM. These diagnostics are derived with the Cloud Feedback Model Intercomparison Project (CFMIP) (Bony et al. 2011) integrated satellite simulator, the CFMIP Observation Simulator Package (COSP) (Bodas-Salcedo et al. 2011). We compared the performance of these five models in reproducing the 20-yr-mean ISCCP observed total cloud amount with and without application of COSP. We found that, while the pattern correlation of CanESM2 improves considerably from 0.67 to 0.81 when using COSP, the performance of the four other models measured by the rmse hardly changes. We therefore do not expect the large intermodel spread in total cloud amount to be reduced considerably when comparing the simulated ISCCP total cloud amount instead of the models’ native total cloud amount, as done here.
Just as for CA, the performance in reproducing the observed multiyear annual mean LWP did not improve considerably in CMIP5 compared with CMIP3. The rmse ranges between 20 and 129 g m−2 in CMIP3 (multimodel mean = 22 g m−2) and between 23 and 95 g m−2 in CMIP5 (multimodel mean = 24 g m−2). For SCF and LCF, the spread among the models is much smaller compared with CA and LWP. The agreement of modeled SCF and LCF with observations is also better than that of CA and LWP. The linear correlations for SCF range between 0.83 and 0.94 (multimodel mean = 0.95) in CMIP3 and between 0.80 and 0.94 (multimodel mean = 0.95) in CMIP5. The rmse of the multimodel mean for SCF is 8 W m−2 in both CMIP3 and CMIP5. As in the case for SCF, the CMIP3 and CMIP5 model performance in simulating a realistic LCF climatology is clearly better than for CA and LWP. The correlation of the multimodel mean LCF from CMIP3 is 0.92 (rmse = 5 W m−2), with that of the individual models ranging between 0.66 and 0.90 (rmse = 6–11 W m−2). For CMIP5, the correlation of the multimodel mean LCF is 0.93 (rmse = 4 W m−2) and ranges between 0.70 and 0.92 (rmse = 4–11 W m−2) for the individual models.
In both CMIP3 and CMIP5, the large intermodal spread and biases in CA and LWP contrast strikingly with a much smaller spread and better agreement of global average SCF and LCF with observations. The SCF and LCF directly affect the global mean radiative balance of the earth, so it is reasonable to suppose that modelers have focused on “tuning” their results to reproduce aspects of SCF and LCF as the global energy balance is of crucial importance for long climate integrations.
The problem of compensating biases in simulated cloud properties is not new and has been reported in previous studies. For example, Zhang et al. (2005) find that many CMIP3 models underestimate cloud amount while overestimating optically thick clouds, explaining why models simulate the ToA cloud forcing reasonably well while showing large biases in simulated LWP (Weare 2004). The problem of too few but too bright clouds has also been identified in some CMIP5 models (e.g., Kay et al. 2012; Nam and Quaas 2012).
b. Coupled models versus prescribed SSTs
A first-principles simulation of even the main features of regional climate in coupled atmosphere–ocean models is challenging, as errors can be amplified by positive air–sea coupling feedbacks. Current coupled GCMs typically simulate mean present-day climate with substantial biases from that observed. A well-known example affecting many coupled models is a cold bias in the eastern Pacific equatorial SST accompanied by deficiencies in the overlying atmospheric circulation, along with the appearance of an unrealistic “double ITCZ” feature (e.g., Lin 2007). Figure 4 shows the biases in simulated annual mean climatological SST in 26 CMIP5 models. The multimodel mean SST has positive biases of 1°–3°C in the stratocumulus regions off the west coasts of North and South America, as well as off Africa, and in middle and high latitudes south of 45°S. SSTs in the multimodel mean are too low compared with observations in the North Pacific (−2°C) and the North Atlantic (−2° to −4°C). The large SST biases in the stratocumulus regions are of particular concern as ocean temperature there may significantly impact the boundary layer cloud fields (e.g., Bony 1997). It is reasonable to ask how much of the large deficiencies we have documented in cloud fields in the CMIP models may be ascribed to the overall biases in the simulated climate in the coupled model runs. Here we compare results from near present day in the CMIP5 coupled model historical runs (Table 1) with results from experiments using the atmosphere components of the CMIP5 models and prescribed observed SSTs (the so-called AMIP runs). As shown in Table 2, 18 of the CMIP5 models have provided AMIP results we can use. The performance of the models in reproducing observed multiyear mean cloud properties is summarized in the Taylor diagrams for CA, LWP, SCF, and LCF, shown in Fig. 5. As in the case of the coupled CMIP5 models discussed above, we find a large intermodel spread for CA and LWP, while the spread is smaller for SCF and LCF. The corresponding biases and rmse are smaller for the modeled SCF and LCF than for CA and LWP. The intermodel spread in both pattern correlation and rmse does not narrow down when going from the coupled model runs to the AMIP experiments. For LWP, the range of disagreement with observations among individual models is similar in the AMIP runs (linear pattern correlation = 0.35–0.81, rmse = 23–96 g m−2) than in the coupled simulations (correlation = 0.03–0.77, rmse = 23–95 g m−2). The median correlation of the AMIP models with observations is 0.49 (median rmse = 42 g m−2) and that of the coupled models is 0.49 (median rmse = 43 g m−2). Similar results are seen for the other cloud parameters investigated here (CA, SCF, and LCF). This suggests that the AMIP models do not systematically outperform the coupled models in reproducing observed mean cloud properties. Furthermore, this suggests that the large intermodel spread in CA and LWP is attributable to the representation of cloud processes rather than to biases in the SST and related aspects of the circulation in the coupled models.
Figure 6 shows the biases in simulated cloud properties from the CMIP5 multimodel means computed for 13 coupled models that also provide data from AMIP runs (models marked with an asterisk in Table 1) and are averaged over the corresponding AMIP experiments (Table 2). Also shown in the rightmost panels are the differences in the magnitude of the biases between the coupled and the AMIP model runs (|coupled| − |AMIP|). Positive values refer to larger biases in the coupled runs, and negative values refer to larger bias amplitudes in the AMIP runs. The differences in the bias amplitudes between the coupled and the AMIP model runs are generally small and are typically less than 10% of the modeled values. The geographical patterns of the biases are also very similar in the coupled and AMIP runs.
Figure 7 shows the coupled model SST biases and the biases in simulated 20-yr-mean LWP in the coupled and AMIP simulations for four selected individual CMIP5 models. For this purpose, we chose the models that have the smallest rmse in annual mean LWP in both the coupled and the AMIP runs: MIROC5 (rmsecoupled = 23 g m−2, rmseAMIP = 23 g m−2), MRI-CGCM3 (24 g m−2, 24 g m−2), CNRM-CM5 (30 g m−2, 34 g m−2), and CSIRO-Mk3.6.0 (35 g m−2, 39 g m−2). As for the multimodel means, we find the differences in simulated LWP between the coupled and the AMIP simulations to be remarkably small, with differences in the bias magnitudes typically around 10%–20% of the modeled values or less. In the subtropical stratocumulus regions off the coast of North and South America and off the coast of Africa, three of the four coupled models show positive SST biases but similar corresponding negative biases in simulated LWP. Negative SST biases are found over most of the other parts of the subtropical oceans, with smaller bias magnitudes in LWP in the coupled model runs than in the AMIP experiments. This is particularly the case for the MIROC5 and the MRI-CGCM3 models. Warm biases in SST in southern mid-to-high latitudes also seem to be related to smaller LWP biases in the coupled version of CNRM-CM5 compared with the AMIP run.
c. Interannual variability
The interannual variability is estimated as the temporal standard deviation of the deseasonalized monthly means (σmod) calculated at each grid cell. The monthly means are normalized by the average over the entire time series . The interannual variability of the multimodel ensemble σmm is calculated by averaging over all N individual CMIP5 models:
Figure 8 shows the relative temporal standard deviation [Eq. (2)] from satellite observations (Table 4) and the deviation of the modeled interannual variability from that of the observations for CA, LWP, SCF, and LCF. The observations show the largest interannual variability in LWP in the ITCZ and SPCZ regions, as well as in the subtropical regions of the central and west Pacific and the Indian Ocean. The modeled interannual variability of LWP in CMIP5 is overestimated throughout most of the globe, ranging between 5% and 15% in the midlatitudes and between 15% and 25% in the subtropics, with the largest overestimation in the subtropical east Pacific. In contrast, the models underestimate the interannual variability in LWP in the equatorial east and central Pacific (−10% to −30%). This qualitative pattern is also found in the CMIP3 results, but there is improved agreement with the observed interannual variability of LWP, particularly in midlatitudes and the subtropical west Pacific and Indian Ocean. Differences between modeled and observed interannual variability in CA, LCF, and SCF are smaller than those in LWP, with the main features of the geographical distribution of the deviations being similar to those for LWP. In contrast to LWP, the differences between the interannual variability in CMIP3 and CMIP5 for CA, SCF, and LCF are only small. The overestimation in interannual variability of CA in CMIP5 is somewhat reduced over the continents compared with CMIP3, but slightly increased over the subtropical oceans. The differences in observed and modeled interannual variability in SCF and LCF in CMIP5 and their changes compared with CMIP3 are qualitatively similar to those in CA.
d. Seasonal cycle
We analyze the ability of the models to reproduce the observed geographical patterns of the seasonal cycle of LWP, CA, SCF, and LCF by calculating the differences in seasonal averages over the months June–August (JJA) and December–February (DJF). Figure 9 shows a comparison of the differences in JJA and DJF averages from the CMIP3 and CMIP5 models with satellite observations. Both the CMIP3 and the CMIP5 multimodel means capture the geographical pattern of the JJA–DJF amplitudes reasonably well. The differences in JJA–DJF amplitudes for LWP, CA, and LCF between CMIP3 and CMIP5 are mostly small throughout the globe, with little to no improvement in CMIP5 over CMIP3. In contrast, the amplitudes of the differences in summer and winter averages of SCF from CMIP5 increased in midlatitudes and high latitudes by 10%–15% compared with CMIP3. The SCF amplitudes from CMIP5 in these regions are in better agreement with satellite observations than those from CMIP3.
For each point and each model, a mean seasonal cycle was determined by averaging the data for January over all years, for February over all years, etc. We then normalized this mean seasonal cycle (xj) for each model and point by dividing by the average over all months and years . As a performance measure of the shape of the seasonal cycle, we calculate the rmse of the normalized modeled mean seasonal cycle compared with that from observations . The rmse of the multimodel ensemble (rmsemm) is calculated by averaging over all rmse values for the N individual CMIP3 or CMIP5 models:
The largest errors in the multimodel mean seasonal cycle of LWP are found in the tropical Atlantic and east Pacific as well as in the Indian Ocean and the west Pacific (Fig. 10). Here the average rmse ranges between 50% and 80%. The errors in the tropics are largely unchanged in CMIP5 compared with CMIP3. The errors in the LWP mean seasonal cycle in midlatitudes are much smaller than in the tropics and typically range between 10% and 20%. Again, there is only very little change between CMIP3 and CMIP5 in the models’ ability to reproduce the observed LWP mean seasonal cycle.
Consistent with the large errors in reproducing the observed LWP seasonal cycle in the tropics and small errors in midlatitudes, CA, SCF, and LCF show a similar behavior. In general, the error amplitudes are smallest for cloud amount and largest for LCF and LWP. In general, there is little to no improvement in reproducing the observed seasonal cycles of LWP, CA, SCF, and LCF in CMIP5 compared with CMIP3.
Comprehensive global models of the kind used for long-term climate projections are known to display significant deficiencies in their cloud simulations. The simulated cloud climate feedbacks activated in global warming projections differ enormously among state-of-the-art models, and this large degree of disagreement has been a constant feature documented for successive generations of GCMs from the time of the first Intergovernmental Panel on Climate Change (IPCC) assessment (Cess et al. 1990) through the CMIP3 generation models used in the fourth IPCC assessment (Bony and Dufresne 2005; Lauer et al. 2010). Much of the difference among state-of-the-art global models in their simulated climate sensitivity is due to the simulated cloud feedbacks, and the recalcitrance of the cloud simulation issue accounts for the often-lamented fact that the range of GCM-estimated climate sensitivity to increased greenhouse gas concentrations scarcely narrowed from the first IPCC assessment to the fourth IPCC assessment. Even the model-simulated cloud climatologies for present-day conditions are known to depart significantly from observations and, once again, the variation among models is quite remarkable (e.g., Weare 2004; Zhang et al. 2005; Waliser et al. 2007, 2009; Lauer et al. 2010; Chen et al. 2011).
In the project reported here, we have evaluated the quality of the cloud climatology simulations in the CMIP5 coupled models, which represent the contemporary state of the art in models to be used for long-term climate projections. The evaluation focused on the geographical patterns of vertically integrated cloud properties, notably, liquid water path, total cloud amount, and shortwave and longwave cloud radiative forcing. We have also tried to assess whether an overall improvement in model performance for these fields can be found relative to that of the CMIP3 models, which represented the state of the art circa 2005. Of course, such a comparison of model performance over different generations is complicated by the fact that many separate changes may be introduced by developers at any particular climate research center over such a long period and improving cloud simulations will be only one of many motivations for such changes. However, given the first-order deficiencies seen in the critically important cloud fields in the global GCMs, it is worthwhile to see if an overall improvement in the state-of-the-art models is apparent. A simple approach to characterizing this overall progress is through use of multimodel mean ensemble results. Possible improvements in performance in the multimodel mean cloud fields are also of interest since the multimodel means are sometimes suggested as, intuitively, the most reliable or probable projections from the CMIP model intercomparison exercises.
We found that the long-term mean vertically integrated cloud fields have quite significant deficiencies in all the CMIP5 model simulations. For example, the highest spatial correlation of annual mean LWP with satellite climatology for any of the CMIP5 models is 0.77, and for one of the models the correlation is as small as 0.03. The correlation of the multimodel mean for CMIP5 in this case is 0.59, which is similar to the results for the CMIP3 ensemble of 0.64. Both the CMIP5 and CMIP3 models display a clear bias to simulating too high LWP in midlatitudes. This bias is not reduced in the CMIP5 models. We also investigated the seasonal cycle of cloud properties in the CMIP3 and CMIP5 runs and found significant discrepancies from observations, but only a very small improvement in the regard in CMIP5.
Our analysis of the root-mean-square error of simulated LWP, CA, SCF, and LCF supports our findings on little to no changes in the skill of reproducing the observed LWP and CA. In general, the models have higher skills in reproducing observed SCF and LCF than CA and LWP. Of course, since the SCF and LCF directly affect the global mean radiative balance of the earth, efforts by model developers to “tune” their subgrid-scale parameterizations may focus on these variables. Our study included a detailed comparison of present-day cloud climatologies in the CMIP5 coupled historical runs and in AMIP simulations, in which the atmospheric components are forced with observed SSTs. We found that, even though the historical coupled runs displayed significant mean SST biases, the overall cloud simulations in the coupled and AMIP runs were quite similar. This result seems quite robust, and it is noteworthy the AMIP simulations do not outperform the coupled runs in reproducing observed mean cloud properties. This suggests that the deficiencies in state-of-the-art GCM cloud climatology simulations may be rather directly attributable to the cloud, convection, and boundary layer parameterizations employed. A positive implication of our findings is that some aspects of cloud-related subgrid-scale parameterizations may be tested adequately through AMIP type simulations and that coupled feedbacks with the ocean may not be first-order significant in this regard.
We also conducted an analysis of the interannual variability of the cloud fields in the coupled CMIP3 and CMIP5 simulations. The CMIP5 models showed improvements, particularly in LWP in the midlatitudes and the subtropical west Pacific and Indian Ocean compared with CMIP3. The CMIP5 versus CMIP3 differences in the statistics of interannual variability of SCF and LCF are quite modest, although a systematic overestimation in interannual variability of CA in CMIP3 is slightly improved over the continents in CMIP5.
Our analysis showed that there is still a wide intermodel spread, particularly in the modeled LWP and CA. There is generally only very modest improvement in the simulated cloud climatology in CMIP5 compared with CMIP3. The better performance of the models in reproducing observed annual mean SCF and LCF therefore suggests that this good agreement is mainly a result of careful model tuning rather than an accurate fundamental representation of cloud processes in the models. Simulating clouds in comprehensive climate models remains a very challenging problem, and, unfortunately, over recent years the pace of improvement has been modest at best. An important issue not addressed by this study is cloud ice. Cloud ice contributes to the total cloud amount as well as to the cloud forcing. Biases in LWP and IWP can partly compensate each other, potentially masking more problems in the simulated cloud climatology. A closer look at the models’ capability in reproducing observed cloud ice and the contribution of biases in cloud liquid and cloud ice to deficiencies in the simulated climatology of cloud amount and cloud radiative forcing is the logical next step in understanding which cloud processes need particular attention for future model improvements.
This research was supported by the Japan Agency for Marine-Earth Science and Technology (JAMSTEC), by NASA through Grant NNX07AG53G, and by NOAA through Grant NA09OAR4320075, which sponsored research at the International Pacific Research Center. We acknowledge the World Climate Research Programme’s Working Group on Coupled Modelling, which is responsible for CMIP, and we thank the climate modeling groups (listed in Tables 1–3 of this paper) for producing and making available their model output. For CMIP, the U.S. Department of Energy’s Program for Climate Model Diagnosis and Intercomparison provides coordinating support and led development of software infrastructure in partnership with the Global Organization for Earth System Science Portals. We thank the journal reviewers for valuable comments on the manuscript.