The authors present an observationally based evaluation of the vertically resolved cloud ice water content (CIWC) and vertically integrated cloud ice water path (CIWP) as well as radiative shortwave flux downward at the surface (RSDS), reflected shortwave (RSUT), and radiative longwave flux upward at top of atmosphere (RLUT) of present-day global climate models (GCMs), notably twentieth-century simulations from the fifth phase of the Coupled Model Intercomparison Project (CMIP5), and compare these results to those of the third phase of the Coupled Model Intercomparison Project (CMIP3) and two recent reanalyses. Three different CloudSat and/or Cloud–Aerosol Lidar and Infrared Pathfinder Satellite Observations (CALIPSO) combined ice water products and two methods are used to remove the contribution from the convective core ice mass and/or precipitating cloud hydrometeors with variable sizes and falling speeds so that a robust observational estimate can be obtained for model evaluations.
The results show that, for annual mean CIWC and CIWP, there are factors of 2–10 (either over- or underestimate) in the differences between observations and models for a majority of the GCMs and for a number of regions. Most of the GCMs in CMIP3 and CMIP5 significantly underestimate the total ice water mass because models only consider suspended cloud mass, ignoring falling and convective core cloud mass. For the annual means of RSDS, RLUT, and RSUT, a majority of the models have significant regional biases ranging from −30 to 30 W m−2. Based on these biases in the annual means, there is virtually no progress in the simulation fidelity of RSDS, RLUT, and RSUT fluxes from CMIP3 to CMIP5, even though there is about a 50% bias reduction improvement of global annual mean CIWP from CMIP3 to CMIP5. It is concluded that at least a part of these persistent biases stem from the common GCM practice of ignoring the effects of precipitating and/or convective core ice and liquid in their radiation calculations.
Representing atmospheric convection, precipitating/nonprecipitating clouds, and their multiscale organization as well as their radiation interaction in GCMs remains a pressing challenge to reduce and quantify uncertainties associated with climate change projections (Randall et al. 2007; Stephens 2005). Atmospheric radiative structures, such as fluxes and the vertical/horizontal distributions of heating, are one of the most important factors determining global weather and climate. In particular, clouds can exert a strong influence on these radiative structures in regional radiative balance by reflecting shortwave (SW) radiation back to space and trapping longwave (LW) radiation and radiating it back to the surface, providing one of the strongest feedbacks in the climate system. The balance of these fluxes is essential for understanding Earth’s climate system and constraining the energy balance for climate models (Stephens 2005).
Global constraints and information for developing and evaluating clouds and radiation in GCM simulations were typically derived from cloud cover observations from the International Satellite Cloud Climatology Project (ISCCP) and related products (e.g., Han et al. 1999; Rossow and Zhang 1995; Rossow and Schiffer 1999) and from radiation budget observations from the Earth Radiation Budget Experiment/Clouds and the Earth’s Radiant Energy System (ERBE/CERES) (Wielicki et al. 1996). In the last decade, the first satellite simulator was available for the ISCCP (Rossow and Schiffer 1999) to serve for evaluation and intercomparison of climate model clouds (e.g., Norris and Weaver 2001; Lin and Zhang 2004; Zhang et al. 2005; Schmidt et al. 2006; Cole et al. 2011; Kay et al. 2012). More recently, the Cloud Feedback Model Intercomparison Project (CFMIP) (e.g., Bony et al. 2011) has been coordinating development of the CFMIP Observation Simulator Package (COSP) and includes a number of new satellite observations from the Multiangle Imaging SpectroRadiometer (MISR), Moderate Resolution Imaging Spectroradiometer (MODIS), CALIPSO, and Polarization and Anisotropy of Reflectances for Atmospheric Sciences coupled with Observations from a Lidar (PARASOL). COSP has been used widely to understand and quantify climate model cloud biases (e.g., Chepfer et al. 2008; Bodas-Salcedo et al. 2008, 2011; Zhang et al. 2010; Kay et al. 2012; Kodama et al. 2012; Nam and Quaas 2012).
A key step of obtaining an accurate top of atmosphere (TOA) and surface radiation budget is the representation of clouds; for GCMs, TOA balance is often gotten by tuning models’ TOA radiative fluxes toward observations through quantities such as cloud cover, cloud particle effective radius, and cloud mass, which have been largely unconstrained because of the lack of observations for cloud water mass and particle size. This is especially the case for the vertical structure information of cloud water mass leaving too many degrees of freedom unconstrained. The recent availability of the first tropospheric vertically resolved cloud radar reflectivity and derived ice/liquid profiles from CloudSat (Austin et al. 2009), combined with CALIPSO (Deng et al. 2010, 2013; Delanoë and Hogan 2008, 2010), provide new means for global cloud mass evaluation (e.g., Chepfer et al. 2008; Li et al. 2012, 2013; Waliser et al. 2009; Chen et al. 2011; Jiang et al. 2012; Gettelman et al. 2010; Klein and Jakob 1999; Webb et al. 2001; Delanoë and Hogan 2008, 2010; Delanoë et al. 2011; Bodas-Salcedo et al. 2008, 2011; Zhang et al. 2010; Kay et al. 2012; Kodama et al. 2012). Among those exploring this issue, Li et al. (2011, 2012, 2013) and Waliser et al. (2009, 2011) strive to point out that considerable care and caution are required in order to make judicious comparisons/interpretations regarding atmospheric liquid/ice and its associated interactions with radiation. This is because most GCMs typically only represent the “suspended” hydrometeors associated with some/most clouds, while satellite observations include both clouds and falling hydrometeors (e.g., rain or snow) as well as convective core cloud mass, as illustrated in Fig. 13-1. Note that the observations from sensors such as the CloudSat Radar and the CERES instruments are sensitive to a broader range of particles for ice/liquid water mass, including clouds, falling snow/rain, and convective core water mass. In contrast, most GCMs, including all CMIP3 models and most of the models in the CMIP5, only model the radiation impacts from the cloud-related hydrometeors and, in some cases, not even all the clouds (e.g., deep convection). Given that most models from CMIP3 and CMIP5, for example, significantly underestimate (or do not explicitly model all) the total water mass, this may result in possible biases in the radiation fields. An observation-based modeling study by Waliser et al. (2011) led to the hypothesis that the typical practice of ignoring the impacts of precipitating hydrometeors would account for at least a portion of this systematic bias. As the persistence of this practice continues, it is worth examining if the same systematic bias might be evident in CMIP5.
In this chapter, we highlight the recent evaluations of the model representations of cloud ice water content (CIWC) and cloud ice water path (CIWP) from a number of studies performed in recent years on cloud ice (e.g., Li et al. 2005, 2007, 2008, 2012; Waliser et al. 2009; Chen et al. 2011) and the radiation budget (e.g., Li et al. 2013). This includes a measure of observational uncertainty, and the illustrative and quantitative set of evaluation diagnostics for the fidelity of the models may have changed between CMIP3 and CMIP5. We then discuss systematic radiation budget biases in CMIP3 and CMIP5, in particular, by evaluating the radiative shortwave flux downward at the surface (RSDS) at the surface and reflected shortwave (RSUT) and radiative longwave flux upward at TOA (RLUT) in examining their biases in light of the ice water biases.
The model simulations considered in this study are from twentieth-century CMIP3 and CMIP5 simulations as well as the NASA Goddard Earth Observing System version 5 (GEOS5) AGCM with prescribed sea surface temperatures (SSTs) and Modern-Era Retrospective Analysis for Research and Applications (MERRA) data if available. Observation-based reference data for the TOA fluxes is derived from contemporary satellite radiation measurements, while surface fluxes are derived from satellite-constrained model calculations using a radiative transfer model.
In section 2, we describe the observational resources for cloud water (IWC/IWP), including the way the different retrievals and other methodologies are combined to form a robust observational estimate with some quantitative information on uncertainty as well as the observed and derived radiative fluxes at TOA and at surface, respectively. In section 3, we briefly describe the models and reanalyses datasets utilized in this evaluation study. In section 4, we illustrate and discuss the results of our model evaluation. Section 5 summarizes and draws conclusions.
2. Observed cloud water and radiation
a. Observed estimates of IWC and IWP
The A-Train constellation of satellites, which includes CloudSat and CALIPSO flying only 15 s apart, provides a global view of the vertical structure of clouds, including cloud condensate, such as IWC. CloudSat provides vertical profiles of radar reflectivity measured by a 94-GHz cloud profiling radar (CPR) with a minimum sensitivity of ~−30 dBZ. The profiles extend between the surface and 30-km altitude with a vertical resolution of 240 m and have a footprint of about 2.5 km along track and 1.4 km across track. The CALIPSO lidar measures parallel and perpendicular backscattered laser energy at 532 nm and total backscattering at 1064 nm at altitude-dependent vertical resolutions and footprints (75 m vertically with about a 0.3-km along-track footprint above 8.2 km and 30 m vertically with about a 1.0-km along-track footprint below 8.2 km). To date, a series of retrieval algorithms using either CloudSat radar or CALIPSO lidar or both provide global retrievals of IWC, effective radius (Re), and the extinction coefficient from the thinnest cirrus (seen only by the lidar) to the thickest ice cloud (Austin and Stephens 2001; Hogan 2006; Delanoë and Hogan 2008, 2010; Mace et al. 2009; Young and Vaughan 2009; Sassen et al. 2009; Deng et al. 2013; Stein et al. 2011).
There are three IWC and IWP products retrieved from three different algorithms available from CloudSat CPR data combined with other satellites’ data that can be used to help account for observational uncertainty. They are as follows:
2B-CWC-RO4 (Austin et al. 2009): The CloudSat Science Team level-2B radar-only cloud water content product (2B-CWC-RO4) provides estimates of IWC and Re using measured radar reflectivity from CloudSat 2B-GEOPROF to constrain the retrieved IWC. The retrieved IWC profiles are obtained by assuming constant ice particle density with a spherical shape and a lognormal particle-size distribution (PSD). An a priori PSD is specified based on its temperature dependencies obtained from European Centre for Medium-Range Weather Forecasts (ECMWF) operational analyses. The cloud water contents for both liquid and ice phases are retrieved for all heights using separate assumptions. Then a composite profile is created by using the retrieved ice properties at temperatures colder than −20°C, the retrieved liquid water content at temperatures warmer than 0°C, and a linear combination of the two in intermediate temperatures. This reduces the total IWC as the temperature approaches 0°C. The sensitivity and uncertainty of this retrieval algorithm are discussed in Austin et al. (2009). The time period of this dataset is from January 2007 to December 2010. The vertical and horizontal resolutions are the same as the CloudSat instrument discussed above.
DARDAR (Hogan 2006; Delanoë and Hogan 2008, 2010): DARDAR is a synergistic ice cloud retrieval product derived from the combination of the CloudSat radar and CALIPSO lidar using a variational method for retrieving profiles of the extinction coefficient, IWC, and Re of the ice cloud. DARDAR assumes a unified PSD given by Field et al. (2005). The mass-size and area-size relations of nonspherical particles are considered using in situ measurements (Brown and Francis 1995; Francis et al. 1998; Delanoë et al. 2011; Stein et al. 2011). For DARDAR, CALIPSO backscatter and temperature were used to find supercooled water in the 0° to −40°C range, while the depolarization is too noisy to use at the CALIPSO resolution (Delanoë and Hogan 2010). The time period of this dataset is from July 2006 to June 2009.
2C-ICE (Deng et al. 2013): Similar to DARDAR, the CloudSat level-2C ice cloud property product (2C-ICE) is a synergistic ice cloud retrieval derived from the combination of the CloudSat radar and CALIPSO lidar using a variational method for retrieving profiles of the extinction coefficient, IWC, and Re in ice clouds. The CALIPSO attenuated backscattering coefficients are collocated to the CloudSat vertical and horizontal resolutions. The ice cloud microphysical model assumes a first-order Gamma particle-size distribution of idealized nonspherical ice crystals (Yang et al. 2000). The Mie scattering of radar reflectivity is calculated in a forward model lookup table according to a discrete dipole approximation calculation (Hong 2007). The 2C-ICE cloud identification is provided by the CloudSat cloud classification (CLDCLASS)-lidar product, which takes advantage of CALIPSO lidar backscatter (sensitive to water clouds), lidar depolarization (sensitive to nonspherical ice particles), and CloudSat radar (sensitive to large ice particles). Readers desiring a more in-depth description of the 2C-ICE algorithm should refer to Deng et al. (2010) for details. The time period of this dataset is from January 2007 to December 2008.
There is one important aspect to keep in mind regarding model and observation compatibility. All three products, to first order, represent total tropospheric ice, including “floating” ice and the precipitating cloud hydrometeors with variable size and falling speed, as the measurements are sensitive to a wide range of particle sizes, including small (quasi-suspended/cloud) particles and large (falling/precipitating) particles. The latter, including those particles associated with convective clouds, are generally not included as prognostic variables (see Fig. 13-1) in most GCMs (e.g., Li et al. 2008, 2011, 2012, 2013; Waliser et al. 2009). It is generally assumed that convective core areas in a GCM grid box are small for a GCM with a gridbox size that is commonly larger than 100 km2. Thus, its contribution to total water content is not very large. Even though they are prognostically determined, the relative contribution does not change. However, the gridbox resolution in most current state-of-the-art GCMs is much higher, with gridbox sizes smaller than 100 km2 or less so that their IWC contribution from the convective core should be considered. Thus, for a meaningful model–observation comparison between the satellite-estimated and model-simulated IWC, an estimate of the convective/precipitating ice mass needs to be removed from the satellite-derived IWC/IWP values.
Two independent approaches to distinguish ice mass associated with clouds from ice mass associated with precipitation and convection follow:
FLAG method (Li et al. 2008, 2012; Waliser et al. 2009): All the retrievals in any profile that are flagged as precipitating at the surface and any retrieval within the profile whose cloud type is classified as deep convection or cumulus (from CloudSat 2B-CLDCLASS data) are excluded. By excluding these portions of the ice mass, an estimate of the cloud-only portion of the IWP/IWC (CIWP/CIWC) is obtained. This methodology of estimating CIWP/CIWC was used in model–data comparisons in many studies (e.g., Li et al. 2008, 2012; Waliser et al. 2009; Chen et al. 2011; Gettelman et al. 2010; Song et al. 2012; Donner et al. 2011; Ma et al. 2012).
PSD method (Chen et al. 2011): The ice PSD parameters associated with each CloudSat retrieval to separate the total IWC into mass with particle sizes smaller and larger than a selected particle-size threshold are also used. Based on the analysis in Chen et al. (2011) and references therein, the size separation of cloud ice and precipitating ice on a global mean basis likely falls between 100 and 200 μm in diameter. A threshold of 150 μm is chosen for the present study, and the integrated mass of particles with diameter smaller than this size is considered representative of the CIWC/CIWP. In this case, such estimates are based on a quantitative, microphysical characterization (i.e., PSD) regardless of the presence of surface precipitation or cloud type; thus, the vertical distributions of cloud ice versus precipitating ice mass can be derived from each CloudSat profile. The CIWC derived by this method has been shown to agree well with estimates based on the FLAG method (Li et al. 2012), and these CIWC have been applied to evaluate the atmospheric ice in the ECMWF IFS and the NASA Goddard Multiscale Modeling Framework (fvMMF) GCM (Chen et al. 2011).
It should be underscored that, with present satellite/retrieval technology, it is not possible to absolutely separate floating/cloudy forms of ice from falling/precipitating forms, yet models often try to make this distinction. Specific retrievals of this sort will require collocated vertical velocity information, such as from a Doppler radar capability, and/or multiple frequency radar to better characterize particle size.
To account for observational uncertainty Li et al. (2012) produce four different estimates of cloud ice (i.e., CIWP/CIWC) from the three retrieval products and two precipitation/convection filtering methods described above. These include the FLAG method applied to all three of the retrieval products as well as the PSD method applied to the 2B-CWC-RO4 product. The ensemble mean of these four estimates as the “observed” or “reference” values (herein referred to as such) and the spread between the four estimates can be used as a measure of observational uncertainty.
Figure 13-2 shows multiple annual mean maps of IWP quantities associated with our observational estimates. The four columns represent estimates of total ice (TIWP; Figs. 13-2a–d), precipitating and convective ice (PCIWP: Figs 13-2e–h), CIWP (Figs. 13-2i–l), and ensemble information (Figs. 13-2m–q). Overall, it is evident that the cloud ice (Figs. 13-2i–l) represents a smaller contribution to the total ice mass (Figs. 13-2a–d) than the precipitating/convective contribution (Figs 13-2e–h), ranging from 10%–30% depending on the product and location. It accounts for a smaller contribution in the two radar + lidar products (Figs. 13-2b,f,j,n and Figs. 13-2c,g,k,p) and in the tropics and storm-track regions in all products. In general, the CIWP estimates typically agree relatively well, typically within a factor of two, and most of the differences can be explained by the different microphysical assumptions (Deng et al. 2013; Delanoë et al. 2011). The difference in mass-size–area-size relations and the cloud occurrence identified criteria between CloudSat 2C-ICE and DARDAR contributes to some subtle differences between those two datasets (Deng et al. 2013). Figures 13-2m and 13-2n show the ensemble mean and standard deviation of the three observational estimates of TIWP (Figs. 13-2a–c), while the same is shown in Figs. 13-2o and 13-2p for the four estimates of CIWP. It is the latter two maps, which are based on the four individual estimates in the third column of Fig. 13-2, that represent the observational basis for CIWP and the GCM evaluations in this study.
Figure 13-3 is similar to Fig. 13-2, but displays the data as zonally averaged annual mean IWC as a function of height, rather than as vertically integrated IWP. The general commonalities and differences between the products and different filtering methods are the same as described above for CIWP and Fig. 13-2. Apart from that, the overall vertical structure of IWC in each exhibits three local maxima; one is in the tropics near 300 hPa and is associated with deep convection, and the other two are at 800 and 700 hPa in the midlatitude of the Southern and Northern Hemisphere, respectively, and correspond to the storm tracks. In general, these maxima tend to be lower in the two radar + lidar products (Figs. 13-3b,f,j,n and Figs. 13-3c,g,k,o) and higher in the radar-only products (Figs. 13-3a,e,i,m and Figs. 13-3d,h,l,p). Moreover, there is a tendency for the maxima to be higher in the CIWC profiles (Figs. 13-3i–l) compared to the PCIWC profiles (Figs. 13-3e–h), particularly in the midlatitudes, which has some intuitive merit as the larger, precipitating particles with larger falling velocity present in the cloud(s) present at lower altitudes. As with Fig. 13-2, the four estimates of CIWC (Figs. 13-3i–l) and their ensemble information (Figs. 13-3o,p) are used to evaluate the model/reanalysis representations of CIWC. The information in Figs. 13-3m and 13-3n can be used to compare with two GCMs [i.e., the GFDL atmospheric general circulation model (AM3) and CM3] examined in this study that provide outputs of TIWC/TIWP (details see section 3).
b. Radiation data sources
The most up-to-date source of the RLUT and RSUT fluxes is the CERES Energy Balanced and Filled (EBAF) product (CERES_EBAF-TOA_Ed2.6r) (Loeb et al. 2012, 2009). The CERES EBAF product includes the latest instrument calibration improvements, algorithm enhancements, and other updates. CERES TOA SW and LW fluxes in the EBAF product are adjusted within their range of uncertainty to remove the inconsistency between average global net TOA flux and heat storage in the earth–atmosphere system, as determined primarily from ocean heat content anomaly (OHCA) data [see supplementary information in Loeb et al. (2012) for more details]. Note, the RLUT and RSUT fluxes used for direct model evaluation are CERES-EBAF, which is directly measured and adjusted/balanced with global energy.
The sources of RSDS are available from the EBAF-Surface product. This surface flux radiation product is constrained by TOA CERES–derived flux with EBAF adjustments (Kato et al. 2010, 2011, 2013, 2012) and is based on two CERES data products. Edition 3-lite SYN1deg-Month provides computed irradiances to be adjusted, and EBAF Ed2.6r (Loeb et al. 2009, 2012) provides the constraint. In addition, temperature and humidity profiles used in the computations are from the GEOS Data Assimilation System reanalysis (GEOS-4 and 5). MODIS-derived cloud properties (Minnis et al. 2011) are combined with geostationary satellite (GEO)-derived cloud properties (Minnis et al. 1994) to resolve the diurnal cycle used in the SYN1deg-Month flux computations. Note that, unlike TOA irradiances, the global estimate of irradiance at the surface is only possible by using a radiative transfer model. The errors in cloud and atmospheric properties used as inputs, therefore, directly affect the accuracy and stability of modeled surface irradiances (Kato et al. 2013). In addition, model-computed TOA irradiances do not necessarily agree with CERES-derived observed TOA irradiances. Therefore, to mitigate these problems, computed fluxes in the EBAF-Surface product are constrained to be consistent with CERES-derived observed TOA fluxes to within their uncertainties. In the constraining process, Cloudsat radar- and CALIPSO lidar-derived cloud vertical profiles (Kato et al. 2010), as well as Atmospheric Infrared Sounder (AIRS)-derived temperature and humidity profiles, are used to determine the uncertainty in cloud and atmospheric properties.
Figure 13-4a shows the annual mean maps of RSDS estimated from EBAF-Surface, while Fig. 13-4b shows annual mean maps of RLUT (W m−2) from CERES EBAF. Shown in Fig. 13-4c is an annual mean map of RSUT from CERES EBAF. The data is the monthly mean, collected from January 2000 to December 2010. For more details on the observed radiation reference datasets and their uncertainties, readers are referred to Li et al. (2013) and Kato et al. (2013, 2012) for RSDS and Loeb et al. (2012) for RSUT and RLUT.
3. Modeled values of cloud water and radiative fluxes
Using the observations described in the previous section, we present results from the evaluation of the CIWP/CIWC and radiative fluxes in Li et al. (2013) in reanalysis datasets, including ECMWF [ERA-Interim; Dee et al. (2011)] and NASA MERRA, coupled atmosphere–ocean GCMs (CGCMs) from the CMIP3 (for CIWP only), CGCMs from CMIP5, and two additional state-of-the-art GCMs, including the University of California, Los Angeles, (UCLA) CGCM (Ma et al. 2012) and the NASA GEOS5 GCM. The CMIP3 simulations are the same as those described in Waliser et al. (2009). Note that CMIP3 model output did not include CIWC. The models used for CMIP5 simulations are listed in Table 13-1. The outline of cloud microphysics parameterizations and SW/LW radiation parameterizations used in the selected CMIP5 models, as well as in the UCLA CGCM, GEOS5, MERRA, and ERA-Interim models, are available in Li et al. (2012) and Li et al. (2013).
Unlike all other models examined, which do not include ice mass from convective-type clouds in their CIWC, the two GFDL models include grid means over shallow cumulus, deep cumulus cells, and convective mesoscale clouds, weighted by their respective area fractions. In the GFDL CM3, precipitating ice, however, that has fallen out of large-scale stratiform clouds and into clear areas is not included. Thus, the GFDL models should be considered somewhat carefully with respect to the others, as they are including cloud mass from clouds whose contributions have been typically ignored, and their IWC/IWP fields would be more commensurate with TIWC/TIWP. In the CSIRO, diagnostic falling precipitation is considered while the convective-type clouds of cloud hydrometeors are not included. Thus, the CSIRO model should somewhat be considered between the cloud-only and total ice water content/path. For both the GCM and observational datasets, all fields have been regridded to 40 levels (with a constant pressure interval of 25 hPa) and mapped onto common 8° × 4° longitude by latitude grids.
The specific experimental scenario in CMIP5 is the historical twentieth-century simulation, which used observed twentieth-century greenhouse gas, ozone, aerosol, and solar forcing. The time period used for the long-term mean is 1970–2005, and if a model provided an ensemble of simulations, only one of them was chosen for this evaluation. For both the GCM and observational datasets, all fields have been regridded and mapped onto common 2° × 2° latitude by longitude grids for radiative fluxes and mapped onto common 8° × 4° latitude by longitude grids for IWC/IWP because of the narrow track of CloudSat–CALIPSO.
4. Characterizing and understanding cloud ice and radiation budget biases
a. Biases of modeled IWC and IWP
Figure 13-5 shows the long-term annual mean spatial distributions of simulated values of CIWP from the 15 CMIP5 CGCMs (see Table 13-1), the multimodel ensemble mean from the 15 CMIP5 models (Fig. 13-5p), GEOS5 (Fig. 13-5s), UCLA CGCM (Fig. 13-5t), and the two analyses ERA-Interim (Fig. 13-5u) and MERRA (Fig. 13-5v), as well as the ensemble mean (Fig. 13-5y) and standard deviation (Fig. 13-5z) of the four observed estimates of CIWP discussed above. Overall, the multimodel mean CMIP5 CIWP values are spatially similar to observations but nonetheless are biased high. Individually, most models tend to qualitatively capture the global and regional CIWP patterns. This includes the relatively high values of CIWP in the intertropical convergence zone (ITCZ), South Pacific convergence zone (SPCZ), warm pool, and storm tracks from the subtropics to high latitudes and over convectively active continental areas over central Africa and South America. Note that the relative magnitudes between tropical and midlatitude values can be quite different across models (this will be more evident when discussing Fig. 13-8 below). About three of the CMIP5 models do a good job at representing both the observed patterns and magnitudes of CIWP (i.e., CNRM-CM5, CanESM2, MRI). A number of models, however, significantly (~factor of 2) underestimate tropical CIWP (i.e., NorESM, BCC, BCC_CSM1, CCSM4) and two severely (~factor of 10) underestimate CIWP (i.e., INM-CM4.0 and INM-CM4.0-ESM). The two GISS GCMs greatly overestimate (~factor of 5) tropical CIWP. The IPSL, CSIRO, MIROC5, MIROC4h, and the two GISS GCMs moderately overestimate CIWP in the extratropics. For the non-CMIP5 GCMs, the GEOS5 AGCM significantly underestimates (~factor of 3) CIWP in the storm tracks, while the UCLA CGCM does remarkably well over most of the globe. The two reanalyses, ECMWF and MERRA, show relatively good CIWP patterns and magnitudes, with MERRA being biased a bit low in midlatitudes which is not surprising given that the base model (GEOS5) exhibits such a strong negative bias. While the above model–observation differences are still substantial in many regards, it is worth noting that the ensemble of CMIP5 CIWP values examined here appears to exhibit improvement compared to the ensemble CMIP3 models evaluated in our previous study (Li et al. 2012, appendix Fig. A1); this will be discussed and quantified further below. The two GFDL models (Figs. 13-5q,r) that simulate and provide output on TIWC each exhibit fairly good TIWP in the tropical ITCZ, warm pool, and convectively active continental regions but significantly underestimate TIWP in the extratropics storm-track regions compared to ensemble mean TIWP shown in Fig. 13-5w.
To further quantify and synthesize the comparative information discussed above, we can use a Taylor diagram (Taylor 2001) to summarize both the degree of agreement in overall CIWP spatial pattern correlations along with the standard deviation among the CMIP5 CGCMs, including their multimodel mean, two analyses, three other GCMs, and four observed CWIP estimates. The ensemble mean of the latter is used as the reference dataset and their spread to help quantify observational uncertainty. The Taylor diagram relates three statistical measures of model fidelity: the centered root-mean-square error, the spatial correlation, and the spatial standard deviations. These statistics are calculated for the long-term time mean and over the global domain (area weighted). The reference dataset is plotted along the x axis at the value 1.0.
Figure 13-6 shows Taylor diagrams for CMIP3 (Fig. 13-6a) and CMIP5 (Fig. 13-6b). The observed estimates are plotted in blue, the CMIP GCMs in red, their ensemble means in green, and the reanalyses and non-CMIP GCMs are in black. The red rectangular-like region illustrates a measure of observational uncertainty developed and shown in conjunction with Fig. 13-2. Not surprisingly, the four individual observed estimates, reanalyses, and AGCM simulations (i.e., specified SST; GEOS5 version 2.5) perform as a group considerably better than the CMIP coupled GCMs. The former all tend to have correlations at around 0.9 or better and standard deviation ratios of between about 0.8 and 1.5. For the CMIP values (red), most of them have correlations between about 0.4 and 0.7 with standard deviation ratios well above 1, with some well above 3 and even up to 5. The CMIP3 and CMIP5 multimodel means do not exhibit the best overall performance relative to the individual models because of the few strong outliers in the ensembles. Noteworthy, however is that the CMIP3 and CMIP5 multimodel means (green) have correlations of 0.54 and 0.76, and standard deviation ratios of about 3.1 and 1.4, respectively, indicating a rather considerable performance improvement from CMIP3 to CMIP5 for representing CWIP. While this progress is encouraging, keep in mind also that all models shown still exhibit a very poor correlation against the reference dataset, with values less than 0.8, and none of the CMIP GCMs fall within the (red box) range of observational uncertainty. In regards to details of specific models’ performance, readers are referred to Li et al. (2012).
Next, we present the fidelity of the models’ CIWC vertical structure. A comparison is given in Fig. 13-7 showing the CIWC zonal and annual mean values from the 13 CMIP5 CGCMs (note that the CNRM-CM5 CGCM CIWC is not available from the CMIP5 data portal at this time), the GEOS5 AGCM (Fig. 13-7q) and the UCLA CGCM (Fig. 13-7p), as well as the ERA-Interim (Fig. 13-7r) and MERRA (Fig. 13-7s). These models provide output specifically on cloud ice. The two GFDL GCMs, on the other hand, are shown in Figs. 13-7n and 13-7o and provide output for TIWC. Overall, there are significant disparities between the CMIP5 CGCMs against the observed ensemble mean (Fig. 13-7v), with overall discrepancies ranging from multiplicative factors of about 0.25 of the observations (i.e., INM-CM4.0) to factors of 10 (i.e., GISS GCMs). Moreover, the general character of their vertical distributions with respect to pressure levels is considerably different. For example, the IPSL exhibits significant overestimates of CIWC over the storm-track regions. About five of the CMIP5 models do a fair job at representing the vertical structure and magnitude of IWC [i.e., CanESM2 (Fig. 13-7f), BCC_CSM1.1 ESM (Fig. 13-7g), NorESM1 (Fig. 13-7h), MIROC5 (Fig. 13-7k), MRI (Fig. 13-7l), and CCSM4 (Fig. 13-7m)]. The rest of the models [CSIRO (Fig. 13-7i), MIROC4h (Fig. 13-7j), and ERA-Interim (Fig. 13-7o)] generally tend to qualitatively capture the patterns but overestimate CIWC over midlatitudes and below 700 hPa. The GEOS5 model, on the other hand, tends to slightly overestimate CIWC in the tropics but significantly underestimates CIWC in the mid-to-high latitudes by about a factor of 2–3. The analyses from MERRA as well as the simulation from the UCLA CGCM show realistic CIWC vertically with values close to the observed ensemble mean, albeit not extending as close to the surface when compared to the observed estimate. However, it is reasonable to exercise caution when considering the robustness of the observed values in these lower-tropospheric regions or anywhere below the freezing level, as there are artificial limitations applied to the retrievals that involve separating ice from liquid contributions. Compared to the observed TIWC (Fig. 13-7t), the two GFDL models all capture the ITCZ in tropical regions pretty well but significantly underestimate TWIC in the extratropical storm track and polar regions. A realistic ITCZ is found in the GFDL uncoupled AGCM (Fig. 13-7o), while a more notable double ITCZ is evident in the GFDL CGCM (Fig. 13-7n). However, the higher values in midtropospheric tropics in these models relative to the observed value might be due to an unrealistic amount of cloud ice for temperatures above freezing in the tropics.
b. Bias of modeled radiative fluxes
While most GCMs (including all CMIP3 and most of CMIP5) typically only represent the suspended hydrometeors associated with some/most clouds, satellite observations include both clouds and falling hydrometeors (e.g., snow) as well as convective core cloud mass (see section 1 and Fig. 13-1). These GCMs, however, have typically tuned their radiation and cloud fields to the observations, which naturally are sensitive to all/most hydrometeors in the atmosphere despite the fact that most of the models typically only represent the suspended hydrometeors associated with clouds, and usually this does not include ice and liquid in convective cores. Figure 13-8 presents CMIP3 (Fig. 13-8a) and CMIP5 (Fig. 13-8b) multimodel mean biases of TIWP (cloud + convective core + precipitating), where the observed cloud ice estimate is from the ensemble mean of three total ice water path observed estimates from the standard CloudSat, a version of DARDAR (Delanoë and Hogan 2008, 2010), and a version of 2C-ICE (Deng et al. 2013) satellite products. The modeled IWP values include only the contributions of suspended clouds, as that is all they typically represent, with no contribution from convective cores or precipitation. The observed IWP values are based on an ensemble of CloudSat + CALIPSO estimates (Li et al. 2012), which do include contributions from precipitation and all clouds, including convective cores. It shows that both CMIP3 and CMIP5 models significantly underestimate ice mass over the ITCZ, the SPCZ, a part of Southern Ocean and tropical continents, and the Indian monsoon regions. Thus, contributions from falling/precipitating hydrometeors are unaccounted for and/or erroneously accounted for by other processes, such as interaction with radiation calculation and hydrological cycle, etc. The models are trying to achieve a radiative balance at TOA without representing all the ice mass in the atmosphere (e.g., Waliser et al. 2011; Li et al. 2013). However, the observations from sensors such as the CloudSat radar and the CERES instruments are sensitive to a broader range of particles for ice/liquid water mass. It is not surprising, yet it is important to highlight that this missing water mass will have some interaction with radiation. Here, we highlight the radiative biases for CMIP3/CMIP5 in terms of only presenting the multimodel mean biases for each radiative flux. For each individual model’s performance in radiative fluxes at TOA and at surface, readers are referred to Li et al. (2013).
Figure 13-9 illustrates the multimodel mean biases (Figs. 13-9a,b) and the multimodel mean standard deviation of the error (SDE), which is defined as a root-mean-square error with the mean bias removed (Figs. 13-9c,d), against the observed estimate, which is calculated across the models for each of the ensembles. Overall, CMIP3 RSDS (Fig. 13-9a) shows low negative biases globally and more uniformly, except over the ITCZ, off the coast of the Peru/California regions, and over Indian monsoon regions. While the CMIP5 global area average bias (Fig. 13-9b) is reduced (~30%) to the value of 2.5 W m−2 from the value of −6.9 W m−2 in CMIP3, it exhibits more distinct spatial gradients with greater local extreme biases and a higher bias over most land regions. While the bias figure emphasizes the sign and magnitude of systematic biases across the two model archives, the SDE figure emphasizes errors irrespective of the sign. The similar pattern of the SDE between the two model ensembles indicates that CMIP3 and CMIP5 share many of the same systematic errors in simulating radiation fluxes. The fact that the magnitude of the SDE is about the same between CMIP3 and CMIP5 indicates that little improvement from CMIP3 has been made. The high SDE values in the equatorial regions of the Pacific and Atlantic (Figs. 13-9c,d), combined with the low bias in the same regions (Figs. 13-9a,b), indicates significant disparity in the manner in which models represent surface radiation in these regions. This is similarly true over the mountainous regions of Asia and South America and, to a lesser extent, over the storm tracks.
The multimodel performance of CMIP3 and CMIP5 in representing the time-mean pattern of RLUT shown in Fig. 13-10 illustrates the multimodel mean bias (Figs. 13-10a,b) and the multimodel mean SDE (Figs. 13-10c,d) against the observed estimate. Interestingly, both CMIP3 (Fig. 13-10a) and CMIP5 (Fig. 13-10b) exhibit very similar RLUT bias patterns and magnitudes. Notable is the consistency of having a high bias over the ITCZ, SPCZ, a part of Southern Ocean and tropical continents, and the Indian monsoon regions. The CMIP5 global area average RLUT bias value of −1.9 W m−2 is about a factor of 2 smaller compared to the CMIP3 value of −3.5 W m−2. The SDE figure indicates a similar pattern of systematic errors in the tropics, with no substantial change in the global mean SDE in CMIP3 (9.8 W m−2) and CMIP5 (8.9 W m−2). The bias and SDE together indicate that the models are performing relatively well in the mid and high latitudes but exhibit significant shortcomings in the tropics. Given that all but one of these models is coupled, it is possible that some part of the SDE errors could result from spatial variations in the location of the ITCZ and associated cloud structure that have a substantive impact on the RLUT field.
Figure 13-11 illustrates the multimodel mean biases (Figs. 13-11a,b) and the SDE (Figs. 13-11c,d) of RSUT against the observed estimate. Similar to RSDS and RLUT, the RSUT SDE and bias in both CMIP3 and CMIP5 exhibit very similar patterns, and there is a clear systematic underestimation over ITCZ, SPCZ, and parts of the Southern Ocean, tropical continents, and the Indian monsoon regions. While the CMIP5 global area average RSUT bias is about a factor of 2 smaller (2.5 W m−2) compared to the CMIP3 value of 4.5 W m−2, they share very similar pattern distributions in bias and SDE. The SDE figure indicates a slight improvement of CMIP5 (14.1 W m−2) over CMIP3 (14.7 W m−2).
To illustrate the effect of the underestimated cloud mass due to excluding the cloud mass from precipitation and convective core in the conventional GCMs, we draw a conceptual sketch of cloud–precipitation–radiation interactions for the real world versus those for conventional GCMs in Fig. 13-12. The figure shows the underestimated values of cloud mass in the CMIP3 and CMIP5 multimodel means might directly, and in part, lead to the overestimations of RSDS (Fig. 13-9) and RLUT (Fig. 13-10) and the underestimation RSUT (Fig. 13-11) across the models and multimodel mean in the heavily precipitating regions. This conjecture is supported by Fig. 13-13, which shows that the latitude of the maximum zonal mean precipitation (180°–360°; 0°–15°N) is strongly correlated with the latitude of the maximum/minimum multimodel mean bias of RSDS, RLUT, and RSUT of the CMIP5 models. In other word, the RSDS, RLUT, and RSUT biases in CMIP3 and CMIP5 are attributed to ignoring the radiative impacts from the precipitating clouds and convective core water mass. As cloud–climate feedback will undoubtedly represent a key uncertainty in the next Intergovernmental Panel on Climate Change (IPCC) assessment report, it is essential that cloud and radiation observations be utilized to their full extent and in concert to provide more complete constraints and that clouds, convection, precipitation, and radiation be treated in a consistent manner, as shown in the left panel of Fig. 13-12.
5. Summary and discussion
In this study, with the missing falling particles/convective mass interactions with radiation in mind, we evaluate the representation of surface and TOA fluxes of atmospheric radiation in GCMs, namely CMIP GCMs, with the focus on the fluxes most strongly influenced by clouds (i.e., RSDS, RSUT, and RLUT) associated with heavily precipitating regions. This means that, apart from a general assessment of the fidelity of the models’ representation of radiation, we seek to relate the impacts of ignoring the interaction of radiation with precipitating and convective clouds, a common practice in most CGCMs contributing to CMIP3 and CMIP5 [see cloud liquid and ice evaluations in Li et al. (2012)].
We first presented an evaluation of the representation of atmospheric ice and radiation in model simulations of present-day climatology, including those of CMIP5 and their comparison to those of CMIP3. Observational reference values and their uncertainty were addressed by using four different estimates of CIWC/CIWP that accounted for different approaches to the retrieval and to the methods of filtering out the contribution to the ice mass in the retrievals because of large-particle/precipitating components, which is a contribution to the mass that is typically not represented in GCMs as a prognostic or column-resolved quantity (see section 2). The models evaluated included 15 simulations of present-day climate available to date in CMIP5 and two other GCMs of interest (GEOS-5 AGCM and UCLA CGCM). The evaluation also included two modern reanalyses (MERRA and ERA-Interim).
Overall, based on a number of diagnostics, there is a fairly wide disparity in the fidelity of CIWP/CIWC representations in the models examined. Even for the annual mean maps considered, there are easily factors of 2 and nearly up to 10 for the differences between observations and modeled values for most of the GCMs for a number of regions (Figs. 13-5–13-8). As expected the two reanalyses examined performed relatively well compared to the group as a whole because of their use of observed SSTs and the incorporation of a wide array of constraining observations: a result that is still notable though since they do not assimilate cloud ice observations and thus rely on (parameterized) model physics to represent this quantity. However, even with the assimilation of many other/related quantities and the benefit of observed SSTs, neither MERRA’s nor ERA-Interim’s performance was within the level of uncertainty of the observations for both the standard deviation ratio and pattern correlation. Considering even these results alone and the remaining disparities between the observations and modeled values of CIWP, it is evident that while the models may be providing roughly the correct radiative energy budget at TOA, many are accomplishing it by means of unrealistic cloud characteristics of cloud ice mass at a minimum, which, in turn, likely indicates unrealistic cloud particle sizes and cloud cover.
Examining the overall performance between CMIP3 and CMIP5, based on a number of diagnostics, we find that there has been significant and quantitative improvement in the representation of CIWP between CMIP3 and CMIP5 by the reduction (by about 50% or more) in the multimodel mean bias of the annual mean maps of CIWP from CMIP3 to CMIP5 (Fig. 13-6). Note that the overall assessment of the improvement from CMIP3 to CMIP5 is done with different sets of models because the participating centers/models in CMIP3/CMIP5 are very different. Only six modeling centers have participated in both CMIP3 and CMIP5, and the models from the same modeling centers might be very different from CMIP3 to CMIP5.
The shortcomings of GCMs ignoring the cloud mass associated with precipitating hydrometeors (e.g., snow and rain) and convective core ice and liquid mass, are clearly presented in the CMIP3 (Fig. 13-8a) and CMIP5 (Fig. 13-8b) multimodel mean biases of total ice water path (TIWP = cloud + convective core + precipitating). It is clearly shown in Fig. 13-8 that both CMIP3 and CMIP5 models significantly underestimate ice mass over the ITCZ, the SPCZ, a part of the Southern Ocean and tropical continents, and the Indian monsoon regions, which also tend to all be high-precipitation regions. In addition, Fig. 13-8 suggests the IWP biases against observed total IWP are significantly worse in CMIP5 models compared to CMIP3 models (in the multimodel means), yet the radiative fields are slightly improved overall in terms of global area average shown in Figs. 13-9–13-11.
Observational reference values and their uncertainties for RSDS are addressed by using the EBAF-Surface and CERES TOA radiative fluxes of RLUT and RSUT. Overall, there is a fairly wide disparity in the fidelity of RSDS, RLUT, and RSUT representations in the models examined. Even for the annual mean bias maps considered, there are local biases easily as high (low) as 30 (−30) W m−2, respectively. Based on a number of diagnostics, there has been a small degree of improvement in the representation of RSDS, RLUT, and RSDS from CMIP3 to CMIP5. This is demonstrated, in terms of global mean, by the reduction of RSDS (by about 30%), RLUT (about 50%), and RSUT (about 40%) in the multimodel mean bias. In particular, the multimodel mean bias of RSDS has been reduced from CMIP3 (−6.9 W m−2) to CMIP5 (~2.5 W m−2). However, this is mostly because of the positive biases over land that became larger while the negative biases over the ocean remained about the same. In addition, an indication of overall improvement in the representation of the quantities studied from CMIP3 to CMIP5 is not evident when considering the SDE computed across the models (i.e., Figs. 13-9–13-11).
Persistent and systematic spatial pattern biases across most of the models with the multimodel ensemble means values are underestimated in RSUT and overestimated in RSDS and in RLUT in the convectively active regions of the tropics (i.e., ITCZ/SPCZ, warm pool, Indian monsoon, South America, and central Africa) (i.e., Figs. 13-9–13-11). Given that a number of these RSDS, RLUT, and RSUT biases occur in conjunction with heavy precipitation and with biases in cloud liquid and ice (Li et al. 2012), we hypothesize that at least a part of these persistent radiation biases stems from GCMs ignoring the effects of precipitating and/or convective core ice and liquid in their radiation calculations illustrated in Fig. 13-12 (e.g., Waliser et al. 2011; Li et al. 2013).
The fact that viable observed estimates of TOA radiation fields and observation-driven modeled values at the surface have been available for many years and yet the biases are still sizeable suggests challenges to utilizing the observations by the modeling groups or that there are still too many degrees of freedom unconstrained (e.g., cloud cover, cloud mass, particle size, vertical structure, and particle shape). There is certainly evidence of this in regards to cloud liquid and ice content (e.g., Waliser et al. 2009; Li et al. 2012). In addition, GCMs have typically tuned their radiation and cloud fields to the observations, which naturally are sensitive to all/most hydrometeors in the atmosphere despite the fact that most of the models typically only represent the suspended hydrometeors associated with clouds and usually do not include ice and liquid in convective cores. Thus, contributions from falling/precipitating hydrometeors are unaccounted for and/or erroneously accounted for by other processes, such as the calculation of radiation and the hydrological cycle.
While it is beyond the scope of this study to probe the causes of the model-to-model differences and model-to-observation biases in cloud water and radiation, based on the results, we hypothesize that the lack of an explicit representation of the cloudy, precipitating, and convective core components of the ice (and liquid) mass might play an important role for the biases in RSDS, RLUT, and RSUT. Our recent study (Waliser et al. 2011) showed that ignoring radiative effects of the precipitating components of the ice mass can result in nontrivial biases in the shortwave and longwave radiation budgets at the surface and top of atmosphere and even more significant impacts in the vertical radiative heating profile. While more work needs to be pursued in this area, there is a strong suggestion from these studies that GCMs should strive to explicitly represent a broader range of ice and liquid hydrometeors—namely, the larger falling hydrometeors (rain and snow)—as well as convective core mass and include their effects in the radiative heating calculations, which, for the moment, are largely ignored. Moreover, the evaluation results of this study show that the radiation balance in the CMIP class of GCMs is still underconstrained and, in many cases, is likely to have been achieved in unrealistic ways.
Taken together, these points indicate the need for additional observational resources to adequately characterize and constrain cloud–precipitation–radiation interactions. Some potentially useful observational resources are a multichannel radar/lidar measurement to characterize the profile and spectrum of cloud and precipitation particle sizes, as well as a Doppler radar capability to provide information on cloud and precipitation dynamics. In addition, satellite observations are affected by spatiotemporal sampling, instrument sensitivity, and retrieval assumptions. Simulators are one method available to emulate these idiosyncrasies within a climate model and thus can be an invaluable tool for robust evaluation of model-simulated clouds. In the future, we plan to integrate these methodologies into our evaluation studies. The use of these additional observational resources, in conjunction with systematic model experimentation practices, will likely be a constructive strategy for improving the cloud–precipitation–radiation interactions alluded to above.
We thank Dr. Bo-Wen Shen for useful comments. This work has been supported in part by AIST-11 JPL Advanced Information Systems Technology. The contributions by DEW, SWL, and JLL to this study were carried out on behalf of the Jet Propulsion Laboratory, California Institute of Technology, under a contract with the National Aeronautics and Space Administration.