Previous evaluations of model simulations of the cloud and water vapor feedbacks in response to El Niño warming have singled out two common biases in models from phase 3 of the Coupled Model Intercomparison Project (CMIP3): an underestimate of the negative feedback from the shortwave cloud radiative forcing (SWCRF) and an overestimate of the positive feedback from the greenhouse effect of water vapor. Here, the authors check whether these two biases are alleviated in the CMIP5 models. While encouraging improvements are found, particularly in the simulation of the negative SWCRF feedback, the biases in the simulation of these two feedbacks remain prevalent and significant. It is shown that bias in the SWCRF feedback correlates well with biases in the corresponding feedbacks from precipitation, large-scale circulation, and longwave radiative forcing of clouds (LWCRF). By dividing CMIP5 models into two categories—high score models (HSM) and low score models (LSM)—based on their individual skills of simulating the SWCRF feedback, the authors further find that ocean–atmosphere coupling generally lowers the score of the simulated feedbacks of water vapor and clouds but that the LSM is more affected by the coupling than the HSM. They also find that the SWCRF feedback is simulated better in the models that have a more realistic zonal extent of the equatorial cold tongue, suggesting that the continuing existence of an excessive cold tongue is a key factor behind the persistence of the feedback biases in models.
Cloud and water vapor are major modulators in the climate system and strongly influence both the global circulation and energy balance through their radiative effects (Manabe and Wetherald 1967; Hartmann and Short 1980; Harrison et al. 1990; Kiehl and Trenberth 1997; Soden 1997; Houghton et al. 2001; Stephens 2005). The climate sensitivity to the rise of the manmade greenhouse gases in the atmosphere depends critically on the feedbacks from water vapor and clouds. However, the feedbacks from cloud and water vapor are considered as the largest source of uncertainty in climate predictions (Bony and Dufresne 2005; Randall et al. 2007). Therefore, assessing the accuracy and narrowing the uncertainties of the cloud and water vapor feedbacks in current leading climate models is of obvious importance. Indeed, persistent efforts have been made in this regard (Cess et al. 1990; Sun and Held 1996; Bony and Dufresne 2005; Stephens 2005; Bony et al. 2006; Lin 2007; Zhu et al. 2007; Soden et al. 2008; Sun et al. 2009, hereafter S09).
Work on the evaluation of the feedbacks from water vapor and clouds have employed two main methodologies. The first one involves examining the response of cloud and water vapor in different models to a prescribed or model predicted global warming (Cess et al. 1990, 1996; Soden and Held 2006; among others). These studies have revealed that the cloud feedbacks differ greatly among models while the globally averaged feedback from water vapor in the models follows that of a constant relative humidity model. As noted by S09, however, these results have neither pointed out which model has the right cloud feedback nor ruled out the possibility that all the models have a similar bias in water vapor feedback. In addition, a uniform increase in the SST is often used to examine the response of cloud and water vapor to global warming in models. However, there is uncertainty with the spatial pattern of global warming. For one thing, the surface temperature is not likely to increase uniformly across the globe (Xie et al. 2010).
Another methodology used to examine the cloud and water vapor feedbacks in climate models involves examining the response of cloud and water vapor to SST changes on the time scales of El Niño–Southern Oscillation (ENSO) (Sun et al. 2003, 2006; S09; Lloyd et al. 2009, 2011, 2012; among others). By comparing the response of cloud and water vapor to the ENSO forcing in nature with that in Atmospheric Model Intercomparison Project (AMIP) simulations by some leading climate models, Sun et al. (2006) revealed two common biases in the AMIP runs of models: 1) an underestimate of the strength of the negative shortwave cloud radiative forcing (SWCRF) feedback and 2) an overestimate of the positive feedback from the greenhouse effect of water vapor. Extending the same analysis to the fully coupled simulations of these models as well as other coupled models in phase 3 of the Coupled Model Intercomparison Project (CMIP3), S09 found that these two biases persist. Studies by Lloyd et al. (2009, 2011, 2012) using CMIP3 outputs show that bias in the SWCRF feedback is the main source of model uncertainty and the biases in the cloud response to dynamical changes dominate the modeled SWCRF feedback. The main purpose of the present study is to update the results of S09 using CMIP5 models and specifically to explore whether the newer models have improvements in simulating the two aforementioned feedbacks and the corresponding origins for the improvements.
The remainder of the paper is organized as follows. After describing the data and methods used in this study in section 2, we show the changes and improvements from CMIP3 to CMIP5 in section 3, particularly in reference to the key results from S09. We then show the correlation among biases in different feedbacks and explore the causes for the biases in section 4. Finally, a summary and discussion are given in section 5.
2. Data and methodology
a. Model, observational, and reanalysis datasets
The model data are derived from the CMIP5 multimodel dataset (Taylor et al. 2012). Outputs from 18 CMIP5 models participating in the Intergovernmental Panel for Climate Change (IPCC) Fifth Assessment Report (AR5) are used in this study. Their acronyms and the names of the institution that have provided them to IPCC AR5 are listed in Table 1. The reason we chose these 18 models for our initial focus is that they are the models whose outputs including both historical runs and AMIP runs had been submitted to the Program for Climate Model Diagnosis and Intercomparison (PCMDI) at the time of our analysis. The historical (AMIP) runs of these models cover the period 1850–2005 (1979–2005).
The radiative fluxes supplied by the International Satellite Cloud Climatology Project (ISCCP; Zhang et al. 2004) and the precipitation derived from the Global Precipitation Climatology Project (GPCP; Huffman et al. 2009) data are used in this study. In addition, the datasets including 1) Hadley Centre Sea Ice and Sea Surface Temperature dataset (HadISST; Rayner et al. 2003) and 2) the National Centers for Environmental Prediction–U.S. Department of Energy (NCEP–DOE) Global Reanalysis 2 data (NCEP2; Kanamitsu et al. 2002) are also employed. As the surface heat fluxes of NCEP2 are subject to larger errors in the eastern Pacific (Cronin et al. 2006), to provide a lower bound to the uncertainty, the turbulent heat fluxes from objectively analyzed air–sea fluxes (OAFlux; Yu and Weller 2007) are employed to calculate the observed feedbacks associated with the surface heat flux. In the present analysis, all the monthly datasets are analyzed during the period from 1984 to 2005 and have been interpolated to a common horizontal grid of 2.5° × 2.5°.
b. Analysis method
To have a straight comparison with the results shown in S09 and showcase the improvements from CMIP3 to CMIP5 in the key feedbacks underscored in S09, the primary methodology used here is the same as that employed in S09: we obtain SST anomalies (SSTA) averaged over the equatorial Pacific (5°S–5°N, 150°E–110°W) by removing the monthly annual cycle of SST and then calculating the values of feedbacks during ENSO through a linear regression using the interannual variations of the SSTA and the corresponding variables. To obtain a more complete characterization of the model deficiencies and identify common factors responsible for the discrepancies in physical processes, we further introduce a skill score as an additional measure of the model deficiencies. In reference to the previous skill score formula for model performance (Taylor 2001; Hirota et al. 2011), the skill score in the present study is given by the following formula:
where R is a pattern correlation between the observation and the models and SDR is the ratio of spatial standard deviations of the models against that of the observation. It is clear from Eq. (1) that both the spatial distribution and magnitude have been considered in this expression while comparing model results with observations.
3. Changes and improvements from CMIP3 to CMIP5
a. Comparison of SWCRF feedback between CMIP3 and CMIP5
The feedbacks from cloud and water vapor to El Niño warming in CMIP5 models are presented in Table 2, which is the same as Tables 1 and 2 in S09, except that the results are from 18 CMIP5 models, whereas the results in S09 are based on 12 CMIP3 models. The ensemble model mean values for high score models and low score models are also listed in Table 2 for later discussion.
As shown in column 1 of Table 2, nearly all the CMIP5 models still overestimate the positive feedback from the greenhouse effect of water vapor . The magnitude of the bias is similar to that found in CMIP3 models. Although there are variations among models, CMIP5 models on average do not show significant improvements in their representation of the water vapor feedback. In contrast, the underestimate of the strength of the negative SWCRF feedback , emphasized in S09, is notably alleviated in some of CMIP5 models. A striking improvement is noted in the National Center for Atmospheric Research (NCAR) and Institute of Atmospheric Physics (IAP) models. Their CMIP5 version (CCSM4 and FGOALS-g2) now have a negative SWCRF feedback (−12.73 and −14.46 W m−2 K−1 in the historical runs) that are very close to the observed value (−13.33 W m−2 K−1), recalling that the values are −3.69 and −2.25 W m−2 K−1, respectively, in the coupled runs of CMIP3 (CCSM3 and FGOALS-g1.0) (see Table 2). In addition, the SWCRF feedback in the historical runs of CCSM4, FGOALS-g2, GISS-E2-R, and NorESM1-M agree well with observations over the equatorial Pacific. This is also an encouraging improvement over CMIP3 models, in which nearly all of them have notable biases in simulating the SWCRF feedback (see the italicization in Table 2). However, the negative SWCRF feedback is still underestimated in the historical runs of more than half of CMIP5 models examined here. The numbers in the parenthesis in column 4 of Table 2 are the SWCRF feedback estimated from the corresponding AMIP runs. It is evident that most CMIP5 models perform better in simulating the SWCRF feedback in the AMIP runs than those in the historical runs.
As listed in the second column of Table 2, the positive feedback of longwave radiation forcing of clouds (LWCRF) is also underestimated in most models. However, the magnitude of the biases from LWCRF does not appear to always compensate those in the feedback from water vapor and SWCRF, leading to diverse results among models in the feedback from the total greenhouse effect of water vapor and clouds (Ga + Cl) (column 3 of Table 2) and the net clouds feedback (Cl + Cs) (column 5 of Table 2). The feedback from the atmospheric transport , the net atmospheric feedback , and the feedback from the net surface heat flux into the ocean are also listed in Table 2. In general, they show a slight improvement from the present CMIP5 models compared to the CMIP3 models (S09).
b. Spatial structure of the feedbacks in El Niño warming
Figures 1 and 2 further show the spatial structure of the response of SWCRF to El Niño warming in the AMIP runs and the historical runs of CMIP5 models, respectively. The spatial distribution and magnitude of the response of SWCRF is generally captured well in most AMIP runs (Fig. 1). In reference to the results of S09, the AMIP results show a slight improvement from CMIP3 to CMIP5. Comparisons with the observations also reveal that the simulation of SWCRF feedback in CMIP5 models performs better in the AMIP runs than those in the historical runs. Relative to the AMIP runs, the corresponding historical runs are found to have a larger spread among models and a greater difference between models and observations (Fig. 2). The most notable discrepancies with observations are the location and magnitude of the maximum response of SWCRF. First, with a few exceptions (i.e., ACCESS1.0, CCSM4, BCC_CSM1.1, FGOALS-g2, GISS-E2-R, and NorESM1-M), the maximum response of SWCRF has an obvious westward extension in other models. Second, the maximum response of SWCRF is significantly underestimated in most CMIP5 models, except for CCSM4, CNRM-CM5, FGOALS-g2, GISS-E2-R, and NorESM1-M.
Compared to their corresponding AMIP runs, 9 out of the 18 CMIP5 models examined here (i.e., CanESM2, CSIRO Mk3.6.0, INM-CM4.0, IPSL-CM5A-LR, FGOALS-s2.0, MIROC5, the two MPI models, and MRI-CGCM3) show that the maximum response of SWCRF shifts westward in the historical runs. In addition, the magnitude of maximum response of the SWCRF shown in the historical runs (Fig. 2) has a general weakening in the response of SWCRF compared to their respective AMIP results that show a better agreement with the observations (Fig. 1). This common bias exists in more than half of the 18 CMIP5 models examined herein.
Overall, in terms of both the structure and magnitude, the AMIP simulations of response of SWCRF to the El Niño warming in CMIP5 models show a slight improvement over CMIP3 models. In addition, the performance in the historical runs is significantly improved in about one-third of CMIP5 models relative to CMIP3 results (S09). Nonetheless, a large spread still exists among models, and the biases in the representation of the cloud and water vapor feedbacks in the ENSO cycle are generally amplified in the historical runs compared to the corresponding AMIP runs.
4. An analysis of the origin of the biases
a. Relationship of biases among different feedbacks
For further evaluating model performance and revealing the origin of biases analyzed above, we have compared the skill scores of the feedbacks of SWCRF, LWCRF, precipitation, and vertical velocity at 500 hPa (ω500) (Fig. 3). As is shown in Figs. 3a–c, there is a positive correlation between the intermodel variations in the simulated SWCRF feedback and the intermodel variations in other simulated feedbacks: the feedback of precipitation, the LWCRF feedback, and the feedback of the large-scale circulation (e.g., ω500). It is suggested that models with higher skill in simulating the response of SWCRF to El Niño warming generally perform better in other physical feedback processes (Figs. 3a–c). The apparent correlations among model biases imply that all the biases might be resulted from the same fundamental origins.
Hence, according to the skill in simulating the SWCRF feedback in the historical runs, we divide the 18 CMIP5 models into two categories: high score models and low score models. The HSM, identified with the higher skill scores above 0.5 (see the models above the dashed line in Fig. 3d), includes nine models: CCSM4, CNRM-CM5, NorESM1-M, FGOALS-g2, ACCESS1.0, BCC_CM1.1, BNU-ESM, GISS-E2-R, and HadGEM2-ES models (sorted by the score in the simulation of SWCRF feedback). The LSM is identified with the lower skill scores below 0.5 (see the models below the dashed line in Fig. 3d). It is found that the ensemble of the HSM definitely performs better than the ensemble of the LSM in the historical runs over the tropical region (Figs. 2t,u). However, the difference between the HSM and the LSM in simulating the SWCRF feedback is smaller in the AMIP runs (see Figs. 1t,u and 2t,u). It suggests that a better representation of the SWCRF feedback in the AMIP runs cannot guarantee a better simulation in its historical counterpart (as shown in Fig. 3d). As shown in the last two rows of Table 2, the values of most feedback coefficients listed in Table 2 generally agree better with observations in the HSM than those in the LSM, even though only variations in the equatorial Pacific (5°S–5°N, 150°E–110°W) are used to obtain these values in Table 2.
b. Origins of the biases
Based on the results analyzed above, one may question what causes the biases of the cloud and water vapor feedbacks in models and significant differences between AMIP runs and historical runs.
When it comes to the improvement from CMIP3 models to the CMIP5 models in the historical runs, it is tempting to attribute the biases of the feedbacks to the biases of the simulation of ENSO. However, it is noted that a better simulation of ENSO variability is generally not accompanied by a more realistic SWCRF feedback to ENSO. Figure 4 displays the relationship between the skill of reproducing ENSO variability as measured by the ratio of the standard deviation (STD) of the simulated Niño-3 SSTA over the STD of the observed Niño-3 SSTA and the skill in the corresponding SWCRF feedbacks in the historical runs. The red open circles and the blue open squares in the figure indicate the results from the HSM and the LSM, respectively. The figure shows clearly that the higher skill score of SWCRF feedback is not necessarily accompanied by a better simulation of ENSO. For example, the CCSM4 model, which has overestimated the amplitude of ENSO, shows a well-simulated SWCRF feedback in its historical runs. However, the IPSL-CM5A-LR, in which the ENSO amplitude is close to the observation, does not simulate the SWCRF feedback well. Therefore, the bias in the simulation of ENSO is unlikely a major factor behind the biases of feedbacks in the coupled GCMs.
The leading candidate for the feedback biases in the historical runs may be the excessive cold tongue in models (Sun et al. 2006; S09). To test this hypothesis, we have examined the climatological mean tropical Pacific SST in the historical runs of the HSM and the LSM (Figs. 5b,c). Although the spatial pattern of the mean SST is similar to the observations, the cold bias in the equatorial Pacific appears in both the HSM and the LSM, except the eastern equatorial Pacific, and the warm bias is prevalent adjacent to the Peru coast, which have been pointed out earlier and continue to exist in current coupled GCMs (Mechoso et al. 1995; Davey et al. 2002; Latif et al. 2001; Sun et al. 2006). Note that the bias of the climatological mean SST is generally smaller in the HSM than that in the LSM, especially over the central equatorial Pacific. As we know, the deep convection tends to occur where SST is higher than a threshold around 28°C (Graham and Barnett 1987) and the dependence of precipitation on SST is highly nonlinear (see Fig. 5a in S09). Therefore, the cold bias of SST in central equatorial Pacific may lead to a decrease in the precipitation over the central equatorial Pacific and shift the deep convection westward. By comparing the response of precipitation to El Niño warming in both AMIP and historical runs of two categories of models (figures not shown), it is found that the response of precipitation in the LSM historical runs is significantly weaker than that in observations and has a westward shift, whereas the response of precipitation in the HSM historical runs is slightly weaker and the location matches better with the observations. However, the precipitation feedback in the corresponding AMIP runs of both the HSM and the LSM are well simulated (the skill scores are 0.92 and 0.93). Recall that the AMIP runs are forced by the same observed SST; it thus suggests that the mean SST bias in the historical runs especially the excessive cold tongue may contribute to the underestimate of the SWCRF feedback and the westward shift of the maximum response of SWCRF.
To further test this hypothesis, we have examined the relationship between the cold-tongue biases and the skill scores of the SWCRF feedback in all the historical runs (Fig. 6). In Fig. 6, we use the mean SST bias over the equatorial Pacific (5°S–5°N, 150°E–110°W) as a measure of the cold-tongue bias. A colder SST bias in this region indicates a more excessive cold tongue. Figure 6 shows that the models with smaller cold-tongue biases are generally accompanied by the higher scores of the SWCRF feedback (i.e., the better simulation of the SWCRF feedback), implying that reducing the cold bias of the mean SST may be beneficial for the better simulation of the SWCRF feedback. Also note that most of the CMIP5 coupled models exhibit cold SST biases. Consistent with Fig. 5, the models in the group of HSM generally exhibit smaller cold SST biases or even warm biases compared to the models in the LSM.
It should be noted, however, the excessive cold tongue in the models may in turn be affected by the biases in the net atmospheric feedbacks. According to Sun et al. (2003, see their Fig. 6), the prevalence of the excessive cold tongue is possibly related to the incorrect representation of the net atmospheric feedbacks (i.e., ). Figure 7 shows the observed and model-simulated over the tropical Pacific. It is found that, in the historical runs, from both the HSM and the LSM are much weaker than the observation (see Figs. 7b,c), and the latter is even weaker (note that this feedback in the LSM is barely negative over the central equatorial Pacific). Also, as shown in the bottom of the second column of Table 2, over the equatorial Pacific (5°S–5°N, 150°E–110°W) in the HSM historical runs and the LSM historical runs are −7.23 ± 0.78 and −0.31 ± 0.65 W m−2 K−1, respectively. The latter feedback is much weaker than the former one. Therefore, the cold SST bias is less regulated by the atmospheric process in the coupled models, leading to the formation of an excessive cold tongue in these models. The excessive cold tongue then further weakens the net atmospheric feedback. Such a vicious cycle may be a root cause for the difficulty to make the equatorial Pacific cold tongue simulated correctly by coupled GCMs.
Indeed, we find in our analysis that the air–sea interaction or coupling between the atmosphere and ocean amplifies the errors in the atmospheric feedbacks. By comparing the HSM and the LSM in the quantification of the skill scores over the tropical Pacific (see Fig. 8), it is found that the simulated feedbacks in the AMIP experiments are generally better than the historical experiments. Although the representations of the feedbacks in the HSM AMIP runs are very close to those in the LSM AMIP runs, they are very different in their corresponding historical runs. The simulated feedbacks in the HSM historical runs are much better than those in the LSM historical runs. Apparently, biases of the cloud and water vapor feedbacks in the LSM AMIP runs are amplified more easily in their historical counterparts than those in the HSM AMIP runs. This also implies that the improvements in the AMIP runs can have an amplified gain in their corresponding coupled runs.
The persistence of the overestimated water vapor feedback in CMIP5 models is linked with the bias of water vapor. As the greenhouse gases feedback is more proportional to its percentage change (Shine and Sinha 1991; Schneider et al. 1999; Zhang and Sun 2008), we have calculated the response of the specific humidity change percentage (anomalies/mean) to El Niño warming in both the AMIP runs and the historical runs of CMIP5 models in the same manner as Zhang and Sun (2006, 2008). We found the excessive response of water vapor in the upper troposphere in all models (figures not shown).
5. Summary and discussion
Previous studies have revealed two common biases in the simulation of the response of the cloud and water vapor to the El Niño warming in CMIP3 models: an underestimate of the negative cloud albedo feedback and an overestimate of the positive water vapor feedback. To examine whether these two biases still exist in current climate models, we assessed the performances of CMIP5 models in representing these feedbacks. As indicated by the traditional feedback calculations as well as by a skill score designed for further evaluating the model simulations for the entire tropical Pacific, the major characteristics of the feedbacks from cloud and water vapor over the tropical Pacific regions in response to ENSO forcing are generally well captured by CMIP5 models. The skill in simulating the SWCRF feedback is found to be significantly higher in CMIP5 models than in CMIP3 models. The most impressive improvement in this regard was noted in the NCAR Climate System Model (Gent et al. 2011). Nevertheless, a large spread among models and a notable difference between the models and observations still exist in the historical runs of CMIP5 models. In addition, most of the CMIP5 models do not have significant improvements in simulating the water vapor feedback. More efforts need to be made to alleviate the positive bias in the water vapor feedback that still remains prevalent in CMIP5 models.
Given the fact that the biases from many other feedbacks correlate with the bias in the SWCRF feedback, 18 CMIP5 models were further classified as the HSM and the LSM according to their scores in simulating the SWCRF feedback in the historical runs. By analyzing the differences of climatological mean tropical Pacific SST between the HSM and the LSM, we reveal that the LSM tend to have a more severe excessive cold tongue than HSM. Further analysis demonstrates that the excessive cold tongue generally leads to a westward shift and a significant weakening of the maximum response to El Niño warming. Among the coupled models, there is a correlation between the cold bias in the equatorial cold tongue and the bias in the SWCRF feedback.
There are some improvements in the AMIP runs in simulating the SWCRF feedback from CMIP3 models to CMIP5 models, but they remain small relative to the intermodel differences. Nonetheless, understanding the causes for these improvements is important as we have noted that a small gain in the AMIP runs generally result in a much reduced bias in the coupled runs. Factors that are potentially responsible for the improvements in the AMIP runs involve the improved convection parameterization, the cloud microphysics schemes and the model resolutions (Zhang et al. 1998; Lin 2007; Li and Zhang 2008; Guilyardi et al. 2009; Lloyd et al. 2009, 2011, 2012). For example, there are some improvements of SWCRF feedback in the atmospheric component model of FGOALS-g2 [Grid-Point Atmospheric Model of IAP State Key Laboratory of Numerical Modeling for Atmospheric Sciences and Geophysical Fluid Dynamics (LASG) version 2 (GAMIL2)] compared to its previous version (GAMIL1) (Li et al. 2013b). It has been pointed out by Li et al. (2013a) that the biases of shortwave and longwave cloud forcing distribution are reduced in GAMIL2 and the strength of atmospheric response to El Niño warming [i.e., regression coefficient between Southern Oscillation index (SOI) and Niño-3 SSTA] is much closer to the observations than GAMIL1. The former arises from the updated cloud macro–microphysics schemes and their coordination with other schemes (e.g., the convection parameterization). The latter is related to the artificially amplified liquid water path (LWP) in GAMIL1, which leads to an unrealistic response of the SWCRF and other corresponding variables to ENSO cycle. To be sure, factors responsible for the improvements in AMIP runs may vary with models or be model dependent.
We thank two anonymous reviewers for their constructive and insightful comments. The discussion with Dr. Li Lijuan about the model FGOLAS-g2 is gratefully acknowledged. This study is jointly supported by the “Strategic Priority Research Program Climate Change: Carbon Budget and Relevant Issues” of the Chinese Academy of Sciences (Grant XDA05110302), the National Key Program for Developing Basic Sciences Grant 2010CB950502, the U.S. National Science Foundation Climate Dynamics Program under AGS 0852329, and grants from the NOAA office of global programs (MAPP and ESS). We also acknowledge the modeling groups, the PCMDI and the World Climate Research Programme's (WCRP) Working Group on Coupled Modelling (WGCM) for their roles in making available the WCRP CMIP5 multimodel dataset. Support of this dataset is provided by the Office of Science, U.S. Department of Energy.
This article is included in the North American Climate in CMIP5 Experiments special collection.