The representation of tropical precipitation has never been a strength of global climate models. Some reasons are well known, but have proven difficult to improve with classical climate modeling approaches. This includes the representation of moist convection, which produces the majority of precipitation in the tropics, but is a process that coarse-resolution climate models must parameterize with the help of resolved processes. It is known that model differences in precipitation arising from such an approach can be substantial (e.g., Dai 2006; Stevens and Bony 2013). In reviewing progress over past phases of the Coupled Model Intercomparison Project (CMIP), Stouffer et al. (2017), identify six “particularly important and long-standing biases” that the authors hope will be reduced in CMIP’s sixth phase (CMIP6). First among these is related to the misrepresentation of tropical precipitation, in the form of tropical rainbands being too hemispherically symmetric, something known as the double intertropical convergence zone (ITCZ) bias. Other studies have pointed to further deficiencies [e.g., in the representation of the summer monsoon (Zhang et al. 2015)], modes of internal variability (Ahn et al. 2017), and the intensity distribution and extremes of precipitation (Stephens et al. 2010).
A correct simulation of the tropical climate matters, not only directly for the region, but also indirectly by influencing the response of the general circulation to forcing at global scales (Held 1983; Palmer and Owen 1986; Zhou and Xie 2015). Precipitation is important due to its many impacts, ranging from ecosystems (Cox et al. 2000) to air pollution (Rodhe and Grandell 1972; Baker and Charlson 1990; Bourgeois and Bey 2011). Hence the past decades have witnessed substantial efforts to improve precipitation in climate models, including the representation of the hydrological cycle in the tropics. Despite these efforts, progress has proven unsatisfactory in past CMIP phases (Hawkins and Sutton 2011; Knutti and Sedlácek 2012; Flato et al. 2013), so much so that it has been suggested to pay the computational price of resolving precipitating convection, and abandoning the traditional approach to climate modeling with parameterized convection for studying tropical precipitation (Schär et al. 2020; Palmer and Stevens 2019; Satoh et al. 2019). In evaluating these arguments it seems sensible to ask if progress in simulating tropical precipitation is as unsatisfactory as past evaluations of CMIP models suggest. This question motivates the present study, revisiting the tropical precipitation over the three major phases of CMIP: CMIP3, CMIP5, and now CMIP6.
At a first glance, the hope that CMIP6 models would substantially address the long-standing biases in precipitation appears unfulfilled. CMIP6 models continue to show large differences in precipitation, compared to observations (Fig. 1). Half of the global precipitation occurs between 30°S and 30°N—a region we refer to as the tropics. Regional model biases relative to data from the Tropical Rainfall Measuring Mission (TRMM; Huffman et al. 2007, 2010) range from −3 to 4 mm day−1 (Fig. 1). These occur partly in regions where the absolute amount is smaller than the tropical mean of 3.85 mm day−1 (e.g., in the southeast Pacific and southern Atlantic). Spatial disagreements are a southward displaced precipitation maximum over the Atlantic Ocean, a double-ITCZ pattern in precipitation over the Pacific Ocean, and an east–west precipitation anomaly over the Indian Ocean.
The question remains whether biases in tropical precipitation in CMIP6 models have been reduced compared to previous phases of CMIP. By combining the expertise of many authors, we apply here different previously used methods to broadly assess the representation of tropical precipitation across models participating in CMIP6. By applying the same methods to model output from the third and fifth phases of CMIP, we evaluate the extent to which model developments have been successful in improving tropical precipitation. Much of what we show effectively extends previous studies on tropical precipitation in earlier CMIP models to CMIP6. The novelty of the present study is thus not in any specific analysis, but rather through our use of existing techniques to develop and take stock of the big picture. Specifically by looking systematically at the representation of tropical precipitation by three generations of CMIP models, across different regions and scales as measured by various metrics, we assess the status and progress in climate modeling for tropical precipitation.
For the purpose of our study, we collected observations and model output from historical simulations with 3-hourly to monthly resolution from 97 different data sources and applied 14 different analysis approaches. The analyses are based on known methods and chosen for their merit for giving a broad view on different characteristics. Our data and the analysis strategy are introduced in the next section (section 2), followed by the presentation of the results of this analysis, which are distributed across four sections, focusing on the climatology (section 3), natural cycles associated with solar radiative effects (section 4) and modes of internal variability (section 5), and long-term trends in the twentieth century (section 6). Opportunities for future research are discussed in section 7. We end with our conclusions in section 8.
2. Data and methods
a. Data sources
1) Model output
We assess the historical simulations of global coupled climate models produced for the last three major phases of the Coupled Model Intercomparison Project: CMIP3 (Meehl et al. 2007), CMIP5 (Taylor et al. 2012), and CMIP6 (Eyring et al. 2016). In these simulations, the boundary conditions (e.g., irradiation, aerosols, orbital parameters, and greenhouse gas concentrations in the atmosphere) represent those estimated for the historical time period in the CMIP phase and therefore differ slightly from one another. The historical simulations in all phases of CMIP start in 1850 but end in 2000, 2005, and 2014 for CMIP3, CMIP5, and CMIP6, respectively. (Tables S1–S4 in the online supplemental material list the model output used here.)
The availability of model output differs across the CMIP phases and the participating models. We therefore chose the data considering the availability and intended analyses as follows: 1991–2000 for subdaily (3-hourly), 1961–2000 for daily, and 1900–2000 for monthly and annual analyses. For CMIP6, we additionally use data in the period 2000–14 for comparison against the current state-of-the-art observational record for the same time period (section 2b). Analyzed variables are total surface precipitation for all output frequencies as well as near-surface winds and top-of-the-atmosphere outgoing longwave radiation for daily to annual time scales. All CMIP data are averages over the given output intervals. CMIP3 and CMIP5 simulation results are summarized in the corresponding chapters of the fourth and fifth IPCC Assessment Reports (Randall et al. 2007; Flato et al. 2013).
Access to the CMIP data is facilitated by the Earth System Grid Federation (ESGF; Williams et al. 2016). For practical reasons, we use ESGF-published model output, which was already replicated by the German Climate Computing Center [Deutsches Klimarechenzentrum (DKRZ)] until 1 October 2019. Additionally, we use the not-yet-published model output from MPI-ESM-LR produced by the Max-Planck-Institute for Meteorology for CMIP6.
We use four observational datasets, listed in Table 1 and introduced here. The diversity in estimated precipitation among the datasets is taken as a measure of observational uncertainty, which for some ocean and mountainous regions with a sparse ground-based observation network can be considerable (e.g., for the Asian monsoon region) (Ceglar et al. 2017). The rainfall retrieval product of the Tropical Rainfall Measuring Mission (TRMM) Multisatellite Precipitation Analysis (TMPA; Huffman et al. 2007) version 7 provides 3-hourly data for 1998–2019. This dataset, TRMM hereafter, combines data from passive microwave sensors, calibrated by the TRMM precipitation radar, with infrared sensors (Huffman et al. 2010), and is corrected to match rain gauge data. We further use the 3-hourly precipitation estimate from the Climate Prediction Center morphing technique (CMORPH) version 1.0 for 1998–2017 (Joyce et al. 2004). CMORPH uses data from passive microwave measurements and cloud advection vectors from correlated images of infrared sensors. For climate change assessments, we use the gridded precipitation product of the Climatic Research Unit (CRU) time series version 4.03 (Harris et al. 2014) for 1901–2014 with 0.5° spatial resolution, based on gauge networks on land.
Overview of used precipitation observations. Listed are the characteristics of the data and the means for the tropics (G), tropical land (L), and tropical ocean (O), and the ratio of land and ocean precipitation rates (L/O).
To test the observational uncertainty, we additionally use the monthly satellite-gauge product (“3IMERGM”) of the Integrated Multisatellite Retrievals for GPM (IMERG; Huffman et al. 2019) from the Global Precipitation Mission (Hou et al. 2014). IMERG extends the concept of TRMM but instead uses a dual-frequency precipitation radar paired with more passive microwave and infrared sensors. Overall, the observed mean precipitation rate for 2000–14 ranges from 3 mm day−1 (CMORPH) to 3.5 mm day−1 (IMERG) across our four observational datasets (Table 1). Individual regions can show larger observational differences, with the largest observational ranges exceeding 2 mm day−1 over islands, in the lee of mountain ranges, and in coastal areas (Fig. S1). The products mainly disagree over central Africa (CMORPH wet bias), in the Pacific warm pool (CMORPH dry bias), in the lee of mountain ranges in West India and the Malay Peninsula (CMORPH dry bias), on the Caribbean islands (CRU wet bias), and Central America (CMORPH dry bias). More details including seasonal differences are provided in the supplemental information. CMORPH and TRMM capture the observational range across the assessed satellite products over land and ocean. We therefore use differences in these two products to measure the observational uncertainty in our analyses.
b. Data analysis strategy
All datasets have been screened, and standardized for easy handling. This includes remapping the data to the same horizontal grid between 30°S and 30°N. Typically one would choose the coarsest resolution as common grid to avoid generating information that the model did not simulate, but this approach would have led to a crude comparison since models in CMIP3 had substantially coarser grids than in CMIP6. As a compromise, we use the T63 grid, which is the native grid of MPI-M’s low-resolution configuration of MPI-ESMs in CMIP5 and CMIP6. This grid has 196 points along the equator, and hence a spatial resolution of approximately 200 km. We unify the precipitation unit of all datasets by calculating mm day−1.
In addition to performing an analysis over the entire tropics, separate analyses are performed for tropical land and ocean. For this purpose we use the land–sea mask of MPI-ESM1.2. We count grid cells with more than 50% ocean surface as ocean and otherwise as land. This approach implies that small islands are assigned to ocean regions. All tropical lakes are defined as land. Results from the analyses for tropical land and ocean are shown if relevant.
The output from models that provide more than one simulation for the historical period are averaged before computing the mean of a CMIP phase. By this procedure, we avoid giving too much weight to an individual model that produced particularly many simulations. The model output includes both precipitation contributions from the model’s subgrid parameterizations and the fractions associated with atmospheric dynamics explicitly resolved on the model grids.
As discussed in the introduction, none of the analysis techniques we employ are novel. Most are widely used in the climate modeling community (e.g., statistics over different time and length scales as well as analyses under different meteorological regimes). Some techniques are less familiar (e.g., the standardized precipitation index and the concept of Jennings scaling; Jennings 1950). These are included to present a broader view of how precipitation is represented in models. We further analyze precipitation associated with a range of different atmospheric features like cloud regimes, monsoons, and intra- and interseasonal variability. The details of these techniques are introduced in the relevant sections.
A comparison of models and observations encompassing different time periods poses a number of challenges. One challenge is the definition of a common time period for the comparison, as not only do the different CMIP phases end on different years, they also overlap differently with satellite datasets. Initially we compared TRMM against CMIP6 for the overlapping time period 2000–14 as validation, and CMIP6 against CMIP5 and CMIP3 for the overlapping period 1900–99 to determine the development across CMIP generations. We found, however, only small differences in the statistics of CMIP6 for 1900–99 and 2000–14 in all our results, consistent with similar long-term mean statistics for CMIP6 (Table 2) and the small past trend in tropical mean precipitation (section 6). For instance, the spatial correlation coefficient of the CMIP6 precipitation climatologies for the twentieth and twenty-first centuries over the tropics is 0.998, much larger than the average correlations between CMIP models and TRMM (Table 3). For simplicity, and because a more temporally consistent comparison adds no new information, we compare TRMM and CMORPH for 2000–14 directly with the different CMIP phases for 1900–99. Another challenge was to establish to what extent changes across phases of CMIP were simply the result of a different mix of models in each phase. To test this possibility we selected the subset of models that participated in all phases of CMIP and tested to what extent this sample of models influenced our conclusion for the climatological mean. We found that using all the CMIP models, or just the subset participating in all CMIP phases yielded similar results (not shown). We further tested averaging over related models to account for different processing practices (Abramowitz et al. 2019). To this end, we calculated the standardized precipitation index on averages of related models in CMIP6, and identified only small differences that did not change our conclusions (not shown).
Long-term mean statistics for tropical precipitation. (from left to right) Listed are the CMIP phase, the time period, the number of models for calculating the long-term statistics, the means ± 1 standard deviation in precipitation for the tropics (G), tropical land (L), and tropical ocean (O), and the ratio of land and ocean precipitation rates (L/O).
Long-term mean comparison of tropical precipitation. Listed are the root-mean-square error/difference (RMSE/D) and the spatial correlation coefficients (r) for the tropics (G), tropical land (L), and tropical ocean (O) as well as the correlation coefficient of the differences between June–August and December–February means as a measure of the seasonal amplitude (S). The top row shows CMIP6 for 1900–99 (20th) against CMIP6 2000–14 (21st), followed by TRMM against CMIP6 for 2000–14 and rows below TRMM (2000–14) against CMIPs (1900–99). The statistics are computed on the multimodel mean precipitation in the three CMIP phases against TRMM.
a. Tropical mean
There has and continues to be a long-standing discrepancy between energy-budget inferences of precipitation, and estimates of precipitation based on observations, whereby the former tend to be larger than the latter (Stephens et al. 2012; Stevens and Schwartz 2012; Wild et al. 2012). The tropical precipitation from CMIP models assessed here are also larger than the observational estimates. Compared to the tropical mean from TRMM of 3.23 mm day−1, CMIP3 has an overestimation by 0.21 mm day−1, and CMIP5 and CMIP6 by 0.34 mm day−1 (Table 2). The tropical means of CMIP5 and CMIP6 are outside of the spread in the satellite observations (Table 1). The intermodel standard deviation is larger than the mean bias for CMIP3, but smaller for CMIP5 and CMIP6. The overestimation is also seen for precipitation averaged over oceans, with CMIP5 and CMIP6 being outside of the observational range (Tables 1 and 2). For land, we find a slight underestimation in CMIP3 and CMIP5, but CMIP6 is in the observational range. The observed land to ocean ratios in precipitation of 0.86–0.99 are consistently underestimated in all CMIP means. The land–ocean ratio has, however, slightly increased across the CMIP phases with CMIP6 (0.82) being the closest to the lower bound of the observational range of the land–ocean ratio (Table 1).
The spatial pattern of tropical precipitation shows a systematic improvement across the CMIP phases, although the values do still not fall within the observational uncertainty. We measure this with the spatial correlations, r, in the annual mean tropical precipitation between CMIP and TRMM, with r = 0.75 in CMIP3, r = 0.79 in CMIP5, and r = 0.84 in CMIP6 for the tropical mean (Table 3). Improvements across CMIP in r are also found for both tropical ocean and land separately, with r being slightly larger over ocean than over land (Table 3). The observed pattern differences, measured by r, are larger over land than over ocean (Fig. 2), but none of the CMIP means fall within the observational uncertainty for r, measured as the spatial correlation between CMORPH and TRMM. Only the two best CMIP6 models for this metric, CESM2 and CESM2-WACCM with r > 0.9 over tropical land, come close to the observational range for r, reflecting a regional improvement in the tropical precipitation pattern of this model (Woelfle et al. 2019). Also the root-mean-square errors (RMSE) for precipitation compared to TRMM have decreased on average over the CMIP phases, from 1.85 mm day−1 in CMIP3 to 1.80 mm day−1 in CMIP5 and 1.55 mm day−1 in CMIP6, but again these are larger than the observational uncertainty (Table 3). RMSEs are slightly larger over ocean than over land in both CMIP3 and CMIP5, but this behavior has reversed in CMIP6.
Figure 2 shows the standard deviations of CMIP models in comparison to TRMM for land and ocean. While this measure of annual-mean variability in CMIP3 is too small, CMIP5 and CMIP6 are closer to observations over tropical land, while the standard deviation has been similar across the CMIP phases over tropical ocean. The difference between the mean precipitation for June–August and December–February is used as a measure for the seasonal amplitude S. The spatial correlations of S between the models and TRMM have improved across the CMIP phases (Fig. 2 and Table 3), but all values fall outside of the observational uncertainty.
We test the hypothesis that improvements in precipitation across the CMIP phases occur in tandem with a reduction in large-scale SST biases. Climate models typically underestimate SSTs by several degrees in large parts of the tropical oceans, especially the cold tongue region in the Pacific Ocean (e.g., Woelfle et al. 2019) while they overestimate SSTs in the upwelling regions at the eastern side of the basins (Li and Xie 2012). We do, however, find no clear indication that the large-scale precipitation difference over tropical oceans is tightly linked with model differences in SSTs neither for the entire tropical oceans nor for the cold tongue in the Pacific, although some of the SST biases in CMIP6 are smaller than in CMIP3 and CMIP5 by up to 1 K (Figs. S2 and S3).
b. Zonal mean
Despite evidence of improvements in the spatial pattern of precipitation, we find no sign of improvement for the zonal mean precipitation across the CMIP phases (Figs. 3a–c). The zonally averaged annual mean precipitation is remarkably robust across all phases of CMIP. The Northern Hemisphere rainfall maximum is well matched compared to TRMM. In the Southern Hemisphere, the rainfall maximum in CMIP6 and CMIP5 is, however, overestimated compared to both TRMM and CMIP3. This is likely related to the too-pronounced double ITCZ in the models (e.g., Li and Xie 2014) and possibly explains the mean differences in tropical precipitation in the central Pacific (Fig. 1). In previous works, it has been related to biases in the ocean–atmosphere feedbacks in the tropical Pacific (Lin 2007), errors in cloud simulations (Li and Xie 2014), and the cold tongue bias in the topical Pacific (Samanta et al. 2019).
The double ITCZ in CMIP6 (Figs. 3a–c) shares the same biases as have been previously reported for CMIP3 and CMIP5 (Zhang et al. 2015). As quantitative comparison, we compute the double-ITCZ index I (Samanta et al. 2019):
Here PN is the mean precipitation in the northern box (5°–15°N, 160°E–120°W), PS is the mean precipitation in the southern box (15°–5°S, 160°E–120°W), and PE is the mean precipitation in the equatorial box (5°S–5°N, 160°E–120°W). The median double-ITCZ index is largely unchanged across different phases of CMIP, with I = 4.3 in CMIP3, I = 3.6 in CMIP5, and I = 4.0 in CMIP6 (Fig. 3d). Compared to the observational estimates of I = 1.5 (CMORPH) and I = 1.7 (TRMM), the median of I is too large by more than a factor of 2 in all CMIP phases. This means that the tropical precipitation over the Pacific Ocean is overestimated (cf. Fig. 1c). The model spread decreases as we move through CMIP generations, but this is not an improvement. Some models reproduced the ITCZ index of the observation in both CMIP3 and CMIP5, but none do in CMIP6.
c. Intensity distribution
Frequency and intensity are important precipitation characteristics with implications for hydrology and aerosol burden. For instance, a large model spread for surface runoff has been identified in CMIP5 (Lehner et al. 2019), and for aerosol burden in aerosol–climate models (e.g., Baker and Charlson 1990; Textor et al. 2006; Fan et al. 2018). Even models with a relatively accurate representation of the spatial pattern of precipitation may have large biases in the frequency and intensity (e.g., Trenberth et al. 2003; Pendergrass and Hartmann 2014). Models in CMIP3 and CMIP5 are known to produce too-frequent drizzle (e.g., Baker and Huang 2014; Pendergrass and Hartmann 2014; Sun et al. 2015). Here, we test to what extent this behavior has improved using long-term statistics of the frequency of wet 3-h means, the 1-day lag autocorrelation, the number of consecutive dry days, and scaling relationships between precipitation amount and its duration.
1) Wet and dry frequency
All CMIP phases consistently produce more frequent wet 3-h means in tropical precipitation than observed (Fig. 4a). This overestimation has been slightly reduced in CMIP6 with 85% of the 3-hourly means being wet, compared to 93% in CMIP3. However, this is still a substantial overestimation of the occurrence of precipitation compared to the observed frequency of 44%–54%. The improvement in tropical precipitation frequency in CMIP6 is primarily explained by the reduction of wet 3-h means over tropical oceans, whereas the frequency over land has only slightly decreased compared to CMIP3 (Figs. S4 and S5). We note, however, a substantial model spread for the frequency of precipitation rates in all CMIP phases (Fig. S6).
We measure the day-to-day variability and spatiotemporal coherence of the tropical precipitation with the 1-day lag autocorrelation (Fig. 4b). A realistic lag autocorrelation is associated with an improved representation of deep convection and convection coupled to equatorial waves, including the Madden–Julian oscillation (Peters et al. 2017; Ma et al. 2019) assessed in section 5a. Atmospheric models with parameterized moist convection are known to have unrealistic day-to-day variability in precipitation (Peters et al. 2017, 2019) due to deficiencies in the physical parameterization schemes that lead to too-frequent triggering of deep convection (Klingaman et al. 2017; Peters et al. 2017). This behavior is characterized by too-large 1-day lag autocorrelations (i.e., wet episodes over several days are not sufficiently interrupted by dry days). We identify a slight improvement in the 1-day lag autocorrelation from CMIP3 and CMIP5 to CMIP6, namely, a reduction from roughly 0.60 in CMIP3 to 0.50 in CMIP6. Since the lag autocorrelation is sensitive to the representation of convection (Klingaman et al. 2017; Peters et al. 2017), this result indicates that the past model development between CMIP phases contributed to a slightly better day-to-day variability of moist convection. However, compared to the observed 1-day lag autocorrelation of 0.35 in both TRMM and CMORPH, CMIP6 models still substantially overestimate this quantity, pointing to too-little intermittence of rainy episodes.
The maximum number of consecutive dry days (CDD) are used to quantify the length of dryness. Following the Expert Team on Climate Change Detection and Indices (ETCCDI) as used by Frich et al. (2002), CDD is defined as the number of consecutive days within a year that have total daily precipitation amounts of less than 1 mm day−1. This threshold removes days with light drizzle events that are difficult to measure. Although TRMM (TMPA) is better for light rain events than other satellite-based data (e.g., Burdanowitz et al. 2015), it misses light precipitation (Behrangi et al. 2014) affecting the frequency of occurrence of precipitation events (Klepp et al. 2018). By eliminating days with such light drizzle events, we determine the differences in CDD considering more regular to extreme precipitation events. We show the spatial and temporal average of CDD (Fig. 4c) and the probability distribution of CDD across time and space (Fig. 4d). The latter primarily indicates spatial variability for CMIP due to the small year-to-year changes in ensemble-averaged CDD with standard deviations of 0.47–0.69 (not shown).
There is an improvement over the three CMIP generations in averaged CDD and their probability of occurrence (Figs. 4c,d), but CMIP6 models still produce shorter dry periods on average than observed (Fig. 4c). Reasons for the remaining difference to observations stem from the poor representation of extremely long dry episodes (Fig. 4d). For instance, the climate models show too-low probabilities for CDD longer than 130 days in CMIP3, and 200 days in CMIP5 and CMIP6, compared to the observations (Fig. 4d). The underestimation of such extremely long dry episodes is primarily explained by the mismatch in CDD over oceans (Figs. S4 and S5), but this is also the region where the improvement across CMIP generations was largest. Compared to the ocean, the number of CDD over land is generally better captured by CMIP models, except for CDDs longer than 250 days. The probability of occurrence for these extremely long dry episodes has slightly improved from CMIP3 to CMIP6, but the occurrence of more than 300 CDDs in deserts, is still underestimated. This finding has implications for other processes in the Earth system (e.g., dust-aerosol emissions, which is influenced by the soil moisture and lack of vegetation cover) (e.g., Shao 2001; Kok et al. 2014).
2) Jennings scaling
Jennings (1950) discovered a scaling law, P ~ Dα, that describes the global maximum of precipitation, P, observed at rain gauges over land during an interval of some duration, D, with the exponent α ~ 1/2 for periods of minutes to 1 year. Even earlier research on thresholds of rainfall extremes supports this power-law scaling (Wussow 1922). This type of scaling can be reproduced by simple thermodynamic models whose large-scale input is modulated by stochastic forcing (e.g., Field and Shutts 2009; Zhang et al. 2013b). The scaling relationships described by Jennings, sometimes called maximum depth–duration graphs, have entered textbooks in hydrology, but their application to the evaluation of climate model output is less common (Zhang et al. 2013a), which motivates the present analysis. Moreover, whereas previous studies focused on time periods of minutes to one year, we test here the extension of the Jennings scaling to decades by calculating the slope for data over longer averaging intervals. We find that the Jennings slopes are very similar in TRMM and all phases of CMIP.
The rainfall maxima P for a given D is determined from the spatial distributions of tropical precipitation. The literature typically refers to P as depth. The depth is the maximum across time and space in the running means of daily precipitation over the duration, calculated at every grid point. Durations range here from 1 day to 1 decade, with steps of 1 day, for all datasets, except for CMIP6 and TRMM, where we also use the entire overlapping period 2001–14. We show three examples of the resulting points that fall on a line in the depth–duration space (Fig. 5a). We find that all regression lines closely fit the data points, with R2 = 0.97 being the smallest coefficient of determination across all datasets here. The slopes of that line α are shown in Fig. 5b and are known as the Jennings slope.
In both the CMIP output and the data from TRMM, α is larger than the value determined from the earlier gauge measurements by Jennings (1950). In addition to the different spatial scales, Jennings (1950) covers minutes to 1 year, while we start with daily precipitation and move to decadal scales for TRMM and CMIP. Looking at the line in Fig. 5a indicates that the steeper slopes in TRMM are primarily explained by the interval from 1 year to 1 decade (i.e., there is some curvature in the slope when moving to longer averaging intervals). Paired with the different spatial representation of gauge measurements and the gridded data, it explains why CMIP and TRMM produce slopes that are more similar to one another than compared to Jennings (1950), with CMIP6 following the observations better than the previous phases of CMIP.
There is considerable variability in the estimates of α from the CMIP output, although less in CMIP6 than in previous CMIP phases. The relatively good match between TRMM and CMIP6 based estimates of α suggests that despite biases in the distribution of precipitation, the tendency for long-duration events to be associated with more intense rainfall is well captured by the models.
d. Low-level and deep clouds
We investigate tropical precipitation associated with different cloud regimes, namely, low-level and deep clouds. Low-level cloud regimes cover large parts of the tropics away from the ITCZ. Using an outgoing longwave radiation (OLR) threshold of >250 W m−2 to exclude areas of deep convection, we estimate the observed fractional area coverage of low-level cloud regimes from the daily CERES product (Loeb et al. 2009) to be 68% (not shown). We choose the threshold of 250 W m−2, corresponding to a brightness temperature of 258 K and similar to other studies (e.g., Masunaga et al. 2005). Note that this includes both low-level cumuli and stratiform clouds. It also includes a fraction of cumulus congestus, which is not distinguishable from low-level clouds with OLR. Sensitivity tests with other thresholds of 240–260 W m−2, consistent with Stubenrauch et al. (1999), give qualitatively similar results to 250 W m−2. For the analysis, we use daily OLR and precipitation data, available from 16 CMIP3 models, 32 CMIP5 models, and 14 CMIP6 models, marked by indices 2 and 3 in Tables S1–S4.
The CMIP means have a fractional low-level cloud area similar to the observations, with a slight increase from 65% for CMIP3 to 69% for CMIP6. Despite a similar areal coverage the models differ substantially by 50%–100% in the amount of precipitation associated with low-level clouds (Fig. 6a). There is no clear improvement over the CMIP phases, although the very large outliers evident in CMIP3 and CMIP5, with precipitation fractions associated with low-level cloud regimes larger by a factor of four to five, have reduced in CMIP6. Some models in CMIP6 lie within the observational range for the fractional precipitation amount associated with low-level clouds (10%–14%), namely, BCC-ESM1 (12%), CNRM-CM6–1 (10%), and CNRM-ESM2–1 (11%). These three models, however, tend to underestimate the fractional area coverage of the low-level cloud regimes with 59%, 65%, and 65%, respectively.
We extend the analysis to regimes with deep convection, which we identify with regions of particularly low OLR. In these regimes the observations differ considerably (Fig. 6b). For an OLR of 120 W m−2 the precipitation rates are 25% larger in CERES-TRMM (200 mm day−1) as compared to CERES-CMORPH (150 mm day−1), consistent with the lower frequency of these precipitation rates in CMORPH than in TRMM (Fig. S6). CMIP5 and CMIP6 have a better representation of the relationship between OLR and precipitation rate than CMIP3 for OLR of 120–270 W m−2, and align closely with what is diagnosed from the CERES-TRMM measurements. For more moderate precipitation, between 10 and 100 mm day−1, the observations are more consistent and suggest that the models require deeper convection to produce these rain rates (too-low OLR) across the CMIP phases.
In summary, models produce more precipitation from low-level clouds than is observed, consistent with the persistent overestimation of drizzle in CMIP. For more moderate precipitation rates, the CMIP models are associated with lower OLR, pointing to deeper clouds or more overcast conditions than is observed. For stronger rain rates (p > 100 mm day−1) substantial divergence between the observational datasets make an evaluation of the models difficult, but CMIP3 clearly lies outside of the observational range, whereas CMIP5 and CMIP6 are closer to the observations.
4. Solar radiative effects
Model-based climate change projections are essentially an exercise in assessing how a model’s climate respond to radiative forcing. In this context, the fidelity of their response to known changes in the radiation budget, for instance, as associated with the seasonal and daily cycles of the sun, provides a useful test of their plausibility. The response of precipitation to radiative forcing associated with atmospheric composition changes, as manifest by a global increase in surface temperatures, is likely different than the response to seasonal and daily cycles in irradiance. There is, however, little reason to believe that a model could capture a forced response in precipitation (e.g., to radiative forcing of greenhouse gases), if they poorly represent the observed cycles induced by radiative perturbations as strong as those associated with the seasonal and daily changes of irradiance. This is one of our motivations for this analysis across the CMIP models.
a. Seasonal cycle
The seasonal cycle of tropical precipitation determines the regional climate in many tropical areas (Knoben et al. 2019). Hence, quite apart from being a generic test of how models respond to natural changes in the radiation budget, an ability of CMIP models to reproduce the seasonal cycle of tropical precipitation with fidelity is relevant on its own. Through the influence of precipitation on the regional energy budget, an accurate simulation of tropical precipitation is also influential for other aspects of climate on both regional and global scales.
1) Zonal means
Models in CMIP3 and CMIP5 are drier than observations early in the wet season and too wet later on in both hemispheres (Seth et al. 2013). They showed two systematic biases in tropical precipitation. First, most CMIP3 and CMIP5 models underestimate the precipitation near the equator between January and June. Others also documented a regional underestimation of precipitation in the 4°–8°N band within the tropical Pacific from March to April (Mechoso et al. 1995; Bellucci et al. 2010). Second, Seth et al. (2013) showed that most CMIP5 models overestimate precipitation at 4°–20°S, particularly strongly between February and May. This is consistent with observations showing more hemispheric asymmetry in the zonal-mean annual precipitation and a dominant ITCZ signature to the north of the equator than most CMIP models (Fig. 3).
Figure 7 shows that CMIP6 models still do not correctly represent the observed seasonal cycle of zonal-mean precipitation over tropical land and ocean. We find that they are generally wetter than observations in the summer hemisphere by 0.5–2.5 mm day−1. Furthermore, we find a too dry–too wet pattern between January and May, explained by a rain belt that is displaced too far to the south. This model behavior might also delay the onset of the summer monsoon in the Northern Hemisphere.
2) Summer monsoons
Monsoon rainfall dominates the annual variability in the tropics (e.g., Trenberth et al. 2000; Wang and Ding 2008), affecting many tropical regions. Previous studies show that CMIP5 models simulate better monsoonal circulation climatology and variability than CMIP3 (e.g., Sperber et al. 2013), but they still suffer from systematic regional biases. For example, the CMIP5 mean tends to underestimate precipitation over the eastern Indian Ocean, the Bay of Bengal, the equatorial western Pacific, and tropical Brazil, but overestimate precipitation over the Maritime Continent, the Philippines, and high-elevated terrains such as the Andes, Sierra Madre, and the Tibetan Plateau (Lee et al. 2010; Lee and Wang 2014). Despite these regional biases, the CMIP5 mean reproduces the observed monsoon intensity and area (Lee and Wang 2014).
We assess the monsoon across the CMIP phases with a bulk measure for the monsoon area and intensity following earlier approaches (Wang and Ding 2006; Wang et al. 2011). For each model simulation, the monsoon regions are defined with two criteria: 1) the annual range of precipitation (summer minus winter mean) exceeds 2 mm day−1; and 2) the summertime precipitation contributing at least 55% to the annual total. The monsoon intensity is then defined as the area-weighted average of summer precipitation (i.e., June–August in the Northern Hemisphere and December–February in the Southern Hemisphere) within the monsoon area. The l