Simulated Tropical Precipitation Assessed across Three Major Phases of the Coupled Model Intercomparison Project (CMIP)

Stephanie Fiedler aUniversity of Cologne, Institute of Geophysics and Meteorology, Cologne, Germany

Search for other papers by Stephanie Fiedler in
Current site
Google Scholar
PubMed
Close
,
Traute Crueger bMax Planck Institute for Meteorology, Hamburg, Germany

Search for other papers by Traute Crueger in
Current site
Google Scholar
PubMed
Close
,
Roberta D’Agostino bMax Planck Institute for Meteorology, Hamburg, Germany

Search for other papers by Roberta D’Agostino in
Current site
Google Scholar
PubMed
Close
,
Karsten Peters cDeutsches Klimarechenzentrum (DKRZ), Hamburg, Germany

Search for other papers by Karsten Peters in
Current site
Google Scholar
PubMed
Close
,
Tobias Becker bMax Planck Institute for Meteorology, Hamburg, Germany

Search for other papers by Tobias Becker in
Current site
Google Scholar
PubMed
Close
,
David Leutwyler bMax Planck Institute for Meteorology, Hamburg, Germany

Search for other papers by David Leutwyler in
Current site
Google Scholar
PubMed
Close
,
Laura Paccini bMax Planck Institute for Meteorology, Hamburg, Germany

Search for other papers by Laura Paccini in
Current site
Google Scholar
PubMed
Close
,
Jörg Burdanowitz dUniversität Hamburg, Hamburg, Germany

Search for other papers by Jörg Burdanowitz in
Current site
Google Scholar
PubMed
Close
,
Stefan A. Buehler dUniversität Hamburg, Hamburg, Germany

Search for other papers by Stefan A. Buehler in
Current site
Google Scholar
PubMed
Close
,
Alejandro Uribe Cortes eMISU, Stockholm University, Stockholm, Sweden

Search for other papers by Alejandro Uribe Cortes in
Current site
Google Scholar
PubMed
Close
,
Thibaut Dauhut bMax Planck Institute for Meteorology, Hamburg, Germany

Search for other papers by Thibaut Dauhut in
Current site
Google Scholar
PubMed
Close
,
Dietmar Dommenget fARC Centre of Excellence for Climate Extremes, Monash University, Melbourne, Australia

Search for other papers by Dietmar Dommenget in
Current site
Google Scholar
PubMed
Close
,
Klaus Fraedrich bMax Planck Institute for Meteorology, Hamburg, Germany

Search for other papers by Klaus Fraedrich in
Current site
Google Scholar
PubMed
Close
,
Leonore Jungandreas bMax Planck Institute for Meteorology, Hamburg, Germany

Search for other papers by Leonore Jungandreas in
Current site
Google Scholar
PubMed
Close
,
Nicola Maher bMax Planck Institute for Meteorology, Hamburg, Germany

Search for other papers by Nicola Maher in
Current site
Google Scholar
PubMed
Close
,
Ann Kristin Naumann bMax Planck Institute for Meteorology, Hamburg, Germany

Search for other papers by Ann Kristin Naumann in
Current site
Google Scholar
PubMed
Close
,
Maria Rugenstein bMax Planck Institute for Meteorology, Hamburg, Germany

Search for other papers by Maria Rugenstein in
Current site
Google Scholar
PubMed
Close
,
Mirjana Sakradzija bMax Planck Institute for Meteorology, Hamburg, Germany

Search for other papers by Mirjana Sakradzija in
Current site
Google Scholar
PubMed
Close
,
Hauke Schmidt bMax Planck Institute for Meteorology, Hamburg, Germany

Search for other papers by Hauke Schmidt in
Current site
Google Scholar
PubMed
Close
,
Frank Sielmann dUniversität Hamburg, Hamburg, Germany

Search for other papers by Frank Sielmann in
Current site
Google Scholar
PubMed
Close
,
Claudia Stephan bMax Planck Institute for Meteorology, Hamburg, Germany

Search for other papers by Claudia Stephan in
Current site
Google Scholar
PubMed
Close
,
Claudia Timmreck bMax Planck Institute for Meteorology, Hamburg, Germany

Search for other papers by Claudia Timmreck in
Current site
Google Scholar
PubMed
Close
,
Xiuhua Zhu bMax Planck Institute for Meteorology, Hamburg, Germany

Search for other papers by Xiuhua Zhu in
Current site
Google Scholar
PubMed
Close
, and
Bjorn Stevens bMax Planck Institute for Meteorology, Hamburg, Germany

Search for other papers by Bjorn Stevens in
Current site
Google Scholar
PubMed
Close
Open access

Abstract

The representation of tropical precipitation is evaluated across three generations of models participating in phases 3, 5, and 6 of the Coupled Model Intercomparison Project (CMIP). Compared to state-of-the-art observations, improvements in tropical precipitation in the CMIP6 models are identified for some metrics, but we find no general improvement in tropical precipitation on different temporal and spatial scales. Our results indicate overall little changes across the CMIP phases for the summer monsoons, the double-ITCZ bias, and the diurnal cycle of tropical precipitation. We find a reduced amount of drizzle events in CMIP6, but tropical precipitation occurs still too frequently. Continuous improvements across the CMIP phases are identified for the number of consecutive dry days, for the representation of modes of variability, namely, the Madden–Julian oscillation and El Niño–Southern Oscillation, and for the trends in dry months in the twentieth century. The observed positive trend in extreme wet months is, however, not captured by any of the CMIP phases, which simulate negative trends for extremely wet months in the twentieth century. The regional biases are larger than a climate change signal one hopes to use the models to identify. Given the pace of climate change as compared to the pace of model improvements to simulate tropical precipitation, we question the past strategy of the development of the present class of global climate models as the mainstay of the scientific response to climate change. We suggest the exploration of alternative approaches such as high-resolution storm-resolving models that can offer better prospects to inform us about how tropical precipitation might change with anthropogenic warming.

Denotes content that is immediately available upon publication as open access.

This article is licensed under a Creative Commons Attribution 4.0 license (http://creativecommons.org/licenses/by/4.0/).

Former affiliation: Max Planck Institute for Meteorology, Hamburg, Germany.

Corresponding author: Stephanie Fiedler, stephanie.fiedler@uni-koeln.de

Abstract

The representation of tropical precipitation is evaluated across three generations of models participating in phases 3, 5, and 6 of the Coupled Model Intercomparison Project (CMIP). Compared to state-of-the-art observations, improvements in tropical precipitation in the CMIP6 models are identified for some metrics, but we find no general improvement in tropical precipitation on different temporal and spatial scales. Our results indicate overall little changes across the CMIP phases for the summer monsoons, the double-ITCZ bias, and the diurnal cycle of tropical precipitation. We find a reduced amount of drizzle events in CMIP6, but tropical precipitation occurs still too frequently. Continuous improvements across the CMIP phases are identified for the number of consecutive dry days, for the representation of modes of variability, namely, the Madden–Julian oscillation and El Niño–Southern Oscillation, and for the trends in dry months in the twentieth century. The observed positive trend in extreme wet months is, however, not captured by any of the CMIP phases, which simulate negative trends for extremely wet months in the twentieth century. The regional biases are larger than a climate change signal one hopes to use the models to identify. Given the pace of climate change as compared to the pace of model improvements to simulate tropical precipitation, we question the past strategy of the development of the present class of global climate models as the mainstay of the scientific response to climate change. We suggest the exploration of alternative approaches such as high-resolution storm-resolving models that can offer better prospects to inform us about how tropical precipitation might change with anthropogenic warming.

Denotes content that is immediately available upon publication as open access.

This article is licensed under a Creative Commons Attribution 4.0 license (http://creativecommons.org/licenses/by/4.0/).

Former affiliation: Max Planck Institute for Meteorology, Hamburg, Germany.

Corresponding author: Stephanie Fiedler, stephanie.fiedler@uni-koeln.de

1. Introduction

The representation of tropical precipitation has never been a strength of global climate models. Some reasons are well known, but have proven difficult to improve with classical climate modeling approaches. This includes the representation of moist convection, which produces the majority of precipitation in the tropics, but is a process that coarse-resolution climate models must parameterize with the help of resolved processes. It is known that model differences in precipitation arising from such an approach can be substantial (e.g., Dai 2006; Stevens and Bony 2013). In reviewing progress over past phases of the Coupled Model Intercomparison Project (CMIP), Stouffer et al. (2017), identify six “particularly important and long-standing biases” that the authors hope will be reduced in CMIP’s sixth phase (CMIP6). First among these is related to the misrepresentation of tropical precipitation, in the form of tropical rainbands being too hemispherically symmetric, something known as the double intertropical convergence zone (ITCZ) bias. Other studies have pointed to further deficiencies [e.g., in the representation of the summer monsoon (Zhang et al. 2015)], modes of internal variability (Ahn et al. 2017), and the intensity distribution and extremes of precipitation (Stephens et al. 2010).

A correct simulation of the tropical climate matters, not only directly for the region, but also indirectly by influencing the response of the general circulation to forcing at global scales (Held 1983; Palmer and Owen 1986; Zhou and Xie 2015). Precipitation is important due to its many impacts, ranging from ecosystems (Cox et al. 2000) to air pollution (Rodhe and Grandell 1972; Baker and Charlson 1990; Bourgeois and Bey 2011). Hence the past decades have witnessed substantial efforts to improve precipitation in climate models, including the representation of the hydrological cycle in the tropics. Despite these efforts, progress has proven unsatisfactory in past CMIP phases (Hawkins and Sutton 2011; Knutti and Sedlácek 2012; Flato et al. 2013), so much so that it has been suggested to pay the computational price of resolving precipitating convection, and abandoning the traditional approach to climate modeling with parameterized convection for studying tropical precipitation (Schär et al. 2020; Palmer and Stevens 2019; Satoh et al. 2019). In evaluating these arguments it seems sensible to ask if progress in simulating tropical precipitation is as unsatisfactory as past evaluations of CMIP models suggest. This question motivates the present study, revisiting the tropical precipitation over the three major phases of CMIP: CMIP3, CMIP5, and now CMIP6.

At a first glance, the hope that CMIP6 models would substantially address the long-standing biases in precipitation appears unfulfilled. CMIP6 models continue to show large differences in precipitation, compared to observations (Fig. 1). Half of the global precipitation occurs between 30°S and 30°N—a region we refer to as the tropics. Regional model biases relative to data from the Tropical Rainfall Measuring Mission (TRMM; Huffman et al. 2007, 2010) range from −3 to 4 mm day−1 (Fig. 1). These occur partly in regions where the absolute amount is smaller than the tropical mean of 3.85 mm day−1 (e.g., in the southeast Pacific and southern Atlantic). Spatial disagreements are a southward displaced precipitation maximum over the Atlantic Ocean, a double-ITCZ pattern in precipitation over the Pacific Ocean, and an east–west precipitation anomaly over the Indian Ocean.

Fig. 1.
Fig. 1.

Long-term multimodel means of CMIP6 precipitation. Shown are the spatial distributions of the present-day (2000–14) precipitation statistics of CMIP6 as (a) the multimodel mean, (b) bias over land including small islands compared to gridded station observations from CRU, and (c) the mean bias in the tropics against TRMM. The thick contour indicates the isoline for the tropical mean precipitation of CMIP6 (3.58 mm day−1) for an easier comparison of regional biases to the precipitation amount. Biases are calculated from the monthly climatology for 2000–14. We use ensemble averages for models with several historical simulations.

Citation: Monthly Weather Review 148, 9; 10.1175/MWR-D-19-0404.1

The question remains whether biases in tropical precipitation in CMIP6 models have been reduced compared to previous phases of CMIP. By combining the expertise of many authors, we apply here different previously used methods to broadly assess the representation of tropical precipitation across models participating in CMIP6. By applying the same methods to model output from the third and fifth phases of CMIP, we evaluate the extent to which model developments have been successful in improving tropical precipitation. Much of what we show effectively extends previous studies on tropical precipitation in earlier CMIP models to CMIP6. The novelty of the present study is thus not in any specific analysis, but rather through our use of existing techniques to develop and take stock of the big picture. Specifically by looking systematically at the representation of tropical precipitation by three generations of CMIP models, across different regions and scales as measured by various metrics, we assess the status and progress in climate modeling for tropical precipitation.

For the purpose of our study, we collected observations and model output from historical simulations with 3-hourly to monthly resolution from 97 different data sources and applied 14 different analysis approaches. The analyses are based on known methods and chosen for their merit for giving a broad view on different characteristics. Our data and the analysis strategy are introduced in the next section (section 2), followed by the presentation of the results of this analysis, which are distributed across four sections, focusing on the climatology (section 3), natural cycles associated with solar radiative effects (section 4) and modes of internal variability (section 5), and long-term trends in the twentieth century (section 6). Opportunities for future research are discussed in section 7. We end with our conclusions in section 8.

2. Data and methods

a. Data sources

1) Model output

We assess the historical simulations of global coupled climate models produced for the last three major phases of the Coupled Model Intercomparison Project: CMIP3 (Meehl et al. 2007), CMIP5 (Taylor et al. 2012), and CMIP6 (Eyring et al. 2016). In these simulations, the boundary conditions (e.g., irradiation, aerosols, orbital parameters, and greenhouse gas concentrations in the atmosphere) represent those estimated for the historical time period in the CMIP phase and therefore differ slightly from one another. The historical simulations in all phases of CMIP start in 1850 but end in 2000, 2005, and 2014 for CMIP3, CMIP5, and CMIP6, respectively. (Tables S1–S4 in the online supplemental material list the model output used here.)

The availability of model output differs across the CMIP phases and the participating models. We therefore chose the data considering the availability and intended analyses as follows: 1991–2000 for subdaily (3-hourly), 1961–2000 for daily, and 1900–2000 for monthly and annual analyses. For CMIP6, we additionally use data in the period 2000–14 for comparison against the current state-of-the-art observational record for the same time period (section 2b). Analyzed variables are total surface precipitation for all output frequencies as well as near-surface winds and top-of-the-atmosphere outgoing longwave radiation for daily to annual time scales. All CMIP data are averages over the given output intervals. CMIP3 and CMIP5 simulation results are summarized in the corresponding chapters of the fourth and fifth IPCC Assessment Reports (Randall et al. 2007; Flato et al. 2013).

Access to the CMIP data is facilitated by the Earth System Grid Federation (ESGF; Williams et al. 2016). For practical reasons, we use ESGF-published model output, which was already replicated by the German Climate Computing Center [Deutsches Klimarechenzentrum (DKRZ)] until 1 October 2019. Additionally, we use the not-yet-published model output from MPI-ESM-LR produced by the Max-Planck-Institute for Meteorology for CMIP6.

2) Observations

We use four observational datasets, listed in Table 1 and introduced here. The diversity in estimated precipitation among the datasets is taken as a measure of observational uncertainty, which for some ocean and mountainous regions with a sparse ground-based observation network can be considerable (e.g., for the Asian monsoon region) (Ceglar et al. 2017). The rainfall retrieval product of the Tropical Rainfall Measuring Mission (TRMM) Multisatellite Precipitation Analysis (TMPA; Huffman et al. 2007) version 7 provides 3-hourly data for 1998–2019. This dataset, TRMM hereafter, combines data from passive microwave sensors, calibrated by the TRMM precipitation radar, with infrared sensors (Huffman et al. 2010), and is corrected to match rain gauge data. We further use the 3-hourly precipitation estimate from the Climate Prediction Center morphing technique (CMORPH) version 1.0 for 1998–2017 (Joyce et al. 2004). CMORPH uses data from passive microwave measurements and cloud advection vectors from correlated images of infrared sensors. For climate change assessments, we use the gridded precipitation product of the Climatic Research Unit (CRU) time series version 4.03 (Harris et al. 2014) for 1901–2014 with 0.5° spatial resolution, based on gauge networks on land.

Table 1.

Overview of used precipitation observations. Listed are the characteristics of the data and the means for the tropics (G), tropical land (L), and tropical ocean (O), and the ratio of land and ocean precipitation rates (L/O).

Table 1.

To test the observational uncertainty, we additionally use the monthly satellite-gauge product (“3IMERGM”) of the Integrated Multisatellite Retrievals for GPM (IMERG; Huffman et al. 2019) from the Global Precipitation Mission (Hou et al. 2014). IMERG extends the concept of TRMM but instead uses a dual-frequency precipitation radar paired with more passive microwave and infrared sensors. Overall, the observed mean precipitation rate for 2000–14 ranges from 3 mm day−1 (CMORPH) to 3.5 mm day−1 (IMERG) across our four observational datasets (Table 1). Individual regions can show larger observational differences, with the largest observational ranges exceeding 2 mm day−1 over islands, in the lee of mountain ranges, and in coastal areas (Fig. S1). The products mainly disagree over central Africa (CMORPH wet bias), in the Pacific warm pool (CMORPH dry bias), in the lee of mountain ranges in West India and the Malay Peninsula (CMORPH dry bias), on the Caribbean islands (CRU wet bias), and Central America (CMORPH dry bias). More details including seasonal differences are provided in the supplemental information. CMORPH and TRMM capture the observational range across the assessed satellite products over land and ocean. We therefore use differences in these two products to measure the observational uncertainty in our analyses.

b. Data analysis strategy

All datasets have been screened, and standardized for easy handling. This includes remapping the data to the same horizontal grid between 30°S and 30°N. Typically one would choose the coarsest resolution as common grid to avoid generating information that the model did not simulate, but this approach would have led to a crude comparison since models in CMIP3 had substantially coarser grids than in CMIP6. As a compromise, we use the T63 grid, which is the native grid of MPI-M’s low-resolution configuration of MPI-ESMs in CMIP5 and CMIP6. This grid has 196 points along the equator, and hence a spatial resolution of approximately 200 km. We unify the precipitation unit of all datasets by calculating mm day−1.

In addition to performing an analysis over the entire tropics, separate analyses are performed for tropical land and ocean. For this purpose we use the land–sea mask of MPI-ESM1.2. We count grid cells with more than 50% ocean surface as ocean and otherwise as land. This approach implies that small islands are assigned to ocean regions. All tropical lakes are defined as land. Results from the analyses for tropical land and ocean are shown if relevant.

The output from models that provide more than one simulation for the historical period are averaged before computing the mean of a CMIP phase. By this procedure, we avoid giving too much weight to an individual model that produced particularly many simulations. The model output includes both precipitation contributions from the model’s subgrid parameterizations and the fractions associated with atmospheric dynamics explicitly resolved on the model grids.

As discussed in the introduction, none of the analysis techniques we employ are novel. Most are widely used in the climate modeling community (e.g., statistics over different time and length scales as well as analyses under different meteorological regimes). Some techniques are less familiar (e.g., the standardized precipitation index and the concept of Jennings scaling; Jennings 1950). These are included to present a broader view of how precipitation is represented in models. We further analyze precipitation associated with a range of different atmospheric features like cloud regimes, monsoons, and intra- and interseasonal variability. The details of these techniques are introduced in the relevant sections.

A comparison of models and observations encompassing different time periods poses a number of challenges. One challenge is the definition of a common time period for the comparison, as not only do the different CMIP phases end on different years, they also overlap differently with satellite datasets. Initially we compared TRMM against CMIP6 for the overlapping time period 2000–14 as validation, and CMIP6 against CMIP5 and CMIP3 for the overlapping period 1900–99 to determine the development across CMIP generations. We found, however, only small differences in the statistics of CMIP6 for 1900–99 and 2000–14 in all our results, consistent with similar long-term mean statistics for CMIP6 (Table 2) and the small past trend in tropical mean precipitation (section 6). For instance, the spatial correlation coefficient of the CMIP6 precipitation climatologies for the twentieth and twenty-first centuries over the tropics is 0.998, much larger than the average correlations between CMIP models and TRMM (Table 3). For simplicity, and because a more temporally consistent comparison adds no new information, we compare TRMM and CMORPH for 2000–14 directly with the different CMIP phases for 1900–99. Another challenge was to establish to what extent changes across phases of CMIP were simply the result of a different mix of models in each phase. To test this possibility we selected the subset of models that participated in all phases of CMIP and tested to what extent this sample of models influenced our conclusion for the climatological mean. We found that using all the CMIP models, or just the subset participating in all CMIP phases yielded similar results (not shown). We further tested averaging over related models to account for different processing practices (Abramowitz et al. 2019). To this end, we calculated the standardized precipitation index on averages of related models in CMIP6, and identified only small differences that did not change our conclusions (not shown).

Table 2.

Long-term mean statistics for tropical precipitation. (from left to right) Listed are the CMIP phase, the time period, the number of models for calculating the long-term statistics, the means ± 1 standard deviation in precipitation for the tropics (G), tropical land (L), and tropical ocean (O), and the ratio of land and ocean precipitation rates (L/O).

Table 2.
Table 3.

Long-term mean comparison of tropical precipitation. Listed are the root-mean-square error/difference (RMSE/D) and the spatial correlation coefficients (r) for the tropics (G), tropical land (L), and tropical ocean (O) as well as the correlation coefficient of the differences between June–August and December–February means as a measure of the seasonal amplitude (S). The top row shows CMIP6 for 1900–99 (20th) against CMIP6 2000–14 (21st), followed by TRMM against CMIP6 for 2000–14 and rows below TRMM (2000–14) against CMIPs (1900–99). The statistics are computed on the multimodel mean precipitation in the three CMIP phases against TRMM.

Table 3.

3. Climatology

a. Tropical mean

There has and continues to be a long-standing discrepancy between energy-budget inferences of precipitation, and estimates of precipitation based on observations, whereby the former tend to be larger than the latter (Stephens et al. 2012; Stevens and Schwartz 2012; Wild et al. 2012). The tropical precipitation from CMIP models assessed here are also larger than the observational estimates. Compared to the tropical mean from TRMM of 3.23 mm day−1, CMIP3 has an overestimation by 0.21 mm day−1, and CMIP5 and CMIP6 by 0.34 mm day−1 (Table 2). The tropical means of CMIP5 and CMIP6 are outside of the spread in the satellite observations (Table 1). The intermodel standard deviation is larger than the mean bias for CMIP3, but smaller for CMIP5 and CMIP6. The overestimation is also seen for precipitation averaged over oceans, with CMIP5 and CMIP6 being outside of the observational range (Tables 1 and 2). For land, we find a slight underestimation in CMIP3 and CMIP5, but CMIP6 is in the observational range. The observed land to ocean ratios in precipitation of 0.86–0.99 are consistently underestimated in all CMIP means. The land–ocean ratio has, however, slightly increased across the CMIP phases with CMIP6 (0.82) being the closest to the lower bound of the observational range of the land–ocean ratio (Table 1).

The spatial pattern of tropical precipitation shows a systematic improvement across the CMIP phases, although the values do still not fall within the observational uncertainty. We measure this with the spatial correlations, r, in the annual mean tropical precipitation between CMIP and TRMM, with r = 0.75 in CMIP3, r = 0.79 in CMIP5, and r = 0.84 in CMIP6 for the tropical mean (Table 3). Improvements across CMIP in r are also found for both tropical ocean and land separately, with r being slightly larger over ocean than over land (Table 3). The observed pattern differences, measured by r, are larger over land than over ocean (Fig. 2), but none of the CMIP means fall within the observational uncertainty for r, measured as the spatial correlation between CMORPH and TRMM. Only the two best CMIP6 models for this metric, CESM2 and CESM2-WACCM with r > 0.9 over tropical land, come close to the observational range for r, reflecting a regional improvement in the tropical precipitation pattern of this model (Woelfle et al. 2019). Also the root-mean-square errors (RMSE) for precipitation compared to TRMM have decreased on average over the CMIP phases, from 1.85 mm day−1 in CMIP3 to 1.80 mm day−1 in CMIP5 and 1.55 mm day−1 in CMIP6, but again these are larger than the observational uncertainty (Table 3). RMSEs are slightly larger over ocean than over land in both CMIP3 and CMIP5, but this behavior has reversed in CMIP6.

Fig. 2.
Fig. 2.

Taylor diagrams for tropical precipitation. Shown are the correlation coefficient, spatial standard deviation, and the root-mean-square error following Taylor (2001) of the tropical precipitation over (left) land and (right) ocean. Statistics are calculated on the (a),(b) long-term means and (c),(d) the difference between June–August and December–February means for the models (colored circles) against TRMM (black star). We mark the spread and average for all models per CMIP phase (colored lines) and the average for the selection of those models that participated in all CMIP phases (colored stars). We show CMIP6 model data for 1900–99 only, since the differences in the statistics of CMIP6 for 2000–14 and 1900–99 are small. The observational uncertainty is indicated by calculating the same statistics for CMORPH (gray star) against TRMM.

Citation: Monthly Weather Review 148, 9; 10.1175/MWR-D-19-0404.1

Figure 2 shows the standard deviations of CMIP models in comparison to TRMM for land and ocean. While this measure of annual-mean variability in CMIP3 is too small, CMIP5 and CMIP6 are closer to observations over tropical land, while the standard deviation has been similar across the CMIP phases over tropical ocean. The difference between the mean precipitation for June–August and December–February is used as a measure for the seasonal amplitude S. The spatial correlations of S between the models and TRMM have improved across the CMIP phases (Fig. 2 and Table 3), but all values fall outside of the observational uncertainty.

We test the hypothesis that improvements in precipitation across the CMIP phases occur in tandem with a reduction in large-scale SST biases. Climate models typically underestimate SSTs by several degrees in large parts of the tropical oceans, especially the cold tongue region in the Pacific Ocean (e.g., Woelfle et al. 2019) while they overestimate SSTs in the upwelling regions at the eastern side of the basins (Li and Xie 2012). We do, however, find no clear indication that the large-scale precipitation difference over tropical oceans is tightly linked with model differences in SSTs neither for the entire tropical oceans nor for the cold tongue in the Pacific, although some of the SST biases in CMIP6 are smaller than in CMIP3 and CMIP5 by up to 1 K (Figs. S2 and S3).

b. Zonal mean

Despite evidence of improvements in the spatial pattern of precipitation, we find no sign of improvement for the zonal mean precipitation across the CMIP phases (Figs. 3a–c). The zonally averaged annual mean precipitation is remarkably robust across all phases of CMIP. The Northern Hemisphere rainfall maximum is well matched compared to TRMM. In the Southern Hemisphere, the rainfall maximum in CMIP6 and CMIP5 is, however, overestimated compared to both TRMM and CMIP3. This is likely related to the too-pronounced double ITCZ in the models (e.g., Li and Xie 2014) and possibly explains the mean differences in tropical precipitation in the central Pacific (Fig. 1). In previous works, it has been related to biases in the ocean–atmosphere feedbacks in the tropical Pacific (Lin 2007), errors in cloud simulations (Li and Xie 2014), and the cold tongue bias in the topical Pacific (Samanta et al. 2019).

Fig. 3.
Fig. 3.

Zonal mean precipitation. Shown are annual means across tropical latitudes for (a) CMIP6 compared to TRMM, (b) CMIP5 compared to CMIP6, and (c) CMIP3 compared to CMIP6, with shading indicating the model spread as one standard deviation, and (d) the double-ITCZ index I calculated using tropical precipitation in the regions defined by Samanta et al. (2019) and explained in the text. In (d), the box-and-whisker plots indicate the median, quartiles, and extremes in CMIP3, CMIP5, and CMIP6, and the horizontal lines are the TRMM and CMORPH observational means.

Citation: Monthly Weather Review 148, 9; 10.1175/MWR-D-19-0404.1

The double ITCZ in CMIP6 (Figs. 3a–c) shares the same biases as have been previously reported for CMIP3 and CMIP5 (Zhang et al. 2015). As quantitative comparison, we compute the double-ITCZ index I (Samanta et al. 2019):

I=PN+PS2PE.

Here PN is the mean precipitation in the northern box (5°–15°N, 160°E–120°W), PS is the mean precipitation in the southern box (15°–5°S, 160°E–120°W), and PE is the mean precipitation in the equatorial box (5°S–5°N, 160°E–120°W). The median double-ITCZ index is largely unchanged across different phases of CMIP, with I = 4.3 in CMIP3, I = 3.6 in CMIP5, and I = 4.0 in CMIP6 (Fig. 3d). Compared to the observational estimates of I = 1.5 (CMORPH) and I = 1.7 (TRMM), the median of I is too large by more than a factor of 2 in all CMIP phases. This means that the tropical precipitation over the Pacific Ocean is overestimated (cf. Fig. 1c). The model spread decreases as we move through CMIP generations, but this is not an improvement. Some models reproduced the ITCZ index of the observation in both CMIP3 and CMIP5, but none do in CMIP6.

c. Intensity distribution

Frequency and intensity are important precipitation characteristics with implications for hydrology and aerosol burden. For instance, a large model spread for surface runoff has been identified in CMIP5 (Lehner et al. 2019), and for aerosol burden in aerosol–climate models (e.g., Baker and Charlson 1990; Textor et al. 2006; Fan et al. 2018). Even models with a relatively accurate representation of the spatial pattern of precipitation may have large biases in the frequency and intensity (e.g., Trenberth et al. 2003; Pendergrass and Hartmann 2014). Models in CMIP3 and CMIP5 are known to produce too-frequent drizzle (e.g., Baker and Huang 2014; Pendergrass and Hartmann 2014; Sun et al. 2015). Here, we test to what extent this behavior has improved using long-term statistics of the frequency of wet 3-h means, the 1-day lag autocorrelation, the number of consecutive dry days, and scaling relationships between precipitation amount and its duration.

1) Wet and dry frequency

All CMIP phases consistently produce more frequent wet 3-h means in tropical precipitation than observed (Fig. 4a). This overestimation has been slightly reduced in CMIP6 with 85% of the 3-hourly means being wet, compared to 93% in CMIP3. However, this is still a substantial overestimation of the occurrence of precipitation compared to the observed frequency of 44%–54%. The improvement in tropical precipitation frequency in CMIP6 is primarily explained by the reduction of wet 3-h means over tropical oceans, whereas the frequency over land has only slightly decreased compared to CMIP3 (Figs. S4 and S5). We note, however, a substantial model spread for the frequency of precipitation rates in all CMIP phases (Fig. S6).

Fig. 4.
Fig. 4.

Wet and dry periods. (a) The frequency of wet 3-h means calculated by flattening 3-hourly CMIP data and observations in time and space into a single dimension and counting the number of precipitation events; (b) the 1-day lag autocorrelation of total daily precipitation, temporally and spatially averaged for CMIP and observations; and (c),(d) the number of consecutive dry days (CDD) as (c) box-and-whisker plot for the time and spatial average of CDD of CMIP and observations plotted as horizontal lines and (d) the probability of occurrence of CDD across time and space.

Citation: Monthly Weather Review 148, 9; 10.1175/MWR-D-19-0404.1

We measure the day-to-day variability and spatiotemporal coherence of the tropical precipitation with the 1-day lag autocorrelation (Fig. 4b). A realistic lag autocorrelation is associated with an improved representation of deep convection and convection coupled to equatorial waves, including the Madden–Julian oscillation (Peters et al. 2017; Ma et al. 2019) assessed in section 5a. Atmospheric models with parameterized moist convection are known to have unrealistic day-to-day variability in precipitation (Peters et al. 2017, 2019) due to deficiencies in the physical parameterization schemes that lead to too-frequent triggering of deep convection (Klingaman et al. 2017; Peters et al. 2017). This behavior is characterized by too-large 1-day lag autocorrelations (i.e., wet episodes over several days are not sufficiently interrupted by dry days). We identify a slight improvement in the 1-day lag autocorrelation from CMIP3 and CMIP5 to CMIP6, namely, a reduction from roughly 0.60 in CMIP3 to 0.50 in CMIP6. Since the lag autocorrelation is sensitive to the representation of convection (Klingaman et al. 2017; Peters et al. 2017), this result indicates that the past model development between CMIP phases contributed to a slightly better day-to-day variability of moist convection. However, compared to the observed 1-day lag autocorrelation of 0.35 in both TRMM and CMORPH, CMIP6 models still substantially overestimate this quantity, pointing to too-little intermittence of rainy episodes.

The maximum number of consecutive dry days (CDD) are used to quantify the length of dryness. Following the Expert Team on Climate Change Detection and Indices (ETCCDI) as used by Frich et al. (2002), CDD is defined as the number of consecutive days within a year that have total daily precipitation amounts of less than 1 mm day−1. This threshold removes days with light drizzle events that are difficult to measure. Although TRMM (TMPA) is better for light rain events than other satellite-based data (e.g., Burdanowitz et al. 2015), it misses light precipitation (Behrangi et al. 2014) affecting the frequency of occurrence of precipitation events (Klepp et al. 2018). By eliminating days with such light drizzle events, we determine the differences in CDD considering more regular to extreme precipitation events. We show the spatial and temporal average of CDD (Fig. 4c) and the probability distribution of CDD across time and space (Fig. 4d). The latter primarily indicates spatial variability for CMIP due to the small year-to-year changes in ensemble-averaged CDD with standard deviations of 0.47–0.69 (not shown).

There is an improvement over the three CMIP generations in averaged CDD and their probability of occurrence (Figs. 4c,d), but CMIP6 models still produce shorter dry periods on average than observed (Fig. 4c). Reasons for the remaining difference to observations stem from the poor representation of extremely long dry episodes (Fig. 4d). For instance, the climate models show too-low probabilities for CDD longer than 130 days in CMIP3, and 200 days in CMIP5 and CMIP6, compared to the observations (Fig. 4d). The underestimation of such extremely long dry episodes is primarily explained by the mismatch in CDD over oceans (Figs. S4 and S5), but this is also the region where the improvement across CMIP generations was largest. Compared to the ocean, the number of CDD over land is generally better captured by CMIP models, except for CDDs longer than 250 days. The probability of occurrence for these extremely long dry episodes has slightly improved from CMIP3 to CMIP6, but the occurrence of more than 300 CDDs in deserts, is still underestimated. This finding has implications for other processes in the Earth system (e.g., dust-aerosol emissions, which is influenced by the soil moisture and lack of vegetation cover) (e.g., Shao 2001; Kok et al. 2014).

2) Jennings scaling

Jennings (1950) discovered a scaling law, P ~ Dα, that describes the global maximum of precipitation, P, observed at rain gauges over land during an interval of some duration, D, with the exponent α ~ 1/2 for periods of minutes to 1 year. Even earlier research on thresholds of rainfall extremes supports this power-law scaling (Wussow 1922). This type of scaling can be reproduced by simple thermodynamic models whose large-scale input is modulated by stochastic forcing (e.g., Field and Shutts 2009; Zhang et al. 2013b). The scaling relationships described by Jennings, sometimes called maximum depth–duration graphs, have entered textbooks in hydrology, but their application to the evaluation of climate model output is less common (Zhang et al. 2013a), which motivates the present analysis. Moreover, whereas previous studies focused on time periods of minutes to one year, we test here the extension of the Jennings scaling to decades by calculating the slope for data over longer averaging intervals. We find that the Jennings slopes are very similar in TRMM and all phases of CMIP.

The rainfall maxima P for a given D is determined from the spatial distributions of tropical precipitation. The literature typically refers to P as depth. The depth is the maximum across time and space in the running means of daily precipitation over the duration, calculated at every grid point. Durations range here from 1 day to 1 decade, with steps of 1 day, for all datasets, except for CMIP6 and TRMM, where we also use the entire overlapping period 2001–14. We show three examples of the resulting points that fall on a line in the depth–duration space (Fig. 5a). We find that all regression lines closely fit the data points, with R2 = 0.97 being the smallest coefficient of determination across all datasets here. The slopes of that line α are shown in Fig. 5b and are known as the Jennings slope.

Fig. 5.
Fig. 5.

Jennings scaling. (a) Three examples for the calculated data points that fall on a line in the depth–duration (P-D) space and (b) the Jennings slopes α of that line across the CMIP phases and in TRMM, compared to the gauge measurements used by Jennings (1950). The box-and-whisker plot show the means, quartiles, and extremes across the CMIP phases.

Citation: Monthly Weather Review 148, 9; 10.1175/MWR-D-19-0404.1

In both the CMIP output and the data from TRMM, α is larger than the value determined from the earlier gauge measurements by Jennings (1950). In addition to the different spatial scales, Jennings (1950) covers minutes to 1 year, while we start with daily precipitation and move to decadal scales for TRMM and CMIP. Looking at the line in Fig. 5a indicates that the steeper slopes in TRMM are primarily explained by the interval from 1 year to 1 decade (i.e., there is some curvature in the slope when moving to longer averaging intervals). Paired with the different spatial representation of gauge measurements and the gridded data, it explains why CMIP and TRMM produce slopes that are more similar to one another than compared to Jennings (1950), with CMIP6 following the observations better than the previous phases of CMIP.

There is considerable variability in the estimates of α from the CMIP output, although less in CMIP6 than in previous CMIP phases. The relatively good match between TRMM and CMIP6 based estimates of α suggests that despite biases in the distribution of precipitation, the tendency for long-duration events to be associated with more intense rainfall is well captured by the models.

d. Low-level and deep clouds

We investigate tropical precipitation associated with different cloud regimes, namely, low-level and deep clouds. Low-level cloud regimes cover large parts of the tropics away from the ITCZ. Using an outgoing longwave radiation (OLR) threshold of >250 W m−2 to exclude areas of deep convection, we estimate the observed fractional area coverage of low-level cloud regimes from the daily CERES product (Loeb et al. 2009) to be 68% (not shown). We choose the threshold of 250 W m−2, corresponding to a brightness temperature of 258 K and similar to other studies (e.g., Masunaga et al. 2005). Note that this includes both low-level cumuli and stratiform clouds. It also includes a fraction of cumulus congestus, which is not distinguishable from low-level clouds with OLR. Sensitivity tests with other thresholds of 240–260 W m−2, consistent with Stubenrauch et al. (1999), give qualitatively similar results to 250 W m−2. For the analysis, we use daily OLR and precipitation data, available from 16 CMIP3 models, 32 CMIP5 models, and 14 CMIP6 models, marked by indices 2 and 3 in Tables S1–S4.

The CMIP means have a fractional low-level cloud area similar to the observations, with a slight increase from 65% for CMIP3 to 69% for CMIP6. Despite a similar areal coverage the models differ substantially by 50%–100% in the amount of precipitation associated with low-level clouds (Fig. 6a). There is no clear improvement over the CMIP phases, although the very large outliers evident in CMIP3 and CMIP5, with precipitation fractions associated with low-level cloud regimes larger by a factor of four to five, have reduced in CMIP6. Some models in CMIP6 lie within the observational range for the fractional precipitation amount associated with low-level clouds (10%–14%), namely, BCC-ESM1 (12%), CNRM-CM6–1 (10%), and CNRM-ESM2–1 (11%). These three models, however, tend to underestimate the fractional area coverage of the low-level cloud regimes with 59%, 65%, and 65%, respectively.

Fig. 6.
Fig. 6.

Precipitation associated with clouds of different depth. (a) The fraction of precipitation associated with low-level clouds, defined as the daily precipitation in regions with daily outgoing longwave radiation (OLR) greater than 250 W m−2 divided by the total tropical precipitation amount. (b) Tropical mean in daily OLR against daily precipitation amount binned by steps of 10 mm day−1. Shaded areas mark half the standard deviation of the model spread. The probability density functions of individual models are shown in the supplemental material (Fig. S6). In both (a) and (b) the black (gray) line is the precipitation observed by TRMM (CMORPH) and OLR based on CERES. We use here CMIP model data marked with indices 2 and 3 in Tables S1–S4, which is slightly less than in analyses that only use daily precipitation, because of the availability of OLR output.

Citation: Monthly Weather Review 148, 9; 10.1175/MWR-D-19-0404.1