Metrics for Evaluating CMIP6 Representation of Daily Precipitation Probability Distributions

Cristian Martinez-Villalobos aFaculty of Engineering and Sciences, Universidad Adolfo Ibañez, Peñalolen, Santiago, Chile
bData Observatory Foundation, Santiago, Chile

Search for other papers by Cristian Martinez-Villalobos in
Current site
Google Scholar
PubMed
Close
,
J. David Neelin cDepartment of Atmospheric and Oceanic Sciences, University of California, Los Angeles, Los Angeles, California

Search for other papers by J. David Neelin in
Current site
Google Scholar
PubMed
Close
, and
Angeline G. Pendergrass dDepartment of Earth and Atmospheric Sciences, Cornell University, Ithaca, New York
eNational Center for Atmospheric Research, Boulder, Colorado

Search for other papers by Angeline G. Pendergrass in
Current site
Google Scholar
PubMed
Close
Full access

We are aware of a technical issue preventing figures and tables from showing in some newly published articles in the full-text HTML view.
While we are resolving the problem, please use the online PDF version of these articles to view figures and tables.

Abstract

The performance of GCMs in simulating daily precipitation probability distributions is investigated by comparing 35 CMIP6 models against observational datasets (TRMM-3B42 and GPCP). In these observational datasets, PDFs on wet days follow a power-law range for low and moderate intensities below a characteristic precipitation cutoff scale. Beyond the cutoff scale, the probability drops much faster, hence controlling the size of extremes in a given climate. In the satellite products analyzed, PDFs have no interior peak. Contributions to the first and second moments tend to be single-peaked, implying a single dominant precipitation scale; the relationship to the cutoff scale and log-precipitation coordinate and normalization of frequency density are outlined. Key metrics investigated include the fraction of wet days, PDF power-law exponent, cutoff scale, shape of probability distributions, and number of probability peaks. The simulated power-law exponent and cutoff scale generally fall within observational bounds, although these bounds are large; GPCP systematically displays a smaller exponent and cutoff scale than TRMM-3B42. Most models simulate a more complex PDF shape than these observational datasets, with both PDFs and contributions exhibiting additional peaks in many regions. In most of these instances, one peak can be attributed to large-scale precipitation and the other to convective precipitation. Similar to previous CMIP phases, most models also rain too often and too lightly. These differences in wet-day fraction and PDF shape occur primarily over oceans and may relate to deterministic scales in precipitation parameterizations. It is argued that stochastic parameterizations may contribute to simplifying simulated distributions.

© 2022 American Meteorological Society. For information regarding reuse of this content and general copyright information, consult the AMS Copyright Policy (www.ametsoc.org/PUBSReuseLicenses).

Corresponding author: Cristian Martinez-Villalobos, cristian.martinez.v@uai.cl

Abstract

The performance of GCMs in simulating daily precipitation probability distributions is investigated by comparing 35 CMIP6 models against observational datasets (TRMM-3B42 and GPCP). In these observational datasets, PDFs on wet days follow a power-law range for low and moderate intensities below a characteristic precipitation cutoff scale. Beyond the cutoff scale, the probability drops much faster, hence controlling the size of extremes in a given climate. In the satellite products analyzed, PDFs have no interior peak. Contributions to the first and second moments tend to be single-peaked, implying a single dominant precipitation scale; the relationship to the cutoff scale and log-precipitation coordinate and normalization of frequency density are outlined. Key metrics investigated include the fraction of wet days, PDF power-law exponent, cutoff scale, shape of probability distributions, and number of probability peaks. The simulated power-law exponent and cutoff scale generally fall within observational bounds, although these bounds are large; GPCP systematically displays a smaller exponent and cutoff scale than TRMM-3B42. Most models simulate a more complex PDF shape than these observational datasets, with both PDFs and contributions exhibiting additional peaks in many regions. In most of these instances, one peak can be attributed to large-scale precipitation and the other to convective precipitation. Similar to previous CMIP phases, most models also rain too often and too lightly. These differences in wet-day fraction and PDF shape occur primarily over oceans and may relate to deterministic scales in precipitation parameterizations. It is argued that stochastic parameterizations may contribute to simplifying simulated distributions.

© 2022 American Meteorological Society. For information regarding reuse of this content and general copyright information, consult the AMS Copyright Policy (www.ametsoc.org/PUBSReuseLicenses).

Corresponding author: Cristian Martinez-Villalobos, cristian.martinez.v@uai.cl

1. Introduction

Precipitation is an essential element to society and environment. Beyond its role in providing for agriculture, industry, and personal water needs, precipitation interacts with other climate variables, ultimately shaping the world as we know it. In a given location, a first-order picture of the effects of precipitation is given by its temporal mean (i.e., the total precipitation that falls in a given period), but this leaves out important details, such as the variability of precipitation in the region. Thus, because society is not only adapted to mean conditions, a complete assessment of local precipitation should characterize the whole temporal distribution of rainfall.

An important tool for projecting the local precipitation response to a variety of forcings, including different global warming scenarios, is the use of highly sophisticated global climate models (GCMs). These models aim to simulate credible realizations that can be plausibly compared to the actual evolution of weather and climate. Due to the complex interactions that give rise to rainfall, precipitation is one of the most challenging variables for GCMs to simulate (Flato et al. 2013). Indeed, different models often use different versions of large-scale and convective precipitation parameterizations (which parameterize subgrid-scale processes not explicitly simulated), with no approach being immune to modeling issues. Previous phases of the Coupled Model Intercomparison Project (CMIP) have revealed long-standing problems in simulating, for example, how often and how hard it rains (Stephens et al. 2010; Rosa and Collins 2013; Terai et al. 2018), the magnitude of extremes (O’Gorman and Schneider 2009; Gervais et al. 2014; Wehner et al. 2014; Abdelmoaty et al. 2021), and the shape of the PDF (Pendergrass and Hartmann 2014; Terai et al. 2018; Chen et al. 2021)—simulated daily precipitation PDFs are often more complex than observed, including deviations in the low- to medium-intensity regime. The main goals of this paper are to introduce a set of metrics to evaluate the probability distributions of daily precipitation, to place them in context of literature on their physical interpretation, and to apply them in an initial evaluation of simulations from phase 6 of CMIP (CMIP6).

There is considerable discussion regarding biases in frequency and intensity of wet-day precipitation [e.g., Flato et al. (2013)], but for a number of reasons these metrics do not necessarily correspond to the fundamental physical processes on which simulations of precipitation are based. This may be why they have not led to substantial improvement over the generations of model development during which awareness of these issues has grown. More physically motivated metrics, and those that are robust to spatial and temporal resolution, may be able to break this deadlock.

Since the last generation of CMIP simulation, our understanding of how physical processes govern the shape of daily precipitation PDFs has improved. Martinez-Villalobos and Neelin (2019), using a model based on a simplified version of the moisture equation (Stechmann and Neelin 2014; Neelin et al. 2017), provide a first-order explanation on how the moisture budget controls the shape of daily precipitation PDFs and why they have shapes that are often approximated by gamma or similar distributions. A gamma distribution has historically been one of the most popular choices to empirically fit daily precipitation PDFs over wet days (Barger and Thom 1949; Thom 1958; Ropelewski et al. 1985; Groisman et al. 1999; Wilby and Wigley 2002; Watterson and Dix 2003; Husak et al. 2007; Martinez-Villalobos and Neelin 2018; Chang et al. 2021), although other gamma-like alternatives are also used (e.g., Wilks 1998; Wilson and Toumi 2005; Papalexiou and Koutsoyiannis 2012).

To leading order, the bulk of the PDF of observed rainfall contains two ranges, governed by different physical balances, and characterized by different metrics: 1) a range with no dominant physical scale (“scale-free range”) at low-to-medium intensities, approximated by a power law with exponent −τP (τP < 1) controlling the probability of low and moderate daily precipitation values; and 2) a range governed by a dominant scale, namely the precipitation cutoff scale PL that controls the probability of medium-to-large events. These two ranges can be captured to a leading approximation for present purposes by a gamma distribution for simplicity. We emphasize that we are not relying on conformance to a particular distribution, but we use gamma-like distribution properties to inform metrics and their interpretations, and the relationships among them. For applications to more subtle features, such as deviations from the approximate power-law scaling at low values (Papalexiou and Koutsoyiannis 2016), or accurately capturing the folding of the very extreme tail (Papalexiou and Koutsoyiannis 2013; Cavanaugh et al. 2015; Papalexiou and Koutsoyiannis 2016), then distributions with an additional parameter (e.g., generalized gamma distribution, Burr type XII distribution) can be better suited (Papalexiou and Koutsoyiannis 2012)—similar considerations to those presented here can still apply, although with added complexity, as further discussed in section 2. The approximate power-law range arises from fluctuations across the threshold between raining and nonraining conditions. For daily average precipitation, a main control of the exponent τP is the number of individual precipitating events within wet days (Martinez-Villalobos and Neelin 2019)—all else being equal, regions with fewer events per day tend to have steeper power-law ranges. For the approximately exponential range governing large events, the cutoff scale PL, which is the main precipitation scale in observed PDFs, is given by a balance between the variability of moisture converging during precipitating events and a measure of moisture loss by precipitation in them (Stechmann and Neelin 2014; Neelin et al. 2017; Martinez-Villalobos and Neelin 2019).

We expect a variety of different model representations of the processes that yield PL and τP, so as a first-order picture we evaluate how well models simulate these parameters. However, models may sometimes deviate from the power-law and cutoff-scale picture that tends to hold in satellite-based precipitation products (see below) and station data (Schiro et al. 2016; Martinez-Villalobos and Neelin 2018, 2019; Chang et al. 2020). Simulated PDFs can thus be more complex; for instance, a bump can indicate an artificial scale introduced into the scale-free range. Thus, in addition to PL, τP and other commonly used scalar metrics (mean, standard deviation, fraction of wet days), we also employ metrics that evaluate the “shape” of simulated probability distributions and their probability distance compared to their observed counterparts.

The paper is organized as follows. Section 2 presents the data and introduces the metrics used. Section 3 gives an overview of the uncertainty between different observational products used to compare to models. This observational uncertainty is necessary to adequately evaluate model performance. Section 4 presents the model evaluation. Section 5 summarizes the study and discusses its implications.

2. Data and methods

a. CMIP6 models and observational datasets

We use daily precipitation from the first variant of 35 CMIP6 models (see Table 1) over the period 1990–2014. To estimate observational uncertainty bounds for meaningful comparison with models, we use six different daily precipitation products: TRMM-3B42 v7.0 (50°S–50°N, 1998–2016) and its microwave-calibrated (IR) and microwave-only (MW) variants (Huffman et al. 2007), CMORPH V1.0 CRT (60°S–60°N, 1998–2017) (Xie et al. 2017), PERSIANN CDR v1 r1 (50°S–50°N, 1983–2017) (Ashouri et al. 2015), and GPCP 1DD CDR v1.3 (90°S–90°N, 1997–2017) (Huffman et al. 2001), all taken from the Frequent Rainfall Observations on Grids (FROGS) database (Roca et al. 2019). These are satellite-based products with correction to gauges over land. Some models have relatively coarse native resolutions (see Table 1), so all models and datasets are coarsened, using the ESMG_regrid function in NCL under the “conserve” option, onto a 3° × 3° latitude–longitude grid prior to analysis.

Table 1

List of models.

Table 1

b. PDFs and contributions

In this subsection and the following one, we aim to reconcile terminology with the framework for looking at the distribution of precipitation with Pendergrass and Hartmann (2014). We calculate PDFs as normalized histograms with bins approximately constant in log(P) space, with P denoting daily precipitation. The first and second moments of the precipitation distribution are used to calculate mean and variance of precipitation. A moment ratio will be used as an estimator of the precipitation scale, so it is useful to examine the contributions to each of these integrals as a function of precipitation. We thus define
C^amount(P)=P×PDF,C^var(P)=P2×PDF.
Here, C^amount is the quantity that integrates to the mean precipitation over wet days (P¯wet, for P measured in millimeters per day (mm day−1); if P were measured in millimeters the integral would yield total precipitation). Similarly, C^var is the quantity that integrates to the second moment (m2), a quantity closely related to the variance on wet days (σP2):
P¯wet=0C^amount(P)dP,m2=0C^var(P)dP,σP2=m2P¯wet2.
Since the quantities that C^amount and C^var integrate to are evaluated with other metrics (see next subsection), it is useful to define the normalized contributions (referred to simply as contributions in what follows) as
Camount=C^amountP¯wet,Cvar=C^varm2,
such that 0CamountdP=1 and 0CvardP=1. Thus, any difference between observed and modeled contributions is in their shape, which facilitates the construction of metrics for differences in the shape of distributions.

c. Terminology and normalization

It is worth clarifying differences in nomenclature and in normalization between (linear) precipitation and log-precipitation variables that exist in the literature. Each approach is self-consistent, but confusion can arise especially when comparing and interpreting the different approaches. In one approach, the PDF normalized in precipitation (Fig. 1a), plotted here in log-P space, has a long history (e.g., Barger and Thom 1949; Thom 1958; Groisman et al. 1999; Katz 1999; Watterson and Dix 2003; Wehner et al. 2014). From this point of view, Figs. 1b and 1c (or Figs. 1e,f) give the quantities that integrate to the first and second moment, referred to here and elsewhere as contributions (to the relevant integral) (Karl and Knight 1998; Neelin et al. 2009; Klingaman et al. 2017; Kuo et al. 2018; Wang et al. 2021); in other cases, these are referred to more explicitly as the frequency density times the variable P (Watterson and Dix 2003). In a second approach, the PDF normalized in log-P has been termed the frequency distribution, and the log-P frequency density multiplied by P has been termed the amount distribution (Pendergrass and Hartmann 2014; Kooperman et al. 2016a,b; Pendergrass et al. 2017; Akinsanola et al. 2020). The translation between calculations in P and log-P coordinates is simply a factor of P. Specifically, if we denote a PDF calculated in P coordinates as fP(P) and its counterpart calculated in G(P) = log(P) coordinates as f^G[G(P)]=f^G(P), then they are related as fP(P)=f^G(P)dG/dP (von Storch and Zwiers 1999). In this case, this translates to f^log(P)(P)=PfP(P).

Fig. 1.
Fig. 1.

Daily precipitation (left) probability density function (PDF) or frequency density, (center) contribution to precipitation amount Camount, and (right) contribution to variance Cvar over the Niño-3.4 region (5°S–5°N, 190°–240°E) according to TRMM-3B42, TRMM-3B42 (IR), TRMM-3B42 (MW), CMORPH, PERSIANN, and GPCP. The same data are displayed in (a)–(c) log-log, (d)–(f) log-linear, and (g)–(i) linear-log axes to illustrate the features of each (see text). To calculate, we pool grid data within the Niño-3.4 region prior to calculation. We estimate the cutoff-scale PL and power-law exponent τP according to Eq. (5), with values (P^L, τ^P): TRMM-3B42 (IR) (18.1 mm, 0.76), TRMM-3B42 (MW) (12.2 mm, 0.74), TRMM-3B42 (16.1 mm, 0.76), CMORPH (17.1 mm, 0.75), PERSIANN (15.5 mm, 0.65), and GPCP (9.1 mm, 0.46). The location of PL in the PDF in each of these datasets is shown by filled circles in (a), (d), and (g). The role of PL as the e-folding scale of the large events range is also schematized for the slope of one dataset in (g). Note the relationship between the shape of Camount and Cvar and the shape of log-P frequency and log-P amount distributions, respectively, outlined in section 2c and Table 2. Blue boxes indicate the panel choices for display in subsequent figures.

Citation: Journal of Climate 35, 17; 10.1175/JCLI-D-21-0617.1

This factor of P implies that the PDF or frequency density normalized in log-P coordinates (Pendergrass and Hartmann 2014) has the same shape as the precipitation amount contribution in P coordinates, and that the amount distribution normalized in log(P) coordinates has the same shape as the precipitation variance contribution in P coordinates.

We briefly summarize advantages of each approach, which are useful for different purposes, illustrating these in the context of three common axis choices (Fig. 1). Regardless of normalization, a log-log plot of the PDF (Fig. 1a) facilitates comparison of the light precipitation range to a power law. In the contributions to the first and second moments (Figs. 1b,c) the slope of this range is increased by 1 and 2, respectively. A log-linear plot facilitates examination of the degree to which the large-event range is approximately exponential (Figs. 1g–i). A linear-y axis with log-x axis (Figs. 1d–f) makes it easier to see differences among models in the low-medium precipitation range. Using a log-P normalization with a linear-y axis has the advantage of providing a log-P frequency density plot (Fig. 1e), with area visually proportional to its integral (Pendergrass and Hartmann 2014). The apparent difference in interpretation of frequency of light rain between the P and log-P normalizations is resolved by noting that an integral over a range of precipitation, such as over the interval (0–0.1 mm day−1) in P coordinates, is spread over a semi-infinite interval in log-P coordinates, with corresponding adjustment in the frequency density. Distributions expressed and normalized in terms of linear P are the traditional basis for statistically fit distributions such as the gamma distribution (discussed below), but we emphasize that the linear- and log-P approaches are equivalent as long as one is consistent in their use. The choice of P or log-P normalization also has some bearing on how we interpret the numerical value of the probability distance metrics in section 2d(2)(ii).

While contributions provide a generic way of referring to these quantities, their significance in model evaluation merits a more specific nomenclature, summarized in Table 2. In what follows, when discussing in terms of linear-P normalization, we equivalently use PDF or frequency density (Fig. 1a), precipitation amount contribution for its first moment (Fig. 1e), and precipitation variance contribution for its second moment (Fig. 1f). When discussing in terms of log-P normalization, we use log-P frequency density (Fig. 1e) and log-P amount (Fig. 1f).

Table 2

Nomenclature and mathematical relationship among the normalizations of the precipitation distribution and its moments; compare to Fig. 1.

Table 2

d. Metrics used to evaluate models

1) Bulk measures of the PDF

Consider, for reference, the form of a gamma distribution:
PDFgamma(P)PτPexp(PPL)(τP<1).
It has some desirable properties in terms of approximately reflecting the two physical regimes discussed in the introduction and elaborated below (see Figs. 1a,g for an example) while maintaining simplicity—emphasizing that we use this to inform metrics, and are not relying on strict conformance to this distribution. Anticipating departures from this, assume that precipitation follows a distribution over wet days
PDFP(P)PτPM(P)exp[N(PPL)](τP<1),
where the unknown functions M(P), N(x) express (with some redundancy) departures from the simplest case of a gamma distribution (for which both are 1). For a specific example, we can consider the generalized gamma distribution, for which N(P/PL) = (P/PL)ν, still governed by the same physical precipitation scale PL, but where departures of ν from 1 can affect PDF extreme tail behavior, decreasing (increasing) extremes for ν greater than (less than) 1.

The factor PτP provides a scale-free power-law range over low and moderate values, with probability decreasing by a constant factor log(PDFgamma)τPlog(P) as orders of magnitude in P increase. This power-law range continues until the PDF approaches a characteristic cutoff-scale PL [the factor exp(P/PL)] where the probability drops much faster [log(PDFgamma)P/PL], as schematized in Fig. 1a, thus effectively bounding the probability of extremes. The differences among the slopes of the different observational datasets in Fig. 1g are primarily associated with different PL values—if the precipitation axis is rescaled by PL, the medium-to-large event portion of the curves collapses to a common dependence to good approximation (Martinez-Villalobos and Neelin 2021). The τP and PL parameters also have physical interpretations. Results from a stochastic model based on the moisture budget (Stechmann and Neelin 2014; Neelin et al. 2017; Martinez-Villalobos and Neelin 2019) suggest that the power-law range is steeper (larger τP, implying probability decreasing faster in the low and moderate range) in generally dry regions where few precipitating events per day occur, and PL (thus, also extremes) is larger in regions of higher moisture convergence variance (Martinez-Villalobos and Neelin 2019). We also expect departures from power-law behavior if event durations (from precipitation onset to termination) are not well separated from the daily averaging interval.

An important consideration for metrics is that they be simple, easily interpretable, and robust to modest departures in PDF shape. The mean and variance over wet days P¯wet and σP2 are familiar quantities. These can be rearranged into metrics closely related to method of moments estimators (Waggoner 1989; Watterson and Dix 2003) for τP and PL. For the gamma distribution,
P^L=σP2P¯wet,τ^P=1P¯wetPL.
As the PDF shape departs from the gamma distribution, these remain useful metrics for the two ranges. For strong departures, they should no longer be considered estimators, but simply a precipitation scale and a nondimensional quantity created from the first two moments. For example, for the generalized gamma distribution, the moment estimator is proportional to the scale PL with a prefactor
Γ[(3τP)/ν]Γ[(2τP)/ν]Γ[(2τP)/ν]Γ[(1τP)/ν]
that is larger (smaller) than 1 for ν below (above) 1.

Although different estimation methods such as maximum likelihood or linear regression (in log-log or log-linear coordinates for relevant ranges of the PDF) may provide different numerical values, these are generally spatially well correlated (Martinez-Villalobos and Neelin 2019). We consider a day wet when the daily precipitation is at least 0.1 mm. In some instances we plot 1τ^P (ranging from 0 to ) instead of τ^P (ranging from − to 1). A small value of 1τ^P indicates a steep power-law range.

The power-law exponent and cutoff scale summarize the wet-day PDF. To provide a complete description of the daily precipitation PDF for all days, we also calculate the fraction of wet days,
fwet=#wet days#days,
the mean over all days P¯all, and mean over wet days P¯wet. They are related by
P¯all=fwetP¯wet.

Alternatively, wet and dry days could be assessed jointly by considering a mixed-type PDF [conditional and unconditional moments are related as in Eqs. (18) and (19) in Papalexiou (2018)]. Here, we choose to analyze wet and dry times separately.

2) Evaluating the fit and shape

(i) Gamma distribution fit

The estimators for PL and τP can be expected to approach their actual values as long as the gamma distribution provides a good fit. We note that several other distributions produce gamma-like features over a range of their parameters (Cho et al. 2004; Kirchmeier-Young et al. 2016) and may account for some subtle features, such as deviation from a strict exponential decay of the extreme tail (Papalexiou and Koutsoyiannis 2013; Cavanaugh et al. 2015), unaccounted by the gamma. In cases or regions where the gamma distribution fit is suboptimal, the interpretation of P^L and τ^P [calculated as in Eq. (5)] as cutoff scale and power-law estimators is modified. They should be understood simply as a scale from the (wet day) variance over mean and the nondimensional square of the mean over the variance (as a departure from 1). A well-performing GCM should still reproduce their values.

A simple method, using scalar quantities, to identify regions where the gamma distribution is expected to provide good or bad fits is comparing predictions of theoretical gamma distributions to observations. For a gamma distribution of form pP={1/[Γ(1τP)PL1τP]}PτPexp(P/PL), the nth uncentered moment is given by Pn=PLn{Γ(n+1τP)/[Γ(1τP)]}, with Γ being the gamma function. Using the property Γ(z + 1) = zΓ(z), the moment ratio rn is given by rn=Pn/Pn1=PL(nτP). Noting that r1 and r2 are used to define estimators P^L and τ^P [from (5), P^L=r2r1,τ^P=(r22r1)/(r2r1)], we evaluate the gamma distribution fit by comparing the observed (or modeled) third-order moment ratio r3 and its expected value from the gamma distribution r3gamma=P^L(3τ^P). This measure is given by
egamma=r3gammar3.

A value of egamma close to 1 implies reasonably good fits while significant deviations from 1 point to progressively degraded ones.

(ii) Distance between observed and modeled PDFs and contributions
To measure how well a modeled PDF (PDFmodel) approaches an observed one (PDFobs), we define a PDF distance metric epdf as follows:
epdf=0|PDF(P)model(P)PDF(P)obs(P)|dP.

This distance is the simplest case of a family of more general probability distances (Zolotarev 1977; Korolev and Gorshenin 2020) and provides comparable results to other commonly used probability distance definitions (Martinez-Villalobos and Neelin 2021).

Similarly, we define eCamount and eCvar as the probability distance between the modeled and observed precipitation amount and variance contributions respectively. Note that Camount and Cvar are weighted progressively toward larger values, with PDFs giving more weight to the low intensity range, and Cvar giving more weight to the extreme range. So, epdf, eCamount, and eCvar provide complementary information on differences in modeled probabilities in the low, moderate, and extreme ranges.

(iii) The shape of the PDF

A large probability distance between modeled and observed PDFs (epdf) may occur because the parameters of the PDFs (τP and PL) differ substantially (although the basic shape of the PDF may be well simulated) and/or because significant deviations in the modeled shape occur compared to the power-law range and cutoff-scale picture that holds in observational datasets. One example of these deviations is the presence of extra peaks in probability. So, to complement information provided by egamma and epdf we also track the number of peaks in the PDF, Camount, and Cvar in models compared to observational products. We note that there are other more subtle features that also imply a deviation from form, for example minimums or maximums in derivatives of the PDF. For this paper, we limit ourselves to only count peaks as a proxy for deviations from the observed shape. The algorithm used to identify these peaks take several precautions to not misidentify them (Savitzky and Golay 1964). Details are given in Text S1 in the online supplemental material.

3) Model summary score for each metric

To calculate an overall score on a particular metric we need to reduce noise, which is especially important if the metric involves the calculation of the PDF and contributions. Thus, prior to evaluation we divide the area within 50°S–50°N into 240 different regions of 10° latitude and 15° longitude. Then, we pool the time series in each region and calculate a single value of P^L,τ^P,P¯all,P¯wet, fwet, and σP as well as the PDF and contributions, representative of the region. To evaluate the overall performance in PL, τP, P¯all,P¯wet, σP, and fwet, we use a root-mean-square (RMS) error given by
RMS error x=iAi[xiobsximodel]2iAi,
where i denotes a particular 10° × 15° region, Ai denotes its area (which scales with the cosine of the latitude), and xiobs and ximodel denote the value of the metric in a particular observational dataset and model respectively. This RMS error gives a measure of the typical deviation (of any sign) of a particular model compared to observations.
Similarly, an overall error in the distance between observed and modeled PDFs [section 2d(2)(ii)] is given by
epdf=iAiepdfiiAi,
where epdfi is the probability distance between PDFs (9) in region i. Total errors in the simulation of contributions are calculated similarly.

Finally, we condense the overall differences in probability peaks between models and observations by calculating the percentage of the 240 regions previously defined where models and observational products disagree in the number of PDF, Camount, and Cvar peaks.

For all metrics, the overall score shown and discussed in the rest of the paper is a weighted average of the model differences compared separately to TRMM-3B42 and GPCP. We chose these datasets because they tend to bracket the observational estimates of the other datasets in most metrics (see next section). To contextualize the difference between models and observations, we compare each metric against the difference between GPCP and TRMM-3B42 estimates to provide a measure of the observational uncertainty. Given that TRMM-3B42 and GPCP share some input data (Huffman et al. 2007), this observational uncertainty is admittedly a conservative estimate.

3. Comparison among observational products

a. PDFs and contributions and uncertainty quantification

Different daily precipitation observational datasets are known to have substantial differences (Donat et al. 2014; Pendergrass and Deser 2017; Klingaman et al. 2017; Sun et al. 2018; Rajulapati et al. 2020; Alexander et al. 2020; Martinez-Villalobos and Neelin 2021). Before evaluating models it is important to be aware of these differences, and use them to provide a measure of observational uncertainty.

Figure 1 shows the daily precipitation PDF over the Niño-3.4 area using the six different observational datasets considered. In all cases the PDFs follow a similar shape—a power-law range and an approximately exponential drop in probability. The power-law range can be seen as a straight line in the log-log plot (Fig. 1a), occurring from the lowest value to approximately the location of the cutoff scale PL (shown in circles), and the drop in probability associated to the cutoff occurs for PPL. However, the estimators of parameters PL and τP differ in all cases, with P^L ranging from 9.2 mm in GPCP to 18.4 mm in TRMM-3B42. Similarly, the contributions have a similar shape (Figs. 1e,f), but the differences in PL and τP imply different locations of their peaks and widths.

To the extent that the PDFs in Fig. 1a are well described by gamma distributions of shape (4), then the contributions in Figs. 1b and 1c would approximately follow the mathematical form
CamountP×PDFPτP+1exp(PPL),CvarP2×PDFPτP+2exp(PPL).

This implies that the PDFs in Fig. 1a, the contribution to total precipitation in Fig. 1b, and the contribution to variance in Fig. 1c follow a similar shape in the large event range, with the main differences being in the power-law exponent (−τP for the PDF, 1 − τP for Camount, and 2 − τP for Cvar). The differences in power-law exponent imply a different shape for the low and moderate range, which results in Cvar preferentially weighted toward larger values, Camount weighted toward more moderate values, and the PDF having more of its weight in the light precipitation range. This implies that the extreme range contributes more to the second daily precipitation moment and the moderate range contributes preferentially to the total (or mean) precipitation.

Figure 2 shows the zonal average of P^L and τ^P in the six different observational datasets considered. These have a more symmetric pattern between hemispheres in P^L than occurs for the mean (see below), as features like the ITCZ, seen clearly in the mean pattern, are attenuated or absent in the spatial pattern of P^L. Generally, larger values of P^L occur in the tropics through the equatorward flank of midlatitude storm tracks; poleward of the storm tracks P^L quickly decreases. However, considerable differences may be noted in the details of the P^L pattern among observational datasets. The power-law exponent estimator τ^P has a more consistent pattern among observational estimates, with smaller values in regions where we expect frequent precipitation (as expected from theory; see Martinez-Villalobos and Neelin 2019), like the ITCZ and storm tracks, and larger values (a more steep power-law range) for regions with little precipitation, as in the subtropics.

Fig. 2.
Fig. 2.

Observational estimates of the zonal average of (a) P^L and (b) τ^P according to TRMM-3B42 (IR), TRMM-3B42 (MW), TRMM-3B42, CMORPH, PERSIANN, and GPCP.

Citation: Journal of Climate 35, 17; 10.1175/JCLI-D-21-0617.1

Despite qualitative agreement, these different satellite products differ quantitatively, indicating a substantial degree of uncertainty. To a large extent, GPCP and TRMM-3B42 bracket the range of these products—we use the difference between these datasets in each metric as a measure of observational uncertainty, as in Martinez-Villalobos and Neelin (2021), although this is a conservative estimate. Comparison between models and each of these satellite products differ in some cases, so we report a weighted model error e as follows:
e=12(eGPCP+eTRMM),
where eGPCP is the model error compared to GPCP and eTRMM is the error compared to TRMM-3B42.

b. Relationships among metrics

The metrics defined here, in particular PL, P¯wet, and number of Camount and Cvar peaks, have several connections to the rain frequency density and amount peaks defined in Pendergrass and Deser (2017), based on the rain frequency and amount distributions defined in Pendergrass and Hartmann (2014, hereafter PH14), and used in several other studies [e.g., Kooperman et al. (2016a); Pendergrass et al. (2017); Terai et al. (2018); Akinsanola et al. (2020)]. Recall from section 2c that these log-P frequency density and amount distributions have the same shape as Camount and Cvar, respectively, in (linear) P normalization. If PDFs [Eq. (4)] and contributions [Eq. (12)] are well described by gamma distributions, then the location of their peaks can be calculated analytically. Noting that the observed range in τP tends to be within the interval [0, 1) (see below), we find that PDFs in GPCP and TRMM-3B42 satellite products have no interior peak (i.e., the largest daily precipitation probability occurs at the smallest resolvable amount, for τP > 0), whereas Camount and Cvar have a single peak given by
PCamountpeak=(1τ^P)P^L=P¯wet(τP<1),PCvarpeak=(2τ^P)P^L=P¯wet+P^L(τP<2).

That is, the peak of Camount is given by the mean on wet days P¯wet, and the difference between Cvar and Camount peaks is given by P^L. Similarly, the standard deviation of Cvar (a quantity proportional to its width) is given by P^L3τ^P. We note that P^L and τ^P predicts the peak of Cvar (or PH14 amount distribution) more robustly than the peak of Camount (or PH14 frequency density distribution), as any error determining τP has a larger impact in this case, especially if τ^P is close to one. While observed PDFs and contributions may deviate from forms (4) and (12), we report good agreement between the actual location of Camount peaks and P¯wet (spatial correlation coefficients equal to r = 0.76 in TRMM-3B42, r = 0.88 in GPCP, and r = 0.9 in the CMIP6 multimodel mean; not shown) and a better agreement between Cvar peaks and P¯wet+PL (r = 0.9 in TRMM-3B42, r = 0.91 in GPCP, and r = 0.94 in CMIP6 multimodel mean; see Fig. S1 in the online supplemental material).

4. Model evaluation

In this section we evaluate models according to the metrics defined in section 2. We exclude regions poleward of 50°, as TRMM-3B42 is only given within 50°N and 50°S latitude bands. We start with the evaluation of the suitability of the gamma distribution in observations and models. Then, we evaluate the model representation of cutoff scales and power-law ranges and, subsequently, the probability distance between observed and modeled PDFs and contributions to precipitation amount and variance. These probability distances depend on how well models simulate the power-law exponent and cutoff scale parameters but also on how well models simulate the basic “shape” of the PDF. Accordingly, to end this section we evaluate model deviations from the observed shape in GPCP and TRMM-3B42 satellite products using the number of peaks in PDFs and contributions as a proxy.

a. Evaluation of the gamma distribution approximation

A global map evaluating the suitability of the gamma distribution to approximate PDFs in satellite products and in the multimodel mean is given in Fig. 3 (first and second row for TRMM-3B42 and GPCP, and third row for the multimodel mean). The first column shows the ratio between the third and second moment r3 [defined in section 2d(2)(i)], the second column shows the expected ratio if the gamma distribution held perfectly r3gamma, and the third column shows egamma [Eq. (8)], the ratio between the two. Visual comparison between r3 and r3gamma shows similar features between these quantities in the satellite products and the multimodel mean. This implies that the gamma distribution provides a reasonable first-order picture of the PDFs. More subtle differences between r3 and r3gamma are revealed by egamma. In the case of TRMM-3B42, egamma deviates from 1 (implying degraded fits) mainly in regions with low precipitation. The reason for this is likely twofold. First, PH14 and Pendergrass et al. (2017) report inconsistent behavior between TRMM-3B42 and GPCP at low precipitation rates over ocean, which are likely related to differences in the assumptions of their algorithms since similar data goes into each of these products. Second, from a theoretical point of view, regions with few precipitating events are characterized by steep power-law ranges, with τP values that may exceed 1 (Martinez-Villalobos and Neelin 2019), that is, beyond the range of the gamma distribution. These cases can exhibit the PDF form given in (4) above the minimum observable rain rate, but the power law is too steep to normalize the PDF over a range that includes 0, and thus the expression (5) for τ^P is not a good estimate of the power-law exponent. In the case of GPCP, deviations are seen mainly in the tropical Indo-Pacific collocated with the intertropical convergence zone (ITCZ). Visual inspection in that region reveals PDFs decaying slightly faster than exponential (not shown). Deviations of egamma in the CMIP6 multimodel mean are mainly collocated with the deviations occurring in TRMM-3B42, although they tend to be more accentuated. In addition, the fit is less good over the poles than in the GPCP case. In most regions, however, the gamma distribution parameters provide conveniently summarized leading-order information on the full wet-day PDF.

Fig. 3.
Fig. 3.

Spatial pattern of the (a) observed third-order moment ratio r3, (b) third-order moment ratio predicted by a gamma distribution r3gamma and (c) the ratio between the two (egamma) according to TRMM-3B42. (d)–(f) As in (a)–(c), but using GPCP. (g)–(i) As in (a)–(c), but for the CMIP6 multimodel mean. To calculate the multimodel mean, we calculate r3, r3gamma, and egamma in each model individually and then average. A value of egamma close to 1 indicate regions where better fits are expected. See details in section 2d(2)(i).

Citation: Journal of Climate 35, 17; 10.1175/JCLI-D-21-0617.1

b. Evaluating model simulation of PDF power-law exponent, cutoff scale, and fraction of wet days

Global maps of P^L,1τ^P,P¯all,P¯wet, and fraction of wet days fwet are shown in Fig. 4 for TRMM-3B42 (50°S–50°N), GPCP, and the multimodel mean. We note a substantial degree of observational uncertainty in P^L and 1τ^P, and to some degree also in P¯wet and fwet. To a large extent, the satellite products have similar mean precipitation P¯all pattern, but their PDFs are different (even though the paradigm of power law and cutoff scale is well followed in both products), with larger extremes (larger PL) and sharper power-law range (smaller 1 − τP) in TRMM-3B42. Both P^L and τ^P CMIP6 multimodel mean patterns tend to be within GPCP and TRMM-3B42 estimates, although closer to GPCP in magnitude. This implies that (given that a day is wet) models tend to simulate weaker extremes than TRMM-3B42 but stronger extremes than GPCP. Despite differences in magnitude, the multimodel mean spatial patterns of P^L and τ^P are reasonably well correlated with GPCP and TRMM-3B42 corresponding patterns (correlation coefficients of 0.73 and 0.76 in the case of P^L, and 0.67 and 0.77 in the case of τ^P for GPCP and TRMM-3B42 respectively. Correlations are taken over 50°S–50°N). These correlation coefficients are comparable with the corresponding correlation coefficients between TRMM-3B42 and GPCP (0.75 for P^L, 0.77 for τ^P). This good agreement between the CMIP6 model mean and observed patterns has previously been noted by Martinez-Villalobos and Neelin (2021) in the case of P^L and suggests that, after cancellation of models, random errors, the CMIP6 ensemble simulates a good spatial representation of the processes yielding extremes, albeit of different magnitude.

Fig. 4.
Fig. 4.

Spatial pattern of the cutoff scale estimator P^L (5) according to (a) TRMM-3B42, (b) GPCP, and (c) CMIP6 multimodel mean. (d)–(f) As in (a)–(c), but for the power-law exponent estimator τ^P [Eq. (5)] (1τ^P is plotted). (g)–(i) As in (a)–(c), but for the mean daily precipitation P¯all. (j)–(l) As in (a)–(c), but for the mean daily precipitation over wet days P¯wet. (m)–(o) As in (a)–(c), but for the fraction of wet days [Eq. (6)], expressed as a percent. See details in section 2d(1). Boxes in the upper two rows show the Niño-3.4 region, southern Europe, and the western United States, used in Fig. 7.

Citation: Journal of Climate 35, 17; 10.1175/JCLI-D-21-0617.1

As is the case in previous CMIP phases (Flato et al. 2013), the mean precipitation P¯all pattern in the CMIP6 ensemble (Fig. 4, third row) captures to a good degree the observations (see also Fig. 5d), both spatially and in magnitude, although traces of the double ITCZ problem in the eastern Pacific (Mechoso et al. 1995; de Szoeke and Xie 2008; Bellucci et al. 2010) can still be seen. The agreement occurs, however, due to errors in the mean over wet days P¯wet (Fig. 4, fourth row) and fraction of wet days fwet (Fig. 4, fifth row) canceling each other. As in previous CMIP phases, the long-standing bias of too frequent (large fwet) and too weak (small P¯wet) precipitation (Dai et al. 1999; Sun et al. 2006; Stephens et al. 2010; Rosa and Collins 2013; Catto et al. 2019) persists in CMIP6. This pattern remains largest over ocean; however, it is smaller over land (color bars over land match satellite products to a large degree; see the fifth row of Fig. 4). However, there are caveats when evaluating models against satellite products, especially over ocean. The data going into these satellite products are known not to capture light precipitation, especially in the subtropics, and over ocean there are no gauges to correct the satellite data (Berg et al. 2010; Kay et al. 2018). Furthermore, the frequency and intensity of wet-day precipitation are sensitive to the wet-day threshold (0.1 mm in this case).

Fig. 5.
Fig. 5.

Zonal average daily precipitation (a) PDF cutoff scale estimator P^L, (b) power-law exponent estimator τ^P, (c) standard deviation on wet days σP, (d) mean over all days P¯all, (e) mean over wet days P¯wet, and (f) fraction of wet days, according to GPCP (blue), TRMM-3B42 (red), CMIP6 multimodel mean (thick solid black), and individual models (thin solid black).

Citation: Journal of Climate 35, 17; 10.1175/JCLI-D-21-0617.1

While the CMIP6 ensemble provides credible spatial patterns of P^L and τ^P, the model spread is substantial, as can be seen in Figs. 5a and 5b. This spread arises from the relatively large model spread in variance (Fig. 5c) and mean over precipitating days (Fig. 5e) combining to produce a large spread in τ^P and P^L, especially in the tropics. The large model spread in P^L, and consequently extreme percentiles, in the tropics suggests that its origin might reside in the different convective parameterizations used (O’Gorman 2015). It is interesting to note that, while the ITCZ signature is clearly present in the mean (Figs. 5d,e) and variance (Fig. 5c), it is largely absent from the extremes (as measured by P^L; Fig. 5a), in both models and observations. We also note that, although the P^L and τ^P CMIP6 ensemble mean tends to be within observational estimates, there are several individual models producing estimates outside the bounds of the satellite products.

Both the mean precipitation over wet days (Fig. 5e) and fraction of wet days (Fig. 5f) in models tend to follow the latitudinal pattern of observations, but the bias previously mentioned (models raining too frequently and too little) is evident. An exception is that the strength of precipitation over the ITCZ on wet days (P¯wet) is well simulated. However, the double ITCZ problem is clearer in the zonally averaged picture. In the fwet case, we note that no model simulates a smaller fraction of wet days, over any latitude band, compared to observations.

It is clear looking at Fig. 5 that some models provide substantially closer results compared to observations than others. Figure 6 provides an evaluation of their individual performance for P^L and τ^P using the methodology outlined in section 2d (a similar plot for P¯all, σP, P¯wet, and fwet is shown in Fig. S2). We note that for P^L and τ^P most models are closer to (a weighted version) of the observational products used (GPCP and TRMM-3B42) than the extent the observational products are close to each other. RMS errors for GPCP compared to TRMM-3B42 are on the order of 5 mm for P^L and 0.25 for τ^P. In the case of P^L only five models are outside observational bounds (Fig. 6a), while in the case of τ^P 13 out of 35 are (Fig. 6b).

Fig. 6.
Fig. 6.

Overall model RMS error (blue bars), as calculated in (10), in how they simulate (a) the cutoff scale estimator P^L and (b) the power-law exponent estimator τ^P. We compare this error to the corresponding observational error between GPCP and TRMM-3B42 (black bar). Models with smaller errors than the difference between observational datasets simulate the numerical value of these parameters (5) better than this measure of observational uncertainty.

Citation: Journal of Climate 35, 17; 10.1175/JCLI-D-21-0617.1

c. Evaluating the distance between modeled and observed PDFs

To illustrate how well models simulate daily precipitation probabilities, Fig. 7 shows PDFs and amount and variance contributions for the best and lowest performing model based on the epdf, eCamount, and eCvar metric in three different regions: (shown in Fig. 4 top two rows): the Niño-3.4 region (5°S–5°N, 190°–240°E; Fig. 7, top row), southern Europe (40°–50°N, 0°–20°E; Fig. 7, middle row), and the western United States (30°–48°N, 236°–257°E; Fig. 7, bottom row). These are chosen to show examples of PDFs and contributions in a variety of climates; a maritime tropical region (Niño-3.4 region), a relatively wet midlatitude region (southern Europe), and a relatively dry midlatitude region (western United States). Although located in very different climates, there is a large degree of commonality in the shape of PDFs and contributions in the GPCP and TRMM-3B42 products and to a good extent also in models over these regions. In all cases, the paradigm of a power-law range and a cutoff scale for the PDFs (Fig. 7, first column) is well followed, although with some slight differences that deserve attention in the western United States. In this particular case TRMM-3B42 displays a sharper power-law range with τP exceeding one, which is not unexpected in dry regions with few precipitating events per day (Martinez-Villalobos and Neelin 2019). This leads to a TRMM-3B42 contribution to precipitation amount that peaks at the lowest resolvable intensity (Fig. 7h), which stands in contrast to Camount in other regions (with one exception in Fig. 7b) and for other datasets in the western United States that display a single peak.

Fig. 7.
Fig. 7.

Best (red) and lowest (blue) performing models under (left) probability error metrics for the PDF (epdf), (center) contribution to precipitation amount Camount (eCamount), and (right) contribution to variance (eCvar) in three different regions: (a)–(c) the Niño-3.4 region (5°S–5°N, 190°–240°E), (d)–(f) southern Europe (40°–50°N, 0°–20°E), and (g)–(i) the western United States (30°–48°N, 236°–257°E). Comparison is against GPCP (gray) and TRMM-3B42 (black).

Citation: Journal of Climate 35, 17; 10.1175/JCLI-D-21-0617.1

Contributions to variance Cvar (Fig. 7, third column) are single peaked in all cases, and are more robust in terms of shape, consistent with previous studies (Pendergrass and Deser 2017). While the shape of PDFs and contributions to amount and variance tend to be well simulated by these models in these regions, the main difference between the best and lowest performing model is in how well they simulate the power-law range and cutoff scale. Errors in these lead to deviations in probability weight (e.g., MPI-ESM-1-2-HAM puts too much probability weight in the light precipitation range in the Niño-3.4 region; Fig. 7b) and in where the contribution peaks are located.

In the examples in Fig. 7, we note that the largest difference in epdf occurs in the Niño-3.4 region, with the highest and lowest scoring models performing similarly close to satellite products in the midlatitude regions. This result tends to hold in general, with tropical regions having a larger model spread compared to midlatitudes (Fig. 8). While the model spread is large in tropical regions, on average the dry subtropics is where models have the largest differences from satellite products (Figs. 8d,e), with the exception of Cvar where the entire subtropical/tropical regions are worse simulated than the midlatitudes (Fig. 8f). To put these results in context we should note, however, that uncertainties between satellite products are large and tend to mirror model errors, with larger uncertainties over the ocean and tropical and subtropical regions (Figs. 8a–c) as highlighted in Pendergrass and Deser (2017). Overall, compared to the range between satellite products, model probability errors tend to be larger in the light and moderate range (as measured by epdf and eCamount in Figs. 8d,e), with extreme probabilities (as measured by Cvar) more similar to GPCP and TRMM-3B42, in agreement with Martinez-Villalobos and Neelin (2021). However, GPCP has known issues for heavy precipitation, which should also temper our interpretation at this end of the distribution (Bador et al. 2020).

Fig. 8.
Fig. 8.

Spatial map of the CMIP6 multimodel mean of (a) PDF probability error epdf, (b) Camount probability error eCamount, and (c) Cvar probability error eCvar. Hatching denotes regions where the multimodel mean probability error is smaller than the probability distance between TRMM-3B42 and GPCP. (d)–(f) Multimodel mean (red thick solid) of the zonal average of epdf, eCamount, and eCvar respectively. Thin red lines show the 5th–95th percentiles of these quantities across the 35 CMIP6 models. Dashed line shows the corresponding zonal averages of epdf, eCamount, and eCvar probability distances between GPCP and TRMM-3B42.

Citation: Journal of Climate 35, 17; 10.1175/JCLI-D-21-0617.1

A ranking of the models in terms of their simulation of daily precipitation PDFs [based on the integrated epdf error; section 2d(2)(ii)] is given in Fig. 9a (Fig. S3 shows the corresponding ranking for Camount and Cvar, as well as a comparison to another metric, the Kullback–Leibler divergence; Kullback and Leibler 1951). Model performance in simulating power-law exponents and cutoff scales is a good predictor of how well models simulate PDFs—models with low RMS error in both P^L and τ^P (Fig. 6) tend to be the same models with low epdf errors (Fig. 9a)—but this does not capture the full story. Model simulation of P^L and τ^P tends to be within observational estimates (Fig. 6) while integrated errors in the simulation of the PDFs (and Camount and to some extent Cvar) are not (Fig. 9a). This implies that modeled PDFs deviations from the power-law range and cutoff shape, which occurs to some extent in models but is rare in observations, also plays a role.

Fig. 9.
Fig. 9.

(a) Ranking of model performance in simulating daily precipitation probability density functions (PDFs) according to the epdf metric [Eqs. (9) and (11)]. This metric calculates the probability distance between models and observations. (b) Scatter between the observed epdf (y axis; integrated over 50°S–50°N) and an estimation of epdf that only takes into account model errors in the cutoff scale PL and power-law exponent τP (x axis). (c) Scatter between model epdf (y axis) and the percentage of 10°–15° latitude–longitude regions in each model (within 50°S–50°N) that display a larger number of daily precipitation PDF peaks compared to observations. (d) As in (b), but for Cvar. The thick solid black lines in (b)(d) display regression lines with corresponding correlation coefficients displayed in the legend. Individual models’ values are shown in (b)–(d) by a blue dot and are numbered as in the x axis of (a). The black dot shows the corresponding value for the distance between GPCP and TRMM-3B42.

Citation: Journal of Climate 35, 17; 10.1175/JCLI-D-21-0617.1

To quantify the extent to which model performance in simulating PDFs (Fig. 9a) can be explained by model performance in simulating cutoff scales (Fig. 6a) and power-law exponents (Fig. 6b), we calculate an epdf measure that can be attributed solely to errors in the simulation of P^L and τ^P. To do this, we generate long synthetic “daily precipitation” time series that are perfectly gamma distributed with PL and τP parameters given by their observed or modeled values. From these time series we calculate an overall epdf value [using Eqs. (4), (9), (11)], which is not affected by deviations from the assumed gamma distribution shape. This epdf, due solely to errors in PL and τP (Fig. 9b, x axis), can be compared to the measured epdf (Fig. 9b, y axis), which also includes deviations from the assumed shape. First, we note that both quantities are well correlated (r = 0.7 across models; Fig. 9b), which implies that P^L and τ^P are indeed good measures to quantify errors in the PDF; however, they do not tell the whole story. (We note that errors in P^L and τ^P are better predictors of errors in Camount and Cvar, in both cases r = 0.88; see Fig. 9d for Cvar). This prompts us to investigate modeled PDF deviations from the power-law and cutoff-scale picture that tends to hold in observational datasets.

d. Counting the number of peaks

To a very good approximation, observed daily precipitation PDFs are characterized by a scale-free range (the power-law range, with exponent usually in the 0–1 range) and a single physical scale (the cutoff scale). This implies that daily precipitation PDFs have no interior peak (the most probable daily precipitation value is the lowest resolvable amount) and that contributions are single-peaked, with the Camount peak giving the daily precipitation intensity that most contributes to precipitation amount and the Cvar peak giving the scale that most contributes to the second moment (section 2b).

As illustrated in the bottom row of Fig. 10, important differences in the shape of observed and simulated PDFs and contributions may occur, which in the most severe cases may include additional peaks not present in observations. While we note that deviations from the power-law range and cutoff-scale shape may be more subtle, here we provide a first quantification of model differences in shape by counting the number of simulated peaks in the PDF and contributions versus observations. In contrast to other metrics, observational datasets tend to agree in these measures—both GPCP and TRMM-3B42 display zero interior peaks in the PDF and one peak in Cvar almost everywhere (Figs. 10a,c), with some differences for Camount (Fig. 10b). We should note, however, that these observational products miss light rain (Kay et al. 2018), so the existence of additional peaks in that range is not ruled out (see also section 5b). In the case of Camount, GPCP and TRMM-3B42 tend to display a single peak almost everywhere (97.5% of regions within 50°S–50°N in GPCP and 75% in TRMM-3B42); however, TRMM-3B42 tends to display no interior peaks in dry subtropical regions (22.9% of regions; see Fig. 10b), associated with a steeper power-law range there [τP tending to exceed one; see Eq. (14)].

Fig. 10.