## Abstract

Interannual variation in precipitation totals is a critical factor governing the year-to-year availability of water resources, yet the connection between interannual precipitation variability and underlying event- and season-scale precipitation variability remains unclear. In this study, tropical and midlatitude precipitation characteristics derived from extensive station records and high-frequency satellite observations were analyzed to attribute the fraction of interannual variability arising as a result of individual variability in precipitation event intensity, frequency, and seasonality, as well as the cross-correlation between these factors at the global scale. This analysis demonstrates that variability in the length of the wet season is the most important factor globally, causing 52% of the total interannual variability, while variation in the intensity of individual rainfall events contributes 31% and variability in interstorm wait times contributes only 17%. Spatial patterns in the contribution of each of these intra-annual rainfall characteristics are informative, with regions such as Indonesia and southwestern North America primarily influenced by seasonality, while regions such as the eastern United States, central Africa, and the upper Amazon basin are strongly influenced by storm intensity and frequency. A robust cross-correlation between climate characteristics is identified in the equatorial Pacific, revealing an increased interannual variability over what is expected based on the variability of individual events. This decomposition of interannual variability identifies those regions where accurate representation of daily and seasonal rainfall statistics is necessary to understand and correctly model rainfall variability at longer time scales.

## 1. Introduction

Higher levels of atmospheric carbon dioxide and associated elevated global temperatures have led to an acceleration of the global hydrologic cycle (Held and Soden 2006) that has resulted in changes in the occurrence of both extreme climate events and interannual variability in precipitation (O’Gorman 2012; Allan and Soden 2008; Sun et al. 2012; Polade et al. 2014; Portmann et al. 2009). Accompanying these climatic shifts are observed changes in the intensity of precipitation events (O’Gorman 2012), the distribution of the length of dry spells between events (Polade et al. 2014), and the seasonality of precipitation (Portmann et al. 2009). The specific intra-annual climate characteristics of precipitation event frequency, intensity, and seasonality are all understood to directly affect hydrologic (Rodriguez-Iturbe et al. 1999; Zanardo et al. 2012) and biologic function (Knapp et al. 2008; Good and Caylor 2011; Guan et al. 2014), and all contribute to total interannual variability of precipitation. However, the relative contribution of each of these intra-annual climatological factors to interannual precipitation variability remains unclear.

Previous research into the causes of interannual variability has focused on linkages with large-scale patterns, and the association between interannual variability in precipitation and climate phenomena such as El Niño–Southern Oscillation (ENSO) and the North Atlantic Oscillation (NAO) is well established (New et al. 2001). While understanding the strength of the correlation between precipitation variability and various climate modes is critical, interpreting the consequences of such a correlation structure remains challenging. By understanding how rainfall variability is directly manifested, such as through changes in the time between storms as compared to changes in the amount of rain in each storm, we can focus further investigations, observations, and models on those climatic properties most relevant to local variability. For example, in the Amazon basin, variability in the timing of the rainy season strongly influences annual precipitation totals, with the onset of the wet season also weakly related to rainy season rainfall rate (Liebmann and Marengo 2001). Similarly, on the Indian subcontinent, interannual variability of the monsoon rainfall has been linked to both a large-scale persistent seasonal component and intraseasonal (i.e., frequency and intensity) components (Krishnamurthy and Shukla 2000). Though a number of regional investigations have connected interannual variability with large-scale indices at specific locations, investigations of the same question at a global scale remain rare (Fatichi et al. 2012).

Spatial and temporal limitations in data availability complicate the analysis of global patterns in interannual precipitation variability. Studies such as Fatichi et al. (2012) have examined interannual variability based on worldwide networks of precipitation gauges and reanalysis products. However, these networks have limited coverage over the oceans and in sparsely populated regions of Africa, Asia, and South America. Alternatively, general circulation model outputs are global in extent and have been used to separate interannual variability into components attributed to ocean, atmosphere, and land processes at a global scale (Koster et al. 2000), though the representation of precipitation intensity, frequency, and seasonality in these global climate models does not usually match observed distributions, with most models producing too much convective and too little stratiform precipitation (Dai 2006; Crétat et al. 2014). Over the last two decades, major advancements in satellite monitoring of precipitation through projects such as the Tropical Rainfall Measuring Mission (TRMM) have resulted in large-scale estimates of rainfall. TRMM produces estimates of precipitation based on a combination of space-based microwave radiometry, active radar, and visible-infrared scanning, merged with ground rainfall gauges at the monthly scale (Huffman et al. 2007). Because TRMM reports the aggregated precipitation over an entire grid cell (~27 km at the equator), a scale mismatch exists between individual station measurements and satellite observations. Comparisons between TRMM and ground validation locations have generally shown good agreement, though the satellite observations tend to overestimate frequency and underestimate rainfall intensity (Wolff and Fisher 2008; Zhou et al. 2008)

In this study, we use data from both gauge stations and satellites to examine the climatological drivers of interannual variability in precipitation through a probabilistic framework. Station data from the Global Historical Climatology Network (GHCN)-Daily (Menne et al. 2012) are used to scale satellite-retrieved precipitation from the TRMM 3B42 version 7 (v7) dataset (Huffman et al. 2007) such that global grids of intra-annual precipitation climatology are consistent with the point-based statistics of precipitation frequency, intensity, and seasonality from the gauge data. We define frequency and intensity of precipitation on an event basis, though these characteristics are also analyzed at the daily scale, and we define the wet season as the period of the year during which 70% of that year’s precipitation falls. The GHCN dataset contains quality-controlled daily precipitation records from thousands of stations worldwide (Klein Tank et al. 2002), while the TRMM 3B42 v7 product contains 3-h precipitation estimates at a 0.25° grid resolution from 50°N to 50°S from 1998 onward (Huffman et al. 2007). Because of limitations in the length of the satellite record, our assessment addresses interannual variability only over the last two decades; however, within this time frame, the TRMM precipitation estimates provide an exceptional dataset with which to investigate the role of frequency and intensity of precipitation events at a global scale (Biasutti and Yuter 2013).

In this study we decompose the total interannual variability in scaled TRMM precipitation over the 1998–2013 period into contributions arising from variability in the frequency of precipitation events, variability in precipitation amounts associated with each event, and variability in the length of the dominant precipitation season. Using scaled TRMM observations, we reconstruct the mean and interannual variability of precipitation based on these statistics and determine the fraction of total variance attributable to each component. This approach does not attempt to investigate the physical mechanisms that give rise to each of these components of climate variability; however, the decomposition of interannual variability into the relative influence of separate intrascale rainfall characteristics allows for further investigation into those parameters most relevant for specific regions throughout the globe.

## 2. Methodology

### a. Conceptual framework

Total precipitation on any given day *P*_{d} (millimeters) can be thought of as the sum of a random number of events *N*, with each event having a random precipitation amount *A* (millimeters; i.e., *P*_{d} = *A*_{1} + *A*_{2} + ⋅⋅⋅ + *A*_{N}). Similarly, the total precipitation falling during the wet season *P*_{s} (millimeters) can be thought of as the sum of all precipitation occurring during a season of *T* days in length, with each day having a random precipitation amount of *P*_{d}. Therefore, total wet season rainfall is given by

where in the simplest case each precipitation event is conceptualized as occurring instantaneously, and *A*, *N*, and *T* may be considered as independent and identically distributed random variables that characterize a homogeneous wet season. In reality, these parameters may not be fully independent, the wet season may not be homogeneous (or continuous), and precipitation events are not instantaneous. However, the conceptual framework of Eq. (1) encapsulates the critical components of precipitation climatology and serves as a base case from which rainfall and its variability may be examined.

This analysis is presented by defining the rainfall process at the event scale at a single point; even though individual events may not be specified in an aggregated daily product, all events occurring in a single day sum to daily rainfall. Note that this definition of rainfall frequency as the events per day (Rodriguez-Iturbe et al. 1984) is different than other studies that define rainfall frequency as the fraction of time that rainfall is occurring (i.e., a percentage). Each of the components that contribute to annual rainfall and its variability (*A*, *N*, and *T*) can be described by their mean and variance . Here, we define average precipitation depth as , average event frequency as , average wet season length as , variance in precipitation depth as , event frequency as , and variance in season length as .

The expected total annual rainfall and its interannual variability can be expressed based on the statistical properties of the separate climatological parameters *A*, *N*, and *T*. We define the wet season as the continuous period of the year during which 70% (*f*_{w} = 0.7) of that year’s total precipitation *P*_{y} (millimeters) occurs. A value of *f*_{w} = 0.7 is chosen because this definition of wet season length was found to be most consistent (see results) with the more commonly used seasonally index (SI) of Markham (1970). Because we define the wet season as a constant fraction of the total precipitation (i.e., *P*_{s} = *f*_{w}*P*_{y}), the moments of the yearly rainfall are also scalar multiples of the seasonal statistics ( and ), and for brevity we will focus this analysis on the seasonal statistics.

Using the law of total expectation, the average daily rainfall is given by

and the average wet season rainfall described by Eq. (1) is given by

In a similar fashion, the law of total variance is used to express the expected variability in daily precipitation:

and the interseasonal variability in precipitation total is given by

In Eq. (5), the first term on the right-hand side describes the contribution of variability in event intensity *A* to , the second term describes the contribution of variability in event frequency *N* to , and the third term describes the contribution of variability in event seasonality *T* to . As the variability in any climate parameter (, , or ) approaches zero, the contribution of that term to total interseasonal and interannual variability diminishes.

With defined, we determine the fraction of interannual variance in precipitation attributable to the frequency, intensity, and seasonality. For each climatological component, its fractional contribution of variability to total interannual variance is found by separately dividing each term in Eq. (5) by :

Finally, we note that because the contribution of variability in event intensity, frequency, and seasonality are expressed as fractions and because the seasonal and yearly statistics are linearly related, the fractional contribution of each term in Eq. (5) is the same for the entire year.

The value calculated with Eq. (5) represents the theoretical variance if *A*, *N*, and *T* are independent and identically distributed random variables. Following Wilks (1999), we also calculate the variance overdispersion Σ_{(α,λ,τ)} as the ratio of the observed interseasonal variability to the value of calculated with Eq. (5) minus one. This variable expresses the fractional bias in the theoretical approach:

The Σ_{(α,λ,τ)} term expresses the contribution to total variance contained in any correlation between *N*, *A*, and/or *T* as well as any autocorrelation within the variables themselves. It is important to note that the influence of correlation between *A*, *N*, and *T* may either add or subtract variance from the total process. As an example, in the case that longer wet seasons are associated with stronger daily precipitation, the Σ_{(α,λ,τ)} term is likely to be positive. Conversely, if on days with more events each rainfall event is smaller, overdispersion is likely to be negative. Throughout our analysis, we do not expect the theoretical variance of Eq. (5) to equal the observed variance, and therefore we do not expect for Σ_{(α,λ,τ)} to be zero. It is very unlikely that precipitation frequency, intensity, and seasonality are unrelated. Instead, the overdispersion term is considered a fourth (in addition to , , and ) derived piece of information that describes the rainfall characteristics at each location.

### b. Parameter estimation

For each year from 1998 through 2013, we sum the GHCN- and TRMM-based rainfall estimates at each station or grid location to determine the observed mean annual precipitation and the observed interannual variability in precipitation . We note that 16 years of data is a relatively short period over which to analyze interannual statistics; however, given the limited availability of long term global data, we do have a large record of daily rainfall statistics from which to estimate intra-annual rainfall statistics. Only GHCN stations with data records that were 95% complete were included in this study. Seasonality is assessed through the use of circular statistics following Markham (1970), where daily precipitation amounts are translated into vectors, with the magnitude of each vector corresponding to the daily precipitation amount and the angle of each vector corresponding to the day of year. For each year, all the daily vectors are summed, and the resulting vector direction date *θ* is taken to correspond to the center of the wet season. The magnitude of the resultant vector can be divided by the total yearly precipitation to determine the commonly used SI (Markham 1970). For each year, the wet season length is estimated by finding the number of days, centered at *θ*, which include 70% of that year’s rainfall, with each year treated as having periodic boundary conditions. The mean *τ* and variance of the wet season duration are then calculated from the 16 separate wet season lengths.

To estimate the statistics of event frequency and intensity during the wet season, we combine all the identified wet seasons into a single time series. From this time series we estimate the statistics of daily wet season rainfall ( and ) directly. Because all precipitation observations are an aggregated product (i.e., it is impossible to know how many individual events occur in a single period over which precipitation is reported), characterization of the frequency of individual events is accomplished through assessment of the distribution of the lengths of consecutive dry days. If precipitation events are approximately Poisson in nature, then the interevent waiting times would be exponentially distributed; however, it is likely that the distribution of the lengths of dry spells will have a heavier/longer tail (Wilks 1999). As a more comprehensive approach to a simple Poisson assumption, we assume that the number of individual events in a period of *t* days is drawn from a distribution approximating a negative binomial. Thus, the probability of a dry period of length *t* occurring is given by

and as *r* approaches infinity, waiting times become exponential (see the appendix). Equation (10) is analogous to the survival function for a long-tailed Pareto type II distribution. Based on the number of occurrences of dry periods of *t* days in our time series, we calculate that for all values of *t* up to 10 days and fit values of *λ* and *r* using Eq. (10). The variance in the number of events per day is then given by . Note that our definition of precipitation frequency, and its associated variance, is event based and therefore fundamentally different than some other climate studies (e.g., Zhou et al. 2008; Guan et al. 2014) where frequency is simply defined as the percentage of days with rainfall (i.e., ), and as such our *λ* may be greater than one and is expected to be larger than frequencies defined as the percentage of wet days. However, we also assess this alternate conception of rainfall frequency (i.e., wet day frequency), with *λ* considered a simple Bernoulli random variable that takes a value of either zero or one, where mean and variance .

To estimate the statistics of event amounts, we directly use the statistics of daily rainfall totals (, ) as well as the event frequency values obtained previously. Rearranging Eq. (2), the average precipitation depth associated with a single event is given by . Similarly, by rearranging Eq. (4), the variance in the precipitation depth for a single event is given as . Using this approach to estimate *α* and ensures that the statistics of daily rainfall totals and the statistics for the distribution of length of dry spells match those observed from the precipitation record. Note that in the assessment of wet day frequency (i.e., the Bernoulli case), *α* and refer to the mean and variance of wet day rainfall totals.

An investigation into the relative importance of intra-annual rainfall characteristics requires global estimates of precipitation event frequency and intensity that accurately reflect the daily statistics of precipitation observed with rain gauges, and the scale mismatch between the global TRMM data and GHCN point data must be accounted for. Comparisons between satellite rainfall estimates and station data have demonstrated that TRMM systematically overestimates the occurrence of precipitation and underestimates the amounts when analyzing the 0.25° by 0.25° gridded satellite estimates (Huffman et al. 2007; Zhou et al. 2008), as is expected for a spatially averaged metric. Furthermore, our analysis is conducted at the point, and so we scale our TRMM-derived parameters to be consistent with point-based rainfall gauge data. In our approach we minimized this discrepancy by calculating the climate parameters of , , , , *τ*, *σ*_{τ}, *λ*, and *σ*_{λ} from each GHCN station and each TRMM grid cell. The GHCN parameters of all stations within each 0.25° by 0.25° grid cell were then averaged, and subsequently each TRMM parameter was multiplied by a single global scale factor so as to minimize the root-mean-square error between each TRMM and GHCN parameter. Scaled TRMM parameters were then used to calculate the *α* and *σ*_{α} and thereafter calculate the contributions of frequency, intensity, and seasonality to interannual precipitation.

## 3. Results

In total, 7663 GHCN stations in 5265 grid cells (Fig. 1) reported at least 95% of daily data during the TRMM interval (1998–2013), with the station-based estimates of rainfall characteristics broadly consistent with the scaled TRMM estimates. Statistics were calculated for the GHCN station data across a range of defined fractions of the precipitation occurring in the wet season *f*_{w}, and final average values , , and were relatively insensitive to *f*_{w} values between 0.4 and 0.8 (Fig. 2), indicating that the defined wet seasons exhibit a degree of internal homogeneity over this range. The estimated length of the wet season is consistent with the conventional SI of Markham (1970), with the highest correlation occurring at *f*_{w} = 0.7. Based on this value, the average length of the wet season is 203 days for the GHCN database, with the wet season length bound between 256 = *f*_{w}365 and 0 days, while the SI values are bound between 1 and 0. This upper bound on the wet season length arises because, in the extreme case, when all precipitation is distributed perfectly uniform throughout the year (i.e., SI = 0), *f*_{w}365 days will encompass the seasonal rainfall of *P*_{s} = *f*_{w}*P*_{y} of that year. Thus, given an SI value, the length of the wet season can be accurately approximated as τ = *f*_{w}365(1 − SI), with a root-mean-square error of approximately 11 days and *r*^{2} of 0.96 (Fig. 3). The seasonality index has been found to be strongly correlated with other measures of seasonality such as the precipitation concentration index (PCI) and the seasonality concentration index (SCI) (Fatichi et al. 2012), and thus our estimate of wet season length *τ* is also likely correlated with these other metrics as well.

Scaling factors were established based on the TRMM and GHCN statistics calculated at the GHCN locations using a value of *f*_{w} = 0.7 (Fig. 4). At these locations, TRMM overestimates by about 3% and overestimates by 2%. After scaling TRMM-derived estimates of by 0.97 and by 0.98, the adjusted TRMM dataset has a root-mean-square error of 180 mm yr^{−1} for and 54 mm yr^{−1} for compared with the GHCN station data. For daily rainfall, TRMM estimates of are sufficiently accurate such that no scaling is required, while TRMM overestimates by 5% and a scaling factor of 0.95 is applied. After scaling, the TRMM daily statistics have a root-mean-square error of 0.6 mm day^{−1} for and 1.7 mm day^{−1} for when compared with the GHCN station data. We find that TRMM estimates of the fraction of wet days during the wet season are also very close to GHCN data, with a scale factor of 0.996 used. The lengths of the wet seasons estimated via TRMM are also found to be 3% higher, with their variability underestimated by 12%, with scaling factors of 0.97 and 1.12 used. Estimating *λ* and *σ*_{λ} for both the station and satellite data demonstrates that individual events are overestimated with TRMM by 11% and their variability is overestimated by 2%.

The climate parameters of frequency and intensity are estimated such that the distribution of daily rainfall totals and dry period lengths are maintained. Using the Pareto type II interstorm waiting times, the global average precipitation amount for a single storm is 5.6 mm, with an average of 4.9 mm over the oceans and 7.4 mm over land. The average storm frequency is 0.71 globally, with an average of 0.76 over oceans and 0.57 over land during the wet season, as assessed by fitting Eq. (10). By using this long-tailed Pareto type II distribution, *λ* and *σ*_{λ} are able to adequately fit the probability of extended dry spells (Fig. 5). This approach considerably improves on exponential models of dry spell lengths for dry spell lengths greater than one day. The exponential model is identical to results expected from defining the frequency of precipitation at the daily scale as a Bernoulli trial. Defining rainfall frequency as the occurrence of wet days fails to accurately capture dry spells of any duration greater than one day and thus also underestimates total temporal variability in the occurrence of wet days. Based on the Pareto distribution approach, the ratio of *σ*_{λ} to *λ* is 1.7 on average globally, where over the oceans this ratio is 1.5 and over land this ratio is 2.1. Note that for an exponential assumption, this ratio is always assumed to be 1:1, and as denoted by the increased *σ*_{λ}:*λ* ratio over land, models that utilize exponential distributions of interstorm wait times (e.g., Laio et al. 2001) may considerably underestimate the occurrence of extended dry periods.

The scale mismatch between spatial satellite averages and point observations coupled with heterogeneities in topography and reporting station densities will result in discrepancies between the TRMM- and GHCN-derived parameter sets. However, after scaling the TRMM parameters of , , , , *τ*, *σ*_{τ}, *λ*, and *σ*_{λ}, we find that we have removed nearly all bias in the TRMM-predicted values of (average of GHCN − TRMM = 0.05 ± 0.11, with the value after ± denoting one standard deviation), (−0.06 ± 0.09), and (0.01 ± 0.16) relative to the contributions predicted at stations. Based on estimated statistics of precipitation frequency, intensity, and seasonality, we also calculate the expected mean annual precipitation , its variability , and its coefficient of variation with both GCHN and TRMM data. These theoretical predictions quantitatively match the observed precipitation statistics (gray points in Figs. 4a,e for observed and modeled values at GHCN stations). The theoretical prediction of mean annual precipitation, with calculated using Eq. (3), has a root-mean-square error of 93 mm (*r*^{2} = 0.99) when compared to the observed GHCN mean annual precipitation values and a root-mean-square error of 19 mm (*r*^{2} = 0.99) when compared with the observed TRMM mean annual precipitation values. The accuracy of the theoretical model to reproduce the observed standard deviation in annual precipitation is less, with a root-mean-square error of 98 mm (*r*^{2} = 0.73) for the GCHN stations and 109 mm (*r*^{2} = 0.66) for the entire TRMM dataset. We note that perfect congruence between the theoretical and observed mean and interannual variance is not expected, as the covariance between climate characteristics captured in the overdispersion metric is not represented in Eqs. (3) and (5). The average overdispersion value at GHCN locations is −0.43, with the overdispersion calculated from TRMM and GHCN at station locations also consistent (average of GHCN − TRMM = 0.10 ± 0.37).

The global distribution of mean annual and interannual variability is characterized by elevated variability throughout the tropical oceans and in the large continental basins (Figs. 6a,b). When interannual variability is normalized by total precipitation, locations with the least rainfall have the largest variability relative to their precipitation. Global patterns in the mean precipitation event frequency and its variability (Figs. 6c,d) mirror those of mean annual precipitation totals; however, patterns in the mean and variability of precipitation event intensity (Figs. 6e,f) are dramatically different from both each other and previous patterns. On average, the most intense precipitation occurs in the great plains of North America and in the Pampas region of South America, as well as northern India, the Sahel, and the Horn of Africa. The largest variability in precipitation event intensity occurs on the eastern sides of the northern and southern landmasses at approximately 25° latitude. Patterns in wet season length (Figs. 6g,h) are less distinct, with the longest wet season occurring near the equator and at higher latitudes.

Globally, the percentage of total interannual variability that arises as a result of variability in precipitation event intensity is 31%, the percentage resulting from variability in precipitation event frequency is 17%, and the percentage that arises as a result of variability in wet season length is 52%. The fractional contribution of intensity , frequency , seasonality , and variance overdispersion all exhibit coherent spatial patterns throughout the globe (Fig. 7). The fractional contributions derived from GHCN stations qualitatively matches the TRMM pattern; however, because of the clustering of stations in North America, Europe, and coastal Australia, a 1:1 congruence is not expected. No strong relationships are observed between mean annual precipitation and , , or , with demonstrating a slightly elevated influence at higher rainfall amounts (2000–3000 mm) and weakly diminished influence at lower rainfall amount (500–1500 mm). As variability in interannual rainfall totals increases, becomes an increasingly large contributor to total variability, while the relative contribution of and decreases.

Spatially, the contribution of variability in storm intensity is larger over the oceans (34%) than over land (24%) and is strongest over midlatitude oceans. Though the length of the wet season generally drives interannual variability over land, regions such as southern Australia and central Africa are also strongly influenced by variability in storm intensity. Similarly, eastern North America and central Asia are also influenced by . Over the oceans, the contribution of intensity is greatest at around 25°N and around 25°S and reaches its minimum near the equator. This strong oceanic pattern is less distinct over land, with a contribution varying between 10% and 40% in these latitudes (Fig. 8), with local minima near about 20°N and about 20°S.

The contribution of variability in storm frequency is the smallest of the three principal components, contributing 14% over the oceans and 25% over land. However, in portions of southern Europe this component is the principal driver of interannual variability. Elsewhere throughout the globe, plays a nominal part in determining interannual variation in precipitation. When precipitation frequency is described by the occurrence of wet days, frequency values decrease relative to those assessed on an event basis. Using this definition, the contribution of rainfall frequency to interannual variability is also less, and since we maintain the statistics of daily rainfall here, the contribution of rainfall intensity is increased (gray lines in Fig. 8), though the shift is small. In either definition of rainfall frequency, seasonality and the estimated variance overdispersion are not affected.

Spatially, variability in wet season length contributes 52% over the oceans and 51% over land. Over the tropical Pacific and Atlantic Oceans, variability in wet season length is the main driver of interannual variability. In Mexico and the southwestern United States, is also the primary cause of interannual variability. In northern Australia, Florida, portions of East Africa, and most of southern Africa, interannual variability is also primarily caused by variability in wet season length. Additionally, Indonesia and northern Australia are also regions where is the largest contributor to interannual variability. Over both the oceans and land, the contribution of seasonality exceeds the contribution of intensity and frequency except at southern midlatitudes (Fig. 8).

Much important information about global precipitation variability is contained in the global pattern of variance overdispersion (Fig. 7d). The average global value of Σ_{(α,λ,τ)} is −25%, which is heavily influenced by the average ocean value of −19% when compared to the average land value of −41%. The strong oceanic overdispersion is centered in the equatorial Pacific. In this region the positive covariance between climatology parameters considerably increases the total interannual variability over the amount estimated based on the assumptions of independent, identically distributed random variables. A few regions, such as western Mexico and Saudi Arabia, contain a negative covariance that decreases observed interannual precipitation variability below that predicted from the theoretical relationship. When zonally averaging the variance overdispersion (Fig. 9), we see the strong peak over the oceans in Σ_{(α,λ,τ)} at the equator, whereas this term fluctuates around zero at the midlatitudes. Over land, small negative depressions in Σ_{(α,λ,τ)} occur around about 20°N, about 0°, and about 20°S, though the overdispersion on land is relatively constant across latitudes.

## 4. Discussion and conclusions

The decomposition of interannual rainfall variability into the contributions of event intensity, frequency, and seasonality identifies the importance of each of these factors in relation to total rainfall variability. Our analysis demonstrates that the majority of interannual precipitation variability arises as a result of variation in the length of the predominant wet season, consistent with similar findings of Fatichi et al. (2012) that locations where precipitation is concentrated in a few months are most susceptible to interannual variability. The contribution of seasonality is highest in the tropics where migration of the intertropical convergence zone (ITCZ) creates distinct wet season–dry season differences. As suggested by Krishnamurthy and Shukla (2000), the influence of seasonality is found to be most pronounced in those regions classically associated with monsoon regimes, such as India and the Indonesia–Australian region. Because these regions have both high seasonality and negative overdispersion values, an anticorrelation between the length of the wet season and the intensity and/or frequency of precipitation events during the wet season suggests that stronger monsoonal years are likely characterized by less rainfall per day. This conclusion is partly supported by analysis of hourly rainfall observations over eastern China (Yu et al. 2007).

The individual contribution of variability in the frequency and intensity of precipitation events is less than that of seasonality; however, both these factors combined contribute about half the observed variance in interannual precipitation. Regional differences in rainfall intensity are related to the intensity of convection (Biasutti and Yuter 2013), and thus patterns in demonstrate locations of variability in convection strength. Regions of large variability in intensity are predominantly found over midlatitudes, with the eastern United States most influenced by the intensity of individual events. This denotes different zonal mechanisms governing interannual precipitation variability. Globally, a variety of atypical rainfall regimes, such as bimodal and skewed precipitation distributions, are known to occur (Wang and LinHo 2002), and these regions are more difficult to characterize through a single global methodology. In this work, we define the wet season as the portion of the year encompassing 70% of the total rainfall, and therefore we do not explicitly consider the possibility of multiple separate wet periods. However, via our definition of seasonality, this type of climate regime will be characterized by higher variability in the frequency of events as a result of multiple periods of frequent events being grouped with a set of longer interstorm wait times. Thus, regions where is a strong driver of interannual variability are indicative of the influence of multimodal precipitation regimes. Finally, our analysis uses only 16 years of TRMM observations (1998–2013), and longer-term climatic phenomena may not be manifest in our results.

Our analysis does not specifically address the underlying mechanism and climate dynamics that give rise to the observed patterns in intra-annual precipitation variability. However, because yearly rainfall must arrive in individual events, large-scale climate modes such as ENSO must manifest themselves by altering the frequency, intensity, or seasonality of precipitation from normal conditions or by altering the cross-correlation between these subannual factors. The highly elevated cross-correlation of subannual precipitation characteristics in the equatorial Pacific indicates a possible manifestation of El Niño or La Niña oscillations in this region. Our finding is that rainfall is more variable in this region than the variability that is expected based on the individual variability of TRMM-observed rainfall frequency, intensity, and seasonality alone. This increase over the theoretical expectation suggests that these factors exhibit a positive correlation, though we have not yet analyzed the overdispersion term in sufficient detail to suggest a cause with high confidence. Additional continental regions with high overdispersion are southeastern Australia and the western United States, though we do not speculate on factors driving these patterns. Further investigation of global high-frequency precipitation observations is needed to further decompose Σ_{(α,λ,τ)} and determine how multidecadal fluctuations manifest themselves at subannual time scales. The relative short length of reliable high-frequency gridded precipitation data currently limits the ability to investigate long-term climate variability.

Our findings are particularly relevant for general circulation models (GCMs) attempting to reproduce patterns in observed interannual variability. GCMs have varying skill in their ability to produce rainfall with realistic characteristics, and the under- or oversimulation of rainfall variability will alter predicted water resource availability away from actual conditions (Rocheta et al. 2014), with potential drastic consequences for water end users such as agricultural and natural ecosystems. The specific representation of the relative occurrence of convective precipitation versus stratiform precipitation and their variability in GCMs determines both short-term and long-term frequency of precipitation (Dai 2006), and the categorization of the regional influence of each of these parameters provides a focal point for further model development. As noted by Polade et al. (2014), GCMs predict that increased interannual variability in precipitation will be caused by a decreased number of days per year with precipitation. Our results provide a method for understanding future variability, and we suggest that if this decrease in the number of wet days annually is temporally clustered (i.e., a shift in seasonality) as opposed to evenly distributed (i.e., a shift in frequency or intensity), the resulting increase in interannual variability will be larger.

Finally, precipitation variability is strongly linked to ecosystem function and water resource availability. Land–atmosphere coupling of precipitation is governed by the availability of water and energy at the surface (Koster et al. 2000), and the degree of intra-annual precipitation variability indicates the time scale of variation in surface conditions of available water. Variation in surface conditions then propagates into interannual variation in soil and canopy water, energy, and carbon fluxes, as well as ecosystem growth and structure (Raich et al. 2002; Ma et al. 2007; Tian et al. 1998; D’Odorico and Bhattachan 2012). Thus, regions characterized by elevated variability in rainfall seasonality as compared to regions of elevated variability in rainfall intensity or rainfall frequency are likely to have divergent ecosystem structure and productivity, as has already been shown for regions of contrasting mean seasonality and intensity (Good and Caylor 2011; Guan et al. 2014; Fatichi and Ivanov 2014). The importance of seasonality in determining interannual variability suggests that wet season–dry season differences in ecosystem function are therefore also likely the dominant factor in interannual variability in ecosystems’ water, carbon, and energy fluxes.

## Acknowledgments

This material is based in part upon work supported by the National Science Foundation under Grants BCS-1026334, BCS-1115009, EF-01241286, and SES-1360421, and the Princeton Environmental Institute. K.G. acknowledges the NASA Earth and Space Science Fellowship. The support and resources from the Center for High Performance Computing at the University of Utah is also gratefully acknowledged.

### APPENDIX

#### The Negative Binomial Distribution

The negative binomial distribution ~NB(*r*; *p*) is a two-parameter discrete distribution bound at zero with the following probability mass function:

where and Var[*N*] = *pr*(1 − *p*)^{2}. The distribution in Eq. (A1) represents the probability of *n* successful Bernoulli trials before a specified number *r* of failures occurs, where *p* is the probability of a successful trial. The binomial coefficient may be rewritten using the gamma function Γ(⋅) and expressed with respect to its mean value (specified here as *λt*), such that where *r* goes to infinity *p* approaches zero to keep the mean value constant. The probability mass function is then given by

where now *r* is a real-valued dispersion parameter, , and . Note that as *r* → ∞ the above becomes

which is the Poisson distribution, where .

For dry periods *t* days in length, we assess , and Eq. (A2) becomes

which is a long-tailed Pareto type II survival function. Again, as *r* → ∞ the above becomes

which is the exponential survival function associated with Poisson process wait times.

If the probability of *t* dry days occurring is log–linear, the underlying renewal process is likely Poisson in nature. In this case, fitting observed distribution of precipitation dry spells from the TRMM time series with Eq. (A4) will result in , with *r* taking large values. However, when has a distinct curvature in log space, fitted values of *r* will be much smaller. By finding the values of *λ* and *r* that best match the distribution of dry spell lengths, the waiting time distribution of Eq. (A4) is able to capture the range of precipitation event frequencies from Possion-like processes to more heavy-tailed distributions.

## REFERENCES

*Biogeosci. Discuss.*,

**11**,