Records of daily rainfall accumulations from 447 rain gauge stations over the central United States (Minnesota, Wisconsin, Michigan, Iowa, Illinois, Indiana, Missouri, Kentucky, Tennessee, Arkansas, Louisiana, Alabama, and Mississippi) are used to assess past changes in the frequency of heavy rainfall. Each station has a record of at least 50 yr, and the data cover most of the twentieth century and the first decade of the twenty-first century. Analyses are performed using a peaks-over-threshold approach, and, for each station, the 95th percentile is used as the threshold. Because of the count nature of the data and to account for both abrupt and slowly varying changes in the heavy rainfall distribution, a segmented regression is used to detect changepoints at unknown points in time. The presence of trends is assessed by means of a Poisson regression model to examine whether the rate of occurrence parameter is a linear function of time (by means of a logarithmic link function). The results point to increasing trends in heavy rainfall over the northern part of the study domain. Examination of the surface temperature record suggests that these increasing trends occur over the area with the largest increasing trends in temperature and, consequently, with an increase in atmospheric water vapor.
Over the past few years, the Mississippi River basin has experienced several large flood events (e.g., 2008, 2010, 2011) that caused economic damage of several billion dollars and numerous fatalities (http://www.ncdc.noaa.gov/oa/reports/billionz.html). These large flood events are associated with heavy rainfall over an extended period of time. The large economic and societal repercussions of these catastrophic events underscore the need to assess whether there have been increasing trends in heavy rainfall over this region [see also the discussion in Trenberth et al. (2003)].
Previous studies have examined the presence of trends in heavy rainfall over the central United States (e.g., Karl et al. 1996; Karl and Knight 1998; Kunkel et al. 1999; Groisman et al. 2001, 2004, 2005). Heavy rainfall is generally defined using a peaks-over-threshold (POT) approach, in which the number of days exceeding a selected threshold value is counted for every year. These results are generally areally averaged over spatially homogeneous regions and a linear trend (computed using ordinary least squares or a nonparametric counterpart) is fitted to these time series. The reasoning for this approach is that by aggregating stations, it would be easier to obtain statistically significant trends because of the larger sample size (e.g., Groisman et al. 2005). The definition of a “homogeneous” region, however, is not generally provided, and stations are grouped over large areas (e.g., multiple states). The implications of aggregating stations over a large spatial domain on the trend analysis are unclear. In this study, we avoid defining homogeneous regions by working with data from individual stations.
Because we are working with rare events, the presence of a signal may be buried in the noise (e.g., Frei and Schär 2001), complicating the signal detection. It is therefore important to select appropriate statistical tools to improve our capability of detecting trends in the data, if present. When working with extremes, the results obtained by including information about the parametric distribution generating the data were found to outperform the trend estimation based on the ordinary least squares method and the Mann–Kendall test (Zhang et al. 2004). Because of the discrete nature of these records (number of days per year exceeding a threshold), Poisson regression provides the appropriate statistical framework.
Another element that is often overlooked is the presence of possible abrupt changes in the heavy rainfall distribution. These shifts, associated with climate variations (e.g., Alley et al. 2003; Swanson and Tsonis 2009), or inhomogeneities and artifacts (e.g., Peterson et al. 1998; Changnon and Kunkel 2006), can have large impacts on the trend results and should be accounted for (e.g., Peterson et al. 1998; Villarini et al. 2009; Dai et al. 2011). Is it possible that undetected abrupt changes were responsible for increasing trends in heavy rainfall over this region? In this study we model the number of heavy rainfall days using a Poisson regression model that accounts for the presence of abrupt changes in the rate of occurrence parameter of the Poisson distribution. Moreover, we do not seek a definition of homogeneous region and pull stations together, but work at the level of individual stations. These results will provide valuable information about the presence of trends in heavy rainfall not over large regions but on more localized areas.
This paper is organized as follows. In the next section we describe the rainfall data and provide some details on the statistical methods used. In section 3 we describe the results regarding changes in the frequency of heavy rainfall days over the study region. Section 4 summarizes the results and concludes this paper.
2. Data and methodology
We use daily rainfall measurements from 447 stations [obtained from the National Climatic Data Center (NCDC) surface daily data] located over the central United States (this area includes Minnesota, Wisconsin, Michigan, Iowa, Illinois, Indiana, Missouri, Kentucky, Tennessee, Arkansas, Louisiana, Alabama, and Mississippi; Fig. 1). We limit our analyses to stations with a record of at least 50 yr and ending no earlier than the year 2000. The vast majority of the stations have data up to at least 2009 (318 out of 447), and only 46 stations have data ending between 2000 and 2002. These 46 stations do not exhibit any coherent spatial pattern. Approximately 75% of the stations have between 50 and 70 yr, about 10% of them have at least 100 yr of data, with the longest record of 125 yr. A year is considered complete if there are less than 10% missing daily data, and we allow no more than one gap in the time series (2 yr is the maximum gap length allowed).
We use a peaks-over-threshold approach (e.g., Davison and Smith 1990), in which we set a threshold and count the number of days exceeding it every year. Different thresholds have been proposed, in some cases based on absolute values (e.g., 25 mm day−1) and in other cases based on the rainfall empirical probability distribution (e.g., 95th percentile). For each station we compute the 95th percentile of the nonzero rainfall values and use this value as the threshold. Over this region, the 95th percentile varies significantly, with a clear south-to-north gradient (Fig. 1). Values range from about 50 mm day−1 (~2 in. day−1) in the coastal areas of the Gulf of Mexico to about 25 mm day−1 (~1 in. day−1) in the northern part of the study region.
Because of the discrete nature of these data, we resort to Poisson regression. This is a form of a generalized linear model (GLM; e.g., McCullagh and Nelder 1989; Cameron and Trivedi 1998; Dobson 2001), in which the predictand has the form of count data and follows a Poisson distribution. We denote the counts in year i as Ni, and assume that Ni has a conditional Poisson distribution with the rate of occurrence parameter λi as shown:
The rate of occurrence is potentially a random variable. We examine two forms of the model. To examine abrupt changes, λi is viewed as a random process that changes levels at random times. For each station we assess the presence of statistically significant increasing or decreasing trends in the frequency of heavy rainfall by fitting a Poisson model in which the rate of occurrence parameter depends linearly on time (by means of a logarithmic link function) of the form λi = exp(β0 + β1ti). If the estimated coefficient for time β1 is different from zero at the 5% significance level, then there is statistical evidence to support the presence of trends in the frequency of heavy rainfall (e.g., Villarini et al. 2012).
One of the drawbacks of this approach is that we do not account for the presence of possible abrupt changes in the data. These step changes could be associated with shifts from one climate regime to another (e.g., Alley et al. 2003; Swanson and Tsonis 2009) or with data inhomogeneities (e.g., rain gauge relocation, or changes in the instrument or in the environment around it; e.g., Easterling and Peterson 1995; Peterson et al. 1998). If not detected and accounted for, these artifacts can have a large impact on the outcome of studies of this kind, leading to erroneous statements about human-induced climate change (e.g., Peterson et al. 1998; Villarini et al. 2009; Dai et al. 2011).
A number of tests have been proposed and developed to assess the homogeneity of a time series, each of them based on different assumptions (e.g., single or multiple changepoints, known or unknown year of the changepoint, Gaussian-distributed data). These tests are generally developed assuming continuous distributions (e.g., Peterson et al. 1998; Reeves et al. 2007; Beaulieu et al. 2012). Our records, however, are discrete. To address this problem, we use segmented regression (Muggeo 2003) to detect changepoints at unknown points in time, together with the estimation of the regression lines. While this regression model has been used in fields such as ecology and epidemiology, it has received much less attention for hydroclimatological applications (Ferguson and Villarini 2012).
In segmented regression, the relation between the response and the predictor is piecewise linear, that is, two or more lines are connected at the changepoint(s). Moreover, we are not restricted to a single changepoint, and the estimation procedure allows inference for all of the model’s parameters, including the location of the changepoint (i.e., it provides confidence intervals of the changepoints). One of the drawbacks is that we need to provide starting values for the changepoints. For each time series, we first fit the data using locally weighted regression (loess with a span of 0.75; Cleveland 1979) to guide the choice of the initial guess (Muggeo 2003). We then perform several trials to assess the sensitivity of the results to the initial values. Following Muggeo (2003), we include only changepoints for which the 95% confidence intervals do not include the first or last years in the record. Moreover, we select the “segmented model” if it has an Akaike information criterion (AIC; Akaike 1974) value smaller than the AIC from the model without changepoints. We also restrict the presence of breakpoints, so that the smallest subperiod has a length of 10 yr. Segmented regression is performed in R (R Development Core Team 2009) using the freely available segmented package (Muggeo 2008).
We use segmented regression to evaluate the presence of abrupt changes and trends in the frequency of occurrence of heavy rainfall events. In all records, at most one changepoint per series was detected. There are 72 stations with a changepoint. If we divide the region into South (Arkansas, Tennessee, Louisiana, Mississippi, and Alabama) and North (the remaining states), then 25 stations are located in the South and 47 in the North. In the South, 14 (6) stations have a changepoint after 1980 (1990), while 11 (6) changepoints occur after 1980 (1990) in the North. The number of stations with a changepoint prior to 1970 is 24 in the North and 2 in the South. Based on these results, abrupt changes tend to occur earlier in the North and later in the South. Because many of these stations have been moved over time, it is difficult to state whether these changes are associated with data inhomogeneity or shifts from one climate regime to another.
For the stations without changepoints (Fig. 2, left panel), statistically significant increasing trends dominate over the negative ones (93 vs 3), with values of β1 ranging from 0.002 to 0.028 for the positive slopes and from −0.007 to −0.01 for the negative ones. Most of these stations are located in the northern part of our domain (74 in the North and 16 in the South), in particular over Minnesota, Wisconsin, Iowa, Illinois, and Missouri. Because of the spatial correlation of the rainfall process, we examined whether these results are field significant using Walker’s test and false discovery rate (Wilks 2006). In both cases, we were able to reject the null hypothesis that all site-specific null hypotheses (β1 = 0) were true, pointing to the field significance of our results.
Out of the 72 stations with a changepoint, there are statistically significant increasing trends prior to the year of the changepoint in 20 of them (Fig. 2, middle panel), and decreasing trends in 2 of them. These stations are mostly located in the southern half of the study area. If we focus on the subseries after the changepoint (Fig. 2, right panel), then 21 stations exhibit increasing trends and 9 stations decreasing trends, with a tendency toward increasing (decreasing) trends in the northern (southern) part of domain.
In agreement with previous studies (e.g., Karl et al. 1996; Karl and Knight 1998; Kunkel et al. 1999; Groisman et al. 2001, 2004, 2005), these results point to increasing trends in the frequency of heavy rainfall over large areas of the study region, in particular over the northern part [see also Kunkel et al. (1999)].
4. Discussion and conclusions
In this study we have examined changes in the frequency of heavy rainfall over the central United States (Minnesota, Wisconsin, Michigan, Iowa, Illinois, Indiana, Missouri, Kentucky, Tennessee, Arkansas, Louisiana, Alabama, and Mississippi). We used daily rainfall measurements from 447 stations with a record of at least 50 yr and ending no earlier than the year 2000.
We used a peaks-over-threshold (POT) approach and set the 95th percentile of the nonzero rainfall as the threshold for each station. Because of the discrete nature of the POT data, we used segmented regression to detect changepoints at unknown points in time and a Poisson model to test the data for the presence of trends in the rate of occurrence parameter. There are 72 stations with a changepoint, with a tendency of the changepoints to occur earlier (later) in the twentieth century in the northern (southern) part of the domain. By focusing on the stations without changepoints, we found that 93 of them showed statistically significant trends, the vast majority of which were increasing (90 vs 3). Stations with increasing trends tend to be located in the northern part of the domain, in particular over Missouri, Illinois, Iowa, Minnesota, and Wisconsin.
These results point to increasing trends in the frequency of heavy rainfall over large areas of the study region, in particular over the northern part. Villarini et al. (2011) examined changes in the annual maximum daily rainfall distribution over the midwestern United States during the twentieth century and found a slight tendency toward increasing trends. A possible interpretation of the findings of these studies is that the largest observed changes are not in the magnitude of the largest 1-day accumulation but in the number of heavy rainfall days. Simply put, storms may have become wetter, with less of a change in the wettest storms.
We have examined the temperature record to explain these results in light of temperature changes. Based on thermodynamic considerations, the saturation water vapor pressure increases roughly exponentially with increasing temperature. Therefore, there is more water vapor available for precipitating systems with increasing temperature. Allan and Soden (2008) examined the link between tropical precipitation and temperature, showing that heavy rainfall events increased (decreased) during warm (cold) periods [see also Allen and Ingram (2002); Vecchi and Soden (2007)]. Based on these considerations, we would expect larger increasing trends in temperature in the northern than in the southern part of the domain. We explore the link between temperature and heavy rainfall using three gridded temperature datasets (Fig. 3): National Aeronautics and Space Administration Goddard Institute for Space Studies (NASA GISS) (left panel; Hansen et al. 2012); the University of East Anglia–Met Office Hadley Centre Climate Research Unit temperature, version 4 (HadCRUT4) (middle panel; Morice et al. 2012); and the National Oceanic and Atmospheric Administration Merged Land–Ocean Surface Temperature Analysis (NOAA MLOST) (right panel; Smith and Reynolds 2005; Smith et al. 2008). Because extreme rainfall is concentrated during the March–October period (Villarini et al. 2011), we focus on trends in temperature averaged over these months (Fig. 3) during the period 1901–2011. Regardless of the temperature dataset used, the northern half of the domain, which corresponds to the area with the largest concentration of increasing trends in heavy rainfall, is the area that warms the most. The southern half, in contrast, has been warming at a much slower rate, and has even been cooling. This is the region with the sparsest number of increasing trends. The heterogeneous nature of central United States surface temperature trends, or “warming hole,” has been noted previously, yet its causes remain to be fully understood (e.g., Kunkel et al. 2006; Portmann et al. 2009; Goldstein et al. 2009). Based on these results and our understanding of the physical processes at play, it is reasonable to state that the observed increasing trends in heavy rainfall in the northern part of the domain are related to the observed increases in temperature over the recent years and highlight the need to better understand regional temperature trends.
It is also worth pointing out that DeAngelis et al. (2010) examined the effects of the increase in irrigation over the Ogallala Aquifer (Great Plains of the United States) as a possible mechanism for the increase in regional precipitation. They found an abrupt increase in rainfall in 1947 for the month of July over a region roughly including the central part of our domain (no other region/month showed statistically significant results). While it is not straightforward to relate the results in DeAngelis et al. (2010) to ours (different variables, regions, spatial and temporal scales, study period), it is likely that changes in land use/land cover and agricultural practice over this part of the United States are also going to play a role in increasing the amount of water vapor in the atmosphere.
This research was funded by the Willis Research Network and by NASA GPM. The authors thank Dr. Muggeo for making the segmented package (Muggeo 2008) freely available in R (R Development Core Team 2008), and Dr. Dai (editor) and two anonymous reviewers for their useful comments on a previous version of the manuscript.
Current affiliation: IIHR—Hydroscience & Engineering, University of Iowa, Iowa City, Iowa.