The spatial extent of an extreme precipitation event can be important for a basin’s hydrologic response and subsequent flood risk, and may yield insights into underlying atmospheric processes. Using a relaxed moving-neighborhood approach, we develop indicator semivariograms based on precipitation records from the Global Historical Climatology Network–Daily (GHCN-D) station network to directly quantify the climatological length scales of extreme daily precipitation over the United States during 1965–2014. We find that the length scales of extreme (90th percentile) daily precipitation events vary both regionally and seasonally. Over the eastern half of the United States, daily extreme precipitation length scales reach 400 km during the winter months, but are approximately half as large during the summer months. The Northwest region, on the other hand, exhibits little seasonal variation, with extreme precipitation length scales of approximately 150 km throughout the year. By leveraging in situ station measurements, our study avoids some of the uncertainties associated with satellite or interpolated precipitation data, and provides the longest climatological assessment of length scales of extreme daily precipitation over the United States to date. Although the length scales that we calculate can be sensitive to station density, neighborhood size, and neighborhood relaxation, we find that the interregional and interseasonal differences in length scales are relatively robust. Our method could be extended to quantify changes in the spatial extent of extreme daily precipitation in the recent past, and to investigate the underlying causes of any changes that are detected.
Extreme precipitation events tend to cause high surface runoff, and can subsequently lead to flooding that is capable of inflicting extensive economic, societal, and ecological damages. In 2017 alone, the United States experienced four major flooding events from extreme precipitation, resulting in 211 deaths and over $180 billion (U.S. dollars) in damages (NOAA/NCEI 2017). Typically, extreme precipitation events are of relatively short duration (lasting from a few hours to a few days) but produce damages comparable to those resulting from long-duration droughts (which often last a year or more). Additionally, recovery can be slow and costly, as evidenced by recent examples such as the 2013 Boulder floods (Gochis et al. 2015) and the 2016 Louisiana floods (van der Wiel et al. 2017). Understanding the spatial characteristics of these extreme precipitation events is important not only because of their considerable societal impacts, but also because these characteristics have been changing in recent decades as atmospheric temperatures increase (O’Gorman 2015).
The frequency and intensity of extreme wet events have been increasing across much of the globe (IPCC 2012), including over much of the United States during the past half century (DeGaetano 2009; Pryor et al. 2009; Gleason et al. 2008). In some cases, these changes in the frequency and/or intensity have been attributed to increasing greenhouse gas concentrations (Min et al. 2013; Diffenbaugh et al. 2017; Cohen et al. 2014). The observed increase in the intensity of extreme wet events has been partly attributed to the thermodynamic contribution of increasing temperatures, and in some cases partly attributed to changes in atmospheric circulation (Trenberth et al. 2003; Trenberth 2011; O’Gorman 2015; Diffenbaugh et al. 2017). Accordingly, as atmospheric temperatures continue to rise, the frequency and intensity of extreme wet events are expected to change further in the future, although such changes are likely to manifest nonuniformly in space and time (Diffenbaugh et al. 2005; Gleason et al. 2008; Singh et al. 2013; Trapp et al. 2007).
Studies have also started to investigate the effect of rising temperatures on the spatial characteristics of precipitation. For example, Wasko et al. (2016) found that increasing temperatures reduce the spatial extents of extreme rain storms in Australia. Similarly, Chang et al. (2016) and Guinard et al. (2015) found that higher temperatures reduce the spatial extent of rainstorms over the United States. However, other studies, such as that of Dwyer and O’Gorman (2017), found inconclusive results when assessing the change in zonal length scale of precipitation events in the tropics. Although these studies (and others) have already begun to quantify changes in the spatial extent of precipitation in response to increasing temperatures, a robust assessment of the long-term historical climatology of spatial extent of extreme precipitation has not yet been developed.
A key challenge in developing such a climatological assessment is the brevity and uncertainty of pixelated or gridded datasets, including those derived from ground-based weather radar, space-based satellite precipitation records, and statistically (e.g., Daymet; Thornton et al. 1997) or dynamically (e.g., reanalysis) interpolated datasets. Decadal variability of the climate system can induce apparent (but spurious) long-term trends (Deser et al. 2014, 2012; Thompson et al. 2015; Li et al. 2017; Endo et al. 2017; Hawkins et al. 2016)—and/or mask trends arising from changes in radiative forcing—in short (~30 yr) radar and/or satellite-derived observations (Schneider et al. 2013). These assessments can be further affected by the reported overestimation of the frequency of extreme precipitation in satellite data when compared to ground and radar measurements (Mehran and Aghakouchak 2014; Aghakouchak et al. 2011). Although station-based interpolated datasets can provide longer periods of record, they consistently underestimate extreme precipitation, especially over regions with sparser ground measurements (Sun and Barros 2010; Behnke et al. 2016).
Although measurement uncertainties in precipitation gauge data have been reported in multiple studies (Rasmussen et al. 2012; Sieck et al. 2007; Tokay et al. 2010; Liu et al. 2013), assessing the characteristics of extreme precipitation using in situ data directly overcomes the need for interpolation methods to estimate extreme precipitation. Additionally, station datasets provide a much longer period of record than is available from satellites or radar, allowing robust assessments of long-term trends in the characteristics of extreme precipitation.
Our objective in the present study is to infer the spatial extent of extreme daily precipitation using station data, thereby overcoming the limitations of gridded datasets. Given that we cannot assume that precipitation varies smoothly between adjacent stations, we quantify the length scale of extreme daily precipitation (rather than the size of continuous objects). Our framework, which we describe below, employs indicator semivariograms to infer the length scale of extreme daily precipitation using the Global Historical Climatology Network–Daily (GHCN-D) dataset of U.S. stations (Menne et al. 2012). We use our approach to identify regional and seasonal variations in the climatological length scales for the 1965–2014 period. Additionally, we assess the sensitivity of those variations to our methodological choices.
We use the GHCN-D station dataset, which contains daily precipitation records since 1861. To balance the need for a set of stations that has continuous data availability with the need for a period of record that is sufficiently long for climatological assessment, we focus our analysis on the period from 1965 to 2014. Over North America, 4512 stations in the GHCN-D dataset have (partially intermittent) records that start by 1965 and continue until or past 2014. These 4512 stations are shown in Fig. 1a. Across all stations, more than half of the observations used are based on 24-h accumulation periods recorded between 0700 and 0800 local time (LT), while 15% are recorded at 1800 LT, and approximately a tenth are recorded at midnight LT.
We first find all daily precipitation values from 1965 to 2014 that are greater than or equal to 1 mm day−1 for each station and each calendar month. We then define the 90th percentile value of these precipitation values as the respective extreme precipitation threshold for each station and each calendar month. We summarize these 90th percentile values in 3-month (or seasonal) averages in Fig. 2, using December–February (DJF) for winter, March–May (MAM) for spring, June–August (JJA) for summer, and September–November (SON) for autumn.
Using this percentile threshold value for each month and each station, we create a binary, or indicator, extreme event dataset (hereafter “p90”) for the whole period of 1965–2014, where stations that have a daily precipitation value equal to or above the corresponding monthly 90th percentile threshold are set equal to 1, and stations that have a daily precipitation value less than that threshold are set equal to 0. Stations that have no recorded data on a given day are eliminated from the binary dataset for that day.
a. Overview of methodology
We use indicator semivariograms to quantify the length scale of extreme precipitation using our binary p90 dataset. Indicator semivariograms have been used in previous studies to spatially interpolate the probability of the presence of a threshold variable using indicator kriging (e.g., Berezowski et al. 2016; Goovaerts et al. 2016; Haberlandt 2007). However, in our study, we use indicator semivariogram methods to directly compute the length scales of extreme precipitation, without kriging the binary variable. The indicator semivariogram quantifies the squared difference between the values of two data points as a function of the separation distance between those points. By calculating the indicator variogram for multiple pairs of stations (as a function of their separation distance) in a given region and time, we can find the distance at which stations with p90 values are no longer substantially correlated in that region and time period. We use this distance to quantify the length scale of p90 precipitation for that given region and time.
As shown in previous studies using radar, satellite, and climate model data, length scales of precipitation can exhibit substantial regional variation over the United States (Guinard et al. 2015; Dwyer and O’Gorman 2017; Chang et al. 2016; Kursinski and Mullen 2008). To ensure that we capture the nonstationarity of spatial characteristics, we directly address regional differences in length scales by using a moving neighborhood approach similar to that developed by Haas (1990). Haas (1990) estimated local semivariogram parameters for sulfate deposition data by using neighborhoods of stations centered around locations of interest in order to provide kriging estimates, and found that this method provides a more accurate representation of the semivariogram structure than when using global semivariogram parameters. The moving neighborhood approach has also been used extensively to estimate local semivariogram parameter sets for spatial prediction (e.g., Lloyd 2005, 2010; Tadić et al. 2015), resulting in decreased prediction error relative to the use of a global semivariogram parameter set.
However, when using a moving neighborhood approach, the number of observations within a neighborhood or the size of the neighborhood must first be established. While most studies use a single neighborhood size (or density) to estimate local semivariogram parameters based on well-known physical traits of the variable of interest (e.g., Alkhaled et al. 2008; Hammerling et al. 2012; Tadić et al. 2015), other studies quantify the effect of the number of observations within a neighborhood on the errors of estimation. More specifically, Lloyd (2005) found that a larger number of observations within a neighborhood produced smaller errors for monthly precipitation estimation over the United Kingdom. Likewise, Oyler et al. (2015) found that minimum and maximum temperatures can be locally estimated with little error over the United States using ~80 stations per neighborhood. In our study, we test the sensitivity of our results to the number of observations in each neighborhood, and also to the size of the neighborhood.
Similar to Alkhaled et al. (2008) and Hammerling et al. (2012), we also “relax” the boundaries of a neighborhood in order to account for larger scales of variability, and to prevent unnatural cutoffs of extreme precipitation areas when calculating the spatial extent of extreme precipitation over a certain location. We relax the neighborhood by allowing stations outside of the boundary to be included as one member of a station pair in our semivariogram calculation. We also quantify the effect of this neighborhood relaxation on our calculation of the spatial extent, and assess whether a relaxed neighborhood reduces the sensitivity to different neighborhood sizes.
The theory and calculation of indicator semivariograms is described in detail by Goovaerts (1997) and applied in many studies to understand the spatial structure of environmental variables (e.g., Berezowski et al. 2016; Goovaerts et al. 2016; Haberlandt 2007). In the following subsections, we first describe the methods for selecting the data used in the calculation of indicator semivariograms (including the moving neighborhood method) so that we capture any seasonal and regional variations in p90 length scales. We then detail the use of raw, experimental, and theoretical variograms to quantify p90 precipitation length scales.
b. Seasonal and regional considerations
To define the length scale of extreme precipitation for a station, we first delineate a 500-km-radius neighborhood around that station. Using a 500-km neighborhood allows us to capture the spatial heterogeneity of extreme daily precipitation over different regions. We then ensure that there are at least 20 stations within that 500-km neighborhood. (A total of eight stations are eliminated because their 500-km neighborhoods encompass fewer than 20 stations.)
Because we are interested in the climatological length scale of p90 precipitation for each season, we calculate the raw semivariogram for the days in which at least 10% of the stations in an individual neighborhood show a p90 event in a season (hereafter, “p90 days”). Given that our threshold for an extreme event is the 90th percentile, this 10% restriction ensures that our analysis only encompasses days that exceed the number of events expected to occur at random within each neighborhood. Because of spatial correlations in our p90 dataset, approximately 2%–12% of days in each season over the whole period (1965–2014) are selected as p90 days (see Fig. S1 in the online supplemental material). JJA shows the largest percentage of days eliminated by this criterion, especially in the eastern half of the United States. Interestingly, for neighborhoods on the Pacific coast, fewer than 50% of days exhibit at least one station with p90 precipitation (Fig. S1).
For raw semivariogram calculations, we only use pairs of stations that include at least one station with a binary value of 1 on a given day. By using only pairs of stations that both meet the threshold (hereafter, “1–1 pairs”; red–red pairs in Fig. 3a) or for which one meets the threshold and one does not (hereafter, “1–0 pairs”; red–black in Fig. 3a), we ensure that pairs of two nonthreshold stations (hereafter, “0–0 pairs”; black–black in Fig. 3a) are not included in our characterization of the length scale of p90 daily precipitation. To implement our relaxation technique, we also use 1–1 and 1–0 pairs that are partly in the neighborhood (i.e., with one station inside the neighborhood and the other station outside the neighborhood; red–pink, red–gray, and black–pink pairs in Fig. 3a) to capture length scales that are representative of the neighborhood but that may be longer than the neighborhood dimensions. Pairs of stations that fall completely outside the neighborhood are excluded (pink–pink, gray–gray, and pink–gray pairs in Fig. 3a). All raw semivariogram calculations from all pairs on all p90 days are weighted equally.
To assess regional variations in the length scales, we summarize our results using nine U.S. regions, as defined by the National Centers for Environmental Information (NCEI) (Karl and Koss 1984). The NCEI regions are shown in Fig. 1a. (Note that there are several stations in Mexico and Canada that are included in our single-station analysis but that are not included in any of the regional summaries.) To assess the statistical significance of regional differences in length scales within each season we use the Mann–Whitney U test. To assess the statistical significance of the seasonal variations in each region, we use the paired Wilcoxon signed-rank test, which is a nonparametric paired difference test (McDonald 2009). We report the p values of the tests, where the p values have been adjusted using the Bonferroni method, which penalizes p values when conducting multiple tests using the same sample (Wright 1992)—that is, 6 intraseasonal comparisons per region and 36 intraregional comparisons per season.
c. Raw and experimental semivariograms
On any given day, the raw semivariogram value can be calculated for a pair of selected stations using the 500-km relaxed moving neighborhood around a station for distances up to 500 km for p90. The raw semivariogram is calculated as shown in Eq. (1) and in Fig. 3b:
where h is the distance between two stations, Z(x) is the p90 event value at a station at location x, and Z(x + h) is the p90 event value at a station that is at distance h away from x. Because the p90 event value, Z(x), is equal to 1 when a station’s precipitation exceeds or equals the p90 threshold and is equal to 0 when it falls below the threshold, the semivariogram γ(h) has only two possible outcomes: 0 when two stations have the same Z(x) value on a given day, and 0.5 when two stations have different Z(x) values on the same day. Because we are only using pairs of stations that have at least one Z(x) value of 1 (i.e., 1–1 or 1–0 pairs, or red–red, red–black, red–pink, red–gray, and black–pink pairs in Fig. 3a), all raw semivariogram values of 0 are calculated from 1–1 pairs and all values of 0.5 are calculated from 1–0 pairs.
Raw semivariogram values are calculated for p90 days in each season and for each station, and then grouped into 20 equally spaced intervals i of separation distances h (e.g., SON Station 599; Fig. 3b). We then calculate the experimental semivariogram by taking the mean of raw semivariogram values in each interval i (black circles in Fig. 3c). In the case of raw indicator semivariogram values, we calculate a simple ratio between the number of 1–1 pairs (N1–1,i) and the number of 1–0 pairs (N1–0,i) of stations in each interval:
The experimental semivariogram (Fig. 3c; black circles) increases with greater separation distance and eventually asymptotes at a value close to 0.5. The shape of the relationship shows that stations farther apart are less likely to have p90 precipitation on the same day, meaning that there are more 1–1 pairs at smaller separation distances, while the number of 1–0 pairs mostly increases with larger distances, as expected.
d. Length scale of extreme precipitation calculated from fitted semivariograms
To characterize the relationship between the experimental semivariogram and the distance h between station pairs we fit a theoretical semivariogram to the experimental semivariogram. Various models of semivariograms are commonly used in the geostatistical literature (Chiles and Delfiner 2012). Following visual inspection of the experimental semivariograms and recommendations from previous studies (e.g., Berndt et al. 2014; Western et al. 1998), we fit the exponential model for p90 experimental semivariograms as defined in Eq. (3):
where c is the nugget, b is the partial sill, and α is the practical range of the semivariogram. The nugget c reflects measurement errors or microscale variability. The partial sill b represents the asymptotic value of the exponential semivariogram at a large separation distance. In this study, we focus on the practical range α, with which we represent the length scale of p90 daily precipitation (Goovaerts 1997; Haberlandt 2007; Fig. 3c). To fit exponential semivariograms to the experimental semivariograms calculated for each neighborhood during each season, we use the variofit function from the geoR package (Ribeiro and Diggle 2016) in R (R Core Team 2015). By using this function, we can automatically fit exponential semivariograms to the experimental semivariograms from all seasons and all neighborhoods.
Before presenting the results for the climatological length scale of extreme precipitation, we assess the results for two selected days for one station in Colorado, with the aim of assessing the implications of the relaxed moving neighborhood. For 10 September 2002, the calculated length scale was 242 km. For 10 September 2013, the calculated length scale was 73 km. The 2013 date indeed had much smaller clusters of stations with extreme precipitation, covering a smaller area, while the examined date in 2002 had a larger cluster of stations with extreme precipitation (see Fig. S2 for more details). The large extreme rainfall cluster centered over Illinois, Missouri, and Iowa in 2013 shows the importance of constraining the raw semivariogram calculations to pairs of stations with at least one station within a neighborhood, in order to eliminate any influences from distinct precipitation areas. In contrast, 2002 shows the importance of relaxing the neighborhood to include stations exhibiting p90 precipitation outside of the edge of the neighborhood, and to prevent an arbitrary, unnatural cutoff from the p90 stations located inside the neighborhood (e.g., see the northern edge of the neighborhood in Fig. S2a).
e. Sensitivity tests
We test the sensitivity of our results to a number of our methodological choices. First, to test the impact of the “relaxed” method, we recalculate the semivariograms using a restricted neighborhood, i.e., only using 1–1 and 1–0 pairs of stations within the neighborhood (red–red and red–black pairs in Fig. 3a). Second, to understand the effect of the neighborhood size, we recalculate the semivariograms using a 300-km neighborhood radius and a 700-km neighborhood radius, for both the relaxed and restricted neighborhood method. Last, we test the sensitivity of the number of stations per 500-km neighborhood by recalculating the semivariogram using a uniform number of 100, 200, and 300 stations within each neighborhood. For stations with neighborhoods that encompass greater than 100, 200, or 300 stations, we randomly subselect 100, 200, or 300 stations without replacement. We do not assess the sensitivity of the length scales for stations that originally had fewer than 100, 200, or 300 stations in their neighborhood. When selecting pairs of stations for calculating the raw semivariogram, stations outside the neighborhood remain intact. As a reference, 1%, 7%, and 30% of stations have fewer than 100, 200, and 300 stations, respectively, in their 500-km neighborhood.
a. Seasonal length scale of p90 precipitation
We calculate the seasonal climatology of the length scale extracted from the fitted semivariograms for extreme precipitation across the United States using a 500-km relaxed neighborhood (Fig. 4a). We find that the winter months (DJF) generally have the longest length scales of extreme precipitation and the summer months (JJA) generally have the shortest (Fig. 4a). Additionally, the longest length scales generally occur in the Central, East North Central, Northeast, South, and West regions, while the smallest occur across the Northwest, Southwest, and West North Central regions. We find that the Central region has the longest median length scale in DJF, MAM, and SON, while the West region has the longest median length scale in JJA (Figs. 4 and 5a).
Variation in length scales between regions is highest during winter, and subtler during summer (Fig. 5a), although the differences in length scales between regions are statistically significant in most cases (p < 0.01). In DJF, the largest regional differences are between the Northwest region and the South, Central, and East North Central regions (differences of >135 km in each case) (Figs. 4b and 5a). In contrast, the largest JJA variations occur between the West region (median length scale ~220 km) and the Southwest region (median length scale of ~130 km) (Figs. 4b and 5a).
In addition to these regional differences in median length scale, we find that within-region spatial heterogeneity of length scales also varies across regions and seasons. For example, in the Northeast, the DJF interquartile range (IQR; ~90 km) is triple the JJA IQR (~30 km). In contrast, the West exhibits an IQR of ~50–60 km in all seasons (Fig. 4b).
Although the magnitude of seasonal variations in climatological length scales differ across regions, the differences between seasonal length scales over each region are generally statistically significant (p < 0.001; Fig. 5b) and are plausibly related to underlying differences in the seasonally and regionally varying atmospheric processes that generate precipitation extremes. In most regions, JJA median length scales are significantly smaller than the median length scales in other seasons (p < 0.001), and DJF median length scales are significantly larger (p < 0.001). (In the Northwest, MAM median length scales are smaller than the median length scales in other seasons, and SON median length scales are larger.) The difference between JJA and DJF median length scales is largest in the South (difference of >135 km), and the differences between JJA and SON length scales are largest (differences of >90 km in each case) over the East North Central, South, and West North Central regions (Figs. 4b and 5b). Although there is still a clear seasonal cycle in the Northwest and Southwest (Fig. 4b), the magnitude of the variations among median seasonal length scales remain below 45 km (Fig. 5b).
b. Sensitivity of p90 length scales to neighborhood size, relaxation, and station density
While the magnitude of length scales is sensitive to neighborhood size, relaxation, and station density, the overall patterns of regional and seasonal variations are relatively robust across all methodological choices assessed in our sensitivity analysis. Compared to length scales when using a 500-km relaxed neighborhood, using a smaller 300-km neighborhood yields a shorter length scale in 90%–95% of stations across seasons, and using a larger 700-km relaxed neighborhood yields a longer length scale in 87%–93% of stations across seasons (see Fig. S4). Compared to length scales calculated using relaxed neighborhoods, using a restricted neighborhood decreases extreme precipitation length scales in DJF, MAM, and SON by less than 25% for the majority of stations (67%, 57%, and 57%, respectively); also, 70% of the length scales in JJA increase by less than 10% (see Fig. S4). Likewise, using uniform station densities decreases length scales by up to 40% (see Fig. S5).
Nonetheless, the regional variations in length scales remain robust in the sensitivity tests. For example, the Central region consistently has significantly longer length scales than any other region in DJF, MAM, and SON, while the Southwest region has the longest length scales in JJA (see Fig. S6). The highest dependence of regional variations on different methodological choices is found in DJF. For instance, when using a 700-km relaxed neighborhood, the difference between the Northwest and West North Central length scale becomes positive and statistically significant. However, large differences in interregional variations are rare.
The seasonal variations are also robust to using different neighborhood sizes and station densities, as well as when using a restricted neighborhood method. For example, DJF length scales are largest in the Central, East North Central, South, Southeast, and West regions, while JJA length scales are smallest in all regions except for the Northwest over all methodological choices. The most prominent exceptions are the Northeast and West North Central regions, where methodological choices have a strong impact on the seasonal variations in length scales (see Fig. S7 for more details).
We note that these sensitivity analyses do not test all of the methodological choices in our framework. Though we test the sensitivity to the neighborhood size and the number of stations, we do not account for gradients in station density within the neighborhood. Given that the semivariogram calculation relies on the separation distance between stations, such gradients could ultimately bias our length scale calculation. For example, denser station availability in one part of a neighborhood could lead to a greater number of station pairs included in the experimental semivariogram at smaller separation distances from that part of the neighborhood. These station pairs would weight the experimental semivariogram at smaller separation distances, and consequently impact our length scale calculation.
Additional untested factors include the size of separation distance intervals for the calculation of the experimental semivariogram, the maximum separation distance to which the semivariograms are calculated, the years over which p90 thresholds and length scales are calculated, and the 10% threshold for choosing p90 days. However, our analysis shows that while the absolute magnitude of length scales can be sensitive to methodological choices (especially neighborhood size), the seasonal and regional differences in length scales are relatively robust.
5. Discussion and conclusions
This study presents a method for quantifying regional and seasonal variations in the spatial extent of extreme precipitation using station data, which is foundational to investigating changes in extreme precipitation in a warming world. Our method is advantageous in that the relaxed moving neighborhood allows for an assessment over a large region, and the geostatistical techniques we use allow us to employ longer station datasets without relying on interpolated precipitation data. Until the representations of extreme precipitation in satellite, radar, interpolated, and modeled datasets improve—and/or have been continuously observing the climate system for a longer period of time—our method can provide a foundation to understand the characteristics of extreme precipitation using direct, in situ measurements.
a. Linking length scales to seasonal and regional atmospheric phenomena
The robust seasonal and regional variations in length scales appear to correspond to well-known atmospheric phenomena. The significantly shorter length scales of extreme precipitation in the summer months over most of the continental United States (with the exception of the Pacific Northwest and northern Great Plains) is characteristic of the intense but relatively localized downpours associated with convective storms—which tend to dominate warm-season precipitation in these areas. The longest length scales in each region, on the other hand, occur during winter (as do the highest median length scales in many areas)—coinciding with the passage of larger “synoptic scale” cyclones as the jet stream and associated storm track shift southward during the cool season (Schneider et al. 2011).
However, our findings show that smaller-scale processes could also be important in modifying the spatial extent of extreme precipitation during the cooler seasons. Short autumn, winter, and spring length scales in the northwest (Fig. 4) are consistent with Rutz et al. (2014), who show that heavy rainfall from atmospheric rivers (ARs) rarely penetrate past the Cascade Range in Washington and Oregon. In the rain-shadowed region immediately to the east of the Cascades, regional topography allows both westerly and southwesterly flows to generate heavy precipitation regimes (e.g., Rutz et al. 2014)—resulting in a more heterogeneous distribution of extremes, and shorter length scales. Farther to the south (across California and western Nevada), however, longer p90 length scales do extend farther inland. As in the Pacific Northwest, this may be largely a product of the coastal topography: California’s coastal mountains are not as tall as the Cascades, and form a less perfect barrier to east-moving Pacific moisture, allowing longer length scales to extend inland across most of the state (Rutz et al. 2014). The much taller Sierra Nevada range, located along the eastern margin of California, has a similar effect as the Cascades, resulting in a strongly rain-shadowed region across the Great Basin in Nevada, and an eastward decrease in p90 length scales.
We note that p90 length scales across California are generally larger than across the Pacific Northwest, despite the predominance of AR-driven autumn–winter–spring extremes in both regions. We hypothesize that latitudinal asymmetry in the spatial orientation of ARs may be partially responsible for this difference. ARs may more frequently be zonally (east–west) oriented in the north as opposed to meridionally (north–south) oriented in the south (Rutz et al. 2015), meaning that these extreme events may affect a broader section of the north–south-oriented coastline in Washington/Oregon than in California. An assessment of the anisotropic length scales of extreme precipitation along zonal and meridional axes could yield further insights into topographical and storm track controls along the Pacific coast.
Conversely, topography in the northeastern United States plays a relatively smaller role in producing heterogeneity in winter length scales. Here, a significant fraction of winter extreme precipitation occurs during strong coastal cyclones (known regionally as “nor’easters”). These nor’easters generate large but coastally confined swaths of extreme precipitation from the upper mid-Atlantic (as far south as Delaware) to northern New England (as far north as coastal Maine) (Agel et al. 2015), which are shown by longer extreme precipitation length scales along the coast. Shorter p90 length scales are found in the vicinity of the Great Lakes, where a significant fraction of extreme wintertime precipitation results from intense but highly localized convective “lake-effect snow” bands (Niziol et al. 1995). During seasons other than winter, Northeastern p90 length scales are more spatially uniform—likely owing to the importance of eastward-moving, large-scale cyclonic storms in triggering precipitation extremes (Murray and Colle 2011).
Although summer length scales are more spatially uniform across the United States (relative to the cooler seasons), we still observe differences in length scales that coincide with regionalized summertime precipitation regimes. For example, the band of longer length scales across the Great Plains and portions of the Midwest (from southeast Texas to the edge of Lake Michigan) coincides with the region where mesoscale convective complexes (MCCs) most commonly form during the warm season. MCCs are organized storm systems, but they tend to have intermediate length scales (i.e., between larger-scale cyclones and individual thunderstorms), and frequently result in intense rainfall and severe weather across relatively broad regions (Laing and Fritsch 1997). On the other hand, parts of the Southeast, especially the Florida Peninsula and other areas along the Gulf Coast, feature intense but highly localized pulse or “popcorn” thunderstorms in the summer months, yielding much shorter characteristic length scales (Miller and Mote 2017). We note, however, that subdaily precipitation in the summer follows a strong diurnal cycle (Higgins et al. 1997), meaning that calculated length scales may be sensitive to the time of observation of daily precipitation accumulation among nearby stations.
In fact, given the daily temporal resolution of the station dataset, interpretations of the length scales of extreme precipitation in our results are not necessarily absolute. Longer length scales of daily extreme precipitation can be a result of a slow and large precipitation system, or a smaller but fast-moving system. An assessment of the length scales of subdaily or multiday precipitation using our method could yield more insight into the temporal controls on the length scales of extreme precipitation. Similar to Zhou and Matyas (2017), who assess the spatial characteristics of precipitation along hurricane tracks, our method could also be used to quantify the length scale of extreme precipitation along the trajectory of a given precipitation system by calculating the length scale of extreme precipitation around the centroid of the system. By following different storm tracks, we could disentangle the physical processes that produce different length scales of extreme precipitation for different types of precipitating systems, such as tropical storms or atmospheric rivers.
b. Strengths, limitations, and applications of the method
In contrast to the distinct regional and seasonal variations in climatological length scales found in our results, recent analyses of the spatial extents of extreme precipitation using radar and satellite data show more limited variations in climatological length scales within and between regions, and between seasons. For example, Guinard et al. (2015), who use 1-hourly radar data to assess precipitation objects over the United States from 1992 to 2001, present subtler seasonal variations in the size of precipitation objects over the Northeast, Central, and Southeast regions. Similarly, we show much more pronounced regional variations in p90 precipitation length scales in the eastern half of the United States (ranging from ~120 to 350 km; Fig. 5) than Dwyer and O’Gorman (2017), who used satellite data from 1998–2015 to assess the zonal length scale of p99 precipitation.
The magnified regional and seasonal variations found in our climatological quantification of p90 precipitation length scales may be a result of using a longer dataset, and/or of using in situ measurements of extreme precipitation directly. Applying our method to the same gridded datasets used in previous studies would help to identify the source of discrepancies in the regional and seasonal climatologies of extreme precipitation length scales. On the other hand, if the gridded dataset in question was originally oversmoothed, then applying our method could still result in misleading magnitudes of length scales due to that oversmoothing.
Discrepancies between our results and those of previous analyses could also be due to different thresholds of precipitation and different periods over which thresholds are calculated. In our study, long length scales of p90 precipitation over regions with large heterogeneities in p90 thresholds, such as east and west of the Sierra Nevada, may not necessarily represent extents of uniformly intense precipitation. In such regions with small-scale variations in p90 precipitation intensities resulting from topographic influences on storm tracks, quantifying length scales of extreme precipitation using a single, uniform, absolute precipitation threshold (rather than a percentile) may produce a more meaningful assessment of high-impact precipitation events, and may show more correspondence with previous uniform-threshold analyses of gridded datasets.
We also note that gauge measurements of rain and snow possess their own uncertainties due to varying gauge types, wind effects, and other factors (Rasmussen et al. 2012; Sieck et al. 2007), which could impact our extreme precipitation dataset and, consequently, the calculated length scales. Although previous experimental efforts have assessed measurement uncertainty in rain and snow gauges in different locations (e.g., Tokay et al. 2010; Liu et al. 2013), we do not quantify these uncertainties for all ~4500 stations in this study. However, we directly use the station measurements to infer length scales without smoothing or interpolating the dataset, eliminating further uncertainties that our extreme precipitation dataset could incur. Further, our method provides a clear framework to assess the sensitivity of our quantification of the length scale of extreme precipitation to methodological choices.
Given that our methodological choices result in differences in the magnitude of length scales, but not in differences in relative regional and seasonal variations, our framework could be used to assess relative changes in the length scales over time, and to diagnose the changes in atmospheric variables that shape any trends that are detected. Although anthropogenic warming has generally increased the intensity of extreme precipitation (e.g., O’Gorman 2015), Guinard et al. (2015) show that the spatial extents of precipitation decreased in 2002–11 compared to 1992–2001. Similarly, Wasko et al. (2016) show that the spatial extents of rain storms decrease with increased local temperatures (although intensities of the storms increase). However, these studies have been limited in their temporal and/or spatial extent. Our method introduces the potential to provide more robust assessment of trends in the length scales over large continental areas, using longer station records. For example, using our method, we could quantify the length scale of extreme daily precipitation at each location for each year, and then evaluate trends in the length over the 50-yr period, while accounting for spatial and temporal dependencies.
Moreover, the substantial heterogeneity in climatological length scales within some of the “climatically consistent” NCEI regions (e.g., the Northeast, South, and West North Central regions) points to potentially complex relationships between changes in the spatial extent of extreme precipitation and changes in atmospheric processes. For example, after calculating trends in length scale of extreme precipitation, we could subsequently assess the thermodynamic and dynamic contribution to those trends by quantifying the corresponding trends in local temperature and precipitable water (Trenberth et al. 2003; Trenberth 2011; O’Gorman 2015), along with changes in the magnitude and spatial patterns of geopotential heights and winds (Horton et al. 2015). In this way, we can begin to identify the causes of changes in the spatial extent of extreme precipitation in the recent past.
Our method and results can also be used to test relationships between length scales of extreme precipitation and associated hydrologic responses—which may enable a more comprehensive understanding of regional flood risk. While the p90 precipitation threshold encompasses events that are likely to have substantial hydrological impacts, a higher threshold (i.e., p99 or p99.9) may be better suited to examining events that are most likely to result in severe flood events. Future work will evaluate changes of precipitation length scales using a range of precipitation thresholds, with a focus on thresholds that are most likely to cause adverse impacts to human and environmental systems.
We thank three anonymous reviewers for insightful and constructive feedback. We thank NOAA for providing the GHCN station data, and the Stanford Research Computing Center and Stanford’s Center for Computational Earth and Environmental Science for providing computational resources and support. We acknowledge funding support from the U.S. Department of Energy and Stanford University. D.L.S. was supported by the UCLA Sustainable LA Grand Challenge and the NatureNet Science Fellows Program through a collaboration between The Nature Conservancy and the University of California, Los Angeles.
Supplemental information related to this paper is available at the Journals Online website: https://doi.org/10.1175/JCLI-D-18-0019.s1.