Watersheds located in the Catskill Mountains of southeastern New York State contribute about 90% of the water to the New York City water supply system. Recent studies show that this region is experiencing increasing trends in total precipitation and extreme precipitation events. To assess the impact of this and other possible climatic changes on the water supply, there is a need to develop future climate scenarios that can be used as input to hydrological and reservoir models. Recently, stochastic weather generators (SWGs) have been used in climate change adaptation studies because of their ability to produce synthetic weather time series. This study examines the performance of a set of SWGs with varying levels of complexity to simulate daily precipitation characteristics, with a focus on extreme events. To generate precipitation occurrence, three Markov chain models (first, second, and third orders) were evaluated in terms of simulating average and extreme wet days and dry/wet spell lengths. For precipitation magnitude, seven models were investigated, including five parametric distributions, one resampling technique, and a polynomial-based curve fitting technique. The methodology applied here to evaluate SWGs combines several different types of metrics that are not typically combined in a single analysis. It is found that the first-order Markov chain performs as well as higher orders for simulating precipitation occurrence, and two parametric distribution models (skewed normal and mixed exponential) are deemed best for simulating precipitation magnitudes. The specific models that were found to be most applicable to the region may be valuable in bottom-up vulnerability studies for the watershed, as well as for other nearby basins.
Watersheds located in the Catskill Mountains of southeastern New York State provide a high-quality municipal water supply to over 9 million people, including city residents as well as other communities in the state (NYCDEP 2016). The system has a total storage capacity of ~2.20 × 109 m3, and in recent years has an average daily delivery rate of ~4.16 × 106 m3, making the system quite resilient to drought with approximately 1.5 years of storage. The system consists of three major subsystems: the Croton, which is the oldest and smallest and is located east of the Hudson River, and the two that are located west of the Hudson River, the Catskill and Delaware subsystems, which are collectively referred to as the West of Hudson (WOH) system and typically provide approximately 90% of the total supply. This WOH system covers an area of approximately 4103 km2 with an elevation range of 125–1275 m (Matonse et al. 2013). The spatial expanse of the system provides additional resilience to both drought and to water quality problems because when one basin is experiencing problems, water can be withdrawn from another, less problematic basin. Yet, despite the relative resilience compared to typical smaller water supply systems, significant vulnerability remains to water quantity and quality issues associated with climatic fluctuations and to extreme events. The New York City Department of Environmental Protection (NYCDEP), which operates and maintains the system, expends considerable effort in providing state-of-the-art system management tools to effectively meet the numerous system operation challenges and objectives (Porter et al. 2015).
This region experiences a humid continental climate, with summers noticeably cooler at higher elevations than in the surrounding lowlands, winter temperatures often below freezing, occasionally abundant snowfall, and year-round precipitation averaging between 1000 and 1200 mm yr−1 (Anandhi et al. 2011a). Precipitation has strong gradients associated along elevation and longitude related to orography and coastal storm tracks, resulting in greater total precipitation and larger extreme events in particular areas along the southern and eastern escarpments (Thaler 1996). Between the mid-twentieth century and the first decade of the twenty-first century, the contribution of snowfall to total annual precipitation has been 20%–30% (Frei et al. 2002; Pradhanang et al. 2011; Anandhi et al. 2011b).
Extreme hydrological events are, in general, responsible for a disproportionate transport of nutrients and sediment into the streams and reservoirs (Mukundan et al. 2013; Yoon and Raymond 2012). Past studies suggest increasing trends in total precipitation and in the frequency and magnitude of extreme precipitation events in the WOH basins. Burns et al. (2007) found that regional mean precipitation for the Catskill Mountain region increased significantly by 136 mm between 1952 and 2005. Matonse and Frei (2013) found that extreme warm season precipitation and streamflow events were more frequent between 2002 and 2012 than during any period of the twentieth century. DeGaetano and Castellano (2013) found that the annual frequency of extreme Catskills precipitation (number of events per year that produce ≥50.8 mm precipitation) has been increasing over the past 60 years, with the time series dominated by year-to-year and decade-to-decade variability. They also analyzed climate model projections from the North American Regional Climate Change Assessment Program (NARCCAP) that suggest that extreme precipitation will increase at a rate of 2%–3% decade−1 through 2069. Thibeault and Seth (2014) found that for the northeastern United States, CMIP5 models predict a significant increase in the frequency and magnitude of extreme precipitation, particularly in mountainous regions such as the Catskills. They also found that models predict a concurrent increase in total precipitation, which could contribute to wetter conditions antecedent to storms and therefore to increased extreme streamflow events. The potential effects of these precipitation changes include increased sediment erosion, increased nutrient loads, alteration of the annual cycle of reservoir thermal structure, and other factors that may pose challenges for water quality management.
As a part of the NYCDEP’s Climate Change Integrated Modeling Project (CCIMP), a series of studies (e.g., Anandhi et al. 2011a,b; Pradhanang et al. 2011; Anandhi et al. 2013; Matonse et al. 2013; Pradhanang et al. 2013) have examined the potential impacts of climate change on the availability of high-quality water in the WOH. These studies follow the “top down” approach, using change factor methodology (CFM) downscaled climate scenarios from global climate models (GCMs), to incorporate climate change into vulnerability analyses using NYCDEP’s integrated suite of models including watershed hydrology, water quality, water system operations, and reservoir hydrothermal models (Anandhi et al. 2011a). As an alternative to the top-down approach, “bottom up” or vulnerability-based methods to climate change vulnerability analyses have recently been applied to water resources (Wilby and Dessai 2010; Brown et al. 2011; Prudhomme et al. 2010, 2015). Using such approaches, which involve sensitivity studies of system responses to scenarios independent of GCM results, can explore the climate vulnerabilities of a system over a wider range of plausible climate change scenarios than the more traditional top-down approaches in which GCM projections completely define the parameter space of future scenarios (Wiley and Palmer 2008; Steinschneider and Brown 2013). The bottom-up approach can be implemented in a number ways. For example, Steinschneider and Brown (2013) first determine system vulnerabilities and then assess different adaptation measures to find the most robust measures under future uncertainty. While the bottom-up approach can include the results of GCM simulations, it also enables more flexible definitions of uncertainty.
Stochastic weather generators (SWGs) provide an advantage over more traditional change factor methods when implementing the bottom-up approach. SWGs are statistical models that produce synthetic weather time series based on observed statistical properties at a particular location and can include numerous weather variables (although this study is focused solely on precipitation). SWGs can be employed in bottom-up risk assessments to generate multiple scenarios of daily climate variations within which a water resource system model can be tested (Ray and Brown 2015). An SWG coupled with a single or series of response models facilitates more complete identification of system vulnerabilities and flexible, quantitative definitions of uncertainty, which can aid in the selection of robust adaption measures (Steinschneider and Brown 2013). One advantage over change factor methods is that SWGs provide the ability to produce probabilistic information about the frequency of extremes. Such information is obtained by running numerous simulations in a Monte Carlo type of analysis and focusing on values at the tails of the distribution. In contrast, traditional change factor methods provide information about only mean conditions.
While the potential of SWGs for vulnerability assessments for the New York City water supply system has not yet been explored in detail, a recent study (Rossi et al. 2015) used a multivariate, multisite weather generator to introduce incremental changes in mean precipitation and temperature to simulate a range of climate change scenarios that were used to study impacts on turbidity in the Ashokan Reservoir. However, the skill of weather generators used to simulate the observed precipitation characteristics was not discussed in detail. There are a number of general categories of SWGs available in the literature with different levels of complexity and varying location-specific skills in simulating observed precipitation characteristics (Chen and Brissette 2014b). It is therefore essential to examine model structure and performance in the target watershed before applying the SWG in climate change assessment studies for these basins.
The objective of this analysis is to demonstrate a method for evaluating a suite of SWGs for use in a particular region with a focus on extreme precipitation events. To ensure robust conclusions, we apply several different approaches to evaluate the simulation of extreme precipitation events, including daily and seasonal extremes, extreme events indices, and recurrence intervals of extreme events. Each of these approaches has been used before, but they are typically not used in combination. This combination of methods may be particularly useful for analyses of other water supply or other infrastructure systems in which large precipitation events present particular challenges. The SWG results for our watershed may be useful to others in our region requiring similar analyses. In future studies we will apply these results to a bottom-up sensitivity study of the New York City water supply system.
The remainder of the paper is organized as follows. Section 2 provides a brief description of historical datasets used in the study. Section 3 provides an overview of SWGs relevant to this study. The framework for calibrating SWGs is described in section 4. Sections 5 and 6 present results, and section 7 includes discussion and conclusions.
Observed daily precipitation data were obtained from Northeast Regional Climate Center (NRCC) at Cornell University. A total of 18 rain gauge stations of National Climatic Data Center (NCDC) stations are nonuniformly distributed across the WOH watersheds (Fig. 1). As each station has a different record history, precipitation data for the common period of record (1950–2009) were used. As the focus of the study is to analyze the average precipitation over each watershed, the weighted mean of nearby stations was calculated to obtain a single time series for each watershed. The Thiessen polygon method (Thiessen 1911) was used to estimate the weights based on the relative areas of each measurement station in the Thiessen polygon network. Table 1 describes the stations and their corresponding weights for each watershed. More details can be found in Anandhi et al. (2011b).
The decision to perform the analysis at the basin scale, rather than station scale, was based on a number of considerations. The results must ultimately be translated into basin-scale information for analyses of water supply and links to stream gauge observations. The translation to basin scale could potentially be made at a different point in the analysis, but would inevitably require some sort of modeling approach to evaluate finer-scale orographic effects that cannot be captured by the relatively sparse station network. We are aware of no product that includes both a consistent long-term record and a reasonable estimation of orographic effects. The PRISM dataset (Daly et al. 2008; http://prism.oregonstate.edu) includes orographic effects but is inappropriate for analysis on the historical time scale that we require because of inconsistency in the station distribution. The data product employed here (Anandhi et al. 2011a) is as good as any other in capturing the relative magnitudes of events at the basin (or larger) and daily (or longer) temporal and spatial scales. This has been confirmed by NYCDEP watershed hydrological models that are able to capture observed streamflow variations based on basin-averaged meteorological input (Zion et al. 2011).
3. Overview of SWGs
The use of SWGs to produce synthetic time series with statistical properties resembling those of an observed time series has wide applications in the modeling of weather and climate-sensitive systems such as crop growth and development, hydrological processes, and ecological systems, where the observed climate records are inadequate in terms of length or completeness (Wilks and Wilby 1999). A plethora of early studies dedicated to the development and advancement of SWGs (Gabriel and Neumann 1962; Todorovic and Woolhiser 1975; Katz 1977; Richardson 1981) have been summarized in several review articles (Wilks and Wilby 1999; Srikanthan and McMahon 2001; Ailliot et al. 2015). SWGs can be broadly classified into four groups: two-part models (the first part is dedicated to precipitation while the second part deals with other meteorological variables such as temperature or solar radiation), resampling models, transition probability models, and auto regressive moving average (ARMA) models (Srikanthan and McMahon 2001), with the first two approaches being the most common. In this study, we discuss only precipitation generation in two-part and resampling models.
Precipitation, which is unique among meteorological variables in its mixed nature as both a discrete (occurrence) and continuous (amount) process, has always been a key variable of interest in the construction of weather generators (Wilks and Wilby 1999). In a two-part model, SWGs analyze precipitation as a chain-dependent model, first simulating precipitation occurrence (wet or dry day) and then precipitation amount. Occurrence is usually simulated using either a Markov chain (MC)-based model or a renewal process, sometime referred to as a spell-length model. Two-state (i.e., precipitation occurs or does not occur) MC models relate the state of precipitation on the current day to the states of precipitation on preceding days where the number of preceding days considered is the order of the MC (Boulanger et al. 2007). Although the first-order MC model (precipitation occurrence depends only on the previous day) has been found satisfactory in most cases (Katz 1977; Richardson 1981; Wilks 1992), the higher-order model better simulated long wet and dry spells (Wilks 1999; Chen and Brissette 2014a). The alternating renewal process, rather than simulating occurrence for each day, fits a probability distribution to the sequence of alternating wet and dry spell lengths that are assumed to be independent (Buishand 1978; Roldan and Woolhiser 1982; Semenov and Barrow 2002). Various probability distributions have been evaluated for the best fit of wet and dry spell lengths such as logarithmic series, truncated negative binomial distribution, truncated geometric distribution, and semiempirical distribution (Wilks and Wilby 1999). The parameters of the distribution are typically assumed to remain constant within seasons or to change continually as estimated by a Fourier series. One drawback of the alternating renewal process method is its inability to handle the seasonality in the rainfall occurrence process (Srikanthan and McMahon 2001).
Given the occurrence of a wet day, the daily precipitation amount is then modeled, typically using a parametric distribution. The distributional pattern of daily precipitation is strongly skewed toward higher values, as very small daily precipitation events occur frequently, while heavy daily precipitation events are relatively rare (Wilks and Wilby 1999; Chen and Brissette 2014b). Numerous studies have compared several probability distributions for simulating daily precipitation, including both single and compound distributions such as exponential (EXP; Todorovic and Woolhiser 1975; Roldan and Woolhiser 1982), gamma (GAM; Ison et al. 1971; Richardson and Wright 1984), Weibull (Stöckle et al. 1999), skewed normal (SN; Nicks and Gander 1994), mixed exponential distribution (MEXP; Roldan and Woolhiser 1982; Wilks 1999), and hybrid exponential and Pareto distributions (EXPP; Furrer and Katz 2008; Li et al. 2012; Chen and Brissette 2014b).
Resampling, a data-driven nonparametric method, provides an alternative to parametric techniques. The k-nearest neighbor (k-NN) conditional bootstrap approach, the most popular resampling scheme for SWG, generates daily weather variables by resampling (with replacement) historical records associated with the wet–dry day series (Rajagopalan and Lall 1999). The k-nearest neighbors for each date are chosen by considering all historical dates within a specified time window. Subsequently, one of the k-nearest neighbors is chosen, with a higher probability given to closer neighbors (King et al. 2015). After the pioneering work by Young (1994) and Sharma et al.(1997), a number of studies extended and improved the k-NN approach (Rajagopalan and Lall 1999; Buishand and Brandsma 2001; Yates et al. 2003; Sharif and Burn 2007; King et al. 2015). Some studies (Apipattanavis et al. 2007; Steinschneider and Brown 2013) also combine the MC model and k-NN resampling into a semiparametric method. The kernel density estimation (KDE) method is another popular nonparametric approach for precipitation generation (Sharma and Lall 1999; Mehrotra and Sharma 2007). The underlying concept of this method is based on counting the relative frequency of data lying in a local neighborhood, which depends on the shape of kernel functions about the point of estimation (Sharma and Lall 1999).
Other less widely used theoretical constructs have been applied to generate precipitation amount in some studies. For example, hidden Markov models (HMMs), in which the probability of local daily rainfall occurrence is conditioned on a number of hidden weather states with Markovian daily transitions between them, have been proposed in this context (Zucchini and Guttorp 1991; Greene et al. 2011). Boulanger et al. (2007) introduced a multilayer perceptron-based neural network to generate synthetic precipitation time series. Chen et al. (2015) proposed the use of a second-degree polynomial curve fitting approach to fit a Weibull experimental frequency distribution of observed daily precipitation constrained on the probable maximum precipitation (PMP).
4. Implementation of SWGs
Chain-dependent models using different combinations of MC-based models of orders first (MC1), second (MC2), and third (MC3) to generate precipitation occurrence were implemented along with several parametric probability distributions, a resampling method, and a curve fitting technique to generate precipitation amount. The overall methodology to implement SWG in this study is outlined using a flowchart shown in Fig. 2. In this study only two-state (wet or dry day) MC models were applied. A threshold of 0.1 mm was employed to discriminate between wet and dry days at the basin-averaged scale.
a. Generation of the precipitation occurrence
In the first-order MC process, the probability of precipitation on a given day is based on the previous day’s condition whether it was wet or dry, which can be defined in terms of transition probabilities P01 and P11:
where the term “prob” means probability and the vertical line represents “conditional on.” Since precipitation either occurs or does not occur on a given day, the two complementary transition probabilities are P00 = 1 − P01 (dry day following a dry day) and P10 = 1 − P11 (dry day following a wet day).
The first-order MC model can be generalized to higher orders. Letting rt = 0 if day t is dry, and rt = 1 if day t is wet, Eqs. (1) and (2) can be extended to second- and third-order MC processes [Eqs. (3) and (4), respectively]:
where i, j, k, h are values of 0 or 1 for dry and wet days, respectively. Transition probabilities were estimated from the observed time series by using the maximum likelihood method for each biweekly period. The sequences of wet and dry days in the SWG were produced by applying a random number generator based on the uniform distribution from the interval [0, 1] for each day. The occurrence of precipitation was estimated for each day, in sequence, based on the precipitation state of the previous day(s) and a comparison of the random number to the corresponding transitional probability (Wilks and Wilby 1999).
b. Generation of precipitation amount
After computing rainfall occurrence, rainfall amounts on wet days were modeled using seven distribution models, including five parametric distributions, one resampling method (k-NN), and one curve fitting method. Parametric distributions include three single distributions—EXP, GAM, and SN—and two compound distributions—MEXP and EXPP. The parameters of the distributions were estimated using the maximum likelihood method for each biweekly time period, except for the SN distribution, where the method of moments was used. Following Rajagopalan and Lall (1999), we chose k = in the k-NN resampling method, where N equals the sample size and a 7-day time window was used to find the nearest neighbor for each date by considering all historical dates. The second-order polynomial-based curve fitting method (PN) to fit the Weibull experimental frequency of daily precipitation used here is similar to Chen et al. (2015), except in our case the maximum value is not constrained by the PMP. The length of all simulated daily precipitation series are 10 times the observed record length. The generation of such long synthetic series is required to avoid biased simulation of the true climate (Chen and Brissette 2014a).
5. Statistical evaluation of SWGs
In this section, we evaluate the skill of SWGs in terms of simulating the statistical characteristics of the full distribution of daily precipitation as well as the monthly and annual scales for all six watersheds.
a. Daily precipitation characteristics
As SWGs simulate daily precipitation occurrence first and then the amount separately, we evaluated each component independently by estimating a number of statistical metrics for observed and simulated time series for each watershed. Each metric is discussed here.
1) Precipitation occurrence
Markov chain–based models with different levels of complexity, namely, MC1, MC2, and MC3, are compared with observations with respect to reproducing the frequency of wet days per month and the distributions of wet and dry spell lengths. We compare observed to simulated number of wet days (Fig. 3) and spell lengths (Fig. 4, only Cannonsville and Ashokan are shown, as results are similar for other basins). Wet (dry) spell lengths are defined as the number of consecutive days with precipitation more (less) than the threshold value. Based on these charts, as well as on statistical metrics including mean, standard deviation, and 99th percentile values (not shown), it is determined that no significant improvement is achieved by increasing the order of the MC model. This result agrees with earlier work. For example, Wilks (1999) found the MC1 broadly appropriate for the central and eastern United States. Schoof and Pryor (2008) also found that the MC1 outperforms the MC2 and MC3 for a major part of the United States, including the northeastern region. Therefore, under the principle of parsimony, only MC1 is used in subsequent sections of this analysis.
2) Precipitation amount
Precipitation amount is evaluated based on the mean, median, standard deviation, interquartile range (IQR), and skewness coefficient of daily precipitation amount (including wet days only; Fig. 5). While all seven models adequately simulate mean daily precipitation for all watersheds, their skills vary in reproducing other statistics. For example, the EXP and GAM distributions consistently overestimate the median and underestimate the standard deviation, while EXPP and PN considerably overestimate both median and standard deviation for most watersheds.
As expected, the skewness coefficient of observed daily precipitation is greater than 3 for most watersheds, implying that the distribution of daily precipitation is extremely skewed toward higher values. Skewness is underestimated by EXP and GAM and overestimated by EXPP and PN (especially for the Schoharie watershed), while SN, MEXP, and k-NN appear to adequately simulate the skewness (this is quantified below). To visualize this issue in more detail, the cumulative density functions (cdf) of observed and simulated time series were estimated (Fig. 6) and plotted on the probability scale such that the values at the tails of the distribution can be more easily visually analyzed. The models tend to perform well at smaller precipitation values. In contrast, for the large events most models perform more poorly: EXP and GAM both underestimate the magnitudes of the largest value, while EXPP and PN overestimate the largest values and overestimate skewness. Similar results for these models have also been reported by other studies for different locations (e.g., Wilks and Wilby 1999; Chen and Brissette 2014b). However, the SN, MEXP, and k-NN capture the most prominent statistical characteristics of the observed cdf for both lower and higher precipitation magnitudes.
To quantify relative model accuracies, mean absolute percent error [MAPE; Eq. (5)], which is the average of the absolute values of the percent errors for each watershed, was calculated for each metric (Table 2):
where P is precipitation and subscripts “sim” and “obs” indicate simulated and observed values.
On the basis of the magnitude of MAPE, Lewis (1982) developed different classes to judge the accuracy of a model, namely, highly accurate (less than 10%), good forecast (11%–20%), reasonable forecast (21%–50%), and inaccurate forecast (51% or more). The model that produces lower MAPE provides a better representation of the rainfall characteristics. Except for mean and interquartile range, MAPE values for EXP and GAM show a reasonable forecast. With values less than 10% for all the statistics, SN, MEXP, and k-NN fall in the highly accurate category. EXPP and PN also perform as highly accurate, except for the skewness coefficient for which MAPE values indicate an inaccurate forecast. With regard to skewness, only SN, MEXP, and k-NN fall in the highly accurate category.
b. Monthly and annual precipitation characteristics
MAPE values suggest that all models provide highly accurate simulations of monthly mean precipitation (MAPE less than 10%) and annual mean precipitation (MAPE less than 2%; Table 3). While most models simulate the seasonal pattern of monthly mean precipitation fairly well (figure not shown), almost all models underestimate the April mean precipitation and overestimate May values more than during other months.
Model simulations of variability as represented by the standard deviation of monthly and annual values are poor (Table 3). The underestimation of the interannual variability of monthly and annual precipitation by all the models is expected and can be attributed to the lack of any terms in the models to represent low-frequency variation. The inclusion of such low-frequency variability has been addressed in the literature (Chen et al. 2010) and will be addressed for our study region at a future date. It is our intention here to focus this report on other aspects of the statistical simulation and to incorporate low-frequency variability at a future time.
6. Performance of SWGs for extreme precipitation events
As the study area has experienced an increased frequency of extreme precipitation events in recent decades, it is important to evaluate the abilities of SWGs to simulate extreme event probabilities. Here we apply a variety of metrics in order to evaluate model simulations in detail.
a. Daily extreme events
The first set of metrics we employ include quantile-based metrics of daily extreme event magnitude: the 95th (Q95) and 99th (Q99) percentiles (figure not shown). While all models provide reasonable estimates of Q95, Q99 is underestimated by EXP and GAM and tends to be overestimated by PN. In terms of MAPE, all models except EXP and GAM perform highly accurate simulations for both Q95 and Q99 (Table 4).
b. Seasonal extreme events
Matonse and Frei (2013) find that the seasonal cycle of extreme precipitation events is unimodal, with peak values occurring between August and October, the season during which the frequency of large events includes, but is not limited to, the influence of tropical storms and hurricanes in this region. Following Matonse and Frei (2013), box plots of all daily precipitation values equal to or greater than the 95th percentile for each watershed (Fig. 7; only Cannonsville and Ashokan are shown as results are similar for other basins) demonstrate that while EXP and GAM underestimate the observed median and interquartile range, other models perform well in capturing the observed pattern. Stochastic simulations produce a few values larger than observed, especially for EXPP and PN. The ability of stochastic models to produce events outside the range of observed values is one of their desirable attributes. However, these extremely large simulated events may also be due to the overestimation of high values toward the end of the tail of the distribution.
c. Extreme events indices
A set of 27 climate extreme indices based on daily temperature and precipitation has been proposed by the Expert Team on Climate Change Detection and Indices (ETCCDI; Klein Tank et al. 2009). Because of their robustness and straightforward calculation and interpretation, these indices have become popular in recent years for numerous applications in climate research (Sillmann et al. 2013). Following Zhang et al. (2011), SWG performance based on the four extreme event indices associated with large precipitation events was evaluated. These indices are
RX1day: maximum 1-day precipitation per year;
RX5day: maximum consecutive 5-day precipitation per year;
R95p: annual total precipitation due to events exceeding the 95th percentile over the entire data period (1950–2009); and
R99p: annual total precipitation due to events exceeding the 99th percentile over the entire data period (1950–2009).
Each index was calculated for each year individually, and yearly results are summarized in box-and-whisker plots in Fig. 8 (Cannonsville watershed) and Fig. 9 (Ashokan watershed) (results are similar for other watersheds). RX1day and RX5day are underestimated by EXP and GAM while other models perform better. For Cannonsville, all models overestimate R95p and R99p, while for Ashokan the R95p and R99p results are more mixed.
d. Extreme value theory analysis
We applied extreme value theory (EVT) to evaluate the abilities of SWG to simulate the probabilistic structure of observed extreme precipitation events. The general extreme value (GEV) distribution is commonly used to fit block maxima values. The GEV distribution is defined as
for the location −∞ < μ < ∞, scale σ > 0, and shape −∞ < ξ < ∞ parameters.
The GEV distribution was calibrated to the annual maximum precipitation time series to estimate precipitation magnitudes associated with return periods of 50, 75, and 100 years (Fig. 10). In GEV, the return level associated with the extreme events can be calculated in three steps: 1) calculate the annual maximum rainfall, 2) fit a GEV distribution, and 3) estimate the return level by calculating the quantiles of the GEV distribution (Coles 2001).
The results of the extreme value analysis are consistent with the evaluations presented in previous sections in that SN, MEXP, and k-NN provide superior simulations. EXP and GAM models consistently underestimate precipitation amount corresponding to all return periods. EXPP and PN do not accurately reproduce the upper tail of the observed daily precipitation because of their exceptional overestimation. In view of MAPE, SN, MEXP, and k-NN fall in either the highly accurate or good forecast category (Table 5). On the other hand EXP, GAM, EXPP, and PN provide reasonable forecasts.
7. Summary and concluding remarks
This work documents the application and tests the skill level of several SWGs over six watersheds that comprise the primary source for the New York City water supply system. This study compares the skill of three models for generating precipitation occurrence and seven models for simulating precipitation amount in the study area. With regard to precipitation occurrence, no significant improvement is achieved when increasing the order of the MC model. Since the number of parameters in an MC model increases exponentially with each increase in order (Wilks 1999), under the principle of parsimony MC1 is selected to simulate precipitation occurrence.
Of the seven models evaluated for generating daily precipitation amount (Table 3), five are based on parametric distributions, one is based on a semiparametric k-NN bootstrapping technique, and one is based on a second-degree polynomial curve–fitting approach. Parametric distribution models include three single distributions (one-parameter EXP, two-parameter GAM, and three-parameter SN), one compound distribution (three-parameter MEXP), and one hybrid distribution (three-parameter EXPP). While all these models reproduce mean daily precipitation reasonably well in all watersheds, their skills differ with regard to standard deviation, skewness coefficient, and extreme characteristics of daily precipitation. EXP and GAM consistently underestimate the standard deviation and skewness coefficient of daily precipitation for all watersheds because they are unable to preserve the shape of the daily precipitation distribution at the tail. As a result, these two distributions also underestimate the extreme indices and return intervals of annual maxima. The performance of SN is consistently better than the other single distributions in all respects. Such improved performance may be attributable to the inclusion of additional parameters, which increases the model’s flexibility, and also to the use of method of moment rather than maximum likelihood to estimate model parameters (Chen and Brissette 2014b). As in earlier studies (Wilks 1999; Chen and Brissette 2014b), the compound distribution MEXP performed better than single distributions except SN, perhaps because it specifically takes the entire range of precipitation distribution into account, not just the bulk. The EXPP distribution overestimates extremes, sometimes by an order of magnitude. This is because when the generalized Pareto distribution is used to simulate the tail of the daily precipitation distribution, a few cases of unreasonably large values are typically generated (Li et al. 2013). Chen and Brissette (2014b) described this issue in detail in their study. The k-NN method consistently performs better at reproducing the observed precipitation characteristics including extreme events for all the watersheds. Like EXPP, PN also overestimates precipitation amount in terms of standard deviation, skewness, and extremes. The second-degree polynomial is not the best choice to fit the Weibull experimental frequency distribution to observed daily precipitation at this study location. An evaluation of other polynomials, which have not previously appeared in the literature, may reveal a better fit, but this is beyond the scope of this analysis. However, for simulating monthly and annual mean precipitation, all models are quite comparable, though they are unable to reproduce the variability as they do not take into account the low-frequency component of climate variability.
In conclusion, though our results are fairly typical in the sense that some models perform better with regard to some metrics, while others are superior with regard to other metrics, we are able to identify three models (SN, MEXP, and k-NN) that, when combined with the first-order MC model, consistently reproduce the observed statistics of daily precipitation, including statistics of the extreme tail of the distribution. However, as the k-NN method is based on a resampling technique, k-NN is inappropriate for use in climate change impact studies because of its inability to produce values that are not found in the observations (Chen and Brissette 2014b). The inability of k-NN to extrapolate values that are plausible but outside of the observed range precludes its application, in its current form, in climate change analyses.
The methodology applied here to evaluate SWGs combines several different types of metrics that are not typically combined in a single analysis. This combination of metrics may prove useful to other water supply applications in different regions. The specific models that we find to be most applicable to our region (SN and MEXP) may be valuable in vulnerability studies for our watershed and for other nearby basins.
The application of these models in vulnerability studies will require at least two additional studies. First, since the SWGs in this evaluation include no representation of low-frequency (from interannual to decadal scale) variability of climatic parameters, an evaluation of several known approaches to incorporate such variability into SWGs (Chen et al. 2010; Khazaei et al. 2013) is forthcoming. Subsequently, vulnerability studies will be performed by coupling SWGs with a response model (or a series of response models) to facilitate a more complete documentation of system vulnerabilities. A number of researchers have been developing these so-called bottom-up approaches, whereby SWGs are used in conjunction with GCM climate scenarios and response models to evaluate water system vulnerabilities to extreme hydrological events including storms, floods, and droughts (Wilby and Dessai 2010; Brown et al. 2011; Prudhomme et al. 2010, 2015). The advantage of such approaches is that they allow the development of a quantitative and potentially Bayesian approach that includes future scenarios that are outside the range of GCM predictions, and the inclusion of a full set of uncertainties, enabling water managers to consider potential future conditions and uncertainties when making decisions.
The study was conducted as part of the Climate Change Integrated Modeling Project sponsored by the Bureau of Water Supply, New York City Department of Environmental Protection (NYCDEP). We thank Jordan Gass (NYCDEP) for making Fig. 1. We thank the Climate Change Initiative (CCI)/Climate and Ocean: Variability, Predictability, and Change (CLIVAR)/Joint Technical Commission for Oceanography and Marine Meteorology (JCOMM) Expert Team on Climate Change Detection and Indices (ETCCDI) for the use of their software “RClimdex” to estimate the extreme precipitation indices. We also thank Eric Gilleland from National Center for Atmospheric Research for his guidance on using “extRemes” software to do the extreme value analysis. Thanks to Rakesh Gelda (NYCDEP) and to two anonymous reviewers for comments that greatly improved earlier versions of this manuscript.
Current affiliation: International Research Institute for Climate and Society, Columbia University, Palisades, New York.