Seasonal drought forecasting is presented within a multivariate probabilistic framework. The standardized streamflow index (SSI) is used to characterize hydrologic droughts with different severities across the Gunnison River basin in the upper Colorado River basin. Since streamflow, and subsequently hydrologic droughts, are autocorrelated variables in time, this study presents a multivariate probabilistic approach using copula functions to perform drought forecasting within a Bayesian framework. The spring flow (April–June) is considered as the forecast variable and found to have the highest correlations with the previous winter (January–March) and fall (October–December). Incorporating copula functions into the Bayesian framework, two different forecast models are established to estimate the hydrologic drought of spring given either the previous winter (first-order conditional model) or previous winter and fall (second-order conditional model). Conditional probability density functions (PDFs) and cumulative distribution functions (CDFs) are generated to characterize the significant probabilistic features of spring droughts. According to forecasts, the spring drought is more sensitive to the winter status than the fall status, which approves the results of prior correlation analysis. The 90% predictive bound of the spring-flow forecast indicates the efficiency of the proposed model in estimating the spring droughts. The proposed model is compared with the conventional forecast model, the ensemble streamflow prediction (ESP), and it is found that their forecasts are generally in agreement with each other. However, the forecast uncertainty of the new method is more reliable than the ESP method. The new probabilistic forecast model can provide insights to water resources managers and stakeholders to facilitate the decision making and developing drought mitigation plans.
Accurate forecasting of hydrologic extreme events plays a significant role in developing appropriate policies to plan for available water resources. Although several studies have proposed promising methods to improve hydrologic forecasts, the observed effects of climate change on floods and droughts across different regions of the globe in recent decades highlights the need for more sophisticated methods in predicting extreme events (Mishra and Singh 2010; Moradkhani et al. 2010; Halmstad et al. 2012; Risley et al. 2011; Madadgar and Moradkhani 2011; Najafi et al. 2012). Motivated by the increasing frequency of droughts throughout the globe, this study proposes an advanced methodology for forecasting droughts within a probabilistic framework.
Drought development is rooted in shortages of precipitation over an extended period of time, resulting in water scarcity (Jaeger et al. 2013). As droughts expand across different spatial and temporal scales, the natural environment and human lives are strongly affected. Almost all the continents throughout the globe have been affected by various drought phenomena during recent decades (Mishra and Singh 2010). According to the U.S. Drought Monitor, more than 70% of the United States currently experiences dry spells of severities ranging from abnormally dry to exceptional droughts (Showstack 2012). Several recent efforts have attempted to enhance forecast accuracy, mitigation policies, and damage estimate of drought events in the globe, specifically in the United States. The Drought Impact Reporter (DIR), launched by National Drought Mitigation Center (NDMC), is a comprehensive database reporting damages caused by recent droughts within the United States. Reported by the North America Drought Monitor of National Oceanic and Atmospheric Administration (NOAA) National Climate Data Center (NCDC; http://www.ncdc.noaa.gov/temp-and-precip/drought/nadm/), droughts with an estimated damage of over 100 billion dollars (Lott and Ross 2000) have been among the costliest natural disasters in the United States since 1980. Lott and Ross (2006) estimated drought and heat wave induced damages to the U.S. economy at $174 billion, between 1980 and 2005. However, drought is a slowly developed phenomenon, and hence there might be a chance to mitigate drought impacts if the events are forecasted within an appropriate time.
There are various studies in the literature examining different aspects of drought phenomena. Drought identification and monitoring (Andreadis and Lettenmaier 2006; Sheffield and Wood 2007; Shukla et al. 2011; Van Loon and Van Lanen 2012; van Huijgevoort et al. 2012), developing drought indices (Niemeyer 2008; Mishra and Singh 2010), characterization of drought events (Shiau 2006; Dupuis 2007; Wong et al. 2010), evaluation of global warming and climate change impacts on future droughts (Blenkinsop and Fowler 2007; Ghosh and Mujumdar 2007; Sheffield and Wood 2008; Burke et al. 2010; Risley et al. 2011; Madadgar and Moradkhani 2011), early warning systems in reservoir operation under drought stress (Huang and Chou 2005, 2008), and many other drought-related analyses have been studied during last years. There are also several studies that have focused on qualified drought forecast for different regions. In an early attempt, Karl et al. (1987) analyzed the probability of receiving the amount of precipitation required to terminate an ongoing dry condition and return to the normal conditions over various time periods. The limitation, however, was using an unconditional gamma distribution—ignoring the dependency and correlation of precipitation in temporal scale—to obtain the probabilities. Lohani and Loganathan (1997) used a nonhomogeneous Markov chain model to generate the transition probability matrix of drought states. In another study, the Markov chain model was employed to evaluate drought transition probabilities, persistence, duration, and frequency within six categories of different severities (Steinemann 2003). Some other studies used the stochastic renewal models, stochastic autoregressive models, and artificial neural networks (ANN) to estimate different characteristics of future droughts and low-flow periods (Kendall and Dracup 1992; Loaiciga and Leipnik 1996; Mishra and Desai 2005, 2006; Barros and Bowden 2008). However, the autoregressive and neural network models were later questioned by Hwang and Carbone (2009) because of limiting the forecasts into the deterministic estimate of the mean drought status. Recently, Özger et al. (2012) developed a wavelet and fuzzy logic combination model for long-lead drought forecasting. The technique was found to outperform fuzzy logic, ANN, or coupled wavelet and fuzzy logic models, yet prior to an application it needs a significant work to find the appropriate independent predictors, which strongly affect the forecast. Without using any frequency-analysis methods, Cancelliere et al. (2007) derived the transition probabilities matrix by revising the statistics underlying the standardized precipitation index (SPI) series. They also questioned the validity of a Markov chain model in forming the transition probability matrix for forecasting SPI values. However, two major limitations of their study are 1) there are promising approaches like copula functions, as discussed later, for frequency analyses of drought status that look promising and avoid overwhelming procedures to analytically derive the transition probability matrix from the index formulas; 2) to reduce the computational burden, they assumed that aggregated monthly precipitations are uncorrelated and normally distributed variables, while this is not a valid assumption specifically when the method is expanded to other hydrologic variables like streamflow. Some other studies took advantage of seasonal climate forecasts to predict the future droughts. Carbone and Dow (2005) and Hwang and Carbone (2009) incorporated the seasonal forecast products of NOAA Climate Prediction Center (CPC) with historical climate records to address the uncertainties of future droughts. Despite the potential of CPC products in drought forecasting, Steinemann (2006) argued that the water managers are not usually able to properly interpret the forecast probabilities and uncertainty information provided by CPC seasonal precipitation outlooks.
This study approaches probabilistic seasonal drought forecast using copula functions (Joe 1997; Nelsen 1999). Despite the fact that copulas are admired in probability theory and statistics, they have yet to be effectively applied in probabilistic forecasts of drought events, except in a few studies in determining drought characteristics. Copulas are distribution functions that join several variables of diverse correlation and dependence structures. Since hydrologic variables are usually correlated to each other, the copula functions have been applied in probabilistic modeling of various hydrologic phenomena over the past few years, including flood analyses (e.g., Favre et al. 2004; Zhang and Singh 2007a; Salvadori and De Michele 2010), rainfall analyses (Zhang and Singh 2007b; Salvadori and De Michele 2006; Kao and Govindaraju 2008), spatial analysis of groundwater quality parameters (Bárdossy 2006; Bárdossy and Li 2008), synthesizing and downscaling of monthly river flow (Li et al. 2013), and low-flow/drought analyses (Shiau 2006; Dupuis 2007; Kao and Govindaraju 2010; Wong et al. 2010; Madadgar and Moradkhani 2011). Recently, Li et al. (2013) employed copula functions to synthesize and downscale the monthly river flows conditional on the past observations. In another recent study, a copula was employed in the postprocessing of hydrologic forecasts (Madadgar et al. 2013), and its performance was tested on 2500 hypothetical case studies as well as a real case study of streamflow forecast ensembles for Sprague River basin in southern Oregon. By reducing the uncertainty and increasing the reliability of forecasts, the developed technique could efficiently improve the streamflow predictions, and it was found to be a robust alternative for postprocessing or bias correction of hydrologic forecasts. This study aims at modeling and forecasting the hydrologic droughts regarding seasonal flows. Since the seasonal flow and corresponding drought conditions are dependent variables in time, copulas as capable tools are incorporated to model the Bayesian network of sequential events. Therefore, a future drought status, given the earlier drought conditions, is estimated using copula functions within a Bayesian network.
Aside from the probabilistic approach presented in this study to develop the seasonal drought forecasting, we also use the forecast tools incorporated by the operational streamflow forecast centers to evaluate how various forecast techniques are in agreement in seasonal drought forecasting. Ensemble Streamflow Prediction (ESP; Twedt et al. 1977; Day 1985) produced by the National Weather Service (NWS) River Forecast Centers (RFC) is a way to address the uncertainties of the future flows. Promising forecasts by ESPs have encouraged the researchers to extensively use them in estimating the uncertainties of various hydrometeorological forecasts, and several techniques have been recently developed to build qualified ESPs (e.g., Schaake et al. 2007; Wood and Lettenmaier 2008; Brown and Seo 2010; Dechant and Moradkhani 2011; Madadgar et al. 2013; Najafi et al. 2012). In seasonal flow forecasting, ESPs are generated using the watershed models driven by the resampled historical meteorology. Since the water supply forecasts are developed in advance when the future climate has not yet being recorded, the historical meteorology with an uncertainty of not knowing the exact future conditions is assumed to reasonably treat the future climatology. Therefore, a watershed model forced by the resampled historical meteorology produces an ensemble of streamflow forecasts that reflect the uncertainties of future unseen climatology.
The paper is organized as follows. Section 2 explains the Bayesian networks and adopts copula modeling to define the conditional probabilities of the Bayesian networks. Section 3 presents drought forecasting techniques and section 4 outlines the study basin. The drought index used in this study is described in section 5 followed by the analysis of historical droughts and developing probabilistic drought forecasts. The forecast products are presented for the first-order and second-order conditional forecasts; last, section 8 provides the final remarks and conclusion.
2. Copula-based modeling of Bayesian networks
Bayesian networks are probabilistic models that describe the conditional dependencies of a set of random variables via directed acyclic graphs (DAG). A DAG represents the sequence of events in a direct ordering with no direct circuits. Similar to a Bayesian network, the directed Markov network represents the sequence of variables in a directed acyclic order. The two networks are similar in representation of dependencies; however, their difference resides in that a Markov network is generally undirected and can be cyclic. Thus, the Bayesian networks are more general representations of directed acyclic networks. A detailed description of DAG is found in Thulasiraman and Swamy (1992). The random variables () that evolve over time (e.g., streamflow or drought states) can be shown in a DAG and their probabilistic queries can be represented within a Bayesian network (or a directed acyclic Markov network). The joint probability density function of the set of random variables in vector forming a Bayesian network can be written as the product of individual density functions conditional on their parent variables (Russell and Peter 2009):
where is the subset of representing the parent variables at time , and is the length of time period over which the random variables evolve. If the dependency ordering of random variables exactly follows the temporal sequence and the parent variables of is the set of all prior variables (), then Eq. (1) becomes as follows:
This is the chain rule in the probability theory, and the conditional probabilities from Eq. (2) can be written as
An intense analytical effort is required to directly model the joint behavior of the variables in Eq. (3) and obtain the joint density functions in the right-hand side of the equation. By the help of copula functions, Eq. (3) can be decomposed to a simpler form, which is explained later on in this section.
Copulas are defined as multivariate distribution functions on the -dimensional unit cube with uniform margins on the interval [0, 1], that is, (Joe 1997; Nelsen 1999). Based on Sklar’s theorem (Sklar 1959), a multivariate distribution can be expressed by a copula as follows:
where, , denoted by in the copula definition, represents the marginal distribution of the ith variable, and is the copula cumulative distribution function. A copula satisfies the “boundary” and “increasing” conditions, which is expressed as follows in the case of bivariate copulas:
If is absolutely continuous, the copula density is written as follows:
and the joint density function can be decomposed using the copula density function:
Hence, Eq. (3) can be rewritten as
where is the product operator, and is the density function of a marginal distribution. To find the conditional density function of Eq. (8), it is necessary to model the joint behavior of the variables by copula functions. This would take less effort than direct modeling of the joint density functions in the right-hand side of Eq. (3).
3. Drought forecasting methods
Similar to many other hydrologic states, the drought status of a location at a particular time is affected by its earlier status to some extent. Owing to the direct influence of streamflow volume on hydrologic droughts, the accurate modeling of future streamflow is essential to estimate the most probable status of future droughts. Upon the correlation of streamflow at a given time with a limited extent of its past observations, the Bayesian framework as discussed in the previous section can be utilized to reflect the sequential behavior of drought conditions within a probabilistic analysis on streamflow variable. Assuming the correlation with a lag time equal to , Eq. (8) will give us the conditional density function of streamflow at time given all the past observations. If , streamflow at any time would be only conditional on its previous value; and the conditional density function [Eq. (8)] would be simplified as follows:
If streamflow dependency is reasonably extended to its two previous time steps, Eq. (8) turns into the following form:
This study aims at forecasting seasonal droughts conditional on the drought status of the earlier seasons with highest correlations. As shown later for the study basin, the target season is correlated the most to its previous season; however, its correlation to the next earlier season (two prior seasons) is not insignificant enough to ignore. Thus, Eq. (9) and (10) are applied in the remainder of this paper to practice the probabilistic drought analysis of the target season given the drought status of either one or two earlier seasons.
To provide an operational insight toward the developed forecast methodology, the presented technique is compared with ESP (Twedt et al. 1977; Day 1985), having been used by various operational hydrological forecast centers including the NWS. ESPs are generated to characterize the uncertainty of hydrological forecasts. They use the observed meteorology in a historical time period to reflect on the unseen future climate. To generate the more reliable ESPs, the climate data might be weighted such that it reflects the similarity of the current climate condition to those of the historical data (Najafi et al. 2012). ESP can also be generated by driving the hydrologic model with the climate forecast. To predict the seasonal flow and generate the ESPs, this study uses the Precipitation-Runoff Modeling System (PRMS; Leavesley et al. 1983), which is a distributed-parameter watershed model developed by the U.S. Geological Survey (USGS) to simulate the effects of various combinations of climate, land use, soil type, etc. on the hydrologic response of the watersheds. The watershed model is driven by the historical climatology during the spinup period before the forecast date. Beginning from the forecast date, the model is forced by the resampled historical meteorology to produce an ensemble of hydrologic forecasts. Since the resampled climate data reasonably reflects the uncertainty of the unseen future meteorology, the generated ESP is assumed to properly model the uncertainty of future hydrology caused by unknown climatology. To implement the retrospective forecast of drought events in the Gunnison River basin, the duration of 1980–90 is taken as the spinup period and the ESPs are generated starting from 1 January of each year during 1990–2011 with a lead time of 6 months. Therefore, each ESP is built for the period of January–June of each year, whereas the PRMS is driven with the resampled meteorology of January–June from each year in the entire period (1980–2011)—except the year that the ESP is generated for.
4. Case study and data
The Gunnison River basin is one of the headwater subbasins of the Colorado River basin, located in the southwestern United States (Fig. 1). The Colorado River basin, with an approximate drainage area of 640 000 km2, is divided into upper and lower portions, and encompasses parts of seven states: Wyoming, Colorado, Utah, Nevada, California, New Mexico, and Arizona. The Gunnison River basin includes seven subbasins with a total drainage area of 5400 km2 at the conjunction of two upstream reaches: the Tomichi Creek and the Gunnison. Streamflow observations of the upstream reaches immediately before the basin outlet at USGS 09119000 (Tomichi Creek River) and USGS 09114500 (Gunnison River) are accumulated to use as the basin’s total outflow at any time. Streamflow record during 1950–90 is contributed to establish and train the forecast model and the rest (1990–2011) is used for verification purposes.
The Gunnison River basin is a snowmelt dominated watershed. According to the drought summary by Western Water Assessment (WWA) and National Integrated Drought Information System (NIDIS) released in July 2012 (WWA and NIDIS 2012), depletion of the snowpack and the early meltout in the spring of 2012 caused the below-average flow in April–July of 2012. During the past 118 yr, 2012 was the second warmest year on record in the state of Colorado. Regarding the inflows to Lake Powell, which reflects the runoff of the entire upper Colorado River basin, the water year 2012 was the fourth driest year in the past century. The intense drought of 2012 (comparable with the drought of 2002 across the region) in the states of Colorado, Utah, and Wyoming caused insufficient water supply, poor pasture and crop conditions, and regionwide wildfires. However, in general, the region has been undergoing various droughts since 2000, with the most intense drought occurring in 2002. Thus accurate forecast of future droughts is significant for reliable planning and management of available water resources across this area.
5. Probabilistic drought forecasting
a. Standardized streamflow index (SSI)
Droughts may occur in different phases of hydrologic cycle. Water movement through the hydrologic cycle is generally slow phenomenon, except for quick mass-transfer events like sudden storms. It happens that in a specific time window some hydrologic variables (e.g., soil moisture) experience a level of drought while some others (e.g., streamflow or water availability) do not undergo any identifiable drought categories. Therefore, drought status would appear differently upon the target hydrologic variable. To assess the drought status of a region, many drought indicators have been developed each using different hydrologic variables. The Palmer drought severity index (PDSI; Palmer 1965), SPI (McKee et al. 1993), crop moisture index (CMI; Palmer 1968), surface water supply index (SWSI; Shafer and Dezman 1982), and vegetation condition index (VCI; Liu and Kogan 1996) are among the most applied indices to characterize different drought types. Though all of these indices are widely used, each one focuses on particular hydrologic variables and has its own specific strengths and weaknesses.
This study adopts the definition of the meteorological drought index, SPI, for streamflow variable (SSI) to characterize the hydrological droughts of Gunnison River basin. In drought studies, the indicators defined similar to SPI are generally called standardized indices (SI), which are able to capture the anomalies from the average moisture status of a region regarding the drought variable in use. An SI may utilize hydrologic variables other than precipitation (as in SPI) such as streamflow, snowpack, soil moisture, etc. (McKee et al. 1993). The severity of droughts characterized by SI is usually identified by the U.S. Drought Monitor classification scheme as summarized in Table 1. The five drought categories (D0–D4) in dry periods are defined upon certain probability thresholds. The more severe droughts are associated with less probable categories (e.g., D4). Indeed, the SI drought indicators are normal variates, and hence the smaller values of drought indicator are in accordance with more severe and less probable droughts. The SSI = −0.5 is a threshold to separate the dry periods from the wet (normal) periods; however, the consistent variation in available water causes a dynamic transition either between dry and wet periods or within various droughts in a dry spell. To calculate the SSI as defined in this study, the monthly flow volumes () are aggregated starting from month m of the year y for the time window of length k (). Then, the marginal CDF of the aggregated flows is obtained as , which is further used to transform the aggregated flow to the standardized normal variable. Hence, the SSI is the inverse normal of :
According to Eq. (11), separate distributions fit the aggregated flows with different starting months. Therefore, this definition of SSI preserves the seasonality effect; otherwise, the seasonal flow pattern would be disregarded if the marginal distribution fitted to the entire series of the aggregated flows.
b. Analyses of historical droughts
According to Eq. (11), probabilistic drought analysis starts by fitting the marginal distributions to aggregated flows. For the seasonal drought forecast, monthly flow volumes are aggregated starting from each month over a sequence of 3 months (), and then a set of distributions is tested to find the best margin fitted to the aggregated flows . The following seven distributions are considered in this study: gamma, generalized extreme value (GEV), lognormal, Gaussian, Weibull, Gumbel, and exponential distributions. The method of maximum likelihood estimation (MLE) is used to estimate the parameters of each distribution. To find the best distribution fitted the seasonal flows, the Kolmogorov–Smirnov (K–S; Kolmogorov 1933; Massey 1951) test and the Akaike information criterion (AIC; Akaike 1974) test are applied. The K–S test evaluates the appropriateness of a particular distribution fitting a given dataset, while the AIC test can find the best alternative in a group of distributions. Hence, neither one is conclusive by itself to find the best choice in the group. The appropriateness of a distribution should be first accepted by the K–S test. The K–S test returns the p value, which should be greater than the significance level of to accept the null hypothesis. Under the null hypothesis, the dataset is assumed to come from the reference distribution. If the goodness of a particular distribution is approved by the K–S test, then its superiority to other alternative distributions is evaluated by the AIC test, where the distribution with the smallest AIC value is assumed to be the best choice among others. Table 2 summarizes the AIC and the p value associated with the K–S test for different distributions fitted to the seasonal flow volumes in the training period of 1950–90. The best distributions, with the smallest AIC and the p values greater than the significance level , are shown in bold. Either gamma or lognormal distribution is found to be the best fit to the seasonal flow volumes. Figure 2 illustrates the marginal distributions against the histogram of the seasonal flow volumes ().
After fitting the marginal distributions, a backward calculation in the SSI formula [Eq. (11)] would give us the range of seasonal flow within each drought category used by the U.S. Drought Monitor (Table 1). The bar chart in Fig. 3 shows the range of flow volume within each drought category D0–D4. From this figure, the seasonal pattern is evidently captured as expected from the SSI definition in Eq. (11). The high-flow season shows more variability in the required amount of water to transition to a different category. In other words, the same class of drought is likely to persist within a high-flow season (e.g., the spring) if the seasonal flow does not change significantly. Moreover, while a particular seasonal flow (e.g., 90 KAF) is defined as an absolutely wet condition in winter (January–March or February–April periods), the same amount of flow might lead to an exceptional drought condition (D4) in spring (e.g., April–June and May–July periods). This is the seasonality issue reflected in drought definitions that identifies the high dependency of drought status on the time of the year.
Using the marginal distributions found upon the aggregated flow during the training period (1950–90), the SSI with k = 3 [Eq. (11)] is calculated for the entire analysis period (1950–2011) as plotted in Fig. 4. Using the U.S. Drought Monitor categories shown in Table 1, is taken as a threshold to separate the dry and wet conditions, and the shaded areas in Fig. 4 illustrate the dry periods with . Figure 4 shows that the Gunnison River basin has been exposed to various droughts since 1950. During 61 yr of the analysis period, the drought of 2002 was the most severe “exceptional” drought (D4; ). Drought persistency is also obvious in Fig. 4. Evidently, several droughts frequently occurred in 1950s and 1960s; and the drought of 2000 continued for 5 yr in spite of the earlier long wet period from 1995 to 2000. To easily follow the temporal sequence of seasonal droughts in Fig. 4, the matrix plot in Fig. 5 shows the status of dry spells for each season. Each cell of the matrix represents the drought status of a particular season in the year, and as shown, several dry periods occurred during the 1950s and 1960s, especially in the falls and winters (October–December and January–March). General evidence of the matrix plot is that the droughts have been more evenly distributed in the springs and summers than in the falls and winters of the analysis period.
c. Products of probabilistic drought forecast
For the forecast phase, this study selects the spring (April–June) as the target season. To determine the predictor seasons in conditional drought forecasting, the correlation between the margin of spring flow and the margins of prior seasons should be obtained (Fig. 6). Given the correlations in Fig. 6, the transformed spring flow (found from the marginal distribution of spring flow) has the highest correlation with the transformed winter flow (January–March) among the other seasons of the year. Yet the correlation between the seasonal flow of spring and fall or even summer is not insignificant, and the analysis of spring flow should be established upon either one season (winter) or two seasons (winter and fall) earlier. Therefore, two different Bayesian networks are applied to study the conditional probabilities of spring flow. In the first network, the spring flow is assumed to be only dependent on the winter flow and hence the drought forecasting is conducted within Eq. (9). The second network adds the impact of fall status on spring drought and it considers the seasonal flow of both past winter and fall in drought forecasting of spring season [Eq. (10)]. However, given the greater correlation between the spring and winter flow, it is expected that the winter influences the spring flow more than the fall.
1) First-order conditional forecasts
In the first-order conditional forecast, the spring drought is assumed to depend only on its past winter flow. Thus, Eq. (9) applies to drought forecasting where and represent the seasonal flow of the spring and winter, respectively, and denotes the corresponding probabilities from associated marginal distributions. To implement the probabilistic forecast analysis, an appropriate bivariate copula function should join the marginal distributions of the spring and winter seasonal flow. The best marginal distributions fitted the observed flow of each season during the training period (1950–90) are the same as those found earlier. Among various copula families, the Archimedean and elliptical families (Embrechts et al. 2003; Nelsen 1999) are usually used in hydrological applications. The Clayton and Gumbel copulas, from the Archimedean family, and the Gaussian and t copulas, from the elliptical family, are tested with parameters estimated by the inference function for margins (IFM) method (Joe 1997). Copula functions, and associated parameter estimation methods, are described in detail in the above-mentioned references. To select the appropriate copula among several alternatives, a visual inspection between the empirical and parametric copulas might be helpful (Madadgar and Moradkhani 2011); however, there are several goodness-of-fit (GOF) tests (Genest et al. 2009) that are more analytically conclusive in copula selection. The parametric bootstrapping GOF test, introduced by Genest and Rémillard (2008), is employed in this study to select the best copula fitted to the dataset. This GOF test calculates the Cramér–von Mises (Anderson 1962) statistic as a measure of distance between the empirical and parametric copulas:
where is the Cramér–von Mises statistic, and and are, respectively, the empirical and parametric copulas fitted to the data with the size of . The Monte Carlo approach is used for bootstrap sampling procedure to find the p value of the test (Genest et al. 2009). If the p value is greater than a predefined significance level the null hypothesis () is accepted; otherwise, it is rejected and the parametric copula would not be a good choice with respect to the significance level of . In a group of copulas, the one with the smallest and the greatest p value (greater than the significance level as well) is selected as the best copula. In this study, the significance level is set to and the p values are obtained by the parametric bootstrapping procedure with 1000 replications. As summarized in Table 3, the Gaussian copula with the smallest and the greatest p value is the best choice among others.
After picking the best-fitted copula for the data, several probabilistic analyses of drought status in spring season can be conducted using Eq. (9). In such an analysis, one might be interested in spring-flow distribution conditional upon a given winter drought status in winter. In this case, the winter drought conditions might be fixed at the very transition point of drought categories, where a drought status turns into another. Figure 7 shows the distribution of spring flow (April–June) conditional on the drought status in winter (January–March). Each curve represents the probability distribution function (PDF) associated with a particular drought status in winter (D0–D4). The analysis is narrowed down to the fixed drought conditions in winter. According to the drought PDFs, as the winter flow increases, which is equivalent to less intense drought in winter, the spring drought is expected to be less intense as well. For example, if a D4 drought occurred in winter, the spring drought status is likely to be more intense than if a D0 winter drought occurred. Moreover, when an intense drought is experienced in winter, the distribution of spring flow is rather narrow around its mode. For instance, the PDF associated with D0 winter droughts is wider than the PDF associated with the D4 winter drought. This leads to a larger range of spring flow given the D0 drought status in winter, as compared to the D4 winter drought.
To expand the results and demonstrate the usefulness of this approach, the conditional probability of drought in spring season, given the drought condition in winter, is shown in a shaded scheme by Fig. 8. To show all the PDFs in a same plot, the probability distributions are scaled between 0 and 1; where 1 (the darkest area) represents the most probable spring flow (mode of the PDF), and hence drought condition, given the flow magnitude (i.e., hydrologic drought condition) in winter. Similarly, the areas with lighter shade represent the tails of the spring flow PDF given the winter flow. Therefore, the darker areas closely surround the mode of the PDFs; hence the associated range of spring flow is more likely to happen if the given winter flow is observed. The scatterplot of the spring flow against the winter flow during the entire analysis period, 1950–2011, is also shown in the x–y range of Fig. 8. The scatterplot is showing both training (1950–90) and validation (1990–2011) periods. As seen, the dark area (most likely situations given the winter flow) captures almost the entire scatterplot, which approves the reliable performance of the forecast methodology in both training and validation phases. The range of seasonal flows is split by the dashed lines, for both winter and spring, to illustrate the range of seasonal flows corresponding to various drought conditions, in each season. Figure 8 helps to find out the most likely drought status in spring, given the winter flow, by simply looking at the dark regions. Given D4 drought in winter, for instance, the spring drought status is most likely to be D4, D3, or D2 depending on the exact value of the winter flow. As another example, if a D2 drought occurs in winter, the spring drought status will likely be located in either D1 or D0 spans. Visual inspection of Fig. 7 also verifies that the PDF of the spring flow is narrower within the D1 and D0 spans, given the winter drought of type D2. Hence, Fig. 7 is a limited version of Fig. 8 where only a few conditional PDFs are illustrated.
To compare the forecast results of the presented method with ESP, Fig. 9 is developed showing the 90% predictive uncertainty bound of the retrospective forecasts within the validation period (1990–2011). As seen, there are only few spring seasons with drought conditions during 1990–2011. The uncertainty bound for the copula method is the limited representation of Fig. 8 where, given any winter flow, only the 5% and 95% bounds of the PDFs are shown. Unlike the copula-based forecast, the uncertainty bound of the ESP forecasts do not change smoothly as the winter flow increases. The uncertainty bound gradually expands in the copula-based forecasts, while it does not follow a particular trend for the ESP approach. To ensure the validity of the generated ESPs, the PRMS simulations are plotted versus the observations in Fig. 10. As seen, PRMS performs quite reasonably in simulating the 3-monthly flows implying that the generated ESPs (shown in Fig. 9) are reliable to compare with copula-based forecasts. According to Fig. 9, the significance of the copula-based forecast model over the ESP approach is that the generated uncertainty bound is reasonably large to encompass the observations showing a discernible trend against the increase in winter flow. Moreover, according to the expected values of forecasts, the performance of copula-based method is seen to be slightly better than the ESP forecasts.
While the conditional PDFs of spring flow, given the winter flow, are shown in Figs. 7 and 8, the probability of various droughts occurring in spring is not evaluated to this extent. In this study, the following expression is used to analyze the probability of spring droughts:
where is the spring flow causing a drought status. Equation (13) gives the probability of spring flow exceeding the thresholds defined for a particular drought status (), while the winter flow is observed (). Figure 11 shows the exceedance probability of spring flow () given the winter flow (). For instance, if the flow magnitude of 35 KAF (a D2 drought) occurs in winter, the probability of spring drought having a D0 or wetter status would be 0.44. Likewise, this probability would be equal to 0.57, 0.77, 0.85, and 0.93 for D1, D2, D3, and D4 drought conditions. The curve for the D0 drought (the lowest one) gives the probability of spring flow leading to a D0 or wetter condition. Hence, this plot is useful in drought mitigation planning and decision making where the probability of dry-period termination is of interest. The probability of terminating the dry periods with severities other than D0 (D4, D3, D2, or D1) can be also obtained using Fig. 11. Moreover, as either curve approaches 1, the difference between the exceedance probabilities decreases. This is in agreement with the fact that less-dry springs are anticipated when the winter flow increases, presuming that the correlation of spring flow with earlier seasonal flows dramatically decreases beyond the previous winter.
2) Second-order conditional forecasts
Given the correlation of spring flow to the past winter and fall seasons (Fig. 6), the spring droughts might be analyzed using the second-order conditional probabilities [Eq. (10)]. In Eq. (10), the random variables , , and denote the seasonal flow of spring, winter, and fall, respectively, with the corresponding probabilities of . The best copula to join the marginal distributions (, , and ) is found among multiple candidates as described in previous section. According to the Cramér–von Mises statistics [Eq. (12)], the best alternative to connect all three marginal variables is the trivariate Gaussian copula (Table 3). It is also necessary to find a bivariate copula to join the marginal distributions of the winter and fall flows ( and ). The results of GOF test indicate that the Gaussian copula is a suitable choice for the bivariate copula as well. The parameters of copulas are found within the same training period as used before (1950–90).
Equation (10) returns the second-order conditional PDF of spring flow given the winter and fall observations. Figure 12 shows the conditional PDFs of spring flow given various droughts in the past seasons. In Fig. 12a, the fall drought is fixed at D4 and the winter drought varies from D0 to D4. Similar to Fig. 7, the modes of PDFs move toward smaller spring flows as the winter drought becomes more intense. The very little difference between Fig. 12a and Fig. 7 indicates that the fall status does not have a noticeable impact on the spring drought. This fact is approved with Fig. 12b as well where, for a given winter drought (D4), even a big change in fall status (D0 to D4) does not make a significant change in spring status. As seen, the PDFs in Fig. 12b are clustered together and are quite similar to the PDF associated with D4 drought in Figs. 7 and 12a. This is evidence of a high influence of winter status on the spring drought. In Figs. 12a,b, all the PDFs are conditional on a particular drought (D4) fixed for either fall or winter. Figure 12c displays the effect of any possible variation in winter and fall status on the spring drought. The PDFs are associated with two situations (D0 and D4) for either fall or winter. While the fall status is fixed at D0 (circle markers), the spring drought would change rather significantly upon the magnitude of change in winter status. In contrast, if the winter status is fixed at a particular drought, for example, D0 (solid markers), the spring status does not considerably change with even a big change in fall status. Hence, the spring drought is found to be more sensitive to the winter status than the fall status, which approves practicing the reduced-dimension form of the forecast model (denoted as the first-order conditional forecast).
Another outlook to the spring drought would be the exceedance probability of spring flow given the winter and fall observations. This is equivalent to the probabilities developed in Fig. 11, with the exception that both fall and winter flows are used and therefore the 2D plot of Fig. 11 turns into a 3D plot (Fig. 13). Given the seasonal flow of winter and fall, Fig. 13 illustrates the probability of spring drought being equally wet or wetter than a particular drought status. For example, the probability layer associated with D0 (the lowest one) gives the probability of spring flow exceeding the threshold for D0 drought. Similar to Fig. 11, as the winter or fall flow increases, the probability layers approach 1 and get close together. This is a valid observation since the high-flow winters and falls are usually followed by wet rather than dry springs. Furthermore, for a particular drought status, the probabilities change more quickly with winter-flow variations than fall-flow variations. Given a particular winter flow and variable fall flow, the probabilities do not change as significantly as if the fall flow was fixed and the winter flow varied. This is consistent with the outcomes of Fig. 12 where the winter flow was found to have stronger impact on the spring status than the fall flow.
The seasonal flow hindcast during the validation period (1990–2011) is shown in Fig. 14. This plot is the 3D version of Fig. 9, where the conditional forecast of spring flow is upon the past winter and fall seasonal flows. The mesh grids show the uncertainty bound of spring flow generated by copula model. Similar to Fig. 9, the uncertainty bound associated with the copula model gradually expands as the winter and/or fall flow increases and it also captures the observed spring flows during the validation period. Furthermore, the variation of the uncertainty bound is more dependent on the variation of winter flow than the fall flow, as earlier approved by Figs. 12 and 13. This is the reason that the hindcasts of Figs. 9 and 14 are very similar.
6. Summary and conclusions
Hydrologic droughts across the Gunnison River basin, a subbasin of the upper Colorado River basin, were analyzed, and the probabilistic forecasts of seasonal droughts were obtained. The standardized streamflow index (SSI) was used to characterize drought events and was also adopted to address the seasonality effects in drought analysis. More fall and winter droughts occurred in the 1950s and 1960s, while the spring and summer droughts were distributed more evenly during the historical period (1950–2011). In the forecast phase, the Bayesian networks contributed to reflect the autocorrelation in the streamflow record and develop the conditional probabilities. An autocorrelation analysis of seasonal flows indicated that the spring flow (forecast variable) has the highest autocorrelation with the previous winter and the second highest correlation with the previous fall. A forecast model reflecting the dependency of seasonal flows was then developed by a family of multivariate distribution functions, the so-called copulas. Copulas are distribution functions that join several variables with different levels of correlation and dependence structures. Owing to the autocorrelation results, two different forecast models were established to obtain the first-order and second-order conditional probabilities of the spring flow. In the former (latter) case, the spring flow was estimated upon the observation of previous winter (winter and fall). The forecast products were presented as the conditional PDF and CDF of spring droughts.
In the first-order conditional model, it was seen that for a given drought status in winter, a less-severe drought is anticipated in spring. In addition, the probabilistic forecasts provided a tool to estimate the most probable status in future (here, spring), while the past status is known (e.g., winter). The probability of terminating the winter drought in the upcoming spring was also estimated for different drought states. A dry spell is less likely in spring if the winter experiences either normal or wet condition. In the second-order conditional model, the fall status displayed very slight impact on spring drought, which was in agreement with the results of autocorrelation analysis. For a given winter drought, even a big change in fall status could not make a significant change in spring status. As seen from the conditional CDFs, the 3D probability layers of spring flow approached 1 and got close together as the winter or fall flow were increasing. This fact indicated that the high-flow winters and falls are usually followed by wet rather than dry springs. According to forecasts, the spring drought was found to be more sensitive to the winter status than the fall status, which approves practicing the reduced-dimension form of the forecast model (denoted as the first-order conditional forecast).
The drought forecasts using the presented methodology were also compared with the conventional ESP forecasts. The ESP retrospective forecasts, starting from 1 January of each year from 1990 to 2011, were produced using the PRMS watershed model. The 90% predictive uncertainty bound of the spring drought forecasts generated by the copula-based method changed smoothly as the given winter and/or fall flow varied gradually. The uncertainty bound made by ESP forecasts did not follow any discernible trend, whereas copula-based forecasts showed a particular trend. The uncertainty bound slowly expands as the winter and or fall flow increases; however, the change is again much smaller in respect with fall flow than the winter flow.
The proposed probabilistic approach helps the decision makers to develop drought mitigation plans and policies with an appropriate insight toward the future drought status. Given the basics of copula modeling, the proposed methodology shows promising results for adequately correlated and dependent variables. However, correlation and dependency are the essential components of the conditional forecast methodologies, whereas without dependent variables, the conditional forecast would not be meaningful. According to the potential of the proposed methodology in probabilistic forecast of the future drought status, the next attempt will study the spatial variability of future drought across the region. Reported results in this study are based upon the streamflow record at the basin outlet; however, the spatial variability of dry spells throughout a region using this method may be of interest and this is subject to future studies. An alternative probabilistic approach to the presented method is the use of data assimilation where the uncertainties from observation, initial condition, and model can be integrated into the forecasting framework (Moradkhani et al. 2012; DeChant and Moradkhani 2012; ,Parrish et al. 2012).
Partial financial support for this study was provided by NOAA-MAPP Grant NA110AR4310140.
This article is included in the Advancing Drought Monitoring and Prediction Special Collection.