1. Introduction
Reliable monthly-to-seasonal streamflow forecasts can significantly improve the management of water resources systems and their subsequent plans (Hamlet et al. 2002). Hydrologists provide deterministic-style forecasts that estimate the volume of streamflow for a month or a season ahead; alternatively, water managers are interested in categorical and probabilistic streamflow forecasts that represent the probability of occurrence of predefined events, such as below-normal or above-normal streamflows (Krzysztofowicz 2001; Ahmadisharaf et al. 2016). Streamflow forecasts are typically developed using either dynamical modeling or statistical modeling. In the dynamical approach, downscaled climate forecasts from general circulation models (GCMs), such as precipitation and temperature forecasts, are forced into physical hydrologic models (e.g., lumped or distributed rainfall–runoff models) in order to develop streamflow predictions and other terrestrial hydrologic fluxes (Vrugt et al. 2006; Lohmann et al. 2004; Sinha and Sankarasubramanian 2013). For the purpose of developing ensemble streamflow forecasts, the hydrologic models can be fed with an ensemble of forcings and/or of perturbed initial hydrologic conditions (ICs). Another approach is to pool forecasts from multiple models to develop a multimodel ensemble that reduces the model uncertainty (Ajami et al. 2006; Li and Sankarasubramanian 2012; Devineni et al. 2008).
Statistical model forecasting is based on developing a statistical relationship between relevant climatic predictors and/or available observations, such as initial soil moisture and streamflow conditions, prior to the forecasting. Statistical models, such as principal component regression (PCR), commonly assume normality of predictands and linearity between predictors and predictands (Hsu et al. 1995; Garen 1992; Sankarasubramanian et al. 2008). However, it is well known that a nonlinear relationship exists between runoff and precipitation (Jakeman et al. 1993; Sankarasubramanian and Vogel 2003). For a long time, considerable progress has been made on incorporating the influence of sea surface temperature (SST) anomalies such as El Niño–Southern Oscillation (ENSO), which partially contains information in the tropics and subtropics, in seasonal precipitation and streamflow forecasting (Ropelewski and Halpert 1986; Tootle et al. 2005; Hamlet and Lettenmaier 1999). Statistical models can also benefit from the natural persistence of streamflows in order to improve their forecasting skill by using information about past streamflow conditions. Previous studies have employed various types of statistical modeling techniques, including parametric, semiparametric, and resampling methods [e.g., K-Nearest Neighbors (KNN) and model output statistics (MOS)], to develop probabilistic streamflow forecasts over a specific watershed or across a region (Grantz et al. 2005; Clark and Hay 2004).
In general, a statistical modeling approach is simpler and computationally less costly compared to the dynamical modeling method. In dynamical modeling, hydrologic models (e.g., land surface models) require climatic forcings with finer spatiotemporal resolutions than those of the climate forecasts issued from GCMs. Addressing this resolution mismatch is a challenge in itself and requires the application of spatial downscaling and/or temporal disaggregation approaches on GCM outputs before forcing them into hydrologic models (Wood et al. 2002; Wood and Lettenmaier 2006; Yuan et al. 2011). In addition, both downscaling and disaggregation procedures introduce errors into the climatic forcings and, subsequently, into hydrologic products, thereby significantly affecting the reliability of streamflow forecasts; these errors mostly depend on the procedures’ location and forecasting season (Mazrooei et al. 2015; Sinha et al. 2014; Seo et al. 2016).
Climate forecasts that are available in large spatial scales are typically issued in the form of ensembles, which quantify the uncertainty due to the initial conditions (Goddard et al. 2003; Doblas-Reyes et al. 2005). Various studies have focused on reducing uncertainty in climate forecasts by combining multiple models (Barnston et al. 2003; Weigel et al. 2008) and developing different strategies for providing ensembles that quantify uncertainty in atmospheric conditions (Kumar et al. 2001; Li et al. 2008) and hydrologic states (Shukla and Lettenmaier 2011). However, for statistical downscaling, most studies use only the ensemble mean of climate forecasts for developing streamflow forecasts, which ignores the probabilistic information in the data (Sinha and Sankarasubramanian 2013). Few studies that pointed out the significance of this probabilistic information used ensemble spread or a subset of skillful ensemble members in their modeling process in order to gain more accuracy in the forecasted products (Wilks and Hamill 2007; Regonda et al. 2006).
The main intent of this study is to evaluate the value of ensemble climate forecasts in developing categorical streamflow forecasts based on statistical downscaling. We propose the use of multinomial logistic regression (MLR) for downscaling probabilistic information inside the large-scale ensemble climate forecasts into basin scale information and utilize it for probabilistic streamflow forecasting. Unlike binary logistic regression that deals with only two categories of events, MLR is able to develop categorical probabilities for multiple outcomes and predefined events. The other advantage of MLR is that the model input can be both probabilistic data, such as probabilistic information derived from climate ensembles, as well as deterministic data, such as initial land surface conditions of a catchment. In this study, we have employed precipitation forecasts from the ECHAM4.5 GCM, along with past streamflow observations, to build an MLR model; this model will develop 1-month-ahead categorical streamflow forecasts over six river basins spanning various hydroclimatic regimes in the U.S. Sun Belt. The performance of the MLR model was then compared with the commonly used PCR model in order to identify the added value of using probabilistic information in the ensemble climate forecasts for improving streamflow forecasts. Results from our experiments address the following questions associated with categorical streamflow forecasting:
How does the MLR model performance compare to the PCR model in developing probabilistic categorical streamflow forecasts during different seasons?
What is the information added to the categorical streamflow forecasts developed using the MLR model?
Is there any relationship between the skill of the MLR model and river basins’ characteristics and their regimes?
What is the role of climate forecast ensemble size in determining the skill of probabilistic streamflow forecasts?
This manuscript is organized as follows. Section 2 provides information about the selected river basins and the hydroclimatic data used in this study. Section 3 details the MLR and PCR experimental setup and verification metrics used for model evaluations. Section 4 presents the results, and section 5 summarizes the findings and conclusions from the study.
2. Study area and data
a. Study area
In this study, we consider six river basins that fall under arid, semiarid, or humid hydroclimatic regimes across the U.S. Sun Belt (Fig. 1). The Sun Belt is defined as the region south of 37°N that has short and mild winters and extended summers. Table 1 presents detailed information about the selected river basins and their streamflow gauge stations.

Map of the United States with the Sun Belt specified as the red shaded area as well as the locations of the selected river basins and their considered USGS gauging stations.
Citation: Journal of Hydrometeorology 18, 11; 10.1175/JHM-D-17-0021.1

Map of the United States with the Sun Belt specified as the red shaded area as well as the locations of the selected river basins and their considered USGS gauging stations.
Citation: Journal of Hydrometeorology 18, 11; 10.1175/JHM-D-17-0021.1
Map of the United States with the Sun Belt specified as the red shaded area as well as the locations of the selected river basins and their considered USGS gauging stations.
Citation: Journal of Hydrometeorology 18, 11; 10.1175/JHM-D-17-0021.1
USGS gauge sites characteristics.


b. Streamflow data
Streamflow data are obtained from the U.S. Geological Survey’s (USGS) Hydro-Climatic Data Network (HCDN; Slack et al. 1993), which consists of streamflow gauging stations with minimal anthropogenic impacts from upstream reservoir operations, land use changes, and groundwater pumping. This dataset encompasses daily, monthly, and annual mean discharge values for 1659 sites across the United States. In this study, we used monthly streamflow records of the selected river basins during the period of 46 years from 1957 to 2002 (HCDN gauge numbers are shown in Fig. 1).
c. Climate forecasts
Precipitation forecasts from the ECHAM4.5 GCM model were obtained from the International Research Institute of Climate and Society (IRI) data library (Li and Goddard 2005). The ECHAM model is developed by the Max Planck Institute and is currently used in real-time climate forecasting (Roeckner et al. 1992). We considered ECHAM4.5 because it has a long period of retrospective monthly precipitation forecasts, and studies have shown that ECHAM4.5 precipitation forecasts can provide reliable skill in streamflow forecasting (Mazrooei et al. 2015; Sinha et al. 2014). In this study, we used both ECHAM4.5 monthly simulations and monthly forecasts, both available at 2.8° × 2.8° resolution. In the simulation scheme, the observed SSTs were used to feed the ECHAM4.5 GCM, and the simulated ensemble of monthly precipitation contains 85 members. In the forecast scheme, the climate model is forced with the updated SST forecasts, developed using the constructed analog SST (CA-SST; van den Dool 1994), to develop the climate forecasts; there are 24 ensemble members, and data are available from 1957 to present. We decided to include the ECHAM4.5 simulations because the model has a sufficient number of ensemble members to evaluate if increased ensemble size provides additional information for developing probabilistic categorical streamflow forecasts. We obtained 1-month-ahead precipitation data from both ECHAM4.5 simulations and forecasts for the period of 46 years (1957–2002).
3. Methodology
For each river basin, we selected the grid point of ECHAM4.5 precipitation forecasts that had the highest correlation with the observed streamflow among all the grid points that overlay or neighbor the basin boundary. The precipitation forecasts from the selected grid point and the observed monthly streamflow prior to the forecasting time step were considered as the two predictors of our modeling framework; the observed streamflow in that forecasting month was considered the predictand variable. Two models, MLR and PCR, were considered to develop streamflow forecasts. The models were evaluated based on leave-5-out cross validation and split-sample validation to develop probabilistic streamflow tercile forecasts. This section provides more details about both models and the validation techniques.
a. Candidate models
1) PCR
The climate predictability tool (CPT) from IRI (Mason and Tippett 2016) was employed in order to develop the PCR model. For a given month, the predictor variables for the PCR model include the ensemble mean of the precipitation forecasts
2) MLR
Multinomial logistic regression is an extension of binary logistic regression, which enables the model to develop probabilistic predictions for multiple categories and outcomes. MLR is also capable of accepting inputs from mixed data types, such as probabilities and deterministic variables. To feed MLR with the probabilistic information from the ensemble of climate forecasts, we quantified the probability of below-normal
Bin counting: For each time step, the number of ensemble members that falls between the GCM model climatological 33rd and 67th percentiles was counted and then divided by the total number of ensembles (either 24 or 85, depending on the source of precipitation forecasts) in order to develop precipitation forecast tercile probabilities. We assigned the term MLR1 to the MLR model that uses this type of precipitation probabilistic predictors.
Distribution fitting: For each time step, a lognormal distribution was fit to the precipitation ensemble, and the CDF of the model climatological 33rd and 67th quantiles were computed. The MLR model fed with this type of predictor is called MLR2.












b. Model calibration and validation
The MLR and PCR models were conducted and evaluated based on two different validation techniques that are explained here.
1) Cross Validation
The cross-validation technique is a procedure that assesses the performance of a model by calibrating the model with a subset of the data and validating it for the left-out data (Craven and Wahba 1978). In this study, for a given month, leave-5-out cross validation is performed by removing a 5-yr window of data centered at the forecasting time step, calibrating the model based on the remaining 41 years of data, and then evaluating the model performance on the forecasting year.
2) Split-Sample Technique
The split-sample technique, unlike the cross-validation technique, uses only a single subset of data to train the models. In this study, the first 26 years of data (1957–1982) are used as the calibration period, and the remaining 20 years (1983–2002) are used as the validation period.
Each of the explained validation techniques has its own advantages and drawbacks. Cross-validation can stabilize the structure of the model because it uses different subsets of training data; however, in the split-sample test, the model has fewer years for the calibration, and it is shown that it has less skill, too, compared to cross-validated forecasts (Goutte 1997; Moradkhani et al. 2004). The split-sample test is commonly used and beneficial when there is a short set of data, so it is not rational to leave out the available data. This technique also provides a more rigorous way to evaluate the model’s operational forecasting skill, as it only uses past data to develop the forecasts (Klemeš 1986).
c. Forecasting skill metrics






where, in the above equations,










where N is the total number of forecast probabilities,
4. Results
Based on the two MLR approaches that are described in section 3, we selected the individual model with higher skill for discussing the results. Figure 2 compares the cross-validated RPSS between MLR and PCR for all six basins during four seasons. The RPSS metric ranges from −∞ to 1; for better visualization purposes (i.e., to better visualize the variability in the plots), we limited the RPSS axes from −1 to 1, which illustrates almost 95% of the data points on average. All the positive RPSS medians in the box plots in Fig. 2 suggest that the MLR and PCR forecasts perform better than the climatology (i.e., RPSS median greater than zero). In addition, the MLR model is more successful than the PCR model in forecasting streamflows in almost all the cases, as it indicates higher median RPSS and interquartile range (IQR; Fig. 2). According to the box plots, the Rio Grande basin exhibits the best forecasting performance, as more than 75% of RPSS values from both models are above zero for all seasons. In general, the MLR model delivers a wider distribution of RPSSs compared to the PCR model, indicating a higher uncertainty in streamflow forecasts. MLR performs better than PCR in the Guadalupe River basin during the winter season: the first quartile of MLR is approximately above the third quartile of PCR, highlighting that in 75% of occasions, MLR would be more skillful.

Comparison of the MLR and PCR models’ performances on cross-validated forecasts on a seasonal time scale.
Citation: Journal of Hydrometeorology 18, 11; 10.1175/JHM-D-17-0021.1

Comparison of the MLR and PCR models’ performances on cross-validated forecasts on a seasonal time scale.
Citation: Journal of Hydrometeorology 18, 11; 10.1175/JHM-D-17-0021.1
Comparison of the MLR and PCR models’ performances on cross-validated forecasts on a seasonal time scale.
Citation: Journal of Hydrometeorology 18, 11; 10.1175/JHM-D-17-0021.1
Figure 3 shows the same analysis of RPSS values developed under the split-sample validation technique. We again see that both MLR and PCR forecasts are more skillful than the climatology. Comparing the RPSS median between the two forecasting models, the MLR model still performs better than the PCR model in all basins over all seasons, with just one exception during spring season in the Deep River basin. Figures 2 and 3 show that the skill of cross-validated forecasts varies less in comparison to the split-sample validation; the cross-validation approach considers more data for model training, thereby resulting in less variability in model skill. We also notice under both validation techniques that the MLR model’s distributions of RPSSs are more skewed toward higher values, indicating that the MLR is able to issue better streamflow predictions over the considered period, suggesting the higher reliability of MLR.

Comparison of the MLR and PCR models’ performances on split-sample forecasts on a seasonal time scale.
Citation: Journal of Hydrometeorology 18, 11; 10.1175/JHM-D-17-0021.1

Comparison of the MLR and PCR models’ performances on split-sample forecasts on a seasonal time scale.
Citation: Journal of Hydrometeorology 18, 11; 10.1175/JHM-D-17-0021.1
Comparison of the MLR and PCR models’ performances on split-sample forecasts on a seasonal time scale.
Citation: Journal of Hydrometeorology 18, 11; 10.1175/JHM-D-17-0021.1
Furthermore, another analysis was conducted on the cross-validated forecasts during BN and AN months using BS and its decomposed components. Table 2 shows the difference in forecast verification metrics by comparing the MLR and PCR forecasts over each river basin. Positive (bold) numbers indicate the improvements achieved by utilizing MLR forecasts over PCR forecasts. REL and RES values were computed using 20 distinct forecast probabilities (i.e.,
Improved forecast attributes (× 10−2; positive values in bold) in terms of


We also compared the performance of two MLR models that were fed with two types of climatic probabilistic information. One uses bin counting (MLR1) to quantify the precipitation probabilities, and the other fits a lognormal distribution (MLR2) in order to obtain the probabilistic information from the precipitation ensembles (details in section 3). Figure 4 shows the individual RPSSs for all the monthly time steps under two validation techniques. The months are also categorized based on below-normal, normal, and above-normal months that are obtained by comparing observed streamflow with the climatological terciles. Based on Fig. 4, we see that the points are scattered almost equally on the two sides of the diagonal line, indicating no significant difference in forecasting skill between MLR1 and MLR2.

Comparison of skill in individual streamflow forecasts between the models MLR1 and MLR2.
Citation: Journal of Hydrometeorology 18, 11; 10.1175/JHM-D-17-0021.1

Comparison of skill in individual streamflow forecasts between the models MLR1 and MLR2.
Citation: Journal of Hydrometeorology 18, 11; 10.1175/JHM-D-17-0021.1
Comparison of skill in individual streamflow forecasts between the models MLR1 and MLR2.
Citation: Journal of Hydrometeorology 18, 11; 10.1175/JHM-D-17-0021.1
Figure 5 compares the skill of the MLR and PCR models in categorical streamflow forecasting for each individual monthly time step during the validation period. To simplify the comparison across the models, the medians of the RPSS values for each category were shown on each axis as colorized lines. This figure indicates that the RPSS of the MLR is higher than that of the PCR because the majority of the points and, consequently, the intersection of the median lines are located below the diagonal line. By looking at the point clouds in Fig. 5a, we see that the skill of the PCR model varies less (less deviation in RPSS values) in terms of forecasting normal flows, compared to the other two categories. The MLR model performs more precisely in arid regions like the Guadalupe River and Rio Grande basins, since most of the points are scattered on the right side of the boxes. Based on the medians of the RPSSs in Fig. 5a, both the MLR and PCR models have a better skill in forecasting AN and BN months than N months. It is harder to conclude this by looking at Fig. 5b because there are fewer data points in the split-sample validation, as we have limited sample size under the different flow categories. In addition, the better performance of the MLR is distinct in the arid basins.

Cloud plot comparison of skill in streamflow forecasting for BN, N, and AN months between the MLR model and PCR model under both cross validation and split-sample validation. The median lines of the cloud points are projected on each axis for each category.
Citation: Journal of Hydrometeorology 18, 11; 10.1175/JHM-D-17-0021.1

Cloud plot comparison of skill in streamflow forecasting for BN, N, and AN months between the MLR model and PCR model under both cross validation and split-sample validation. The median lines of the cloud points are projected on each axis for each category.
Citation: Journal of Hydrometeorology 18, 11; 10.1175/JHM-D-17-0021.1
Cloud plot comparison of skill in streamflow forecasting for BN, N, and AN months between the MLR model and PCR model under both cross validation and split-sample validation. The median lines of the cloud points are projected on each axis for each category.
Citation: Journal of Hydrometeorology 18, 11; 10.1175/JHM-D-17-0021.1




where σ is the standard deviation of flow values for a given month m computed over 46 years of observations. This assessment was conducted by fitting a linear regression to the monthly data collected from all the basins during each season. Based on this seasonal analysis, we infer that a significant positive relationship between

Difference between MLR and PCR performances (
Citation: Journal of Hydrometeorology 18, 11; 10.1175/JHM-D-17-0021.1

Difference between MLR and PCR performances (
Citation: Journal of Hydrometeorology 18, 11; 10.1175/JHM-D-17-0021.1
Difference between MLR and PCR performances (
Citation: Journal of Hydrometeorology 18, 11; 10.1175/JHM-D-17-0021.1
Similar analysis was also conducted based on monthly average
a. Relative contribution of model inputs
As mentioned earlier, the MLR model is forced with probabilistic information from precipitation forecasts as well as with deterministic streamflow observations from prior months. To further understand the role of each input component in determining the skill of probabilistic streamflow forecasting, MLR modeling was performed under two scenarios: first, using probabilistic information from precipitation forecasts

Decomposed contribution of probabilistic information from precipitation forecasts P and antecedent streamflow observations Q as input variables in determining the skill of MLR streamflow forecasts under the cross-validation approach.
Citation: Journal of Hydrometeorology 18, 11; 10.1175/JHM-D-17-0021.1

Decomposed contribution of probabilistic information from precipitation forecasts P and antecedent streamflow observations Q as input variables in determining the skill of MLR streamflow forecasts under the cross-validation approach.
Citation: Journal of Hydrometeorology 18, 11; 10.1175/JHM-D-17-0021.1
Decomposed contribution of probabilistic information from precipitation forecasts P and antecedent streamflow observations Q as input variables in determining the skill of MLR streamflow forecasts under the cross-validation approach.
Citation: Journal of Hydrometeorology 18, 11; 10.1175/JHM-D-17-0021.1
b. Ensemble size analysis
One of the objectives of our study is to analyze the role of ensemble size of precipitation forecasts in determining the performance of streamflow forecasting via the MLR model. We decided to use ECHAM4.5 precipitation simulations with 85 ensemble members instead of actual ECHAM4.5 forecasts, which have only 24 ensemble members. We performed the cross-validated MLR model 50 times for each ensemble size, ranging from 10 to 85 members with increments of 5. For a given ensemble size, in each one of the 50 iterations that the MLR model was fitted, we randomly selected ensemble members out of the 85 members to form the specified ensemble size that was subsequently used to obtain the probabilities of the precipitation forecasts. This helped us to evaluate 50 MLR models under a given ensemble size. However, when the ensemble size increased, the number of unique subsets of ensemble members would become more limited, which resulted in less variation in the model’s skill (see Fig. 8). This figure illustrates the results of this analysis for three sample basins during four seasons; they are represented in different colors, with the shaded colors as the spread of the RPSSs. This analysis reveals that the skill of the MLR model is almost independent of the precipitation ensemble size because there is not a significant trend in the plots. It is important to note that the MLR skill in this figure should not necessarily be the same as Fig. 2; in Fig. 8, we used precipitation forecasts from the 85-member ECHAM4.5 simulations, whereas all the results presented in Figs. 2–6 were developed using retrospective, 24-member ECHAM4.5 forecasts.

The performance range of the MLR model in categorical streamflow forecasting as a function of ensemble size of precipitation forecasts during four seasons. The solid line, the darker shaded area, and the lighter shaded area in each color represent the median, IQR, and 90% confidence interval of the RPSSs, respectively, as derived from 50 iterations at each ensemble size.
Citation: Journal of Hydrometeorology 18, 11; 10.1175/JHM-D-17-0021.1

The performance range of the MLR model in categorical streamflow forecasting as a function of ensemble size of precipitation forecasts during four seasons. The solid line, the darker shaded area, and the lighter shaded area in each color represent the median, IQR, and 90% confidence interval of the RPSSs, respectively, as derived from 50 iterations at each ensemble size.
Citation: Journal of Hydrometeorology 18, 11; 10.1175/JHM-D-17-0021.1
The performance range of the MLR model in categorical streamflow forecasting as a function of ensemble size of precipitation forecasts during four seasons. The solid line, the darker shaded area, and the lighter shaded area in each color represent the median, IQR, and 90% confidence interval of the RPSSs, respectively, as derived from 50 iterations at each ensemble size.
Citation: Journal of Hydrometeorology 18, 11; 10.1175/JHM-D-17-0021.1
5. Discussion and concluding remarks
Categorical streamflow forecasts that provide information on the probability of occurrence of below-normal and above-normal flows are useful in contingency planning and in allocating resources for the shift in streamflow potential. Furthermore, categorical forecasting provides the change in the seasonal streamflows from climatology; hence, they are easy to communicate. In this regard, we applied multinomial logistic regression (MLR) as a potential approach to develop probabilistic categorical streamflow forecasts by using the probabilistic information in climate forecasts and previous month streamflow. Coarse-scale, ensemble-based precipitation forecasts from the ECHAM4.5 GCM, along with past observations of streamflow from the HCDN dataset, were used as the predictors of the MLR model to issue 1-month-ahead streamflow forecasts consisting of BN, N, and AN streamflow occurrences over six river basins across the U.S. Sun Belt. We compared the performance of the MLR model with the traditional approach, principal component regression (PCR), which is commonly used to obtain the categorical precipitation and streamflow forecasts. Our findings demonstrated that the MLR and PCR models both have a higher forecasting skill than climatology for almost all the seasons; the median of RPSS values were greater than zero. The analysis of both models under cross validation and split-sample validation techniques also revealed that the MLR model has a higher skill than the PCR model in producing categorical streamflow forecasts. This is because the MLR model is capable of utilizing the probabilistic information in the climate forecast ensemble, while the PCR model is built based on the mean of the ensembles, thereby not considering the ensemble spread in issuing categorical forecasts. Furthermore, the MLR structure is based on the multinomial distribution, which can naturally accommodate the skewness exhibited in the conditional distribution of flows. Therefore, the MLR model performs more accurately in arid basins and during months with high skewness in flows over humid basins. Grantz et al. (2005) has shown that for snow-dominated river basins (i.e., the Rio Grande and Verde River basins), incorporating the information from large-scale climatic forecasts with the winter snowpack initial conditions as model predictors potentially leads to a higher streamflow forecasting skill. Thus, for basins under a snowmelt regime, one could consider snow water equivalent (SWE) as a predictor instead of streamflow records, particularly for predictions during the melting seasons. However, for nonsnowmelt months, considering antecedent streamflow would be a good strategy to develop monthly streamflow forecasts. For basins under a rainfall–runoff regime, consideration of additional predictors, such as remotely sensed soil moisture products (e.g., SMAP) and groundwater levels (e.g., from the USGS Groundwater–Climate Response Network), could also be given to enhance the forecasting skill. Even though this study shows the contribution of precipitation forecasts to skill improvement is limited, one could also consider tercile forecasts from multimodel ensembles, which generally improve the reliability of climatic probabilistic forecasts (Devineni and Sankarasubramanian 2010a; Singh and Sankarasubramanian 2014). We also infer that the skill of the streamflow forecasts generally improves for large basins in comparison to smaller basins; for instance, the Deep River basin has relatively the lowest skill compared to the rest of the basins. Thus, basins dominated with significant groundwater storage [e.g., the Apalachicola, Chattahoochee, and Flint Rivers (ACF) basin] and having strong persistence in streamflows are expected to have skill contributed mostly by previous month streamflow conditions.
In this study, two different methods were employed and evaluated to estimate the probabilistic information inside the climate forecasts. The MLR model was forced with tercile precipitation forecasts estimated either by 1) counting where the ensemble members lie in each category or by 2) fitting a lognormal distribution to the forecasted ensembles. Results revealed that there is no significant change in the skill of the MLR model between these two information extraction approaches. In addition, the role of the ensemble size of precipitation forecasts was evaluated in estimating the categorical streamflow forecasts using the MLR model. Our analyses showed that probabilistic information collected from 10 to 25 members is enough to estimate streamflow forecasts, while a further increase in ensemble size did not result in any significant and consistent improvements in the skill of categorical streamflow forecasting. Thus, the proposed MLR approach offers an alternate approach to issuing categorical streamflow forecasts that are typically needed in communicating the change in monthly/seasonal streamflow potential.
Acknowledgments
We thank the anonymous reviewers whose suggestions and comments helped us in improving the manuscript. In addition, we thank our funding sources: National Science Foundation Grants CBET-0954405, CBET-1204368, and CCF-1442909.
REFERENCES
Ahmadisharaf, E., A. J. Kalyanapu, and E.-S. Chung, 2016: Spatial probabilistic multi-criteria decision making for assessment of flood management alternatives. J. Hydrol., 533, 365–378, https://doi.org/10.1016/j.jhydrol.2015.12.031.
Ajami, N. K., Q. Duan, X. Gao, and S. Sorooshian, 2006: Multimodel combination techniques for analysis of hydrological simulations: Application to distributed model intercomparison project results. J. Hydrometeor., 7, 755–768, https://doi.org/10.1175/JHM519.1.
Antolik, M. S., 2000: An overview of the National Weather Service’s centralized statistical quantitative precipitation forecasts. J. Hydrol., 239, 306–337, https://doi.org/10.1016/S0022-1694(00)00361-9.
Arumugam, S., R. Boyles, A. Mazrooei, and H. Singh, 2015: Experimental reservoir storage forecasts utilizing climate-information based streamflow forecasts. Water Resources Research Institute of the University of North Carolina Rep. 456, 33 pp., https://repository.lib.ncsu.edu/bitstream/handle/1840.4/8661/NC-WRRI-456.pdf.
Barnston, A. G., S. J. Mason, L. Goddard, D. G. Dewitt, and S. E. Zebiak, 2003: Multimodel ensembling in seasonal climate forecasting at IRI. Bull. Amer. Meteor. Soc., 84, 1783–1796, https://doi.org/10.1175/BAMS-84-12-1783.
Brier, G. W., 1950: Verification of forecasts expressed in terms of probability. Mon. Wea. Rev., 78, 1–3, https://doi.org/10.1175/1520-0493(1950)078<0001:VOFEIT>2.0.CO;2.
Clark, M. P., and L. E. Hay, 2004: Use of medium-range numerical weather prediction model output to produce forecasts of streamflow. J. Hydrometeor., 5, 15–32, https://doi.org/10.1175/1525-7541(2004)005<0015:UOMNWP>2.0.CO;2.
Craven, P., and G. Wahba, 1978: Smoothing noisy data with spline functions. Numer. Math., 31, 377–403, https://doi.org/10.1007/BF01404567.
Devineni, N., and A. Sankarasubramanian, 2010a: Improved categorical winter precipitation forecasts through multimodel combinations of coupled GCMs. Geophys. Res. Lett., 37, L24704, https://doi.org/10.1029/2010GL044989.
Devineni, N., and A. Sankarasubramanian, 2010b: Improving the prediction of winter precipitation and temperature over the continental United States: Role of the ENSO state in developing multimodel combinations. Mon. Wea. Rev., 138, 2447–2468, https://doi.org/10.1175/2009MWR3112.1.
Devineni, N., A. Sankarasubramanian, and S. Ghosh, 2008: Multimodel ensembles of streamflow forecasts: Role of predictor state in developing optimal combinations. Water Resour. Res., 44, W09404, https://doi.org/10.1029/2006WR005855.
Doblas-Reyes, F. J., R. Hagedorn, and T. Palmer, 2005: The rationale behind the success of multi-model ensembles in seasonal forecasting–II. Calibration and combination. Tellus, 57A, 234–252, https://doi.org/10.1111/j.1600-0870.2005.00104.x.
Epstein, E. S., 1969: A scoring system for probability forecasts of ranked categories. J. Appl. Meteor., 8, 985–987, https://doi.org/10.1175/1520-0450(1969)008<0985:ASSFPF>2.0.CO;2.
Garen, D. C., 1992: Improved techniques in regression-based streamflow volume forecasting. J. Water Resour. Plann. Manage., 118, 654–670, https://doi.org/10.1061/(ASCE)0733-9496(1992)118:6(654).
Goddard, L., A. Barnston, and S. Mason, 2003: Evaluation of the IRI’s “net assessment” seasonal climate forecasts: 1997–2001. Bull. Amer. Meteor. Soc., 84, 1761–1781, https://doi.org/10.1175/BAMS-84-12-1761.
Goutte, C., 1997: Note on free lunches and cross-validation. Neural Comput., 9, 1245–1249, https://doi.org/10.1162/neco.1997.9.6.1245.
Grantz, K., B. Rajagopalan, M. Clark, and E. Zagona, 2005: A technique for incorporating large-scale climate information in basin-scale ensemble streamflow forecasts. Water Resour. Res., 41, W10410, https://doi.org/10.1029/2004WR003467.
Hamill, T. M., J. S. Whitaker, and S. L. Mullen, 2006: Reforecasts: An important dataset for improving weather predictions. Bull. Amer. Meteor. Soc., 87, 33–46, https://doi.org/10.1175/BAMS-87-1-33.
Hamlet, A. F., and D. P. Lettenmaier, 1999: Columbia River streamflow forecasting based on ENSO and PDO climate signals. J. Water Resour. Plann. Manage., 125, 333–341, https://doi.org/10.1061/(ASCE)0733-9496(1999)125:6(333).
Hamlet, A. F., D. Huppert, and D. P. Lettenmaier, 2002: Economic value of long-lead streamflow forecasts for Columbia River hydropower. J. Water Resour. Plann. Manage., 128, 91–101, https://doi.org/10.1061/(ASCE)0733-9496(2002)128:2(91).
Hsu, K.-l., H. V. Gupta, and S. Sorooshian, 1995: Artificial neural network modeling of the rainfall-runoff process. Water Resour. Res., 31, 2517–2530, https://doi.org/10.1029/95WR01955.
Jakeman, A., and G. Hornberger, 1993: How much complexity is warranted in a rainfall-runoff model? Water Resour. Res., 29, 2637–2650, https://doi.org/10.1029/93WR00877.
Klemeš, V., 1986: Operational testing of hydrological simulation models. Hydrol. Sci. J., 31, 13–24, https://doi.org/10.1080/02626668609491024.
Krzysztofowicz, R., 2001: The case for probabilistic forecasting in hydrology. J. Hydrol., 249, 2–9, https://doi.org/10.1016/S0022-1694(01)00420-6.
Kumar, A., A. G. Barnston, and M. P. Hoerling, 2001: Seasonal predictions, probabilistic verifications, and ensemble size. J. Climate, 14, 1671–1676, https://doi.org/10.1175/1520-0442(2001)014<1671:SPPVAE>2.0.CO;2.
Li, S., and L. Goddard, 2005: Retrospective forecasts with ECHAM4.5 AGCM. IRI Tech. Rep. 05-02.
Li, S., L. Goddard, and D. G. DeWitt, 2008: Predictive skill of AGCM seasonal climate forecasts subject to different SST prediction methodologies. J. Climate, 21, 2169–2186, https://doi.org/10.1175/2007JCLI1660.1.
Li, W., and A. Sankarasubramanian, 2012: Reducing hydrologic model uncertainty in monthly streamflow predictions using multimodel combination. Water Resour. Res., 48, W12516, https://doi.org/10.1029/2011WR011380.
Li, W., A. Sankarasubramanian, R. Ranjithan, and E. Brill, 2014: Improved regional water management utilizing climate forecasts: An interbasin transfer model with a risk management framework. Water Resour. Res., 50, 6810–6827, https://doi.org/10.1002/2013WR015248.
Lohmann, D., and Coauthors, 2004: Streamflow and water balance intercomparisons of four land surface models in the North American Land Data Assimilation System project. J. Geophys. Res., 109, D07S91, https://doi.org/10.1029/2003JD003517.
Mason, S. J., and M. K. Tippett, 2016: Climate Predictability Tool version 15.3. 9. Columbia University Academic Commons, https://doi.org/10.7916/D8668DCW.
Mazrooei, A., T. Sinha, A. Sankarasubramanian, S. Kumar, and C. D. Peters-Lidard, 2015: Decomposition of sources of errors in seasonal streamflow forecasting over the U.S. Sunbelt. J. Geophys. Res. Atmos., 120, 11 809–11 825, https://doi.org/10.1002/2015JD023687.
Moradkhani, H., K.-l. Hsu, H. V. Gupta, and S. Sorooshian, 2004: Improved streamflow forecasting using self-organizing radial basis function artificial neural networks. J. Hydrol., 295, 246–262, https://doi.org/10.1016/j.jhydrol.2004.03.027.
Oh, J., and A. Sankarasubramanian, 2012: Interannual hydroclimatic variability and its influence on winter nutrient loadings over the Southeast United States. Hydrol. Earth Syst. Sci., 16, 2285–2298, https://doi.org/10.5194/hess-16-2285-2012.
Pagano, T. C., 2008: Probabilistic seasonal water supply forecasting in an operational environment: The USDA-NRCS perspective. World Environmental and Water Resources Congress 2008: Ahupua’A, Honolulu, HI, American Society of Civil Engineers, 1–10, https://doi.org/10.1061/40976(316)575.
Regonda, S. K., B. Rajagopalan, and M. Clark, 2006: A new method to produce categorical streamflow forecasts. Water Resour. Res., 42, W09501, https://doi.org/10.1029/2006WR004984.
Roeckner, E., and Coauthors, 1992: Simulation of the present-day climate with the ECHAM-3 model: Impact of model physics and resolution. Max Planck Institute for Meteorology Rep. 93, 171 pp.
Ropelewski, C. F., and M. S. Halpert, 1986: North American precipitation and temperature patterns associated with the El Niño/Southern Oscillation (ENSO). Mon. Wea. Rev., 114, 2352–2362, https://doi.org/10.1175/1520-0493(1986)114<2352:NAPATP>2.0.CO;2.
Sankarasubramanian, A., and R. M. Vogel, 2003: Hydroclimatology of the continental United States. Geophys. Res. Lett., 30, 1363, https://doi.org/10.1029/2002GL015937.
Sankarasubramanian, A., U. Lall, and S. Espinueva, 2008: Role of retrospective forecasts of GCMs forced with persisted SST anomalies in operational streamflow forecasts development. J. Hydrometeor., 9, 212–227, https://doi.org/10.1175/2007JHM842.1.
Seo, S., T. Sinha, G. Mahinthakumar, A. Sankarasubramanian, and M. Kumar, 2016: Identification of dominant source of errors in developing streamflow and groundwater projection under near-term climate change. J. Geophys. Res. Atmos., 121, 7652–7672, https://doi.org/10.1002/2016JD025138.
Shukla, S., and D. Lettenmaier, 2011: Seasonal hydrologic prediction in the United States: Understanding the role of initial hydrologic conditions and seasonal climate forecast skill. Hydrol. Earth Syst. Sci., 15, 3529–3538, https://doi.org/10.5194/hess-15-3529-2011.
Singh, H., and A. Sankarasubramanian, 2014: Systematic uncertainty reduction strategies for developing streamflow forecasts utilizing multiple climate models and hydrologic models. Water Resour. Res., 50, 1288–1307, https://doi.org/10.1002/2013WR013855.
Sinha, T., and A. Sankarasubramanian, 2013: Role of climate forecasts and initial conditions in developing streamflow and soil moisture forecasts in a rainfall–runoff regime. Hydrol. Earth Syst. Sci., 17, 721–733, https://doi.org/10.5194/hess-17-721-2013.
Sinha, T., A. Sankarasubramanian, and A. Mazrooei, 2014: Decomposition of sources of errors in monthly to seasonal streamflow forecasts in a rainfall–runoff regime. J. Hydrometeor., 15, 2470–2483, https://doi.org/10.1175/JHM-D-13-0155.1.
Slack, J., A. Lumb, and J. Landwehr, 1993: Hydro-Climatic Data Network (HCDN) streamflow data set, 1874–1988. Water-Resources Investigations Rep. 93-4076, U.S. Geological Survey, CD-ROM.
Tootle, G. A., T. C. Piechota, and A. Singh, 2005: Coupled oceanic-atmospheric variability and U.S. streamflow. Water Resour. Res., 41, W12408, https://doi.org/10.1029/2005WR004381.
van den Dool, H., 1994: Searching for analogues, how long must we wait? Tellus, 46A, 314–324, https://doi.org/10.3402/tellusa.v46i3.15481.
Vrugt, J. A., H. V. Gupta, B. Nualláin, and W. Bouten, 2006: Real-time data assimilation for operational ensemble streamflow forecasting. J. Hydrometeor., 7, 548–565, https://doi.org/10.1175/JHM504.1.
Weigel, A. P., M. A. Liniger, and C. Appenzeller, 2007: The discrete brier and ranked probability skill scores. Mon. Wea. Rev., 135, 118–124, https://doi.org/10.1175/MWR3280.1.
Weigel, A. P., M. A. Liniger, and C. Appenzeller, 2008: Can multi-model combination really enhance the prediction skill of probabilistic ensemble forecasts? Quart. J. Roy. Meteor. Soc., 134, 241–260, https://doi.org/10.1002/qj.210.
Wilks, D. S., 2006: Statistical Methods in the Atmospheric Sciences. 2nd ed. International Geophysics Series, Vol. 100, Academic Press, 648 pp.
Wilks, D. S., and T. M. Hamill, 2007: Comparison of ensemble-MOS methods using GFS reforecasts. Mon. Wea. Rev., 135, 2379–2390, https://doi.org/10.1175/MWR3402.1.
Wood, A. W., and D. P. Lettenmaier, 2006: A test bed for new seasonal hydrologic forecasting approaches in the western United States. Bull. Amer. Meteor. Soc., 87, 1699–1712, https://doi.org/10.1175/BAMS-87-12-1699.
Wood, A. W., E. P. Maurer, A. Kumar, and D. P. Lettenmaier, 2002: Long-range experimental hydrologic forecasting for the eastern United States. J. Geophys. Res., 107, 4429, https://doi.org/10.1029/2001JD000659.
Yuan, X., E. F. Wood, L. Luo, and M. Pan, 2011: A first look at Climate Forecast System version 2 (CFSv2) for hydrological seasonal prediction. Geophys. Res. Lett., 38, L13402, https://doi.org/10.1029/2011GL047792.