South Asia is vulnerable to extreme weather events, and the Indus River basin (IRB) is subject to flood risk. Flooding occurred in Pakistan during the monsoon seasons from 2010 to 2012, causing major economic damage and significant loss of life. The nature of slow-rise discharge on the Indus and overtopping of riverbanks in the IRB indicates the need for extended warning time (1–10 days). Daily mean streamflow at Tarbela Dam, a major control structure on the Indus River, is reproduced by using numerical weather prediction initialization data from the ECMWF to drive the Variable Infiltration Capacity macroscale hydrologic model. A one-dimensional routing model conducts base flow and surface runoff from each grid cell through a stream network. Comparisons of reconstructions with inflow data at Tarbela Dam over a 3-yr period yield an R2 correlation of 0.93 and RMSE of 692 m3 s−1. From 2010 to 2012, 276 daily ensemble hindcasts are generated. In comparing the ensemble mean of 10-day hindcasts to reconstructed inflow, the R2 correlation was >0.85 for 2010, >0.70 for 2011, and >0.70 for 2012. The RMSE was <24% of mean streamflow for 2010, <28% for 2011, and <34% for 2012. A method to translate this type of probabilistic streamflow forecast data into communicable information is described. A two-dimensional model is described to simulate movement of overflow water into the floodplain.
The unanticipated variability of monsoon rainfall in the current and future climate can have devastating consequences for the developing world, where societies are centered on agrarian pursuits (Webster and Jian 2011). Ensembles of extended-range weather forecasts are now being used as forcing data for hydrological model simulations to determine the risk posed by potential weather extremes (Cloke and Pappenberger 2009). For example, hydrological forecasts and flood warnings are routinely provided to Bangladesh authorities out to 10-day lead times (RIMES 2013) by the Regional Integrated Multi-Hazard Early Warning System for Asia and Africa (RIMES), located in Bangkok, Thailand, using methodologies described in Hopson and Webster (2010), Webster et al. (2010), and Webster and Jian (2011).
The continuing expansion of settlements and agriculture in the Indus River basin (IRB) floodplain in Pakistan has made it one of the most vulnerable regions to extreme hydrometeorological events (Vorosmarty et al. 2000). In addition to human casualties, the loss of crops, farm animals, and purchased agricultural inputs from flood damage condemns successive generations to “a treadmill of poverty” (Webster et al. 2010, p. 1494). In 2010, massive flooding directly affected nearly 20 million people (Webster et al. 2011; Houze et al. 2011). Though not as pervasive and destructive, flooding in 2011 and 2012 also caused significant damage in southern and central portions of the country.
The Indus River originates on the Tibetan Plateau and flows to the west through India before entering Pakistan as shown with the National Aeronautics and Space Administration (NASA) Shuttle Radar Topography Mission (SRTM) data in Fig. 1. The river turns south in northern Pakistan and moves southwest through the center of the country before emptying into the Arabian Sea ~3200 km later. The major tributaries—the Jhelum, Beas, Ravi, Chenab, and Satluj Rivers—conduct runoff occurring in western India and eastern Pakistan westward to the Indus. These rivers meet with the Indus northeast of Sindh Province and add to flows originating in northern and central Pakistan.
a. Challenges for water resource management
Water resources in Pakistan are focused around three major rivers: the Indus, the Jhelum, and the Chenab. The flows of these rivers generally peak in spring and summer, with much lower overall discharge in fall and winter (Kahlown and Majeed 2003). Water is stored in a number of dams and barrages during the high-flow period and is released during the dry season. These storages benefit hydropower, irrigation, and flood control. The end of the monsoon season—typically late August or early September—is the most critical time for storage decisions, that is, exactly how much water to retain, how much to release, and when.
Flood risk in Pakistan is exacerbated by the annual threat of drought, and officials must strike a perilous balance between maximizing retention in dams while maintaining enough storage capacity to attenuate potential flood waves that begin farther upstream. To complicate matters further, the holding capacities of existing structures are declining. Since their initial construction, the combined active storage of the three large hydropower dams—Tarbela, Mangla, and Chashma—in Pakistan has degraded by ~22% because of silting, leaving 2.29 × 1010 m3 to supply the Indus basin irrigation system (PMPIU 2011). The major dams are capable of storing barely a month’s worth of flow in total (Briscoe and Qamar 2009). In contrast, the dams on the U.S. Colorado and the Australian Murray–Darling Rivers can store 900 days of river runoff, dams on the South African Orange River can store 500 days of runoff, and dams in India can store between 120 and 220 days of runoff (Qureshi 2011).
Since 2006, the Pakistan Water and Power Development Authority (WAPDA) has launched several major initiatives such as the building of the Mirani Dam (completed in 2007; 3.7 × 108 m3), the raising of Mangla Dam (completed in 2009; 3.58 × 109 m3), and the building of Gomal Zam Dam (completed in 2011; 1.41 × 109 m3; WAPDA 2012). Diamer-Bhasha Dam is also underway (estimated completion in 2023; 1 × 1010 m3). Despite these improvements and commitments, water management in Pakistan remains tenuous, as suggested by flooding in each of the last 3 years.
b. Aims of this work
There is a great need for extended horizon (5–10 day) streamflow forecasts in Pakistan to allow for advanced preparation/mitigation of flooding and to optimize water resource management. The basic aim of this work is to produce a hydrologic forecasting scheme for the IRB to support an operational framework. The IRB irrigation system is sufficiently advanced that with enough warning time, water could perhaps be diverted into canal systems in advance of large flood pulses, thereby ensuring the safety of expensive barrage structures. Preserving the barrage infrastructure is integral to the success of agricultural operations dependent on the related water diversions (PMPIU 2011).
2. Overview of 2010–12
We first analyze the storage situation at Tarbela Dam (8.26 × 109 m3; Fig. 1) during the 2010–12 monsoon seasons. The catchment upstream of Tarbela, covering an area of ~170 000 km2, is unregulated and predominately a barren, mountainous, and glaciated landscape (Asianics Agro Dev International 2000). We should expect a reasonable correlation between modeling efforts based on “naturalized” simulations and observed inflows. Naturalized or unregulated simulations are projections of river routing without any man-made diversions or storage. Tarbela Reservoir, along with the numerous barrages and headworks capable of modifying flow on the Indus and its tributaries, can be a critical resource if dangerous flow conditions on the Indus are anticipated.
Overbank floodwater from the main stem of the Indus River can spread rapidly to large areas, especially in the south. Sindh Province is often at great risk for floods and is home to the country’s second largest economy, with a substantial agricultural base along the Indus River. An example of flooding is shown in Fig. 2, which is a visual interpretation (Haq et al. 2012) of the inundated areas based on imagery captured by the NASA Moderate Resolution Imaging Spectrometer (MODIS) instrument. MODIS bands 1 (620–670 nm), 2 (841–876 nm), and 7 (2105–2155 nm) were used here. Stagnant water is given as a dark blue color and flowing water as a light blue color.
In 2010, intense rainfall in the upland catchment began in July. Figure 3 indicates the locations and magnitudes of significant rainfall measured at gauging stations in various areas of Pakistan (PMD 2013). On 27–30 July 2010, Risalpur, Cherat, Saidu Sharif, Peshawar, Mianwali, and Kohat all recorded new records with 415, 372, 338, 333, 271, and 262 mm, respectively. The reservoir level and mean daily discharge at Tarbela Dam were also recorded by the Flood Forecasting Division (FFD) of the Pakistan Meteorological Department (PMD). Figure 4 shows the daily level and corresponding inflow/outflow from 1 July to 30 September 2010. The maximum level is 472.44 m (1550 ft), as indicated by the plateau in late August. The gray overlay indicates the time window of significant rainfall events as reported by gauges.
Historically unprecedented flooding occurred throughout the country along the Indus main stem and in places where the typical courses of flow were altered (due to breaching of artificial levees). Syvitsky and Brakenridge (2013) recently conducted an analysis of multitemporal remote sensing observations and topography. They indicated that much of the 2010 flood damage was caused by dam- and barrage-related backwater effects, reduced water and sediment conveyance capacity, and multiple failures of irrigation system levees. These findings agreed with depositions to the Pakistan Supreme Court during its investigation of the flood.
From 1 to 30 September 2011, Mithi, Mirpur Khas, and Nawabshah all recorded new records with 760, 603, and 353 mm, respectively (PMD 2013), shown in Fig. 5. Figure 6 shows the daily level and corresponding inflow/outflow from 1 August to 31 October 2011. Major flooding was reported in Sindh in September. The gray overlay indicates the time window of significant rainfall events as reported by gauges. The travel time of water along the Indus River from Tarbela Dam to Sukkur Barrage (in northern Sindh) is approximately 8 days (PMD 2013). Even if forecasts for the latter part of this time window had been communicated 10 days in advance, Tarbela Reservoir would have been either nearly or completely full.
From 8 to 11 September 2012, Jacobabad, Khanpur, and Shorkot all recorded new rainfall records with 457, 290, and 152 mm, respectively (PMD 2013), shown in Fig. 7. Despite indications at the end of August that 2012 would turn out to be a low water year, the monsoon again persisted through September. Capacity at Tarbela in advance of these rainfalls was higher than in 2011 (Fig. 8), but the reservoir still reached capacity by the middle of September. Had rainfalls in the south extended for a longer period of time, as in 2011, much more flood damage could have occurred.
a. Challenges for streamflow modeling
Despite the intricacy of irrigation processes and the progress of hydropower development in the IRB, it is an unfortunate reality that little historical discharge data from the river network is publicly available. From a modeling standpoint, this means that we are forced to treat these vulnerable regions as essentially ungauged. Streamflow prediction is particularly difficult in these circumstances (Wagener and Wheater 2006), but not impossible. Hopson and Webster (2010), for example, described a multimodel framework to model water-balance physics explicitly (evapotranspiration, soil storage, etc.). This forecasting methodology was used to predict the severe flooding events on the Brahmaputra River in 2004, 2007, and 2008 despite the absence of upstream river discharge data from India. Discharge forecasts as a result of this scheme were able to significantly outperform both climatological and persistence forecasts.
b. Land surface model
Land surface models (LSMs) describe the terrestrial water cycle. These models provide implementations of our current understanding of hydrological systems (Gudmundsson et al. 2012). Adding complexity to any hydrologic model increases the risk of overcalibration, and overcalibration is an unavoidable circumstance of more comprehensive efforts. Overparameterization of models (particularly distributed models) relative to the information content of observational data available for calibration of parameter values is a common issue in hydrologic modeling (Beck 1987). Models strive to find an optimal means of fitting simulated outputs to observational data, but the information context is such that no approach, by definition, can yield a single unambiguous mathematical solution to the identification problem. Cornwell and Harvey (2007), for example, critically reviewed several LSMs typically used in global circulation models (GCMs) to simulate soil moisture and evaporation and found that soil moisture was treated and even defined differently in several different models.
Instead of making the case that a selected model provides the “best” method of environmental parameterization, it may be more reasonable to adhere to an equifinality principle (Beven 2006), that is, the idea that two models may lead to an equally acceptable representation of the observed natural processes and that there is much to be gained from an analysis of both methodologies. Here we employ the Variable Infiltration Capacity (VIC) model (Liang et al. 1994; Bohn et al. 2013), a research model that has been tested in a variety of watersheds with several recent applications in the western United States and southeastern Australia (Tang et al. 2010; Shukla et al. 2011; Zhao et al. 2012; Li. et al. 2013). VIC balances both the water and surface energy budgets within a grid cell; its subgrid variations are captured statistically.
By comparing model-simulated runoff to observations over large river basins, Maurer et al. (2002) described the benefits of generating a dataset of land surface states and fluxes in which both the land surface water and energy budgets balance at every time step. The VIC model has been used in several large international projects, such as the Project for Intercomparison of Land-Surface Parameterization Schemes (PILPS; Henderson-Sellers et al. 1995). One of the main goals of PILPS was to evaluate the ability of current land surface schemes to reproduce measured energy and water fluxes over multiple seasonal cycles across a climatically diverse, continental-scale basin (Wood et al. 1998). When modeled latent heat fluxes were compared to those estimated from radiosonde data, VIC reproduced the atmospheric budget evapotranspiration to within the approximated estimated errors of 5% and 10% for the annual and monthly warm season means (Liang et al. 1998). In the North American Land Data Assimilation System (NLDAS) project, VIC outputs were used to drive an independent routing model. A comparison of modeled runoff to observed flow showed a bias of less than 5% and a correlation of 0.9 over a 2-yr period (Lohmann et al. 2004).
c. Hydrological system framework
A flow diagram for the streamflow modeling framework centered on the VIC model is shown in Fig. 9. This diagram indicates the different types of geospatial databases used to produce the soil, vegetation, and routing parameters necessary to transform estimates of temperature and precipitation to streamflow. Soil textures at 0.25° resolution were obtained from Food and Agricultural Organization data (FAO 1998). Vegetation types and land cover information at 30-in. resolution were obtained from University of Maryland (UMD) land cover data (Hansen et al. 2000). Drainage basins and stream networks were obtained from the U.S. Geological Survey’s (USGS) HYDRO1K data (Verdin and Greenlee 1996). Elevations at 3-arc-s resolution based on void-filling interpolation methods (Reuter et al. 2007) for SRTM data were obtained from the Consultative Group for International Agricultural Research (CGIAR) Consortium for Spatial Information.
The basin delineation (Fig. 1, dark blue outline) is created from an aggregation of smaller catchments. Spatial soil water–holding properties are estimated by employing soil textures (Reynolds et al. 2000). These texture classes are linked to literature surveys of hydraulic soil properties (Rawls et al. 1982). In each grid cell, the FAO soil texture is used to estimate characteristics such as saturated hydraulic conductivity, bubbling pressure, quartz content, bulk density, particle density, field capacity, and wilting point. UMD land cover classes are used similarly to estimate overstory, architectural resistance, leaf area index, shortwave albedo, roughness length, displacement, and radiation attenuation factor. Fractional characterization of land cover within each grid cell is performed by subdivision into tiles. The seasonal dependence of some parameters is based on literature surveys about the Northern Hemisphere (Tian et al. 2004). Elevations are used to produce flow directions and accumulations. Path lengths are created from flow directions.
Meteorological variables are used as forcings for the LSM in 6-h, energy balance mode. To create representations of streamflow from output fluxes generated by VIC, we use a large-scale river routing algorithm designed to be coupled to land surface parameterization schemes (Lohmann et al. 1996). We have selected this routing approach because of its history of use with the VIC model and its computational efficiency. The application of this model allows for the comparison of predicted and measured streamflow data as an integrated spatiotemporal quantity, or “semidistributed” model. Base flow and runoff from each upstream grid cell are routed to each station based on one-dimensional (in-channel only) flow. The 1D routing model is predicated on assumptions of linearity and time invariance and requires the specification of a set of parameters estimated independently from the land surface scheme. It uses base flow separation techniques (Linsley et al. 1975) to determine the parameters for the routing within each grid cell. This type of horizontal routing process is suitable for macroscale (10–300 km) effects.
A simplified routing strategy can have limitations. For example, Li et al. (2013) note that although impulse response function (IRF) methods can perform well across a wide range of resolutions, they view the runoff transport system as linear and stationary. These methods may require significant calibration, and representation of flows from a particular spatial/temporal regime may not translate effectively to others. The IRF method also implicitly incorporates the subtleties of overland flow (travel time across hill slopes and through finescale stream networks), so these types of dynamics may be inaccurately modeled.
d. Quantitative precipitation forcings
The VIC model processes time series of subdaily meteorological drivers such as precipitation, air temperature, and wind speed. The model is most sensitive to precipitation forcings. Quantitative precipitation data at high spatial (0.25°) and temporal (6 h) resolution were considered for this study. Emphasis was placed on real-time products. Reanalysis products such as Interim European Centre for Medium-Range Weather Forecasts (ECMWF) Re-Analysis (ERA-Interim) were not used because of their coarser spatial resolution (1.5°); the LSM is run here at 0.25°. Several different high-resolution satellite precipitation products are operationally available, for example, the National Oceanic and Atmospheric Administration (NOAA) Climate Prediction Center (CPC) morphing technique (CMORPH; Joyce et al. 2004) and the Tropical Rainfall Measuring Mission (TRMM) Multi-satellite Precipitation Analysis (TMPA; Huffman et al. 2007). Each is available quasi-globally at 3-h temporal and 0.25° spatial resolutions. Passive microwave (PMW)-based estimates of instantaneous precipitation are subject to significant uncertainties, and each of these products has strengths and weaknesses that can vary both spatially and temporally.
A comparison of CMORPH versus TRMM 3B42, version 6 (3B42V6), in the continental United States for 2003–06 (Tian et al. 2007) indicates that CMORPH has season-dependent biases, with overestimation in summer and underestimation in winter. This led to TRMM 3B42V6 showing a lower bias and higher correlation with gauge measurements when aggregated to annual or seasonal time scales. On daily time scales, however, CMORPH correlated slightly better with ground-based measurements than TRMM 3B42V6. The authors recommended the use of 3B42V6 for long-term and climatological studies and the use of CMORPH for short-term applications because of its higher probability of detection of rainfall events.
The 3B42V6 is typically available at significant delay, ~10–15 days after the end of each month. A real-time version of TRMM (3B42RT) is available. While both CMORPH and 3B42RT have significant potential for global hydrological application and research, both display errors in high-latitude and complex-terrain regions. A study over the Laohahe basin in northern China for 2003–08 (Jiang et al. 2010) indicated that 3B42V6 had the best correspondence with surface observations, and CMORPH performed better than 3B42RT. Comparing the daily real-time products, CMORPH had better correlation, lower bias, and lower mean error.
Yong et al. (2010) evaluated 3B42RT and 3B42V6 for a high-latitude hydrological application. When used to drive the VIC model and generate a daily streamflow series, 3B42V6 produced Nash–Sutcliffe coefficient efficiency (NSCE) = 0.55 with 12.78% bias. The 3B42RT, on the other hand, produced NSCE = −18.39 with 327.14% bias. Both model-parameter error analysis and hydrologic application suggested that VIC cannot tolerate the nonphysical overestimation behavior of 3B42RT through the hydrologic integration process, possibly owing to the difficulties of precipitation estimations at high latitudes/complex topography.
We note similar problems with utilizing such real-time PMW data to initialize our hydrological model in South Asia and thus present a different approach in this work. To produce real-time streamflow forecasts using high-resolution (0.125°) quantitative precipitation and temperature estimates, we chose the first two time steps of the ECMWF deterministic (EC-DET) forecasting system. EC-DET forecasts are initialized twice per day (0000 and 1200 UTC) with globally assimilated observational data. The first time step of each forecast is an “analysis” field that represents the observed state of the atmosphere at the time of integration. To obtain forcings every 6 h, we use the first two time steps of consecutive forecasts each day (0000 and 1200 UTC). This leads to the use of a forecast product for two of the four daily time steps, but this is currently unavoidable—precipitation analyses at 6-h, 0.125° resolution are not available. This does have the advantage of maintaining continuity between the precipitation-driven hydrological model initializations and precipitation forecast variables. For total precipitation, minimum temperature, and maximum temperature, four time steps per day are then aggregated spatially from 0.125° to 0.25°.
The quality of daily EC-DET precipitation assimilated in this manner was evaluated by comparing magnitudes over the basin to an observational dataset. For aforementioned reasons, CMORPH data were selected as observations. A comparison of EC-DET to CMORPH precipitation after spatial aggregation over the region 24°–27°N, 68°–72°E is shown in Fig. 10. EC-DET tends to underestimate the magnitude of rainfall pulses in relation to CMORPH. To minimize systematic model bias differences, we implement a quantile-to-quantile (q to q) mapping technique following Hopson and Webster (2010, their Fig. 3). Daily EC-DET and CMORPH precipitation values from 2011 are binned into quantiles, creating a cumulative distribution function (CDF) for each 0.25° cell. EC-DET values are mapped onto the CMORPH precipitation fields by identifying the EC-DET CDF bin corresponding to the estimate and then replace the estimate with the value found in the same CMORPH CDF bin. This creates a nonlinear statistical correction for each grid cell. Note that the same technique is later used to statistically adjust the precipitation fields in forecast mode.
e. 2010–12 reconstructions
The hydrological framework was used to simulate the inflow to Tarbela Dam for 2010–12. As previously mentioned, because of the naturalized catchment upstream of this location, reconstructed flows should match reasonably well to observed gauge data at Tarbela. Measurements of average daily inflow at Tarbela Dam in 2010–12 were provided by the PMD through the World Bank. In a calibration phase, tuning parameters were modified to produce optimum fit between simulated flow and gauge measurements. These parameters include the fraction of the maximum base flow where nonlinear base flow begins Ds (0–1); the fraction of the maximum soil moisture where nonlinear base flow occurs Ws (0–1); a description of the VIC binfilt (0–0.4); and the depths of each soil layer d1–d3 (0.1–2.0 m). The results are shown in Fig. 11. Over the 3-yr period, R2 = 0.93 and RMSE = 692 m3 s−1. Note the significant difference in hydrographs when comparing 2010 to 2011 and 2012; even though the latter two were both flood years, 2010 caused flooding in nearly 20% of the country.
In Fig. 12, daily mean flows in 2010, 2011, and 2012 are quantitatively compared to observational data from July to September. In 2010, 2011, and 2012, R2 was 0.87, 0.78, and 0.81, respectively. Calculating the cumulative sum of each series at a daily time step, the total flow moving through Tarbela at the end of the 3 months is overestimated by 0.46% in 2010, overestimated by 6.58% in 2011, and underestimated by 18.71% in 2012.
4. Probabilistic forcings
The meteorological community is now capable of leveraging numerical models that rely on real-time, satellite-dominated global observing systems involving the atmosphere, ocean, land, and cryosphere (AMS 2013; Webster 2013). Observed values from both remote sensing and direct measurements are synthesized to initialize the models.
a. Probabilistic meteorological forecasting background
The atmosphere is inherently a nonlinear and complex system; small errors in the initial conditions of a weather forecast can cause significant divergence in pairs of near analogs (Lorenz 1969). For numerical weather prediction (NWP) systems, predictability is limited by model errors linked to the approximate simulation of atmospheric processes. It follows that a single (deterministic) numerical forecast may not provide the best estimate of the true state of the atmosphere. Ensemble methods seek to approximate the probability density function of the initial state.
There are a number of ways to describe the distribution of initial errors and to subsequently sample the distribution (Buizza et al. 2005). The bred-vector perturbation method (Toth and Kalnay 1993) is based on the argument that fast-going perturbations develop naturally in a data assimilation cycle and will continue to grow as short- and medium-range forecast errors. A Monte Carlo–type procedure can also be adopted, for example, by generating initial conditions through assimilation of randomly perturbed observations using different model versions in a number of independent data assimilation cycles (Houtekamer and Derome 1995). Here we will use forecasts from the ECMWF Variable Resolution Ensemble Prediction System (VarEPS), which are created via selective sampling procedures. Singular vector-based methods are used to identify the directions of fastest growth (Buizza and Palmer 1995; Molteni et al. 1996).
Finite-element representations of the dynamics and physics of the atmosphere introduce further uncertainties, for example, the uncertainty of parameters describing subgrid-scale cumulus convection in a global model (Leutbecher and Palmer 2008). These two types of errors are not definitively separable because the estimation of the initial conditions involves a forecast model, and thus, initial condition errors are affected by model errors. Ensemble forecasting is a response to the limitations imposed by the inherent uncertainties in the prediction process. A deterministic forecast can be complemented by utilizing a range of simulations associated with the complex boundary conditions governing the atmosphere. The ECMWF VarEPS forecasts consist of 51 ensemble members initialized twice per day (0000 and 1200 UTC), and each ensemble member has a 10-day forecast horizon. The horizontal resolution of the model is ~25 km from 0 to 10 days.
b. Ensemble prediction systems for streamflow forecasting
The adoption of ensembles of NWPs to drive operational streamflow forecast models is becoming adopted more frequently (Cloke and Pappenberger 2009), suggesting that we can project weather forecast uncertainty successfully onto hydrologic forecasts for a variety of uses. Gouweleeuw et al. (2005) indicated that operational flood forecasting has appeared to follow a shift from a single-solution or “best guess” deterministic approach to an ensemble-based probabilistic approach. Ensemble-based streamflow predictions in the Meuse and Odra River basins provided complementary information in the medium forecast range, both in the case of under- and overprediction. In a case study of two Belgian catchments, Roulin (2007) showed that hydrological ensemble predictions had greater skill than deterministic ones, owing to the fact that greater economic value for many users was reliant on the choice of a probability threshold appropriate to the decision-making situation. Pappenberger et al. (2008) conducted a case study of a flood event in Romania and indicated that awareness could have been raised as early as 8 days before the event. The authors concluded that the forecasts could have provided increasing insight into the range of possible flood conditions. Boucher et al. (2011) compared the performance of ensemble and deterministic forecasts in six subcatchments of the Gatineau River basin. In terms of continuous ranked probability score and mean absolute error, ensemble forecasts outperformed the higher-resolution deterministic forecasts at each lead time, except for the first forecasting horizon.
Meteorological input uncertainty is usually assumed to represent the largest source of uncertainty for hydrological prediction systems. However, an increase in the skill of meteorological models does not always consequently lead to similar improvement in hydrological modeling (Pappenberger et al. 2011). Despite the availability of quasi-global precipitation forecasts, it is important to test how well perceived meteorological skill translates to hydrological signals in each probabilistic forecasting application.
c. Skill of precipitation forecasts in South Asia
Webster et al. (2011) indicated that the heavy rainfall events of July–August 2010 affecting regions of northern Pakistan were skillfully predictable in both magnitude and timing 6–8 days in advance (Fig. 13b). The result motivated Galarneau et al. (2012) to examine the further use of ECMWF Ensemble Prediction System forecasts to determine synoptic-scale features modulating the production of heavy rainfall. In the 96–144-h forecast initialized at 0000 UTC 24 July, the probability of precipitation (PoP) >50 mm during 28–29 July reached 60% over northern Pakistan, confirming the Webster et al. (2011) study. The region of highest PoP over northern Pakistan agreed with the observed TRMM-derived precipitation analysis.
Forecast lead time diagrams for 2011 and 2012 were created using a similar procedure. Statistically corrected precipitation data from the ECMWF VarEPS were used to analyze critical areas of the IRB. In Fig. 13, the ensemble forecasts are evaluated against observed data from the NOAA CMORPH system. Forecasts for the regions highlighted in green (Figs. 13a,c,e) are considered. The white-to-red color scheme corresponds to the probability that the ECMWF forecasts exceed the observed CMORPH climatology plus one standard deviation. The blue line represents the observed CMORPH rainfall averaged for the same region and time period. For 2011 rainfall events, the forecast lead time diagram (Fig. 13d) shows high probabilities for rainfall pulses ~8 days in advance of the events. For 2012 events, the diagram (Fig. 13f) shows the lead time for a pulse starting on 3 September exceeding 10 days. A significant pulse on 8 September 2012 was not well predicted by VarEPS, however.
We will now show a case study intended to evaluate integrations of spatially and temporally varying precipitation/temperature forecasts over the IRB. Using a semidistributed hydrological model, can we translate spatially and temporally varying precipitation/temperature effects into skillful streamflow forecasts?
5. Hindcast verifications
a. Forecast spinup and reduction in ensemble members
To spin up the hydrological model (propagating moisture/energy storages forward), we use the first two 6-h time steps of every EC-DET forecast prior to the forecast date up until 1 January of the previous year. The method is the same as that used to create the streamflow reconstructions (section 3d). For example, to simulate a hindcast for 1 July 2010, we assimilate EC-DET data from 1 January 2009 to 30 June 2010. To simulate a hindcast for 2 July 2010, we assimilate EC-DET data from 1 January 2009 to 1 July 2010, and so on. This scheme mimics operational use, where the hydrological initializations are updated with the latest analysis information in every cycle. To create a streamflow prediction system capable of producing daily forecasts available 2 h after receipt of NWP initializations, computational resource limitations currently necessitate that we condense the number of simulated ensembles to 11, or approximately one-fifth of the 51 ensembles provided by ECMWF VarEPS. VarEPS forecasts consist of one control member and 50 other members resulting from perturbed initial conditions. We simulate the control member unaltered. A reduction in the other 50 members is performed by ranking each of the ensembles by precipitation magnitude and averaging the meteorological variables in each five-member “block.” This leaves us with 1 + 10 or 11 total sets of variables with which to force the hydrological model.
b. 2010 hindcasts (raw QPF)
Figure 14 shows four 10-day streamflow hindcasts at Tarbela Dam: 27 July, 1 August, 13 August, and 23 August 2010. The blue vertical line indicates the first day of the forecast, the ensemble mean is plotted as a bold green line, and each ensemble member (11 total) is plotted as a thin green line. The reconstructed streamflow is plotted as a bold black line, and the black horizontal lines indicate the 70% and 95% levels of the expected overflow threshold (22 643 m3 s−1, or the “extremely high” flood limit as given by PMD). We will refer to this set of simulations as “raw quantitative precipitation forecast (QPF)” hindcasts.
In Fig. 15a, the 2010 reconstructed streamflow at Tarbela Dam (black dashed line) is plotted versus the observed inflow (green line). In Fig. 15b, the reconstructed inflow from 1 July to 30 September 2010 is plotted versus the ensemble mean at each lead time. The ensemble mean consistently underestimates the reconstruction, and the magnitude of the underestimation generally increases with lead day for most values of streamflow. Figure 16a shows scatterplots of the ensemble mean versus reconstructed inflow at 1, 3, 5, and 7 lead days, for 92 hindcasts (from 1 July through 30 September). Figure 16b shows linear regressions of each data series and indicates a trend of nonlinear biases. Figure 16c shows the correlation and Fig. 16d shows the RMSE as a function of lead day. A persistence hindcast was also generated on each hindcast day; this refers to the practice of using the river streamflow on day t to forecast the discharge on day t + tf. The ensemble-mean correlation outperforms persistence at all lead times—R2 > 0.90 throughout the 1–10-day time window—but the ensemble-mean RMSE is worse than persistence for days 5–10. The uncertainty boundaries resulting from streamflow hindcasts in 2010 are shown in Fig. 17. The four plots show 30%, 50%, 70%, and 90% of the total distribution created by the ensemble members.
Though the data correlation is favorable, the RMSE values are too large (at five lead days, the error is ~3000 m3 s−1 or 40% of mean flow from 1 July to 30 September). Error values must at least show improvement over persistence for this system to be useful. The underestimation trends displayed in Figs. 16a and 16b suggest that the process could benefit from some form of statistical postprocessing. We undertake procedures to minimize the error of the ensemble mean versus the streamflow reconstruction while attempting to preserve the correlation. Ideally, an automatic process would be implemented that “recenters” the data distributions shown in Fig. 16a. As previously mentioned, the LSM tends to be most sensitive to precipitation uncertainty; a systematic underestimation of forecasted precipitation typically leads to a systematic underestimation of streamflow. Because the biases are nonlinear, a q-to-q correction of the precipitation forcings was again used. Similar to the model initializations described in section 3d, a q-to-q mapping of VarEPS precipitation data was created from daily operational forecasts in 2011—in each grid cell and by each lead day—to CMORPH precipitation data.
c. 2010 hindcasts (adjusted QPF)
These modified precipitation forcings are used to create a new series of 92 hindcasts from 1 July to 30 September 2010. Hindcasts from 27 July, 1 August, 13 August, and 23 August are shown in Fig. 18; we will refer to these simulations as “adjusted” QPF hindcasts. Figure 19b shows the reconstructed inflow from 1 July to 30 September 2010 plotted versus the ensemble mean at each lead time. Note that the ensemble mean at each lead day is better centered about the reconstruction. Figures 20a and 20b illustrate the new data distributions by lead day via scatterplots and linear regressions. The fittings by lead day are closer to the 1-to-1 line, but there is still slight underestimation, especially at lower values of flow (<7500 m3 s−1). Figure 20c shows that the correlation resulting from the adjusted QPF hindcasts is slightly worse compared to the raw QPF simulations but is still >0.85 at all lead times and better than persistence on all lead days. Figure 20d shows that the RMSE is now better than persistence at all lead days, an improvement compared to raw QPF simulations. The RMSE of the ensemble mean versus reconstructed streamflow is now less than 1750 m3 s−1 at all lead days; this is <24% of mean streamflow from 1 July to 30 September 2010. This approach of minimizing the ensemble-mean error is not without drawbacks, however; a comparison of Figs. 17 and 21 shows that the upper bounds of the 5th–95th percentiles, at longer lead times (>3 days), may increase to an extent too large to be practically useful for evaluating uncertainty. As a result, a qualitative analysis of a particular hindcast (after correction) may need to be constrained to the ensemble mean and the middle 70% of the distribution (Figs. 21a–c).
d. 2011 hindcasts (adjusted QPF)
Figure 22 shows corrected 10-day streamflow hindcasts from 20 July, 20 August, 3 September, and 14 September 2011. Figure 23b shows the reconstructed inflow from 1 July to 30 September 2011 plotted versus the ensemble mean at each lead time. In Fig. 24b, linear regressions of the data by lead time show generally good agreement with the 1-to-1 line, but higher values of streamflow (>5000 m3 s−1) at longer lead times (>5 days) tend to be overestimated. Lower values of streamflow (<5000 m3 s−1) tend to be underestimated at longer lead times (>2 days). Figure 24c shows that the ensemble-mean correlation outperforms persistence at all lead times, with R2 > 0.7. Figure 24d shows that RMSE outperforms persistence on all lead days. The error magnitudes were <28% of mean streamflow from 1 July to 30 September 2011. The uncertainties generated by the ensemble spread at the upper quantiles at longer lead times may again be too large to be practically useful, shown in Fig. 25d.
e. 2012 hindcasts (adjusted QPF)
Figure 26 shows corrected 10-day streamflow hindcasts from 3 August, 20 August, 27 August, and 3 September 2012. Figure 27b shows the reconstructed inflow from 1 July to 30 September 2012 plotted versus the ensemble mean at each lead time. In Fig. 28b, linear regressions of the data by lead time show that higher values of streamflow (>5000 m3 s−1) tend to be underestimated at longer lead times (>2 days). Lower values of streamflow (<5000 m3 s−1) tend to be underestimated at longer lead times (>2 days). Figure 28c indicates that the correlation outperforms persistence at all lead times, with R2 > 0.7. Figure 28c indicates that the RMSE outperforms persistence at all lead times. The error magnitudes were <34% of mean streamflow from 1 July to 30 September 2012. The uncertainty generated by the ensemble spread for the upper quantiles, at longer lead times, may be too large to be practically useful, shown in Fig. 29d.
f. Alert states
Streamflow can be forecasted at any location in the basin (0.25° resolution). It should be noted, however, that we currently do not have access to daily historical data to the south of Tarbela Dam and simulations past this point are only model projections. In Fig. 30, discharge is shown from the three locations on the main stem of the Indus River (Fig. 12a): Tarbela Dam, green; southern Punjab province, blue; and southern Sindh province, red. Daily flows are plotted. The probabilities generated from the evolution of the ensemble distribution can be used to create alert states for each day of the 10-day forecast. Figure 30 shows these alerts for the three main stem locations based on river threshold estimations from the PMD FFD. The two black horizontal lines correspond to 70% and 95% of estimated bankfull discharge. In the case that at least four ensembles break a discharge threshold, the alert state is increased to either watch (yellow) or warning (red). If this condition is not met, the alert state is normal (white). Three probability states are also associated with the alert state: if less than 40% of the ensembles break a threshold, this indicates low confidence; 40%–80% breaking a threshold indicates medium confidence; and 80%–100% breaking a threshold indicates high confidence. The respective ensemble means are plotted as bold lines. For the 19 July 2010 hindcast, simulations projected a watch state at location 2 on 21 July and a warning state for 22–28 July. Simulations projected a warning state at location 3 from 24–28 July.
6. Floodplain projections
Currently, there are numerous limitations to the analysis of flood prediction in Pakistan. This is in large part due to data availability for researchers outside of Pakistan. More detailed information on the following is needed: 1) regulatory information for the highly managed IRB, 2) daily historical flows in central and southern parts of Pakistan, and 3) measurements of specific cross-sectional areas leading to overbank thresholds. In the absence of this information, we can make projections of naturalized flow into the floodplain resulting from a static overflow threshold throughout the basin. Our main focus is to indicate areas affected by “worst case” overflows (i.e., in the absence of any regulation) on the main stem of the Indus River, so we have selected a threshold of 22 650 m3 s−1 (8 × 105 ft3 s−1) for each cell in the basin delineation. In the future, we plan to make use of more realistic overflow thresholds by incorporating lower thresholds on each tributary and at specific levels at critical structures (e.g., Jinnah, Chashma, Taunsa, Guddu, Sukkur, and Kotri Barrages). Locations 2 and 3 (Figs. 12a, 30) in the “alert state” image described in the previous section are an indication of the potential for this forecast scheme; we can simulate daily streamflow at any point in the basin. The 2D model operates at relatively coarse spatial resolution (0.25°); it is intended to preserve the volumes of water overtopping the main stem of the Indus and to give a general assessment of worst-case inundation extent based on overages in each cell.
The output of the “naturalized” routing algorithm in each grid cell is imported to create a three-dimensional matrix; its dimensions are latitude, longitude, and time. Because discharge is simply volume per time, this matrix is a set of images describing the average volume of water moving through each cell at each time step. These water volumes serve as the raw inputs for a 2D flow module. The 2D module simulates flood-affected areas because of flow overages in image cells (pixels).
First, a threshold image is populated (the expected river threshold in each cell). We also populate an overflow image (the volume exceedance in each grid cell), a modifiable 1D image (same as the raw set of images, but capped at the channel saturation limits in each cell), a flow direction image (direction water moves out of each cell outlet), a flow distance image (the length of the flow path in each cell), an elevation image (the average terrain elevation in each cell), and a 2D image (describing flood-affected areas).
The following sequence is iterated for each simulated day. First, an overflow image is generated based on the water volumes exceeding each cell threshold. A modifiable 1D image is created by capping the raw volumes for this day at saturation. For each nonzero cell in the overflow image, the volume of water is removed from the next day’s modifiable 1D image according to the cell-to-cell travel times specified by the 1D routing solutions. The sequence continues until the flood volume has either moved past the basin outlet or beyond the image spatial extents, and it is repeated for each subsequent image in the time series.
We now have iteratively deconvolved the flood wave responses from the 1D, or in-channel, image series. The series of overflow images serves as the basis for raw water volumes to be modified by floodplain attributes. As far as 2D expansion into adjacent cells, we make three simplifying assumptions: 1) flow overtops the riverbanks omnidirectionally, 2) the velocity of overflow is much slower than the typical river flow, and 3) the reflow velocity of floodwater back into the river channel (related to persistence time) is even slower. This is not a fully hydrodynamic solution and thus faces limitations; for example, the physical mechanisms driving interactions between lateral and channelized flow are significantly more complicated than described here. We have made simplifications to support operational calculations.
For each cell in the overflow image, the surrounding total storage on the floodplain is calculated by estimating the volume of a cone based on the elevations image. The overflow water progresses radially outward into the floodplain according to the overflow velocity for one time step. If overflow volume of this particular cell is fully contained by the estimated floodplain storage, the outward movement of water stops, and reflow back into the channel takes over (as long as the channel is not currently saturated). If the overflow volume is greater than the estimated floodplain storage, we move to the next time step and so on until the storage condition has been met and reflow can take place.
We are left with two major outputs: 1) the modified 1D image describing the in-channel flows and 2) the modified 2D image describing the floodplain flows. Evaporation of water from the floodplain is not currently considered, but we plan to address this in future work. The combination of these images creates a sequence that simulates the sequence of overflow and drainage in the basin. Early results from this algorithm development were made in a report to the World Bank (Webster and Shrestha 2013).
Figure 31 shows the 2010 reconstruction. Three time series are plotted corresponding to three locations (Fig. 12a) on the Indus main stem. The observed inflow at Tarbela Dam is plotted as a black series. Note that the red and blue series plateau at 22 650 m3 s−1; this was our assumption for the mean daily flow level leading to bankfull discharge at points on the Indus main stem. Figure 31 also shows four spatial maps of flood-affected areas simulated by the model on four different days. Each of these images is a daily “snapshot” of the flow magnitudes in each cell in the basin network. Dark blue colors correspond to low discharge, and dark red colors to high discharge. On 16 July 2010, the naturalized model simulates in-channel flow for all major rivers. On 23 July 2010, the model simulates overbank flow just south of the confluence between the Indus main stem and its major tributaries (southern Punjab). On 29 July 2010, flooding in northern part of the country is simulated and flood-affected areas have increased in size. By 26 August 2010, the model simulates an intensification of flooding in the south central area of the country.
Figure 32 shows the 2011 reconstruction. On 28 July 2011, the model simulates in-channel flow for all major rivers. On 1 August 2011, the model simulates overbank flow in southern Sindh. This simulated flood pulse arrived earlier than news reports and remote sensing suggest (the Sindh floods began in mid-August 2011). As indicated by Fig. 1, a significant portion of the Indus River basin is affected by rains over India, which would ultimately drain westward into Pakistan in an unregulated scenario without withholding by dams and barrages. However, because of political agreements between the two countries, India has exclusive control of the Sutlej, Beas, and Ravi Rivers, while Pakistan controls the Indus, Jhelum, and Chenab. This means that major dams in India such as the Bhakra (8.63 × 109 m3), the Nathpa Jhakri (3.08 × 109 m3), and the Pong (7.4 × 109 m3), along with many other man-made diversion effects throughout Pakistan, significantly attenuate flows into Pakistan. Figure 31 shows the unregulated model projections in Pakistan shortly after heavy rains in northern India. The model simulates Sindh flooding in early September and intensification throughout the month.
Figure 33 shows the 2012 reconstruction. On 25 August 2012, the model simulates overbank flow in southern Punjab. This first flood pulse shows a “false positive” in comparison to flooding reports, but we again need to consider the mitigating properties of the large eastern dams (in India), which are not simulated. These images illustrate the benefits of heavy regulation throughout Pakistan and India and what might occur in a worst-case scenario. The progression of the projected flood wave in early September is shown. Dangerous flows on the main stem Indus enter southern Punjab starting 6 September 2012 and do not abate until 15 September 2012.
We have described a methodology for generating 10-day probabilistic streamflow forecasts by driving the VIC semidistributed macroscale hydrologic model with gridded ECMWF atmospheric forecasts. The forecast scheme is intended to be run daily, and computations are designed to complete 2 h after receipt of NWP data. The LSM operates at a 6-h time step and 0.25° spatial resolution, balances the surface energy budget within each grid cell, and creates estimates of output fluxes in each grid cell. Base flow and runoff from each upstream cell are combined based on one-dimensional flow routing. This type of horizontal routing process is suitable for large-scale (10–300 km) effects. The uncertainty of meteorological forecast variables is projected onto discharge estimations. This type of approach, as opposed to one reliant on physical gauges, is uniquely suited to situations where real-time data are suspect. One major advantage is the capability of the process to remain functional even in the case that extreme floods result in the loss of upstream equipment and/or data transmission failure from gauges (Cloke and Pappenberger 2009). An alert system is described that communicates alert states and the confidence of each forecast.
Comparisons of streamflow reconstructions with observed inflow data at Tarbela Dam over a 3-yr period yield an R2 correlation of 0.93 and RMSE of 692 m3 s−1. Probabilistic meteorological forcings were used to drive the hydrological modeling framework, generating 276 ten-day streamflow hindcasts from 2010 to 2012 (July–September). In comparing the ensemble mean of these hindcasts to reconstructed inflow, the R2 correlation was >0.85 for 2010, >0.70 for 2011, and >0.70 for 2012. The RMSE was <24% of mean streamflow for 2010, <28% for 2011, and <34% for 2012. In terms of these metrics, the ensemble-mean hindcasts outperformed persistence hindcasts at all lead times.
Previous efforts to develop rapid forecasting and warning in places such as Bangladesh (Hopson and Webster 2010; Webster et al. 2010) have demonstrated the utility of experimental systems based on probabilistic forecasting. The Bangladesh system, which was first used experimentally in 2004 and became operational in 2007, was developed with support from the U.S. National Science Foundation, the U.S. Agency for International Development, and the humanitarian agency Cooperative for Assistance and Relief Everywhere (Webster 2013). We have presented a methodology based on ensemble simulations that may be capable of providing similar early warnings (Fig. 30) for aid organizations. Operational hydrologic ensemble forecasting is relatively new, and details on these types of macroscale hydrometeorological ensemble approaches are just starting to emerge (Alfieri et al. 2012; Demargne et al. 2014).
Limitations and future work
Currently, there are numerous limitations to the analysis of flood prediction in Pakistan. This is in large part due to data availability for researchers outside of the country. More detailed information is needed on regulatory information for the highly managed IRB, historical flows in central and southern parts of Pakistan, and measurements of specific cross-sectional areas leading to overbank thresholds. In the meantime, we have described a fully automated operational process that is capable of operating without any of these requirements. The creation of a mechanism to obtain this clarifying information—especially in a real-time context—is ongoing.
We have sought to make quantitative comparisons to gauge information where available. A case study was described at Tarbela Dam, which is particularly relevant given the complex nature of water management decisions in the IRB. This methodology can be used to forecast flow at any point in the basin, and we have noted that forecast skill in the IRB (in comparison to our reconstructions) appears to increase as catchment size increases. Although a substantial amount of area is drained at Tarbela (~170 000 km2), the IRB in its entirety drains ~1 165 000 km2, and projections of floodwater are typically most critical at southern locations in Pakistan. Specific quantitative analyses at locations far to the south are not included in this paper because of a lack of historical data. As soon as data become available, we will repeat the hindcast analyses at other critical points in the basin.
A 2D routing scheme was described to project overflows onto the floodplain. The simulations of routing mechanics described here are relatively coarse in spatial scale (0.25°), owing to the logistics associated with performing daily ensemble forecasts. Simulations of more finescale effects are highly desirable, and recent research indicates that this is possible. For example, Neal et al. (2012) present a computationally efficient hydraulic model for simulating the spatially distributed dynamics of water surface elevation, wave speed, and inundation extent over large spatial domains. The scheme was shown to be numerically stable and scalable in test cases such as the Niger River in Mali. Li et al. (2013) present the Model for Scale Adaptive River Transport (MOSART), a physically based runoff routing model for simulating routing processes through hill slopes, sub-networks, and main channels. The model was shown to be capable of capturing the seasonal variation of both streamflow and channel velocities at different spatial resolutions. It may be possible to either couple outputs from the 1D scheme described in this work with subgrid-scale inundation modeling or replace the model in its entirety with a new paradigm.
For future work, we plan to increase computational resources in order to increase the number of streamflow ensembles. Operational forecasting systems are now being run on a daily basis with global coverage (Alfieri et al. 2012; Wu et al. 2012), indicating that it is possible to design a computationally efficient structure supporting large-scale LSM processes. Code optimizations following these methodologies are being pursued. There are also recent advancements to the VIC model that have not yet been leveraged, for example, improved daily long-/shortwave estimations in the presence of snow, subdaily humidity estimations, and computation of soil layer average temperatures (University of Washington 2012).
In each of the three study years, corrected streamflow hindcasts tended to underestimate low flow values. In 2011 and 2012, higher values of flow tended to be overestimated. Ideally, we would like to always bound the solution by the ensemble spread. In some cases (especially early July hindcasts), the solution was not contained by the 70% interpercentile bounds. This suggests that the quantile matching procedure is still not sufficient to address systematic forecast errors/biases. For computational simplicity, the quantile maps in each grid cell operated on a fixed temporal scale; that is, they were not updated on a real-time basis. We plan to update these quantile maps in the future on a dynamic basis, that is, by ingesting real-time CMORPH or TMPA data. Research is ongoing to determine whether such improvements to existing framework yield improved estimates for medium-range streamflow forecasting.
This research is supported by the National Science Foundation under Award 0965610. We would like to specifically thank Mr. Javaid Afzal of the World Bank, the late Dr. Kurt Frankel from Georgia Tech, Dr. Thomas Hopson from the National Center for Atmospheric Research, Dr. Dennis Lettenmaier from the University of Washington, and Raghav Vemula from ESRI for their help in this work. We would also like to thank all those at ECMWF who work to provide the numerical weather prediction data critical to this research.