1. Introduction
With water, the most essential natural resource for sustaining life, coming under threat (Vörösmarty et al. 2010), it becomes imperative to understand the hydrological cycle and monitor its constituents. This urgent need, however, has not been able to stem the steady decline in the number of precipitation and runoff gauges (Fig. 1), which still provide the only direct observations of terrestrial water cycle components. This is a setback both for our understanding of the evolution of the hydrological cycle and for the calibration and validation of hydrological and atmospheric models and ex situ measurements of those components. The situation is further aggravated by the lack of reliable data for evapotranspiration and water storage changes, which, together with precipitation and runoff, form the major components of large-scale hydrological cycle.
Over the years, in the absence of measurements of the hydrological cycle components, it has been normal practice to resort to sophisticated hydrological and atmospheric reanalysis models to study the global water balance (e.g., Trenberth et al. 2007). However, the hydrological or land surface models suffer from the steady changes in the spatial distribution and number of precipitation or runoff gauges. For such models, observations are needed both as an input and also as a validation measure. Therefore, there is a desperate need for new and alternative methods.
Spaceborne measurements seem to provide the necessary respite by a plethora of satellites providing nearly all the components of the global hydrological cycle: precipitation from the Tropical Rainfall Measuring Mission (TRMM; Kummerow et al. 2000), water storage changes from the Gravity Recovery and Climate Experiment (GRACE; Schmidt et al. 2008), runoff from satellite altimetry and synthetic aperture radar (Alsdorf et al. 2007), soil moisture from the Soil Moisture and Ocean Salinity (SMOS) satellite (Kerr et al. 2010) or (beginning in late 2014) the Soil Moisture Active Passive (SMAP) satellite (Entekhabi et al. 2010), and a host of variables concerning the hydrological cycle from Aqua (Parkinson 2003). The spaceborne observations bring with them the advantages of global coverage and homogeneous accuracy, but with limited spatial and temporal resolutions. Hrachowitz et al. (2013) acknowledge these advancements and provide examples of spaceborne measurements becoming indispensable to hydrological research, especially in data-poor regions like West Africa, Asia, and the vast open oceans.
In the context of runoff, Alsdorf et al. (2007) provide a broad overview of the types of spaceborne runoff measurements and their relative merits and challenges. Recently, but not for the first time, Tourian et al. (2013) demonstrated the potential and accuracy of runoff derived from satellite altimetry. However, an application of this method to catchments still depends on the existence of contemporaneous or previous in situ runoff measurements. Therefore, our study focuses on independent methods using mass estimates of the water storage in river systems from GRACE, while observed runoff is used as a validation measure only.
The GRACE satellite mission by design provides indirect estimates of water storage changes (Tapley et al. 2004), which has tremendously revolutionized global- and continental-scale hydrological studies. This is corroborated by the wealth of literature dealing with a range of applications: estimation of basin- to continental-scale water storage (Crowley et al. 2006; Schmidt et al. 2008) or evapotranspiration (Rodell et al. 2004a), regional weather analysis (Seitz et al. 2008), large-scale moisture flux divergences (Swenson and Wahr 2006); groundwater depletion in India (Rodell et al. 2009), and polar ice mass loss monitoring (Schrama and Wouters 2011). Nevertheless, the GRACE data come with their own problems, the most persistent being the filtering problem (e.g., Werth et al. 2009). The aftereffects of the filtering, leakage, and poor spatial resolution limit the utility of GRACE data only to catchments with an area of ~200 000 km2 or more (Longuevergne et al. 2010). Despite these challenges, our previous studies (Riegger et al. 2012; Fersch et al. 2012) and current improvements in GRACE data processing (Dahle et al. 2013; Bettadpur 2012) provide enough grounds to pursue a GRACE-based approach for global-scale runoff estimation.
The use of GRACE water storage changes entails the use of the water balance equations, wherein runoff is computed from a combination of precipitation and evapotranspiration in addition to water storage changes. Such water budget–based methods have already been applied by Sheffield et al. (2009), who used GRACE and other remotely sensed data for computing streamflow from the Mississippi basin. Syed et al. (2005) used an atmospheric water balance for estimating discharge over the Amazon and Mississippi basins and, in the following years, total freshwater discharge from the pan-Arctic region (Syed et al. 2007) or continents and large river basins (Syed et al. 2009). Other methods of estimating runoff from GRACE are also being proposed (e.g., Riegger and Tourian 2014), but they are still in a very nascent state of development. For a long time, evapotranspiration has been elusive and has been the thorn in the flesh in global hydrological studies (e.g., Trenberth et al. 2007; Schlosser and Houser 2007). Lately, attempts have been made to provide global evapotranspiration estimates apart from what has been available from various hydrological and atmospheric reanalysis models (Salvucci and Gentine 2013; Miralles et al. 2011; Jung et al. 2009; Mu et al. 2007). Precipitation and runoff datasets are the most mature of all the hydrological datasets, but there are significant differences between the different precipitation datasets (Lorenz and Kunstmann 2012). With an accuracy of 10%–20%, observed runoff is the most accurately measured component of the hydrological cycle (Fekete et al. 2012), but suffers from variations in the time period of observation, substantial delays in data access, and large declines in the monitoring capacity (Fekete et al. 2002; Fekete and Vörösmarty 2007; Shiklomanov et al. 2002; Vörösmarty et al. 2001).
To analyze the performance of runoff estimations independent of gauge-based observations, we thus set out to derive large-scale runoff from catchments using water-balance-based methods. Therefore, we use the most recent versions of widely used data sources of precipitation, evapotranspiration, water storage changes, and moisture flux divergences. By combining all these datasets, we end up with a total of 90 possible runoff estimates from a terrestrial water balance and 15 from an atmospheric–terrestrial water balance for each catchment. These estimates are then validated against observed runoff data from the Global Runoff Data Centre (GRDC; GRDC 2013) over 96 river basins. The performance of these estimates is also compared with the performances of sophisticated hydrological (land surface) models and an empirical method involving the precipitation–runoff (P–R) ratio. This allows us to:
evaluate the potential of runoff estimation from water-balance-based methods, land surface models, and an empirical predictor;
analyze the quality of the different contributing datasets; and
evaluate the current status of the closure of catchment-scale water budgets.
Our study is outlined as follows: in section 2, we elaborate on the water-balance-based methods and the empirical method for estimating runoff from precipitation; in section 3, the datasets that we have used in the different approaches are detailed; in section 4, we investigate the utility of the runoff estimates from water-balance-based methods as an alternative to observed runoff and also exploit those estimates to study the global water balance closure; and finally, we draw our conclusions in section 5.
2. Methods
The primary concern in this study is to estimate runoff globally from the landmasses, and we have chosen to use the water balance equations as the means to our end. Water balance equations define the hydrological cycle in quantitative terms by taking into account the amount of water entering and leaving a river basin within, for example, 1 month. This budgeting can be approached in several different ways, and here we have chosen both large-scale terrestrial and atmospheric–terrestrial water balance equations. Our choice was driven mainly by the availability of water storage change data from the GRACE satellite mission. All our calculations are performed on a monthly basis in order to be as consistent as possible with the monthly runoff observations as provided from the GRDC.
a. Hydrological budget approach
b. Hydrometeorological budget approach
c. Runoff–precipitation ratio
Due in part to the simplicity and empirical nature of the model, its success is entailed to the dynamic behavior of the river system whose runoff is estimated. Runoff of a river system more prone to vagaries in the mean annual hydrological cycle, or in which there is a lot of anthropogenic intervention, might not be estimated via this model. Furthermore, evapotranspiration and water storage do not appear explicitly in these formulas, but they certainly affect the relationship between precipitation and runoff (i.e., their impacts are “hidden” within the ratios). It is thus essential that both evapotranspiration and water storage changes follow a stable annual cycle, which does not change significantly from year to year. That being said, estimating runoff from catchments with, for example, a trend in one of these two variables might lead to large uncertainties.
To evaluate the performance of this empirical model as a predictor (i.e., for ungauged regions or runoff observations with large data gaps), the values of
d. Water budget imbalance
3. Data
The water balance equations (2) and (6) and the empirical model (9) use all the components of the hydrological cycle. We have ensured that all the datasets that we have used for our analysis here are observation based. In cases where direct observations are not possible, for example, moisture flux divergence, only partially modeled datasets are used. In addition to this, we have also taken care to use only datasets that are publicly available. An overview of the different data sources of precipitation, evapotranspiration, water storage changes, moisture flux divergences, and runoff is given in Table 1.
Summary of the observation- and model-based datasets containing precipitation (P), actual evapotranspiration (ETa), vertically integrated moisture flux divergences
a. Runoff
To attain maximum data coverage, both in space and time, data are collected from different sources: the GRDC; the Arctic Rapid Integrated Monitoring System (ArcticRIMS) project; Water Survey Canada; the U.S. Geological Survey (USGS); the U.S. Army Corps of Engineers (USACE); the Environmental Research Observatory (ORE) Geodynamical, Hydrological, and Biogeochemical Control of Erosion/Alteration and Material Transport in the Amazon Basin (HYBAM) project; the Department of Water, Land and Biodiversity Conservation, Government of Australia; and the Department of Water Affairs and Forestry, Republic of South Africa.
The GRACE-based approaches (i.e., Rhyd and Ratm) are essentially global methods capable of providing runoff estimates of all catchments without the necessity of any runoff observations. To validate these estimates, we need contemporaneous (2003–10) observed runoff data for all the catchments. Because of data gaps both in space and time, we choose catchments that have a minimum of five years of observed runoff data within the period 2003–10 so that we can perform a meaningful statistical assessment. Applying this restriction, we are left with 96 catchments out of a total of 403 catchments to perform our investigation. An overview over the different study regions, the respective river gauges, the area of the basins, and the mean annual discharge is given in Table 2 and Fig. 2. The number of available runoff observations is shown in Fig. 3, where the white and gray boxes indicate available and missing data, respectively.
River basins that were used in this study, including the respective river gauge, its location, the area of the basin, and the mean annual discharge as provided from GRDC. The abbreviation trib. stands for tributary, and d/s stands for downstream.
Spatially, the collection of catchments covers a wide range of climatic and geographic conditions. This includes some tropical catchments with the highest discharge worldwide (e.g., Amazon and Congo) and also the vast high-latitude basins (e.g., Lena, Ob, and Yana) with typically low discharge values during the winter months due to frozen conditions and peak values between June and August during and after the melting season. Also, the areal extent of the selected catchments varies from 4.7 × 106 to 2.1 × 104 km2, where we have deliberately included the small catchments (<200 000 km2) as well. This allows the quantification of the area dependence of GRACE-based estimates. To better characterize the study regions, Figs. 4 and 5 show the annual cycle of precipitation, evapotranspiration, runoff, water storage changes, and moisture flux convergences for all the basins.
b. Precipitation
Precipitation is traditionally measured using various types of rain gauges, such as the nonrecording cylindrical container, the recording weighing gauge, floating gauges, and tipping-bucket-type gauges. In the recent past, precipitation was also measured with weather radar and weather satellites. Generally, because of the adequate amount of sensors, global gridded precipitation data are available from different data providers. In this study, we used precipitation data from the following:
Global Precipitation Climatology Centre (GPCC; Schneider et al. 2014),
Global Precipitation Climatology Project (GPCP; Adler et al. 2003),
Climate Prediction Center (CPC) Unified precipitation (Chen et al. 2008),
Climatic Research Unit (CRU; Harris et al. 2013), and
University of Delaware (DEL; Matsuura and Willmott 2012).
c. Evapotranspiration
A wide variety of model approaches for estimating evapotranspiration exist, which range from simple empirical ones to complex ones including radiative energy balance. Many researches have focused on remote sensing approaches, in which the observation of surface temperature and vegetation indices are used for estimation of evapotranspiration. In this study, we have used data from the following:
Global Land Surface Evaporation: The Amsterdam Methodology (GLEAM; Miralles et al. 2011),
GLDAS, version 2 (GLDAS2; Rodell et al. 2004b),
Interim European Centre for Medium-Range Weather Forecasts (ECMWF) Re-Analysis (ERA-Interim; Berrisford et al. 2009; Dee et al. 2011),
the land-only version of the Modern-Era Retrospective Analysis for Research and Applications (MERRA-Land; Reichle et al. 2011),
FLUXNET Multi-Tree Ensemble (FLUXNET MTE; Jung et al. 2009, 2010), and
Moderate Resolution Imaging Spectroradiometer (MODIS) Global Evapotranspiration Project (MOD16; Mu et al. 2007, 2011).
Finally, ERA-Interim is a reanalysis model where a global atmospheric model is assimilated with numerous observations in order to provide reliable estimates of a wide variety of hydrometeorological and hydrological variables (see section 3e). A comparison of some of these datasets is presented in, for example, Mueller et al. (2011).
To get an impression of the level of agreement between the different precipitation and evapotranspiration datasets, the top row in Fig. 6 shows scatterplots between the mean annual and seasonal precipitation from GPCC and the other datasets. For the bottom row, evapotranspiration from MOD16 served as reference.
d. Water storage changes
Water storage changes are computed from GRACE observations of the temporal gravity field changes. Eight years of GRACE (release 5) data have been used in this study, and they come from two data processing centers: GeoForschungsZentrum (GFZ; Dahle et al. 2013) and Center for Space Research, The University of Texas at Austin (CSR; Bettadpur 2012). The data are provided as spherical harmonic coefficients of the monthly gravity field snapshots. Wahr et al. (1998) provide the mathematics (section 2a) to invert the monthly snapshots to surface mass anomalies ΔM. According to them, those surface mass anomalies represent anomalies in water storage over the continental landmasses. For this reason, we will denote the water storage changes estimated from GRACE as dM/dt. The GRACE data as such are plagued by a lot of errors, which necessitates further processing prior to their usage (see appendix for details). The postprocessed ΔM are taken on a 0.5° × 0.5° grid and then aggregated over catchments to derive catchment-specific time series. The time series are then numerically differentiated using three-point central differences to arrive at the required flux quantity, surface mass changes dM/dt.
In this study, we use three different GRACE-based datasets in order to gain information on differences between products from different data providers or filtering methods. Therefore, the GFZ data have been filtered with both a regularization filter (GFZreg; see appendix for details) as well as the widely used decorrelation (or destriping) filter in combination with a Gaussian filter (GFZdstr). The data from the CSR have been filtered with the regularization filter only (CSRreg).
e. Moisture flux divergence
Vertically integrated moisture flux divergences can be obtained from global atmospheric reanalysis models. These models are combinations of a numerical model and an analysis scheme where the numerical model predicts the state of the earth’s atmosphere, oceans, land surface, and upper soil layers. The analysis scheme, on the other hand, assimilates a large number of observations into the numerical model. Because of the very nature of assimilation, the combination of an analysis scheme forces the estimates to stay both in reasonable physical and numerical bounds, which, on the flip side, disturbs the model’s interior mass budgets (Trenberth et al. 2011; Trenberth and Fasullo 2013).
The present-day reanalyses models assimilate nearly 7 × 106–9 × 106 observations every 6–12 h, which goes to show their data-intensive nature. These reanalyses usually cover a period of multiple decades and provide estimates of the most important hydrological, hydrometeorological, and atmospheric quantities globally. Kalnay et al. [1996; see also Kistler et al. (2001)] categorize the gridded variables from reanalyses into three distinct classes according to the influence of the assimilated observations on them. The type A variables are mainly influenced by observations (e.g., upper-air temperatures or horizontal winds). Type B variables are influenced by both the observations as well as the model (e.g., surface and 2-m temperatures), while type C variables are solely derived from the model (e.g., precipitation and runoff).
The vertically integrated moisture flux divergences are derived from horizontal wind fields—a type A variable—and specific humidity—a type B variable [see Eq. (4)]. Therefore, the moisture flux divergence can be assumed to be more reliable and stable than, for example, modeled evapotranspiration. Here, data from three widely used state-of-the-art reanalysis models have been used: ERA-Interim; MERRA from the Goddard Space Flight Center (GSFC), National Aeronautics and Space Administration (NASA; Rienecker et al. 2011); and Climate Forecast System Reanalysis (CFSR; Saha et al. 2010) from the National Centers for Environmental Prediction (NCEP), National Oceanic and Atmospheric Administration (NOAA).
f. Land surface models
In this study, we also used monthly runoff and evapotranspiration from five different land surface models driven by the GLDAS (Rodell et al. 2004b) and MERRA-Land (Reichle et al. 2011). In contrast to traditional hydrological or land surface models, these estimates are strongly constrained by observations (similar to atmospheric reanalyses) in order to provide more reliable estimates of hydrological variables.
MERRA-Land is an offline (without atmosphere) rerun of a revised version of MERRA’s land component, which is forced by the atmospheric reanalysis near-surface fields (air temperature, wind, and humidity), surface pressure, radiation, and observed precipitation based on the CPC precipitation product (see section 3b). MERRA-Land thus provides a supplemental and improved set of land surface hydrological fields (Reichle et al. 2011).
GLDAS can be seen as a data assimilation framework, where different land surface models are combined with satellite- and ground-based observations in order to improve and constrain the simulated hydrological processes. In particular, the models currently driven by the GLDAS are the Noah land surface model (LSM; Chen et al. 1996; Koren et al. 1999; Ek et al. 2003), the Common Land Model (CLM; Bonan 1998; Dai et al. 2003), Mosaic (Koster and Suarez 1996), and the Variable Infiltration Capacity (VIC) model (Liang et al. 1994). These four realizations are usually referred to as GLDAS1 and share the same forcing datasets:
meteorological data from the Global Data Assimilation System (GDAS; Derber et al. 1991),
downscaled precipitation from the CPC’s Merged Analysis of Precipitation (CMAP; Xie and Arkin 1997), and
satellite-derived downward radiation from the Air Force Weather Agency Agricultural Meteorology Modeling System (AFWA AGRMET; Kopp and Kiess 1996).
It should be noted that these models do not consider a routing scheme for estimating the runoff (R. H. Reichle 2013, personal communication; M. Rodell 2012, personal communication). Therefore, integrating fields of surface and subsurface runoff over a large area on monthly or shorter time scales might lead to inaccuracies in, for example, peak runoff as varying flow directions and velocities (due to the topography of the basin) are not considered. As a consequence, the impact of such routing schemes reduces when considering longer time scales (seasonal and annual). Therefore, for this study, we have used the modeled monthly runoff as provided by the data centers without applying any routing scheme in order to analyze the quality of these estimates on monthly and longer time scales.
4. Results
a. Performance metrics
We compare the time series of runoff estimates from GRACE-based approaches with the observed runoff. To get a holistic understanding, we also compare them with the estimates from hydrological models and the runoff–precipitation ratio method. Such an analysis should provide us a balanced view on the utility of the GRACE-based estimates in terms of already available global-scale alternatives. In the current setup, we use the following three metrics to carry out our evaluation: correlation coefficient, percentage bias (PBIAS; Gupta et al. 1999) and Nash–Sutcliffe efficiency (NSE) coefficient (Nash and Sutcliffe 1970).
b. Notes on the interpretation of the results
The sheer number of combinations, due to the choice of multiple datasets for each of the budget components, forms an ensemble that helps to provide a broad overview of the approach in hand, but also makes the assessment a little cumbersome. The contributing datasets are shown in Fig. 7 together with a color code. The performance of the different approaches and data combinations at the monthly time scale is summarized in Figs. 8, 11, and 12. The upper part of these figures collects the values of the performance metrics into a matrix, where each cell denotes the individual performance values for each catchment and each estimate separately. In other words, each row of the matrix provides an overview of the performance of different combinations for a particular catchment, and each column provides an overview for a particular combination. The lower part summarizes each column of the matrix by sorting the performance metric values within prescribed intervals. The color scales in these matrices have been chosen in such a manner that darker colors indicate better performance and lighter colors indicate poorer performance. The datasets of a single combination can be identified with the aid of the color code and the legend (Fig. 7).
c. Runoff from terrestrial water budget: Rhyd
The runoff estimates from the terrestrial water budget equation is the largest group of the four different runoff estimates compared here. This is obvious because the runoff estimates from the terrestrial water balance equation require three inputs: precipitation, evapotranspiration, and water storage change. The performance metric PBIAS (Fig. 8) shows a random behavior with no data combination providing an outright good performance consistently over all the catchments. However, for some catchments (Amazon, Rio Tapajos, and Rio Jari) Rhyd provides consistently good PBIAS values (≤25%). Also, for most of the catchments (≈60) there is at least one data combination that provides a PBIAS value of nearly zero (see Fig. 9). In other words, there is at least one data combination that provides a reliable mean runoff value for most of the catchments. In all this, the MERRA-Land dataset performs consistently poorly except when it is combined with GPCP (see stacked bars in Fig. 8). The possible reason could be that GPCP is the wettest of all the precipitation datasets and MERRA-Land is the wettest of all the evapotranspiration datasets (see Fig. 6), and therefore their biases get canceled out. But when combined with the other datasets, the bias in MERRA-Land becomes visible. Similarly, GPCP does not perform very well when combined with MOD16 and FLUXNET MTE, while other precipitation datasets perform very well with them. The other interesting feature of the PBIAS matrix is the consistently poor performance in the Arctic catchments (Yenisei, Lena, Ob, Yukon, Thelon, Anabar, Kazan, Mackenzie, Kolyma, Churchill, Yana, Luleaelven, Olenek, Kuskokwim, and Peel), where there is a substantial underestimation of the runoff values (also see Fig. 10). However, we also observe that in these catchments the data combinations that involve GPCP are relatively well behaved. This better performance might be due to the gauge undercatch correction performed in the GPCP dataset, which provides higher values for the precipitation values in the Arctic catchments.
In contrast to PBIAS, the correlation matrix (Fig. 11) though random to an extent, shows a homogeneous picture. In nearly 50% of the catchments, the correlation values are consistently more than 0.5 irrespective of the data combination, which demonstrates that most of the data combinations capture the dynamics of the catchment runoff to a large extent. Further, the Arctic and tropical catchments provide strong correlations with many data combinations, having a correlation upward of 0.75. Some catchments, however, show strong dependency on the data combination (e.g., Mackenzie and Bolshoy Anyuy provide very good correlation when using data combinations involving GLDASNoah3.3). Similarly, the tropical catchments are strongly influenced by the choice of precipitation dataset, where the best performance is given by the use of GPCC and GPCP. As in the case of PBIAS, the MERRA-Land dataset produces the lowest correlation values. Lowest correlations are also found in catchments (e.g., St. Lawrence, Neva, Don, Vuoksi, San Joaquin, Orange, Murray, and Colorado) where the runoff does not have a clear seasonal cycle and is constant throughout the year. These catchments are affected by the biases and errors in the datasets, which produce short-term artifacts, thus reducing the correlation with observed runoff data (see supplemental material). An interesting aspect of Fig. 11 is the influence of the choice of GRACE dataset in the Arctic catchments (e.g., Yenisei, Ob, Pechora, Yana, Anabar, Stinkine, and Peel). The reason for the influence of the water storage dataset is that there is a clear time shift between precipitation and evapotranspiration in the Arctic catchments (see supplemental material), and therefore, the water storage changes have an important role to play in the runoff estimation. Nevertheless, it should also be noted that there is no single GRACE dataset that consistently performs well in all of the Arctic catchments. This is also corroborated by the histograms in Fig. 13.
NSE, as mentioned before, summarizes the PBIAS and correlation metrics, where a data combination should have less bias, and also must capture the dynamics of the catchment runoff. It is clearly evident that no data combination fulfills the twin requirements over all the catchments (see Fig. 12). The Amazon is the only catchment that provides a value of more than 0.75 and also only for select combinations. For example, the combinations involving CPC do not even produce a positive NSE value. Some reasonable (positive) values are provided by Rio Tapajos, Rio Xingu, Pechora, Severnaya Dvina, Stinkine, Tombigbee, Mezen, and Anabar over a variety of data combinations. Interestingly, many of these catchments are quite small (see Table 1) and are considered indiscernible by the GRACE mission. In spite of this optimism, Fig. 12 clearly demonstrates that the current status of global hydrological datasets does not allow a reasonable estimation of runoff globally. The main weakness of these datasets is the biases in the individual datasets, which limit the ability of the terrestrial water budget equation to estimate runoff.
To study any missing patterns in the matrix plots, we also study the impact of a particular dataset in all the data combinations in which it was involved. This was carried out via histogram analysis as shown in Fig. 13. The histogram clearly shows that for all the datasets involved in the Rhyd approach, nearly 40% of the dataset provides a correlation of more than 0.5, but only 30% of the datasets have a relative bias of less than 25%. This again confirms that the major problem of the Rhyd method lies in the biases in the individual datasets. Further, these histograms also indicate that CPC precipitation and MERRA-Land evapotranspiration do not provide good estimates as they consistently perform poorly in all three metrics. The GRACE datasets differ predominantly in the way they help capture the dynamics, but not otherwise. The biggest difference between the datasets of one variable is seen in evapotranspiration, where GLDASNoah3.3, MOD16, and FLUXNET MTE perform consistently well in all three metrics.
d. Runoff from atmospheric water budget: Ratm
The Ratm approach uses only two data sources, and therefore, we can expect a better performance when compared with Rhyd as there is one less source of potential errors. However, the PBIAS values from Ratm (Fig. 8) show a similar random behavior like the estimates from Rhyd. Also, the number of catchments with PBIAS values ≤10% is similar compared to Rhyd (see stacked bars in Fig. 8). On the other hand, there are slightly more catchments with bias values between 10% and 25%, which is mainly due to a better performance over Arctic catchments (e.g., Yenisei, Lena, Ob, Kuskokwim, and Thelon). Here, while Rhyd tends to underestimate runoff (cf. blue and orange dots in Fig. 10), Ratm shows a better water budget closure with remaining relative imbalances of less than 25%. As many precipitation datasets suffer from an underestimation of precipitation in high-latitude regions (e.g., due to gauge undercatch errors; Lorenz and Kunstmann 2012), moisture flux divergences might be more reliable than those derived from precipitation and evapotranspiration over these regions. Of the three moisture flux divergence datasets, ERA-Interim and MERRA show a similar overall performance while CFSR reaches PBIAS values (≤25%) over less than 30 catchments, indicating large biases in the reanalysis model.
In terms of correlation (Fig. 11), Ratm shows slightly worse results than the best combinations from Rhyd. However, there are still about 45 catchments where moisture fluxes from ERA-Interim and MERRA reach correlations ≥ 0.5, indicating reasonable dynamics in the runoff estimates from these combinations (see stacked bars in Fig. 11). Even if the better results over Arctic basins are not visible in the correlation matrix, there are still some catchments where Ratm higher correlations than Rhyd (e.g., Volga, Columbia, and Connecticut).
Finally, the matrix showing the NSE values (Fig. 12) reveals that the combinations using moisture fluxes from ERA-Interim and MERRA perform slightly better than the Rhyd estimates. Positive NSE values can be observed over at least 15 catchments, while there are only 8 catchments with values ≥ 0.5 (see stacked bars in Fig. 12). The matrix shows that Ratm provides better results mainly over Arctic catchments (e.g., Yenisei and Yana). Here, all combinations (except for these using CFSR moisture fluxes) reach values greater than 0.5, which indicates good agreement with the runoff observations.
The histograms (bottom row in Fig. 13) confirm the similar performance of moisture fluxes from ERA-Interim and MERRA and the significant biases in the CFSR dataset. Over roughly 50% of all the Ratm combinations and catchments involving moisture fluxes from ERA-Interim, correlations ≥0.5 with respect to observed runoff can be observed. In terms of PBIAS, MERRA has slightly more data points with biases ≤50%. Finally, the NSE shows that there is no superior data source of moisture flux divergences as both ERA-Interim and MERRA reach values between 0 and 0.5 (0.5 and 0.75) over about 10% (8%) of the data points.
e. Runoff from land surface models: Rmod
In terms of PBIAS (Fig. 8), the Rmod estimates from GLDASCLM and GLDASNoah3.3 show lower biases than the other land surface models. However, compared with Rhyd and Ratm, the models seem to suffer from some large biases over many catchments. For large basins like the Amazon, this can be partly explained with the lack of a routing scheme, which denies the correct estimation of peak runoff and thus creates large differences with respect to the observations. However, even over these large basins, there are single models that show PBIAS values ≤10% (e.g., GLDASNoah3.3 over Amazon and Volga or GLDASCLM over Congo). This indicates that the lack of a routing scheme cannot fully explain the large deviations from the observations. Interestingly, even if GLDASCLM, GLDASMosaic, GLDASNoah2.7, and GLDASVIC are driven by the same forcing data, the PBIAS shows some significant scattering, as there is not a single catchment where all these models perform consistently well. Overall, GLDASMosaic and GLDASVIC show the largest biases with values smaller than 50% over less than 40 catchments (see stacked bars in Fig. 8).
In contrast to the PBIAS analysis, the correlations between Rmod and the observations draw a much more homogeneous picture (Fig. 11). There are many catchments with correlation values greater than 0.5 from all models. This holds true for large catchments (e.g., Mississippi, Volga, and Danube) but also for very small basins with low discharge (e.g., Rio Mearim, Gilbert, or Rio Prado). Thus, there is no dependency on a catchment’s area or mean discharge, if land surface models are able to provide reasonable runoff dynamics. This performance is even more impressive when compared with the results from Rhyd and Ratm as most land surface models show correlations of at least 0.5 (0.75) over 60 (30) and more catchments. Moreover, Rmod performs quite well over regions where Rhyd and Ratm do not capture the runoff dynamics. These are either catchments where runoff has a much smaller amplitude compared to the other water cycle variables (e.g., Mississippi, Volga, Rhine, Don, and Neman) and/or no clear distinct annual cycle in precipitation (e.g., Rhine, Don, Daugava, Neman, Potomac, Altamaha, Weser, or Savannah). In these cases, the biases of the data sources of precipitation and evapotranspiration create short-term variations that result in low correlation values.
The NSE matrix (Fig. 12) shows a similar performance of Rmod and Rhyd. However, the two methods often succeed in totally different catchments. Especially over smaller catchments with low discharge, Rmod seems to provide reasonable runoff estimates (e.g., Potomac, the western Fitzroy, Victoria, or Tugela). Here, we see a very good agreement in the time series between the observed and estimated runoff. Furthermore, there are several catchments where most of the Rmod estimates provide NSE values greater than 0 (e.g., Tombigbee, Klamath, or Victoria), which should give evidence in the reliability of the modeled runoff in these regions.
f. Runoff from precipitation–runoff ratio: Rpr
The rudimentary approach based on precipitation–runoff ratio is by far the most bias-free approach, with more than 45 (75) catchments that provide a PBIAS value of less than 10% (25%; see stacked bars in Fig. 8). Surprisingly, these values hold well irrespective of the precipitation dataset being used. Nevertheless, this method also fails in some catchments (St. Lawrence, Nelson, Churchill, Vuoksi, Rio Jequitinhonha, Suwanee, Orange, Murray, De Gray, and Rio Prado), wherein there is not a clear relationship between precipitation and runoff. The correlation values (Fig. 11) also demonstrate that the Rpr method captures the dynamics accurately, leading to a very good performance. Here again the choice of precipitation dataset does not matter at all for performance of the method, as nearly 60 catchments provide a correlation of more than 0.75. The catchments that returned poor PBIAS values also perform poorly in terms of correlation. The NSE values (Fig. 12) paint the same picture as PBIAS and correlation values. However, the CPC dataset performs relatively worse than the other precipitation datasets. In spite of this, the Rpr method performs the best of all the methods analyzed here.
This method is successful in catchments where there is a strong annual cycle (cyclostationarity) in both the precipitation and runoff datasets, which directly translates into a strong relationship in the P–R ratio. This method fails when the runoff has a nonstationary behavior (e.g., see the time series of Nelson, St. Lawrence, or Vuoksi in the supplemental material), thus resulting in a lack of relationship in said ratio. Another interesting attribute of this method is that the ratio absorbs the biases in the precipitation datasets: the higher the biases, the smaller the ratio and vice versa. Although the P–R ratio deals only with the stable relationship between runoff and precipitation, it implicitly demands the stability of the other two components of the water cycle: evapotranspiration and water storage change. For example, it is precisely because of the instability in water storage change that the method fails in catchments like St. Lawrence and Nelson, where the precipitation and evapotranspiration have a very strong annual cycle. Therefore, it can be said that the influence of evapotranspiration and water storage change is implicit in the P–R ratio.
A theoretical advantage of Rhyd and Ratm over models (without routing) and Rpr is that anthropogenic influences should not play a role when considering the level of agreement with respect to GRDC. As the predictions of Rpr are based on the mean monthly runoff–precipitation ratio, anthropogenic effects have to be treated separately, or, more precisely, those effects cannot be taken into account when no runoff observations are available. However, we could not detect a better performance of the budget approaches compared to Rpr and Rmod over such catchments.
g. Water budget closure
Each of the data sources of precipitation, evapotranspiration, water storage changes, and moisture flux divergences might be biased over certain regions and must thus be considered as a potential source of errors. In some cases, such biases can get canceled out when computing Rhyd or Ratm, leading to random errors that scatter around zero. This is the case for catchments like Rio Tapajos, St. Lawrence, Rio Jari, Tombigbee, or even small basins like San Joaquin (Fig. 10). In these cases, the scattering gives information on the uncertainty level of the water cycle variables. Over most catchments, however, either a positive or negative bias for most combinations can be observed. This indicates general shortcomings in the current quality of the different water cycle variables. Most obviously, this can be observed for the Arctic (tropical) catchments, where the majority of Rhyd combinations consistently tend to underestimate (overestimate) runoff (Fig. 10).
The main reason for these heavily biased runoff estimates can be assumed to originate from uncertainties in the precipitation and evapotranspiration datasets. According to Fig. 15, there is no clear contribution of water storage changes from GRACE to the catchments showing large remaining imbalances. For example, for Tocantins, where Rhyd is significantly overestimated, the GRACE datasets show an increase of water storage of less than 10 mm yr−1. The same holds true for several Arctic catchments (e.g., Yenisei or Lena). In general, the total annual water storage as depicted by GRACE is less than ±50 mm yr−1 for all catchments, but for most of the catchments, the total annual water storage values are within ±25 mm yr−1. Therefore, for the majority of the analyzed catchments, as GRACE does not contribute a lot to the imbalances, the biases in the precipitation and evapotranspiration datasets contribute heavily to the over- and underestimation.
5. Summary and conclusions
In this contribution, we have estimated runoff from water balance equations mainly using GRACE water storage changes. A wide variety of hydrological and hydrometeorological datasets have been used to arrive at the runoff estimates. The estimates were then compared to observed runoff. This was only possible in about 96 catchments because of the limited time span of the GRACE dataset and the limited data availability and large data gaps in the GRDC database. To quantify the estimates holistically, we included runoff estimates from sophisticated hydrological land surface models and an empirical model involving the ratio between precipitation and runoff in the analysis. The evaluation of the estimates was carried out using three performance metrics, namely, PBIAS, correlation coefficient, and NSE.
Based on the assessment, the following conclusions are drawn.
Overall, the estimates from the precipitation–runoff ratio, though being a simple method, provide by far the best results when compared to the budget-based methods and the land surface models. This also implies that for most catchments, the impact of evapotranspiration and water storage changes on runoff underlies a stable annual cycle, which remains constant from year to year. However, the method can only succeed when the monthly relationship between runoff and precipitation remains constant over the years. For catchments with more or less random time series of precipitation or runoff, no stable precipitation–runoff ratio can be derived. For such catchments, the member of Rpr showed consistently bad results.
In the water-balance-based estimates, there is no combination of datasets that is consistently standing out as the best, and there is also no combination that behaved consistently over different catchments. In particular, the PBIAS analysis shows highly variable results from the Rhyd and Ratm estimates. The results further indicate that Rhyd tends to underestimate (overestimate) runoff over Arctic (tropical) catchments. For Arctic catchments, however, Ratm shows better PBIAS values. As a consequence, the quality of different data sources of precipitation, evapotranspiration, water storage changes, and moisture fluxes differ significantly from catchment to catchment. The analysis further shows that the PBIAS is very sensitive to the choice of a member of the Rhyd ensemble.
This indicates that one cannot extrapolate the performance of a combination or method to other basins without thorough validation. To a certain extent, this holds also true for the estimates from the precipitation–runoff ratio. Performance of combinations thus requires scrutiny over individual catchments. This also has strong implications for the regionalization of these estimates to adjacent catchments.
The estimates from Rhyd and Ratm perform well in catchments where the runoff exhibits a strong annual cycle combined with a large volume of runoff. Over such catchments, most combinations reach correlations of at least 0.75. Basins where the magnitudes of the variations in the time series of runoff are far below the other water cycle variables suffer from biases in the different data sources of precipitation, evapotranspiration, water storage changes, and moisture flux divergences. These biases create short-time variations, which deny a reasonable estimation of runoff.
Further, the water-balance-based methods are reliant on GRACE datasets, which are reported to be of inferior spatial resolution compared to the other datasets used here. In spite of this shortcoming, the Rhyd estimates still provide good performance in small catchments, but only those with a strong annual cycle. Moreover, the different GRACE datasets show significantly different results in catchments with a time shift between peak runoff, precipitation, and evapotranspiration. In these cases, GRACE has to take care of the water storage and thus has a major influence on the performance of the Rhyd estimates.
For most catchments, there is still a significant relative water budget imbalance with respect to the mean runoff. The majority of combinations from Rhyd suffer from a remaining imbalance of more than 25% over many basins. However, over most catchments, there is at least a single combination that leads to a relative imbalance close to 0%, indicating a reasonable water budget closure. Moreover, over several catchments, the absolute imbalance scatters around zero. Here, the water budget residual can be assumed to be randomly distributed between the different combinations of Rhyd. This emphasizes the use of ensemble-based methods for analyzing the water cycle over such catchments.
The models performed comparatively poorly when compared with the other approaches. A reason for this might be due to the lack of a routing scheme. However, even on annual time scales, significant biases could be observed when compared with the observations. It could be also detected that over several catchments, modeled runoff might show a significant time lag. However, this is again not consistent throughout all the study regions, which makes it inevitable to evaluate such data sources both in the individual catchments and also globally.
Acknowledgments
We would like to thank the Helmholtz Alliance for Remote Sensing and Earth System Dynamics for funding this study. Further, we gratefully acknowledge the support of projects SN13/1, BA1150/11, KU2090/1 by the Deutsche Forschungsgemeinschaft (DFG) in the framework of the priority program SPP1257 Mass Transport and Mass Distribution in the System Earth. We thank the following data providers: Global Runoff Data Centre (GRDC); Global Precipitation Climatology Project (GPCP); Global Precipitation Climatology Centre (GPCC); University of Delaware (UDEL); Numerical Terradynamic Simulation Group at the University of Montana (NTSG UMT); Max Planck Institute for Biogeochemistry (MPI BGC); European Centre for Medium-Range Weather Forecasts (ECMWF); The University of Texas at Austin, Center for Space Research (CSR); NASA’s Goddard Earth Science Data and Information Services Center (GES DISC); NOAA’s Earth System Research Laboratory, Physical Sciences Division (ESRL PSD); and NCAR’s Computational and Information Systems Laboratory Research Data Archive (CISL RDA). Further thanks are due to Dr. Diego Miralles (University of Bristol) for kindly providing the GLEAM evapotranspiration data. All graphics in the document were produced with the Generic Mapping Tools (GMT; Wessel and Smith 1991).
APPENDIX
Treating the Errors in GRACE Mass Anomalies
The major sources of errors in the GRACE data are aliasing of residual tidal signal (Seo et al. 2008), poor estimates of C20 coefficients (Chen et al. 2005), and high-frequency noise in the spherical harmonic coefficients due to orbit geometry (Kusche et al. 2009). These errors are treated at the spherical harmonic coefficient level.
a. Tidal residuals
The tidal residuals alias frequencies that depend on the orbit geometry of GRACE, sampling rate of GRACE, and the natural frequency of the corresponding tides, which enables them to be modeled and removed from the signal to a large extent. This is accomplished by performing a least squares spectrum analysis on the residuals of GRACE temporal geoid anomalies, which are obtained after removing the mean annual behavior. The mean annual behavior is computed by taking the average of the 12 calendar months as is done for the precipitation–runoff ratio (8) [see Tourian et al. (2011) for the efficacy of this method]. The spectral analysis is performed only for select tidal frequencies: M1, S1, S2, P1, Q1, O1 and N2. The estimates of the aliased frequencies are subsequently subtracted from the residuals of the spherical harmonic coefficients, which are then added back to the mean annual behavior to obtain the corrected monthly anomalies.
b. Erroneous C20 coefficients
The erroneous C20 coefficients are replaced with those estimated from satellite laser ranging (SLR) in the case of the CSR monthly solutions as suggested by Cheng and Ries (2014). However, the C20 coefficients in the GFZ monthly solutions are used as they are well behaved. This is due to the use of EIGEN-6C gravity field model as a background model in the processing scheme (Dahle et al. 2013), which includes data from GRACE, Gravity Field and Steady-State Ocean Circulation Explorer (GOCE), and Laser Geodynamics Satellite (LAGEOS; SLR) and also data from airborne and terrestrial gravimetry.
c. High-frequency noise
The high-frequency noise in the spherical harmonic coefficients manifests itself as stripe-like patterns in the spatial domain. This is dealt with by filtering, and here we apply two methods of filtering: a decorrelation filter in combination with a Gaussian filter with a filter radius of 350 km (Swenson and Wahr 2006) and a regularization filter as proposed by Kusche (2007), but with certain modifications. The modifications are that while Kusche (2007) computes the signal covariance from a combination of different models, we compute the signal covariance from the GRACE data itself, as described by Sasgen et al. (2006); Kusche (2007) also uses stationary signal and noise covariances, whereas we use cyclostationary signal covariances and nonstationary noise covariances. For the GFZ dataset, standard deviations of the spherical harmonic coefficients are provided, which were used for the simulation of noise covariances. The standard deviations for the CSR dataset are not provided, and therefore, we have used standard deviations from the release 4 dataset for the noise covariance simulation.
REFERENCES
Adler, R. F., and Coauthors, 2003: The version-2 Global Precipitation Climatology Project (GPCP) monthly precipitation analysis (1979–present). J. Hydrometeor., 4, 1147–1167, doi:10.1175/1525-7541(2003)004<1147:TVGPCP>2.0.CO;2.
Alsdorf, D. E., Rodríguez E. , and Lettenmaier D. P. , 2007: Measuring surface water from space. Rev. Geophys., 45, RG2002, doi:10.1029/2006RG000197.
Berrisford, P., Dee D. , Fielding K. , Fuentes M. , Kallberg P. , Kobayashi S. , and Uppala S. , 2009: The ERA-Interim archive version 1.0. ERA Rep. Series, Rep. 1, ECMWF, 16 pp. [Available online at http://old.ecmwf.int/publications/library/ecpublications/_pdf/era/era_report_series/RS_1.pdf.]
Bettadpur, S., 2012: GRACE UTCSR level-2 processing standards document (for level-2 product release 0005). Rev. 4.0, Doc. GRACE 327-742 (CSR-GR-12-xx), Center for Space Research, The University of Texas at Austin, 17 pp. [Available online at ftp://podaac.jpl.nasa.gov/allData/grace/docs/L2-CSR0005_ProcStd_v4.0.pdf.]
Bonan, G. B., 1998: The land surface climatology of the NCAR Land Surface Model coupled to the NCEP Community Climate Model. J. Climate, 11, 1307–1326, doi:10.1175/1520-0442(1998)011<1307:TLSCOT>2.0.CO;2.
Chen, F., and Coauthors, 1996: Modeling of land-surface evaporation by four schemes and comparison with FIFE observations. J. Geophys. Res., 101, 7251–7268, doi:10.1029/95JD02165.
Chen, J. L., Rodell M. , Wilson C. R. , and Famiglietti J. S. , 2005: Low degree spherical harmonic influences on Gravity Recovery and Climate Experiment (GRACE) water storage estimates. Geophys. Res. Lett.,32, L14405, doi:10.1029/2005GL022964.
Chen, M., Shi W. , Xie P. , Silva V. B. S. , Kousky V. E. , Higgins R. W. , and Janowiak J. E. , 2008: Assessing objective techniques for gauge-based analyses of global daily precipitation. J. Geophys. Res.,113, D04110, doi:10.1029/2007JD009132.
Cheng, M., and Ries J. , 2014: Monthly estimates of C20 from 5 SLR satellites based on GRACE RL05 models. GRACE Tech. Note 07, Center for Space Research, The University of Texas at Austin, 1 pp. [Available online at ftp://podaac.jpl.nasa.gov/allData/grace/docs/TN-07_C20_SLR.txt.]
Crowley, J. W., Mitrovica J. X. , Bailey R. C. , Tamisiea M. E. , and Davis J. L. , 2006: Land water storage within the Congo basin inferred from GRACE satellite gravity data. Geophys. Res. Lett.,33, L19402, doi:10.1029/2006GL027070.
Dahle, C., Flechtner F. , Gruber C. , König D. , König R. , Michalak G. , and Neumayer K.-H. , 2013: GFZ GRACE level-2 processing standards document for level-2 product release 0005. Scientific Tech. Rep. STR12/02, GFZ German Research Centre for Geosciences, 26 pp., doi:10.2312/GFZ.b103-1202-25.
Dai, Y., and Coauthors, 2003: The Common Land Model. Bull. Amer. Meteor. Soc., 84, 1013–1023, doi:10.1175/BAMS-84-8-1013.
Dee, D. P., and Coauthors, 2011: The ERA-Interim reanalysis: Configuration and performance of the data assimilation system. Quart. J. Roy. Meteor. Soc., 137, 553–597, doi:10.1002/qj.828.
Derber, J. C., Parrish D. F. , and Lord S. J. , 1991: The new global operational analysis system at the national meteorological center. Wea. Forecasting, 6, 538–547, doi:10.1175/1520-0434(1991)006<0538:TNGOAS>2.0.CO;2.
Ek, M. B., Mitchell K. E. , Lin Y. , Rogers E. , Grunmann P. , Koren V. , Gayno G. , and Tarpley J. D. , 2003: Implementation of NOAH land surface model advances in the National Centers for Environmental Prediction operational mesoscale Eta model. J. Geophys. Res.,108, 8851, doi:10.1029/2002JD003296.
Entekhabi, D., and Coauthors, 2010: The soil moisture active passive (SMAP) mission. Proc. IEEE, 98, 704–716, doi:10.1109/JPROC.2010.2043918.
Falloon, P., Betts R. , Wiltshire A. , Dankers R. , Mathison C. , McNeall D. , Bates P. , and Trigg M. , 2011: Validation of river flows in HadGEM1 and HadCM3 with the TRIP river flow model. J. Hydrometeor., 12, 1157–1180, doi:10.1175/2011JHM1388.1.
Fekete, B. M., and Vörösmarty C. J. , 2007: The current status of global river discharge monitoring and potential new technologies complementing traditional discharge measurements. IAHS Publ.,309, 129–136. [Available online at http://iahs.info/uploads/dms/309015.pdf.]
Fekete, B. M., Vörösmarty C. J. , and Grabs W. , 2002: High-resolution fields of global runoff combining observed river discharge and simulated water balances. Global Biogeochem. Cycles, 16, 1042, doi:10.1029/1999GB001254.
Fekete, B. M., Looser U. , Pietroniro A. , and Robarts R. D. , 2012: Rationale for monitoring discharge on the ground. J. Hydrometeor., 13, 1977–1986, doi:10.1175/JHM-D-11-0126.1.
Fersch, B., Kunstmann H. , Bárdossy A. , Devaraju B. , and Sneeuw N. , 2012: Continental-scale basin water storage variation from global and dynamically downscaled atmospheric water budgets in comparison with GRACE-derived observations. J. Hydrometeor., 13, 1589–1603, doi:10.1175/JHM-D-11-0143.1.
GRDC, 2013: Tenth meeting of the GRDC Steering Committee, 15–17 June 2011, Koblenz, Germany. GRDC Rep. Series, Rep. 42, Global Runoff Data Centre, 31 pp., doi:10.5675/GRDC_Report_42.
Gupta, H., Shrooshian S. , and Yapo P. O. , 1999: Status of automatic calibration for hydrologic models: Comparison with multilevel expert calibration. J. Hydrol. Eng., 4, 135–143, doi:10.1061/(ASCE)1084-0699(1999)4:2(135).
Harris, I., Jones P. D. , Osborn T. J. , and Lister D. H. , 2013: Updated high-resolution grids of monthly climatic observations—The CRU TS3.10 dataset. Int. J. Climatol., 34, 623–642, doi:10.1002/joc.3711.
Hrachowitz, M., and Coauthors, 2013: A decade of Predictions in Ungauged Basins (PUB)—A review. Hydrol. Sci. J., 58, 1198–1255, doi:10.1080/02626667.2013.803183.
Huffman, G. J., Adler R. F. , Bolvin D. T. , and Gu G. , 2009: Improving the global precipitation record: GPCP version 2.1. Geophys. Res. Lett.,36, L17808, doi:10.1029/2009GL040000.
Jones, P., 1999: First- and second-order conservative remapping schemes for grids in spherical coordinates. Mon. Wea. Rev., 127, 2204–2210, doi:10.1175/1520-0493(1999)127<2204:FASOCR>2.0.CO;2.
Jung, M., Reichstein M. , and Bondeau A. , 2009: Towards global empirical upscaling of FLUXNET eddy covariance observations: Validation of a model tree ensemble approach using a biosphere model. Biogeosciences, 6, 2001–2013, doi:10.5194/bg-6-2001-2009.
Jung, M., and Coauthors, 2010: Recent decline in the global land evapotranspiration trend due to limited moisture supply. Nature, 467, 951–954, doi:10.1038/nature09396.
Kalnay, E., and Coauthors, 1996: The NCEP/NCAR 40-Year Reanalysis Project. Bull. Amer. Meteor. Soc., 77, 437–471, doi:10.1175/1520-0477(1996)077<0437:TNYRP>2.0.CO;2.
Kerr, Y. H., and Coauthors, 2010: The SMOS Mission: New tool for monitoring key elements of the global water cycle. Proc. IEEE,98, 666–687, doi:10.1109/JPROC.2010.2043032.
Kistler, R., and Coauthors, 2001: The NCEP–NCAR 50-Year Reanalysis: Monthly means CD-ROM and documentation. Bull. Amer. Meteor. Soc., 82, 247–268, doi:10.1175/1520-0477(2001)082<0247:TNNYRM>2.3.CO;2.
Kopp, T. J., and Kiess R. B. , 1996: The air force global weather central cloud analysis model. Preprints, 15th Conf. on Weather Analysis and Forecasting, Norfolk, VA, Amer. Meteor. Soc., 220–222.
Koren, V., Schaake J. , Mitchell K. , Duan Q. Y. , Chen F. , and Baker J. M. , 1999: A parameterization of snowpack and frozen ground intended for NCEP weather and climate models. J. Geophys. Res., 104, 19 569–19 585, doi:10.1029/1999JD900232.
Koster, R. D., and Suarez M. J. , 1996: Energy and water balance calculations in the Mosaic LSM. NASA Tech. Memo. 104606, Tech. Rep. Series on Global Modeling and Data Assimilation, Vol. 9, 60 pp. [Available online at http://gmao.gsfc.nasa.gov/pubs/docs/Koster130.pdf.]
Kummerow, C., and Coauthors, 2000: The status of the Tropical Rainfall Measuring Mission (TRMM) after two years in orbit. J. Appl. Meteor., 39, 1965–1982, doi:10.1175/1520-0450(2001)040<1965:TSOTTR>2.0.CO;2.
Kusche, J., 2007: Approximate decorrelation and non-isotropic smoothing of time-variable GRACE-type gravity field models. J. Geod., 81, 733–749, doi:10.1007/s00190-007-0143-3.
Kusche, J., Schmidt R. , Petrovic S. , and Rietbroek R. , 2009: Decorrelated GRACE time-variable gravity solutions by GFZ, and their validation using a hydrological model. J. Geod., 83, 903–913, doi:10.1007/s00190-009-0308-3.
Landerer, F. W., Dickey J. O. , and Güntner A. , 2010: Terrestrial water budget of the Eurasian pan-Arctic from GRACE satellite measurements during 2003–2009. J. Geophys. Res.,115, D23115, doi:10.1029/2010JD014584.
Liang, X., Lettenmaier D. P. , Wood E. F. , and Burges S. J. , 1994: A simple hydrologically based model of land surface water and energy fluxes for general circulation models. J. Geophys. Res., 99, 14 415–14 428, doi:10.1029/94JD00483.
Longuevergne, L., Scanlon B. R. , and Wilson C. R. , 2010: GRACE hydrological estimates for small basins: Evaluating processing approaches on the High Plains Aquifer, USA. Water Resour. Res.,46, W11517, doi:10.1029/2009WR008564.
Lorenz, C., and Kunstmann H. , 2012: The hydrological cycle in three state-of-the-art reanalyses: Intercomparison and performance analysis. J. Hydrometeor., 13, 1397–1420, doi:10.1175/JHM-D-11-088.1.
Matsuura, K., and Willmott C. J. , 2012: Terrestrial precipitation: 1900–2010 gridded monthly time series (version 3.02). Center for Climatic Research, University of Delaware, Newark, DE. [Available online at http://climate.geog.udel.edu/~climate/html_pages/download.html#P2011rev.]
Miralles, D. G., de Jeu R. A. M. , Gash J. H. , Holmes T. R. H. , and Dolman A. J. , 2011: An application of GLEAM to estimating global evaporation. Hydrol. Earth Syst. Sci. Discuss., 8, 1–27, doi:10.5194/hessd-8-1-2011.
Mu, Q., Heinsch F. A. , Zhao M. , and Running S. W. , 2007: Development of a global evapotranspiration algorithm based on MODIS and global meteorology data. Remote Sens. Environ., 111, 519–536, doi:10.1016/j.rse.2007.04.015.
Mu, Q., Zhao M. , and Running S. W. , 2011: Improvements to a MODIS global terrestrial evapotranspiration algorithm. Remote Sens. Environ., 115, 1781–1800, doi:10.1016/j.rse.2011.02.019.
Mueller, B., and Coauthors, 2011: Evaluation of global observations-based evapotranspiration datasets and IPCC AR4 simulations. Geophys. Res. Lett.,38, L06492, doi:10.1029/2010GL046230.
Nash, J. E., and Sutcliffe J. V. , 1970: River flow forecasting through conceptual models: Part 1. A discussion of principles. J. Hydrol., 10, 282–290, doi:10.1016/0022-1694(70)90255-6.
Parkinson, C., 2003: Aqua: An earth-observing satellite mission to examine water and other climate variables. IEEE Trans. Geosci. Remote Sens.,41, 173–183, doi:10.1109/TGRS.2002.808319.
Peixoto, J. P., and Oort A. H. , 1992: Physics of Climate. American Institute of Physics, 520 pp.
Reichle, R. H., Koster R. D. , Lannoy G. J. M. D. , Forman B. A. , Liu Q. , Mahanama S. P. P. , and Tour A. , 2011: Assessment and enhancement of MERRA land surface hydrology estimates. J. Climate, 24, 6322–6338, doi:10.1175/JCLI-D-10-05033.1.
Riegger, J., and Tourian M. J. , 2014: Characterization of runoff–storage relationships by satellite gravimetry and remote sensing. Water Resour. Res.,50, 3444–3466, doi:10.1002/2013WR013847.
Riegger, J., Tourian M. J. , Devaraju B. , and Sneeuw N. , 2012: Analysis of GRACE uncertainties by hydrological and hydro-meteorological observations. J. Geodyn., 59–60, 16–27, doi:10.1016/j.jog.2012.02.001.
Rienecker, M. M., and Coauthors, 2011: MERRA: NASA’s Modern-Era Retrospective Analysis for Research and Applications. J. Climate, 24, 3624–3648, doi:10.1175/JCLI-D-11-00015.1.
Roads, J. O., Chen S. C. , Guetter A. K. , and Georgakaos K. P. , 1994: Large-scale aspects of the United States hydrologic cycle. Bull. Amer. Meteor. Soc., 75, 1589–1610, doi:10.1175/1520-0477(1994)075<1589:LSAOTU>2.0.CO;2.
Rodell, M., Famiglietti J. S. , Chen J. , Seneviratne S. I. , Viterbo P. , Holl S. , and Wilson C. R. , 2004a: Basin scale estimates of evapotranspiration using GRACE and other observations. Geophys. Res. Lett.,31, L20504, doi:10.1029/2004GL020873.
Rodell, M., and Coauthors, 2004b: The Global Land Data Assimilation System. Bull. Amer. Meteor. Soc., 85, 381–394, doi:10.1175/BAMS-85-3-381.
Rodell, M., Velicogna I. , and Famiglietti J. , 2009: Satellite-based estimates of ground-water depletion in India. Nature, 460, 999–1002, doi:10.1038/nature08238.
Saha, S., and Coauthors, 2010: The NCEP Climate Forecast System Reanalysis. Bull. Amer. Meteor. Soc., 91, 1015–1057, doi:10.1175/2010BAMS3001.1.
Salvucci, G. D., and Gentine P. , 2013: Emergent relation between surface vapor conductance and relative humidity profiles yields evaporation rates from weather data. Proc. Natl. Acad. Sci. USA, 110, 6287–6291, doi:10.1073/pnas.1215844110.
Sasgen, I., Martinec Z. , and Fleming K. , 2006: Wiener optimal filtering of GRACE data. Stud. Geophys. Geod., 50, 499–508, doi:10.1007/s11200-006-0031-y.
Schlosser, C. A., and Houser P. R. , 2007: Assessing a satellite-era perspective of the global water cycle. J. Climate, 20, 1316–1338, doi:10.1175/JCLI4057.1.
Schmidt, R., Flechtner F. , Meyer U. , Neumayer K.-H. , Dahle C. , König R. , and Kusche J. , 2008: Hydrological signals observed by the GRACE satellites. Surv. Geophys., 29, 319–334, doi:10.1007/s10712-008-9033-3.
Schneider, U., Becker A. , Finger P. , Meyer-Christoffer A. , Ziese M. , and Rudolf B. , 2014: GPCC’s new land surface precipitation climatology based on quality-controlled in situ data and its role in quantifying the global water cycle. Theor. Appl. Climatol., 115, 15–40, doi:10.1007/s00704-013-0860-x.
Schrama, E. J. O., and Wouters B. , 2011: Revisiting Greenland Ice Sheet mass loss observed by GRACE. J. Geophys. Res.,116, B02407, doi:10.1029/2009JB006847.
Seitz, F., Schmidt M. , and Shum C. K. , 2008: Signals of extreme weather conditions in central Europe in GRACE 4-D hydrological mass variations. Earth Planet. Sci. Lett., 268, 165–170, doi:10.1016/j.epsl.2008.01.001.
Seo, K. W., Wilson C. R. , Han S. C. , and Waliser D. E. , 2008: Gravity Recovery and Climate Experiment (GRACE) alias error from ocean tides. J. Geophys. Res.,113, B03405, doi:10.1029/2006JB004747.
Sheffield, J., Goteti G. , and Wood E. F. , 2006: Development of a 50-year high-resolution global dataset of meteorological forcings for land surface modeling. J. Climate, 19, 3088–3111, doi:10.1175/JCLI3790.1.
Sheffield, J., Ferguson C. R. , Troy T. J. , Wood E. F. , and McCabe M. F. , 2009: Closing the terrestrial water budget from satellite remote sensing. Geophys. Res. Lett.,36, L07403, doi:10.1029/2009GL037338.
Shiklomanov, A. I., Lammers R. B. , and Vörösmarty C. J. , 2002: Widespread decline in hydrological monitoring threatens pan-Arctic research. Eos, Trans. Amer. Geophys. Union, 83, 13–17, doi:10.1029/2002EO000007.
Swenson, S., and Wahr J. , 2006: Estimating large-scale precipitation minus evapotranspiration from GRACE satellite gravity measurements. J. Hydrometeor., 7, 252–270, doi:10.1175/JHM478.1.
Syed, T. H., Famiglietti J. S. , Chen J. , Rodell M. , Seneviratne S. I. , Viterbo P. , and Wilson C. R. , 2005: Total basin discharge for the Amazon and Mississippi River basins from GRACE and a land–atmosphere water balance. Geophys. Res. Lett.,32, L24404, doi:10.1029/2005GL024851.
Syed, T. H., Famiglietti J. S. , Zlotnicki V. , and Rodell M. , 2007: Contemporary estimates of pan-Arctic freshwater discharge from GRACE and reanalysis. Geophys. Res. Lett.,34, L19404, doi:10.1029/2007GL031254.
Syed, T. H., Famiglietti J. S. , and Chambers D. P. , 2009: GRACE-based estimates of terrestrial freshwater discharge from basin to continental scales. J. Hydrometeor., 10, 22–40, doi:10.1175/2008JHM993.1.
Tapley, B. D., Bettadpur S. , Ries J. C. , Thompson P. F. , and Watkins M. M. , 2004: GRACE measurements of mass variability in the Earth system. Science, 305, 503–505, doi:10.1126/science.1099192.
Tourian, M. J., Riegger J. , Sneeuw N. , and Devaraju B. , 2011: Outlier identification and correction for GRACE aggregated data. Stud. Geophys. Geod., 55, 627–640, doi:10.1007/s11200-009-9007-z.
Tourian, M. J., Sneeuw N. , and Bárdossy A. , 2013: A quantile function approach to discharge estimation from satellite altimetry (ENVISAT). Water Resour. Res., 49, 4174–4186, doi:10.1002/wrcr.20348.
Trenberth, K. E., and Fasullo J. T. , 2013: Regional energy and water cycles: Transports from ocean to land. J. Climate, 26, 7837–7851, doi:10.1175/JCLI-D-13-00008.1.
Trenberth, K. E., Smith L. , Qian T. , Dai A. , and Fasullo J. , 2007: Estimates of the global water budget and its annual cycle using observational and model data. J. Hydrometeor., 8, 758–769, doi:10.1175/JHM600.1.
Trenberth, K. E., Fasullo J. T. , and Mackaro J. , 2011: Atmospheric moisture transports from ocean to land and global energy flows in reanalyses. J. Climate, 24, 4907–4924, doi:10.1175/2011JCLI4171.1.
Vörösmarty, C. J., and Coauthors, 2001: Global water data: A newly endangered species. Eos, Trans. Amer. Geophys. Union, 82, 54–58, doi:10.1029/01EO00031.
Vörösmarty, C. J., and Coauthors, 2010: Global threats to human water security and river biodiversity. Nature, 467, 555–561, doi:10.1038/nature09440.
Wahr, J., Molenaar M. , and Bryan F. , 1998: The time-variability of the Earth’s gravity field: Hydrological and oceanic effects and their possible detection using GRACE. J. Geophys. Res., 103, 30 205–30 230, doi:10.1029/98JB02844.
Werth, S., Güntner A. , Schmidt R. , and Kusche J. , 2009: Evaluation of GRACE filter tools from a hydrological perspective. Geophys. J. Int., 179, 1499–1515, doi:10.1111/j.1365-246X.2009.04355.x.
Wessel, P., and Smith W. H. F. , 1991: Free software helps map and display data. Eos, Trans. Amer. Geophys. Union, 72, 441–446, doi:10.1029/90EO00319.
Xie, P., and Arkin P. A. , 1997: Global precipitation: A 17-year monthly analysis based on gauge observations, satellite estimates, and numerical model outputs. Bull. Amer. Meteor. Soc., 78, 2539–2558, doi:10.1175/1520-0477(1997)078<2539:GPAYMA>2.0.CO;2.
Zaitchik, B. F., Rodell M. , and Olivera F. , 2010: Evaluation of the Global Land Data Assimilation System using global river discharge data and a source-to-sink routing scheme. Water Resour. Res.,46, W06507, doi:10.1029/2009WR007811.
Zhou, X., Zhang Y. , Wang Y. , Zhang H. , Vaze J. , Zhang L. , Yang Y. , and Zhou Y. , 2012: Benchmarking global land surface models against the observed mean annual runoff from 150 large basins. J. Hydrol.,470–471, 269–279, doi:10.1016/j.jhydrol.2012.09.002.