• Bras, R., , and I. Rodríguez-Iturbe, 1976: Evaluation of mean square error involved in approximating the areal average of a rainfall event by a discrete summation. Water Resour. Res., 12, 181184.

    • Search Google Scholar
    • Export Citation
  • Chen, M., , P. Xie, , J. E. Janowiak, , and P. A. Arkin, 2002: Global land precipitation: A 50-yr monthly analysis based on gauge observations. J. Hydrometeor., 3, 249266.

    • Search Google Scholar
    • Export Citation
  • Dai, A., , I. Y. Fung, , and A. D. Del Genio, 1997: Surface observed global land precipitation variations during 1900–88. J. Climate, 10, 29432962.

    • Search Google Scholar
    • Export Citation
  • Dai, A., , T. Qian, , K. E. Trenberth, , and J. D. Milliman, 2009: Changes in continental freshwater discharge from 1948 to 2004. J. Climate, 22, 27732791.

    • Search Google Scholar
    • Export Citation
  • Evensen, G., 1994: Sequential data assimilation with a nonlinear quasi-geostrophic model using Monte Carlo methods to forecast error statistics. J. Geophys. Res., 99 (C5), 10 14310 162.

    • Search Google Scholar
    • Export Citation
  • Gao, H., , Q. Tang, , C. R. Ferguson, , E. F. Wood, , and D. P. Lettenmaier, 2010: Estimating the water budget of major U.S. river basins via remote sensing. Int. J. Remote Sens., 31, 39553978, doi:10.1080/01431161.2010.483488.

    • Search Google Scholar
    • Export Citation
  • Jung, M., , M. Reichstein, , and A. Bondeau, 2009: Towards global empirical upscaling of FLUXNET eddy covariance observations: Validation of a model tree ensemble approach using a biosphere model. Biogeosci. Discuss., 6, 52715304.

    • Search Google Scholar
    • Export Citation
  • Jung, M., and Coauthors, 2010: Recent decline in the global land evapotranspiration trend due to limited moisture supply. Nature, 467, 951954.

    • Search Google Scholar
    • Export Citation
  • Kalman, R. E., 1960: A new approach to linear filtering and prediction problems. J. Basic Eng., 82D, 3545.

  • Leblanc, M. J., , P. Tregoning, , G. Ramillien, , S. O. Tweed, , and A. Fakes, 2009: Basin-scale, integrated observations of the early 21st century multiyear drought in southeast Australia. Water Resour. Res., 45, W04408, doi:10.1029/2008WR007333.

    • Search Google Scholar
    • Export Citation
  • Liang, X., , D. P. Lettenmaier, , E. F. Wood, , and S. J. Burges, 1994: A simple hydrologically based model of land surface water and energy fluxes for GSMs. J. Geophys. Res., 99 (D7), 14 41514 428.

    • Search Google Scholar
    • Export Citation
  • Liang, X., , E. F. Wood, , and D. P. Lettenmaier, 1996: Surface soil moisture parameterization of the VIC-2L model: Evaluation and modifications. Global Planet. Change, 13, 195206.

    • Search Google Scholar
    • Export Citation
  • Luo, L., , and E. F. Wood, 2008: Use of Bayesian merging techniques in a multimodel seasonal hydrologic ensemble prediction system for the eastern United States. J. Hydrometeor., 9, 866884.

    • Search Google Scholar
    • Export Citation
  • Luo, L., and Coauthors, 2003: Validation of the North American Land Data Assimilation System (NLDAS) retrospective forcing over the Southern Great Plains. J. Geophys. Res., 108, 8843, doi:10.1029/2002JD003246.

    • Search Google Scholar
    • Export Citation
  • Matsuura, K., , and C. J. Willmott, 2010: Terrestrial air temperature and precipitation: 1900–2008 gridded monthly and annual time series. Version 2.01, Center for Climatic Research, University of Delaware. [Available online at http://climate.geog.udel.edu/~climate/html_pages/Global2_Ts_2009/README.global_t_ts_2009.html.]

  • McCabe, M. F., , E. F. Wood, , R. Wojcik, , M. Pan, , J. Sheffield, , H. Gao, , and H. Su, 2008: Hydrological consistency using multi-sensor remote sensing data for water and energy cycle studies. Remote Sens. Environ., 112, 430444.

    • Search Google Scholar
    • Export Citation
  • McLaughlin, D. B., 2002: An integrated approach to hydrologic data assimilation: Interpolation, smoothing, and filtering. Adv. Water Resour., 25, 12751286.

    • Search Google Scholar
    • Export Citation
  • Milly, P. C. D., , and K. A. Dunne, 2002: Macroscale water fluxes 1. Quantifying errors in the estimation of basin mean precipitation. Water Resour. Res., 38, 1205, doi:10.1029/2001WR000759.

    • Search Google Scholar
    • Export Citation
  • Mitchell, T. D., , and P. D. Jones, 2005: An improved method of constructing a database of monthly climate observations and associated high-resolution grids. Int. J. Climatol., 25, 693712, doi:10.1002/joc.1181.

    • Search Google Scholar
    • Export Citation
  • Morel, P., 2001: Why GEWEX? The agenda for a global energy and water cycle program. GEWEX News, Vol. 11, No.1, International GEWEX Project Office, Silver Spring, MD, 1–11.

  • Murphy, B., , and B. Timbal, 2008: A review of recent climate variability and climate change in southeastern Australia. Int. J. Climatol., 28, 859879, doi:10.1002/joc.1627.

    • Search Google Scholar
    • Export Citation
  • NASA, 2003: Earth Science Enterprise Strategy. National Aeronautics and Space Administration, Washington, DC, 94 pp. [Available online at http://science.nasa.gov/media/medialibrary/2010/03/31/ESE_Strategy2003.pdf.]

  • NASA NEWS Science Integration Team, 2007: Predicting energy and water cycle consequences of earth system variability and change. 89 pp. [Available at http://news.cisc.gmu.edu/doc/NEWS_implementation.pdf.]

  • Oki, T., , T. Nishimura, , and P. Dirmeyer, 1999: Assessment of annual runoff from land surface models using Total Runoff Integrating Pathways (TRIP). J. Meteor. Soc. Japan, 77, 235255.

    • Search Google Scholar
    • Export Citation
  • Pan, M., , and E. F. Wood, 2006: Data assimilation for estimating terrestrial water budget using a constrained ensemble Kalman filter. J. Hydrometeor., 7, 534547.

    • Search Google Scholar
    • Export Citation
  • Pan, M., , E. F. Wood, , R. Wojcik, , and M. McCabe, 2008: Estimation of regional terrestrial water cycle using multi-sensor remote sensing observations and data assimilation. Remote Sens. Environ., 112, 12821294.

    • Search Google Scholar
    • Export Citation
  • Pan, M., , H. Li, , and E. F. Wood, 2010: Assessing the skill of satellite-based precipitation estimates in hydrologic applications. Water Resour. Res., 46, W09535, doi:10.1029/2009WR008290.

    • Search Google Scholar
    • Export Citation
  • Rawlins, M. A., and Coauthors, 2010: Analysis of the Arctic system for freshwater cycle intensification: Observations and Expectations. J. Climate, 23, 57155737.

    • Search Google Scholar
    • Export Citation
  • Rodríguez-Iturbe, I., , and J. M. Mejía, 1974: The design of rainfall networks in time and space. Water Resour. Res., 10, 713728, doi:10.1029/WR010i004p00713.

    • Search Google Scholar
    • Export Citation
  • Rossow, W. B., , and R. A. Schiffer, 1999: Advances in understanding clouds from ISCCP. Bull. Amer. Meteor. Soc., 80, 22612287.

  • Sahoo, A. K., , M. Pan, , T. J. Troy, , R. Vinukollu, , J. Sheffield, , and E. F. Wood, 2011: Reconciling the global terrestrial water budget using satellite remote sensing. Remote Sens. Environ., 115, 18501865, doi:10.1016/j.rse.2011.03.009.

    • Search Google Scholar
    • Export Citation
  • Schneider, U., , T. Fuchs, , A. Meyer-Christoffer, , and B. Rudolf, 2008: Global precipitation analysis products of the GPCC. GPCC Publication, 12 pp. [Available online at ftp://ftp.dwd.de/pub/data/gpcc/PDF/GPCC_intro_products_2008.pdf.]

  • Sheffield, J., , and E. F. Wood, 2007: Characteristics of global and regional drought, 1950–2000: Analysis of soil moisture data from off-line simulation of the terrestrial hydrologic cycle. J. Geophys. Res., 112, D17115, doi:10.1029/2006JD008288.

    • Search Google Scholar
    • Export Citation
  • Sheffield, J., , G. Goteti, , and E. F. Wood, 2006: Development of a 50-yr high-resolution global dataset of meteorological forcings for land surface modeling. J. Climate, 19, 30883111.

    • Search Google Scholar
    • Export Citation
  • Sheffield, J., , C. R. Ferguson, , T. J. Troy, , E. F. Wood, , and M. F. McCabe, 2009: Closing the terrestrial water budget from satellite remote sensing. Geophys. Res. Lett., 36, L07403, doi:10.1029/2009GL037338.

    • Search Google Scholar
    • Export Citation
  • Simmons, A. J., , S. Uppala, , D. Dee, , and S. Kobayashi, 2006: ERA-Interim: New ECMWF reanalysis products from 1989 onwards. ECMWF Newsletter, No. 110, ECMWF, Reading, United Kingdom, 25–35. [Available online at http://www.ecmwf.int/publications/newsletters/pdf/110_rev.pdf.]

  • Simon, D., , and T. L. Chia, 2002: Kalman filtering with state equality constraints. IEEE Trans. Aerosp. Electron. Syst., 38, 128136.

  • Su, Z., 2002: The Surface Energy Balance System (SEBS) for estimation of turbulent heat flux. Hydrol. Earth Syst. Sci., 6, 8599.

  • Swenson, S., , and J. Wahr, 2002: Methods for inferring regional surface-mass anomalies from Gravity Recovery and Climate Experiment (GRACE) measurements of time-variable gravity. J. Geophys. Res., 107, 2193, doi:10.1029/2001JB000576.

    • Search Google Scholar
    • Export Citation
  • Tang, Q., , H. Gao, , P. Yeh, , T. Oki, , F. Su, , and D. P. Lettenmaier, 2010: Dynamics of terrestrial water storage change from satellite and surface observations and modeling. J. Hydrometeor., 11, 156170.

    • Search Google Scholar
    • Export Citation
  • Tian, Y., , and C. D. Peters-Lidard, 2010: A global map of uncertainties in satellite-based precipitation measurements. Geophys. Res. Lett., 37, L24407, doi:10.1029/2010GL046008.

    • Search Google Scholar
    • Export Citation
  • Troy, T. J., , J. Sheffield, , and E. F. Wood, 2010: Estimation of the terrestrial water budget over northern Eurasia through the use of multiple data sources. J. Climate, 24, 32723293.

    • Search Google Scholar
    • Export Citation
  • Vinukollu, R. K., , E. F. Wood, , C. R. Ferguson, , and J. B. Fisher, 2011: Global estimates of evapotranspiration for climate studies using multi-sensor remote sensing data: Evaluation of three process-based approaches. Remote Sens. Environ., 115, 801823, doi:10.1016/j.rse.2010.11.006.

    • Search Google Scholar
    • Export Citation
  • Yang, D., , D. Kane, , Z. Zhang, , D. Legates, , and B. Goodison, 2005: Bias corrections of long-term (1973–2004) daily precipitation data over the northern regions. Geophys. Res. Lett., 32, L19501, doi:10.1029/2005GL024057.

    • Search Google Scholar
    • Export Citation
  • View in gallery

    Thirty-two global basins selected for the water cycle estimation.

  • View in gallery

    Seasonal cycle of precipitation from different products over six representative basins for 1984–2006.

  • View in gallery

    As in Fig. 2 but of evapotranspiration. The data for ERA-Interim Inferred are for 1989–2006.

  • View in gallery

    Flowchart of the water cycle assessment procedure.

  • View in gallery

    Basin-mean precipitation gauge density (gauges per 106 km2) during the study period. Density data for WM are not available and are assumed to be one-half of GPCC. The density–error curve is plotted on the left for readers to loop up for error levels given the gauge density.

  • View in gallery

    Example of merging of precipitation products for the Amazon. Monthly time series of (top) original products and (bottom) the merged estimate. The pie chart represents the mean of the merging weights applied to each product to form the merged estimate, and the gray area marks the range (between maximum and minimum) of inputs.

  • View in gallery

    As in Fig. 6 but of evapotranspiration including (middle panel) bias-corrected versions of the products.

  • View in gallery

    Mean merging weights for the precipitation products for the 32 selected basins.

  • View in gallery

    As in Fig. 8 but for the evapotranspiration.

  • View in gallery

    (left) Unconstrained and (right) constrained water budgets over the Amazon. The water budget imbalance is shown in the bottom panels. By construct, the constrained imbalance is zero. Also shown is the attribution of the imbalance to each variable averaged over the time period.

  • View in gallery

    Annual time series of all water budget components after balance constraining over 12 selected basins. The slope of the linear trend (mm yr−2) is annotated to the corresponding time series with the significance level in the brackets (annotated only if the significance level is greater than 95%). Trend and significance level are annotated in different fonts for different variables: bold font for p, normal font for e, and italic font for Δs. All time series values are in millimeters per year.

  • View in gallery

    Seasonal cycles of all water budget components after balance constraining for 12 selected basins.

  • View in gallery

    Mean annual water budgets.

  • View in gallery

    Mean error attribution among water budget terms.

All Time Past Year Past 30 Days
Abstract Views 0 0 0
Full Text Views 40 40 19
PDF Downloads 35 35 19

Multisource Estimation of Long-Term Terrestrial Water Budget for Major Global River Basins

View More View Less
  • 1 Department of Civil and Environmental Engineering, Princeton University, Princeton, New Jersey
© Get Permissions
Full access

Abstract

A systematic method is proposed to optimally combine estimates of the terrestrial water budget from different data sources and to enforce the water balance constraint using data assimilation techniques. The method is applied to create global long-term records of the terrestrial water budget by merging a number of global datasets including in situ observations, remote sensing retrievals, land surface model simulations, and global reanalyses. The estimation process has three steps. First, a conventional analysis on the errors and biases in different data sources is conducted based on existing validation/error studies and other information such as sensor network density, model physics, and calibration procedures. Then, the data merging process combines different estimates so that biases and errors from different data sources can be compensated to the greatest extent and the merged estimates have the best possible confidence. Finally, water balance errors are resolved using the constrained Kalman filter technique. The procedure is applied to 32 globally distributed major basins for 1984–2006. The authors believe that the resulting global water budget estimates can be used as a baseline dataset for large-scale diagnostic studies, for example, integrated assessment of basin water resources, trend analysis and attribution, and climate change studies. The global scale of the analysis presents significant challenges in carrying out the error analysis for each water budget variable. For some variables (e.g., evapotranspiration) the assumptions underpinning the error analysis lack supporting quantitative analysis and, thus, may not hold for specific locations. Nevertheless, the merging and water balance constraining technique can be applied to many problems.

Corresponding author address: Ming Pan, Dept. of Civil and Environmental Engineering, Princeton University, Princeton, NJ 08544. E-mail: mpan@princeton.edu

Abstract

A systematic method is proposed to optimally combine estimates of the terrestrial water budget from different data sources and to enforce the water balance constraint using data assimilation techniques. The method is applied to create global long-term records of the terrestrial water budget by merging a number of global datasets including in situ observations, remote sensing retrievals, land surface model simulations, and global reanalyses. The estimation process has three steps. First, a conventional analysis on the errors and biases in different data sources is conducted based on existing validation/error studies and other information such as sensor network density, model physics, and calibration procedures. Then, the data merging process combines different estimates so that biases and errors from different data sources can be compensated to the greatest extent and the merged estimates have the best possible confidence. Finally, water balance errors are resolved using the constrained Kalman filter technique. The procedure is applied to 32 globally distributed major basins for 1984–2006. The authors believe that the resulting global water budget estimates can be used as a baseline dataset for large-scale diagnostic studies, for example, integrated assessment of basin water resources, trend analysis and attribution, and climate change studies. The global scale of the analysis presents significant challenges in carrying out the error analysis for each water budget variable. For some variables (e.g., evapotranspiration) the assumptions underpinning the error analysis lack supporting quantitative analysis and, thus, may not hold for specific locations. Nevertheless, the merging and water balance constraining technique can be applied to many problems.

Corresponding author address: Ming Pan, Dept. of Civil and Environmental Engineering, Princeton University, Princeton, NJ 08544. E-mail: mpan@princeton.edu

1. Introduction

Comprehensive documentation of the terrestrial water cycle at the global scale and its evolution over time is fundamental to understanding the earth’s climate system and assessing the impacts owing to climate change. Such documentation is also needed to characterize the memories, pathways, and feedbacks between key water, energy, and biogeochemical cycles. With such an enhanced understanding, there is the potential for research programs, such as the World Climate Research Programme (WCRP) Global Energy and Water Cycle Experiment (GEWEX) (Morel 2001) and the National Aeronautics and Space Administration (NASA) Energy and Water Cycle Study (NEWS) (NASA NEWS Science Integration Team 2007), to resolve overarching scientific goals to document and enable improved, observationally based predictions of the energy and water cycles, and “to understand how the Earth is changing and what are the consequences for life on Earth” (NASA 2003). The GEWEX long-term scientific goal is to obtain a quantitative description of weather-scale variations in the global energy and water cycles over a period of at least 20 yr, which will provide the needed scientific basis for understanding climate variability and change. Such long-term datasets have been referred to as Earth System Data Records (ESDRs) by NASA’s Making Earth Science Data Records for Use in Research Environments (MEaSUREs) program, and climate data records (CDRs) by the NOAA National Climatic Data Center and the European Organization for the Exploitation of Meteorological Satellites (EUMETSAT). More broadly, long-term, high-quality, and uninterrupted observational datasets are central to the goals of the Global Climate Observing System (GCOS) framework for the sustained monitoring of essential climate variables (ECV) in support of climate change research.

An underlying challenge in using current in situ measurements, remote sensing (RS) observations, terrestrial water cycle variables from reanalysis model output, or offline land surface models (LSMs) is resolving the uncertainty among the various estimates for a specific variable and the consistency among the variables comprised in the terrestrial water budget. Rawlins et al. (2010) analyzed pan-Arctic terrestrial water budget datasets to assess the intensification of the Arctic hydrologic cycle. While they found consistency among the trends, precipitation (p) estimates from eight sources ranged from 420 to 520 mm yr−1; evapotranspiration (e) from six sources ranged from 150 to 320 mm yr−1; and river discharge (q) estimated as pe ranged from 180 to 290 mm yr−1 (five sources). Sahoo et al. (2011) evaluated remote-sensing-only datasets in terms of their depiction of terrestrial water budgets over 20 large global river basins. Uncertainties were found to be largest across precipitation datasets relative to other budget components. This was borne out by Tian and Peters-Lidard (2010) who found uncertainties across remote sensing precipitation products on the order of 100%–140% in mid to high latitudes and, especially, for cold season and light precipitation over complex terrain.

Providing data records for all scales and time periods, with sufficient accuracy and consistency for all applications, is virtually impossible from a single estimation system. Therefore, it is also one of the core strategies of initiatives such as GEWEX and MEaSUREs to utilize multiple estimates when providing water cycle data records to users, for example, from in situ observations, remote sensing retrievals, atmospheric reanalyses, and offline LSMs. This multisource strategy has the potential to compensate for the limitations of each individual estimation method in terms of its accuracy, spatial coverage, temporal sampling frequency, and so forth. At the same time, this strategy also raises new questions and challenges. First, how do we allow for the discrepancies among different data sources, which can be significant? If a single combined estimate of a variable is to be made from many competing sources, how do we determine the accuracy and consistency of each source and assign uncertainties to the final estimate? Another challenge is that, in many cases, estimates of budget components from different sources do not close the water budget (Pan and Wood 2006; Sheffield et al. 2009; Gao et al. 2010; McCabe et al. 2008; Sahoo et al. 2011); that is, the basic physical constraint of mass balance of water is not satisfied. For example, Sheffield et al. (2009) attempted to close the water budget for the Mississippi River basin using multiple remote sensing estimates of budget components but found significant errors that were larger than the observed streamflow, even after bias correction. Sahoo et al. (2011) found similar magnitude nonclosure errors for multiple global river basins and nonclosure errors on the order of 5%–25% of mean annual precipitation. If mass balance is not preserved, estimates for different variables may present conflicting information (Sheffield et al. 2009). How do we ensure such physical consistency in our estimates of individual variables when multiple sources of data are merged? With these questions and challenges in mind, this study is motivated to develop a strategy to integrate multiple sources of water cycle estimates into one consistent set of water-cycle data records using data assimilation techniques.

Our overall goal is to develop a method for merging multiple sources of water budget variables in a consistent manner and develop a long-term global data record for the terrestrial water cycle. By utilizing information from multiple sources, it is expected that the data record will provide a best possible estimate that can be used as a baseline for various hydrological and climate studies. Nevertheless, there are two major challenges that will have to be addressed to meet this goal:

  1. how to determine the quality or error level of each individual estimate and optimally merge them;
  2. how to ensure that the estimates are physically consistent, such that the mass balance of water is conserved among all flux and storage terms.
To meet the first challenge, we will use both theoretical and empirical knowledge to assess the confidence of different estimation systems based on characteristics such as sensor operation mechanisms, sensor network density, model physics, parameterization schemes, past validation studies, and so on, as well as statistical measures to quantify errors. For the second challenge, we use the constrained data assimilation techniques (Pan and Wood 2006), which have been successfully implemented in a number of previous studies (Pan et al. 2008; Sahoo et al. 2011).

We apply the merging scheme for the 23-yr period 1984–2006, which provides the largest overlap of long-term, large-scale estimates from in situ, remote sensing, and model datasets. The merging can be applied over longer time periods, but this would rely more heavily on modeled datasets before the satellite period (and be subject to the associated shifts due to changing observing systems) and raises the issue of how to merge datasets that change in number and quality. This will be addressed in future work. The analysis is performed over 32 major river basins in the world (Fig. 1) at monthly time scale since these scales are commensurate with the availability of measured streamflow data for the study period, which forms a robust constraint on the water budget. Section 2 gives a discussion of how each terrestrial water cycle variable is estimated and the source data used. Section 3 details the methodology for data merging and balance constraining. Results are presented and discussed in section 4, followed by conclusions in section 5.

Fig. 1.
Fig. 1.

Thirty-two global basins selected for the water cycle estimation.

Citation: Journal of Climate 25, 9; 10.1175/JCLI-D-11-00300.1

2. The terrestrial water cycle and sources of data

The terrestrial water cycle consists of fluxes into and from the land surface and the water storage on and below the land surface, and is part of the larger circulation of water between the atmosphere, land, and ocean. This terrestrial water cycle is the closest to human habitats and, thus, has the most direct impacts on human lives, for example, through flooding, drought, agricultural productivity, water resources, and ecosystem health. The terrestrial water budget consists of four main terms: precipitation p, evapotranspiration e, runoff q, and total terrestrial water storage s. Note that the precipitation includes both liquid and solid forms, and the storage term includes all possible water stores (soil moisture/ice, snowpack, vegetation canopy moisture/snow, groundwater, wetlands, lakes, reservoirs, and streams). These water cycle variables can be observed in different ways.

  1. For precipitation, long-term records come primarily from in situ (rain gauge) or ground radar measurements. The density of gauge or radar networks differs across the world with developed counties generally having dense networks and developing regions having sparse measurements. In the past decade, there has been a dramatic growth in the estimation of precipitation from satellite remote sensing (Pan et al. 2010) because of its superior spatial coverage, especially after the launch of the Tropical Rainfall Measurement Mission (TRMM) satellite in late 1997. In this study, four in-situ-based products are used—the Climate Prediction Center (CPC) product (Chen et al. 2002), the Climate Research Unit (CRU) product (Mitchell and Jones 2005), the Willmott–Matsuura (WM) product (Matsuura and Willmott 2010), and the Global Precipitation Climate Center (GPCC) product (Schneider et al. 2008). No satellite data are used because most satellite products are too short (begin after 1998), and some are rescaled against gauge data and so are only marginally different from in situ products at the basin and monthly scale. Figure 2 shows the seasonal cycle of precipitation calculated from the four products over six representative basins. In general, there is a good agreement among the products because they have many rain gauges in common. The spread is slightly higher in densely gauged basins like the Danube, possibly due to the difference in gridding and undercatch correction procedures. The spread is extremely low in sparsely monitored regions such as the Amazon and Niger basins, suggesting not only a heavy overlap of gauges being used among products, but also a lack of data for procedures like undercatch correction.
  2. For evapotranspiration, in-situ-based estimates rely on networks of flux towers. Although these networks are very sparse globally, progress has been made in upscaling flux tower estimates to global coverage (Jung et al. 2009). Large-scale estimates can also be derived from remote sensing using satellite-retrieved radiation fluxes and surface meteorological conditions. The retrieval is usually performed using an empirically based, process-based, or energy balance model of boundary layer fluxes, such as the Penman–Monteith (PM), Priestly–Taylor (PT), or the Surface Energy Balance System (SEBS) models (Su 2002). Two evapotranspiration datasets are used in this study: the upscaled flux tower-based dataset from the Max Planck Institute (MPI) ( Jung et al. 2010) and the SEBS-derived estimates (Vinukollu et al. 2011) using radiation fluxes from the International Satellite Cloud Climatology Project (ISCCP; Rossow and Schiffer 1999).
  3. Streamflow is well recorded through river gauges and long-term datasets are available for most major rivers. We use monthly streamflow data compiled by the Global Runoff Data Centre (GRDC). There are gaps in the GRDC records and a majority of basins miss no more than a few years of data except for seven: the Aral, Dnieper, Don, Limpopo, Nile, Ural, and Volga miss >40% of the months in the study period. To fill the gaps, a linear regression is performed using land surface model (LSM, details of the LSM are provided later) simulated streamflow and all available GRDC records (back to 1950), and the missing values are filled based on LSM simulated values. This procedure is very similar to that used in Dai et al. (2009) except that the regression is always performed without considering the significance of the linear dependency.
  4. The terrestrial water storage term includes several variables such as soil moisture, snow, and lake storage. Some of these variables can be observed separately, both from in situ and satellite remote sensing, but others cannot, such as groundwater; some variables suffer from sampling issues, such as low temporal sampling and low resolution for altimeter data and shallow sampling depth for microwave-based soil moisture retrievals. The estimates derived from the Gravity Recovery and Climate Experiment (GRACE) sensor overcome these issues and provide information on total terrestrial storage anomalies. The GRACE storage anomalies (Swenson and Wahr 2002), available from 2002 onward, are used, and estimates from LSM are used for years prior to 2002.
  5. In addition to observational approaches, dynamic model-based or hybrid approaches are also a good source of information. Climate/weather model reanalysis is one of the best ways to reconstruct the fluxes and states of the atmosphere and land because it provides consistent and continuous fields of all of the water cycle variables, albeit with errors due to the shortcomings of the modeling and assimilation process. We use the European Centre for Medium-Range Weather Forecast’s (ECMWF’s) Interim Re-Analysis (ERA-Interim; Simmons et al. 2006) as an alternative source for evapotranspiration (note that ERA-Interim data start from 1989). To minimize the artifacts caused by the assimilation that generally impacts the land water balance, the predicted evapotranspiration from ERA-Interim is replaced by the values that are “inferred” from its atmospheric water budget, that is, the moisture divergence minus the precipitation and the change in the atmospheric column total moisture (Troy et al. 2010). This is referred to as “ERA-Interim Inferred” hereafter.
  6. Land surface models (LSMs) offer a very sophisticated parameterization for the land surface, and offline LSM simulations provide reasonable estimates of the land surface states and fluxes when they are forced with high-quality surface meteorological data. Unlike a climate or weather model, precipitation is not a prognostic variable in a LSM and has to be supplied as an input. Here we use an offline LSM simulation with the Variable Infiltration Capacity (VIC) model (Liang et al. 1994, 1996). This VIC simulation is forced with surface meteorological data from the Princeton Global Forcing (PGF) (Sheffield et al. 2006) and is an updated version of the simulation of Sheffield and Wood (2007). VIC is calibrated against streamflow data for 25 large river basins globally, and thus gives a reasonable depiction of the terrestrial water cycle at least at monthly time scale. We use the VIC estimated evapotranspiration within the data merging, the VIC runoff to infill missing data in GRDC, and VIC storage change for years prior to 2002 before the GRACE satellite was launched.
Fig. 2.
Fig. 2.

Seasonal cycle of precipitation from different products over six representative basins for 1984–2006.

Citation: Journal of Climate 25, 9; 10.1175/JCLI-D-11-00300.1

Table 1 summarizes the data sources used in this study. Figure 3 shows the seasonal cycle of evapotranspiration calculated from the four products—VIC, ERA-Interim Inferred, SEBS (ISCCP), and MPI—over the six basins. Unlike the precipitation, the agreement among products is very poor, for example, some products are consistently lower or higher than others. Over the Amazon, there is not even consensus on the shape of the seasonal cycle. This significant divergence among the products and the uncertainty in the true seasonal cycle exemplifies the need to merge information from different sources. The method for merging the data products will be discussed in the next section.

Table 1.

Observational and modeling data used in the water cycle study.

Table 1.
Fig. 3.
Fig. 3.

As in Fig. 2 but of evapotranspiration. The data for ERA-Interim Inferred are for 1989–2006.

Citation: Journal of Climate 25, 9; 10.1175/JCLI-D-11-00300.1

3. Water cycle assessment procedure

To produce a single consistent set of estimates for the water cycle variables from multiple sources of data, the assessment procedure includes three basic steps, as shown in the three boxes in Fig. 4. The first step performs conventional error and bias analysis on the various observational and model-derived datasets. This step determines the error variances (or equivalently rms errors or standard errors) of all data products, and corrects any known biases in them. Based on this error information, the second step merges, variable by variable, the various estimates on the same variable into one. The third step resolves the water balance errors and ensures a perfectly closed water budget.

Fig. 4.
Fig. 4.

Flowchart of the water cycle assessment procedure.

Citation: Journal of Climate 25, 9; 10.1175/JCLI-D-11-00300.1

a. Conventional error and bias analysis

In this first step, various aspects of the products are examined based on information like the sensor characteristics, production methods, source data, model parameterization, calibration procedure, error assessment/validation, and so on. When possible, we identify those “best” products that are believed to be relatively unbiased and have the most reasonable temporal dynamics, and perform bias correction with respect to the best ones. Otherwise no bias correction will be performed. Then, we try to quantify the error levels in different data products from available information. The process differs by variable.

For precipitation, all products are gauge based and the relative biases among the products are small (Fig. 2). This suggests that for densely gauged regions the biases may be very small and for sparsely gauged regions the biases may exist but are very hard to assess and correct owing to the lack of better reference data. Consequently, no bias correction will be carried out. For errors, we follow past studies (Dai et al. 1997; Yang et al. 2005) and focus on two major types of errors—sampling errors due to sparse gauge distributions and processing errors related to the production methods, for example, undercatch correction, interpolation, and orographic correction. The first type (sampling) of errors can be related to rain gauge density, and the theoretical relationship between gauge density and sampling errors has been well studied (Rodríguez-Iturbe Mejía 1974; Bras and Rodríguez-Iturbe 1976). More recent studies like Oki et al. (1999) and Milly and Dunne (2002) tried to assess the sampling errors at scales (large basins, monthly) comparable to our study, and their results provide good guidance for the analysis carried out here. As these recent studies do not offer a functional form for the density–error relationship, we use the following formula to approximately reproduce the mean density–error results presented in Fig. 13 of Oki et al. (1999):
e1
The errors are relative to precipitation magnitude, and d is the gauge density as the number of gauges per 106 square kilometers. Note that Oki et al. (1999) calculated the errors in monthly runoff only, and their runoff error versus gauge density curve starts around 80% at the lowest gauge density and then decreases to a lower limit of about 40% with increasing gauge density. In other words, about 40% relative error will persist despite increasing gauge density, and we assume that this 40% relative error is from LSM modeling and routing and the remainder is from inadequate gauge sampling. Equation (1) is obtained through subtracting the curve in Fig. 13 of Oki et al. (1999) by 40% and then fitting an exponential function to the remainder. According to Eq. (1), the sampling error peaks at 40% as gauge density goes to zero per 106 km2, reduces to 10% at 25 per 106 km2, to about 5% at 50 per 106 km2, and then to about 1.6% at 100 per 106 km2. Among the products used, CPC, GPCC, and CRU provide the exact number of gauges being used for each grid cell and month. The Willmott–Matsuura procedure does not provide this information, but globally the maximum number of gauges in WM is about 22 000, about one-half of the maximum of 46 000 used in GPCC. Thus, we assume WM has a gauge density about one-half that of GPCC everywhere. The basin-averaged gauge density for all basins during 1984–2006 and the density–error curve are shown in Fig. 5. The Amazon, African, and Arctic basins have the lowest gauge densities and the highest sampling errors (up to 32% for CPC in the Congo), and basins in developed areas are so densely gauged that the sampling errors are essentially zero, for example, the Danube, Murray–Darling, and Mississippi. GPCC has significantly more gauges than all other products. The gauge density stays relatively stable during the study period except that there is a sudden and dramatic drop of gauge density in CPC since 1997.
Fig. 5.
Fig. 5.

Basin-mean precipitation gauge density (gauges per 106 km2) during the study period. Density data for WM are not available and are assumed to be one-half of GPCC. The density–error curve is plotted on the left for readers to loop up for error levels given the gauge density.

Citation: Journal of Climate 25, 9; 10.1175/JCLI-D-11-00300.1

The second type of error is caused by uncertainties in the production processes and, thus, can be approximated by their deviations from the mean of all products. The two types of errors are assumed independent and summed. The calculations are performed for each of the 12 months in a year because the error levels may vary seasonally: for each product, 12 error variances are calculated, one for each month.

For evapotranspiration, the situation is quite different. First of all, there is a lack of consensus on the magnitude and shape of the seasonal cycle for places like the Amazon (Fig. 3), and a bias correction can only be based on limited validation studies conducted on individual products over limited regions (Luo et al. 2003; Vinukollu et al. 2011). The Amazon is generally an energy-limited regime for evapotranspiration (Vinukollu et al. 2011) and remote-sensing-based methods tend to work less well because of the large uncertainties in surface radiation and surface meteorology. Also, ground observations are scarce, and the models tend to offer a more reliable seasonal cycle. Therefore, we bias correct the observational products, SEBS (ISCCP) and MPI, for each of the 12 months in a year, to match the seasonal cycle of the average of VIC and ERA-Interim Inferred. For other energy-limited basins, the same bias correction is performed. For water-limited basins, including the Niger, Nile, Murray–Darling, Limpopo, and Yellow, confidence in the model-produced seasonal cycle is not significantly stronger than for the observations and remote-sensing-based estimates. However, the annual total evapotranspiration from VIC is reasonable because VIC is forced with observed rainfall and calibrated against observed runoff, both of which are better observed than evapotranspiration, and VIC forces water balance by model construct. For these basins, we bias correct the SEBS (ISCCP), MPI, and ERA-Interim Inferred to match the annual totals of VIC. After the bias correction, the error variances are calculated around the mean of all products owing to the lack of an alternative method to quantify errors from production information.

For runoff, the standard errors in the gauge-based measurements are usually assumed to be 5%–10% of the runoff values (Sheffield et al. 2009). This error varies according to factors like river flow volume, gauging methods, climate regimes, channel morphology, etc. Because larger rivers tend to be easier to gauge, we set the GRDC runoff errors inversely proportional to the mean annual flow volume with the largest river, the Amazon (AAmazon = 4.62 × 106 km2), having the lowest error of 5%, and smallest river, the Ural (AUral = 0.19 × 106 km2), having the highest error of 10%. The formula to calculate the runoff error for a basin with drainage area A is simply
e2
The runoff observations are assumed to be unbiased. In months where GRDC records are missing and a regression has to be performed to fill the data gap based on VIC simulations, the regression errors are added to the river gauging errors, thus assuming that the two error sources are independent. Errors in VIC estimated storage change is simply assumed to be 5% of the change values and 10% is assumed for GRACE. Note that GRACE storage anomalies include “all” of the storage elements, but VIC simulates storage changes related to soil moisture, canopy water, and snow/ice, and changes related to deep groundwater, lakes and wetlands are not accounted for. So, a level of inconsistency exists between GRACE and LSM. However, recent studies over continental U.S. regions, including Gao et al. (2010) and Tang et al. (2010), have shown that there is reasonable agreement between GRACE and LSM at basin scales.

b. Data assimilation to combine estimates

With error analysis and bias corrections achieved in the first step, the goal of the second step is to combine multiple estimates of the same variable into one single “best” estimate that has the minimum rms error. We can assume that all estimates are unbiased now and combine the estimates linearly. Mathematically, the problem can be described as follows. Say we have N estimates for the same variable υ, , and we also know the error variances for all of the estimates, , where the error variance is defined as
e3
The symbol υ without a caret refers to its true value. We assume that the errors are zero-mean Gaussian (unbiased) and independent among products. The goal is to find a linear combination of candidate estimates,
e4
that has the (expected) minimum error variance; are weights associated with each estimate and sum to unity, that is, , due to the unbiased condition. The solution to the optimal weights is (Luo and Wood 2008):
e5
In other words, the weight for one estimate is proportional to the inverse of its error variance. The error variance of merged estimate is calculated as
e6
Note that the independent errors assumption may not hold in cases like gauge-based precipitation products where different products share many common gauges. The dependence among products will lead to an underestimation of the error variance in the combined estimate.

c. Closing the water balance and attribution of imbalance errors

Combined estimates of individual water budget variables from the second step do not necessarily close the water balance. The water balance errors are defined and treated using the constrained Kalman filter (CKF) as follows.

1) Water balance errors

Suppose for time k that we have the estimates of all terrestrial water cycle variables:precipitation pk, evapotranspiration ek, runoff qk, and total terrestrial water storage sk. The mass balance of water requires that
e7
Defining the storage change term as Δsk = sksk–1, and omitting the time index k, the above equation becomes
e8
Thus, the residual (defined below) of the water balance should be zero:
e9
Unfortunately, individual estimates of p, e, q, and s from independent sources usually do not close the mass balance and the residual term r ≠ 0. However, the mass balance of water is a physical relationship that should not be violated. Therefore, a joint estimation of the entire water budget must enforce the balance constraint, and this adds new information, that is, our physical knowledge about the dynamic system, to the unconstrained estimates of individual variables. Proper removal of mass balance errors provides added value to the original estimates.
Note that the mass balance is a linear constraint. If we write the complete set of water budget variables as a column vector:
e10
Then the balance constraint can be expressed as a linear function of the vector,
e11
in which
e12
Note also that we assume that the storage term at the previous time is known as a constant; therefore, Δs and s are two equivalent terms and knowledge of one uniquely determines the other.

2) Constrained Kalman filter

The water budget imbalance errors are analyzed with a constrained Kalman filter (CKF), which is a simpler (nonensemble) form of the constrained ensemble Kalman filter (CEnKF), as proposed in (Pan and Wood 2006).

The problem of enforcing the water balance constraint can be defined mathematically as the following: given an unclosed set of water budget estimates and their error estimates, where , we look for a new set of estimates such that and the expected errors in the new estimate are minimized in a rms sense. Specifically, all of the errors are zero mean and Gaussian and can be quantified by “covariances,” that is, the error covariances of the state estimate , which is referred as εxx and defined as
e13
The symbol without a caret, for example, x refers to the “true value” of the vector, and the bar over an expression means “expectation” or “mean”. Then εxx has dimension 4 × 4 and it breaks down to error covariances between each individual water budget term:
e14
The covariance matrix is symmetric and only the lower (or upper) triangle part needs to be calculated.

There have been several approaches for this well-posed problem, for example, to formulate it as an error minimization problem or orthogonal projection problem onto the constraint surface (Simon and Chia 2002), but they all come to the correct solution. The solutions have been discussed thoroughly in Simon and Chia (2002) in the context of a filtering or data assimilation problem (McLaughlin 2002) with a state constraint. For simplicity and convenient implementation with a regular Kalman Filter (KF) (Kalman 1960) or an ensemble Kalman filter (EnKF) (Evensen 1994), the constraining procedure adopted in Pan and Wood (2006) is done as a “post processing” procedure, which can be independently performed after a regular filter update. Because this postprocessing approach is a standalone procedure, it is ideal for water balance estimation applications without filtering or data assimilation.

The post processing enforces the balance constraint through an additional Kalman filter in which the observation is a zero residual term and this observation is “perfect”; that is, the observation error covariance is zero. The method is derived in ensemble form in Pan and Wood (2006) and named the CEnKF and its nonensemble form can be referred to as the CKF. The CKF results are
e15
where is called the Kalman gain. Given , we have
e16
The above equation states that the balance-constrained estimate is the original estimate minus an adjustment term, which is the residual term multiplied by the Kalman gain matrix . Effectively, the residual term is removed by distributing it back into the different water budget variables. Since , the distribution of the balance residual is determined by both the magnitude of errors in the various water budget variables and their correlations. This is an error attribution process, and a larger part of the balance error will be attributed to a specific variable if it has larger errors relative to the other variables. Additionally, the adjustment term provides a breakdown of errors, and this breakdown offers valuable information on the attribution of errors and biases (in a relative sense) to each of precipitation, evapotranspiration, runoff, and land moisture storage.

The state error covariance εrr is calculated entry by entry according to Eq. (14). All diagonal entries (εpp, εee, εqq, and εΔsΔs) have already been calculated from the merging step and they vary from month to month. To calculate the cross-covariance εpe, we break it down to the product of variances and correlations εpe = εppρpeεee, and assume that the error correlation ρpe remains unchanged in time (stationary). As a result, ρpe can be computed using the deviations from all product means during the merging step. Errors in q and Δs are assumed uncorrelated with errors in p or e and mutually uncorrelated as well owing to the lack of data for their estimation. Thus, εqp = εqe = εΔsp = εΔse = εqΔs = 0.

4. Results and discussion

Not all of the 32 basins need to be discussed in detail, and the Amazon will serve as the example for the entire water cycle assessment process since it is perhaps the most difficult but important study basin because of its extremely large size and flux volume. Difficulties in determining the budget arise from the extremely sparse observation networks relative to the magnitude of its budget terms, inadequate modeling because of the uncertainty in the inputs, and limited validation data. The merging process is shown for precipitation products over the Amazon in Fig. 6. Even though the four products do not differ significantly (top panel), their weighs in the merging are quite different: GPCC takes the majority of 74%, and WM, CPC, and CRU take 15%, 9%, and 2%, respectively. Such a weighting reflects the strong influence of gauge densities of the four products over Amazon: 44, 22, 14, and 4 (per 106 km2), because these low density numbers lead to large differences in sampling errors (see the curve in Fig. 5). The influence of gauge density becomes insignificant in high density areas as in the Danube or Mississippi, where cross-product differences will basically determine the uncertainties. Note that the merging weights differ seasonally, and the mean weights are shown in the bottom panel of Fig. 6 as a pie chart.

Fig. 6.
Fig. 6.

Example of merging of precipitation products for the Amazon. Monthly time series of (top) original products and (bottom) the merged estimate. The pie chart represents the mean of the merging weights applied to each product to form the merged estimate, and the gray area marks the range (between maximum and minimum) of inputs.

Citation: Journal of Climate 25, 9; 10.1175/JCLI-D-11-00300.1

Figure 7 shows the bias correction (middle panel) and merging process (bottom panel) for evapotranspiration. Over the Amazon, the bias correction adjusts all products to match the seasonal cycle of the average of VIC and ERA-Interim. As seen in Fig. 4, VIC and ERA-Interim have weak but conflicting seasonal cycles, and their average has very low seasonal variations, which is similar to MPI. The high-biased SEBS is adjusted downward significantly. In terms of merging weights, MPI and VIC tend to have very strong weights (44% and 36%), and ERA-Interim Inferred and SEBS (ISCCP) are less weighted (8% and 12%). Both ERA-Interim Inferred and SEBS (ISCCP) have relatively large spikes, making them farther awayfrom the all-product means and resulting in lower weight.

Fig. 7.
Fig. 7.

As in Fig. 6 but of evapotranspiration including (middle panel) bias-corrected versions of the products.

Citation: Journal of Climate 25, 9; 10.1175/JCLI-D-11-00300.1

The same merging procedure is implemented over all 32 basins, and Figs. 8 and 9 give the merging weights of all basins for precipitation and evapotranspiration, respectively. For precipitation (Fig. 8), the GPCC product has larger weights than others over sparsely gauged areas like the Arctic region, South America, and Africa. The dominance of GPCC diminishes over Europe, North America, and Australia. For evapotranspiration (Fig. 9), VIC and MPI hold significant and similar weights in most basins, and the two products contribute more than two-thirds of the total weight. ERA-Interim Inferred takes much less weight, and SEBS (ISCCP) contributes the least (slightly higher in tropical basins than in high-latitude basins). This may suggest that the remote sensing is often an outlier against modeling or the tower estimate, even though we lack evidence and data to show which is more reliable.

Fig. 8.
Fig. 8.

Mean merging weights for the precipitation products for the 32 selected basins.

Citation: Journal of Climate 25, 9; 10.1175/JCLI-D-11-00300.1

Fig. 9.
Fig. 9.

As in Fig. 8 but for the evapotranspiration.

Citation: Journal of Climate 25, 9; 10.1175/JCLI-D-11-00300.1

No merging is applied to the runoff or the storage change because only one product is used at a time. The merged water cycle estimates do not close the water budget, and this is shown in the left column in Fig. 10. The top panel plots all flux terms: evapotranspiration and runoff are plotted as blue and green stacked bars and the precipitation is shown by a purple line. If the water balance is satisfied, then the gap between the purple line and top of the stacked bars should be exactly balanced by the storage change (cyan bars in the middle panel); otherwise the imbalance values are plotted as red bars in the bottom panel. The imbalances over the Amazon have a mean close to zero, that is, no long-term accumulated imbalance, and the values vary seasonally. The right column in Fig. 10 repeats the left column except that all estimates are balance constrained using the CKF. The CKF removes the imbalances by distributing them to all water cycle components according to their covariances with the imbalances (variances + correlation). The CKF attributes the imbalances month by month and the mean attribution is shown as a pie chart in the bottom right panel of Fig. 10. For the Amazon, 38% of the errors are attributed to precipitation owing to the extremely sparse gauges. A very significant part is attributed to both evapotranspiration (25%) and runoff (25%). This is because on one hand the cross-product discrepancy is large for evapotranspiration in Amazon and on the other hand the runoff is a very large term in Amazon, so the 5% error becomes significant.

Fig. 10.
Fig. 10.

(left) Unconstrained and (right) constrained water budgets over the Amazon. The water budget imbalance is shown in the bottom panels. By construct, the constrained imbalance is zero. Also shown is the attribution of the imbalance to each variable averaged over the time period.

Citation: Journal of Climate 25, 9; 10.1175/JCLI-D-11-00300.1

Figures 11 and 12 summarize the final results for the merged and balance-constrained water budget estimates. Figure 11 gives the annual time series of all water cycle components for 1984–2006 over 12 selected basins. Linear trends are calculated from the time series and the slopes are annotated in Fig. 11. The slope is shown only if the trend is statistically significant at the 95% level. The largest trends are positive, for example, the rising fluxes over the Niger and Mekong. A notable negative trend occurs for the runoff in the Murray–Darling basin in Australia. Interestingly, the storage change over Murray–Darling has a net negative accumulation in the final years of the period. This net loss of terrestrial water storage relates to the long drought over the region (Murphy and Timbal 2008) and possibly the use of groundwater for irrigation that would be identified by the GRACE data (Leblanc et al. 2009). Figure 12 gives the seasonal cycles of all water cycle components, which are as expected, and confirms the importance of snow storage and melt over northern basins like the Lena, Mackenzie, and Volga.

Fig. 11.
Fig. 11.

Annual time series of all water budget components after balance constraining over 12 selected basins. The slope of the linear trend (mm yr−2) is annotated to the corresponding time series with the significance level in the brackets (annotated only if the significance level is greater than 95%). Trend and significance level are annotated in different fonts for different variables: bold font for p, normal font for e, and italic font for Δs. All time series values are in millimeters per year.

Citation: Journal of Climate 25, 9; 10.1175/JCLI-D-11-00300.1

Fig. 12.
Fig. 12.

Seasonal cycles of all water budget components after balance constraining for 12 selected basins.

Citation: Journal of Climate 25, 9; 10.1175/JCLI-D-11-00300.1

Figure 13 presents an overview of the final results in terms of a map of the mean annual fluxes and storage releases. The blue and green stacked bars (to scale) represent the long-term mean annual evapotranspiration and runoff (mm month−1). The sum of the two fluxes should be very close to the precipitation because the long-term storage change should equal zero. The cyan bars represent the mean annual maximum storage release (mm month−1). This map shows the relative magnitude of all terrestrial water cycle components and the relative importance of the snowmelt. Unsurprisingly, basins like the Amazon, Congo, and Mekong are among the wettest, and snow plays a large role in the water budgets for Arctic river basins such as the Northern Dvina and Pechora.

Fig. 13.
Fig. 13.

Mean annual water budgets.

Citation: Journal of Climate 25, 9; 10.1175/JCLI-D-11-00300.1

Finally, Fig. 14 shows a map of the mean error attribution from the water balance constraining. The evapotranspiration term receives the most error attribution in a significant portion of the 32 basins, especially those in Africa, South Asia, and Australia. Precipitation receives a significant amount of error attribution in South America where the gauges are scarce and the rainfall is heavy. Attributions to the storage term are generally small except for those northern basins where the snow storage is large.

Fig. 14.
Fig. 14.

Mean error attribution among water budget terms.

Citation: Journal of Climate 25, 9; 10.1175/JCLI-D-11-00300.1

5. Conclusions

A systematic method is proposed to optimally combine estimates of the terrestrial water cycle from different data sources and to enforce the water balance constraint using data assimilation techniques. The method has been applied to create global long-term records of the terrestrial water budget by merging a number of global datasets including in situ observations, remote sensing retrievals, LSM simulation output, and global reanalyses. The data merging process utilizes existing knowledge of the land surface dynamical system and the characteristics of various estimation methods, such that biases and errors from different data sources can be compensated to the extent possible and the merged estimations carry the best possible confidence. The global scale of the study and the number of different variables and products involved pose a major challenge to this study and, as a result some of the assumptions made during the bias and error analysis, especially those for evapotranspiration products, do not have adequate in situ data (and subsequent quantitative analysis) to support. However, we believe that the resulting global water budget estimations over 32 major basins for 1984–2006 can be used as a baseline dataset for large-scale diagnostic studies, for example, integrated assessment of basin water resources, trend analysis and attribution, and climate change studies. As 23 years is still relatively short for trend analysis for variables with strong interannual variability, the next step is to improve the quality, coverage, resolution, and potential usage of such a baseline data record by including more sources of long-term data where available.

The temporal aggregation to monthly values limits the applicability of the merged dataset to study issues such as the diurnal cycle and daily variations, but such issues have a small impact when basin-average water budgets or long-term time scale analyses are carried out. There is a potential to extend the merged estimates to finer time and space scales, but a major challenge is to spatially disaggregate river runoff to discrete grids, which is an ongoing activity by the authors.

Acknowledgments

This research is supported by National Aeronautic and Space Administration (NASA) Grant NNX08AN40A (“Developing consistent Earth system data records for the global terrestrial water cycle”).

REFERENCES

  • Bras, R., , and I. Rodríguez-Iturbe, 1976: Evaluation of mean square error involved in approximating the areal average of a rainfall event by a discrete summation. Water Resour. Res., 12, 181184.

    • Search Google Scholar
    • Export Citation
  • Chen, M., , P. Xie, , J. E. Janowiak, , and P. A. Arkin, 2002: Global land precipitation: A 50-yr monthly analysis based on gauge observations. J. Hydrometeor., 3, 249266.

    • Search Google Scholar
    • Export Citation
  • Dai, A., , I. Y. Fung, , and A. D. Del Genio, 1997: Surface observed global land precipitation variations during 1900–88. J. Climate, 10, 29432962.

    • Search Google Scholar
    • Export Citation
  • Dai, A., , T. Qian, , K. E. Trenberth, , and J. D. Milliman, 2009: Changes in continental freshwater discharge from 1948 to 2004. J. Climate, 22, 27732791.

    • Search Google Scholar
    • Export Citation
  • Evensen, G., 1994: Sequential data assimilation with a nonlinear quasi-geostrophic model using Monte Carlo methods to forecast error statistics. J. Geophys. Res., 99 (C5), 10 14310 162.

    • Search Google Scholar
    • Export Citation
  • Gao, H., , Q. Tang, , C. R. Ferguson, , E. F. Wood, , and D. P. Lettenmaier, 2010: Estimating the water budget of major U.S. river basins via remote sensing. Int. J. Remote Sens., 31, 39553978, doi:10.1080/01431161.2010.483488.

    • Search Google Scholar
    • Export Citation
  • Jung, M., , M. Reichstein, , and A. Bondeau, 2009: Towards global empirical upscaling of FLUXNET eddy covariance observations: Validation of a model tree ensemble approach using a biosphere model. Biogeosci. Discuss., 6, 52715304.

    • Search Google Scholar
    • Export Citation
  • Jung, M., and Coauthors, 2010: Recent decline in the global land evapotranspiration trend due to limited moisture supply. Nature, 467, 951954.

    • Search Google Scholar
    • Export Citation
  • Kalman, R. E., 1960: A new approach to linear filtering and prediction problems. J. Basic Eng., 82D, 3545.

  • Leblanc, M. J., , P. Tregoning, , G. Ramillien, , S. O. Tweed, , and A. Fakes, 2009: Basin-scale, integrated observations of the early 21st century multiyear drought in southeast Australia. Water Resour. Res., 45, W04408, doi:10.1029/2008WR007333.

    • Search Google Scholar
    • Export Citation
  • Liang, X., , D. P. Lettenmaier, , E. F. Wood, , and S. J. Burges, 1994: A simple hydrologically based model of land surface water and energy fluxes for GSMs. J. Geophys. Res., 99 (D7), 14 41514 428.

    • Search Google Scholar
    • Export Citation
  • Liang, X., , E. F. Wood, , and D. P. Lettenmaier, 1996: Surface soil moisture parameterization of the VIC-2L model: Evaluation and modifications. Global Planet. Change, 13, 195206.

    • Search Google Scholar
    • Export Citation
  • Luo, L., , and E. F. Wood, 2008: Use of Bayesian merging techniques in a multimodel seasonal hydrologic ensemble prediction system for the eastern United States. J. Hydrometeor., 9, 866884.

    • Search Google Scholar
    • Export Citation
  • Luo, L., and Coauthors, 2003: Validation of the North American Land Data Assimilation System (NLDAS) retrospective forcing over the Southern Great Plains. J. Geophys. Res., 108, 8843, doi:10.1029/2002JD003246.

    • Search Google Scholar
    • Export Citation
  • Matsuura, K., , and C. J. Willmott, 2010: Terrestrial air temperature and precipitation: 1900–2008 gridded monthly and annual time series. Version 2.01, Center for Climatic Research, University of Delaware. [Available online at http://climate.geog.udel.edu/~climate/html_pages/Global2_Ts_2009/README.global_t_ts_2009.html.]

  • McCabe, M. F., , E. F. Wood, , R. Wojcik, , M. Pan, , J. Sheffield, , H. Gao, , and H. Su, 2008: Hydrological consistency using multi-sensor remote sensing data for water and energy cycle studies. Remote Sens. Environ., 112, 430444.

    • Search Google Scholar
    • Export Citation
  • McLaughlin, D. B., 2002: An integrated approach to hydrologic data assimilation: Interpolation, smoothing, and filtering. Adv. Water Resour., 25, 12751286.

    • Search Google Scholar
    • Export Citation
  • Milly, P. C. D., , and K. A. Dunne, 2002: Macroscale water fluxes 1. Quantifying errors in the estimation of basin mean precipitation. Water Resour. Res., 38, 1205, doi:10.1029/2001WR000759.

    • Search Google Scholar
    • Export Citation
  • Mitchell, T. D., , and P. D. Jones, 2005: An improved method of constructing a database of monthly climate observations and associated high-resolution grids. Int. J. Climatol., 25, 693712, doi:10.1002/joc.1181.

    • Search Google Scholar
    • Export Citation
  • Morel, P., 2001: Why GEWEX? The agenda for a global energy and water cycle program. GEWEX News, Vol. 11, No.1, International GEWEX Project Office, Silver Spring, MD, 1–11.

  • Murphy, B., , and B. Timbal, 2008: A review of recent climate variability and climate change in southeastern Australia. Int. J. Climatol., 28, 859879, doi:10.1002/joc.1627.

    • Search Google Scholar
    • Export Citation
  • NASA, 2003: Earth Science Enterprise Strategy. National Aeronautics and Space Administration, Washington, DC, 94 pp. [Available online at http://science.nasa.gov/media/medialibrary/2010/03/31/ESE_Strategy2003.pdf.]

  • NASA NEWS Science Integration Team, 2007: Predicting energy and water cycle consequences of earth system variability and change. 89 pp. [Available at http://news.cisc.gmu.edu/doc/NEWS_implementation.pdf.]

  • Oki, T., , T. Nishimura, , and P. Dirmeyer, 1999: Assessment of annual runoff from land surface models using Total Runoff Integrating Pathways (TRIP). J. Meteor. Soc. Japan, 77, 235255.

    • Search Google Scholar
    • Export Citation
  • Pan, M., , and E. F. Wood, 2006: Data assimilation for estimating terrestrial water budget using a constrained ensemble Kalman filter. J. Hydrometeor., 7, 534547.

    • Search Google Scholar
    • Export Citation
  • Pan, M., , E. F. Wood, , R. Wojcik, , and M. McCabe, 2008: Estimation of regional terrestrial water cycle using multi-sensor remote sensing observations and data assimilation. Remote Sens. Environ., 112, 12821294.

    • Search Google Scholar
    • Export Citation
  • Pan, M., , H. Li, , and E. F. Wood, 2010: Assessing the skill of satellite-based precipitation estimates in hydrologic applications. Water Resour. Res., 46, W09535, doi:10.1029/2009WR008290.

    • Search Google Scholar
    • Export Citation
  • Rawlins, M. A., and Coauthors, 2010: Analysis of the Arctic system for freshwater cycle intensification: Observations and Expectations. J. Climate, 23, 57155737.

    • Search Google Scholar
    • Export Citation
  • Rodríguez-Iturbe, I., , and J. M. Mejía, 1974: The design of rainfall networks in time and space. Water Resour. Res., 10, 713728, doi:10.1029/WR010i004p00713.

    • Search Google Scholar
    • Export Citation
  • Rossow, W. B., , and R. A. Schiffer, 1999: Advances in understanding clouds from ISCCP. Bull. Amer. Meteor. Soc., 80, 22612287.

  • Sahoo, A. K., , M. Pan, , T. J. Troy, , R. Vinukollu, , J. Sheffield, , and E. F. Wood, 2011: Reconciling the global terrestrial water budget using satellite remote sensing. Remote Sens. Environ., 115, 18501865, doi:10.1016/j.rse.2011.03.009.

    • Search Google Scholar
    • Export Citation
  • Schneider, U., , T. Fuchs, , A. Meyer-Christoffer, , and B. Rudolf, 2008: Global precipitation analysis products of the GPCC. GPCC Publication, 12 pp. [Available online at ftp://ftp.dwd.de/pub/data/gpcc/PDF/GPCC_intro_products_2008.pdf.]

  • Sheffield, J., , and E. F. Wood, 2007: Characteristics of global and regional drought, 1950–2000: Analysis of soil moisture data from off-line simulation of the terrestrial hydrologic cycle. J. Geophys. Res., 112, D17115, doi:10.1029/2006JD008288.

    • Search Google Scholar
    • Export Citation
  • Sheffield, J., , G. Goteti, , and E. F. Wood, 2006: Development of a 50-yr high-resolution global dataset of meteorological forcings for land surface modeling. J. Climate, 19, 30883111.

    • Search Google Scholar
    • Export Citation
  • Sheffield, J., , C. R. Ferguson, , T. J. Troy, , E. F. Wood, , and M. F. McCabe, 2009: Closing the terrestrial water budget from satellite remote sensing. Geophys. Res. Lett., 36, L07403, doi:10.1029/2009GL037338.

    • Search Google Scholar
    • Export Citation
  • Simmons, A. J., , S. Uppala, , D. Dee, , and S. Kobayashi, 2006: ERA-Interim: New ECMWF reanalysis products from 1989 onwards. ECMWF Newsletter, No. 110, ECMWF, Reading, United Kingdom, 25–35. [Available online at http://www.ecmwf.int/publications/newsletters/pdf/110_rev.pdf.]

  • Simon, D., , and T. L. Chia, 2002: Kalman filtering with state equality constraints. IEEE Trans. Aerosp. Electron. Syst., 38, 128136.

  • Su, Z., 2002: The Surface Energy Balance System (SEBS) for estimation of turbulent heat flux. Hydrol. Earth Syst. Sci., 6, 8599.

  • Swenson, S., , and J. Wahr, 2002: Methods for inferring regional surface-mass anomalies from Gravity Recovery and Climate Experiment (GRACE) measurements of time-variable gravity. J. Geophys. Res., 107, 2193, doi:10.1029/2001JB000576.

    • Search Google Scholar
    • Export Citation
  • Tang, Q., , H. Gao, , P. Yeh, , T. Oki, , F. Su, , and D. P. Lettenmaier, 2010: Dynamics of terrestrial water storage change from satellite and surface observations and modeling. J. Hydrometeor., 11, 156170.

    • Search Google Scholar
    • Export Citation
  • Tian, Y., , and C. D. Peters-Lidard, 2010: A global map of uncertainties in satellite-based precipitation measurements. Geophys. Res. Lett., 37, L24407, doi:10.1029/2010GL046008.

    • Search Google Scholar
    • Export Citation
  • Troy, T. J., , J. Sheffield, , and E. F. Wood, 2010: Estimation of the terrestrial water budget over northern Eurasia through the use of multiple data sources. J. Climate, 24, 32723293.

    • Search Google Scholar
    • Export Citation
  • Vinukollu, R. K., , E. F. Wood, , C. R. Ferguson, , and J. B. Fisher, 2011: Global estimates of evapotranspiration for climate studies using multi-sensor remote sensing data: Evaluation of three process-based approaches. Remote Sens. Environ., 115, 801823, doi:10.1016/j.rse.2010.11.006.

    • Search Google Scholar
    • Export Citation
  • Yang, D., , D. Kane, , Z. Zhang, , D. Legates, , and B. Goodison, 2005: Bias corrections of long-term (1973–2004) daily precipitation data over the northern regions. Geophys. Res. Lett., 32, L19501, doi:10.1029/2005GL024057.

    • Search Google Scholar
    • Export Citation
Save