The terrestrial water cycle in the Australian Community Atmosphere Biosphere Land Exchange (CABLE) model has been evaluated across a range of temporal and spatial domains. A series of offline experiments were conducted using the forcing data from the second Global Soil Wetness Project (GSWP-2) for the period of 1986–95, but with its default parameter settings. Results were compared against GSWP-2 multimodel ensembles and a range of observationally driven datasets. CABLE-simulated global mean evapotranspiration (ET) and runoff agreed well with the GSWP-2 multimodel climatology and observations, and the spatial variations of ET and runoff across 150 large catchments were well captured. Nevertheless, at regional scales it underestimated ET in the tropics and had some significant runoff errors. The model sensitivity to a number of selected parameters is further examined. Results showed some significant model uncertainty caused by its sensitivity to soil wilting point as well as to the root water uptaking efficiency and canopy water storage parameters. The sensitivity was large in tropical rain forest and midlatitude forest regions, where the uncertainty caused by the model parameters was comparable to a large part of its difference against the GSWP-2 multimodel mean. Furthermore, the discrepancy among the CABLE perturbation experiments caused by its sensitivity to model parameters was equivalent to about 20%–40% of the intermodel difference among the GSWP-2 models, which was primarily caused by different model structure/processes. Although such results are model dependent, they suggest that soil/vegetation parameters could be another source of uncertainty in estimating global surface energy and water budgets.
Model evaluation has been an integral part of developing global land surface models (LSMs) for weather and climate studies. Since the early 1990s, the Project for Intercomparison of Land Surface Parameterization Schemes (PILPS; Henderson-Sellers et al. 1996) has evaluated the parameterization of surface energy and water fluxes against field observations at a number of selected sites. They found the simulated annual mean latent heat flux varied from 30 to 56 W m−2, as compared with the observed value of 32 W m−2 for the year 1987 at Cabauw, the Netherlands (Chen et al. 1997). The more recent second Global Soil Wetness Project (GSWP-2; Dirmeyer et al. 2006; Schlosser and Gao 2010) has compared the global water budget simulated by 15 models and identified a wide range of intermodel variations. In addition to the use of field observations at selected sites or regions, the evaluations of LSMs have started to use derived land surface data products at regional or global scales that only became available over the last 10 years (e.g., Jimenez et al. 2011; Jung et al. 2009). These include the upscaled global latent heat flux (Jung et al. 2009, 2010), which integrated point-wise measurements at the FLUXNET observing sites with geospatial information from satellite remote sensing and surface meteorological data in a machine-learning algorithm, as well as the ones from land surface data assimilation products over the globe (Rodell et al. 2004) and over particular continents (e.g., Xia et al. 2012a,b). These datasets are very useful for assessing the responses of terrestrial system to climate variations and changes over the last several decades (e.g., Jung et al. 2010) and identifying key components of LSMs for further improvement at a range of time and spatial scales (e.g., Bonan et al.2011).
Over the last several decades, global LSMs have evolved from relatively simple bucket-type biogeophysical schemes (e.g., Manabe 1969) to the current ones with detailed representations of biogeophysical, biogeochemical, and hydrological interactions (e.g., Oleson et al. 2010). As a result, the functionality of such models and their applications have significantly widened in recent years. In addition to their traditional applications in calculating surface energy and water fluxes for weather and climate modeling, they have now been used for monitoring global surface energy and water budgets (e.g., Trenberth et al. 2007; Jimenez et al. 2011) and studying the consequences of climate changes on terrestrial water budget (e.g., Betts et al. 2007; Jung et al. 2010) and ecosystem productivity (Weber et al. 2009). As a result, a single measure of model performance becomes inadequate, and evaluating the performance of these models is required at a range of temporal and spatial scales.
Another recent development in model evaluations is the inclusion of the analysis of uncertainties in both observational data and model simulations in assessing model performance. In general, four sources of errors lead to the mismatch between model simulations and observations: input errors, model structure/parameterization errors, model parameter errors, and errors in the observations. For instance, there are large uncertainties in observed global precipitation data that are used for estimating global land water budgets. Biemans et al. (2009) compared seven such global precipitation data products and found the mean annual precipitation over land varied between 743 and 926 mm yr−1. As pointed out by a number of studies (Gottschalk et al. 2005; Guo et al. 2006), such uncertainty can significantly affect the estimated surface fluxes, as seen from a number of projects such as GSWP-2 (Dirmeyer et al. 2006), the Global Land Data Assimilation System (GLDAS; Rodell et al. 2004), the North American Land Data Assimilation System (NLDAS; Xia et al. 2012a,b), and the Water Model Intercomparison Project (WaterMIP; Haddenland et al. 2011). The multimodel mean of annual global evapotranspiration (ET) is 488 mm yr−1 in GSWP-2; 560 mm yr−1 in GLDAS, version 1 (GLDAS-1; Rodell et al. 2004); and 499 mm yr−1 in WaterMIP. The causes of such discrepancies include different forcing data, different models, and different soil and vegetation parameter datasets used in these experiments. Furthermore, even with the same forcing data, there are still large uncertainties that are caused by errors in model parameters and different model structures. For instance, significant intermodel differences have been reported in GSWP-2 (with global ET varying between 388 and 600 mm yr−1; Dirmeyer et al. 2006), WaterMIP (with global ET varying between 415 and 586 mm yr−1; Haddenland et al. 2011) and GLDAS-1 (Rodell et al. 2004).
Nevertheless, there is still a lack of good understanding of the causes of these uncertainties. It remains unclear to what extent the uncertainty is caused by errors in forcing data or errors in the model structures and model parameters. For instance, Decharme and Douville (2006) questioned the quality of GSWP-2 forcing data and showed that precipitation was overestimated in middle and high latitudes, and they concluded that errors in the forcing data were the dominant cause for the errors in river discharge simulations. Zhou et al. (2012) also found that errors in GSWP-2 precipitation data were likely to be the dominant cause for the errors of the simulated river discharge over 150 large catchments globally. In contrast, Schlosser and Gao (2010) argued that the range of global evapotranspiration estimates among the GSWP-2 models was larger than the errors in the GSWP-2 atmospheric forcing, including precipitation, and that errors in the model parameterization are the most important cause for the errors in the modeled land surface fluxes. Jimenez et al. (2011) compared 12 global land heat flux products and suggested that both the choice of model parameterizations and the forcing data significantly contributed to the uncertainty of the estimated surface evaporation and heat fluxes. Haddenland et al. (2011) assessed the performances of 15 LSMs and global hydrological models in the WaterMIP project, and they attributed a large part of such modeling disagreement to different modeling structure and parameterizations. Not many studies have assessed the extent of such uncertainty caused by prescribed soil/vegetation parameters in land surface models. Prompted by these unsettled arguments, exploring the modeling uncertainty becomes an important part of the analysis in this study.
In this paper, we evaluated the Community Atmosphere Biosphere Land Exchange (CABLE) model (Kowalczyk et al. 2007; Wang et al. 2011) as used in the Australian Community Climate and Earth-System Simulator (ACCESS). Previous studies such as Wang and McGregor (2003) and Wang et al. (2011) evaluated the model against in situ observation. Zhang et al. (2009) and Zhang et al. (2011) further explored the skill of an early model version in capturing the observed features of surface energy, water, and carbon fluxes using 50-yr offline simulations. In this study, our objectives are (i) to use the observationally based (e.g., Jung et al. 2009) and model-based (e.g., Dirmeyer et al. 2006; Rodell et al. 2004) global data products to explore the skill of CABLE model at different time and spatial scales and (ii) to explore the sensitivities of the modeled surface ET and runoff to a number of key parameters. This will help us assess the relative contribution of the errors in precipitation inputs and in the model parameter values to the errors in the modeled surface ET or runoff. Most of the previous studies on analyzing the uncertainties in our current estimates of global water budget (e.g., Jimenez et al. 2011; Haddenland et al. 2011) were focused on the contributions from errors in forcing data and errors in model structure/processes. The extent of CABLE sensitivity to its parameter values can help us assess if the prescribed soil/vegetation parameters in land surface models could also contribute to such uncertainties.
Accordingly, the manuscript is organized as follows. In section 2, we briefly introduce the version of the model and the offline experiments we have conducted. Section 3 is used to document the model results from a series of GSWP-2 experiments. We also analyze uncertainties from the model perturbation experiments to help us diagnose the model errors. The main conclusion and discussions are presented in section 4.
2. Model description and experiments
Detailed CABLE descriptions can be found in a number of publications. Kowalczyk et al. (2007) provided the history of its development, major features, structure and numerical schemes, and comparisons with field observations at a number of sites. Wang et al. (2011) described the model performance at a number of FLUXNET sites (http://fluxnet.ornl.gov). It is composed of three subcomponents for modeling canopy microclimate; soil and snow; and dynamics of plant, litter, and soil carbon pools. The soil is divided into six layers with thicknesses of 0.022, 0.058, 0.154, 0.409, 1.085, and 2.872 m from the top to the bottom. It includes a three-layer snow submodel computing snow temperature, density, and thickness and snow albedo. A two-big-leaf canopy model is used for calculating photosynthesis, stomatal conductance, and leaf temperatures separately for sunlit and shaded leaves (Wang and Leuning 1998), and the canopy turbulence model of Raupach (1989) is used for calculating within-canopy air temperature and humidity. Recently, the Carnegie–Ames–Stanford Approach for carbon, nitrogen, and phosphorus (CASA-CNP) biogeochemical module was implemented into CABLE, but it is not activated in the experiments used in this study. The main differences between the current version and the one used in Zhang et al. (2009) and Zhang et al. (2011) include bug fixes and the adoption of Lai and Katul (2000) for estimating the root water uptake rate as a function of potential transpiration rate, root uptake efficiency, and root density distributions (Li et al. 2012).
In this study, we use the 1° × 1° GSWP-2 global meteorological forcing data for all CABLE simulations. GSWP is an international project designed for producing global soil wetness and other hydrological variables using multiple LSMs with the same meteorological forcing. In its second phase (GSWP-2; Dirmeyer et al. 2006), detailed outputs from 15 land surface models are produced for the period 1986–95 at a resolution of 1° × 1°. They include each component of the surface energy and water budgets. The multimodel ensemble monthly averages and the intermodel variations are freely available (http://www.iges.org/gwsp2/). Although the GSWP-2 outputs are still model-based products, studies (e.g., Gao and Dirmeyer 2006) showed that the ensemble averages of multimodel outputs are reasonably accurate in capturing the global variation of the water cycle and its component. Comparing CABLE results against other models in detailed process-based analyses can help us to evaluate and improve the model. In addition, we also compare the CABLE performance against four LSM outputs from GLDAS-1 (Rodell et al. 2004) for the same period to consider uncertainty in our current estimates of global water budget.
In the default CABLE GSWP-2 experiment, rather than using the GSWP-2 soil/vegetation parameters provided by the project, we decided to use the same soil and vegetation parameters used by the model when it was coupled to the host global climate model (ACCESS) for weather and climate simulations. This was done for two reasons. First, such a setup allows us to have a “clean” intercomparison between results from CABLE offline and coupled experiments to assess whether its offline modeling errors contribute to the coupled ACCESS modeling errors of surface climate. Second, we plan to use the outputs from the CABLE GSWP-2 offline experiments to initialize a series of coupled ACCESS experiments to assess if land surface initialization/conditions affect its climate variability and predictability. Therefore, it is desirable for us to use the exact same model parameters for CABLE offline and coupled experiments. Nevertheless, in addition to the default CABLE GSWP-2 experiments, we also conduct a number of parameter perturbation experiments to explore CABLE's sensitivity to key soil/vegetation parameters. This allows us to assess if some of its modeling error against the GSWP-2 multimodel ensembles could be potentially caused by different parameter settings used in its GSWP-2 experiment. Furthermore, as all the GSWP-2 models (Dirmeyer et al. 2006) used the same forcing data and the same soil/vegetation parameters, the intermodel variations reported in GSWP-2 therefore largely reflected current modeling uncertainty due to different model structures/processes. Here we can compare the extent of uncertainty within CABLE sensitivity experiments against GSWP-2 intermodel variations to assess where soil/vegetation parameters could be another source of uncertainty in current global energy and water budget estimate. We acknowledge that the model uncertainties of its soil/vegetation parameters are also related to the model structure and processes and that the parameters that are important for CABLE may be unimportant for a different model (and vice versa). Nevertheless, results from these CABLE experiment are valuable for helping us to understand uncertainties in current global energy/water budget estimates.
Table 1 lists the experiments we conducted, including perturbing the maximum carboxylation rate (VCmax; influencing the canopy stomatal conductance), wilting point (influencing the amount of water available for plant uptake), leaf area index (LAI; influencing the rate flux water flux per ground area), canopy water storage capacity (influencing canopy interception of precipitation), root water uptake efficiency parameter (affecting the rate plant transpiration), and the soil hydraulic conductivity at saturation (affecting soil hydrology).
It needs to be pointed out that the selection of the six key parameters for estimating the model sensitivity to parameter settings is largely based on the results from some previous CABLE experiments, such as Wang et al. (2001, 2006), as well as the results from other modeling experiments in the past (e.g., Shao and Henderson-Sellers 1996). It is possible that there are other parameters that are influential but not included. For instance, Dharssi et al. (2009) showed the impacts of a range of soil properties on land surface modeling. We have arbitrarily chosen ±20%, 30%, and 50% perturbations from the default parameter values, which do not necessarily represent the likely uncertainty in the current estimates of these parameters. A comprehensive examination of the model sensitivity to a wide range of parameters is feasible through some advanced data-fusion approaches (Wang et al. 2001, 2009; Keenan et al. 2011), but that is beyond the scope of this study.
3. Water budgets at different spatial and temporal scales
Figure 1 gives an overview of the range of analyses conducted in this CABLE evaluation study. For the spatial domain, we start by comparing globally averaged mean climatology in CABLE with GSWP-2 models and other datasets and then assess CABLE's skill in capturing detailed features at continental and catchment scales. For the time scale, we assess the models' skill in simulating multiyear annual mean climatology, as well as seasonal cycles over the selected regions. Using sensitivity analysis (see Table 1), we explore the likely causes of the model errors in terms of errors in the forcing data, model parameters, and model structures.
a. Global mean climatology
Figure 2 compares the CABLE 10-yr annual mean of global water budget with the GSWP-2 multimodel ensemble means and intermodel ranges as reported by Oki et al. (2005) and Dirmeyer et al. (2006). Of the total 836 mm annual rainfall presented in the GSWP-2 forcing data, the simulated ET rate by GSWP-2 models varied from 388 to 600 mm yr−1, with a multimodel mean climatology (hereafter named GSWP-2_mmc) of 488 mm yr−1 (Dirmeyer et al. 2006). The simulated ET by CABLE (495 mm yr−1) is very close to the GSWP-2_mmc and the value of 492 mm yr−1 from Jung et al. (2011), which scaled up site-level measurements at the FLUXNET stations using a number of methods with geospatial information from satellite remote sensing and surface meteorological data (hereafter we simply name the dataset to be MPI data).
Figure 2 also compares the partitioning of total ET into dry canopy transpiration, wet canopy interception, and bare soil evaporation. The mean transpiration rate of 242 mm yr−1 estimated by CABLE is close to the GSWP-2_mmc (233 mm yr−1). Such a difference is substantially smaller than the divergence within the GSWP-2 models (varying from 114 to 344 mm yr−1). The mean wet canopy evaporation rate estimated by CABLE (67 mm yr−1) is lower than GSWP-2_mmc (81 mm yr−1), and the simulated bare soil evaporation (185 mm yr−1) is slightly higher than that of GSWP-2_mmc (175 mm yr−1). Again, the differences between CABLE and GSWP-2_mmc are significantly less than the intermodel range in GSWP-2 models (31–145 mm yr−1 for interception and 27–396 mm yr−1 for bare soil evaporation). Previous studies (e.g., Gao and Dirmeyer 2006) have shown that the multimodel ensemble tends to give better agreement than individual models, and, together with that fact that CABLE offers a good agreement with upscaled observational data of Jung et al. (2011), we gain confidence of the model skill in modeling surface water partitions at the global scale.
The CABLE-simulated total runoff of 339 mm yr−1 is quite close to the 347 mm yr−1 in GSWP-2_mmc. Both are higher than the estimated runoff of 299 mm yr−1 derived from the global composite runoff data (Fekete et al. 2000) from the Global Runoff Data Center (GRDC; http://grdc.bafg.de). However, it should be noted that the GRDC data climatology covered a different period than that used in GSWP-2 experiments and that GRDC estimates were also derived by combining observed river discharge information with a climate-driven water balance model. In addition, despite the good agreement in total runoff between CABLE and GSWP-2_mmc, there are notable differences in their partitions of total runoff into surface and subsurface components. As also shown in Zhou et al. (2012), CABLE substantially overestimates surface runoff (284 mm yr−1 in CABLE against 230 mm yr−1 in GSWP-2_mmc) and underestimates subsurface runoff/drainage (55 mm yr−1 in CABLE against 118 mm yr−1 in GSWP-2_mmc). This underlines the importance of further improving its hydrological component as the correct partition of total runoff has significant implications for its applications in hydrological forecasting such as flash flood warnings and river flow predictions.
We have further analyzed the model sensitivity experiments listed in Table 1 to explore the range of uncertainty in the CABLE simulations associated with the model parameters. First, from these sensitivity experiments we try to identify key parameters (and associated processes) that have significant influences on the model performance. Second, such results can potentially help us understand some of the differences among a number of global land surface projects such as GSWP-2 (Dirmeyer et al. 2010), GLDAS-1 (Rodell et al. 2004), and WaterIP (Haddenland et al. 2011). As reviewed in section 1, it is yet unclear to what extent the differences among these projects are caused by different forcing data or errors in model structure and parameters. Using the same meteorological forcing and the same model structure, the analysis and comparison of the range of uncertainty within CABLE GSWP-2 experiments to the range of uncertainties between these projects helps us assess if model parameter setup can cause uncertainties in our current estimates of global water and energy budgets reported by these different global land projects.
Figure 3 displays global mean total evapotranspiration and runoff from the six sets of CABLE GSWP-2 sensitivity experiments (see Table 1). In these experiments, the default values for canopy water storage capacity (named CAN), LAI, VCmax (named VC), wilting point (named WP), and hydraulic conductivity at saturation (HYD) are perturbed by ±20%, ±30% and ±50%. As mentioned previously, the selection of the six key parameters is largely based on the results from some previous CABLE experiments such as Wang et al. (2001, 2006) and from other modeling experiments in the past (e.g., Shao and Henderson-Sellers 1996). Note that in the WP15 experiment, when the wilting point is increased by 50%, soil water content at wilting points at a number of grid points in the tropics becomes the same as soil water content at its soil field capacity. This leads to erroneous simulations over these locations (as soil cannot hold any water to meet plant demand), and results from these simulations were not used. Table 2 highlights the range of variations from these sensitivity experiments by reducing or increasing the default values by 50% (except that WP13 is used for WP experiments); it also shows the results from the multimodel ensemble averages from GLDAS-1, GSWP-2, and the upscaled data product (Jung et al. 2011) and the GRDC; (Fekete et al. 2000). The GLDAS-1 results are averaged over the same period as in GSWP-2, but the GRDC climatology covers a different period. Figure 3a shows that the mean annual ET as simulated by CABLE varies from 470 to 520 mm yr−1 among the CABLE sensitivity experiments. This is equivalent to about 24% of the intermodel range in the 13 GSWP-2 models (388–600 mm yr−1) as reported by Oki et al. (2005). Note that this result does not consider the combined effect of these parameters because of the limitation of the experiments conducted here.
Figure 3a shows that global ET simulated by CABLE is sensitive to the change in wilting point, canopy storage capacity, and root water uptaking efficiency parameter (RT). Among the six parameters, the simulated global ET is least sensitive to VCmax and the hydraulic conductivity parameter. The insensitivity to VCmax is due to the fact that LAI is prescribed in these offline experiments, which could have limited the extent of leaf area feedback on maximal canopy conductance and, therefore, latent heat fluxes. When the LAI becomes a prognostic variable, with the carbon cycle being fully turned on in the model, the model sensitivity to VCmax becomes significant, according to some recent modeling experiments. The insensitivity of the model to the hydraulic conductivity parameter is in contrast to the significant influence when plant root hydraulic redistribution process is introduced to the model (Li et al. 2012). This suggests significant causes of land surface modeling uncertainties from model structures and process.
Furthermore, the difference of the globally averaged ET between GLDAS-1 model ensemble (561 mm yr−1 averaged over the same period as in GSWP-2) and GSWP-2 model ensemble (489 mm yr−1) is 72 mm yr−1 (see Table 2). The CABLE modeled ET sensitivity to its parameter perturbations is about 50 mm yr−1. This is equivalent to 70% of the difference between GSWP-2 and GLDAS-1. Although the model sensitivity to its parameters here can be model dependent, this result does suggest that uncertainty in prescribing soil/vegetation parameters could be another source of the discrepancy in our current estimate of global ET among different projects, as discussed in Jimenez et al. (2011).
As discussed in Dirmeyer et al. (2006) and Xia et al. (2012b), runoff modeling remains a large source of uncertainty among land surface models. The uncertainty in CABLE runoff simulations is shown in Fig. 3b and Table 2. With the same precipitation forcing, the modeled runoff sensitivity to surface parameters is roughly inverse to the ET results in Fig. 3a, because precipitation is approximately equal to the sum of ET and runoff over multiple years. CABLE-simulated runoff is sensitive to the parameter of canopy water storage capacity that determines the maximal amount of intercepted precipitation and, therefore, wet canopy evaporation rate. Changes in root water use efficiency and WP directly affect the simulated runoff in CABLE, but the influence from soil hydraulic conductivity is weak in this particular model. Overall, there is about 33 mm yr−1 runoff difference among the CABLE perturbation runs, suggesting limited contribution from model parameter errors to the significant uncertainty in our current estimates of global runoff (e.g., the averaged GLDAS-1 model total runoff of 213 mm yr−1 against the GSWP-2_mmc runoff of 348 mm yr−1; see Table 2). Among the different models used in GSWP-2, the simulated mean annual runoff varies from 117 to 633 mm yr−1, highlighting the challenge in simulating global water budget.
To further explore if such runoff differences are caused by different precipitation forcing data used, we can calculate the evaporation fraction that is defined as the ratio of total ET and precipitation for CABLE, GSWP-2_mcc, and GLDAS-1. Note that the GLDAS-1 total precipitation is approximated as the sum of ET and runoff, as we do not have its forcing data. The evaporation fraction is quite different between GSWP-2_mcc and the GLDAS-1 ensembles, with GSWP-2 models showing ~59% of the total precipitation (833 mm yr−1) being evaporated back to the atmosphere. This is much smaller than the fraction (~72%) in GLDAS-1 models with total precipitation of 772 mm yr−1. Therefore, the GLDAS-1 models would simulate higher ET and lower total runoff than the GSWP-2 models for a given amount of precipitation globally. Therefore, the lower number of total runoff in GLDAS-1 than GSWP-2 is not totally caused by higher precipitation in GSWP-2. Thus, although errors in meteorological forcing can result in significant discrepancies among the outputs from a number of international projects, different model structures and parameters can make significant contributions to such results (Schlosser and Gao 2010).
b. Catchment-scale evaluation
Following the assessment of the model global averages, here we analyze the model results at the catchment scale as Xia et al. (2012b) did for NLDAS. Zhou et al. (2012) assessed the performance of GSWP-2 models in simulating the observed mean annual runoff from 150 basins globally. They used the monthly runoff data from 150 large basins (area > 10 000 km2) collected at the farthest downstream stations (Dai et al. 2009). They concluded that errors in the precipitation forcing in GSWP-2 were the dominant cause of the model errors in the simulated mean annual runoff. Here we evaluate the performance of CABLE at catchment scale using the observed runoff data from Dai et al. (2009) and the upscaled ET data from Jung et al. (2011) and explore whether different ET calculations have also contributed to the errors in modeled runoff. It must be pointed out that both the runoff data from Dai et al. (2009) and the upscaled ET data from Jung et al. (2011) aggregated over these catchments are only approximations to the “truth” as these datasets themselves are subjected to errors associated with data quality and analytical methods. Here our intention is to validate the CABLE model at the catchment scale against these two independent and observationally constrained datasets.
Figure 4a compares the CABLE-simulated mean annual runoff with the observations. The spatial variations across these catchments are well simulated, with the linear correlation coefficient exceeding 0.8 and about 66% of the variance (R2 ≈ 0.66) being captured by the model. However, biases are still significant, as the slope of the linear regression is significantly less than one and the intercept is 165 mm yr−1. Figure 4b further examines the CABLE-simulated ET against the data from Jung et al. (2011) over these catchments. Similar to Fig. 4a, the spatial variations across the basins are well reproduced by the model, with the correlation coefficient nearly 0.9 and more than 80% of the spatial variations are captured by the model (with R2 = 0.805). Nevertheless, linear regression of the modeled and upscaled ET data also suggests that overall ET is overestimated in CABLE, with the regression interception being 135 mm yr−1.
In Fig. 5 we explore the effects of the errors in the precipitation forcing data on the errors of the modeled runoff and ET. Figure 5a shows the runoff and ET errors across the 150 catchments as a function of the annual precipitation received by these catchments. We use the observational Global Precipitation Climatology Center (GPCC) precipitation (Rudolf and Schneider 2004) for classifying these basins and assessing the errors in GSWP-2 precipitation forcing. As shown in Fig. 4a, CABLE has a tendency of overestimating runoff in a large proportion of the 150 catchments (Fig. 5a). For the catchments with less than 500 mm yr−1 rainfall, the model runoff errors are roughly within 150 mm yr−1. Mapping the ET errors in the same diagram, one can see there are some notable features between the two. First of all, for catchments receiving less than 1000 mm yr−1 precipitation, the model errors in ET have a similar pattern to runoff: both ET and runoff are overestimated. Such overestimations are more likely to be caused by the higher precipitation in the GSWP-2 forcing data than in GPCC (Fig. 5b). As pointed out in Zhou et al. (2012), these catchments are largely located in the middle-to-high latitudes of the Northern Hemisphere where the precipitation forcing is overestimated. However, for catchments receiving more than 1500 mm of annual precipitation, the ET errors have the opposite sign from the errors in runoff. For these catchments, even though the annual precipitation in GSWP-2 is lower than that in GPCC data (Fig. 5b), CABLE overestimates mean annual runoff for most of them. These overestimations can only be explained by the underestimation of ET in these catchments, as clearly shown in this same figure (Fig. 5b). Indeed, as shown in Fig. 5c, precipitation error can only explain part of the errors in the simulated runoff and ET. Linear regression results in Fig. 5c suggest that precipitation error alone accounts for 31% of the error in ET (with R2 = 0.3195). However, only 10% of the errors in runoff can be explained by precipitation (R2 = 0.1) because of a large a number of outliers with negative runoff errors corresponding to positive precipitation biases in the forcing data and vice versa.
c. Seasonal cycle under different climatic conditions
A number of recent studies underlined the importance of realistically representing the seasonality of surface fluxes. For instance, in studying the climate changes in southeast Australia, several studies (CSIRO 2010) found that disproportional rainfall decline in autumn that resulted in dry soil conditions at the start of the rainy winter season caused a significant decline of river flow in the region. Following the evaluation of annual mean climatologies in sections 3a and 3b, here we focus on evaluating the modeled seasonal cycle over three selected regions of the Australian continent and over the Amazon basin (Fig. 6). The three Australian regions are over northern Australia, where the dominant vegetation is tropical rain forest or savannah and high rainfall is received in its austral summer monsoon wet season (November–March); over southwest Western Australia (SWWA), where extensive research has been conducted to investigate the cause of the rapid river flow decline and where Zhang et al. (2011) reported significant nonlinear rainfall–runoff relationship being captured in an early version of CABLE; and over the southeast part of the continent known as the Murray–Darling basin (MDB), which is the Australian food bowl and has experienced significant drought, severe water shortage, and decreased river flow over the recent decades. In addition, we have examined the model results in the Amazon basin, which has always been an area of intensive land surface observational and modeling studies (e.g., Davidson et al. 2012). Note that while we have focused on the evaluation in the Australian continent (as well as the Amazon basin), where the model is extensively used for weather and climate research in the region, the diversity of the climate and vegetation conditions over the four regions in Fig. 6 makes the results here representative for other regions as well.
Although both SWWA (Fig. 6a) and MDB (Fig. 6b) are dominated by a semiarid climate with a similar amount of annual precipitation, the seasonal variations of their surface water partitions are remarkably different. The SWWA rainfall has a much stronger seasonality than in MDB, associated with strong seasonal variations of regional circulation, as documented in Feng et al. (2010). All the data from MPI, GSWP-2_mmc, and CABLE show a delay in the response of surface ET to rainfall seasonal changes. While its rainfall peaks around June–July, the maximum surface ET occurs around August–September in CABLE and GSWP-2_mmc. Overall, the CABLE monthly ET agrees well with GSWP-2_mmc across the year, except that it tends to reach its peak earlier than GSWP-2_mmc. Compared with the data of Jung et al. (2011), both GSWP-2_mmc and CABLE have much higher ET following the start of the high winter rainfall season. In addition, Fig. 6a shows the deficiency of CABLE in capturing runoff seasonality in this semiarid climate. Although the magnitude of total runoff is similar between CABLE and GSWP-2_mmc, the runoff generation in CABLE appears to follow the rainfall pattern much more closely than GSWP-2_mmc and GRDC data.
Different from SWWA, MDB has much weaker seasonal variation of precipitation (Fig. 6b). The GSWP-2_mmc, CABLE, and MPI data all suggest almost opposing seasonal cycles of monthly surface ET and rainfall, with surface ET being lowest during the wet season (autumn to winter). This high-rainfall–low-evaporation combination represents an important surface water “recharge” phase. During this period, precipitation exceeds surface ET loss, dry soil column is progressively refilled, and the response of runoff lags rainfall increase. In contrast, ET in the late spring to summer exceeds rainfall received, representing the “discharge” phase of surface water storage when soil water is depleted by ET because of high atmospheric demand. This recharge–discharge feature plays a critical role in modulating surface ET and soil moisture variations as well as runoff generation in this basin. Overall, the seasonal cycles of CABLE-simulated ET and runoff are quite similar to those from GSWP-2_mmc, with a stronger runoff seasonal variation in CABLE than in GSWP-2_mmc.
Dominated by the Australian monsoon, the northern Australian climate (Fig. 6c) exhibits distinct wet seasons (November–April) and dry seasons (with less than 30 mm month−1 of precipitation received within a year). As a result, surface fluxes exhibit quite strong seasonal variations. Different from the semiarid climates of SWWA and MDB, where surface ET is much greater than runoff, runoff in northern Australia has a similar magnitude as ET (note that in Figs. 6c and 6d, the left-hand-side y axis is ET, and precipitation and runoff are shown in the right-hand-side y axis with different scales). The simulated monthly ET and runoff by CABLE agree quite well with the estimates from GSWP-2_mmc. Both simulations show that monthly ET rate varies with monthly rainfall but with smaller seasonal amplitude. Surface ET reduces progressively, despite the rapid fall of monthly rainfall amount at the end of the wet season. The stored water in the soil at the end of the wet season can sustain surface evaporation for a few months into the dry season. Compared with results from Jung et al. (2011), both CABLE and GSWP-2_mmc overestimate the wet season ET, although the seasonal patterns are similar among the three estimates.
Although precipitation in the Amazon basin (Fig. 6d) displays a notable seasonal cycle, with relatively low rainfall (<100 mm month−1) received during its dry season of June–September, there is always enough surface water storage for soil and canopy evapotranspiration. Therefore, its total ET has a much weaker seasonal cycle than that seen for the Australian regions, with its magnitude varying between 75–105 mm month−1 in CABLE and 80–100 mm month−1 in GSWP-2_mmc. Furthermore, there are a number of notable features in Fig. 6d. First of all, GSWP-2_mmc and CABLE underestimate the evaporation in the Amazon Basin by about 25 mm month−1 for CABLE and 15 mm month−1 for GSWP-2_mmc during the wet season. Although a number of studies, such as Decharme and Douville (2006) and Zhou et al. (2012), pointed out that precipitation is lower in the GSWP-2 forcing data than other observational datasets, it is unlikely that precipitation forcing is the primary cause of such results. This is because surface ET is not water limited during the wet season, even with more than 120 mm month−1 precipitation in the GSWP-2 forcing data. Rather, such underestimation is related to the model parameterizations. Figure 6d also shows that the ET seasonal variations in CABLE differ from GSWP-2_mmc and MPI data, with high ET occurring during the brief dry season and peaking around August. This is in line with the peak of downward shortwave radiation during this period, when higher solar radiation reaches the surface associated with less cloudiness. Thus, in this region, surface ET in CABLE is more energy limited rather than water limited. In addition, the error bars in the GSWP-2_mmc are significantly bigger during the Amazon dry season, suggesting significant modeling uncertainty.
The CABLE-simulated seasonal cycle of monthly runoff in the Amazon Basin is similar to GSWP-2_mmc and the observational data from GRDC. Its seasonal variations lag behind rainfall. At the beginning of the wet season (August–October), monthly rainfall increases with time, while the monthly runoff remains quite steady. This lagged response is a result of increasing surface ET and recharging soil water at the early part of the wet season.
d. Global distribution
Figure 7 compares the geographic distributions of annual ET climatology between upscaled FLUXNET data of Jung et al. (2011), GSWP-2_mmc, and CABLE for the same period of 1986–95. Clearly, CABLE reproduces many of the geographic variations of surface ET well. The high ET dominating the tropical rain forest regions, the Asian monsoon region, the eastern part of North America, the north and eastern coast of Australia, and over the Northern Hemisphere (NH) middle and high latitudes are all well represented in the CABLE simulations. In the zonally averaged results (Fig. 7d), one can clearly see the progressive improvement of the model skill from an early version 1.4b, to an interim version, and then to the current version. The problem of severe underestimation of evaporation over rain forest region, a common feature seen in many LSMs (Bonan et al. 2011), has been significantly improved, although it is still underestimated, as discussed in the catchment evaluations in section 3b. Note that the large biases by CABLE in the latitude band between 15° and 35°N are a result of missing values over the Sahara region in the dataset of Jung et al. (2011), rather than deficiencies in CABLE (see Fig. 7d). Evaporation in the Sahara desert region is very low, so zonal averages of ET excluding these locations result in artificially higher zonal averages using the data of Jung et al. (2011). Another feature is that both CABLE and GSWP-2_mmc tend to simulate high evaporation in the middle latitudes caused by higher precipitation in GSWP-2, as pointed out by Decharme and Douville (2006) and discussed previously.
Figure 8 compares the total runoff between GRDC observational data (Fig. 8a), GSWP-2_mmc (Fig. 8b), and CABLE simulation (Fig. 8c). By and large, the global runoff patterns are well captured by CABLE. High runoff in the tropics, the Asian monsoon region, the eastern part of the North American continent, Europe, and western Asia, and along the western coasts of northern American continent and the north and east coasts of Australia are well simulated by the CABLE model. Notable differences between CABLE simulations and observational data include overestimation of runoff generation in the NH high latitude and the tropical rain forest region, with the former being likely caused by high precipitation forcing data in GSWP-2 and the latter being caused by both forcing data and underestimation of surface ET.
When designing the analytical structure for this study (Fig. 1), we have emphasized the importance of exploring the causes of model uncertainties in terms of errors in forcing data, model structure, and model parameters. In section 3a, we have assessed the model-simulated, globally averaged ET and runoff to soil and vegetation parameters. Here we explore to what extent the CABLE modeling errors presented in Fig. 8 can be attributed to likely model parameter errors used in its GSWP-2 experiment. As shown in Eq. (1) and Figs. 9 and 10 for December–February (DJF) and June–August (JJA), we first calculate the seasonal mean differences between the perturbation runs generated by increasing each default parameter value by 50%, 30%, and 20% against the runs generated by decreasing its default parameter value by 50%, 30%, and 20% [as the numerator of Eq.(1)]. This difference measures CABLE sensitivity to its parameter values used in its GSWP-2 experiments. Then, because we want to measure the extent of such sensitivity that can be used to account for CABLE-simulated error against the GSWP-2 multimodel ensemble mean [as the denominator of Eq.(1)], we calculate the ratio (Sa in percentage) between the two as
where α0 is the default value for parameter α, M is the model 10-yr seasonal mean for JJA and DJF, and Sα represents the percentage of the model error against the GSWP-2_mcc, which can be explained by perturbing the model parameter. In Figs. 9g and 10g, the overall model sensitivity to its parameters is measured by averaging all of the absolute differences from the six groups of perturbation runs in Table 1 and then dividing by the modeled seasonal mean values. In addition, we calculate the standard deviation of the total 36 CABLE GSWP-2 runs listed in Table 1 and calculate the proportion of the standard deviations from these CABLE perturbation runs against the standard deviations calculated from the 15 GSWP-2 models in Dirmeyer et al. (2006) (Figs. 9h, 10h). This allows us to assess to what extent the uncertainty seen in the GSWP-2 models can be simulated by parameter perturbations within this single model. Although the CABLE sensitivity results may be model dependent, this at least can give us some indications on how much uncertainty can be caused by model parameters with regard to the uncertainty caused by model structure/processes.
Figure 9 shows the results in DJF. Clearly, the influence of the model soil/vegetation parameters varies with different climate. Perturbing CAN leads to a large model sensitivity in Amazonia, Southeast Asia, and the tropical African rain forest region when large rainfall is received in the austral summer season (Fig. 9a). Such a sensitivity (as large as 150–200 mm yr−1) can explain over 80% of the model difference compared with GSWP-2_mcc, as seen in Fig. 7b. Nevertheless, the model sensitivity is largely limited in the tropics. The impacts of LAI have similar features in the tropics, but also show notable influence in the middle and high latitudes of the NH where the dominant vegetation is broadleaf forest and 20%–30% of the model difference between CABLE and GSWP-2 models can be reproduced by perturbing LAI. Significant influence of RT is observed in the Southern Hemisphere (SH) during its summer season, as well as in the low latitudes of the NH tropics. In this case, more than 50% of the model difference against the GSWP-2_mcc can be reproduced by perturbing RT. A similar influence is seen for perturbing soil wilting point, with the model showing notable sensitivity in the whole SH summer, especially outside the tropical rainforest regions. Such significant influence of wilting point was also noted in earlier land surface modeling studies such as Shao and Henderson-Sellers (1996), in which they found a majority of LSMs were very sensitive to the uncertainty in defining soil wilting point. Nevertheless, the model shows very weak sensitivity to the parameters of soil hydraulic conductivity used in the experiment. Note that Li et al. (2012) demonstrated significant influence of adding a “hydraulic lifting” process in CABLE modeling of soil water processes. Thus, model structures/processes are important sources of uncertainty in land surface modeling.
Figure 9g shows the overall CABLE sensitivity to its parameters derived from the six groups of perturbation runs. Over a large part of the globe, the model ET sensitivity to its parameter values is no more than 15% against its seasonal mean in this particular model. Nevertheless, there are some areas in South Africa, north and west Australia, and a large part of the middle latitude in Eurasian continent and North America where the overall sensitivity can be over 20% of its seasonal mean. Note the extremely high values over the Sahara desert region is because of the extremely low ET, which leads to high percentage values. We further calculate the standard deviations of the 33 CABLE sensitivity runs used in Figs. 9a–e and compare them with the intermodel standard deviations among the GSWP-2 models as reported in Dirmeyer et al. (2006) and in Fig. 9h. In a large part of the Southern Hemisphere, about 20% to 40% of the intermodel difference between 15 GSWP-2 models can be reproduced by model parameter perturbations within a single model structure with the same forcing data. This implies that besides the uncertainty caused by different model structure and parameterization of the processes in surface biogeophysical processes, uncertainty in defining key soil/vegetation parameters can also contribute to the uncertainty in our current estimates of global surface energy and water budgets (e.g., Jimenez et al. 2011; Haddenland et al. 2011).
The model sensitivity to key parameters becomes more significant in the Northern Hemisphere summer JJA (Fig. 10), with about 50–100 mm yr−1 for the modeled ET. A large part of the model sensitivity is located in tropical rain forest and midlatitude forest regions. Among the six parameters, perturbing WP and RT produce the most significant model responses (Figs. 10c,e), and the model sensitivity is seen over most of the global domain. As seen in DJF, in this offline version of the CABLE experiment the LAI is prescribed rather than interactive, and this has limited the model sensitivity to the VCmax parameter. In the northern summer, the model sensitivity to parameter perturbation is only equivalent to about 20%–30% intermodel difference in GSWP-2. This is primarily due to the fact the GSWP-2 models differ significantly in the Northern Hemisphere summer. In addition, we have also calculated the model sensitivity in other seasons. Overall, they give very similar features and are therefore not included here.
4. Conclusions and discussion
In this study, we have evaluated the Australian land surface model CABLE with a series of experiments using GSWP-2 forcing data (Dirmeyer et al. 2006) against a range of observation datasets. As summarized in Fig. 1, we have focused on assessing the model skill in simulating surface water cycle under different time and spatial scales: from global and continental averages to catchment-scale validations and from annual mean climatology to seasonal variations. Another feature of this analysis is that we have tried to estimate the uncertainty in the model simulations by a series of perturbation runs, and we try to use such results to understand some of the uncertainties reported in a number of international projects in which LSMs were used in estimating global water and energy budgets (e.g., Rodell et al. 2004; Dirmeyer et al. 2006; Haddenland et al. 2011; Jimenez et al. 2011).
First of all, we have compared its global water budget against multimodel climatology derived from 15 GSWP-2 models (Oki et al. 2005; Dirmeyer et al. 2006). Results have shown that CABLE is competitive for its skill in capturing the fundamental features of global water budgets, with its global mean ET and runoff agreeing well with the GSWP-2 multimodel climatology (GSWP-2_mmc). Its global mean ET climatology is similar to the one derived from the upscaled FLUXNET ET data by Jung et al. (2011), but the global mean total runoff is slightly higher than that of the GRDC (Fekete et al. 2000). The higher runoff has been partially attributed to high precipitation in the GSWP-2 forcing data in NH high latitudes. Furthermore, the decomposition of total ET into canopy interception, transpiration, and bare soil evaporation in CABLE shares similar features to the GSWP-2_mmc. Nevertheless, the model has some deficiency in correctly partitioning total runoff into surface and subsurface components. By comparing the CABLE results with another set of model-produced data from the GLDAS-1 project (Rodell et al. 2004), we found that the precipitation forcing difference between GSWP-2 and GLDAS-1 is not the only factor in accounting for the differences between GSWP-2 model-simulated ET and runoff and those from GLDAS-1 models. In fact, we found that the evaporation fraction is much higher in the GLDAS-1 models than in GSWP-2; thus, the same precipitation forcing would also lead to the overestimation of runoff in the GSWP-2 models due to lower surface evapotranspiration.
At regional scale, we have validated the model water balance across 150 catchments as used in a recent study of Zhou et al. (2012). The overall ET and runoff variations across these catchments are well captured, but the model ET underestimation has contributed to its runoff overestimation over catchments where annual rainfall exceeds 1500 mm yr−1. Another model deficiency is that although the soil water recharge–discharge feature is prominent in the model, as reflected in observational data across a number of semiarid regions (southwest and southeast Australia), results tend to suggest that runoff in CABLE is too closely tied to the rainfall seasonal variations. It still underestimated evaporation in the tropical rain forest region, although the model has made significant progress compared with its previous simulations.
Comparing the results from a series of CABLE parameter perturbation experiments, we further found that the model showed some sensitivity to some key parameters, including the root water uptake efficiency parameter when the scheme of Lai and Katul (2000) is introduced in the model, as well as the wilting point in the soil parameters and canopy interception capacity. Nevertheless, the model showed weak sensitivity to soil hydraulic conductivity. The model is sensitive to LAI given its significant influence on canopy biogeophysical and biogeochemical processes, but because LAI was prescribed in these experiments, the model did not show significant sensitivity to VCmax. Within the same modeling framework, the model sensitivity to soil/vegetation parameters is equivalent to 20%–30% of the uncertainty seen from the GSWP-2 intermodel difference. Previous studies identified different forcing data and different model structures as the primary causes of the uncertainty in estimating global water and energy budgets (e.g., Rodell et al. 2004; Dirmeyer et al. 2006; Haddenland et al. 2011; Jimenez et al. 2011). Here results from CABLE experiments, although they could be model dependent, suggest that uncertainty in model soil/vegetation parameters can be another important cause.
In this analysis, we have only focused on evaluating the partition of precipitation into surface ET and runoff at different scales. We have not evaluated the detailed processes such as soil moisture variations and snow process, which are shown to be important in modeling climate variability and predictability (e.g., Zhang and Frederiksen 2003; Zhang 2004). Future work will be focused on assessing to what extent the sensitivities seen in this study are transformed into influence on weather and climate modeling in coupled climate models.
The authors acknowledge the ACCESS land modeling team for the CABLE development and improvement. L. Zhang's contribution to this study is undertaken as part of the bilateral collaboration between CAWCR and CMA. Comments from Dr. I. Dharssi and L. Rikus during the internal review process are appreciated. We also thank L. Hanson and L. Rikus for their English editing. We sincerely thank all three anonymous reviewers for their very thoughtful and constructive comments and suggestions.