Central Chile is facing dramatic projections of climate change, with a consensus for declining precipitation, negatively affecting hydropower generation and irrigated agriculture. Rising from sea level to 6000 m within a distance of 200 km, precipitation characterization is difficult because of a lack of long-term observations, especially at higher elevations. For understanding current mean and extreme conditions and recent hydroclimatological change, as well as to provide a baseline for downscaling climate model projections, a temporally and spatially complete dataset of daily meteorology is essential. The authors use a gridded global daily meteorological dataset at 0.25° resolution for the period 1948–2008, adjusted by monthly precipitation observations interpolated to the same grid using a cokriging method with elevation as a covariate. For validation, daily statistics of the adjusted gridded precipitation are compared to station observations. For further validation, a hydrology model is driven with the gridded 0.25° meteorology and streamflow statistics are compared with observed flow. The high elevation precipitation is validated by comparing the simulated snow extent to Moderate Resolution Imaging Spectroradiometer (MODIS) images. Results show that the daily meteorology with the adjusted precipitation can accurately capture the statistical properties of extreme events as well as the sequence of wet and dry events, with hydrological model results displaying reasonable agreement with observed streamflow and snow extent. This demonstrates the successful use of a global gridded data product in a relatively data-sparse region to capture hydroclimatological characteristics and extremes.
Whether exploring teleconnections for enhancing flood and drought predictability or assessing the potential impacts of climate change on water resources, understanding the response of the land surface hydrology to perturbations in climate is essential. This has inspired the development and assessment of many large-scale hydrologic models for simulating land–atmosphere interactions over regional and global scales (e.g., Lawford et al. 2004; Milly and Shmakin 2002; Nijssen et al. 2001b; Sheffield and Wood 2007).
A prerequisite to regional hydroclimatological analyses is a comprehensive, multidecadal, and spatially and temporally complete dataset of observed meteorology, whether for historic simulations or as a baseline for downscaling future climate projections. In response to this need, datasets of daily gridded meteorological observations have been generated, both over continental regions (e.g.,Cosgrove et al. 2003; Maurer et al. 2002) and globally (Adam and Lettenmaier 2003; Sheffield et al. 2006). These have benefited from work at coarser time scales (Chen et al. 2002; Daly et al. 1994; Mitchell and Jones 2005; New et al. 2000; Willmott and Matsuura 2001) with many products combining multiple sources, such as station observations, remotely sensed images, and model reanalyses.
While these large-scale gridded products provide opportunities for hydrological simulations for land areas around the globe, they are inevitably limited in their accuracy where the underlying density of available station observations is low, the station locations are inadequate to represent complex topography, or where the gridded spatial resolution is too large for the region being studied. Central Chile is an especially challenging environment for characterizing climate and hydrology since the terrain exhibits dramatic elevation changes over short distances, and the orographic effects produce high spatial heterogeneity in precipitation. In general, the observation station density in South America is inadequate for long-term hydroclimate characterization (de Goncalves et al. 2006). While some of South America is relatively well represented by global observational datasets (Silva et al. 2007), regions west of the Andes are much less so (Liebmann and Allured 2005).
In this study, we utilize a new high-resolution global daily gridded dataset of temperature and precipitation, adjust it with available local climatological information, and assess its utility for representing river basin hydrology. Recognizing the value in simulating realistic extreme events, we assess the new data product for its ability to produce reasonable daily streamflow statistics. We evaluate the potential to reproduce climate and hydrology in a plausible manner such that historical statistics are reproduced.
The principal aim of this study is to produce a gridded representation of the climate and hydrology of central Chile and demonstrate a methodology for generating a reasonable set of data products that can be used for future studies of regional hydrology or climate. Given these regional results, we assess the potential to export the method to other relatively data-sparse regions where representative climatological average information is available but long-term daily data are inadequate. The paper is organized as follows. Section 2 describes the study area. In section 3 we describe the data, the hydrological model, and the methodological approach. Results of the adjusted dataset validation and model simulations are discussed in section 4. Finally, the main conclusions of the study are presented in section 5.
The focus area of this study is the region in central Chile encompassing the four major river basins (from north to south: the Rapel, Mataquito, Maule, and Itata Rivers) between latitudes 35.25° and 37.5°S (Fig. 1). The climate is mediterranean with 80% of the precipitation falling in the rainy season from May to August (Falvey and Garreaud 2007), peaking during June. The terrain is dramatic, rising approximately 6000 m within a horizontal distance of approximately 200 km, producing sharp gradients in climate (Falvey and Garreaud 2009). Mean precipitation is approximately 500 mm yr−1 at the north end of the study domain, and as much as 3000 mm yr−1 in the high elevations at the southern end of the domain. Although climate information in the valley or mountain foothills is well represented by meteorological stations, it is evident from Fig. 1 that the high elevation areas are underrepresented by any of the observation stations.
Our study region in central Chile is especially important from a hydroclimatological standpoint, as it contains more than 75% of the country’s total irrigated agriculture (www.censoagropecuario.cl/index2.html) and the majority of the reservoir storage in the country, and provides water supply for some of Chile’s largest cities. A changing climate is evident in recent hydroclimate records (Rubio-Álvarez and McPhee 2010), and future climate projections for the region indicate the potential for very large impacts (Bradley et al. 2006). Vicuña et al. (2011) show that the vulnerability of central Chile to projected climate change is high, with robust drying trends in general circulation model (GCM) projections and a high sensitivity to changing snowmelt patterns, and they also discuss the challenges in characterizing climate in a Chilean catchment with few precipitation observations and none at high elevations.
3. Methods and data
a. Gridded dataset development
We begin with a gridded global (land surface) forcing dataset of daily precipitation and minimum and maximum temperatures at 0.25° spatial resolution (approximately 25 km), prepared following Sheffield et al. (2006). To summarize, the forcing dataset is based on the National Centers for Environmental Prediction–National Center for Atmospheric Research (NCEP–NCAR) reanalysis (Kalnay et al. 1996) for 1948–2008 from which daily maximum and minimum temperature and daily precipitation are obtained at approximately 2° spatial resolution. Reanalysis temperatures are based on upper-air observations, though precipitation is a model output and thus exhibits significant biases.
The reanalysis temperatures are interpolated to a 0.25° spatial resolution using a temperature lapse rate of −6.5°C km−1 based on the elevation difference between the large reanalysis spatial scale and the elevation in each 0.25° grid cell. Precipitation is interpolated to 0.25° using a product of the Tropical Rainfall Measuring Mission (TRMM 3B42RT) (Huffman et al. 2007) following the methods outlined by Sheffield et al. (2006). The daily statistics of precipitation (number of rain days and wet/dry day transition probabilities) are corrected by resampling to match the observation-based monthly 0.5° rain days data from the Climatic Research Unit (CRU) (Mitchell and Jones 2005) and the daily statistics of the Global Precipitation Climatology Project (Huffman et al. 2001) and the global forcing dataset of Nijssen et al. (2001b). To ensure large-scale correspondence between this dataset and the CRU monthly dataset, precipitation is scaled so that the monthly totals match the CRU monthly values at the CRU spatial scale. Maximum and minimum temperatures are also scaled to match the CRU time series using CRU monthly mean temperature and diurnal temperature range.
While the incorporation of multiple sources of extensively reviewed data provides an invaluable data product for global and continental-scale analyses, as discussed by Mitchell and Jones (2005) ultimately much of the local characterization is traceable to a common network of land surface observations (Peterson et al. 1998), which is highly variable in station density for different regions. For example, for the region of study shown in Fig. 1 an average of three to four observation stations are included in the CRU precipitation data product and none are in high-elevation areas. This results in a few low elevation meteorological stations in Chile on the western side of the Andes, and the next observation station to the east is in a more arid area in Argentina. Thus, the resulting precipitation fields in the gridded product for this region show a spatial gradient opposite to that published by the Dirección General de Aguas (DGA 1987). Figure 2a shows the spatial distribution of gridded global total annual precipitation that displays a notable decrease of rainfall with elevation. Conversely, the DGA precipitation map is able to capture the climatological orographic enhancement of precipitation by the Andes (Fig. 2b). The precipitation lapse rates for the latitudinal bands at 35.125° and 36.125°S show a negative gradient of precipitation with elevation in the global gridded dataset whereas the DGA precipitation shows a positive gradient for the period 1951–89 (Figs. 2c and 2d, respectively).
When the gridded data product is used to drive a land surface hydrology model (described in more detail below), the shortcomings of the precipitation characterization are evident in Fig. 3. There a very poor characterization of the seasonal low flow, and the later season snowmelt pulse is absent. Calibration is not able to recover the observed streamflow patterns, indicating that the erroneous spatial distribution of precipitation in the raw gridded dataset is not appropriate for hydroclimatological studies in this snow-dominated region.
Local data from the DGA of Chile, some monthly and some daily, were obtained to characterize the local climatology better. While still biased toward low elevation areas, the stations (Fig. 1) do cover a wider range and include altitudes up to 2400 m. These stations were filtered to include those that had at least 90% complete monthly records for at least a 20-yr period of record in the 1983–2007 window (the period containing the most complete data coverage), selected as our climatological period. From the pool of 70 available stations, 40 stations met the two criteria. Except for the Itata River basin, which had two stations located at 1200 and 2400 m above sea level, most of the selected stations were located in the central part of the region at elevations below 500 m. Mean precipitation was computed for each month and for each selected station, resulting in 12 mean values for the climatological period. Since the climate of central Chile is also modulated by interannual variability linked to El Niño and the Pacific decadal oscillation (Garreaud et al. 2009), a multidecadal period was chosen to limit the influence of a single predominant phase of low frequency oscillations during the climatological period. The monthly average precipitation for the selected DGA stations was interpolated onto the same 0.25° grid using cokriging, with elevation being the covariate. This method of cokriging has been shown to improve kriging interpolation to include orographic effects induced by complex terrain (Diodato and Ceccarelli 2005; Hevesi et al. 1992).
This process produced 12 monthly mean precipitation maps for the region. The same 1983–2007 period was extracted from the daily gridded dataset, and monthly average values were calculated for each grid cell. Ratios (12—one for each month) of observed climatology divided by the gridded dataset average were then calculated for each grid cell. Daily values in the gridded dataset were adjusted to create a new set of daily precipitation data, Padj, which matches the interpolated observations produced with cokriging, using a simple ratio:
where Pgrid is the original daily gridded 0.25° data at location (i, j), Pobs is the interpolated observed climatology, overbars indicate the 1983–2007 mean, and the subscript “mon” indicates the month from the climatology in which day t falls.
This same method was applied to a global dataset of daily meteorology in a data-sparse region in Central America, resulting in improved characterization of precipitation and land surface hydrology (Maurer et al. 2009). In addition, this new adjusted dataset includes the full 1948–2008 period, despite that local observations are very sparse before 1980.
To validate the adjusted precipitation dataset, we computed a set of statistical parameters widely used to describe climate extremes (dos Santos et al. 2011; Zhang and Yang 2004). Additionally, to evaluate the temporal characteristics of rainfall events, we computed the probability of occurrence of wet and dry days and the transition probabilities between wet and dry states (Wilks and Wilby 1999). Table 1 shows a description of the statistics used.
To evaluate if the adjusted precipitation dataset captures the orographic gradient of precipitation, we compared model-simulated snow water equivalent (SWE) to the MODIS/Terra Snow Cover dataset, which is available at 0.05° resolution for 8-day periods starting from the year 2000. MODIS snow cover data are based on a snow mapping algorithm that employs a normalized difference snow index (Hall et al. 2006). To estimate snow cover from the meteorological data, the Variable Infiltration Capacity (VIC) model was employed (see model description in section 3b). The accuracy of simulated snow cover relative to that of random chance was measured with the Heidke skill score (HSS) (Wilks 2006). A score equal to one would indicate perfect agreement between VIC simulated snow cover and observations; a value greater than zero indicates some predictive skill. Thus, the closer the HSS is to one, the less likely it is that the agreement between observations and simulations has been obtained by chance. The score is computed as
where a and d are the numbers of hits (i.e., pairs of successful MODIS–VIC no snow and snow estimates, respectively), b is the number of cases when VIC simulates snow but MODIS does not measure it, and c is the number of events when snow is observed by MODIS but not simulated by VIC.
b. Hydrologic model simulations
To assess the ability of the daily gridded meteorology developed in this study to capture daily climate features across the watersheds, we simulate the hydrology of river basins in the region to obtain streamflow and snow cover estimates. The hydrologic model used is the Variable Infiltration Capacity model (Cherkauer et al. 2003; Liang et al. 1994). The VIC model is a distributed, physically based hydrologic model that balances both surface energy and water budgets over a grid mesh. The VIC model uses a “mosaic” scheme that allows a statistical representation of the subgrid spatial variability in topography, infiltration, and vegetation/land cover—an important attribute when simulating hydrology in heterogeneous terrain. The resulting runoff at each grid cell is routed through a defined river system using the algorithm developed by Lohmann et al. (1996). The VIC model has been successfully applied in many settings, from global to river basin scale (e.g., Maurer et al. 2002; Nijssen et al. 2001a; Sheffield and Wood 2007).
For this study, the model was run at a daily time step at a 0.25° resolution (approximately 630 km2 per grid cell for the study region). Elevation data for the basin routing were based on the 15-arc-second Hydrosheds dataset (Lehner et al. 2006), derived from the Shuttle Radar Topography Mission (SRTM) at 3 arc-second resolution. Land cover and soil hydraulic properties were based on values from Sheffield and Wood (2007), though specified soil depths and VIC soil parameters were modified during calibration. The river systems contributing to selected points were defined at a 0.25° resolution, following the technique outlined by O’Donnell et al. (1999).
4. Results and discussion
The adjusted dataset was validated in several ways. First, adjusted daily precipitation fields were cross-validated in four locations. Second, the final dataset was prepared and daily statistics were compared between the global daily dataset and local observations. Third, hydrologic simulation outputs were compared to observations to investigate the plausibility of using the new dataset as an observational baseline for studying climate impacts on hydrology.
a. Gridded precipitation data cross-validation
Prior to using the adjusted precipitation fields for hydrologic simulations, we performed a cross-validation of the gridded precipitation for the months of May–August at four locations across the basins. Figure 4 shows the geographic location of the validation sites and Table 2 shows the geographic coordinates of the 0.25° grid cells used in the comparison. For each grid point (0.25° gridcell center), the three nearest precipitation observation stations, located in an approximate 50-km diameter circle surrounding the grid cell center, were selected for validation. Selected stations were located, when possible, not more than 50% higher or lower (maximum elevation difference was 150 m except at Loc3 where it was 500 m) than that of the 0.25° grid cell. For each month and for each of the four locations (i.e., Loc1, Loc2, Loc3, and Loc4) the three precipitation gauge stations surrounding the grid cell were excluded from the cokriging interpolation process, which produced four sets (one for each site) of four maps (one for each month from May to August) of climatological precipitation at 0.25° spatial resolution. Daily gridded adjusted precipitation values at each location were then obtained by applying Eq. (1).
A scatterplot between observed and interpolated average (1983–2007) monthly precipitation for May–August (the rainy season) at each location is shown in Fig. 5. Note that the three rain gauges surrounding each location were excluded from the interpolation process producing the daily gridded data at that location. Interpolated monthly totals underestimate observations by 17.5%, mostly as a result of strong underestimation of rainfall totals at Loc 3. The relative rms error is 0.87%, indicating a good agreement between observed simulated fields. When values for Loc3 are not included in the computation of the bias, it improves to 3.3% and the RMSE is equal to 0.55%. It is worth noting that Loc3 is situated in a region with extremely complex precipitation gradients due to orographic enhancement at the foothills of the Andes Mountains, and, by excluding the three gauges nearest to the grid cell center, the closest remaining precipitation observations are at a distance of hundreds of kilometers, leaving that area essentially unrepresented by observations. Thus, it is not surprising that with no observational underpinning, capturing complex features in spatial precipitation patterns is difficult.
Given the rising interest in characterizing extreme events in the context of a changing climate (Allen et al. 2011), the ability of the adjusted daily gridded dataset to characterize extreme statistics is important. The skill of the adjusted daily datasets for the cross-validation exercise at capturing rainfall extremes was assessed for the same four locations as above. We computed a set of statistical variables frequently used to describe climate extremes, using the RClimDex software (Zhang and Yang 2004; Zhang et al. 2005). In this case, the statistics were computed at the daily level at each grid cell for the original gridded dataset (UndAdj), the adjusted dataset (Adj), and the observations (Obs). Figure 6 indicates that the monthly rescaling improves the representation of intense rainfall events (R95p), the annual total precipitation (PRCTOT), and the precipitation intensity (SDII) in three of the four locations. The number of days with precipitation over 20 mm (R20mm) also shows an improvement due to the adjustment in three locations. The statistical parameter linked to the length of wet spells (CWD) is not greatly affected by the adjustment, which is expected since rescaling is performed at the monthly level and daily sequencing is not affected (though minor changes in classification of wet days can occur because of the definition of wet days including a 1-mm threshold). The maximum 1-day precipitation value (RX1day) is slightly improved in Loc3 and Loc4 as a result of scaling.
The statistical significance of the parameters in Fig. 6 was assessed with the correlation coefficient and a two-sample unpaired Student’s t test to determine if the means of the statistics were statistically different (Table 3). Correlation is high between observations and adjusted precipitation for the statistics that measure the precipitation maximum at annual and daily levels (R99p and RX1d) and the differences in their means are not significantly different from zero (at α = 0.05). Total annual precipitation is highly correlated in both datasets; however, the means are statistically different in Loc2 and Loc3. The precipitation intensity (SDII) and the number of days with precipitation over the 20 mm threshold (R20mm) are strongly correlated with observations and have means that are statistically indistinguishable in all locations except for Loc3. The length of adjusted wet spells (CWD) was statistically different than observations.
Overall, based on the cross-validation results in Figs. 5 and 6 and Table 3, it can be concluded that, while not all deficiencies in the representation of daily precipitation in the gridded product can be recovered by our simple monthly climatological scaling, the resemblance to observations improves for many important statistics, especially those related to total precipitation, precipitation intensity, and extreme events.
b. Gridded meteorological data development
Following the cross-validation, the same methodology was applied to all 12 months using all of the 40 rain gauges selected using the criteria outlined in section 3a. In other words, for the final gridded precipitation product, no precipitation stations were excluded. Figure 7 shows the adjusted gridded annual precipitation fields and their departure from the observed dataset for the period 1950–2006. In Fig. 7, for each 0.25° grid cell, adjusted annual precipitation was subtracted from observed precipitation to obtain the differences. It is evident that in the more humid southern mountainous portion of the study area there has been a marked increase in precipitation with the adjustment, incorporating the more detailed information embedded in the rain gauge observations. Negative differences between original and adjusted gridded precipitation indicate the existence of a band along the Andes where annual precipitation is greater in the adjusted precipitation dataset compared to the unadjusted gridded data (Fig. 7b).
To compare how the adjusted daily precipitation relates to observations, we compared daily precipitation statistics at the same four locations used in the cross-validation step and compare these locations with the same three surrounding observation stations as above. Table 4 summarizes the basic statistics, bias, RMSE, and correlation coefficient for daily observed (OBS) and daily adjusted gridded precipitation (ADJ) for austral summer [December–February (DJF)] and winter [June–August (JJA)] for the period 1983–2007. The bias is defined as the sum of the differences between ADJ and OBS, and the RMSE is equal to the rms error between daily ADJ and OBS precipitation values. Since no observation stations were excluded in developing this final gridded dataset, and the adjustment process scales daily data to match monthly climatological means, long-term mean values in the gridded dataset are expected to be close between observations. There is, however, no daily information from the observations included in the gridded dataset. Thus, as expected, in Table 4 mean daily values are very close for the observed and adjusted datasets for both seasons. The variability of daily precipitation within each season, represented by the standard deviation, also compares relatively well, though the adjusted gridded data show greater variability than the observations during the rainy winter season. A high RMSE and low correlation values indicate that temporal sequencing differs between the two datasets. This is not unexpected, since the daily precipitation in the original 0.25° gridded data was derived from reanalysis, and as such it is a model output (Kalnay et al. 1996), adjusted as discussed in section 3a above, which does not incorporate station observations from the surrounding stations used in the comparisons in this study. While important characteristics of daily precipitation variability are represented in the 0.25° gridded data and monthly totals should bear resemblance to observations (at least as represented by the underlying monthly data such as CRU), correspondence with observed daily precipitation events is not anticipated.
As in the cross-validation step, we use RClimDex to compute a set of extreme statistics, described in Table 1. Additionally, we compute the unconditional probability of occurrence of wet and dry days and the corresponding transition probabilities. Table 5 summarizes results for comparisons of statistical parameters listed in Table 1 for the four locations, similar to Table 3.
The agreement between adjusted total annual precipitation (PRCPTOT) and observations is good with an average bias of −9% from the observed station mean (not shown), although this is constrained by design since for long-term average quantities the datasets are not independent as noted above. Extreme precipitation events [R99p and R95p and maximum 1-day precipitation (RX1day)] have statistically indistinguishable means for all four locations. Precipitation intensity (SDII) also shows good agreement at the four locations with statistically equal means for observations and adjusted gridded data for three of the four locations. Conversely, the parameters, R5mm and R20mm, albeit strongly correlated, were found to have statistically different mean values. This phenomenon of a gridded precipitation dataset having lower extreme precipitation values than station observations was also noted in the South American study of Silva et al. (2007) and is consistent with the effect of spatial averaging—that is, comparing the average of a 630 km2 0.25° grid cell to the smaller, more discrete area represented by the three averaged stations (Yevjevich 1972). The statistics related to duration of wet and dry spells showed statistically different population means at all four locations. The maximum consecutive number of dry days (CDD) and wet days (CWD) in a year is lower for the adjusted gridded precipitation compared to observations, indicating that the rescaling at the monthly level cannot modify the durations of wet and dry events in the adjusted gridded dataset, except, as discussed above, by changing classifications of small numbers of days as wet according to the definition in Table 1.
The adjusted precipitation data shows an average transition probability of a wet day followed by a wet day of 0.21 compared to 0.50 obtained for the observations, indicating that the duration of storm events is shorter in the gridded dataset. This could partially explain the underestimation of maximum consecutive wet days (CWD) in the gridded precipitation compared to observations (Fig. 6e) as well.
Finally, the results in Table 5 (for the final gridded product) can be seen to be nearly identical to those in Table 3 (for the cross-validation). This demonstrates that the daily statistics of the gridded product are not deriving their values from the stations surrounding individual grid cells since the statistics at each location are not particularly sensitive to the exclusion of nearby stations. Rather, the improvements in extreme statistics are realized owing to improved large-scale precipitation characterization incorporating elevation data to correct biases in spatial distribution of precipitation fields.
c. Hydrologic model validation of adjusted meteorology
To assess the representation in the new meteorological dataset of basinwide and high elevation areas, the adjusted gridded data developed and assessed in the previous sections were then used to drive the VIC hydrologic model. Since the precipitation was shown to be comparable to observations (where available) in many important respects, another validation of the driving meteorology would be the successful simulation of observed streamflow and snow cover. Records of observed streamflow in the region tend to be incomplete or for short periods, and, since most of the rivers are affected by reservoirs and diversions, the flows often do not reflect natural streamflow as simulated by the VIC model. For this project, we focused on three sites, shown in Fig. 1, that have more complete records and were judged to be relatively free of anthropogenic influences.
For the site on the Mataquito River, the VIC model was calibrated to monthly stream flows for the period 1990–99 using the Multiobjective Complex Evolution of the University of Arizona (MOCOM-UA) algorithm (Yapo et al. 1998). The three optimization criteria used in this study were the Nash–Sutcliffe model efficiency (NSE) (Nash and Sutcliffe 1970), using both flow (NSE) and the logarithm of flow (NSElog), and the bias, expressed as a percent of observed mean flow. This provides a balance between criteria that penalize errors at high flows and others that are less sensitive to a small number of large errors at high flows (Lettenmaier and Wood 1993). The MOCOM-UA method does not require an a priori subjective weighting to the multiple optimization criteria, but evolves toward a set of “nondominated” Pareto solutions of all objectives. By definition, for a set of objectives, two solutions will not dominate one another if they have the property that moving from one solution to another results in the improvement of one objective while causing deterioration in one or more others, using, in the case of MOCOM-UA, rank-based assessments of objectives (Gupta et al. 1999; Vrugt et al. 2003).
Figure 8 shows the VIC simulation results for the calibration period and for the validation period of 2000–07. The flows for both periods generally meet the criteria for “satisfactory” calibration based on the criteria of Moriasi et al. (2007), with NSE > 0.50 and absolute bias <25%. While during the validation period several of the maximum annual flow peaks are overestimated, resulting in a lower NSE score compared to the calibration period, the reasonable peaks, low flows, and satisfactory calibration and validation do serve to provide further validation of the driving meteorology as plausible. Comparing the simulation in Fig. 8 (with the adjusted precipitation) to that in Fig. 3 (with the unadjusted precipitation) shows that the precipitation adjustment helps to capture important hydrologic characteristics, including seasonal low flows and snowmelt pulses. The simulated peak flows exceeding observation could reflect a disproportionate increase in precipitation with the rescaling process in some parts of the basin, though this is not evident at any of the low to moderate elevation locations included in the validation (Fig. 6).
Despite the highly variable precipitation across the study region, we applied the same VIC-calibrated parameters from the Mataquito basin to the entire domain and used the VIC model to generate streamflow at the other two gauge sites. This avoids the possibility of allowing extensive calibration to hide meteorological data deficiencies. The simulated flows for the period 2000–07 at each site, and the associated statistics, are in Figs. 9 and 10. The simulated flows on average show little bias in both locations. The Claro River NSElog value is low, reflecting the underestimation of low flows and overestimation of peak flows during the simulation period, though the higher NSE value suggests that the errors at the high flows are not as systematic. The Loncomilla River displays a general overestimation by VIC of low flows, though both NSE and NSElog are above the satisfactory threshold. While these are not demonstrations of the best hydrologic model that could be developed for each basin or the best that the VIC model could produce (since no calibration was performed for two of the three basins), they do provide some further validation that the driving meteorology appears plausible and does not appear to show any systematic biases.
Additionally, a comparison of four streamflow properties is shown in Fig. 11 for the three simulated basins. We calculate the center timing (CT), defined as the day when half the annual (water year) flow volume has passed a given point (Stewart et al. 2005) for which the water year runs from 1 April through 31 March. CT values lie within the −11 to 17-day window compared to observed values, indicating that the snow melting season is reasonably captured by the model (Fig. 11a). The unpaired Student’s t test indicates the distributions have equal means at a 5% significance level. The water-year volume and the 3-day peak flow are systematically overestimated by VIC simulations; however, their means are found to statistically equal with the exception of the Rio Claro 3-day peak flow. Low flows are over- and underestimated by VIC simulations but only the Loncomilla River has means that are statistically different (Fig. 11d).
Recognizing the high dependence of this region on snowmelt and thus the importance of this process being well represented, we validate the high elevation meteorology of the new dataset by comparing VIC-simulated SWE to MODIS 8-day snow coverage for the six events (see Table 6) between 2002 and 2007 (one image per year). The satellite images were selected to capture the snow cover in mid to late August in each year, approximating the maximum snow accumulation in the region. Following Maurer et al. (2003), a snow depth of 25.4 mm (1 inch) was used as threshold to indicate the presence of snow on the ground. MODIS snow coverage was interpolated to a 0.25° grid using triangle-based cubic interpolation. VIC-simulated SWE was averaged to match the MODIS 8-day period. Strong similarities in the spatial extent are found between MODIS and VIC-simulated snow coverage for the period 21–28 August 2002 (Fig. 12). The average area covered by snow for the six years is 172 320 and 167 050 km2 in VIC simulations and MODIS, respectively. This represents only a 3% error in the snow-covered area simulated by the VIC model.
Table 7 is a contingency table of relative frequencies of snow/no snow in MODIS and VIC-simulated SWE. We include all of the pixels for the six selected periods (total 1170). The number of pixels classified as snow or no snow is similar in VIC and MODIS with frequencies of 0.58 and 0.29 for no snow and snow classification, respectively. Conversely, the occurrence of misclassified snow/no snow events is quite low—on the order of 6%—indicating an excellent agreement between both data sources. The Heidke skill score for this data is 0.72, showing that the agreement between observed and VIC-simulated snow cover is unlikely to be due to chance.
Finally, the successful validation of the streamflow and simulated snow cover with observations also implicitly supports the gridded temperatures in the dataset. Reasonable end of season snow extent and well-simulated timing of flows in snow-dominated streams indicates that the temperatures are not likely to be greatly in error.
In this study an adjusted gridded daily precipitation dataset is developed for central Chile for the period 1948–2008. Precipitation gauge data are used to correct the inaccuracies in the representation of orographic distribution of precipitation existent in the available global gridded dataset. Adjusted gridded data are validated using station observations and hydrological model simulations.
In data-sparse regions, a simple cokriging method that incorporates topographic elevation as covariate can be successfully used to improve the spatial representation of gridded precipitation in areas with complex terrain. A month-to-month adjustment can effectively remove biases in precipitation values hailing from sparse rain gauge observations. The improvements in extreme daily precipitation statistics are derived from the improved large-scale characterization of precipitation and its elevation dependence; excluding individual observation stations had minor effects on the extreme precipitation statistics at nearby grid cells.
The adjusted gridded precipitation is able to capture precipitation enhancement due to orography in the region with a good representation of annual totals and precipitation intensity. However, the duration of storm events is slightly shorter than observed, perhaps as a result of comparing a 630 km2 grid cell to the smaller, more discrete, areal precipitation represented by three averaged rain gauges. The statistics of extreme precipitation events are well captured by the adjusted gridded dataset, which encourages its use for climate change applications.
Streamflow simulations in three basins realistically capture high and low flow statistical properties, indicating that the driving meteorology in the adjusted gridded dataset is well represented. Simulated SWE closely resembles satellite observations that can be linked to a good depiction of winter precipitation at higher elevations, despite the driving meteorological dataset not including high-elevation station observations. While not explicitly tested here, successful simulation of snow cover and flow in snow-dominated streams indicates that temperatures in the gridded dataset are also reasonable and do not require adjustment.
Based on our results, the adjusted daily gridded precipitation dataset can be successfully used for hydrologic simulations of climate variability and change in central Chile. The methodology presented in this paper can be implemented in numerous data-sparse basins located in mountainous regions around the globe, with one caveat. The sensitivity of the results to the number of rain gauges used to obtain plausible adjusted values was not determined; therefore, the quality of the adjusted dataset will be constrained by the density of the local observation network.
This study was funded by CORFO-INNOVA Grant 2009-5704 to the Centro Interdisciplinario de Cambio Global at the Pontificia Universidad Católica de Chile. A Fulbright Visiting Scholars Grant also provided partial support to the second author. The authors are grateful to Paul Nienaber and Markus Schnorbus of the Pacific Climate Impacts Consortium, University of Victoria, BC, Canada, and Katrina Bennett at the Univerity of Alaska, Fairbanks, for providing updated and improved code for the MOCOM/VIC application. We are grateful to two anonymous reviewers whose careful review and helpful comments led to substantial improvements to this work.