Abstract

Warm season river flows in central Asia, which play an important role in local water resources and agriculture, are shown to be closely related to the regional-scale climate variability of the preceding cold season. The peak river flows occur in the warm season (April–August) and are highly correlated with the regional patterns of precipitation, moisture transport, and jet-level winds of the preceding cold season (November–March), demonstrating the importance of regional-scale variability in determining the snowpack that eventually drives the rivers. This regional variability is, in turn, strongly linked to large-scale climate variability and tropical sea surface temperatures (SSTs), with the circulation anomalies influencing precipitation through changes in moisture transport. The leading pattern of regional climate variability, as resolved in the operationally updated NCEP–NCAR reanalysis, can be used to make a skillful seasonal forecast for individual river flow stations. This ability to make predictions based on regional-scale climate data is of particular use in this data-sparse area of the world.

The river flow is considered in terms of 24 stations in Uzbekistan and Tajikistan available for 1950–85, with two additional stations available for 1958–2003. These stations encompass the headwaters of the Amu Darya and Syr Darya, two of the main rivers of central Asia and the primary feeders of the catastrophically shrinking Aral Sea. Canonical correlation analysis (CCA) is used to forecast April–August flows based on the period 1950–85; cross-validated correlations exceed 0.5 for 10 of the stations, with a maximum of 0.71. Skill remains high even after 1985 for two stations withheld from the CCA: the correlation for 1986–2002 for the Syr Darya at Chinaz is 0.71, and the correlation for the Amu Darya at Kerki is 0.77. The forecast is also correlated to the normalized difference vegetation index (NDVI); maximum values exceed 0.8 at 8-km resolution, confirming the strong connection between hydrology and growing season vegetation in the region and further validating the forecast methodology.

1. Introduction

The rivers of semiarid central Asia have an important role in local water resources, providing drinking water, hydropower, and irrigation for both subsistence and large-scale agriculture. The region is water stressed (e.g., Oki and Kanae 2006), and drought can have severe societal influences (e.g., Agrawala et al. 2001; Barlow et al. 2006). The two major rivers of the region, the Amu Darya and the Syr Darya, are primary feeders of the Aral Sea. The rivers have their headwaters in the high mountains of the region and cross all the countries of central Asia before feeding into the Aral Sea. Massive irrigation efforts begun during the Soviet era divert most of the downstream water of the Amu Darya and Syr Darya Rivers, resulting in a catastrophic shrinking of the Aral Sea (e.g., Micklin 1988, 2007 ,Glantz 1999). Advance warning of wet or dry periods provided by seasonal forecasts may aid the water resource decision-making process and agricultural planning in the region and may provide some ability to reduce societal and ecological influences (Schär et al. 2004).

The most important driver of local river flows in central Asia is the runoff from melting snow. Most of the regional precipitation occurs during the cold season, which will be discussed in section 3, and falls primarily as snow in the high mountains. The delay between cold season precipitation and warm season river flows suggests that forecasts of warm season river flows could be made in early spring based on the snow water equivalent present in the snow. Unfortunately, observations of snow are scarce in this region. The data problem is further compounded by the extreme terrain of the region, which necessitates a high spatial resolution of measurement to describe the snowpack accurately. Because winter precipitation in the mountains falls primarily as snow, it seems plausible to estimate the snow water equivalent there from precipitation data alone. However, precipitation observations, although more numerous than snow observations, are still quite sparse in the region. The available precipitation estimates are compared for the region in Schiemann et al. (2007, manuscript submitted to Int. J. Climatol.); there is general qualitative agreement but also considerable quantitative differences. The limitations of the in situ observations lead us to consider model-based estimates of precipitation such as reanalysis products. For rivers with large drainage basins, Schär et al. (2004) have shown that the European Centre for Medium-Range Weather Forecasts (ECMWF) reanalysis precipitation integrated over the drainage basin can be used to predict river flow accurately at downstream stations. They have shown a correlation between 15-yr ECMWF Re-Analysis (ERA-15) December–April catchment-averaged precipitation and May–September river flow of 0.91 for the Syr Darya at Chinaz, 0.51 for the Amu Darya at Kerki, and 0.59 for the Zeravshan at Dupuli. The fidelity of the ECMWF reanalysis precipitation is perhaps somewhat surprising because there are few observations in the region to constrain the reanalysis. However, as Schär et al. note, the precipitation of the region is associated with synoptic storms that are well observed over Europe and the Middle East just before entering the region. Also, the distribution of precipitation in the region is largely determined by orographic uplift associated with the high mountains of the regions, a process resolved by the resolution of ERA-15, at least at the catchment scale.

Here we focus on the information present in regional-scale patterns of climate variability rather than basin averages, both as an avenue for investigating the regional dynamics and as a basis for forecasting. The current analysis addresses two main questions: 1) what are the dynamics of regional-scale influences on winter snowpack and subsequent river flows, and 2) what is the potential for using these regional-scale patterns for river forecasting? Previous research has shown that cold season precipitation in the region is strongly influenced by regional- and large-scale climate variability (Barlow et al. 2002, 2005a, 2007; Tippett et al. 2003, 2005). These scales of variability are reasonably well captured by the available data and, given their demonstrated importance, may provide considerable information about the local variability. Both a simple index, based on the normalized average of all river stations, and canonical correlation analysis (CCA) are used to extract the primary mode of covariability between the regional-scale cold season climate and local warm-season river flows. The climate is represented using data from the National Centers for Environmental Prediction (NCEP) Climate Data Assimilation System (CDAS), which provides data for 1950 to the present and is updated monthly. River flows are represented primarily by data from 24 river flow stations in eastern Uzbekistan and Tajikistan, with very good reporting during the period 1950–85. Additionally, data for two other stations are available for 1958 to 2003. These two stations are not used in the CCA and provide independent verification data as well as information about more recent flow variability. River flows peak during the warm growing season when precipitation has become scarce; an association may also be expected between groundwater and vegetation. Vegetation, which is estimated from satellite data for 1982 to the present, may therefore provide further data for evaluating the links to climate variability and is analyzed using the Tucker et al. (2005) normalized difference vegetation index (NDVI).

Box A in the overview map in Fig. 1a denotes the central Asia region where river flows are considered; the individual stations are shown in Fig. 1b. Snowmelt is a primary driver of warm season hydrology in this mountainous region, which encompasses most of the headwaters of two of the main rivers of central Asia, the Amu Darya and Syr Darya. The larger box B in Fig. 1a shows the region in which patterns of associated precipitation variability are analyzed and box C shows the region in which patterns of wind variability are analyzed.

Fig. 1.

The study regions including (a) the analyzed area of river flow variability (inner rectangle), the associated precipitation analyzed over the large box (dotted–dashed), and the upper-level winds over the largest box (dashed); and (b) the individual river stations (see Table 1 for station details).

Fig. 1.

The study regions including (a) the analyzed area of river flow variability (inner rectangle), the associated precipitation analyzed over the large box (dotted–dashed), and the upper-level winds over the largest box (dashed); and (b) the individual river stations (see Table 1 for station details).

The structure of this paper is as follows. The data is described in section 2 and a brief review of the climatology of the region is given in section 3. The links between the river flows and the regional- and large-scale climate are examined in section 4 using a normalized average of station streamflows. In section 5, the links are examined using CCA for the period 1950–85, and a forecast scheme based on a CCA regression is evaluated. In section 6, the results are validated for the period 1986–2004, using two additional river flow stations withheld from the CCA analysis and NDVI. Finally, a summary and discussion are given in section 7.

2. Data

a. River discharge data

Monthly river flow data was obtained from the National Center for Atmospheric Research (NCAR) data archive and from a hydrometeorological survey of Uzbekistan. Data from 24 stations were extracted from NCAR dataset ds553.2 for 1950–85, representing the period with best coverage; less than 7% of the monthly data is missing for any of the stations (11 of the stations have no missing data). Station and river names, basin areas, and elevations are given in Table 1; locations are shown in Fig. 1b. Based on the available documentation, these flow data are not corrected for human influence. Data for two additional stations were obtained from the Uzbekistan Hydrometeorological Survey for 1958–2003. These two stations, Kerki on the Amu Darya (missing 1994 data) and Chinaz on the Syr Darya (no missing data), have been corrected for human influence (Schär et al. 2004). The Kerki station (No. 26 in Fig. 1b) has a drainage area of 320 520 km2, and the Chinaz station (No. 25 in Fig. 1b) has a drainage area of 166 400 km2.

Table 1.

River flow station information.

River flow station information.
River flow station information.

b. Precipitation data

Monthly gridded observational precipitation at 0.5° × 0.5° resolution is obtained from the New et al. (2000) dataset for the period 1950–2001. The underlying station data is sparsely distributed and irregularly reported, with best coverage in the central Asia region before the breakup of the Soviet Union. Precipitation is also obtained from the NCEP–NCAR reanalysis at 2.5° × 2.5° resolution (Kalnay et al. 1996). The reanalysis precipitation is a product of the reanalysis model and is not constrained by precipitation observations. However, it has been shown to be able to capture regional-scale patterns of variability in the region (Tippett et al. 2003) and is further validated against observations in section 5.

c. Wind and moisture flux data

Monthly 200-hPa wind data is obtained from the NCEP–NCAR reanalysis (Kalnay et al. 1996). Because the spatial scales of wind variability at upper levels are fairly large, this field can be considered to have good reliability (see Kalnay et al. 1996) and has a close association with surface variability in the region (Barlow et al. 2002; Tippett et al. 2003; Barlow et al. 2005a, 2007). Vertically integrated moisture flux is calculated from the daily NCEP–NCAR reanalysis. Although the primary period of analysis is 1950–2003, we note that the reanalysis is operationally updated in real time through the NCEP–NCAR CDAS; therefore, any forecast scheme based on the reanalysis data can be readily and consistently implemented in real time.

d. Vegetation data

The Tucker et al. (2005) NDVI data is used as an estimate of vegetative vigor. The estimate is available at 0.073° × 0.073° (approximately 8 km × 8 km) for 1982 to the present and is also considered regridded to 1° × 1° resolution for convenience when plotting the data.

3. Climatology

Precipitation in central Asia occurs mainly during winter and early spring (Fig. 2a), with little occurring in the warm months (Fig. 2b). The primary precipitation mechanism in the region is the occurrence of synoptic storms moving in from the west (Martyn 1992), although the northwestern-most extent of the monsoon and warm season processes do make a modest contribution (cf. Figs. 2a and 2b). A comparison of the cold season precipitation (Fig. 2a) with the topography (Fig. 2c) highlights the dominant role of the local mountains in determining the distribution of precipitation. Although the sparseness of station reports limits the accuracy of precipitation estimates in this region, there is good qualitative agreement among different datasets at the level of detail discussed here (Schiemann et al. 2008). In the high mountains, much of the precipitation falls as snow and a significant snowpack accumulates throughout the cold season. The melting of this snowpack is an important driver of regional hydrology, particularly given the relative lack of precipitation during the warm season.

Fig. 2.

(a) Climatological observed November–April precipitation totals over land (contour interval = 15 cm), (b) climatological observed May–October precipitation totals over land (contour interval = 15 cm), (c) elevation (contour interval = 1 km), and (d) April–August vegetation, as estimated by NDVI.

Fig. 2.

(a) Climatological observed November–April precipitation totals over land (contour interval = 15 cm), (b) climatological observed May–October precipitation totals over land (contour interval = 15 cm), (c) elevation (contour interval = 1 km), and (d) April–August vegetation, as estimated by NDVI.

Growing season (April–August) vegetation, as estimated by NDVI, is shown in Fig. 2d. Snowmelt can also be an important factor in vegetative vigor for the many areas that receive little warm season precipitation. [The large values of NDVI along the southern shore of the Caspian Sea have no apparent precipitation source in any season; this is a common failing of gridded precipitation datasets in the region, which often fail to resolve a narrow swath of heavy precipitation along the coast (Domroes et al. 1998).]

The river flow stations (Fig. 1b) are under the influence of the local cold-season precipitation maximum in the Pamir and Tien Shen Mountains, and may be expected to be strongly affected by snowpack dynamics. River flow at these stations encompasses much of the headwaters of the Amu Darya and Syr Darya, which are two of the largest rivers in central Asia and the primary feeders of the Aral Sea. The two rivers, shown in Fig. 3 along with the drainage basins for the Kerki and Chinaz stations, capture the snowmelt-driven part of the flow. As noted in the introduction, the Aral Sea has been undergoing a dramatic reduction in extent, primarily due to diversions of upstream river water for agriculture. The outline of the Aral Sea in the map background, which is the default outline in many maps, is accurate for the early 1960s. The boundaries as observed in the mid-1990s are drawn as a heavy black line; note that the sea has already separated into two bodies, the small Aral Sea to the north and the large Aral Sea to the south. The current shorelines are even more contracted, although the northern water body (the “small Aral Sea”) appears to have stabilized as a result of the construction of a dike (e.g., Micklin 2007).

Fig. 3.

Amu Darya, Syr Darya, and Aral Sea. The Aral Sea boundary in the map background corresponds to ca. 1960; the overlaid dark outline is ca. 1994; the current extent is even smaller (e.g., Micklin 2007).

Fig. 3.

Amu Darya, Syr Darya, and Aral Sea. The Aral Sea boundary in the map background corresponds to ca. 1960; the overlaid dark outline is ca. 1994; the current extent is even smaller (e.g., Micklin 2007).

The average seasonal cycle for the 24 river stations is computed by normalizing the annual cycle of each station by its total annual flow and then averaging over all stations. The average seasonal cycle of river flow, shown in Fig. 4a, has its peak in June. Individual annual cycles of river flow are shown in Fig. 4b. Although there is considerable variability in the month of peak flow, all but one of the stations have their maximum flows during the period April–August, with two-thirds of the annual flow, on average, occurring during these five months. The seasonal cycle of precipitation in Fig. 4a is the average of the observed precipitation over the approximate total drainage area of the river flow stations (box A in Fig. 1a). Most of the precipitation occurs in late winter and early spring, with a maximum in March. The distinct lag of about three months between maximum precipitation and subsequent maximum river flows is clearly visible, reflecting the importance of snowmelt in the annual cycle of the river flows. It is reasonable to expect a similar relation between year-to-year variations of precipitation and river flow. The seasonal cycle of the NDVI, a measure of vegetation health, is also shown in Fig. 4 to further highlight the importance of snowmelt in the region. The seasonality of the NDVI closely follows that of the river flow, with both peaking in June and at their minimum in January. This correspondence, given the lack of precipitation during much of the growing season, strongly suggests a key role for groundwater in the local vegetation (both natural and, through irrigation, agricultural). As with river flows, the seasonal lag between cold season precipitation and growing season vegetation suggests potential predictability.

Fig. 4.

The seasonal cycle (for the October–September water year) of (a) observed precipitation (dotted), river flows (crosses), and vegetation (dashed). The averaging area is shown in Fig. 1d and each monthly value is plotted as the percentage of the yearly total for each variable. Vegetation is represented by the NDVI. (b) The seasonal cycle of the individual river flow stations. Also shown are a normalized average of all stations (bold line) and the three easternmost stations (dotted lines).

Fig. 4.

The seasonal cycle (for the October–September water year) of (a) observed precipitation (dotted), river flows (crosses), and vegetation (dashed). The averaging area is shown in Fig. 1d and each monthly value is plotted as the percentage of the yearly total for each variable. Vegetation is represented by the NDVI. (b) The seasonal cycle of the individual river flow stations. Also shown are a normalized average of all stations (bold line) and the three easternmost stations (dotted lines).

River flow station information is given in Table 1, including the fraction of total flow that occurs from April to August. Most of the stations experience their seasonal peak during that period and accumulate between 65% and 85% of their total flow. Although the discharges at the different stations are associated with a wide range of elevations (from 0 to 4 km) and drainage areas (152 to 142 000 km2) and comprise multiple river systems, there is a great deal of coherence in the variability. In Table 1, the correlation of each station to the normalized all-station average is also shown for the April–August period. All but three of the stations (3, 4, and 7) are highly correlated with the all-station average, indicating considerable common variability across the stations. The three uncorrelated stations are the eastern-most stations and have the latest seasonal maxima, in August or September (dotted lines in Fig. 3b); in general, the stations with a seasonal peak before July share the greatest regional coherence. Given its high correlation with 21 of the 24 rivers, the all-station average provides a simple proxy for regionally coherent river flow variability.

4. Analysis using the all-station average

As discussed in the previous section, the normalized average of the river flows is a good index of the coherent variability of the stations. Here we use it to examine the links between year-to-year river flow variability, and the regional- and large-scale climate during the preceding cold season. The correlation between the all-station average and observed precipitation during the preceding November–March, when the snowpack is generated, is shown in Fig. 5a for the period 1950–85. The New et al. (2000) 0.5° × 0.5° gridded dataset is used as an estimate of observed precipitation. High correlations (maximum of 0.84) occur throughout the region, which include the river flow stations but also, notably, extend over a much larger domain. Given the sparse distribution of observed precipitation in this region, the high correlations are encouraging and indicate that the available precipitation data during this period are adequate for capturing the climate signals driving river flow variability. The large area of high correlations shows that river flows are being driven by regional-scale climate signals.

Fig. 5.

Correlations between the November and March average river flow and precipitation during the preceding November–March season are shown for (a) observed precipitation (maximum correlation = 0.84) and (b) reanalysis model precipitation (maximum correlation = 0.73). Contour interval is 0.1, with dark shading for values ≥0.3 and light shading for values ≤−0.3.

Fig. 5.

Correlations between the November and March average river flow and precipitation during the preceding November–March season are shown for (a) observed precipitation (maximum correlation = 0.84) and (b) reanalysis model precipitation (maximum correlation = 0.73). Contour interval is 0.1, with dark shading for values ≥0.3 and light shading for values ≤−0.3.

The correlation between the all-station average and the model-based precipitation estimate from the NCEP–NCAR reanalysis is shown in Fig. 5b. Although the reanalysis precipitation product is not based on observed precipitation, it is produced consistently throughout the entire reanalysis period with fixed model dynamics and is operationally updated. In contrast, available observed estimates have dramatically varying coverage throughout the historical record. The few observational estimates that are operationally updated almost completely lack station data in the region during the recent and ongoing period and rely primarily on satellite estimates. The correlations between the all-station average and the model precipitation are still relatively high (maximum of 0.73) and show overall good correspondence with the observed precipitation correlations. The model precipitation, thus, captures much of the information present in the observed precipitation that is relevant for river flow. Schär et al. (2004) suggested that the perhaps surprising accuracy of the reanalysis precipitation estimate may be because the storms that generate the precipitation are well observed in Europe and the Mediterranean area before they enter the region, combined with the strong influence of the high mountains of the region in determining the location of precipitation. The fidelity of the model precipitation means that a consistent estimate of precipitation is available operationally, even though the observational network is currently very sparse in the region.

Previous research has shown that tropical Pacific variability can influence the precipitation of the region (Barlow et al. 2002; Tippett et al. 2003). To investigate the large-scale linkages in the present case, the correlation of sea surface temperatures (SSTs) with the all-station average is shown in Fig. 6a. As in the aforementioned studies, the correlation with SST has a pattern that is similar to that of the El Niño–Southern Oscillation (ENSO) but with a relatively stronger signal in the central rather than eastern Pacific. The correlation pattern is also similar to that of the correlation between SST and the Niño-4 index (not shown). Niño-4, the average of SST anomalies in the region (5°S–5°N, 160°E–150°W), is an ENSO index that focuses on the central Pacific. In the current case, high correlations are also observed in the eastern Indian Ocean. The correlation between upper-level winds and all-station average is shown in Fig. 6b, in terms of 200-hPa streamfunction (the rotational component of the winds). An ENSO-like pattern is also apparent in the winds, with a pair of anticyclonic circulations straddling the equator in the eastern Pacific. At this upper level, the streamfunction correlation has maxima over the tropical Pacific and Atlantic Oceans—and even the Southern Ocean—that are as large as the maximum just upstream of central Asia, emphasizing the influence of global climate variability on the central Asia river flows. This upper-level wind pattern emphasizes the global response to the shifted ENSO SST pattern, consistent with the modeling results of Hoerling and Kumar (2003). Although previous research has emphasized the direct influence of the western Pacific and eastern Indian Oceans on winds over the region (Barlow et al. 2002, 2005a, 2007), here the swath of highest streamfunction correlations are upstream (to the west), connecting to the Pacific by way of the Middle East and Atlantic rather than directly from the western Pacific. This suggests the possibility that, in addition to direct influence from the western Pacific, the region may also be influenced by the tropical Pacific via extratropical Rossby wave propagation over North America, the Atlantic, and along the North African jet, consistent with the work of Shaman and Tziperman (2005). Although the precipitation and large-scale SST patterns are broadly similar to the analysis of Barlow et al. (2002, 2005a), the equivalent barotropic nature of the wind anomalies in the present case is suggestive of remote forcing, contrasting with the baroclinic wind anomalies found in the previous analyses. The precipitation anomalies analyzed here are also somewhat north of the anomalies in Barlow et al. (2002, 2005a, 2007), the SST signal in the western Pacific is not as pronounced, and the connection to the eastern Indian Ocean is stronger. The dynamical significance of these differences remains to be fully explored.

Fig. 6.

Correlations between the April–August average river flows and preceding November–March (a) SSTs and (b) 200-hPa streamfunction (rotational component of the wind). Contour interval is 0.1, with dark shading for values ≥0.3 and light shading for values ≤−0.3.

Fig. 6.

Correlations between the April–August average river flows and preceding November–March (a) SSTs and (b) 200-hPa streamfunction (rotational component of the wind). Contour interval is 0.1, with dark shading for values ≥0.3 and light shading for values ≤−0.3.

To further investigate the regional link between the precipitation and the jet-level winds, we have examined the vertically integrated moisture transport, which can be an important link between local precipitation in the region and larger-scale variability (Mariotti 2007). The jet level wind anomalies seen in Fig. 6 extend in a quasi-barotropic manner to the surface, consistent with remote forcing via extratropical Rossby wave activity; that is, the tropospheric winds at all levels have anomalies similar in direction to those aloft, and the jet level winds, therefore, are in this case an indicator of changes in moisture flux. The climatological vertically integrated moisture flux is shown in Fig. 7a. The climatological moisture flux is from the west, and several water bodies supply moisture to central Asia, including the Mediterranean Sea, the Persian Gulf, and the Caspian Sea. The correlation of cold season vertically integrated moisture flux to the subsequent warm season average river flow is shown in Fig. 7b (maximum value is 0.79). The variability in moisture transport is closely aligned with the climatological transport, either directly enhancing or opposing it. Given the importance of moisture availability for precipitation in this semiarid region, these changes to moisture transport provide a straightforward link between the jet-level wind anomalies and the precipitation anomalies, with a key aspect being the quasi-barotropic nature of the wind anomalies.

Fig. 7.

(a) November–March climatological vertically integrated moisture flux and (b) correlation of April–August river flows to the preceding November–March vertically integrated moisture flux. The correlations are shown as both vectors of the zonal and meridional correlations as well as contours of the magnitude of the vector correlation. Contours are shown at 0.3 and 0.6 (maximum correlation 0.79), with vectors only shown for correlations with a magnitude greater than 0.3.

Fig. 7.

(a) November–March climatological vertically integrated moisture flux and (b) correlation of April–August river flows to the preceding November–March vertically integrated moisture flux. The correlations are shown as both vectors of the zonal and meridional correlations as well as contours of the magnitude of the vector correlation. Contours are shown at 0.3 and 0.6 (maximum correlation 0.79), with vectors only shown for correlations with a magnitude greater than 0.3.

Thus, analysis based on an average river flow index shows that the variability of the river flow stations is not only locally coherent among the various stations but is also linked to regional-scale precipitation, wind, and moisture transport variability as well as to global-scale teleconnections centered on the tropical Pacific Ocean.

5. Canonical correlation analysis of river flow linkages to regional-scale climate

Examination of the all-station average shows a strong link between river flows and regional- and large-scale climate variability. To examine this link more closely, CCA is used to extract patterns of covariability between cold season (November–March) climate variables and subsequent warm season (April–August) river flows. CCA is a technique for identifying patterns of maximum correlation between two datasets (e.g., Wilks 2006). Here, CCA is used to find the linear combination of winter climate variables (reanalysis precipitation and upper-level zonal winds) and streamflow station data whose time series are maximally correlated. Upper-level zonal winds are included because the analysis of the previous section, as well as the analyses of Barlow et al. (2002, 2005a, 2007) and Tippett et al. (2003), all show strong connections between the upper-level wind field and precipitation in the region. Additionally, inclusion of the upper-level wind provides a variable that is relatively well captured, even by the sparse local observations, because of the larger spatial scales of upper-level wind variability and the good upstream sampling. The domains for precipitation and 200-hPa zonal wind are shown in boxes B and C, respectively, of Fig. 1a and have been shown to effectively capture the regional cold season climate variability (Tippett et al. 2003). Correlation empirical orthogonal analysis (EOF) prefiltering is used to reduce the number of degrees of freedom in the analysis and avoid overfitting. Cross validation (e.g., Michaelsen 1987) shows that there is a single robust CCA mode based on 19 correlation EOFs of the climate data and 4 correlation EOFs of the streamflow data; the relatively large number of climate data EOFs reflects the flatness of the spectrum of the multivariate (precipitation and zonal wind) correlation EOFs. The CCA patterns are shown in Fig. 8. As in the correlations with the all-station average, the CCA mode reflects a regional-scale pattern of increased winter precipitation (Fig. 8b) and a northward shift in the maximum of the jet-level winds (Fig. 8a). The associated April–August streamflow CCA mode (Fig. 8c) explains 61% of the variance and shows increased flows at all the river flow stations, with the weakest relationship at stations 3, 4, and 7, as before. The correlation between the time series (Fig. 8d) of the April–August streamflow CCA mode and the all-station average is 0.94, showing that the CCA is capturing the dominant mode of coherent variability among the stations. The patterns of covariability can occur with either sign, so that both river flow surpluses and deficits are associated with this mode of variability. Correlations to global SSTs and 200-hPA streamfunction are quite similar to Fig. 6 and, therefore, are not shown. The similarity of the CCA results to the simple analysis of the previous section gives confidence that the CCA has identified a physically meaningful mode of variability.

Fig. 8.

Correlation patterns and time series from the leading CCA between cold season climate variability and subsequent warm season river flows with (a) November–March 200-hPa zonal wind, (b) November–March reanalysis model precipitation, (c) April–August river flows, and (d) time series. In (a) and (b), the contour interval is 0.1, with dark shading for values ≥0.3 and light shading for values ≤−0.3; in (c) the correlation magnitude is indicated by the diameter of the circles (maximum correlation = 0.71), with correlations <0.3 indicated by a plus sign.

Fig. 8.

Correlation patterns and time series from the leading CCA between cold season climate variability and subsequent warm season river flows with (a) November–March 200-hPa zonal wind, (b) November–March reanalysis model precipitation, (c) April–August river flows, and (d) time series. In (a) and (b), the contour interval is 0.1, with dark shading for values ≥0.3 and light shading for values ≤−0.3; in (c) the correlation magnitude is indicated by the diameter of the circles (maximum correlation = 0.71), with correlations <0.3 indicated by a plus sign.

The values of the correlations between the time series of the leading November–March CCA climate mode and the individual April–August river flows are given in the left-most column of Table 2 (these correlations comprise the river flow pattern of the CCA). All the correlations are statistically significant except for those of stations 4 and 7, with an average correlation of 0.67. These correlations are computed with the same data used to calculate the CCA. As in regression, we expect the in-sample correlations to be higher than ones computed with independent data. We use cross validation to assess the level of correlation on independent data, as follows: one year is withheld from all the CCA calculations, including climatologies and correlation EOFs, then the CCA relationship is used to predict the river flow of the withheld year from the climate of the previous winter. This process is repeated, withholding each year of the data in turn, producing forecasts that are applied to independent data (the serial correlation is modest in the present case). The cross-validated correlation is then computed from these cross-validated predictions. This procedure mimics a forecast procedure and provides an estimate of the operational level of skill in predicting April–August rivers flows from November to March climate. The correlations of the cross-validated time series with the individual stations are given in the right-most column of Table 2 and remain relatively high, with 10 of the stations correlated at 0.5 or higher and a maximum correlation of 0.71 (50% of the variance).

Table 2.

CCA correlations and estimates of skill and bias. Asterisks indicate significance at the 99% level, and bold indicates a correlation of 0.5 or more.

CCA correlations and estimates of skill and bias. Asterisks indicate significance at the 99% level, and bold indicates a correlation of 0.5 or more.
CCA correlations and estimates of skill and bias. Asterisks indicate significance at the 99% level, and bold indicates a correlation of 0.5 or more.

Although forecast skill is most clearly assessed in the context of a specific application, we also provide in Table 2 several additional measures of skill and bias calculated from the cross-validated forecasts for a more general assessment: mean absolute percentage error (MAPE), root-mean-square error (RMSE), and two tests of normality (Lilliefors 1967; Jarque and Bera 1980). Additionally, scatterplots of the relationship between the observed values and the cross-validated forecasts for each of the 24 stations are shown in Fig. 9, ordered from top left to bottom right on the basis of the strength of the correlation. The x axis represents observed values, whereas the y axis represents forecasts values. Both axes have the same range, so that a perfect forecast will lie along the diagonal. Note that in a regression-based forecast, the standard deviation of the forecast is proportional to the correlation between forecast and observed values. Therefore, the forecasts will underpredict large observed values and overpredict small observed values, with the bias increasing as skill decreases. As the skill approaches zero, as in the last three stations in the figure, the forecasts become approximately constant at the mean value, resulting in values along a horizontal line (always predicting the mean). Forecasts for the station with the largest drainage basin, Bekabad (station 22), show considerable overprediction for very low flows (note that the axis scale goes to 0.0). Poor prediction of low flows at this station are likely due to the confounding effects of large water withdrawals for agriculture.

Fig. 9.

Scatterplots of the cross-validated forecasts for the 24 CCA stations are shown ordered by correlation value, with observed values along the x axis and forecast values along the y axis. Both axes have the same scale, so that perfect forecasts would fall along the diagonal as indicated by the black line.

Fig. 9.

Scatterplots of the cross-validated forecasts for the 24 CCA stations are shown ordered by correlation value, with observed values along the x axis and forecast values along the y axis. Both axes have the same scale, so that perfect forecasts would fall along the diagonal as indicated by the black line.

Given the direct physical link between winter precipitation and snowmelt-driven river flow, relatively high correlations between the two are expected. However, it is striking that even in the absence of accurate measurements of snow water equivalent, these high correlations can be obtained with relatively low-resolution (2.5° × 2.5°) model-based precipitation estimates. These high correlations demonstrate both the important role that regional-scale variability plays in the region as well as the practical utility of considering these regional-scale patterns when dealing with scarce observational data. It is also remarkable that information from regional-scale patterns is sufficient for capturing a great deal of the variability at individual stations. This may be partly because the river drainage basins act as a natural spatial integrator of regional climate. However, several of the highly correlated stations are associated with small drainage basins, suggesting that spatial integration is not the only factor—note that the highest cross-validated correlation is associated with the third-smallest drainage basin.

CCA demonstrates, therefore, that regional-scale cold season climate variability, as represented in the operationally available NCEP reanalysis, is a good basis for skillful forecasts of subsequent river flows during the subsequent warm season at individual stations representing a wide range of flow rates, drainage areas, and elevations.

6. Verification with independent river flows and NDVI in the post-1985 period

The CCA described in the previous section is limited to the period 1950–85 as a result of data availability for the region in the NCAR river flow dataset. However, the NCEP–NCAR atmospheric data is operationally updated and can be used to produce river flow forecasts for 1950 to the present.

The two additional stations, withheld from the CCA analysis and available for 1958–2003, provide an excellent opportunity to further evaluate the forecast methodology. These stations, Kerki and Chinaz, represent the two major river systems of the region, the Amu Darya and Syr Darya, and have been corrected for human influence. The stations were not included in the CCA analysis (although the Bekabad station on the Syr Darya was included) and are available for 18 yr after the original analysis period. They, therefore, provide independent data with which to test the robustness of the CCA approach, the stability of the relationships in the more recent period, and the relevance to stations not in the original analysis. Forecasts for the two stations were made for 1986–2003, with the regression parameters for the forecast calculated using only data before 1986, to provide the most stringent test of the method; that is, these forecasts are exactly what would have been made in real time, without any explicit or implicit inclusion of future information. For the period 1986–2003 after the CCA analysis, the forecast correlation is 0.77 for the Amu Darya and 0.71 for the Syr Darya. Figure 10 shows the forecast and observed anomalies of the Amu Darya and Syr Darya, with the associated percentage error (including the forecast of the mean). MAPE is 0.17 (17%) for both stations.

Fig. 10.

The observed and forecast anomalies are shown for the (a) Amu Darya at Kerki and (c) Syr Darya at Chinaz for the independent 1986–2003 period, and (b), (d) the associated percent error (for total flow).

Fig. 10.

The observed and forecast anomalies are shown for the (a) Amu Darya at Kerki and (c) Syr Darya at Chinaz for the independent 1986–2003 period, and (b), (d) the associated percent error (for total flow).

Figure 11 shows two additional perspectives on the forecasts. Scatterplots are used to compare forecasts to observations for both stations in Figs. 11a and 11b. Forecasts for both stations show a modest overall negative bias as a result of small increases in the mean at both stations between the period 1958–1985 when the forecast parameters are set and the period 1986–2003 when the forecasts are made. In general, both sets of forecasts show good skill, even though only data prior to 1986 was used to identify the climate patterns and river regression parameters. Figures 11c and 11d show the observed annual cycle (October–September) for the three highest and three lowest forecasts, respectively, for each station. In both cases, good separation is seen during the peak flow period between the observed values during forecast high years and the observed values during forecast low years.

Fig. 11.

The 1986–2003 forecasts for the (a) Amu Darya at Kerki and (b) Syr Darya at Chinaz are shown as scatterplots. The x axis is observed values, the y axis is forecast values, and the diagonal line represents a perfect forecast.

Fig. 11.

The 1986–2003 forecasts for the (a) Amu Darya at Kerki and (b) Syr Darya at Chinaz are shown as scatterplots. The x axis is observed values, the y axis is forecast values, and the diagonal line represents a perfect forecast.

The good skill for the post-CCA period 1986–2003, for two stations not used in the CCA, demonstrates that both the regional-scale patterns of variability and the CCA forecasting approach are robust and stable, even through the recent period.

A comparison with the results of Schär et al. (2004), who calculated correlations with these same two stations based on ERA-15 reanalysis precipitation accumulated within the two river basins, shows that the two approaches yield complementary information: Schär et al’s basin-accumulation method gives higher correlations for the Syr Darya basin, whereas the regional pattern–based approach used here gives higher correlations for the Amu Darya basin. The current approach also yields high correlations for several other rivers in the region, including some with small drainage basins.

Although a full examination of the role of snowmelt in the growth of warm season vegetation is beyond the scope of this analysis, the annual cycles of precipitation, river flows, and vegetation in Fig. 4a also suggest the possibility of predicting vegetation. Figure 12 shows the correlation of growing season (April–August) NDVI to the hydrologic prediction time series that was previously discussed for the period 1982–2003. In Fig. 12a, the correlation to NDVI is shown at 1° × 1° resolution for clarity. The correlation is also shown at maximum resolution (approximately 8 km × 8 km) in Fig. 12b, shaded at correlation levels 0.3, 0.5, and 0.7. Although the prediction series is trained on the patterns of importance to the river flows, correlations to vegetation exceed 0.7 at 1° resolution and exceed 0.8 at 8-km resolution. Preliminary results suggest that a forecast directly tailored to vegetation would likely result in considerably larger areas of high skill. Water for irrigation is closely limited by the amount of natural flow in many parts of the region, so the snowmelt-based forecasts of vegetation correlations may be of relevance to agricultural management.

Fig. 12.

Correlation of the CCA forecast to April–August vegetation for 1982–2003 using NDVI as a measure of vegetative vigor. (a) The correlation is shown at 1° × 1° resolution for clarity with a contour interval of 0.1, dark shading for values ≥0.3, and light shading for values ≤−0.3. The maximum correlation is 0.73. (b) The correlation is shown at 0.073° × 0.073° (approximately 8 km) resolution and shaded at values 0.3, 0.5, and 0.7; the maximum correlation is 0.84.

Fig. 12.

Correlation of the CCA forecast to April–August vegetation for 1982–2003 using NDVI as a measure of vegetative vigor. (a) The correlation is shown at 1° × 1° resolution for clarity with a contour interval of 0.1, dark shading for values ≥0.3, and light shading for values ≤−0.3. The maximum correlation is 0.73. (b) The correlation is shown at 0.073° × 0.073° (approximately 8 km) resolution and shaded at values 0.3, 0.5, and 0.7; the maximum correlation is 0.84.

7. Summary and discussion

Despite large differences in drainage area, elevation, and average discharge, river flows in the region show a great deal of coherent variability during the warm season (April–August) when peak flows occur. The warm season flows are linked to climate variability in the preceding cold season by the importance of snowmelt, and both a simple all-station average and a more complex canonical correlation analysis (CCA) approach show that regional and global-scale teleconnections play an important role in the cold seasonal variability. The cold season precipitation pattern consists of a regional swath of same-signed anomalies, considerably larger in extent than the cumulative drainage basins of the river flows. Accompanying this pattern are regional changes in upper-level winds, enhancing the jet stream over the region of precipitation and strengthening the flow into the high mountains of the region. An analysis of the associated changes in moisture flux shows that the changes in winds strongly influence the transport of moisture into the region. The regional-scale variability is also linked to the tropical Pacific SSTs, with an El Niño–like pattern similar to those identified in previous studies of the region (Barlow et al. 2002; Tippett et al. 2003).

Although the importance of snowmelt suggests considerable predictability for the warm season river flows based on the state of the snowpack just prior to melt, sufficiently accurate observations of the snowpack are not available. However, the importance of regional scales of climate variability, which are adequately resolved in the operationally available NCEP–NCAR reanalysis data, in determining the snowpack allows for an alternate approach to forecasting. Using the patterns of cold season precipitation and upper-level wind to predict subsequent warm season river flow provides skillful forecasts (cross-validated correlations exceed 0.5 for 10 of the 24 stations, with a maximum of 0.71). The region has very complex terrain and steep mountains, and the success of this approach shows the primary importance of regional-scale variability even to individual stations and suggests that this approach may prove useful in similar regions where the available real-time data is sparse. The forecast was further validated by considering both a period and target stations that were not included in the original CCA. Using the patterns extracted from the period 1950–1985, a forecast was also made for the period 1986–2003 and applied to two additional river flow stations that were withheld from the original analysis and have data for the recent period. The two stations, the Amu Darya at Kerki and the Syr Darya at Chinaz, have forecast correlations of 0.77 and 0.71, respectively. This validation demonstrates the stability of the patterns even in the recent period and relevance even for stations not included in the analysis. Because the climate data that serves as the basis for the forecast comes from the NCEP–NCAR reanalysis, which is available in real time, these forecasts can be produced operationally with no changes. Note that the CCA approach has a wide range of applicability to river flows in the region: it yields high correlations not only for downstream stations with large drainage basins but also for stations with small drainage basins and modest flows (of the 24 CCA stations, the highest cross-validated correlation is obtained for the third-smallest basin) and could easily be applied to additional stations as more data becomes available. Additionally, several additional refinements, such as excluding the eastern-most stations, considering a subset of the stations based on geographic or seasonality considerations, or reestablishing parts of the observational network so that a consistent set of real-time precipitation measurements could replace the reanalysis precipitation product, would likely further improve the forecast skill.

The relevance of the hydrologic forecast to vegetation, as estimated by NDVI, was also examined, both to further validate the stability of the forecast in the recent period and to investigate the links between hydrology and vegetation in the region. The importance of snowmelt and the lack of warm season precipitation suggest that regional vegetation should be linked with the hydrologic variability, and this is indeed the case. Maximum correlations of the hydrologic forecast to NDVI exceed 0.8 at 8-km resolution.

The demonstrated link to the tropical Pacific has been used to make seasonal forecasts of the cold season precipitation in the region (Tippett et al. 2003, 2005), which suggests that the warm season river flows considered here also likely have some predictability at longer lead times of 3–6 months. The potential predictability of vegetation in the region also merits further scrutiny.

This region has not been well studied in the international literature; a number of questions remain (see also Barlow et al. 2005b), which include the factors determining the late winter/early spring maximum in precipitation, the links between synoptic activity and regional-scale flow, the details of the interaction of synoptic activity with the extreme terrain of the region, the relative importance of the local water bodies in providing moisture and the implications for synoptic tracks and strength, and the underlying mechanisms of the links to tropical climate variability. Of particular interest are the effects of regional warming trends on the timing of the spring melt and on glacier mass balance, which may have profound affects on water availability and river flows in the warm season. Data availability also remains a key issue in this region. Although the approach here, which takes advantage of the importance of regional scales to alleviate data scarcity issues, is successful, historical data recovery efforts are still a primary concern. Although the standard global datasets have sparse coverage of the region, there is considerably more information available from individual countries in the region (Barlow et al. 2005b). In addition, a concerted effort to retrieve, process, and collate this information would have large benefits for all concerned, particularly as a result of the strong regional connections in the variability. Finally, we note this is a water-stressed region with high societal vulnerability and operational predictability could mitigate effects of climate variability on agriculture, water resources, and, perhaps, contribute to improved management of Aral Sea problems.

Acknowledgments

MAB was supported by National Science Foundation (NSF) Grant ATM 0603555, and MKT was supported by Cooperative Agreement NA05OAR4311004 from the National Oceanic and Atmospheric Administration (NOAA). The views expressed herein are those of the authors and do not necessarily reflect the views of NOAA or any of its subagencies. We thank Christoph Schär, Reinhard Scheimann, Mariya Glazirina, the Uzbekistan hydrometeorological survey, Ranga Myneni, Bruce Anderson, Ping Zhang, Mike Bell, Benno Blumenthal, John Del Corral, Alexey Kaplan, Heidi Cullen, Annarita Mariotti, and three anonymous reviewers for their useful comments, help in accessing data, and help in data processing. Some of the river flow data were provided by the Data Support section of the Scientific Computing Division at the National Center for Atmospheric Research (NCAR). NCAR is supported by grants from the NSF. The IRI/LDEO Climate Data Library and the GrADS program were also instrumental in this analysis.

REFERENCES

REFERENCES
Agrawala
,
S.
,
M.
Barlow
,
H.
Cullen
, and
B.
Lyon
,
2001
:
The drought and humanitarian crisis in Central and Southwest Asia: A climate perspective. IRI Special Report 01-11, 24 pp
.
Barlow
,
M.
,
H.
Cullen
, and
B.
Lyon
,
2002
:
Drought in central and southwest Asia: La Niña, the warm pool, and Indian Ocean precipitation.
J. Climate
,
15
,
697
700
.
Barlow
,
M.
,
M.
Wheeler
,
B.
Lyon
, and
H.
Cullen
,
2005a
:
Modulation of daily precipitation over southwest Asia by the Madden–Julian oscillation.
Mon. Wea. Rev.
,
133
,
3579
3594
.
Barlow
,
M.
,
D.
Salstein
, and
H.
Cullen
,
2005b
:
Hydrologic extremes in central-southwest Asia.
Eos, Trans. Amer. Geophys. Union
,
86
.
23, doi:10.1029/2005EO230003
.
Barlow
,
M.
,
H.
Cullen
,
B.
Lyon
, and
O.
Wilhelmi
,
2006
:
Drought disaster in Asia.
Natural Disaster Hotspots Case Studies, M. Arnold et al., Eds., Disaster Risk Management Series, Vol. 6, World Bank, 1–20 pp
.
Barlow
,
M.
,
A.
Hoell
, and
F.
Colby
,
2007
:
Examining the wintertime response to tropical convection over the Indian Ocean by modifying convective heating in a full atmospheric model.
Geophys. Res. Lett.
,
34
.
L19702, doi:10.1029/2007GL030043
.
Domroes
,
T.
,
M. S.
Kaviani
, and
D.
Schaefer
,
1998
:
An analysis of regional and intra-annual precipitation variability over Iran using multivariate statistical methods.
Theor. Appl. Climatol.
,
61
,
151
159
.
Glantz
,
M.
,
1999
:
Creeping Environmental Problems and Sustainable Development in the Aral Sea Basin.
Cambridge University Press, 291 pp
.
Hoerling
,
M.
, and
A.
Kumar
,
2003
:
The perfect ocean for drought.
Science
,
299
,
691
694
.
Jarque
,
C. M.
, and
A. K.
Bera
,
1980
:
Efficient tests for normality, homoscedasticity and serial independence of regression residuals.
Econ. Lett.
,
6
,
255
259
.
Kalnay
,
E.
, and
Coauthors
,
1996
:
The NCEP/NCAR 40-Year Reanalysis Project.
Bull. Amer. Meteor. Soc.
,
77
,
437
471
.
Lilliefors
,
H.
,
1967
:
On the Kolmogorov-Smirnov test for normality with mean and variance unknown.
J. Amer. Stat. Assoc.
,
62
,
399
402
.
Mariotti
,
A.
,
2007
:
How ENSO impacts precipitation in southwest central Asia.
Geophys. Res. Lett.
,
34
.
L16706, doi:10.1029/2007GL030078
.
Martyn
,
D.
,
1992
:
Climates of the World.
Elsevier, 436 pp
.
Michaelsen
,
J.
,
1987
:
Cross-validation in statistical climate forecast models.
J. Climate Appl. Meteor.
,
26
,
1589
1600
.
Micklin
,
P. P.
,
1988
:
Dessication of the Aral Sea: A water management disaster in the Soviet Union.
Science
,
241
,
1170
1176
.
Micklin
,
P. P.
,
2007
:
The Aral Sea disaster.
Annu. Rev. Earth Planet. Sci.
,
35
,
47
72
.
New
,
M. G.
,
M.
Hulme
, and
P. D.
Jones
,
2000
:
Representing twentieth-century space–time climate variability. Part II: Development of 1901–1996 monthly grids of terrestrial surface climate.
J. Climate
,
13
,
2217
2238
.
Oki
,
T.
, and
S.
Kanae
,
2006
:
Global hydrological cycles and world water resources.
Science
,
313
,
1068
1072
.
Schär
,
C.
,
L.
Vasilina
,
F.
Pertziger
, and
S.
Dirren
,
2004
:
Seasonal runoff forecasting using precipitation from meteorological data assimilation systems.
J. Hydrometeor.
,
5
,
959
973
.
Schiemann
,
R.
,
D.
Lüthi
,
P.
Vidale
, and
C.
Schär
,
2008
:
The precipitation climate of Central Asia—Intercomparison of observational and numerical data sources in a remote semiarid region.
Int. J. Climatol.
,
28
,
295
314
.
Shaman
,
J.
, and
E.
Tziperman
,
2005
:
The effect of ENSO on Tibetan Plateau snow depth: A stationary wave teleconnection mechanism and implications for the south Asian monsoons.
J. Climate
,
18
,
2067
2079
.
Tippett
,
M. K.
,
M.
Barlow
, and
B.
Lyon
,
2003
:
Statistical correction of central Southwest Asia winter precipitation simulations.
Int. J. Climatol.
,
23
,
1421
1433
.
Tippett
,
M. K.
,
L.
Goddard
, and
A. G.
Barnston
,
2005
:
Statistical–dynamical seasonal forecasts of central-southwest Asian winter precipitation.
J. Climate
,
18
,
1831
1843
.
Tucker
,
C. J.
,
J.
Pinzon
,
M.
Brown
,
D.
Slayback
,
E.
Pak
,
R.
Mahoney
,
E.
Vermote
, and
N.
El Saleous
,
2005
:
An extended AVHRR 8-km NDVI dataset compatible with MODIS and SPOT vegetation NDVI data.
Int. J. Remote Sens.
,
26
,
4485
4498
.
Wilks
,
D.
,
2006
:
Statistical Methods in the Atmospheric Sciences.
2nd ed. Academic Press, 627 pp
.

Footnotes

Corresponding author address: Mathew A. Barlow, EEAS Department, UMass Lowell, 1 University Ave., Lowell, MA 01854. Email: mathew_barlow@uml.edu