1. Introduction
The upper Indus River basin is a key water storage zone on which the population of Pakistan and northwest India have relied throughout the Holocene for water supply and agriculture, in an otherwise semiarid or arid basin (Yu et al. 2013; Petrie et al. 2017; Baudouin et al. 2020). The Indus watershed is dominated by a large flat plain and major mountain ranges including the Hindu Kush, the Karakoram, and the Himalayas that lie to its north. The upper Indus River basin (UIB) is mostly characterized by this mountainous terrain.
Precipitation in the UIB mainly falls during two wet seasons that have different synoptic characteristics. In winter, from December to March, extratropical western disturbances bring rain, and snow at altitude, over most of the mountains and nearby plains (Dimri et al. 2015). In summer, mostly in July and August, the South Asian monsoon extends to the Indus River basin and produces large quantities of precipitation near the foothills of the Himalayas during intense episodes of deep convection (Barros et al. 2004; Houze et al. 2007).
Previous studies of precipitation in the Indus or upper Indus River basin have noted the importance of the topography to trigger or enhance precipitation during both wet seasons and explained the differences between the mountainous areas and the drier lowland (e.g., Ménégoz et al. 2013; Palazzi et al. 2013; Immerzeel et al. 2015; Dahri et al. 2018; Iqbal and Athar 2018; Baudouin et al. 2020). These studies have stressed the uncertainties in precipitation amount and finer-scale patterns in the mountainous areas due to the scarcity, heterogeneity and biases in precipitation observations. Numerical experiments have also been performed for case studies to better understand the processes involved in the generation of precipitation (Barros et al. 2004; Dimri 2004; Medina et al. 2010; Dimri and Chevuturi 2014; Muhammad Tahir et al. 2015, among others). Unsurprisingly, these authors have found that the topography is responsible for the increase of precipitation along the foothills. They especially show the importance of wind and upslope moisture convergence, associated with convective instability, in the production of precipitation during both wet seasons. However, a systematic analysis of synoptic-scale processes causing the dependency between precipitation and topography at the basin scale has not been performed so far.
Houze (2012) has reviewed the processes related to precipitation generation in mountainous areas. Most processes involve cross-barrier flow, when a mountain barrier dynamically forces the air to rise up to saturation if enough moisture is present (e.g., moisture convergence). Depending on the stability of the flow, the effect of mountains differs. In cases of high static stability or weak motion, the cross-barrier flow can be blocked and deflected by the mountain barrier, limiting convergence on the foothills. In contrast, if the flow is stronger, or the atmosphere has lower static stability, upslope convergence is maximized. Convective instability can also provide the energy that sustains vertical velocities. This energy is released either by differential diurnal heating or by a cross-barrier flow. In the latter case, kinetic energy from the cross-barrier flow and convective potential energy combine to strengthen the vertical velocities. These strong interactions between instability and cross-barrier flow produce complex patterns in precipitation variability.
In this study, we investigate the processes that control precipitation and their relationship to the topography. We only consider the precipitation average over the UIB to avoid dealing with finer-scale heterogeneity and associated uncertainty. Specifically, we quantify the importance of the cross-barrier wind effect in the production of precipitation. The method is based on common statistical tools that are not often applied in atmospheric science. Subsequently, the analysis focuses on the similarities and differences between seasons, and the importance of the direction and the altitude of the cross-barrier flow. Last, we quantify the ability of climate models to represent these processes.
2. Data
a. Study area
The upper Indus River basin (UIB) is defined as in Baudouin et al. (2020) as the part of the Indus River basin north of a line between 68.75°E–33.5°N and 77.5°E–30°N, and is delimited by the black contour line in Fig. 1. This study area includes the most mountainous part of the watershed: part of the Hindu Kush and the Sulaiman ranges to the west, the Karakoram to the north, and the western part of the Himalayas to the northeast. It also extends to the plains to the south, in order to incorporate most of the maximum of precipitation falling along the foothills (Fig. 1, see also Baudouin et al. 2020). Figure 1 also shows the topography as represented in the ERA5 reanalysis dataset. The topography rises steeply north of the plain to reach a maximum of between 3000 and 8000 m above sea level. The orientation of the mountain ranges changes from west to east, forming a notch where the Hindu Kush and the Sulaiman range connect to the Karakoram (See Fig. 1).
b. Datasets
This study requires estimates of both precipitation and moisture transport, defined as the product of wind and specific humidity. Gridded estimates of atmospheric variables, such as wind and moisture, are only provided by reanalyses. These atmospheric variables are then compared with precipitation data from the same reanalysis. Doing so ensures consistency between the variables and lowers the statistical threshold for significant results.
Reanalysis data from ERA5 (Hersbach et al. 2018) are used for this study. ERA5 produces the most reliable precipitation estimates for the study area, both in terms of total amount and variability (Baudouin et al. 2020). The data were selected over the maximum period 1979–2018 (40 years) at a 3-hourly time step. Precipitation corresponds to the accumulation over this 3-hourly period, averaged over the study area. Instantaneous values of wind and specific humidity are considered at three different standard pressure levels: 850, 700, and 500 hPa, and at each grid point (at 0.5° horizontal resolution) within the area between 28° and 40°N, and between 65° and 85°E. ERA5 data provide extrapolated values on pressure levels below the model surface. To ensure that for a specific grid point we use values mostly above the model surface (see contour lines in Fig. 1), we deselect values for grid points where the model surface is above 1500 m for 850 hPa, 3000 m for 700 hPa, and 5000 m for 500 hPa. Note that the wind, and therefore the moisture transport, were separated into northward, westward, southward, and eastward directions. Higher temporal and spatial resolution are available for ERA5 but have not been used due to computing limitations and for a better consistency with climate model output.
Last, we use the output from climate models: the historical simulations produced for the Coupled Model Intercomparison Project phase 6 (CMIP6; Eyring et al. 2016). We use the data from the period 1980–2010 of this experiment. Note that in CMIP6, the instantaneous values of atmospheric variables are at a subdaily resolution (6-hourly). As of August 2019, data for humidity and wind on pressure levels, and precipitation are available for the historical experiment from three climate models: GISS-E2–1-G, IPSL-CM6A-LR, and MRI-ESM2–0 (Table 1).
Available CMIP6 climate models, only the historical experiment is considered.
3. Methods
a. Multilinear regression
The large-scale atmospheric circulation associated with precipitation events is often studied using a composite analysis, that is, by averaging the circulation over a large number of events (e.g., Midhuna et al. 2020; Hunt et al. 2018a; Vellore et al. 2016). However, this technique cannot determine which components of the circulation (location, altitude, and direction) are the most important in the generation of precipitation, nor can it quantify the strength of this link. Statistical methods help answering these questions. Here, we aim at predicting precipitation from moisture transport. We use a general linear model to fit precipitation (the predictand) with the time series of moisture transport at each grid point (the predictors) with the ordinary least squares method. In mathematical terms, the model is defined as follows:
In Eq. (1), the predictand Y is the time series of average precipitation over the UIB. With all 40 years of ERA5, the predictand includes n = 116 877 observations, without distinguishing the seasons. The predictors Xi are the time series of moisture transport at 700 hPa, defined as the product of wind and specific humidity. The time series are defined at each grid point within the box defined in section 2b (see extent of the Fig. 2a; note that points outside of the UIB are considered). There is a total of p = 645 predictors or grid points. The coefficient
The quality of the regression is defined using the coefficient of determination:
where SS is the sum of square function (i.e., the variance multiplied by the degree of freedom). R2 also equals the square of the Pearson’s correlation coefficient between the predictand and the prediction; R2 is validated on a sample of observations distinct from the one used to train the model: nine tenths of the observations are randomly selected to generate the training sample, while the remaining one tenth forms the validation sample. This selection is repeated 10 times to reduce the sampling uncertainty. The final R2 equals the mean of R2 for each instance of sampling.
The uncertainty of the estimate of the coefficients
where
b. Dealing with multicollinearity
Figure 2a presents the result of the regression between precipitation and southerly moisture transport at 700 hPa. With R2 = 0.81, it has a high predictive skill. However, there is only a relatively limited number of significant coefficients despite the high number of observations, and the figure lacks a homogeneous pattern, which limits the interpretation. The cause of the noisy pattern is the spatial autocorrelation of the 2D field, also referred to as multicollinearity of the predictors in the context of a multilinear regression (see e.g., Saporta 2006). The effect of multicollinearity, or the fact that a predictor can be predicted by the other predictors, is apparent in Eq. (3). There, the coefficient
A common way to reduce the impact of multicollinearity is the use of the principal component (PC) regression. It acts on the predictors themselves by reducing their number in the regression [in Eq. (1)]. The PC regression method consists first in a decomposition in principal components (or principal component analysis) of the predictors. During this process, the predictors are standardized. Then, a certain number of principal components, that need to be determined, are selected and used for the regression. The time series of each selected principal component are then used as new predictors in the regression as shown in the following equation:
where PCj is the time series of the jth principal component, and is a linear combination of the time series of moisture transport at each location (Xi). The coefficients αij form the rotation matrix from the PC analysis. The prediction
It is clear from Eq. (5) that the prediction is still a function of the same time series Xi as in Eq. (1), despite having fewer coefficients
In our example, we apply the regression to the 10, 50, and 100 first components, respectively. Figures 2b–d show the resulting pattern by plotting each coefficient associated with Xi as shown in Eq. (5). The selection of the first 50 components (Fig. 2c) produces good result in terms of interpretability, with a pattern that overlaps with most of the study area.
c. Using several 2D fields
In the two previous sections, we used the southerly moisture transport at 700 hPa to predict precipitation. Now, we look at other predictors, especially different altitudes or directions of moisture transport, to ascertain whether they can further improve the prediction. The additional predictors, all 2D fields, are included in the regression along with the southerly moisture transport at 700 hPa. A decomposition in principal components is still needed to address the multicollinearity between the predictors. However, we do not perform the decomposition of all 2D fields simultaneously since doing so would couple different fields and hinder the selection of fields that significantly contribute to the precipitation variability. In addition, we are interested in patterns of similar extent between each 2D field in order to compare them. A simultaneous decomposition would mask the distinct spatial details explained by individual fields. So, the PC decomposition is performed for each 2D field independently. The number of principal components for each field is defined so that each field is represented by a pattern of similar extent. A measure for the extent of a pattern is needed in order to define a consistent threshold across the different fields.
We consider a gridded map (or matrix) of n points filled with value αi. When plotted, the values αi form a pattern for which we want to measure the extent. The Extent of Pattern (EP) can be described by the root-mean-square of the differences of value αi between all pairs of neighboring grid points (i, j) of this map. We only consider neighboring points of the same latitude or longitude (m pairs). This value is then normalized by dividing with the root-mean-square of the values αi at all grid points. Hence, this normalized value represents the mean change between two neighboring points in units of deviation from zero. The inverse of this value, multiplied by the mean distance (d) between neighboring points, represents the mean distance needed to observe a change of one deviation to zero, and is used to define EP for a map of values αi:
For example, for Fig. 2c, there are m = 1167 pairs of neighboring points with a distance of 0.5°, the resulting Extent of Pattern is EP = 1.13°. In Fig. 2a, it reaches only EP = 0.32°, consistent with the much noisier pattern that is visible in this figure.
The effect of reducing the number of components selected in a PC regression can be reinterpreted with EP. In Fig. 3, EP is computed for each principal component for the southerly moisture transport at 700 hPa. The figure shows that the Extent of Pattern generally decreases for principal components that explain less variance (i.e., with a higher principal component number). Hence, limiting the regression to the principal components with a higher EP limits the noise in the regression pattern, as seen in Figs. 2b–d. We determined in section 3b that a selection of the 50 first principal components gives satisfying results (Fig. 2c). The Extent of Pattern of the 50th principal component is 0.73°, after smoothing. This value is used as a threshold to determine the number of components used for the other 2D fields. The third and fourth rows of Table 2 show the number of components selected for each field, as well as the cumulative variance explained for this selection. Differences in the cumulative variance explained can be observed between the different altitudes of moisture transport, which highlights the importance of not using explained variance if the goal is to have patterns of similar extent.
Selection of the most important predictors among the different directions of moisture transport at three altitudes: 850, 700, and 500 hPa. The number of components for the southerly moisture transport at 700 hPa is fixed at 50; for the other fields, that number is determined so that the Extent of Pattern of the last component is greater than that of the 50th component of the southerly transport at 700 hPa. Cumulative variance explained by the selection is also given. Pratt’s index is computed on the full set of observations and given as a fraction of R2 of the regression that considers all fields, which reaches 0.904 after validation.
Equation (1) can be rewritten when considering several 2D fields, as follows:
where
The next step, after performing the principal component analysis and selecting the number of components for each 2D field, is to select the most important fields for the regression of precipitation. This selection is performed by using Pratt’s index (Pratt 1987), which quantifies the relative importance of each predictor. For the general case of Eq. (1), Pratt’s index is given for each predictor Xi, by the product of the coefficients
The sum of that index over all predictors equals the R2 of the regression with all the predictors: the Pratt’s index of a predictor can be expressed as a percentage of R2. For a 2D field, we simply take the sum of Pratt’s index for each predictor composing that field, which following the notation of Eq. (7), can be simplified to
The result for each predictor is given in Table 2 (last row): a higher value characterizes a more important predictor.
Grömping (2006) and Nathans et al. (2012) have discussed advantages and limitations of Pratt’s index and other measures of relative importance. The main issue related to Pratt’s index is that negative values are possible as can be seen for some predictors in Table 2, which does not mean that including the predictor reduces the R2 of the regression. However, in our context, these values remain very close to zero, which suggest a low relative importance, and Pratt’s index offers a clear differentiation of the relative importance of each predictor, compared to other measures.
d. Time lag and causality analysis
We have so far predicted precipitation only by considering the state of the moisture transport at the start of the 3-h precipitation accumulation period. It is possible that precipitation is also explained by an earlier state of the moisture transport due to a potential lag phenomenon. The result of the regression can also include a response of the moisture transport field to precipitation. It is therefore important to disentangle the causes and effects of precipitation. We use a distributed lag model (Saporta 2006), where the regression is applied on the predictors with different time lags. Due to the high correlation between the different time lags of the same predictor, multicollinearity issues arise, and the number of lags tested in the same regression is kept small. Specifically, for moisture transport, we select the start and the end of the 3-h precipitation accumulation period, as well as 3 h before the start, and 3 h after the end of this period.
It is important to note that a negative lag (the predictor occurs before the predictand) does not imply a direct causality link. For example, the predictor may well be a faster response to the real direct cause of the predictand, or an indirect cause through another variable. Similarly a positive lag (the predictor occurs after the predictand) does not imply that the predictor is caused by the predictand (cf. Granger causality principles developed in Granger 1969). Consequently, we only interpret the predictors as a potential cause (consequence) when the lag is negative (positive), as the results may differ depending on the predictors selected for the regression. A causal link can only be established after identifying and testing an underlying physical mechanism.
This causality issue can also be seen in our study. Cross-barrier flow is not a direct cause of precipitation as it relates to it through moisture convergence and condensation in the updraft. We have also mentioned in the introduction that the updraft (and therefore precipitation) can be caused by convective instability. Convection results in lower troposphere convergence and thus moisture transport. If convection occurs over a large enough scale, its impact on the southerly moisture transport can be detected by our method. In our study, however, we do not distinguish whether the cross-barrier flow is a cause or a consequence of the updraft, as the two are intricately linked. Therefore, we do not investigate the specific role of instability in the generation of precipitation. Table 3 presents the relative importance according to Pratt’s index for the regression performed with the southerly moisture transport at 850 and 700 hPa and the four time lags selected. Pratt’s index clearly shows that the two positive time lags are much less important than the two negative, suggesting that moisture transport is predominantly a potential cause of precipitation.
Selection of the most important time lags. The regression predicts the precipitation using the southerly moisture transport at 700 and 850 hPa and four time lags: the start of the 3-h accumulation period of the precipitation, 3 h earlier, the end of the accumulation period, and 3 h later. The validated R2 of the regression considering all the time lags reaches 0.860. The relative importance is given in the table using Pratt’s index, as a fraction of R2.
4. Results
a. General
Table 2 shows that the southerly moisture transport at 700 hPa (Pratt’s index of 51%) and 850 hPa (36%) are by far the most important altitudes and directions for the prediction of the 3-hourly precipitation. In terms of time lag, the most important point in time is the start of the accumulation period (total Pratt’s index of 55%, Table 3), followed by 3 h earlier (31%). For simplification, only the start of the accumulation period is considered in the rest of the study. Figure 4a presents the coefficients
b. Seasonality
Precipitation in the UIB is characterized by a bimodal seasonality, which is well represented by ERA5 (Baudouin et al. 2020, as shown by the black line in Fig. 5a). One wet season occurs in winter–early spring and is driven by extratropical disturbances; the other, narrower but more intense, occurs in July–August, in relation to the Indian summer monsoon. Due to the difference in larger-scale drivers, it is possible that the cross-barrier moisture transport is differently related to precipitation depending on the season. We reproduce the same regression as in Fig. 4a, with the same temporal resolution, but for winter (Fig. 4b) and summer (Fig. 4c). The summer season is defined by the two wettest summer months, July and August. For winter, we consider the two wettest months February and March, but also December and January, which are always considered in the literature on western disturbances (Dimri et al. 2015; Hunt et al. 2018a). A different definition of the seasons does not affect the main results.
In either season, the patterns are remarkably similar. The main changes are a rebalancing of the relative importance of the two fields, and a better coefficient of determination in winter (R2 = 0.922) than in summer (R2 = 0.740). We checked whether the change of relative importance of some of the other directions (westerly, easterly, northerly) and altitudes (500 hPa) for summer could explain the difference in R2, however, there was no significant difference compared to the results presented in Table 2. Since no major differences are observed between the two seasonal regressions, the regression for the whole year is suited to studying the seasonality.
Figure 5 presents the seasonality of different variables: the prediction (Fig. 5a), the coefficient of determination (Fig. 5b), and the different contributions to precipitation (Figs. 5c–e). The prediction is able to reproduce almost perfectly the seasonality of precipitation (Fig. 5a). This quality contrasts with the seasonality of the coefficient of determination (R2, Fig. 5b): it reaches a minimum in June and remains lower than it is in winter for the rest of summer. This minimum is possibly the result of increased small-scale diurnal convection during summer that produces precipitation without the need of horizontal moisture transport (e.g., influence of diurnal heating only, Houze 2012). Yet, this process does not impact seasonal biases and the minimum value of R2 remains high (>0.6).
Despite their similarity with the season-limited regressions, the moisture transport at 850 and 700 hPa predicting precipitation behave differently along the seasons as shown in Fig. 5c. Both contributions of the moisture transport exhibit two peaks that match the precipitation peaks, but their magnitudes differ. Winter precipitation is dominated by an increase in moisture transport at 700 hPa, while during summer both altitudes contribute equally to precipitation.
We take the analysis further by investigating whether the seasonality in precipitation is explained by a distinct change in seasonality of the southerly wind or specific humidity. We multiply the regression coefficients (as in Fig. 4a) with the time series of southerly wind on the one hand and specific humidity on the other, and sum the results for each time step. The two time series obtained are representative of the distinct influence of wind and moisture, respectively, on precipitation. In Figs. 5d and 5e, we represent the seasonal cycle of those two time series on a log scale. Their product (or sum on the log scale) nearly equals the moisture transport seasonality. The small residual is due to high wind and specific humidity occurring at the same locations or time steps.
This synchronous occurrence, at the subdaily scale, can be quite important in winter, where it explains up to 25% of the precipitation increase (i.e., the southerly wind at 700 hPa brings higher specific humidity). Yet, Figs. 5d and 5e suggest that winter precipitation is mostly driven by an increase in southerly wind at 700 hPa, while moisture only plays a role in explaining the lag between the wind peak in February and the precipitation peak in March. In contrast, the summer peak is mostly explained by an increase in moisture at both altitudes, although the wind at 850 hPa also peaks in this season, similar to winter. Both dry seasons are related to weak occurrence of southerly winds.
c. Cross-barrier wind direction
The fact that only the southerly flow relates to precipitation is quite surprising as the Himalayan range is oriented northwest to southeast. We would have expected the westerly flow to play an important role, but it is a poor predictor (total Pratt’s index of 12%, cf. Table 2). This result also seemingly contradicts numerous studies supporting a westward or southwestward origin of moisture, especially in the context of the western disturbances (Dimri et al. 2015), which are particularly evident in tracking analyses (Hussain et al. 2015; Jeelani et al. 2018; Hunt et al. 2018b; Boschi and Lucarini 2019). However, our study does not investigate the origin of moisture, and the PC regression result is not indicative of the mean flow in the UIB. The mean flow is given by the composite analysis we present next.
Figure 6 shows the composite maps of the mean wind and moisture fields when significant precipitation occurs in the following 3 h (>2 mm) in comparison with the absence of precipitation (<0.1 mm). It shows a general southerly orientation of the wind within the study area when precipitation falls (Figs. 6a,c,e,g). As evident in winter at 850 hPa (Fig. 6c) and in summer at both altitudes (Figs. 6e,g), a small easterly component is also present along the foothill of the Himalayas in the study area when precipitation occurs. By contrast, a westerly component dominates in the absence of precipitation (Figs. 6d,f,g). Several reasons explain the absence of precipitation in relation to a westerly wind, and oppositely, an easterly component when precipitation occurs.
The principal effect of topography on a flow is the deflection of that flow so that it can get around the obstacle. A horizontal deflection is particularly obvious in winter (Figs. 6a,b). A strong southwesterly wind occurs over Rajasthan, perpendicular to the Himalayan range. Arriving close to the mountain it splits in half with a tipping point just southeast of the study area: a southerly to southeasterly branch reaches the study area while the other turns east toward the Ganges plain. A similar but weaker southwesterly wind occurs over Rajasthan in summer at 850 hPa (Figs. 6g,h). During this season, the southerly branch heading toward the upper Indus only takes place when precipitation occurs, while the westerly branch dominates in the other case. The deflection caused by the topography should be further enhanced by the static stability of the air mass as the flow would be blocked in and below the stable layer instead of rising above the mountain range. This process, however, is not investigated further here as the static stability is not among the available variables for ERA5.
The shape of the topography also impacts the flow. There is a rapid change in the orientation of the slope of the mountain ranges in the western part of the study area, where the Hindu Kush and the Sulaiman ranges connect with the Karakoram and the Himalayas (See Fig. 1). It forms a notch, known to be related to terrain-locked disturbances in winter (Lang and Barros 2004; Dimri et al. 2015). A southeasterly flow parallel to the Himalayan foothills at 700 hPa or lower is eventually trapped in that notch (Figs. 6c,g), and forced to rise. On the other hand, an opposite northwesterly flow follows the Himalayas and leaves the UIB without a forced uplift (Figs. 6b,d,f,h).
In addition, the synoptic dynamic drives the wind flow through pressure gradients and enhances the effect of the topography. Southerly winds at 850 hPa are often triggered by a low located just to the west or southwest of the study area, as suggested by the cyclonic circulation shown in Figs. 6c and 6g. The friction, which is stronger toward the foothills, diverts the wind toward the lower pressure by geostrophic adjustment, enhancing the orographic deflection, and the convergence in the notch. Divergence is also often found at the tropopause in relation to the low at 850 hPa, particularly in winter (Hunt et al. 2018a), which sustains vertical velocities. By contrast, the topography to the north (Pamir range and Tibetan Plateau) prevents the formation of a low at 850 hPa there that could enhance a westerly flow. Rather, this flow is related to the intensity of the subtropical ridge, while the topography forces an anticyclonic deflection (Figs. 6b,d,f,h). Overall, this synoptic context lacks the dynamics that would force the uplift of the westerly flow.
Last, the moisture content of the westerly and southerly flows is quite different. The southerly flow brings warm and humid tropical air. In contrast, the westerly to northwesterly flow descends after passing over the Sulaiman and Hindu Kush ranges leading to the intrusion of higher altitude continental dry air, as seen in Figs. 6b, 6d, and 6f (Foehn effect).
In summary, our study suggests that only the southerly component of the cross-barrier flow is important for triggering precipitation, while the mean flow varies from southwesterly to southeasterly depending on the location and season. These results corroborate the southward origin of the moisture suggested by Hunt et al. (2018b).
d. Moisture transport altitude
According to Table 2, the moisture transport at 700 hPa is the most important predictor for explaining the precipitation variability (total Pratt’s index of 56%), followed by the one at 850 hPa (33%), while the transport at 500 hPa is much less significant (12%). We further investigate the dependency between relative importance and altitude by performing the PC regression of precipitation using a vertical layer of moisture transport along 30°N (see Fig. 1). We select as predictor all grid points above the ground between 70° and 80°E (every 0.5°) and between 950 and 500 hPa (every 25 hPa). This selection insures that we capture most of the moisture eventually reaching the UIB. The regression is performed for the whole year using the 20 first principal components. R2 reaches 0.62, which is lower than when using horizontal layers, because the predictors considered here are farther away from the foothills and the main moisture convergence area. However, R2 remains high enough to investigate the relative importance of the different altitudes of moisture transport to produce precipitation. Figure 7 (black line in either A or B) shows the relative importance of each altitude using Pratt’s index. As expected, the moisture transport at 700 hPa is more important than at 850 hPa, with a peak at 725 hPa. The relative importance quickly decreases above 700 hPa and below 850 hPa, with Pratt’s indices reaching values close to or below 0 at 500 and 950 hPa. This behavior justifies the selection of moisture transport at 850 and 700 hPa to predict precipitation.
The relative importance of the altitude is compared to the mean meridional moisture transport across the same cross-section, which is computed for different amounts of precipitation. This composite analysis is split between the two wet seasons (Figs. 7a,b). The vertical structure of the mean moisture transport clearly differs from its relative importance. In winter, the moisture transport is equally strong at 700 hPa and at 850 hPa when intense precipitation occurs (>2 mm per 3 h over the UIB). Other altitudes above and below also see significant southerly moisture transport. During summer, the strongest moisture transport occurs at 925 hPa and quickly decreases above that altitude.
This difference in vertical structure suggests that the relative importance is not merely proportional to the mean moisture transport. The link between moisture transport and precipitation is indeed indirect. This link is first explained by moisture flux convergence along the foothills. However, not all of the moisture transported by the cross-barrier flow eventually converges. At high altitude (mostly above 600 hPa), the cross-barrier flow is able to pass over the mountains with little disturbance, minimizing chances of convergence. Below, the mountain ranges effectively disturb the flow, which can converge, but can also be deflected horizontally. This deflection depends on the characteristic of the flow (static stability and wind speed) and the size of the mountain. At 850 hPa, the flow is generally weaker, and the distance to climb over the mountains larger, compared to 700 hPa. Therefore the deflection at 850 hPa is in general more important, as can be seen in Figs. 6a,c and 6e,g. Consequently, at 850 hPa, a smaller fraction of the moisture transport eventually converges and is converted into precipitation, which explains the smaller relative importance despite stronger moisture transport in Fig. 7. Second, the link between moisture transport and precipitation can also be modulated by the presence of moisture divergence at higher altitudes. Figure 4b shows that, when no precipitation occurs, a mean southerly moisture transport is present below 850 hPa, however, above it, there is a mean northerly moisture transport: moisture escapes the UIB. In this case, higher altitudes are important not only because they provide more moisture but also because they prevent moisture from leaving the domain.
In conclusion, this analysis of the altitude’s relative importance shows the advantages of the regression over the composite analysis for identifying the key components of the atmospheric circulation that explain 3-hourly precipitation variability.
e. Representation in climate models
Last, the method is applied to climate simulations to check the ability of climate models to represent the effect of cross-barrier wind in the UIB. We specifically use the output of three climate models from CMIP6: MRI, IPSL, and GISS, as outlined in section 2b. These datasets are only available on a 6-hourly time step, but the use of this resolution had little impact on the result discussed for ERA5. The Extent of Pattern used as a threshold to select the principal components for the reanalysis is too small for the resolution of the climate models. Using the mean grid resolution of the datasets offers reasonable results regarding the multicollinearity issues (e.g., 1.125° for MRI, 1.875° for IPSL, and 2° for GISS).
The regression has been performed for each simulation and the coefficients are presented in Fig. 8. The link between moisture transport and precipitation is reproduced in each simulation, as indicated by the coefficient of determination (R2) of the regressions, which is in the range of 0.8 for the three simulations. Despite various spatial resolutions, they all represent the highest coefficients along the Himalayas. Some discrepancies with ERA5 are notable in the relative importance of each predictor. Moisture transport at 700 hPa is more important for IPSL and MRI, while in GISS, precipitation is more related to moisture transport at 850 hPa. This behavior may be related to the representation of the relief and the model’s latitudinal resolution (1.125° for MRI, 1.25° for IPSL, and 2° for GISS).
The seasonality of precipitation and its different contributions is represented in Fig. 9, while the seasonality of R2 is shown in Fig. 10. The seasonal cycle of precipitation is similar in the three models, but is very different from reality, as was noted from previous versions of the models used for CMIP5 (Palazzi et al. 2015). The wet season in summer is absent, instead the period June to September is the driest. Winter precipitation, on the other hand, is more intense and lasts longer from October, or November to April, or May, depending on the model, with a peak in April, instead of March. The overall annual precipitation is 5% (for IPSL) to 15% higher (for MRI) compared to ERA5.
The regression of precipitation helps the understanding of some of the biases in modeled precipitation. First, the underestimation of summer precipitation coincides with a drop of the predictive skill: the R2 of the regression is around 0.3 or below in July (Fig. 10). This drop suggests that large-scale cross-barrier wind is no longer the main trigger of summer precipitation in the simulations. The drop may also be related to the coarser resolution that increases the importance of subgrid precipitation. Second, the moisture content related to precipitation is underestimated for all seasons and all models compared to ERA5 (Figs. 9g–l). In winter, this underestimation is more than compensated for by stronger southerly winds that peak at both altitudes over an extended winter season. In summer, by contrast, wind does not peak at 850 hPa in the simulations as it does in ERA5 and reaches a minimum instead.
In summary, climate models produce precipitation in the UIB for the right reason: the forced uplift of a cross-barrier flow and the condensation of the moisture within. Yet, the seasonality of this moisture transport is largely incorrect leading to the precipitation biases. We show that the synoptic-scale circulation is responsible for these biases, rather than the moisture processes.
5. Conclusions
A method based on statistical regressions and principal component analysis (PC regression) is used to investigate the link between a given variable (predictand) and 2D fields that characterize the atmospheric state (predictors). More specifically, the method evaluates the strength of this link and exposes the pattern of the 2D fields that explain the predictand. Those patterns are useful for interpreting the link in terms of physical processes, while lag regressions can be used to disentangle cause and effect of the predictand’s variability. This study also stresses the advantages over a composite analysis, which is often used in atmospheric science. Overall, the PC regression allows for a comprehensive analysis of the causality links and can be applied to different contexts other than the one investigated here.
This study focuses on the causes of precipitation variability in the upper Indus River basin. Only the mean precipitation over that area is considered, as no dataset reliably represents the finescale patterns of precipitation (Baudouin et al. 2020). The main result is that horizontal moisture transport successfully predicts over 80% of the precipitation variability of which southerly moisture transport at low levels (between 700 and 850 hPa) along the Himalayas is the dominant contributor. This fact demonstrates that precipitation in the upper Indus basin is mainly caused by a forced uplift of a cross-barrier flow. Comparison with mean moisture transport through a composite analysis complements the PC regression by connecting its result to the larger-scale circulation and the synoptic drivers of precipitation. However, important discrepancies arise from the vertical structure between mean and relative importance of moisture transport, which suggest complex, altitude dependent, interaction with the relief. Particularly, the mountain ranges and their specific shape play an important role in the channelling, trapping and eventually uplifting of moisture coming from the south of the study area. A finer-scale analysis would be useful to further investigate this relationship. We also suggest that a complex interaction between cross-barrier flow and the stability of the air mass provides the energy that sustains vertical velocities and moisture convergence. However, the method and data used here are not suited for such an analysis.
The link between moisture transport and precipitation for both winter and summer are strikingly similar despite differences in the synoptic drivers of precipitation, proving the importance of cross-barrier transport of moisture in both seasons. Nonetheless, in summer, moisture transport explains a lower fraction of the precipitation variability, which suggests that additional factors influence precipitation, such as small-scale convection triggered by diurnal differential heating.
The prediction of precipitation is decomposed into the contributions of the different altitudes of moisture transport and, from there, into the contributions of wind and moisture. These contributions show further differences between seasons. Winter precipitation variability is solely driven by moisture transport at 700 hPa, while in summer, moisture transport at 850 and 700 hPa have similar contributions. In addition, the winter peak is driven by an increase of mean southerly wind while the mean moisture is at its minimum. The increase of moisture starting from January explains the delay between the peak in wind in February, and the peak in precipitation in March. By contrast, the summer wet season is mostly explained by the increase in moisture content, while the mean southerly wind also exhibits a secondary peak. Both dry seasons are explained by the absence of occurrences of southerly wind. The decomposition in altitude and wind/moisture contributions offer further opportunities to investigate the variability of precipitation at longer time scales through the variability of these contributions.
Last, the method provides insight into the reasons why climate models misrepresent precipitation seasonality. The CMIP6 climate models selected for this study represent similarly overly wet winters and dry summers. Despite precipitation biases and relatively coarse spatial resolution, the climate models are able to reproduce a relationship between moisture transport and precipitation similar to the reanalysis. The decomposition of precipitation into the various contributions suggests that both winter and summer biases are related to an anomalous seasonality of the mean southerly wind, while moisture is relatively well simulated. These anomalies suggest that the synoptic-scale circulation, rather than local moisture processes, is the main factor explaining precipitation biases in these climate simulations. Further analysis of the wind and moisture variability is needed to narrow down and correct climate model biases and build confidence in simulations for past and future climates.
Acknowledgments
We acknowledge the teams involved in the producing of ERA5 and MERRA2 reanalysis, and the team from IPSL, MRI, and NASA GISS responsible for climate simulations. We gratefully thank them for making these datasets freely available online. We also thank our editor and the two anonymous reviewers for comments that greatly helped clarify the paper. This research was carried out as part of the TwoRains project, which is supported by funding from the European Research Council (ERC) under the European Union’s Horizon 2020 research and innovation programme (Grant 648609).
Data availability statement: The code used to produced the figures and some of the associated data are available at http://doi.org/10.5281/zenodo.3765211 and is available from the authors.
REFERENCES
Barros, A., G. Kim, E. Williams, and S. Nesbitt, 2004: Probing orographic controls in the Himalayas during the monsoon using satellite imagery. Nat. Hazards Earth Syst. Sci., 4, 29–51, https://doi.org/10.5194/nhess-4-29-2004.
Baudouin, J.-P., M. Herzog, and C. A. Petrie, 2020: Cross-validating precipitation datasets in the Indus River basin. Hydrol. Earth Syst. Sci., 24, 427–450, https://doi.org/10.5194/HESS-24-427-2020.
Boschi, R., and V. Lucarini, 2019: Water pathways for the Hindu-Kush-Himalaya and an analysis of three flood events. Atmosphere, 10, 489, https://doi.org/10.3390/atmos10090489.
Boucher, O., S. Denvil, A. Caubel, and M. A. Foujols, 2018: IPSL IPSLCM6A-LR model output prepared for CMIP6 CMIP, version 20180803. Earth System Grid Federation, accessed 6 August 2019, https://doi.org/10.22033/ESGF/CMIP6.1534.
Dahri, Z. H., E. Moors, F. Ludwig, S. Ahmad, A. Khan, I. Ali, and P. Kabat, 2018: Adjustment of measurement errors to reconcile precipitation distribution in the high-altitude Indus basin. Int. J. Climatol., 38, 3842–3860, https://doi.org/10.1002/joc.5539.
Dimri, A. P., 2004: Models to improve winter minimum surface temperature forecasts, Delhi, India. Meteor. Appl., 11, 129–139, https://doi.org/10.1017/S1350482704001215.
Dimri, A. P., and A. Chevuturi, 2014: Model sensitivity analysis study for western disturbances over the Himalayas. Meteor. Atmos. Phys., 123, 155–180, https://doi.org/10.1007/s00703-013-0302-4.
Dimri, A. P., D. Niyogi, A. Barros, J. Ridley, U. Mohanty, T. Yasunari, and D. Sikka, 2015: Western disturbances: A review. Rev. Geophys., 53, 225–246, https://doi.org/10.1002/2014RG000460.
Eyring, V., S. Bony, G. A. Meehl, C. A. Senior, B. Stevens, R. J. Stouffer, and K. E. Taylor, 2016: Overview of the Coupled Model Intercomparison Project Phase 6 (CMIP6) experimental design and organization. Geosci. Model Dev., 9, 1937–1958, https://doi.org/10.5194/gmd-9-1937-2016.
Gelaro, R., and Coauthors, 2017: The Modern-Era Retrospective Analysis for Research and Applications, version 2 (MERRA-2). J. Climate, 30, 5419–5454, https://doi.org/10.1175/JCLI-D-16-0758.1.
Granger, C. W. J., 1969: Investigating causal relations by econometric models and cross-spectral methods. Econometrica, 37, 424–438, https://doi.org/10.2307/1912791.
Grömping, U., 2006: Relative importance for linear regression in R: The package relaimpo. J. Stat. Software, 17, 1–27, https://doi.org/10.18637/jss.v017.i01.
Hersbach, H., and Coauthors, 2018: Operational global reanalysis: Progress, future directions and synergies with NWP. ERA Rep. Series 27, 65 pp., http://doi.org/10.21957/tkic6g3wm.
Houze, R. A., Jr., 2012: Orographic effects on precipitating clouds. Rev. Geophys., 50, RG1001, https://doi.org/10.1029/2011RG000365.
Houze, R. A., Jr., D. C. Wilton, and B. F. Smull, 2007: Monsoon convection in the Himalayan region as seen by the TRMM precipitation radar. Quart. J. Roy. Meteor. Soc., 133, 1389–1411, https://doi.org/10.1002/QJ.106.
Hunt, K. M., A. G. Turner, and L. C. Shaffrey, 2018a: The evolution, seasonality and impacts of western disturbances. Quart. J. Roy. Meteor. Soc., 144, 278–290, https://doi.org/10.1002/qj.3200.
Hunt, K. M., A. G. Turner, and L. C. Shaffrey, 2018b: Extreme daily rainfall in Pakistan and North India: Scale interactions, mechanisms, and precursors. Mon. Wea. Rev., 146, 1005–1022, https://doi.org/10.1175/MWR-D-17-0258.1.
Hussain, S., S. Xianfang, I. Hussain, L. Jianrong, H. Dong Mei, Y. Li Hu, and W. Huang, 2015: Controlling factors of the stable isotope composition in the precipitation of Islamabad, Pakistan. Adv. Meteor., 2015, 817513, https://doi.org/10.1155/2015/817513.
Immerzeel, W., N. Wanders, A. Lutz, J. Shea, and M. Bierkens, 2015: Reconciling high-altitude precipitation in the upper Indus basin with glacier mass balances and runoff. Hydrol. Earth Syst. Sci., 19, 4673–4687, https://doi.org/10.5194/hess-19-4673-2015.
Iqbal, M. F., and H. Athar, 2018: Validation of satellite based precipitation over diverse topography of Pakistan. Atmos. Res., 201, 247–260, https://doi.org/10.1016/j.atmosres.2017.10.026.
Jeelani, G., R. D. Deshpande, M. Galkowski, and K. Rozanski, 2018: Isotopic composition of daily precipitation along the southern foothills of the Himalayas: Impact of marine and continental sources of atmospheric moisture. Atmos. Chem. Phys., 18, 8789–8805, https://doi.org/10.5194/acp-18-8789-2018.
Lang, T. J., and A. P. Barros, 2004: Winter storms in the central Himalayas. J. Meteor. Soc. Japan, 82, 829–844, https://doi.org/10.2151/JMSJ.2004.829.
Medina, S., R. A. Houze Jr., A. Kumar, and D. Niyogi, 2010: Summer monsoon convection in the Himalayan region: Terrain and land cover effects. Quart. J. Roy. Meteor. Soc., 136, 593–616, https://doi.org/10.1002/qj.601.
Ménégoz, M., H. Gallée, and H. Jacobi, 2013: Precipitation and snow cover in the Himalaya: From reanalysis to regional climate simulations. Hydrol. Earth Syst. Sci., 17, 3921–3936, https://doi.org/10.5194/hess-17-3921-2013.
Midhuna, T., P. Kumar, and A. Dimri, 2020: A new western disturbance index for the Indian winter monsoon. J. Earth Syst. Sci., 129, 59, https://doi.org/10.1007/s12040-019-1324-1.
Muhammad Tahir, K., Y. Yin, Y. Wang, Z. A. Babar, and D. Yan, 2015: Impact assessment of orography on the extreme precipitation event of July 2010 over Pakistan: A numerical study. Adv. Meteor., 2015, 510417, https://doi.org/10.1155/2015/510417.
NASA/GISS, 2018: NASA-GISS GISS-E2.1G model output prepared for CMIP6 CMIP historical. Earth System Grid Federation, accessed 30 September 2020, https://doi.org/10.22033/ESGF/CMIP6.7127.
Nathans, L. L., F. L. Oswald, and K. Nimon, 2012: Interpreting multiple linear regression: A guidebook of variable importance. Pract. Assess. Res. Eval., 17, 21–62.
Palazzi, E., J. Von Hardenberg, and A. Provenzale, 2013: Precipitation in the Hindu-Kush Karakoram Himalaya: Observations and future scenarios. J. Geophys. Res. Atmos., 118, 85–100, https://doi.org/10.1029/2012JD018697.
Palazzi, E., J. von Hardenberg, S. Terzago, and A. Provenzale, 2015: Precipitation in the Karakoram-Himalaya: A CMIP5 view. Climate Dyn., 45, 21–45, https://doi.org/10.1007/s00382-014-2341-z.
Petrie, C. A., and Coauthors, 2017: Adaptation to variable environments, resilience to climate change: Investigating land, water and settlement in Indus northwest India. Curr. Anthropol., 58, 1–30, https://doi.org/10.1086/690112.
Pratt, J. W., 1987: Dividing the indivisible: Using simple symmetry to partition variance explained. Proc. Second Int. Tampere Conf. in Statistics, Tampere, Finland, University of Tampere, 245–260.
Saporta, G., 2006: Probabilités, Analyse des Données et Statistique. Editions Technip, 656 pp.
Vellore, R. K., and Coauthors, 2016: Monsoon-extratropical circulation interactions in Himalayan extreme rainfall. Climate Dyn., 46, 3517–3546, https://doi.org/10.1007/s00382-015-2784-x.
Yu, W., Y.-C. Yang, A. Savitsky, D. Alford, C. Brown, J. Wescoat, D. Debowicz, and S. Robinson, 2013: The Indus basin of Pakistan: The impacts of climate risks on water and agriculture. The World Bank, accessed 5 June 2020, https://openknowledge.worldbank.org/handle/10986/13834.
Yukimoto, S., and Coauthors, 2019: MRI MRI-ESM2.0 model output prepared for CMIP6 CMIP. Earth System Grid Federation, accessed 5 August 2019, https://doi.org/10.22033/ESGF/CMIP6.621.