## Abstract

A robust response of South Asian summer monsoon precipitation to increasing greenhouse gas concentration during the twenty-first century is identified in 23 models from phase 5 of the Coupled Model Intercomparison Project. The pattern of this response is dominated by two dipole structures, one oriented east–west across the Maritime Continent and another oriented north–south across the equatorial Indian Ocean, and is characterized by enhanced rainfall in South Asia and diminished rainfall over the Maritime Continent. The response is robust in the sense that the same pattern has a trend that is within one standard deviation of the trend of other models. Another robust feature is that the variability of precipitation about this trend decreases in all models, and hence becomes more detectable with time. The response is negligible compared to internal variability during the twentieth century but emerges clearly by the middle of the twenty-first century. Although the pattern as a whole is robust, the response over small land areas such as India has more uncertainty, with some models disagreeing even with the sign of the response.

## 1. Introduction

The South Asian summer monsoon has tremendous socioeconomic impact on the lives of billions of people. The possibility that increases in greenhouse gas concentration may lead to changes in the South Asian summer monsoon is of obvious concern. While the response of temperature due to increasing greenhouse gas concentration has a well-defined physical mechanism, and the associated response of water vapor concentration is strongly constrained by the Clausius–Clapeyron equation (Boer 1993; Held and Soden 2006), the response of precipitation is much more uncertain. State-of-the-art climate models consistently predict that global mean precipitation will increase in response to increasing greenhouse gas concentration (Meehl et al. 2007; Cherchi et al. 2011; Lee and Wang 2014; Hsu et al. 2013), but the change over South Asia is relatively uncertain (Turner and Annamalai 2012), and recently developed climate models show no reduction in uncertainty (Knutti and Sedláček 2013). Observations indicate that South Asian monsoon rainfall has decreased in the second half of the twentieth century (Ramanathan et al. 2005), but the mechanism for this decrease is unclear: some researchers attribute this weakening to weakening of tropical meridional overturning circulation induced by anthropogenic aerosols (Bollasina et al. 2011), while others attribute this weakening to circulation changes caused by anthropogenic warming over the Pacific warm pool (Annamalai et al. 2013).

The purpose of the paper is to clarify the forced response of South Asian summer monsoon precipitation to changes in greenhouse gas concentrations. In particular, we will apply a statistical optimization method to identify the most detectable climate change signal in South Asian summer monsoon rainfall, and identify those aspects that are robust across the models and hence insensitive to model details. In this paper, robustness means more than mere agreement of the sign; rather, it means the response has the same spatial structure and trend within a prescribed significance level. Whether such a robust response exists is by no means obvious: most state-of-the art GCMs have large biases in simulating the mean and interannual variability of the South Asian summer rainfall, and projections of future precipitation change in the South Asian region have low confidence because of large intermodel disparities in projected precipitation responses. Earlier studies (e.g., Lee and Wang 2014) used a small number of models to project future global precipitation patterns and found that the uncertainty increases when multimodel mean based upon all models from phase 5 of the Climate Model Intercomparison Project (CMIP5) is used to project future change in precipitation. In contrast, we have retained all 23 models in our study and applied a special technique to optimize the detection pattern in the South Asian region. Finally, our method does not make any assumptions regarding the time evolution of the forced response, in contrast to previous studies that use trend analysis (e.g., Menon et al. 2013). Instead, our method identifies a forced response essentially by maximizing the ratio of variance between the control and forced runs, and hence identifies variability that can be attributed directly to climate forcing.

We acknowledge that finding a detailed mechanism responsible for the trends is beyond the scope of this paper. Also, we do not assess the accuracy of the model simulations relative to observations. Instead, our objective is to determine if there exists a robust response pattern common to all models. The existence of a common response pattern would imply that the particular response is insensitive to differences in model physics and hence would provide a compelling candidate for mechanistic study and for detection and attribution analysis of observations.

## 2. Data

To avoid confusion, we first discuss the data, so that the methodology can be explained later in the context of the data to which it is applied. The goal of our technique is to determine the climate change signal that maximizes detectability over the pool of models. The models are chosen from CMIP5. In particular, we choose models for which three types of simulations are available: “preindustrial control runs,” also called piControl runs, which simulate climate variability in the absence of interannual changes in anthropogenic and natural forcings; “twentieth-century runs,” also called historical runs, which include anthropogenic and natural forcings of the twentieth century; and “twenty-first-century runs” with representative concentration pathways characterized by radiative forcing equivalent to 8.5 W m^{−2} by 2100, also called RCP8.5. The RCP8.5 is a scenario that does not include any specific mitigation target (Riahi et al. 2011). There are 23 models that satisfy this condition. These models are listed in Table 1. Table 2 provides information about the number of ensemble members and the length of integrations in the preindustrial control, historical, and RCP8.5 runs for each model. A full description of the experiments in CMIP5 can be found in Taylor et al. (2012). We have taken the last 240 yr in the piControl and the first 94 yr (2006–99) in the RCP8.5 runs and interpolated the monthly rainfall data from each model onto a common 2.5 × 2.5 grid and calculated the June–September (JJAS) mean rainfall anomaly over the South Asian region (30°S–40°N, 40°E–180°) by subtracting JJAS mean climatology from JJAS mean rainfall in the piControl and RCP8.5, respectively. Then we pooled the piControl and RCP8.5 JJAS mean rainfall anomaly data from 23 models together. This method differs slightly from that of Jia and DelSole (2012) in which the piControl climatology is subtracted from both the piControl and twentieth-century runs. By removing the climatology from each run separately, we are effectively ignoring the difference in means *when computing the EOFs*. We found that subtracting just the piControl climatology from both runs produced EOFs that did not project onto a robust pattern. Perhaps the mean climate change pattern differs widely between models and therefore contaminates the EOFs with patterns that are not robust. In any case, removing the climatology from each run separately is not unreasonable when computing EOFs. Moreover, it will become clear that the climate change signal derived from our analysis has a clear mean climate change pattern, so removing the difference in means in the EOF calculation does not imply that our analysis cannot detect a difference in mean.

## 3. Optimization procedure

We apply a form of discriminant analysis (DA) to identify the most detectable climate change signal (Jia and DelSole 2012). This technique does not prescribe a specific time variation of the climate change signal, in contrast to trend analysis, and has the virtue that no other pattern can be more detectable in the models from which it was derived (at least when the signal is expressed with a particular combination of variables and optimized in a particular subspace).

### Discriminant analysis

Discriminant analysis finds the linear combination of variables (values at grid points) that maximizes the ratio of variances between two different datasets. A critical assumption in this technique is that forced variability behaves as an independent and additive perturbation to unforced variability, in which case total variability can be modeled as

where *T*, *F*, and *U* denote the total, forced, and unforced variability, respectively. To the extent that (1) holds, the total variance in the RCP8.5 run equals the sum of the variances due to forced and unforced components, which may be expressed as

where **Σ**_{RCP8.5} is the covariance matrix of RCP 8.5 runs and **Σ**_{F} and **Σ**_{U} are the (unknown) covariance matrices of forced and unforced components, respectively. In piControl, there is no external forcing, so

Let the *centered* piControl and RCP8.5 data be denoted as

respectively, where *N*_{X} and *N*_{Y} are the sample sizes of the piControl and RCP8.5 datasets, respectively; and *S* is the dimension of the state space. Since there are more variables than samples in each dataset, the respective covariance matrices will be singular. Therefore, the datasets are projected onto lower dimensional space using principal component analysis. If *T* denotes the number of basis vectors, then the filtered datasets can be approximated by the leading principal components as

where the columns of and give the time series and spatial structure of the principal components, often called PCs and EOFs, respectively; and the dots above the quantities indicate filtered data. The datasets are said to be filtered because, by selecting first *T* basis vectors, some variability has been filtered out. Discriminant analysis is then applied to the PCs _{X} and _{Y}. Let denote the weighting coefficients for the PCs, defined such that

where the tilde denotes a quantity in basis vector space. Therefore, the ratio of sample variances is

where

where *M*_{X} and *M*_{Y} are the number of means taken out from piControl and RCP8.5 data, respectively. Maximizing the variance ratio *λ* given by (8) leads to the generalized eigenvalue problem

The solution of (10) gives *T* eigenvalues that can be organized in decreasing order of magnitude. The corresponding eigenvectors are the coefficients of the linear combination of *T* basis vectors that give time series, such that the first maximizes the variance ratio, the second maximizes the variance ratio subject to being uncorrelated to the first one, and so on.

The significance of the eigenvalues is tested using a Monte Carlo method. In this method, independent random variables are drawn from a Gaussian distribution with a fixed but arbitrary covariance matrix. Then we solve the generalized eigenvalue problem (10), save the resulting eigenvalues, and repeat the procedure 1000 times, and finally select the 95th percentile of the sample. The original eigenvalues are said to be significant at the 5% level if they are greater than the 95th percentile of the empirical eigenvalue distribution generated from Monte Carlo experiments.

The above-mentioned procedure is applied to all truncations *T* between 1 and 20 inclusive. Since there is little objective basis for preferring one truncation relative to another, we have explored all truncations with the aim of ensuring that the results are not sensitive to truncation.

After a truncation *T* is selected (for our study we have selected truncation 20), the generalized eigenvalue problem (10) is solved to diagnose the components responsible for the difference in variances. The solution to (10) gives the eigenvectors . The weights can be combined into matrix form as

The time series (also called variates) corresponding to each eigenvector are the column vectors of

Since **Σ**_{X} and **Σ**_{Y} are symmetric, the time series are orthogonal. It is convenient to scale the eigenvectors so that the *X* time series have unit variance. Under this normalization, it can be shown that

where diag(*λ*_{1 }*λ*_{2} … *λ*_{T}) denotes a diagonal matrix with diagonal elements *λ*_{1 }*λ*_{2} … *λ*_{T}. The loading vector for each component is obtained as the regression map between the data and the corresponding variate.

## 4. Results

Figure 1 shows the maximized variance ratio [leading eigenvalue from (10)] for each principal component truncation from 1 to 20. The red shading bounds the 5th and 95th percentiles of the maximized variance ratio from Monte Carlo estimates. It is apparent that the maximized variance ratios are significant at the 5% level for all the combinations of PCs from 1 to 20. Based on this result, we reject the hypothesis that the covariance matrices are equal in the piControl and RCP8.5 runs. In other words, anthropogenic forcings produce a significant forced response in South Asian JJAS rainfall in RCP8.5.

To examine the sensitivity of the forced response to each model and truncation, we selected a region (6.25°–1.25°S, 91.25°–106.25°E) in the equatorial Indian Ocean (which corresponds to a region of maximum loading) and then computed the trend in the leading discriminant averaged over this region for each truncation and model. The result is shown in Fig. 2. The figure shows that the majority of the models produce a trend of the same sign for all truncations. However even for the extreme cases of MIROC-ESM and MIROC-ESM-CHEM, the trends have same sign for *T* > 15. Thus, the forced response in any of the models does not depend upon the selection of truncation for *T* > 15. For this reason we choose *T* = 20. We also checked the sensitivity of our results to the selection of the spatial domain and analyzed the results over other domains [India (7°–35°N, 65°–95°E) and the intermediate South Asian region (30°S–35°N, 45°–100°E)], and found that the spatial pattern of the change in precipitation and the robustness in the forced response is insensitive to the changes (increase or decrease) in the area of the spatial domain. However, the signals are little weaker over these smaller domains (figures not shown here).

The most detectable climate change pattern based on JJAS mean precipitation derived from RCP8.5 in the CMIP5 models is shown in Fig. 3. The pattern is dominated by two dipole structures, with the first dipole oriented zonally across the Maritime Continent having its poles in the west Pacific Ocean and in the eastern south equatorial Indian Ocean, and the other dipole oriented meridionally across the equator with poles lying in the north equatorial Indian Ocean and the eastern south equatorial Indian Ocean. As a result of the dipole structure, a spatial average of precipitation over the Indian Ocean region would involve substantial cancellation and hence little to no signal, indicating the value of our methodology compared to spatial averaging. Note that the amplitudes are much stronger over ocean than over land. This pattern is similar to the response of precipitation to doubling carbon dioxide concentrations in CMIP3 models (Turner and Annamalai 2012), suggesting that the cause of this pattern is due to changing greenhouse gas concentrations and that the structure of the pattern is robust across different generations of models.

The time evolution of the most detectable climate change pattern in the models is illustrated in Fig. 4. The time series between 1850 and 2005 and the time series beyond 2005 are derived by multiplying the weights derived from (11) by the JJAS anomaly data in the respective time periods. Here, the time series (or variates) have been constructed after subtracting the JJAS mean climatology of the piControl runs from the JJAS mean rainfall in piControl runs, historical runs, and RCP8.5 runs. Further, the time series for a given model has been standardized with respect to its preindustrial control. The thick solid curves show the multimodel mean amplitude across 23 models, while the shading shows ±1 standard deviation across models (only one ensemble member per model is included). Importantly, there is no significant trend of the time series during the twentieth century, implying that this climate change signal would be undetectable during the twentieth century. Recall that the signal obtained here is optimal, so the fact that this pattern is not detectable during the twentieth century implies that *no* pattern based on JJAS mean precipitation in the South Asia region is detectable (this does not preclude detectability of signals expressed in different ways, such as with longer-term averages or with simultaneous space–time information). In contrast, a strong trend emerges in the twenty-first century, with the multimodel mean exceeding 1 standard deviation of internal variability around 2030. The space–time behavior of this climate change signal can be inferred by multiplying the time series in Fig. 4 by the pattern in Fig. 3 (note, however, that the time series in Fig. 4 have been normalized by their control runs, so the resulting product for each model is unique up to a positive multiplicative factor). The resulting climate change signal is characterized by enhanced rainfall north of the equator, including parts of South Asia, and drying south of the equator including the Maritime Continent. The trend over South Asia in the twenty-first-century runs inferred here from climate models is opposite to that observed in the later half of the twentieth century (Ramanathan et al. 2005).

The model dependence of the climate change signal during the twenty-first century (RCP8.5) is indicated in Fig. 5. In this case, the amplitudes have not been standardized by their control simulations, so the trend values can be multiplied directly by the spatial pattern shown in Fig. 3 to obtain the trend contributed by this signal at each point (in units of mm day^{−1} decade^{−1}). The trend is seen to be of the same sign in all models. Also, the magnitudes of the trend are “close” to each other, in the sense that several models have trends within a standard deviation of the trend of other models. These two points imply that the climate change signal derived here is “robust” across the models. Since the model trends are not significantly different from each other, differences in trends can be explained largely by sampling error. Another robust aspect of the climate change signal is its variability about the long-term trend. Figure 6 shows the log ratio of variance between the twenty-first century (RCP8.5) and preindustrial control runs for the residuals about the long-term trend. All models show a decrease in variability about the long-term trend, with most models indicating more than a 50% decrease in variability, and with some models indicating more than a sevenfold decrease in variability. This decrease in variability of precipitation presumably reflects the weakening of the atmospheric circulation that is expected to occur as a consequence of the increase in lower-tropospheric water vapor (Held and Soden 2006). This result also implies that the growth in spread during the twenty-first century is due to differences in trends between models, rather than growth of variability within a model. The fact that the response grows nearly linearly in time suggests that the response is a linear function of forcing. To verify this further, we projected the response pattern computed from RCP8.5 onto the RCP4.5 simulations and found that the resulting time series also grows linearly in time but at nearly half the rate as in RCP8.5 (Fig. 7). Therefore, other scenarios with even less forcing, such as RCP2.6, or more uniform growth, such as a 1% CO_{2} increase, are expected to contain similar response patterns but different time series corresponding to the different scenarios.

The precipitation response derived here is most likely a response to increasing greenhouse gas concentrations: for example, the time series of the pattern increases almost linearly during the twenty-first century and the pattern, especially the north–south dipole in the Indian Ocean, is similar to the predicted change in precipitation due to doubling of carbon dioxide concentrations (Turner and Annamalai 2012). Bony et al. (2013) show that the north–south dipole in the Indian Ocean arises in the early stages of an abrupt increase in carbon dioxide concentration, including in atmosphere-only models in which the sea surface temperature is held fixed while carbon dioxide concentration is quadrupled. This result suggests that the dipole response over the Indian Ocean is driven primarily by relatively rapid changes in the radiative balance of the atmosphere. Specifically, Bony et al. (2013) argue that increasing greenhouse gas concentrations weaken the net radiative cooling of the atmosphere, slightly stabilizing the atmosphere and hence slightly suppressing vertical motions everywhere except over land. Land areas tend to warm more than oceans, owing to the difference in heat capacities, and hence they experience an enhanced ascent. In contrast, Bony et al. (2013) attribute the enhanced precipitation in the Pacific as a manifestation of the wet-get-wetter, dry-get-drier regional pattern that is expected to occur due to global warming (Held and Soden 2006, see also their Fig. 7). Unfortunately, the magnitude of the latter depends on climate sensitivity and the spatial structure depends on the climatology of precipitation, which has significant biases in models. Thus, the relative roles of the two mechanisms are expected to be sensitive to model. The above-mentioned mechanism is by no means exhaustive. Sobel and Camargo (2011) obtain a similar pattern of change in summer precipitation, which they attribute to sea surface temperature (SST) changes induced by changes in surface wind speed. Sulfate aerosols, black carbon, and land use changes over the South Asian region also may play a role in human-caused precipitation changes (Bollasina et al. 2011; Chung and Ramanathan 2007; Niyogi et al. 2010). Separating these different mechanisms and attributing our response pattern to a particular subset of these mechanisms requires specialized climate simulations that lie outside the scope of the present paper.

We emphasize that while the large-scale structure and trend of the above-mentioned climate change signal is robust across models, the response on shorter space or time scales is subject to considerable uncertainty. To illustrate this sensitivity, we show in Fig. 8 the average trend in precipitation in the twenty-first century over land points of central India (15°–25°N, 70°–87°E). Despite the fact that the large-scale climate change signal shown in Fig. 3 has a positive trend of precipitation over India in all the models, some models predict a negative trend of spatially averaged South Asian monsoon precipitation over India itself, and the magnitude of the trend is more variable when only the mean precipitation over parts of India are considered. This sensitivity is analogous to that seen in global warming: although the global average temperature has a robust multidecadal response in all climate models, the temperature change over shorter space and time scales is much more uncertain, and can even differ in sign in some locations (Deser et al. 2012).

## 5. Conclusions

We applied discriminant analysis to identify the spatiotemporal pattern of South Asian summer monsoon rainfall in response to increasing greenhouse gas concentrations. The spatial pattern of this response is dominated by two dipole structures, one oriented east–west across the Maritime Continent and the other oriented north–south across the equatorial Indian Ocean. This pattern identifies enhanced rainfall north of the equator, including land points of South Asia and Southeast Asia, and suppressed rainfall south of the equator. The pattern is robust in the sense that all 23 CMIP5 models predict trends of the same sign in South Asian summer monsoon rainfall and the trends shown by the models are within one standard deviation of the trend of other models. Another robust aspect of our result is that the variability about the long-term trend is predicted to decrease. This result implies that the growth in spread among models during the twenty-first century is due to differences in trends among models, rather than growth of variability within a model. However, it is important to note that while the large-scale structure and trend of the climate change signal in South Asian precipitation is robust, there has been considerable uncertainty on shorter spatial and time scales.

## Acknowledgments

This research was supported by the Department of Energy (Grants ER64633, ER65095), the National Science Foundation (Grant ATM0830068), the National Aeronautics and Space Administration (Grant NNX09AN50G), and the National Oceanic and Atmospheric Administration (Grant NA09OAR4310058). The views expressed herein are those of the authors and do not necessarily reflect the views of these agencies.

## REFERENCES

*J. Climate,*

**26,**2701–2718, doi:.

*Science,*

**334,**502–505, doi:.

_{2}levels on monsoons

*J. Climate,*

**19,**5686–5699, doi:.

**118,**1247–1260, doi:.

**3,**369–373, doi:.

**42,**101–119, doi:.

*Climate Change 2007: The Physical Science Basis,*S. Solomon et al. Eds., Cambridge University Press, 747–846.

*Bull. Amer. Meteor. Soc.,*

**93,**485–498, doi:.

**2,**587–595, doi:.