Predicting Monsoon Intraseasonal Precipitation using a Low-Order Nonlinear Stochastic Model

Nan Chen Department of Mathematics and Center for Atmosphere Ocean Science, Courant Institute of Mathematical Sciences, New York University, New York, New York

Search for other papers by Nan Chen in
Current site
Google Scholar
PubMed
Close
,
Andrew J. Majda Department of Mathematics and Center for Atmosphere Ocean Science, Courant Institute of Mathematical Sciences, New York University, New York, New York, and Center for Prototype Climate Modeling, New York University Abu Dhabi, Abu Dhabi, United Arab Emirates

Search for other papers by Andrew J. Majda in
Current site
Google Scholar
PubMed
Close
,
C. T. Sabeerali Center for Prototype Climate Modeling, New York University Abu Dhabi, Abu Dhabi, United Arab Emirates

Search for other papers by C. T. Sabeerali in
Current site
Google Scholar
PubMed
Close
, and
R. S. Ajayamohan Center for Prototype Climate Modeling, New York University Abu Dhabi, Abu Dhabi, United Arab Emirates

Search for other papers by R. S. Ajayamohan in
Current site
Google Scholar
PubMed
Close
Open access

Abstract

The authors assess the predictability of large-scale monsoon intraseasonal oscillations (MISOs) as measured by precipitation. An advanced nonlinear data analysis technique, nonlinear Laplacian spectral analysis (NLSA), is applied to the daily precipitation data, resulting in two spatial modes associated with the MISO. The large-scale MISO patterns are predicted in two steps. First, a physics-constrained low-order nonlinear stochastic model is developed to predict the highly intermittent time series of these two MISO modes. The model involves two observed MISO variables and two hidden variables that characterize the strong intermittency and random oscillations in the MISO time series. It is shown that the precipitation MISO indices can be skillfully predicted from 20 to 50 days in advance. Second, an effective and practical spatiotemporal reconstruction algorithm is designed, which overcomes the fundamental difficulty in most data decomposition techniques with lagged embedding that requires extra information in the future beyond the predicted range of the time series. The predicted spatiotemporal patterns often have comparable skill to the MISO indices. One of the main advantages of the proposed model is that a short (3 year) training period is sufficient to describe the essential characteristics of the MISO and retain skillful predictions. In addition, both model statistics and prediction skill indicate that outgoing longwave radiation is an accurate proxy for precipitation in describing the MISO. Notably, the length of the lagged embedding window used in NLSA is crucial in capturing the main features and assessing the predictability of MISOs.

Denotes content that is immediately available upon publication as open access.

© 2018 American Meteorological Society. For information regarding reuse of this content and general copyright information, consult the AMS Copyright Policy (www.ametsoc.org/PUBSReuseLicenses).

Corresponding author: Nan Chen, chennan@cims.nyu.edu

Abstract

The authors assess the predictability of large-scale monsoon intraseasonal oscillations (MISOs) as measured by precipitation. An advanced nonlinear data analysis technique, nonlinear Laplacian spectral analysis (NLSA), is applied to the daily precipitation data, resulting in two spatial modes associated with the MISO. The large-scale MISO patterns are predicted in two steps. First, a physics-constrained low-order nonlinear stochastic model is developed to predict the highly intermittent time series of these two MISO modes. The model involves two observed MISO variables and two hidden variables that characterize the strong intermittency and random oscillations in the MISO time series. It is shown that the precipitation MISO indices can be skillfully predicted from 20 to 50 days in advance. Second, an effective and practical spatiotemporal reconstruction algorithm is designed, which overcomes the fundamental difficulty in most data decomposition techniques with lagged embedding that requires extra information in the future beyond the predicted range of the time series. The predicted spatiotemporal patterns often have comparable skill to the MISO indices. One of the main advantages of the proposed model is that a short (3 year) training period is sufficient to describe the essential characteristics of the MISO and retain skillful predictions. In addition, both model statistics and prediction skill indicate that outgoing longwave radiation is an accurate proxy for precipitation in describing the MISO. Notably, the length of the lagged embedding window used in NLSA is crucial in capturing the main features and assessing the predictability of MISOs.

Denotes content that is immediately available upon publication as open access.

© 2018 American Meteorological Society. For information regarding reuse of this content and general copyright information, consult the AMS Copyright Policy (www.ametsoc.org/PUBSReuseLicenses).

Corresponding author: Nan Chen, chennan@cims.nyu.edu

1. Introduction

The monsoon intraseasonal oscillation (MISO) (Sikka and Gadgil 1980; Webster et al. 1998; Goswami and Ajayamohan 2001; Lau and Waliser 2011; Kikuchi et al. 2012; Lee et al. 2013) is one of the prominent modes of tropical intraseasonal variability. As a slow-moving planetary-scale envelope of convection propagating northeastward, it strongly interacts with the boreal summer monsoon rainfall over South Asia. Because of the interaction with the mean monsoon circulation and other modes of tropical variability, the propagation of the MISO is more complex compared with the eastward-propagating Madden–Julian oscillation (MJO) (Zhang 2005). The MISO plays a crucial role in determining the onset and demise of the Indian summer monsoon as well as affecting the seasonal amount of rainfall over the Indian subcontinent (Murakami et al. 1986; Goswami and Ajayamohan 2001; Goswami et al. 2003; Gadgil 2003). Therefore, both the real-time monitoring and accurate extended-range forecast of MISO phases are important, and they have large socioeconomic impacts over the Indian subcontinent (Sahai et al. 2013; Abhilash et al. 2014a).

Both dynamical models (Wang et al. 2005; Pattanaik and Kumar 2010; Acharya et al. 2011; Nair et al. 2014) and low-order statistical models (Rajeevan et al. 2007; DelSole and Shukla 2002) are widely utilized for predicting the MISO. While the forecast through operational models captures more refined structures, prediction with low-order models aims at the large-scale features and is thus computationally efficient. The prediction through low-order models relies on developing effective MISO indices, which are typically given by a few principal components (PCs) that explain the intraseasonal variabilities of the high-dimensional raw data. Once the indices are predicted, spatiotemporal reconstruction by making use of the spatial bases associated with these PCs results in the forecast of the large-scale spatial patterns.

Several indices have been proposed for the real-time monitoring and extended-range forecast of the MISO. The Indian Institute of Tropical Meteorology (IITM) relies on an index based on extended empirical orthogonal function (EEOF) analysis, which is applied to longitudinal averaged daily rainfall anomalies for the extended range prediction of MISO (Suhas et al. 2013; Sahai et al. 2013; Abhilash et al. 2014a). Another well-known MISO index (Lee et al. 2013) mimics the real-time multivariate MJO (RMM) index (Wheeler and Hendon 2004) and is based on the multivariate EOF analysis of daily anomalies of the zonal wind at 850 hPa and outgoing longwave radiation (OLR). Other MISO indices (Kikuchi et al. 2012; Goswami et al. 1999) are based on similar EOF and EEOF techniques or its analog, multichannel singular spectrum analysis (MSSA; Krishnamurthy and Shukla 2007). These covariance-based approaches in general capture the spatiotemporal MISO patterns reasonably well and isolate the northeastward-propagating intraseasonal periodicity band from the high-frequency band (Suhas et al. 2013; Abhilash et al. 2014a,b). Yet, the seasonal extraction and longitudinal averaging in computing these indices are sometimes ad hoc and can potentially lead to the loss of predictive information or mixing with other modes. In addition, these covariance-based techniques have potential inadequacy in capturing the rare/extreme events in complex nonlinear dynamics (Crommelin and Majda 2004) that have significant societal and economic impacts.

Recently Sabeerali et al. (2017) developed a new MISO index based on the nonlinear Laplacian spectral analysis (NLSA; Giannakis and Majda 2012b,a) technique. NLSA is a nonlinear data analysis technique that combines ideas from lagged embedding (Packard et al. 1980; Sauer et al. 1991), machine learning (Coifman and Lafon 2006; Belkin and Niyogi 2003), adaptive weights, and spectral entropy criteria to extract spatiotemporal modes of variability from high-dimensional time series. These modes are computed utilizing the eigenfunctions of a discrete analog of the Laplace–Beltrami operator, which can be thought of as a local analog of the temporal covariance matrix employed in EOF and EEOF techniques, but adapted to the nonlinear geometry of data generated by complex dynamical systems. A key advantage of NLSA over classical covariance-based techniques is that NLSA by design requires no ad hoc preprocessing of data such as detrending or spatiotemporal filtering of the full dataset, and it captures both intermittency and low-frequency variability (Giannakis and Majda 2012a,b, 2013; Giannakis et al. 2012b). Therefore, the NLSA-based MISO index provides an objective identification of MISO patterns from noisy precipitation data. In addition, as reported in Sabeerali et al. (2017) the NLSA MISO modes have higher memory and predictability, stronger amplitude, and higher fractional explained variance over the western Pacific, Western Ghats, and adjoining Arabian Sea regions and a more realistic representation of the regional heat sources over the Indian and Pacific Oceans compared with those extracted via EEOF analysis. Other applications of NLSA beyond the capability of EOF and EEOF in capturing both the intermittent and low-frequency modes in climate, atmosphere, and ocean can be found in previous studies (Székely et al. 2016a,b; Slawinska and Giannakis 2017; Giannakis and Majda 2012a, 2011; Brenowitz et al. 2016).

In this article, we develop a prediction framework for the large-scale MISO precipitation. This is achieved in two steps: 1) predicting the NLSA-based MISO indices and 2) reconstructing the predicted large-scale spatiotemporal patterns of the MISO precipitation. In the first step, a physics-constrained low-order stochastic model (Majda and Harlim 2013; Harlim et al. 2014) is developed to describe and predict the NLSA MISO indices. This physics-constrained low-order stochastic model contains two MISO variables and two hidden variables. They couple with each other through energy-conserving nonlinear interactions and involve both correlated multiplicative noise and additive stochastic noise. The special structure of this nonlinear stochastic model allows an effective data assimilation algorithm for determining the initial ensemble of the hidden variables that facilitates the ensemble prediction scheme. Note that this nonlinear low-order stochastic modeling framework has been shown to have significant skill for determining the predictability limits of the large-scale cloud patterns of both the boreal winter MJO and boreal summer intraseasonal oscillations (Chen et al. 2014; Chen and Majda 2015a) as well as improving the prediction skill of the RMM indices (Chen and Majda 2015b). In the second step, an effective and practical spatiotemporal reconstruction algorithm is designed for obtaining the predicted large-scale spatiotemporal patterns of the MISO precipitation. By incorporating a “predicted spatial basis” determined completely from the training data, this spatiotemporal reconstruction method overcomes the fundamental difficulty in most data decomposition techniques with lagged embedding that requires extra information in the future beyond the predicted range of the time series.

Several related issues are also addressed in this article. First, because of the lack of sufficient observational data in many real-world situations, a short training phase is usually preferred from a practical point of view. To this end, we compare both the statistical features and the prediction skill using a 3-yr short training period with those using the 10-yr period as in the default setup. It is shown that the model is able to capture and predict the main characteristics of the MISO indices even with such a short training period. Second, since most tropical rainfall is convective, OLR is a potential candidate for assessing the precipitation in the tropics. To see whether OLR is a good proxy for describing the MISO precipitation, the parameter values calibrated from the OLR monsoon modes (Chen and Majda 2015a) are adopted in the low-order model to study the skill of predicting the MISO precipitation indices. Furthermore, an intraseasonal time length is shown to be crucial for the lagged embedding window size in the NLSA in order to capture the main MISO characteristics as well as determining the predictability of the MISO precipitation.

The remainder of this article is organized as follows. Section 2 describes the precipitation dataset and the MISO indices obtained from the NLSA technique. Section 3 presents the physics-constrained low-order nonlinear stochastic model as well as the calibration and the effective prediction algorithm. The results of predicting the MISO indices are reported in section 4, and the prediction of the spatiotemporal reconstructed patterns is shown in section 5. Section 6 discusses the possibility of shortening the training period to only 3 years, the strong connection between OLR and precipitation in describing and predicting the MISO, and the significance of adopting the lagged embedding window with intraseasonal time length. Summary and conclusions are included in section 7.

2. The precipitation MISO indices from NLSA

In this study, the MISO indices for the period 1998–2013 are extracted from the daily Global Precipitation Climatology Project (GPCP) rainfall data (Huffman et al. 2001) over the Asian summer monsoon region (20°S–30°N, 30°–140°E), using the NLSA algorithm. The spatial resolution of the GPCP dataset is 1° × 1°, amounting to grid points for the Asian summer monsoon region.

NLSA is applied to the daily GPCP dataset with a lagged embedding window of size days, a natural choice for the intraseasonal time scale (Sabeerali et al. 2017). Therefore, the total length of the data in time is (days) with (days), and the length of data from 1998 to 2007 (the training phase as will be defined in section 3) is (days) with (days). A variety of extended spatial precipitation patterns emerge from the analysis but the focus here is on the two spatial patterns associated with MISO, where the associated time series are depicted in Fig. 1a. These two MISO time series are the NLSA-based analogs of the MSSA (EEOFs) PCs, but they provide more objective identification of MISO patterns from noisy precipitation data and have higher memory and predictability (Sabeerali et al. 2017). The details of applying NLSA to daily GPCP dataset have already been described in Sabeerali et al. (2017), and hence they are omitted here. It is evident from Fig. 1a that these patterns are active in boreal summer and quiescent in boreal winter. As shown in Fig. 1b, such temporal intermittency results in highly non-Gaussian probability distribution functions (PDFs) with fat tails of both the time series. In addition, although the spectrum of both MISO time series peaks at 45 days, significant powers lie within the band from 30 to 60 days (from ½ to 1 month−1; Fig. 1c), which indicates the irregularity of the MISO frequency.

Fig. 1.
Fig. 1.

MISO indices and statistical features. (a) MISO indices from NLSA ranging from 1 Jan 1998 to 31 Dec 2013. The period up to 31 Dec 2007 is utilized as the training period in section 3b for model calibration, and the prediction skill is tested for years 2008–2013 in section 4. (b),(c) The highly non-Gaussian PDFs with fat tails and the power spectrum of the MISO indices in the training period. The power spectrums here and in other figures are computed via the Thomson multitaper spectrum approach, which provides a smooth estimate (Thomson 1982). (d) Correlation functions of the MISO indices (blue) and the signal from the low-order model [(1) and (2)] with the optimal parameters in Table 1 (red), where the autocorrelation of MISO 1 up to 2 years is shown at the top, and the autocorrelations of MISO 1 and MISO 2 and cross-correlation between them up to 3 months are shown at the bottom. (e) Comparison of the PDFs between the MISO indices (blue) and model signal (red), where the inset shows the PDF in a logarithm scale. (f) The power spectrum of the model signals.

Citation: Journal of Climate 31, 11; 10.1175/JCLI-D-17-0411.1

It was shown in Sabeerali et al. (2017) that the NLSA MISO modes display the key characteristics of MISO such as northeastward propagating anomalies associated with the MISO. A case study there also revealed three consecutive MISO events in the NLSA MISO modes in the boreal summer of 2004, the onset and demise phases of which are highly consistent with observations. These facts indicate that the time series depicted in Fig. 1a give a reasonable representation of the full life cycle of the northward propagating boreal summer convection band and can be utilized to determine the phase and amplitude of the poleward-propagating rainfall anomalies associated with the MISO. Below, we utilize the terminology “MISO indices” for the two time series in Fig. 1.

3. The low-order nonlinear stochastic model and the prediction algorithm

a. The model

To describe the temporal intermittency and the randomness in the oscillation frequency of the MISO indices, the following family of low-order stochastic models is proposed. Here the two components, MISO 1 and MISO2, are denoted by and respectively:
e1a
e1b
e1c
e1d
where
e2
In addition to the two observed MISO variables and , the other two variables υ and are unobserved, which represent the stochastic damping and stochastic phase, respectively. In (1), a is the mean value of the oscillation period while , and are damping, and , and are diffusion coefficients that measure the noise strength. The parameters and are large-scale and time-periodic damping coefficients, and , and are independent white noise processes. The time periodic damping in (1a) and (1b) is utilized to crudely model the active summer season and the quiescent winter season in the seasonal cycle. The hidden variables and the observed MISO variables are coupled with each other through energy conserving nonlinear interactions following the systematic physics constrained nonlinear regression strategies for time series (Majda and Harlim 2013; Harlim et al. 2014). The energy-conserving nonlinearity is easily seen by multiplying (1a)(1d) by , and , respectively, and then summing the resulting equations. The energy changes in the quadratic nonlinear terms cancel, and thus the energy due to the nonlinear interaction is conserved. The energy-conserving nonlinear interactions guarantee the robustness of the low-order model [(1)] with respect to the parameter variation. They also prevent the model from finite-time blowup of statistical solutions and pathological behavior of the related invariant measure (Majda and Yuan 2012; Majda and Harlim 2013). The hidden variables and their dynamics can be regarded as phenomenological surrogates for the energy-conserving interactions in the model involving the synoptic-scale activity and the equatorial convective dynamic equations for temperature, velocity, and moisture that affect the precipitation. The low-order stochastic nonlinear models in (1) are fundamentally different from those utilized earlier (Kondrashov et al. 2013; Kravtsov et al. 2005), which allow for nonlinear interactions only between the observed variables and only special linear interactions with layers of hidden variables.

The physics-constrained nonlinear low-order stochastic model [(1) and (2)] has been shown to have significant skill for determining the predictability limits of the large-scale cloud patterns of both the boreal winter MJO and boreal summer intraseasonal oscillations (Chen et al. 2014; Chen and Majda 2015a) as well as for improving the prediction skill of the RMM indices (Wheeler and Hendon 2004) by incorporating a new information-theoretic strategy in the training phase (Chen and Majda 2015b).

b. Calibration of the nonlinear stochastic model

The parameters of the stochastic model in (1) and (2) are calibrated by systematically minimizing the information distance (see appendix A) of the highly non-Gaussian equilibrium PDF of the stochastic model compared with that of the actual data (Majda and Gershgorin 2010, 2011) and minimizing the root-mean-squared (RMS) error in the autocorrelations of the two MISO variables in the training period from 1998 to 2007. With the minimized information distance, the model is able to capture the non-Gaussian statistics with fat tails. On the other hand, calibrating the autocorrelations ensures that the model has the same dynamical behavior as nature. Table 1 records the optimal parameter values. Note that a coarse estimation of these parameters is sufficient since the model statistics are robust with respect to the parameter variations around these optimal values (see appendix A).

Table 1.

Optimal parameters for the nonlinear low-order stochastic model [(1)] are shown in the first row (the parameters , and have units of month−1; , and have units of month−1/2; and ϕ is dimensionless). The second row shows the parameters from Chen and Majda (2015b), predicting the two BSISO modes obtained by applying NLSA to the OLR dataset (CLAUS, version 4.7), which are used in section 6c.

Table 1.

Figures 1d–f demonstrate the statistics of the nonlinear stochastic model [(1) and (2)] with these parameters and compare them with those of the two MISO indices. Figure 1d displays that the stochastic model captures the correlation functions almost perfectly for a 3-month duration as well as the timing of the wiggles that appear with lags around one year. Figure 1e shows that the nonlinear stochastic model reproduces the fat tails of the highly non-Gaussian PDF of the two MISO indices, which cannot be captured by linear models (Chen and Majda 2015a). Figure 1f reveals that the power spectrums of the stochastic model match those of the MISO indices (Fig. 1b) very well within the intraseasonal band from 30 to 60 days, which contains the most power. Note that in the absence of the stochastic damping and stochastic phase, the model fails to capture the highly non-Gaussian PDFs, the autocorrelation functions, and the power spectrums simultaneously even with the large-scale damping (Chen and Majda 2015a). This actually indicates the nonlinear nature of these MISO indices and suggests that these hidden variables are not redundant.

c. Prediction algorithm and data assimilation of the hidden variables

An ensemble prediction algorithm, which involves running the forecast model [(1)] forward in time given the initial values, is adopted for the MISO indices. The initial data of the two MISO variables are obtained directly from observations. On the other hand, the two hidden variables have no direct observational surrogate. Nevertheless, the special structure of the model allows an active data assimilation algorithm to determine the initial values of that facilitate the ensemble forecasting scheme.

In fact, the equations in (1) are a conditional Gaussian system with respect to the observed variables , meaning that once are given, the time evolution of the distributions of is Gaussian. This special feature of (1) allows closed analytic equations for the conditional Gaussian distributions of the hidden variables obtained from the posterior estimates in a Bayesian framework (Liptser and Shiryaev 2001). Appendix B contains the details and explicit equations. It is also shown in appendix B (Fig. B1) that the posterior mean estimates of the hidden variables υ and have large variations, where the overall amplitudes are comparable to those associated with and a. In addition, the posterior variance that reflects the uncertainty changes significantly in time. Therefore, the two hidden variables have important contributions to the coupled model, and an accurate online estimation of their initial values is crucial for skillful predictions. In the prediction below, 50 ensemble members are utilized.

4. Results of predicting the MISO indices

With the optimal parameters from Table 1 and the ensemble initialization scheme described in sections 3a and 3c, the prediction of the MISO indices from years 2008 to 2013 using the stochastic model in (1) is presented here. The prediction skill is assessed by the RMS error and pattern correlation (Corr). For the two-dimensional multivariate time series, these skill scores are defined as
eq1
where n is the number of the points in the time series, and and are the truth and predicted time series, respectively. Useful predictions are typically defined as follows: 1) the RMS error in the prediction is less than the standard deviation of the truth at equilibrium, and 2) the pattern correlation between the predicted signal and the truth is above 0.5. Typically, the ensemble mean is used for in computing these metrics.

The skill scores of the ensemble mean prediction as a function of lead time (days) in different years from 2008 to 2013 are shown in Fig. 2. Among these years, the year 2010 has useful predictions for about 20 days while years 2011 and 2013 have skillful predictions around 25–30 days. In some years like 2008, 2009, and 2012, prediction skill reaches out about 50 days. In general, prediction using this nonlinear stochastic model shows much higher skill than the conventional EEOF-based indices (Suhas et al. 2013).

Fig. 2.
Fig. 2.

Skill scores with (top) RMS error and (bottom) pattern correlation for the prediction utilizing the ensemble mean in different years. The dashed line in the top panel shows the standard deviation of the equilibrium distribution associated with the MISO indices, and that in the bottom panel shows the threshold of Corr = 0.5. Useful predictions are defined by 1) the RMS error in the prediction is less than the standard deviation of the truth at the equilibrium, and 2) the pattern correlation between the predicted signal and the truth is above 0.5.

Citation: Journal of Climate 31, 11; 10.1175/JCLI-D-17-0411.1

The 15- and 25-day lead prediction of years 2008, 2009, and 2010 are shown in the top panels of Fig. 3. Both the phase and amplitude of MISO activity play important roles in determining the prediction skill in different years. For example, year 2008 has an overall strong and regular MISO activity during the whole monsoon season that results in a long predictability, whereas the signal-to-noise ratio in year 2010 is smaller than other years, and thus the predictability is greatly reduced. Note that although year 2009 is a drought year with weak MISO activity during the late monsoon season (September), the MISO activity in other months of the 2009 boreal summer remains strong, and the overall prediction skill is high. From the limited sample size (12 years) of our analysis, it is hard to derive a relationship between the predictability of MISO and the interannual variability of the monsoon. However, it appears that the drought years do not necessarily have low predictability.

Fig. 3.
Fig. 3.

(top) The 15- and 25-day lead predictions based on the ensemble mean, where each point in the predicted curve is a prediction starting from 15 or 25 days ago. (bottom) Long-range forecasting starting from 1 Apr (the transition between the quiescent phase and the active phase), 1 Jun (onset phase), and 1 Oct (demise phase).

Citation: Journal of Climate 31, 11; 10.1175/JCLI-D-17-0411.1

In addition to the ensemble mean prediction, the ensemble spread that represents the predictive uncertainty is another important indicator of the prediction skill. The bottom panels in Fig. 3 shows the ensemble predictions including the ensemble spread for years 2008, 2009, and 2010, beginning at three different dates that correspond to a transition between the quiescent phase and the active phase (1 April), a starting date in the active mature phase (1 June), and a starting date in the decaying phase of MISO activity (1 October), respectively. Although the ensemble mean predictions for the 1 April starting date do not have any long-range skill, the envelope of the ensemble predictions contains the true signal and forecasts for both the summer active and winter quiescent phases. The forecasts from 1 June obviously have skill from both the ensemble mean and ensemble spread for moderate to long lead times. The forecasts starting from 1 October have both an accurate mean and small ensemble spread for very long times.

It is easy to perform twin prediction experiments with the perfect nonlinear stochastic model in (1) and (2) where 10-yr training segments of the data generated from the model are utilized to make 6-yr forecasts. It is significant that this internal prediction skill of the stochastic model is comparable to its skill in predicting the MISO indices from observations (not shown here). This lends support to the fact that the nonlinear stochastic model in (1) and (2) can accurately determine the predictability limits of the two MISO indices.

5. The spatiotemporal reconstruction

With the predicted MISO indices in hand, the next step is to recover the predicted large-scale MISO patterns in physical space. This requires the spatiotemporal reconstruction that combines the predicted MISO indices and the associated spatial bases.

a. Method

Let be the d-dimensional vector of gridded precipitation values over the South Asian monsoon region at time i as discussed in section 2. Here, i is an integer ranging from 1 to n, representing the period of training phase. The first step in NLSA is to construct a higher-dimensional, time-lagged embedding dataset utilizing Takens’ method of delay (Takens et al. 1981). Denote q as the lagged embedding window size. Then the lagged embedding matrix can be written as
e3
where is a × N matrix since each column concatenates q vectors that are each d-dimensional. Here .
The next step in NLSA is to compute the kernel matrix with entries given by
eq2
where ε is a positive kernel bandwidth parameter, and the quantities are “phase space velocities” measuring the local time tendency of the data through . Here is the ith column of , where T stands for the transpose. For the readers who are not familiar with NLSA, the kernel matrix can be thought as a nonlinear analog of the temporal covariance matrix in the singular spectrum analysis (SSA) (Ghil et al. 2002), and this nonlinearity is crucial in capturing both intermittency and low-frequency variability. The details of NLSA decomposition used in this work are recorded in Sabeerali et al. (2017).
Applying NLSA with the kernel matrix , the raw lagged embedding matrix can be decomposed into the following:
e4
where these represent different modes, such as the annual mode, semiannual mode, biannual mode, and so on. As discussed in section 2, there are a pair of modes that combine to form the MISO. Denote and as the two MISO modes. Since they can be treated independently, we focus on below. The same procedure will be applied to in order to form complete MISO spatiotemporal structures.
Similar to other data decomposition methods, the given by NLSA is of rank 1 and has a singular value decomposition
e5
where is the time series (i.e., MISO indices, analogous to PCs), and is the spatial basis (analogous to EEOF) that is computed by projecting to using the orthogonality of different in NLSA. Note that is of size and is of size . With being determined in the training phase, it is clear that once the MISO indices are predicted by the low-order model, in the prediction phase is reached using (5). Then averaging over the q elements that appear in the antidiagonal entries as indicated by the underlined elements in (3) finalizes the spatiotemporal reconstruction of the predicted MISO pattern.
Denote as the predicted MISO indices. The prediction of the MISO mode is given by the following:
e6
where the dot represents multiplication, and is the column concatenation of and . Each is a d-dimensional vector. We have also ignored the transpose in and in (6) and (7) below for notation simplicity. It is straightforward to check that the averaging over the q underlined elements in (6) provides the predicted spatiotemporal reconstruction of the MISO patterns at one lead day. However, this one-lead-day spatiotemporal reconstruction requires the information of the predicted MISO indices up to q days, namely . Therefore, the algorithm in (6) is not practical for real-time predictions of the spatiotemporal patterns. It is worth remarking that the spatiotemporal reconstruction of the MISO by design requires all the q underlined elements (Golyandina et al. 2001; Giannakis and Majda 2011). The reconstructed information is far from complete if the reconstruction is based only on a single element in (6).
One natural remedy to overcome such difficulty is to switch the extra future information beyond the predicted range as required in the time series to the spatial basis by calculating a “predicted spatial basis” in the training phase. Recall in (5) that by multiplying by both sides can be explicitly written as with , where is the norm. Using a similar argument, we define with , where is a × matrix with , and the ith column of is the column of . In other words, ignoring the last q columns of , the matrix is simply q units shifted forward in time with respect to . On the other hand, the time series is given by the first entries of . Comparing with , it is clear that is the “predicted” spatial basis since can be regarded as the predicted with q units in time. As a remark, is the projection of the shifted observational map onto the eigenfunctions. By definition, the shifted observational map is equal to the action of the Koopman operator on the observations. Multiplying by the MISO indices results in
e7
Again, the average value of the underlined elements provides the spatiotemporal reconstruction of the MISO patterns at one lead day, which now requires the prediction of the MISO indices for only one lead day as well. This is the fundamental difference between (6) and (7). Although the simple strategy adopted in “predicting” the spatial basis introduces some errors, the predicted spatiotemporal MISO patterns are overall quite skillful as will be shown in section 5b. It is also natural that an s-day prediction of the indices is able to provide an s-day forecast of the spatiotemporal patterns, which facilitates a fair comparison of different methods in predicting the indices. Finally, since is completely determined in the training period using straightforward calculations, the formula in (7) is an effective and practical method for predicting the spatiotemporal patterns.

b. Prediction of the spatially and temporally reconstructed precipitation fields

Figure 4a shows three phase-space diagrams of predicting the MISO indices, starting from 1 July 2009, 1 June 2008, and 1 June 2013, and all lasting for 30 days of lead time. Among the three cases, a significant skillful prediction is found for July 2009 whereas the prediction skill of June 2008 is moderate. The true signal of June 2013 has a weak amplitude, and the corresponding prediction is far from the truth.

Fig. 4.
Fig. 4.

(a) The predicted time series (ensemble mean in red) compared with the truth (blue) in three different cases (July 2009, June 2008, and June 2013). The initial day is the first day of each month, and the prediction lasts for 30 days with dots indicating every five days. (b) Skill score with (left) RMS error and (right) pattern correlation for predicting the reconstructed spatiotemporal patterns as a function of lead time (days) for July 2009 (blue), June 2008 (red), and June 2013 (black).

Citation: Journal of Climate 31, 11; 10.1175/JCLI-D-17-0411.1

We demonstrate the prediction of spatiotemporal patterns based on the improved method [(7)], where the ensemble mean of prediction is utilized for the spatiotemporal reconstruction. The skill scores of the predicted spatiotemporal patterns for each of the three periods are shown in Fig. 4b. Consistent with the MISO indices, July 2009 has the highest prediction skill, and the useful prediction lasts for 40 days. On the other hand, a higher pattern correlation is found in predicting the spatiotemporal patterns of June 2008 compared with that of June 2013, where the useful prediction of June 2008 is up to around 22 days. These results indicate that the spatiotemporal patterns at different time instants largely depend on the corresponding MISO indices. An accurate prediction of the stochastic phase and amplitude usually results in a good reconstruction of the spatiotemporal patterns. Note that, different from predicting the MISO indices, the skill scores of predicting the spatiotemporal pattern do not decrease monotonically as a function of lead time, and the error at very short lead times does not approach zero as well. These facts are due to the approximation of the spatial basis in the prediction period. In addition, this spatial basis resulting from averaging over the entire training phase may also lead to the amplitude underestimation in the predicted spatiotemporal reconstruction. A direct remedy is to compute the ratio of and in the training period and multiply this constant ratio in the prediction. In fact, the value of decreases with the increase in q value. This is because the correlation between the spatial basis and the time series with a phase lag becomes weaker when the lag increases.

Figures 5 and 6 compare the truth and the predicted spatiotemporal patterns of July 2009 and June 2008, respectively. The predicted patterns for all of July 2009 are highly consistent with the truth, especially in the regions of the Indian subcontinent and Bay of Bengal. On the other hand, although the overall skillful prediction is up to 20 days lead time in June 2008, significant errors in the spatiotemporal patterns appear for longer time predictions due to the failure in predicting the precipitation in regions such as the Indian Ocean.

Fig. 5.
Fig. 5.

Reconstruction and prediction of the spatiotemporal patterns of July 2009 starting from 1 Jul 2009. The predictions at days 1, 4, 8, 12, 16, 20, 24, and 30 are shown.

Citation: Journal of Climate 31, 11; 10.1175/JCLI-D-17-0411.1

Fig. 6.
Fig. 6.

As in Fig. 5, but for June 2008.

Citation: Journal of Climate 31, 11; 10.1175/JCLI-D-17-0411.1

c. Comparison of different prediction algorithms for the spatiotemporal fields

To further understand the skill of predicting the spatiotemporal MISO patterns, the skill scores of using three different prediction algorithms are compared in Fig. 7. In addition to the improved algorithm [(7)], the other two methods applied here are the prediction using the direct algorithm [(6)] and the persistence prediction. The persistence method assumes that the conditions at the time of the forecast will not change, and it is typically used as a baseline for prediction. On the other hand, the direct algorithm [(6)] requires the predicted time series up to days for the predicted spatiotemporal pattern at an s-day lead time. Then the average value of the q underlined elements indicated in (6) is taken as the predicted spatiotemporal pattern. Note that the spatiotemporal reconstruction of the MISO by design requires all the q underlined elements (Golyandina et al. 2001; Giannakis and Majda 2011). Since the reconstructed information is far from complete if the reconstruction is based only on a single element in (6) for an s-day lead prediction, such a prediction is not shown here.

Fig. 7.
Fig. 7.

Comparison of the three prediction methods for the spatiotemporal fields, namely the direct algorithm in (6) (dashed line), the improved algorithm in (7) (solid line), and the persistence prediction (dotted line). The three columns show the predictions of July 2009, June 2008, and June 2013 for the (a)–(c) pattern correlation and (d)–(f) RMS error.

Citation: Journal of Climate 31, 11; 10.1175/JCLI-D-17-0411.1

First, because of the oscillation nature of the MISO, the persistence prediction is skillful only for 5–6 days in July 2009 and June 2008, when the signal is strong and moderate. Then the skill of the persistence prediction deteriorates rapidly and becomes much worse than the other two prediction methods, despite the fact that a reemergence of skill of the persistence prediction is found after around 40 days when another MISO event appears. On the other hand, although the short-term skill of the persistence prediction is higher than the other two methods in June 2013, the true MISO signal in this month is very weak, which implies much less significance of the MISO prediction. Next, we compare the direct approach [(6)] and the improved method [(7)]. In the strong MISO month of July 2009, the direct method is slightly better (pattern correlation ~ 0.9) than the improved method (pattern correlation ~ 0.8), and both methods are skillful up to 40 days. In this month, the prediction of the time series is quite accurate, and therefore the approximate error in becomes dominant. Nevertheless, the prediction with a pattern correlation of ~0.8 is already quite accurate, and therefore the improved method is skillful. In addition, at a lead time longer than 40 days, the improved method is more skillful than the direct one. On the other hand, in the moderate MISO month of June 2008, the improved method has an overall higher skill than the direct method, especially after a lead time of 25 days. In fact, the predicted time series becomes biased after 25 days, and therefore the direct method suffers from a large error in predicting a much longer time series, which dominates the approximate error in . These facts indicate that the strong requirement of the direct method [(6)] to have a skillful prediction of the time series up to days for an s-day prediction of the spatiotemporal pattern is hard to satisfy even with a suitable low-order model as in (1) while the improved algorithm [(7)] that requires the predicted time series only up to s days is more practical.

6. Model robustness and sensitivity to key parameters

a. Prediction with a 3-yr short training period

A typical situation in climate science is that only a short period of observational data is available. This actually leads to one of the fundamental difficulties in prediction utilizing most nonparametric methods that require a huge amount of data for training. Suitable models that are able to describe the essential characteristics of the data are usually preferred since they allow for a much shorter training period. Recall in the previous sections, 10 years of observations (1998–2007) were adopted for model calibration, and the prediction skill was assessed for the remaining 6 years (2008–13). Although this 10-yr training window is already much shorter than that required by most nonparametric methods, it is important to understand whether an even shorter training period is possible here for the nonlinear model to obtain the information in nature.

To this end, a very short training period involving only the first three years of the time series (1998–2000) is adopted here for model calibration. Since MISO occurs only in boreal summer and the averaged duration of one event is roughly 40 days, this 3-yr training period only contains about 10 events, which is a small number of sample size. Figure 8 compares the statistics of the MISO time series obtained with different lengths of training periods, including this short 3-yr training period (1998–2000), the 10-yr training period adopted in previous sections (1998–2007), and the full analysis period (1998–2013). The fact that the statistics of the 10-yr training period and the full analysis period almost perfectly match each other indicates the sufficiency of the 10-yr training period in obtaining the unbiased information. On the other hand, the 3-yr training period, including one weak year (1998), one moderate year (1999), and one strong year (2000) of MISO activity, also has highly consistent statistics with those associated with the MISO time series obtained with full analysis period, including the non-Gaussian fat-tailed PDFs, the power spectrums, and the autocorrelations up to 1.5 months. Therefore, the key features of the full MISO indices are well reflected in this short 3-yr training period. Because of the robustness of the model parameters (appendix A), the calibrated parameters based on this 3-yr short training period are nearly the same as the optimal parameters shown in Table 1. Importantly, this short training period allows for the study of prediction skill for a long period back to year 2001, and the results are roughly reported here.

Fig. 8.
Fig. 8.

Comparison of the statistics of MISO indices based on the full analysis period (1998–2013; black), 10-yr training period utilized in section 3b (1998–2007; red), and the 3-yr short training period as in section 6a (1998–2000; green).

Citation: Journal of Climate 31, 11; 10.1175/JCLI-D-17-0411.1

Figure 9 shows the skill scores and the predicted signals based on the ensemble mean prediction from year 2001 to 2007, analogous to those in Figs. 2 and 3 from years 2008 to 2013. The useful prediction of these 7 years all exceeds 25 days, where in particular the skillful predictions in years 2001, 2003, and 2007 are more than 40 days. Among these 7 years, years 2002 and 2004 are recorded as drought years. A significant error is found in predicting the subdued MISO activity during August and September of year 2002, which explains its lower overall prediction skill than most of the other years. On the other hand, the major error in predicting the MISO indices of year 2004 is in fact due to the model’s failure in capturing the extremely slow oscillation frequency during August and September.

Fig. 9.
Fig. 9.

(left) Skill score with RMS error and pattern correlation for predicting the MISO indices in different years from 2001 to 2007. As in Fig. 2, the two dashed lines indicate the standard deviation of the MISO indices at climatology and the value with Corr = 0.5, which serve as the threshold for the useful prediction. (right) The 25-day lead predictions of MISO 1 for four different years. The model parameters are listed in the top row of Table 1.

Citation: Journal of Climate 31, 11; 10.1175/JCLI-D-17-0411.1

We have also checked the model statistics and prediction skill by utilizing any three consecutive years between 1998 and 2013 as the training phase. Despite the discrepancy in the signal variance due to the strength of the MISO activity in different years, the fat tails in the non-Gaussian PDFs, the peak of the power spectrums, and the autocorrelations up to 1.5 months all resemble those of the full MISO time series. Notably, the ensemble prediction skill does not have significant deterioration based on different training periods.

b. MISO indices based on different lagged embedding window sizes and the corresponding prediction skill

Recall that the two MISO indices shown in Fig. 1a and studied throughout this article were obtained by applying NLSA to the precipitation data with a lagged embedding window of length days. Adopting is natural since it is an ideal choice for representing the intraseasonal time scale, and such a lagged embedding window size was utilized for defining the large-scale cloud patterns of the MJO and monsoon in previous works (Chen et al. 2014; Chen and Majda 2015a,b; Tung et al. 2014). On the other hand, EEOF was also widely utilized in defining the MISO indices in literature (Suhas et al. 2013; Kikuchi et al. 2012), which involves removing the climatological mean, computing the first few harmonics of the seasonal cycle, and then applying a much shorter embedding window with 15–20 days. Nevertheless, in studying the intraseasonal variability of Indian summer monsoon rainfall, Krishnamurthy and Shukla (2007) used a large embedding window size (61 days) in the MSSA technique, which is comparable to the 64 days of the embedding window utilized in our study. Therefore, it is important to study the difference in the MISO indices by applying NLSA with different lagged embedding window sizes.

Figure 10 shows the resulting MISO indices by applying NLSA with , and 34 as well as the corresponding statistics. Different from , the MISO indices with and 34 have active phases in both boreal summer and winter, implying that the obtained MISO indices contain the components of the boreal winter MJO and the associated PDFs are nearly Gaussian. In addition to being polluted by the boreal winter signal, these time series, especially with , also contain biannual, annual, and semiannual cycles, as indicated by the large bursts in the low-frequency band of the power spectrum. Another significant discrepancy with different q values lies in the causality between the two components of the MISO indices. With , the cross-correlation functions have significant peaks at lags around 12 days, which is nearly ¼ of the averaged oscillation frequency and indicates the quadrature structure of MISO 1 and MISO 2. On the other hand, the cross-correlations and with and remain close to zero, and the maximum value of the lagged correlation between the MISO 1 and MISO 2 indices is less than 0.3 (not shown here). These facts imply that MISO 1 and MISO 2 are nearly uncorrelated, and therefore model errors appear in fitting the cross-correlations utilizing the nonlinear low-order model [(1)]. Finally, the fast decay of autocorrelations and with and 34 implies deterioration in the predictability of the MISO indices.

Fig. 10.
Fig. 10.

Comparison of the MISO indices, obtained by applying NLSA with different lagged embedding window sizes (q = 64, 48, and 34 days) and the associated statistics.

Citation: Journal of Climate 31, 11; 10.1175/JCLI-D-17-0411.1

Also shown in appendix C (Figs. C1C4) are the spatiotemporal patterns of the year 2004 boreal summer monsoon season with different values of q. As discussed in Sabeerali et al. (2017) it is found that the patterns with are consistent with observations where three wet phases of MISO over the western/central tropical Indian Ocean that initiate at the third week of June 2004, the last week of July, and the first week of September, respectively, are observed in the spatiotemporal patterns, and they propagate toward the northeast. The MISO patterns with in boreal summer are overall similar to those with , but for all, the timing of the onset of the wet phases shifts by 10 days in advance. In contrast, the boreal summer MISO patterns with are noisier, and the propagation seems to be more northward rather than northeastward. On the other hand (as shown in Fig. C3), the MISO patterns in the 2004 boreal winter are very weak with whereas those with and 34 are much stronger and noisy due to the interference by the MJO signals.

Figure 11 shows the prediction skill with different values of q. Here useful prediction is defined in the same way as that in section 4: first, the RMS error in the prediction is less than the standard deviation of the truth at the equilibrium and, second, the pattern correlation between the predicted signal and the truth is above 0.5. In addition to illustrating the prediction skill for the whole year, the prediction skill conditioned on the boreal summertime (June–September) is also emphasized. As expected, with the decrease in q, the overall prediction skill deteriorates. Nevertheless, conditioned on the boreal summertime, the prediction with remains quite skillful, and in particular, the 15-day lead time prediction is highly consistent with the truth. This is, however, not true for the prediction with , where the useful prediction only lasts for 10–12 days in terms of both the whole year and is conditioned only on the boreal summertime.

Fig. 11.
Fig. 11.

Comparison of the prediction skill of the MISO indices obtained by applying NLSA with different lagged embedding window sizes: (top) , (middle) , and (bottom) . (left) Useful prediction for the full year and for only the boreal summertime (from June to September). (right) The 15-day lead predictions for the MISO indices obtained with different lagged embedding window sizes.

Citation: Journal of Climate 31, 11; 10.1175/JCLI-D-17-0411.1

c. Significant prediction skill of the precipitation MISO indices with parameters calibrated from the OLR dataset

Most tropical rainfall is convective, which implies that OLR, a proxy for the convection, is a potential candidate to describe the precipitation in the tropics. Positive (negative) OLR anomalies are associated with reduced (increased) cloudiness, and hence suppressed (enhanced) deep convection. Because of the strong relationship between the OLR and tropical precipitation anomalies in describing the MISO, it is important to compare the MISO modes based on the OLR and precipitation as well as to understand the skill of the low-order nonlinear stochastic model [(1) and (2)] in predicting the MISO indices with parameters calibrated from the OLR dataset.

In Chen and Majda (2015b), the low-order nonlinear stochastic model [(1) and (2)] was adopted to predict the two boreal summer intraseasonal oscillation (BSISO) modes obtained by applying NLSA to the brightness temperature, a highly correlated variable with OLR, within the equatorial tropical belt from 15°S to 30°N. The dataset utilized there was the Cloud Archive User Service (CLAUS) version 4.7. As shown in Székely et al. (2016a), the spatial patterns of the BSISO modes initiate in the Indian Ocean and propagate northeastward, essentially the same as the MISO precipitation modes used here (Sabeerali et al. 2017). Figure 12 compares the time series and the associated statistics of OLR BSISO and those of the MISO precipitation. In addition to the intermittent time series, the power spectrums, autocorrelation functions, and cross-correlation functions of the two kinds of indices are all quite similar to each other. Both the OLR BSISO and MISO precipitation indices have non-Gaussian fat-tailed PDFs, although the variance of the OLR indices is relatively smaller.

Fig. 12.
Fig. 12.

Comparison of the BSISO indices and their statistical features based on OLR from Chen and Majda (2015a) and those of the MISO precipitation indices developed here. (a) Time series of BSISO 1 and BSISO 2, where a lag (around 12 days) is added in the BSISO such that the two components nearly overlap each other. (b) Power spectrum of BSISO 1. (c) Highly non-Gaussian PDF and Gaussian fit, where (f) shows these PDFs in logarithm scales. (d) The autocorrelation function of BSISO 1 and (e) cross-correlation function between BSISO 1 and 2 up to 2 years. (g)–(i) As in (a)–(f), but for the MISO precipitation indices developed here.

Citation: Journal of Climate 31, 11; 10.1175/JCLI-D-17-0411.1

Next, instead of comparing the predictability limit of the OLR and precipitation indices with the corresponding optimal parameters, a cross-validation type of experiment is adopted here to assess the prediction skill. Namely, the parameters associated with the OLR BSISO Chen and Majda (2015b) are used in the low-order model [(1) and (2)] to predict the precipitation MISO indices. For simplicity, these parameters are named as the OLR-based parameters. The OLR-based parameters are listed in the second row of Table 1 with two minor modifications. First, since the time series in Chen and Majda (2015b) were started from September instead of January, the phase parameter ϕ in Chen and Majda (2015b) is modified accordingly. Second, because of the general negative correlation between OLR and precipitation, the sign of the oscillation frequency a in Chen and Majda (2015b) is flipped. In fact, as shown in Table 1, the OLR-based parameters are quite similar to the optimal parameters utilized in the previous sections.

Figure 13 compares the 25-day lead prediction of the low-order model with the optimal parameters calibrated in section 3b and the parameters taken from OLR data in Chen and Majda (2015a). The RMS error and pattern correlation of the prediction using the OLR-based parameters remain nearly the same as those using the optimal parameters. This together with the comparable statistics shown in Fig. 12 confirms a strong (negative) correlation between OLR and precipitation anomalies in describing and predicting the MISO (Sabeerali et al. 2017). Based on these findings, it is also interesting and valuable to develop a low-order model using combined OLR and precipitation data, which remains as a future work.

Fig. 13.
Fig. 13.

The 25-day lead prediction of the MISO precipitation indices with the optimal parameters calibrated in section 3b and the parameters taken from the OLR data in Chen and Majda (2015b). Both parameters are recorded in Table 1. Shown are comparisons of the (a) RMS error and (b) pattern correlation, (c) the information model error as measured by the relative entropy between the predicted PDF and the truth [(A1)], and (d),(e) the ensemble mean predictions with these two groups of parameters.

Citation: Journal of Climate 31, 11; 10.1175/JCLI-D-17-0411.1

It is worthwhile pointing out that since the variance in the OLR BSISO is smaller than that of the MISO precipitation indices, the prediction with OLR-based parameters tends to underestimate the amplitude of the MISO variability. For example, Figs. 13d and 13e show that the peaks in July 2008, June 2009, and August 2013 with the OLR-based parameters are slightly weaker than those with the optimal parameters. Since these extreme events are usually associated with the nonlinear and non-Gaussian features of the underlying system, the error in the extreme events may not be accurately captured by linear measures such as the RMS error and pattern correlations. On the other hand, the underestimation of the amplitude is clearly indicated by the information measure [see (A1)] (Chen and Majda 2015b; Majda and Gershgorin 2010, 2011) that assesses the lack of information in the PDFs, as shown in Fig. 13c.

7. Conclusions

A recently developed nonlinear data analysis technique NLSA (Giannakis and Majda 2012a,b, 2013) has been applied to the raw daily GPCP rainfall dataset without detrending or spatiotemporal filtering (Sabeerali et al. 2017). The resulting MISO precipitation mode contains two time series that have non-Gaussian fat-tailed PDFs as a consequence of intermittency. We predict the large-scale MISO precipitation in two steps.

In the first step, a physics-constrained nonlinear stochastic model (Majda and Harlim 2013; Harlim et al. 2014) is developed to calibrate and predict the MISO indices. This physics-constrained low-order stochastic model contains two MISO variables and two hidden variables that couple with each other through energy-conserving nonlinear interactions, and the model involves both correlated multiplicative noise and additive stochastic noise. The model succeeds in capturing the observed non-Gaussian PDFs, power spectrums, and autocorrelations in the MISO indices. An effective data assimilation algorithm that determines the initial ensemble of the hidden variables facilitates the ensemble prediction scheme. It is shown in section 4 that the low-order nonlinear stochastic model is skillful in predicting the MISO indices ranging from 20 to 50 days of lead time in different years.

In the second step, an effective and practical spatiotemporal reconstruction algorithm is developed (section 5), which overcomes the fundamental difficulty in most data decomposition techniques with lagged embedding that requires extra information in the future beyond the predicted range of the time series. The prediction skill of the reconstruction spatiotemporal patterns is consistent with that of the MISO indices.

A few issues are addressed in section 6. First, the model calibration and prediction with a 3-yr short training period are studied. The resulting statistics and prediction skill do not have significant deterioration compared with those based on a 10-yr training period. This suggests the advantage of utilizing the low-order nonlinear model [(1)] over most nonparametric methods in predicting the MISO indices from a practical point of view (Alexander et al. 2017). Second, the NLSA MISO indices obtained by using different lagged embedding window sizes are compared. The resulting MISO indices with shorter lagged embedding window sizes ( days and days) are polluted by other variabilities, and the corresponding predictability is greatly reduced. Nevertheless, with days lag, the prediction conditioned on the boreal summertime remains still skillful. Third, the low-order nonlinear stochastic model with OLR-based parameters remains skillful in both fitting the non-Gaussian statistics and predicting the precipitation MISO indices, implying a significant correlation between the tropical precipitation and OLR.

The simple spatiotemporal reconstruction strategy proposed in this article does not include the discrepancy of the patterns conditioned on different MISO phases. Therefore, applying a phase decomposition method to the NLSA spatial modes is a potential way to improve the spatiotemporal reconstruction in the prediction stage. Yet, the phase decomposition method has an obvious drawback in that the predicted spatiotemporal patterns are discontinuous in time when the corresponding spatial basis transits from one phase to another. One remedy to the discontinuity issue is to introduce a smooth transition between different phases such as adopting a convolution with a Gaussian kernel. On the other hand, the clustering method (Giannakis et al. 2012a) is also a promising technique for recovering more detailed features of the spatial basis conditioned on different phases. In addition, exploring the causality between MISO and other modes is another potential way of improving the MISO predictions. The study of these strategies remains as a future work.

Acknowledgments

The research of A.J.M. is partially supported by the Office of Naval Research Grant ONR MURI N00014-16-1-2161 and the New York University Abu Dhabi Research Institute. N.C. is supported as a postdoctoral fellow following A.J.M.’s ONR MURI Grant. C.T.S., R.S.A., and A.J.M. also acknowledge the support from the Monsoon Mission of the Ministry of Earth Sciences (MoES), Government of India (Grant MM/SERP/NYU/2014/SSC-01/002). The research of C.T.S. and R.S.A. is also supported by the New York University Abu Dhabi Research Institute. The authors thank Dimitrios Giannakis for useful discussions.

APPENDIX A

Calibration of the Nonlinear Stochastic Model with Information Theory

The optimal parameters in the nonlinear low-order stochastic model [(1)] are calibrated by systematically minimizing the information distance (or sometimes called the model error) in the PDF of the model compared with that of the MISO index π together with the RMS error in the autocorrelation functions up to three months between the model signal and the MISO index. Here, the information distance is measured by the relative entropy (Majda and Gershgorin 2010, 2011; Kleeman 2002; Majda and Branicki 2012; Branicki et al. 2013):
ea1
The model error dependence on the variation of different parameters is shown in Fig. A1. The flatness of all the curves around the optimal values (black dots) indicates that the nonlinear low-order stochastic model [(1)] is robust with respect to the parameter variations. The huge model error with the underestimation of , and γ as well as the overestimation of is due to the failure of capturing the intermittency, where the variance in the MISO indices is underestimated by the model. Note that the information model error has only a weak dependence on the background phase a. However, the parameter a is crucial in describing the frequency of the intraseasonal oscillation, and its role is reflected in the autocorrelation functions. The other parameters , and in the hidden processes affect both the model error and the autocorrelations. In particular, the nonlinear interactions between the hidden variables and the observed variables induce multiplicative noise. Therefore, the noise amplitude coefficients and influence the correlation functions of and that have a direct contribution to the system memory. Significant errors appear in the statistics if these parameters are outside the optimal range. Note that despite the moderate effect of and on the model error, the physics-constrained nonlinear regression modeling strategies require nonzero noise as the model ansatz (Majda and Harlim 2013; Harlim et al. 2014). The parameter is not an independent parameter given and γ, and therefore we fix its value. The frequency is prescribed such that one time unit of the model is equal to one month in reality. The role of the phase ϕ is to trigger strong intermittency in the boreal summer. Note that neither of the hidden processes is redundant in the nonlinear stochastic model [(1)]. Dropping these stochastic factors results in a large model error. In fact, without the hidden variables υ and , even if the time-period damping is able to crudely describe the active phase of BSISO in the reduced linear model, a distinguished disparity is observed in the model statistics compared with the truth Chen and Majda (2015a), indicating an intrinsic barrier (Majda and Gershgorin 2011; Majda and Branicki 2012).
Fig. A1.
Fig. A1.

Sensitivity test. (left) Model error (via information distance) as functions of different parameters. (right) The RMS error in the autocorrelation functions as functions of those parameters related to the phase. The black dots indicate the optimal parameters as listed in Table 1. The two dotted lines in each panel indicate the range of randomly picked suboptimal parameters in the test of prediction.

Citation: Journal of Climate 31, 11; 10.1175/JCLI-D-17-0411.1

The sensitivity in prediction is studied by randomly drawing suboptimal parameters from the interval given by the two dotted lines in each panel of Fig. A1. Comparable prediction skill is found with these random suboptimal parameters as the optimal parameters.

APPENDIX B

Mathematical Details of Effective Data Assimilation and Prediction Algorithm

Recall the nonlinear low-order stochastic model [(1)]. Denote and . The abstract form of the low-order stochastic model [(1)] is given as follows:
eb1a
eb1b
where
eq3
The model [(B1)] is a conditional Gaussian system conditioned on the observations , meaning that once the observations are given, the dynamics of in (B1) becomes a Gaussian system (Chen and Majda 2016). The special structure of the system [(B1)] allows for the closed analytic formulas for the evolution of the conditional Gaussian distributions of the hidden parameters υ and (Liptser and Shiryaev 2001) obtained in the Bayesian framework:
eb2
where and are the posterior mean and posterior covariance of the conditional distributions, respectively. The asterisk represents the complex conjugate.

Figure B1 shows the posterior mean and posterior variance of stochastic damping υ and stochastic phase in (1). The posterior mean estimates of the hidden variables υ and have large variations, where the overall amplitudes are comparable to those associated with and a. In addition, the posterior variance that reflects the uncertainty changes significantly in time. Thus, the two hidden variables have important contributions to the coupled model, and an accurate online estimation of their initial values is crucial for skillful predictions.

Fig. B1.
Fig. B1.

Recovery of the posterior mean and variance of stochastic damping υ and stochastic phase in (1). The cross-correlation is negligible and is thus omitted here.

Citation: Journal of Climate 31, 11; 10.1175/JCLI-D-17-0411.1

APPENDIX C

Spatiotemporal Patterns with Different Lagged Embedding Window Sizes

Figures C1C3 show the spatiotemporal patterns of boreal summer (June–September) in year 2004 obtained by NLSA with lagged embedding window sizes , and 34, respectively, and Fig. C4 compares those of boreal winter (only January is shown) in year 2004.

Fig. C1.
Fig. C1.

Spatiotemporal patterns of boreal summer (June–September) in year 2004 obtained by applying NLSA with a lagged embedding window size .

Citation: Journal of Climate 31, 11; 10.1175/JCLI-D-17-0411.1

Fig. C2.
Fig. C2.

As in Fig. C1, but with .

Citation: Journal of Climate 31, 11; 10.1175/JCLI-D-17-0411.1

Fig. C3.
Fig. C3.

As in Fig. C1, but with .

Citation: Journal of Climate 31, 11; 10.1175/JCLI-D-17-0411.1

Fig. C4.
Fig. C4.

Spatiotemporal patterns of boreal winter (only January is shown here) in year 2004 obtained by NLSA with different lagged embedding window sizes: (top) , (middle) , and (bottom) .

Citation: Journal of Climate 31, 11; 10.1175/JCLI-D-17-0411.1

REFERENCES

  • Abhilash, S., A. K. Sahai, S. Pattnaik, B. N. Goswami, and A. Kumar, 2014a: Extended range prediction of active-break spells of Indian summer monsoon rainfall using an ensemble prediction system in NCEP Climate Forecast System. Int. J. Climatol., 34, 98113, https://doi.org/10.1002/joc.3668.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Abhilash, S., and Coauthors, 2014b: Prediction and monitoring of monsoon intraseasonal oscillations over Indian monsoon region in an ensemble prediction system using CFSv2. Climate Dyn., 42, 28012815, https://doi.org/10.1007/s00382-013-2045-9.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Acharya, N., S. C. Kar, U. Mohanty, M. A. Kulkarni, and S. K. Dash, 2011: Performance of GCMs for seasonal prediction over India—A case study for 2009 monsoon. Theor. Appl. Climatol., 105, 505520, https://doi.org/10.1007/s00704-010-0396-2.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Alexander, R., Z. Zhao, E. Székely, and D. Giannakis, 2017: Kernel analog forecasting of tropical intraseasonal oscillations. J. Atmos. Sci., 74, 13211342, https://doi.org/10.1175/JAS-D-16-0147.1.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Belkin, M., and P. Niyogi, 2003: Laplacian eigenmaps for dimensionality reduction and data representation. Neural Comput., 15, 13731396, https://doi.org/10.1162/089976603321780317.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Branicki, M., N. Chen, and A. J. Majda, 2013: Non-Gaussian test models for prediction and state estimation with model errors. Chin. Ann. Math., 34B, 2964, https://doi.org/10.1007/s11401-012-0759-3.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Brenowitz, N. D., D. Giannakis, and A. J. Majda, 2016: Nonlinear Laplacian spectral analysis of Rayleigh–Bénard convection. J. Comput. Phys., 315, 536553, https://doi.org/10.1016/j.jcp.2016.03.051.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Chen, N., and A. J. Majda, 2015a: Predicting the cloud patterns for the boreal summer intraseasonal oscillation through a low-order stochastic model. Math. Climate Wea. Forecasting, 1, 120, https://doi.org/10.1515/mcwf-2015-0001.

    • Search Google Scholar
    • Export Citation
  • Chen, N., and A. J. Majda, 2015b: Predicting the real-time multivariate Madden–Julian oscillation index through a low-order nonlinear stochastic model. Mon. Wea. Rev., 143, 21482169, https://doi.org/10.1175/MWR-D-14-00378.1.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Chen, N., and A. J. Majda, 2016: Filtering nonlinear turbulent dynamical systems through conditional Gaussian statistics. Mon. Wea. Rev., 144, 48854917, https://doi.org/10.1175/MWR-D-15-0437.1.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Chen, N., A. J. Majda, and D. Giannakis, 2014: Predicting the cloud patterns of the Madden–Julian oscillation through a low-order nonlinear stochastic model. Geophys. Res. Lett., 41, 56125619, https://doi.org/10.1002/2014GL060876.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Coifman, R. R., and S. Lafon, 2006: Diffusion maps. Appl. Comput. Harmon. Anal., 21, 530, https://doi.org/10.1016/j.acha.2006.04.006.

  • Crommelin, D. T., and A. J. Majda, 2004: Strategies for model reduction: Comparing different optimal bases. J. Atmos. Sci., 61, 22062217, https://doi.org/10.1175/1520-0469(2004)061<2206:SFMRCD>2.0.CO;2.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • DelSole, T., and J. Shukla, 2002: Linear prediction of Indian monsoon rainfall. J. Climate, 15, 36453658, https://doi.org/10.1175/1520-0442(2002)015<3645:LPOIMR>2.0.CO;2.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Gadgil, S., 2003: The Indian monsoon and its variability. Annu. Rev. Earth Planet. Sci., 31, 429467, https://doi.org/10.1146/annurev.earth.31.100901.141251.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Ghil, M., and Coauthors, 2002: Advanced spectral methods for climatic time series. Rev. Geophys., 40, 1003, https://doi.org/10.1029/2000RG000092.

  • Giannakis, D., and A. J. Majda, 2011: Time series reconstruction via machine learning: Revealing decadal variability and intermittency in the North Pacific sector of a coupled climate model. 2011 Conf. on Intelligent Data Understanding (CIDU), Mountain View, CA, NASA, 107–117.

  • Giannakis, D., and A. J. Majda, 2012a: Comparing low-frequency and intermittent variability in comprehensive climate models through nonlinear Laplacian spectral analysis. Geophys. Res. Lett., 39, L10710, https://doi.org/10.1029/2012GL051575.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Giannakis, D., and A. J. Majda, 2012b: Nonlinear Laplacian spectral analysis for time series with intermittency and low-frequency variability. Proc. Natl. Acad. Sci. USA, 109, 22222227, https://doi.org/10.1073/pnas.1118984109.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Giannakis, D., and A. J. Majda, 2013: Nonlinear Laplacian spectral analysis: Capturing intermittent and low-frequency spatiotemporal patterns in high-dimensional data. Stat. Anal. Data Min., 6, 180194, https://doi.org/10.1002/sam.11171.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Giannakis, D., A. J. Majda, and I. Horenko, 2012a: Information theory, model error, and predictive skill of stochastic models for complex nonlinear systems. Physica D, 241, 17351752, https://doi.org/10.1016/j.physd.2012.07.005.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Giannakis, D., W.-W. Tung, and A. J. Majda, 2012b: Hierarchical structure of the Madden–Julian oscillation in infrared brightness temperature revealed through nonlinear Laplacian spectral analysis. 2012 Conf. on Intelligent Data Understanding, Boulder, CO, NASA, 55–62, https://doi.org/10.1109/CIDU.2012.6382201.

    • Crossref
    • Export Citation
  • Golyandina, N., V. Nekrutkin, and A. A. Zhigljavsky, 2001: Analysis of Time Series Structure: SSA and Related Techniques. CRC Press, 320 pp.

    • Crossref
    • Export Citation
  • Goswami, B. N., and R. S. Ajayamohan, 2001: Intraseasonal oscillations and interannual variability of the Indian summer monsoon. J. Climate, 14, 11801198, https://doi.org/10.1175/1520-0442(2001)014<1180:IOAIVO>2.0.CO;2.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Goswami, B. N., V. Krishnamurthy, and H. Annmalai, 1999: A broad-scale circulation index for the interannual variability of the Indian summer monsoon. Quart. J. Roy. Meteor. Soc., 125, 611633, https://doi.org/10.1002/qj.49712555412.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Goswami, B. N., R. S. Ajayamohan, P. K. Xavier, and D. Sengupta, 2003: Clustering of synoptic activity by Indian summer monsoon intraseasonal oscillations. Geophys. Res. Lett., 30, 1431, https://doi.org/10.1029/2002GL016734.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Harlim, J., A. Mahdi, and A. J. Majda, 2014: An ensemble Kalman filter for statistical estimation of physics constrained nonlinear regression models. J. Comput. Phys., 257, 782812, https://doi.org/10.1016/j.jcp.2013.10.025.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Huffman, G. J., R. F. Adler, M. M. Morrissey, D. T. Bolvin, S. Curtis, R. Joyce, B. McGavock, and J. Susskind, 2001: Global precipitation at one-degree daily resolution from multisatellite observations. J. Hydrometeor., 2, 3650, https://doi.org/10.1175/1525-7541(2001)002<0036:GPAODD>2.0.CO;2.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Kikuchi, K., B. Wang, and Y. Kajikawa, 2012: Bimodal representation of the tropical intraseasonal oscillation. Climate Dyn., 38, 19892000, https://doi.org/10.1007/s00382-011-1159-1.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Kleeman, R., 2002: Measuring dynamical prediction utility using relative entropy. J. Atmos. Sci., 59, 20572072, https://doi.org/10.1175/1520-0469(2002)059<2057:MDPUUR>2.0.CO;2.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Kondrashov, D., M. D. Chekroun, A. W. Robertson, and M. Ghil, 2013: Low-order stochastic model and “past-noise forecasting” of the Madden–Julian oscillation. Geophys. Res. Lett., 40, 53055310, https://doi.org/10.1002/grl.50991.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Kravtsov, S., D. Kondrashov, and M. Ghil, 2005: Multilevel regression modeling of nonlinear processes: Derivation and applications to climatic variability. J. Climate, 18, 44044424, https://doi.org/10.1175/JCLI3544.1.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Krishnamurthy, V., and J. Shukla, 2007: Intraseasonal and seasonally persisting patterns of Indian monsoon rainfall. J. Climate, 20, 320, https://doi.org/10.1175/JCLI3981.1.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Lau, W. K.-M., and D. E. Waliser, 2011: Intraseasonal Variability in the Atmosphere–Ocean Climate System. 2nd ed. Springer, 614 pp.

    • Crossref
    • Export Citation
  • Lee, J.-Y., B. Wang, M. C. Wheeler, X. Fu, D. E. Waliser, and I.-S. Kang, 2013: Real-time multivariate indices for the boreal summer intraseasonal oscillation over the Asian summer monsoon region. Climate Dyn., 40, 493509, https://doi.org/10.1007/s00382-012-1544-4.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Liptser, R. S., and A. N. Shiryaev, 2001: II. Applications. Vol. 2, Statistics of Random Processes, Springer, 402 pp.

    • Crossref
    • Export Citation
  • Majda, A. J., and B. Gershgorin, 2010: Quantifying uncertainty in climate change science through empirical information theory. Proc. Natl. Acad. Sci. USA, 107, 14 95814 963, https://doi.org/10.1073/pnas.1007009107.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Majda, A. J., and B. Gershgorin, 2011: Improving model fidelity and sensitivity for complex systems through empirical information theory. Proc. Natl. Acad. Sci. USA, 108, 10 04410 049, https://doi.org/10.1073/pnas.1105174108.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Majda, A. J., and M. Branicki, 2012: Lessons in uncertainty quantification for turbulent dynamical systems. Discrete Contin. Dyn. Syst., 32, 31333221, https://doi.org/10.3934/dcds.2012.32.3133.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Majda, A. J., and Y. Yuan, 2012: Fundamental limitations of ad hoc linear and quadratic multi-level regression models for physical systems. Discrete Contin. Dyn. Syst., 17B, 13331363, https://doi.org/10.3934/dcdsb.2012.17.1333.

    • Search Google Scholar
    • Export Citation
  • Majda, A. J., and J. Harlim, 2013: Physics constrained nonlinear regression models for time series. Nonlinearity, 26, 201217, https://doi.org/10.1088/0951-7715/26/1/201.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Murakami, T., L.-X. Chen, and A. Xie, 1986: Relationship among seasonal cycles, low-frequency oscillations, and transient disturbances as revealed from outgoing longwave radiation data. Mon. Wea. Rev., 114, 14561465, https://doi.org/10.1175/1520-0493(1986)114<1456:RASCLF>2.0.CO;2.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Nair, A., U. Mohanty, A. W. Robertson, T. Panda, J.-J. Luo, and T. Yamagata, 2014: An analytical study of hindcasts from general circulation models for Indian summer monsoon rainfall. Meteor. Appl., 21, 695707, https://doi.org/10.1002/met.1395.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Packard, N. H., J. P. Crutchfield, J. D. Farmer, and R. S. Shaw, 1980: Geometry from a time series. Phys. Rev. Lett., 45, 712, https://doi.org/10.1103/PhysRevLett.45.712.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Pattanaik, D. R., and A. Kumar, 2010: Prediction of summer monsoon rainfall over India using the NCEP Climate Forecast System. Climate Dyn., 34, 557572, https://doi.org/10.1007/s00382-009-0648-y.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Rajeevan, M., D. S. Pai, R. A. Kumar, and B. Lal, 2007: New statistical models for long-range forecasting of southwest monsoon rainfall over India. Climate Dyn., 28, 813828, https://doi.org/10.1007/s00382-006-0197-6.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Sabeerali, C. T., R. S. Ajayamohan, D. Giannakis, and A. J. Majda, 2017: Extraction and prediction of indices for monsoon intraseasonal oscillations: An approach based on nonlinear Laplacian spectral analysis. Climate Dyn., 49, 30313050, https://doi.org/10.1007/s00382-016-3491-y.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Sahai, A. K., and Coauthors, 2013: Simulation and extended range prediction of monsoon intraseasonal oscillations in NCEP CFS/GFS version 2 framework. Curr. Sci., 104, 13941408.

    • Search Google Scholar
    • Export Citation
  • Sauer, T., J. A. Yorke, and M. Casdagli, 1991: Embedology. J. Stat. Phys., 65, 579616, https://doi.org/10.1007/BF01053745.

  • Sikka, D. R., and S. Gadgil, 1980: On the maximum cloud zone and the ITCZ over Indian longitudes during the southwest monsoon. Mon. Wea. Rev., 108, 18401853, https://doi.org/10.1175/1520-0493(1980)108<1840:OTMCZA>2.0.CO;2.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Slawinska, J., and D. Giannakis, 2017: Indo-Pacific variability on seasonal to multidecadal time scales. Part I: Intrinsic SST modes in models and observations. J. Climate, 30, 52655294, https://doi.org/10.1175/JCLI-D-16-0176.1.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Suhas, E., J. M. Neena, and B. N. Goswami, 2013: An Indian monsoon intraseasonal oscillations (MISO) index for real time monitoring and forecast verification. Climate Dyn., 40, 26052616, https://doi.org/10.1007/s00382-012-1462-5.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Székely, E., D. Giannakis, and A. J. Majda, 2016a: Extraction and predictability of coherent intraseasonal signals in infrared brightness temperature data. Climate Dyn., 46, 14731502, https://doi.org/10.1007/s00382-015-2658-2.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Székely, E., D. Giannakis, and A. J. Majda, 2016b: Initiation and termination of intraseasonal oscillations in nonlinear Laplacian spectral analysis-based indices. Math. Climate Wea. Forecasting, 2, 125, https://doi.org/10.1515/mcwf-2016-0001.

    • Search Google Scholar
    • Export Citation
  • Takens, F., and Coauthors, 1981: Detecting strange attractors in turbulence. Lect. Notes Math., 898, 366381, https://doi.org/10.1007/BFb0091924.

  • Thomson, D. J., 1982: Spectrum estimation and harmonic analysis. Proc. IEEE, 70, 10551096, https://doi.org/10.1109/PROC.1982.12433.

  • Tung, W.-W., D. Giannakis, and A. J. Majda, 2014: Symmetric and antisymmetric convection signals in the Madden–Julian oscillation. Part I: Basic modes in infrared brightness temperature. J. Atmos. Sci., 71, 33023326, https://doi.org/10.1175/JAS-D-13-0122.1.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Wang, B., Q. Ding, X. Fu, I.-S. Kang, K. Jin, J. Shukla, and F. Doblas-Reyes, 2005: Fundamental challenge in simulation and prediction of summer monsoon rainfall. Geophys. Res. Lett., 32, L15711, https://doi.org/10.1029/2005GL022734.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Webster, P. J., V. O. Magaña, T. N. Palmer, J. Shukla, R. A. Tomas, M. Yanai, and T. Yasunari, 1998: Monsoons: Processes, predictability, and the prospects for prediction. J. Geophys. Res., 103, 14 45114 510, https://doi.org/10.1029/97JC02719.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Wheeler, M. C., and H. H. Hendon, 2004: An all-season real-time multivariate MJO index: Development of an index for monitoring and prediction. Mon. Wea. Rev., 132, 19171932, https://doi.org/10.1175/1520-0493(2004)132<1917:AARMMI>2.0.CO;2.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Zhang, C., 2005: Madden–Julian Oscillation. Rev. Geophys., 43, RG2003, https://doi.org/10.1029/2004RG000158.

Save
  • Abhilash, S., A. K. Sahai, S. Pattnaik, B. N. Goswami, and A. Kumar, 2014a: Extended range prediction of active-break spells of Indian summer monsoon rainfall using an ensemble prediction system in NCEP Climate Forecast System. Int. J. Climatol., 34, 98113, https://doi.org/10.1002/joc.3668.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Abhilash, S., and Coauthors, 2014b: Prediction and monitoring of monsoon intraseasonal oscillations over Indian monsoon region in an ensemble prediction system using CFSv2. Climate Dyn., 42, 28012815, https://doi.org/10.1007/s00382-013-2045-9.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Acharya, N., S. C. Kar, U. Mohanty, M. A. Kulkarni, and S. K. Dash, 2011: Performance of GCMs for seasonal prediction over India—A case study for 2009 monsoon. Theor. Appl. Climatol., 105, 505520, https://doi.org/10.1007/s00704-010-0396-2.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Alexander, R., Z. Zhao, E. Székely, and D. Giannakis, 2017: Kernel analog forecasting of tropical intraseasonal oscillations. J. Atmos. Sci., 74, 13211342, https://doi.org/10.1175/JAS-D-16-0147.1.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Belkin, M., and P. Niyogi, 2003: Laplacian eigenmaps for dimensionality reduction and data representation. Neural Comput., 15, 13731396, https://doi.org/10.1162/089976603321780317.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Branicki, M., N. Chen, and A. J. Majda, 2013: Non-Gaussian test models for prediction and state estimation with model errors. Chin. Ann. Math., 34B, 2964, https://doi.org/10.1007/s11401-012-0759-3.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Brenowitz, N. D., D. Giannakis, and A. J. Majda, 2016: Nonlinear Laplacian spectral analysis of Rayleigh–Bénard convection. J. Comput. Phys., 315, 536553, https://doi.org/10.1016/j.jcp.2016.03.051.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Chen, N., and A. J. Majda, 2015a: Predicting the cloud patterns for the boreal summer intraseasonal oscillation through a low-order stochastic model. Math. Climate Wea. Forecasting, 1, 120, https://doi.org/10.1515/mcwf-2015-0001.

    • Search Google Scholar
    • Export Citation
  • Chen, N., and A. J. Majda, 2015b: Predicting the real-time multivariate Madden–Julian oscillation index through a low-order nonlinear stochastic model. Mon. Wea. Rev., 143, 21482169, https://doi.org/10.1175/MWR-D-14-00378.1.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Chen, N., and A. J. Majda, 2016: Filtering nonlinear turbulent dynamical systems through conditional Gaussian statistics. Mon. Wea. Rev., 144, 48854917,