Hybrid Model for Multistep-Ahead Rainfall Forecast in Northeast India: A Comparative Study

Priya Ashok Shejule aDepartment of Civil Engineering, Indian Institute of Technology Guwahati, Guwahati, Assam, India

Search for other papers by Priya Ashok Shejule in
Current site
Google Scholar
PubMed
Close
https://orcid.org/0000-0001-5895-2002
and
Sreeja Pekkat aDepartment of Civil Engineering, Indian Institute of Technology Guwahati, Guwahati, Assam, India

Search for other papers by Sreeja Pekkat in
Current site
Google Scholar
PubMed
Close
Full access

Abstract

Among all hydrometeorological parameters, rainfall strongly correlates with hydrometeorological disasters. The rainfall forecast process remains challenging due to the nonlinear, nonstationary nature and multiscale variability of rainfall. Moreover, the unique microclimate in different regions further complicates the forecasting process. This study proposes a hybrid model employing multivariate singular spectrum analysis (MSSA) and long short-term memory (LSTM) for multistep-ahead hourly rainfall forecasting in urban areas of northeast India. The model is trained and evaluated using high-resolution (12 km) hourly meteorological data from the Indian Monsoon Data Assimilation and Analysis (IMDAA) dataset for Guwahati (plain) and Aizawl (hilly) regions from 2015 to 2019. The hybrid model outperforms the single LSTM model in both plain and hilly regions, with an average percentage gain of 47.99% and 43.88% for symmetric mean absolute percentage error (SMAPE) and root-mean-square error (RMSE) in the case of the Guwahati dataset and 84.59% and 82.27% in the case of the Aizawl dataset, respectively. The performance of the LSTM model significantly improves as the zero values in the observed data are eliminated after reconstruction by MSSA. This enables the model to discern essential patterns and relationships in the data, which leads to more accurate forecasts. However, the hybrid model underestimates the rainfall, which can be tackled by hypertuning the parameters. The study highlights the importance of considering the interplay between rainfall and meteorological parameters for accurate rainfall forecasting in urban areas. The proposed MSSA–LSTM model can be used as a decision support tool for urban planning and disaster management.

Significance Statement

This study addresses a critical need in multistep-ahead hourly rainfall forecasting in urban areas, with a focus on the unique conditions of northeast India. Our research has identified the presence of red noise in the rainfall data, shedding light on the complexities of the underlying rainfall patterns. Furthermore, we delve into the intricate interplay between rainfall and meteorological parameters, providing valuable insights into the factors influencing rainfall dynamics. Notably, our study underscores the region-specific challenges in rainfall forecasting. While the hybrid model demonstrates reasonable accuracy in the plains, its performance in the hilly region falls short of expectations. This highlights the nuanced nature of rainfall prediction in areas with varying topography and emphasizes the need for tailored forecasting approaches.

© 2024 American Meteorological Society. This published article is licensed under the terms of the default AMS reuse license. For information regarding reuse of this content and general copyright information, consult the AMS Copyright Policy (www.ametsoc.org/PUBSReuseLicenses).

Corresponding author: Priya Ashok Shejule, spriya@iitg.ac.in

Abstract

Among all hydrometeorological parameters, rainfall strongly correlates with hydrometeorological disasters. The rainfall forecast process remains challenging due to the nonlinear, nonstationary nature and multiscale variability of rainfall. Moreover, the unique microclimate in different regions further complicates the forecasting process. This study proposes a hybrid model employing multivariate singular spectrum analysis (MSSA) and long short-term memory (LSTM) for multistep-ahead hourly rainfall forecasting in urban areas of northeast India. The model is trained and evaluated using high-resolution (12 km) hourly meteorological data from the Indian Monsoon Data Assimilation and Analysis (IMDAA) dataset for Guwahati (plain) and Aizawl (hilly) regions from 2015 to 2019. The hybrid model outperforms the single LSTM model in both plain and hilly regions, with an average percentage gain of 47.99% and 43.88% for symmetric mean absolute percentage error (SMAPE) and root-mean-square error (RMSE) in the case of the Guwahati dataset and 84.59% and 82.27% in the case of the Aizawl dataset, respectively. The performance of the LSTM model significantly improves as the zero values in the observed data are eliminated after reconstruction by MSSA. This enables the model to discern essential patterns and relationships in the data, which leads to more accurate forecasts. However, the hybrid model underestimates the rainfall, which can be tackled by hypertuning the parameters. The study highlights the importance of considering the interplay between rainfall and meteorological parameters for accurate rainfall forecasting in urban areas. The proposed MSSA–LSTM model can be used as a decision support tool for urban planning and disaster management.

Significance Statement

This study addresses a critical need in multistep-ahead hourly rainfall forecasting in urban areas, with a focus on the unique conditions of northeast India. Our research has identified the presence of red noise in the rainfall data, shedding light on the complexities of the underlying rainfall patterns. Furthermore, we delve into the intricate interplay between rainfall and meteorological parameters, providing valuable insights into the factors influencing rainfall dynamics. Notably, our study underscores the region-specific challenges in rainfall forecasting. While the hybrid model demonstrates reasonable accuracy in the plains, its performance in the hilly region falls short of expectations. This highlights the nuanced nature of rainfall prediction in areas with varying topography and emphasizes the need for tailored forecasting approaches.

© 2024 American Meteorological Society. This published article is licensed under the terms of the default AMS reuse license. For information regarding reuse of this content and general copyright information, consult the AMS Copyright Policy (www.ametsoc.org/PUBSReuseLicenses).

Corresponding author: Priya Ashok Shejule, spriya@iitg.ac.in

1. Introduction

Sufficient knowledge and monitoring of hydrometeorological conditions are extremely important for weather and climate research communities all over the globe. Global climate change is affecting the rainfall intensity and frequency, which can result in extreme events such as floods and droughts (Chang et al. 2022). Among the many meteorological parameters, rainfall shows the highest correlation with hydrometeorological disasters affecting the water resource-based sectors, ecosystems, agricultural services, flight operations, and overall economic planning of the nation (Panda and Sahu 2019). The disastrous impact of these events can be reduced with the help of reliable rainfall forecast. Over the last few decades, the growing understanding of physical processes and technological advances have encouraged accurate rainfall forecast (Benjamin et al. 2019). However, the forecast process remains challenging due to the nonlinear, nonstationary nature and multiscale variability of rainfall.

The machine learning (ML) model has received huge attention from researchers in the field of hydrology. Artificial neural networks (ANNs) have proven the ability to forecast rainfall on short-term as well as long-term scales (Abbot and Marohasy 2012; Khastagir et al. 2022). The artificial neuro-fuzzy inference system (ANFIS), support vector machines (SVMs), random forest (RF), decision tree (DT), and genetic algorithms (GAs) (Hung et al. 2009; Faulina et al. 2012; Yu et al. 2017; Das et al. 2017; Kumar and Ramesh 2018; Ratnam et al. 2019) are some of the popular ML models found to be very effective in handling complex nonlinear systems. Among these, ANN is a popular tool with the ability of pattern identification, noise tolerance, and less computational difficulty and is flexible according to the complexity of data (Dawson and Wilby 2001; Litta et al. 2013). Despite the flexibility, previous studies do not recommend the application of ANN in real-time rainfall forecast (Hung et al. 2009). To improve these, other variants of ANN such as multilayer perceptron, feed-forward neural networks (FFNNs), backpropagation neural networks (BPNNs), neuro-fuzzy neural network (NFNN), and recurrent neural network (RNN) are developed where each model has a unique forecast mechanism. Hydrological variables such as rainfall runoff are time dependent, i.e., current and future events are affected by previous hydrological events. This makes the forecast even more challenging for the researchers. To take care of the sequential problems in data series, the RNN model is recommended (Bergen et al. 2019). However, the architectural design of RNN prevents it from effectively addressing the vanishing gradient challenge, thereby limiting its capacity to learn and retain long-term consequential information (Hochreiter and Schmidhuber 1997). To overcome the issue of the vanishing gradient problem, long short-term memory (LSTM) networks, a special kind of RNN, are designed by introducing the concept of a memory cell. The LSTM network learns sequential information for a long time and can effectively tackle the vanishing gradient problem (Wang et al. 2022). LSTM has gained wide attention from researchers in predicting the weather parameters due to its ability to handle complex long-term dependencies in the weather pattern (Salman et al. 2018; Hewage et al. 2019; Fan et al. 2022; Nitesh et al. 2023), making it a popular tool in weather forecast.

A review of some recent works on the application of LSTM networks in the field of hydrology is listed in Table S1 in the online supplemental material. A single and multilayer LSTM model is proposed by Salman et al. (2018) to forecast weather parameters for an Indonesian airport comprising of 40 025 time series data. The author suggests that incorporating an intermediate variable within the LSTM block can enhance the understanding of input patterns and consequently improve forecast accuracy. Many researchers (Yunpeng et al. 2017; Kratzert et al. 2018) proved the superiority of the LSTM model in multistep-ahead time series prediction. However, a single LSTM model will not always give a better rainfall forecast (Wu et al. 2021). Chen et al. (2022) introduced a novel hybrid framework for short-term wind forecast employing LSTM and BPNN techniques. The author has applied singular spectrum analysis (SSA) and complete ensemble empirical mode decomposition adaptive noise (CEEMDAN) for denoising and relevant information extraction. Also, the improved sparrow search algorithm is implemented to optimize the BPNN. Another study by Le et al. (2019) revealed that LSTM can be successfully applied for 1-, 2-, and 3-day-ahead flood forecasting with performance accuracy above 86%. Wu et al. (2021), for the first time, utilized a hybrid wavelet–autoregressive integrated moving average (ARIMA)–LSTM (W-AL) model for monthly precipitation forecast in different climatic regions of China. It is concluded that the model performs well in humid and semihumid types, while the results are worse in arid and semiarid climatic regions. The aforementioned studies have proved the superior performance of hybrid LSTM models in weather forecast on a daily scale.

Rainfall phenomenon is chaotic and can be considered a quasi-periodic signal interfering with the noise/irregular component (Wu et al. 2009; Umut 2012). Eliminating the irregular component from the highly nonstationary rainfall series can help in improving the forecast performance (Wu et al. 2010; Wang et al. 2015). Khorram and Jehbez (2023) applied the convolutional neural network (CNN)–LSTM model for inflow forecast and recommended the application of the decomposition technique to handle the nonlinearity in the river flow data. The SSA, wavelet analysis (WA), and ensemble empirical mode decomposition (EEMD) are some of the popular data decomposition techniques in the time series analysis (Vautard and Ghil 1989; Partal and Kişi 2007; Wu and Huang 2009). SSA is a widely known nonparametric approach that addresses complex nonlinear interactions (Vautard et al. 1992; Elsner and Tsonis 1996). Unlike WA and EEMD, SSA excels at revealing internal patterns within the time series, regardless of their length, and thus is suitable for rainfall forecasting. Rainfall is a highly erratic phenomenon affected by a multitude of meteorological parameters. To achieve reliable forecasts, it is crucial to consider the impact of several meteorological parameters, such as temperature, relative humidity, surface pressure, mean sea level pressure, and wind speed, on rainfall. To the best of our knowledge, a hybrid model considering all the above factors has not been explored fully.

Predictions of rainfall in northeast India mostly rely on physical and statistical models (Durai and Bhardwaj 2014; Joseph et al. 2015; Murthy et al. 2018). A thorough literature review revealed that spectral approaches and machine learning models can be combined to enhance the nowcast and short-range rainfall forecast for the research area (which receives heavy rainfall). In the present study, we considered two highest rainfall receiving areas in northeast India with different topographic conditions. An urban catchment of Guwahati, located in northeast India, is exposed to flood disasters frequently due to its intricate topography. Situated by the Brahmaputra River, its danger level is 49.68 m above mean sea level. The flood risk in the area is influenced by local meteorology, atmospheric circulation, and uncertain factors impacting daily rainfall, posing challenges for accurate forecasting (Khan and Maity 2020). The performance accuracy depends on many factors, such as input data, pattern identification, regional rainfall characteristics, design, and the forecast approach of the model (Hutter et al. 2019). In this context, to check the applicability of the model, a hilly region, Aizawl, in northeast India, with varying rainfall patterns, is also considered. High model performance requires a thoughtful selection of input features, along with appropriate preprocessing techniques and a suitable ML model. In the present study, the gamma test is performed to select the relevant inputs having a maximum influence on the target variable, i.e., rainfall.

In this study, we attempt to answer the following questions: 1) Whether multivariate singular spectrum analysis (MSSA) is an effective noise removal technique in the performance improvement of ML model by identifying the patterns in meteorological series combined with the gamma test for feature selection, 2) What is the nature of noise present in the data under consideration, 3) Is the hybrid approach of forecast integrating multivariate SSA and LSTM, i.e., MSSA–LSTM, improve hourly rainfall forecast considering the interplay between rainfall and various meteorological factors, and 4) What is the influence of regional characteristics of rainfall on the forecast accuracy. In addition, performance evaluation of single model as well as hybrid model is performed using statistical measures such as symmetric mean absolute percentage error (SMAPE), root-mean-square error (RMSE), mean negative error (MNE), and Nash–Sutcliffe efficiency (NSE) in multistep hourly forecast, suggesting the best model for hourly rainfall forecast in a given area. The future agenda is to consider the applicability of nowcasting for managing the urban flash flood scenario.

2. Study area and data preprocessing

a. Study area

This study considers meteorological parameters such as rainfall (mm), temperature (°C), relative humidity (%), wind speed (horizontal) (m s−1), mean sea level pressure (Pa), and surface pressure (Pa) at hourly intervals. Data are collected from the Indian Monsoon Data Assimilation and Analysis (IMDAA) for two cities, namely, Guwahati and Aizawl, located in northeast India, during 2015–19. Figure 1 shows the location map of the study areas. The high-frequency IMDAA datasets are available freely at https://rds.ncmrwf.gov.in, providing high-resolution (12 km) reanalysis data.

Fig. 1.
Fig. 1.

Location map of the study areas.

Citation: Journal of Hydrometeorology 25, 8; 10.1175/JHM-D-23-0173.1

The Guwahati city, which is located in the state of Assam, northeast India, between 26°4′45″ and 26°13′25″N latitude and 91°34′25″ and 91°52′00″E longitude, 55 m above mean sea level (MSL), has a humid subtropical climate and is more prone to flooding. This flood-prone area is situated on the banks of the Brahmaputra River and is interspersed with many hills and wetlands (Suresh and Pekkat 2023). Aizawl, on the other hand, 1132 m above MSL, with its high elevation, small-size rivers, and rugged terrain, is located between 23°42′26″ and 23°45′59″N latitude and 92°40′00″ and 92°50′00″E longitude with less incidences of flooding. Table S2 displays statistical attributes (mean, standard deviation, coefficient of variation, coefficient of skewness, maximum and minimum value, etc.) of considered meteorological parameters for both cities.

b. Data preprocessing

This study focuses on meteorological parameters (e.g., rainfall, temperature, surface pressure, relative humidity, mean sea level pressure, and wind speed) with varied measurement scales. To handle the issue of dependency on units, data preprocessing is required. In this study, standardization is employed, ensuring a uniform scale while preserving original values. Here, data are made consistent with the mean being equal to zero and the standard deviation being equal to one. The formula for data standardization is as follows:
mstd=miμσ.
Here, mi is the original observed data, mstd is the standardized value, μ is the mean, and σ is the standard deviation of the meteorological variables. Standardization enhances optimization algorithm efficacy in ML models (Sun et al. 2022). Postforecast data can be destandardized to restore the original value.

3. Model development

This section explains the algorithm for methodologies adopted in the study in detail.

a. MSSA

MSSA is a natural extension of the SSA technique developed for handling multiple time series (Broomhead and King 1986). It considers the correlation among the multiple time series to understand the dependencies and patterns in them. It has many applications, such as seismic analysis, denoising, and prediction of nonlinear time series, for reconstructing the missing data and in the medical field, among others (Ansari et al. 2022). The MSSA algorithm is similar to that of SSA with the decomposition and reconstruction phase. The steps involved are presented below (Rodrigues and Mahmoudvand 2018; Ohmichi et al. 2022)

  • Step i: Embedding

    Let Z={ZNp=(z1p,z2p,,zNp):p=1,2,,P} be a multivariate time series of length N with P number of series. The first step is the embedding step, which involves the conversion of time series to trajectory matrix YP represented as YP=[Y11,Y22,,YWP] with vectors Yip=(zip,,zi+W1p)T,i=1,2,,N. The trajectory matrix is given as follows:
    YP=[z11zNW+11z1PzNW+1Pz21zNW+21z2PzNW+2PzW11zN11zW1PzN1PzW1zN1zWPzNP],
    where W is the window length, P is the number of series under consideration, and N is the length of the time series and parameter K = NW + 1. Thus, the size of the obtained matrix is W × PK. There is no strict criterion for window length selection. Still, the basic idea is the same even for MSSA, which suggests that approximate separability between time series components must be obtained for selected window length. Golyandina (2010) recommended the choice of window length for one-dimensional SSA. For complex trends in series, smaller window length can be obtained to ensure better separability (Golyandina and Zhigljavsky 2013).
  • Step ii: Singular value decomposition (SVD)

    This step involves reconstruction of trajectory YP matrix to lower dimension by singular value decomposition on YPYPT. Let QPi=[qi1,qi2,,qWi] and χ=diag{λ1,λ2,,λW} denote eigenvectors and eigenvalues of matrix YPYPT. Then, we finally have
    Y^M=i=1rQPiχQPiT,
    where r denotes the rank of the matrix.
  • Step iii: Grouping

    There is no hard and fast rule for the grouping step. It is based on the identification of SVD components corresponding to the signal and noise part of the related time series. Generally, larger eigenvalues indicate dominant patterns in the time series contributing significantly to the overall structure and are associated with the signal component. Conversely, the insignificant eigencomponents, corresponding to low eigenvalues, represent the noise part. Mainly, important SVD components are chosen which provide the best representation of the related time series based on the variance contribution to the time series. A detailed mathematical background can be found in the literature by Golyandina et al. (2001).

  • Step iv: Diagonal averaging

    This is the last step where the matrix is converted back to the original time series by antidiagonal averaging also known as the Hankelization process (Hassani et al. 2009):
    Y˜M=HY^M,
    where H stands for the Hankel operator and Y˜M represents the denoised time series obtained after performing Hankelization on the matrix YM.

b. LSTM

LSTMs have chain-like repeating modules where each of them has a memory unit called a cell. The cell has the capacity to store or forget the information over longer period of time (Sun et al. 2022). The architecture of the LSTM memory cell is depicted in Fig. 2. The LSTM module consists of cell state (ct) and the gates, namely, the forget gate (ft), the input gate (it), and the output gate (ot). Cell state has the ability to store the information while all three gates help in information propagation through the block.

Fig. 2.
Fig. 2.

The architecture of LSTM memory cell.

Citation: Journal of Hydrometeorology 25, 8; 10.1175/JHM-D-23-0173.1

The equations used in LSTM with initial states are given as follows:
ft=σ(Wf×[ht1,Xt]+bf),
C˜t=tanh(Wc×[ht1,Xt]+bc),
it=σ(Wi×[ht1,Xt]+bi),
Ct=ftCt1+itC˜t.
As LSTM is a standard model, a detailed description can be found in the literature (Kratzert et al. 2018; Zuo et al. 2020) and not repeated herein.

Multivariate time series analysis is a highly intricate task due to its nonlinear, nonstationary nature and the presence of noise, which further adds complexity to the analysis process. Therefore, model development is performed in two stages. In the first stage, MSSA has been applied to multivariate series for simultaneous decomposition and reconstruction. Before the MSSA application, data standardization was performed to transfer the series on the same scale. Subsequently, the gamma test is applied to select the optimum input combination and its relevant contribution to forecasting rainfall. In the next stage, after the decomposition, the selected input variables are fed to the LSTM model. The detailed flowchart for the hybrid model is illustrated in Fig. 3.

Fig. 3.
Fig. 3.

Flowchart for hybrid model architecture.

Citation: Journal of Hydrometeorology 25, 8; 10.1175/JHM-D-23-0173.1

c. Spearman rank correlation test

The Spearman rank correlation test (Zar 1972) is used to understand the significant correlations among the meteorological parameters.

d. Ljung–Box Q test

The test is performed to check whether any autocorrelation exists in the extracted noise component. The Ljung–Box Q statistics is given by the following equation (Ljung and Box 1978):
Qstat=t(t+2)kmρk2tk,
where t denotes the sample size, m is the number of lags under consideration, and ρ is the autocorrelation at lag k.
Let {Nt} be the series under consideration, and the hypotheses for the given series are as follows:
{H0:{Nt}ThedataareindependentlydistributedH1:{Nt}Thedataarenotindependentlydistributed,i.e.,exhibitserialcorrelation

e. Gamma test

The gamma test, a nonlinear analysis tool, is used by researchers to identify the optimal input combination in hydrological processes. This helps streamline the often complex and time-consuming task of selecting the best inputs, reducing the need for tedious trial-and-error procedures (Han et al. 2009; Niknia et al. 2014; Bajirao et al. 2022). Selecting the optimum input parameters is a crucial step in the case of linear as well as nonlinear model developments to arrive at better forecast results (Güldal and Tongal 2010). The gamma test quantifies the extent to which input/output data can be described as a smooth relationship. It also determines the minimum achievable mean-square-error (MSE) for modeling unseen data using any continuous nonlinear model. The set of D input/output observations considered in the gamma test are presented as follows:
{(xi,yi),1iD},wherexR(vectorinput)andyR(scalaroutput).
The relationship between input and output is given as
y=f(x)+n,
where f is a smooth unknown function and n is a stochastic part representing noise with zero mean and variance as Var(n). The gamma test statistic determines how much variance in y is attributed to the stochastic part or smooth function f. It is an estimate of the variance of model output. If two points are nearby in the input space, their respective outputs should also be close in the output space. If there exists some distance between the output space, it is termed as a noise. If P[i, j] is the jth nearest neighbor to xi, then delta function is given by the following equation:
δM(j)=1Di=1D|xixP[i,j]|2,
where δ(j) represents the mean-square distance to the jth nearest neighbor, j is the number of nearest neighbors, generally j = 10, and |…| represents the Euclidian distance.
The corresponding gamma function is given by
γM(j)=12Di=1D|yiyP[i,j]|2,
where yP[i,j] is the output corresponding to xP[i,j]. The gamma test statistic is estimated by plotting linear regression lines on points δM and γM.
γ=Mδ+Γ.
In a regression line, γ denotes the output vector, whereas M and Γ are the gradient and intercept of the regression line, respectively. The intercept (δ = 0) on the vertical axis indicates the Γ value. For small value of Γ, the function f exists and output is mainly determined by input variables. However, for large Γ values, output is mainly due to the random variation, suggesting that inputs are irrelevant to output y. The results are standardized by using another term Vratio that ranges from 0 to 1, and it is represented as
Vratio=Γσ2(y),
where σ2(y) represents a variance of output y and the value of Vratio closer to 0 denotes the higher degree of predictability of output. The best combination is selected based on the minimum value of gamma (Γ) and Vratio.

f. Noise separation

Noise detection is one of the crucial tasks while dealing with time series analysis, specifically hydrological series. As explained earlier, MSSA is applied to identify and segregate the noise present in the rainfall series. Therefore, to assess the separability between signal and noise components, the weighted (W)-correlation matrix is plotted. Golyandina et al. (2001) have given a detailed mathematical explanation on the degree of separability between signal and noise components. The author suggested that if the W-correlation value is small, the two series are orthogonal to each other.

At this point, simply visual inspection makes it challenging to comprehend the nature of the extracted noise. Therefore, it is vital to understand the type of noise that exists in the rainfall series with the help of some well-developed statistical methods. The hydrologic phenomenon, mostly white noise and red noise, is observed. White noise is time uncorrelated and exhibits a power spectrum evenly spread across all allowed frequencies. In contrast, red noise is temporally correlated, displaying a power spectrum skewed toward lower frequencies.

g. Approach of forecast

One-step-ahead rainfall forecast is already a critical task, and attempting n-step ahead is more challenging because of more uncertainty and error accumulations. There are two main approaches for n-step-ahead forecast: the recursive approach and the direct approach (Taieb and Hyndman 2012). The direct approach is model based, which fits the parametric function to the data, and the model is developed to forecast the parameters. This approach is also known as independent strategy (Hamzaçebi et al. 2009). The recursive approach handles the forecast iteratively. It is a step-by-step model in which the forecasted value at the current time step is used to calculate the value at the next time step:
m^t+1=f(mt,mt1,,mtp+1,θ)+εtm^t+2=f(m^t+1,mt,mt1,,mtp+2,θ)+εtm^t+n=f(m^t+n1,m^t+n2,m^tp+n,θ)+εt,
where mt, mt−1, …, mtp+1 are the previous p observations, n is the forecasting horizon, θ indicates the parameter vector, εt is the residual term, and m^t+1 is the prediction for mt+1.

Choosing between the above two strategies is based on ease of implementation, error accumulation, and accuracy of forecast (Marcellino et al. 2006). In the present study, we have implemented the recursive approach.

h. Error indices

In the given study, we employed error indices such as RMSE, SMAPE, MNE, and accuracy measures such as coefficient of determination R2 and NSE as given in Table S3. The description of these measures can be found in the literature (Nash and Sutcliffe 1970; Hyndman and Koehler 2006; Unnikrishnan and Jothiprakash 2018) and not repeated herein.

4. Results and discussion

To identify the significant correlations among the meteorological parameters that will affect the rainfall time series, we examined the correlations between the rainfall time series and other meteorological parameters such as temperature, surface pressure, relative humidity, mean sea level pressure, and wind speed using the Spearman rank correlation test. The results of the Spearman rank correlation test show that the meteorological parameters are highly correlated at a 1% significance level, which are listed in Table 1. Rainfall, temperature, wind speed, and relative humidity exhibited positive correlations. Notably, the strongest correlation was observed between rainfall and temperature (correlation coefficient = 0.44) in Guwahati, while the highest correlation in Aizawl was between rainfall and relative humidity (correlation coefficient = 0.13). The possible reason lies in the thermodynamics. Based on the Clausius–Clapeyron (CC) equation, warm air can hold more moisture in it which increases by 7% for every 1°C rise in temperature. This suggests with the rise in temperature, there is more potential for rainfall to occur. Higher humidity levels indicate increased moisture content in the air, increasing the chance of cloud formation and subsequent rainfall. Conversely, surface pressure and mean sea level pressure exhibit a negative correlation with rainfall. Low pressure areas attract wind, causing air to ascend. As the air rises in the atmosphere, water vapor in it condenses, resulting in precipitation. Therefore, low pressure is generally linked to the occurrence of rainfall events.

Table 1.

Correlation among rainfall and meteorological parameters.

Table 1.

a. Data preprocessing using MSSA

MSSA has been applied to multivariate series for simultaneous decomposition and reconstruction. Before the MSSA application, data standardization was performed to transfer the series on the same scale. The choice W = N/2 is recommended in general, but as we have hourly meteorological data with N = 37 915 (for Guwahati) and N = 38 032 (for Aizawl), it will make the analysis difficult with a large number of eigenvalues. In the present study, based on trial and error, we have set the embedding dimension, i.e., window length, to 24 for the formation of the trajectory matrix and SVD. The plot of eigenvalues and their variance contribution obtained after the SVD is shown in Fig. 4 for both the study areas.

Fig. 4.
Fig. 4.

Eigenvalue plot and variance ratio for hourly multivariate series for (a) Guwahati city and (b) Aizawl city for the period 2015–19.

Citation: Journal of Hydrometeorology 25, 8; 10.1175/JHM-D-23-0173.1

Generally, the first few components form the major part of the time series. Figure 4 shows that the first eigencomponent possesses the highest value and is significantly different from others. Mainly, the first value forms the trend of the time series, suggesting the shape of the time series (Alexandrov 2009).

The variance contribution is given by λi/iλi and is displayed in Table 2. It reveals that the first 20 eigencomponents account for 93.46% of the time series variation in Guwahati and 94.90% in Aizawl. This observation aligns with the results depicted in Fig. 4, where it becomes evident that after the first 20 eigentriples, the eigenvalues remain relatively constant. This implies that subsequent components do not make a significant contribution to the time series. The noise component has very less contribution in the time series and thus can be identified from the plot.

Table 2.

Variance contribution by leading 20 eigencomponents obtained after SVD of given meteorological series.

Table 2.

Figures 5a and 5b show the W-correlation matrix plot for the 50 reconstructed components (RCs) of the meteorological series for Guwahati and Aizawl, respectively. From visual inspection of the W-correlation plot, it is seen that the first 1–20 eigentriples are uncorrelated with other components. After that, W-correlation among themselves and nearby eigentriples increases. The increased spread denotes the random correlation among the eigentriples. In addition to autocorrelation and boxplot, we confirmed the presence of red noise by the power spectrum plot and the Ljung–Box test. The correlogram (autocorrelation vs lag) plot of red noise exhibits a slow decline as the lag increases, while that of white noise rapidly diminishes (Elsner and Tsonis 1996). For both the time series, it exhibits a gradual decrease, confirming the presence of red noise in the data. The boxplot of noise shows zero mean, which is an essential characteristic of the noise. The power spectrum plot (Figs. 6a and 6b) shows that the power of a signal gradually decreases at higher frequencies. It typically indicates that the signal is skewed toward lower frequencies. This is a characteristic of red noise, which is further confirmed by the Ljung–Box test. The results are displayed in Table 3. The null hypothesis is rejected at both 5% and 1% significance levels, suggesting the presence of correlation, i.e., red noise.

Fig. 5.
Fig. 5.

The W-correlation for the first 50 RCs for (a) Guwahati city and (b) Aizawl city for the period 2015–19.

Citation: Journal of Hydrometeorology 25, 8; 10.1175/JHM-D-23-0173.1

Fig. 6.
Fig. 6.

Correlogram plot, boxplot, and power spectrum of a noise component for (a) Guwahati city and (b) Aizawl city for the period 2015–19.

Citation: Journal of Hydrometeorology 25, 8; 10.1175/JHM-D-23-0173.1

Table 3.

Ljung–Box Q-statistic test output for serial correlation.

Table 3.

b. Selection of best input variables

For a better forecast, selecting an optimum number of input variables is necessary, and the same is carried out using the gamma test, which is a popular nonlinear modeling and analysis tool. This study examined various combinations of meteorological variables to assess their impact on next-hour rainfall forecasts. The results, presented in Table 4, reveal that in the case where all meteorological parameters are chosen as input, the minimum value of Γ and Vratio is obtained, making it the best combination for forecasting next-hour rainfall. This finding underscores the significance of incorporating the relevant meteorological parameters in the context of rainfall forecast and its influence on urban microclimate.

Table 4.

Gamma test output for selection of best input combination.

Table 4.

The steps performed in the multistep-ahead forecast are depicted as follows:

  • Hydrologic series are highly chaotic, which makes the forecast process difficult. Therefore, it is necessary to understand their internal structure. In the present study, MSSA has been applied to decompose the meteorological series into its characteristic components. The coefficient of determination R2 between observed components and RCs and the MSSA parameters used are depicted in Table 5.

  • The signal and noise components are separated based on the eigenvalue plot, the total contribution of eigenvalues, and the W-correlation matrix.

  • The reconstructed rainfall, temperature, surface pressure, mean sea level pressure, relative humidity, and wind speed are fed to the LSTM model for training purposes. Rainfall at the next hour is considered as target data. The LSTM model carries out a multistep-ahead hourly rainfall forecast. In the present study, the LSTM model uses one hidden layer to avoid complexities and processing time. Different learning rates, 0.001, 0.002, 0.003, and 0.005, have been tried, and the results are compared. The number of epochs has been set to 500 and the batch size to 512. In the current LSTM model, the number of hidden layer neurons is set to 10, 20, 30, and 50. The other default parameters used by TensorFlow are given in Table 6.

    Table 5.

    MSSA parameters and coefficient of determination between observed components and RCs. Here, the abbreviations RC1, RC2, RC3, RC4, RC5, and RC6 represent RCs in case of rainfall, temperature, surface pressure, mean sea level pressure, relative humidity, and wind speed.

    Table 5.
    Table 6.

    The parameters used by the LSTM model for next-hour rainfall forecast.

    Table 6.

c. Error plots

The performance comparison by the LSTM model before and after preprocessing in terms of error indices and NSE is depicted in Fig. 7. Specifically, the optimal performance for the Guwahati dataset is achieved after preprocessing with a learning rate of 0.002 and 30 neurons in the LSTM model. The best results are attained with preprocessing at a learning rate of 0.003 and 20 neurons in the LSTM model for the Aizawl dataset. Without preprocessing, it is observed at a learning rate of 0.003 with 50 neurons and 0.003 with 10 neurons for Guwahati and Aizawl, respectively. Notably, the NSE plot reveals that the MSSA–LSTM performs better than the single LSTM model. Additionally, error indices demonstrate substantial improvement after preprocessing across various learning rates. The negative NSE value in the case of the Aizawl dataset indicates poor performance by the LSTM model. However, for the Guwahati dataset, NSE values significantly improved after preprocessing. Inferior performance by the single LSTM model can be attributed to the presence of noise in observed rainfall data. Preprocessing with MSSA enhances forecast skill by eliminating noise and enabling the model to discern essential patterns and relationships in the data. Another possible reason is that as the LSTM model is sensitive to input, removing noise and irrelevant data aids the model’s ability to generalize better to unseen data, leading to improved performance.

Fig. 7.
Fig. 7.

Performance comparison by using error indices RMSE, NSE, MNE, and SMAPE between observed and forecasted rainfall (a) before preprocessing and (b) after preprocessing by MSSA at a different learning rate for Guwahati city and (c) before preprocessing and (d) after preprocessing by MSSA at a different learning rate for Aizawl city.

Citation: Journal of Hydrometeorology 25, 8; 10.1175/JHM-D-23-0173.1

The plot of observed and forecasted rainfall for the training and testing period is shown in Figs. 8a and 8b for Guwahati and Aizawl, respectively. During this period, the observed and forecasted values show a strong correlation, as displayed in Fig. 8. Seven-hour-ahead forecasting for the period 1700:00–2300:00 Indian standard time (IST) 29 April 2019 (for Guwahati) and 1400:00–2000:00 IST 4 May 2019 (for Aizawl) is performed by single as well as hybrid model. The figures clearly show that, for datasets without preprocessing, the forecasted rainfall values consistently decrease as the lag time increases. This underscores the importance of preprocessing techniques for the multistep-ahead forecast.

Fig. 8.
Fig. 8.

Hourly observed and forecasted rainfall by hybrid MSSA–LSTM model for the period July (a) 3 Jan 2015–29 Apr 2019 (for Guwahati) and (b) 3 Jan 2015–4 May 2019 (for Aizawl); scatterplot of observed and model forecast during training and testing period (c) for Guwahati city and (d) for Aizawl city; observed and model forecast for next 1 h for the period (e) 1700:00–2300:00 LT 29 Apr 2019 (for Guwahati) and (f) 1400:00–2000:00 LT 4 May 2019 (for Aizawl); radar chart showing the performance skill of the LSTM model in hourly rainfall forecast with and without preprocessing by MSSA for (g) Guwahati city and (h) Aizawl city.

Citation: Journal of Hydrometeorology 25, 8; 10.1175/JHM-D-23-0173.1

The radar chart shows the performance skill of the LSTM model in case 7-h-ahead forecast with and without preprocessing is computed using SMAPE, RMSE, MNE, and NSE and is shown in Figs. 8g and 8h. The SMAPE value is significantly reduced in both cases, and the NSE value is improved after preprocessing. It is noted that the model is performing better in the plain region of Guwahati compared to the hilly region of Aizawl. This difference can be attributed to the complex terrain, vegetation cover, and meteorological conditions in the hilly area, which result in varying rainfall patterns affecting the microclimate of the region. Urban areas possess unique microclimate influenced by dynamic factors such as unique topography, wind shear, and elevation differences. To enhance model performance, it is essential to comprehend and incorporate these dynamic factors into the model. In summary, the hybrid MSSA–LSTM model outperforms the single LSTM model when predicting hourly rainfall for multiple time steps ahead. It achieves reasonable accuracy in the plain area, whereas its performance is less satisfactory in the hilly region.

5. Conclusions

The main objective of the study is to improve the hourly rainfall forecast in the study area through the application of a preprocessing technique. The rainfall series can be viewed as a combination of signal (including trends, periodic patterns, and cyclical variations) and irregular noise data. In preprocessing, the aim is to achieve consistent and noise-free data. The main findings from the study are summarized as follows:

  • As the rainfall is affected by a multitude of parameters, forecasting it on an hourly scale was a difficult task. Here, MSSA is applied to decompose and reconstruct the hourly meteorological series.

  • The performance of the LSTM model significantly improves as the zero values in the observed data are eliminated after reconstruction by MSSA. The results show that, in general, the developed hybrid systems outperformed the single model.

  • The hybrid systems employing the MSSA–LSTM achieved, on average, a percentage gain compared to its respective single LSTM model of 47.99% and 43.88% for SMAPE and RMSE in the case of the Guwahati dataset and 84.59% and 82.27% in the case of the Aizawl dataset for a 7-h-ahead forecast. The MNE values suggest underprediction by the model in both cases.

  • It is to be noted that hyperparameters significantly impact the accuracy of the LSTM model, and tuning this parameter is a difficult task. So it is recommended to apply techniques like a random search algorithm or standard Gaussian Bayes to optimize the hyperparameters. Another suggestion is to consider dynamical factors, as discussed earlier, to strengthen these results, as the model performance may vary concerning the location due to the unique microclimate.

In the context of future works, we believe that the obtained hourly rainfall forecast can serve as input to the hydrologic models for flood forecasting. When integrated with a suitable decision support system, this forms the basis for evolving an effective urban flood management strategy. This study highlights the need to understand the regional characteristics of rainfall and its effect on the model accuracy. While this study considers the impact of meteorological parameters on rainfall, the complex atmospheric processes challenge real-time forecasting. Concerning the results of the present study, efforts are required to fine-tune the existing model to tackle the randomness and nonlinearity in the rainfall along with the extreme events to reduce the gap between observed and forecasted rainfall. A recent study by Liu et al. (2022) introduced a robust approach, combining spatiotemporal LSTM with self-attention mechanism, outperforming traditional convolutional (ConvLSTM) models. This is one of the powerful approaches, and its potential to include meteorological data along with the rainfall should be investigated in detail.

Acknowledgments.

This research was financially supported by the Prime Minister’s Research Fellow (PMRF) Research Grant through the Indian Institute of Technology, Guwahati, India.

Data availability statement.

The high-frequency IMDAA datasets are available freely at https://rds.ncmrwf.gov.in. The code used in the analysis can be made available from the authors upon request.

REFERENCES

  • Abbot, J., and J. Marohasy, 2012: Application of artificial neural networks to rainfall forecasting in Queensland, Australia. Adv. Atmos. Sci., 29, 717730, https://doi.org/10.1007/s00376-012-1259-9.

    • Search Google Scholar
    • Export Citation
  • Alexandrov, T., 2009: A method of trend extraction using singular spectrum analysis. arXiv, 0804.3367v3, https://doi.org/10.48550/arXiv.0804.3367.

  • Ansari, K., T.-S. Bae, K. D. Singh, and J. Aryal, 2022: Multivariate singular spectrum analysis of seismicity in the space–time-depth-magnitude domain: Insight from eastern Nepal and the southern Tibetan Himalaya. J. Seismol., 26, 147166, https://doi.org/10.1007/s10950-021-10057-6.

    • Search Google Scholar
    • Export Citation
  • Bajirao, T. S., A. Elbeltagi, M. Kumar, and Q. B. Pham, 2022: Applicability of machine learning techniques for multi-time step ahead runoff forecasting. Acta Geophys., 70, 757776, https://doi.org/10.1007/s11600-022-00749-z.

    • Search Google Scholar
    • Export Citation
  • Benjamin, S. G., J. M. Brown, G. Brunet, P. Lynch, K. Saito, and T. W. Schlatter, 2019: 100 years of progress in forecasting and NWP applications. A Century of Progress in Atmospheric and Related Sciences: Celebrating the American Meteorological Society Centennial, Meteor. Monogr., No. 59, Amer. Meteor. Soc., https://doi.org/10.1175/AMSMONOGRAPHS-D-18-0020.1.

  • Bergen, K. J., P. A. Johnson, M. V. De Hoop, and G. C. Beroza, 2019: Machine learning for data-driven discovery in solid Earth geoscience. Science, 363, eaau0323, https://doi.org/10.1126/science.aau0323.

    • Search Google Scholar
    • Export Citation
  • Broomhead, D. S., and G. P. King, 1986: On the qualitative analysis of experimental dynamical systems. Non-linear Phenomena and Chaos, S. Sarkar, Ed., Adam Hilger, 113–144.

  • Chang, M., B. Liu, B. Wang, C. Martinez-Villalobos, G. Ren, and T. Zhou, 2022: Understanding future increases in precipitation extremes in global land monsoon regions. J. Climate, 35, 18391851, https://doi.org/10.1175/JCLI-D-21-0409.1.

    • Search Google Scholar
    • Export Citation
  • Chen, G., B. Tang, X. Zeng, P. Zhou, P. Kang, and H. Long, 2022: Short-term wind speed forecasting based on long short-term memory and improved BP neural network. Int. J. Electr. Power Energy Syst., 134, 107365, https://doi.org/10.1016/j.ijepes.2021.107365.

    • Search Google Scholar
    • Export Citation
  • Das, S., R. Chakraborty, and A. Maitra, 2017: A random forest algorithm for nowcasting of intense precipitation events. Adv. Space Res., 60, 12711282, https://doi.org/10.1016/j.asr.2017.03.026.

    • Search Google Scholar
    • Export Citation
  • Dawson, C. W., and R. L. Wilby, 2001: Hydrological modelling using artificial neural networks. Prog. Phys. Geogr., 25, 80108, https://doi.org/10.1177/030913330102500104.

    • Search Google Scholar
    • Export Citation
  • Durai, V. R., and R. Bhardwaj, 2014: Forecasting quantitative rainfall over India using multi-model ensemble technique. Meteor. Atmos. Phys., 126, 3148, https://doi.org/10.1007/s00703-014-0334-4.

    • Search Google Scholar
    • Export Citation
  • Elsner, J. B., and A. A. Tsonis, 1996: Singular Spectrum Analysis: A New Tool in Time Series Analysis. Springer Science and Business Media, 164 pp.

  • Fan, M., O. Imran, A. Singh, and S. A. Ajila, 2022: Using CNN-LSTM model for weather forecasting. 2022 IEEE Int. Conf. on Big Data, Osaka, Japan, Institute of Electrical and Electronics Engineers, 4120–4125, https://doi.org/10.1109/BigData55660.2022.10020940.

  • Faulina, R., D. A. Lusia, B. W. Otok, Sutikno, and H. Kuswanto, 2012: Ensemble method based on ANFIS-ARIMA for rainfall prediction. 2012 Int. Conf. on Statistics in Science, Business and Engineering (ICSSBE), Langkawi, Malaysia, Institute of Electrical and Electronics Engineers, 1–4, https://doi.org/10.1109/ICSSBE.2012.6396564.

  • Golyandina, N., 2010: On the choice of parameters in singular spectrum analysis and related subspace-based methods. Stat. Interface, 3, 259279, https://doi.org/10.4310/SII.2010.v3.n3.a2.

    • Search Google Scholar
    • Export Citation
  • Golyandina, N., and A. Zhigljavsky, 2013: Singular Spectrum Analysis for Time Series. Springer, 120 pp.

  • Golyandina, N., V. Nekrutkin, and A. A. Zhigljavsky, 2001: Analysis of Time Series Structure: SSA and Related Techniques. CRC Press, 320 pp.

  • Güldal, V., and H. Tongal, 2010: Comparison of recurrent neural network, adaptive neuro-fuzzy inference system and stochastic models in Eğirdir Lake level forecasting. Water Resour. Manage., 24, 105128, https://doi.org/10.1007/s11269-009-9439-9.

    • Search Google Scholar
    • Export Citation
  • Hamzaçebi, C., D. Akay, and F. Kutay, 2009: Comparison of direct and iterative artificial neural network forecast approaches in multi-periodic time series forecasting. Expert Syst. Appl., 36, 38393844, https://doi.org/10.1016/j.eswa.2008.02.042.

    • Search Google Scholar
    • Export Citation
  • Han, D., W. Yan, and A. Moghaddamnia, 2009: Model input data selection by the gamma test. Geophysical Research Abstracts, Vol. 11, Abstract 9711, https://meetingorganizer.copernicus.org/EGU2009/EGU2009-9711-2.pdf.

  • Hassani, H., S. Heravi, and A. Zhigljavsky, 2009: Forecasting European industrial production with multivariate singular spectrum analysis. Int. J. Forecasting, 25, 103118, https://doi.org/10.1016/j.ijforecast.2008.09.007.

    • Search Google Scholar
    • Export Citation
  • Hewage, P., A. Behera, M. Trovati, and E. Pereira, 2019: Long-short term memory for an effective short-term weather forecasting model using surface weather data. 15th IFIP Int. Conf. on Artificial Intelligence Applications and Innovations, Hersonissos, Greece, Springer International Publishing, 382–390, https://inria.hal.science/hal-02331313.

  • Hochreiter, S., and J. Schmidhuber, 1997: Long short-term memory. Neural Comput., 9, 17351780, https://doi.org/10.1162/neco.1997.9.8.1735.

    • Search Google Scholar
    • Export Citation
  • Hung, N. Q., M. S. Babel, S. Weesakul, and N. K. Tripathi, 2009: An artificial neural network model for rainfall forecasting in Bangkok, Thailand. Hydrol. Earth Syst. Sci., 13, 14131425, https://doi.org/10.5194/hess-13-1413-2009.

    • Search Google Scholar
    • Export Citation
  • Hutter, F., L. Kotthoff, and J. Vanschoren, 2019: Automated Machine Learning: Methods, Systems, Challenges. Springer, 219 pp.

  • Hyndman, R. J., and A. B. Koehler, 2006: Another look at measures of forecast accuracy. Int. J. Forecasting, 22, 679688, https://doi.org/10.1016/j.ijforecast.2006.03.001.

    • Search Google Scholar
    • Export Citation
  • Joseph, S., and Coauthors, 2015: North Indian heavy rainfall event during June 2013: Diagnostics and extended range prediction. Climate Dyn., 44, 20492065, https://doi.org/10.1007/s00382-014-2291-5.

    • Search Google Scholar
    • Export Citation
  • Khan, M. I., and R. Maity, 2020: Hybrid deep learning approach for multi-step-ahead daily rainfall prediction using GCM simulations. IEEE Access, 8, 52 77452 784, https://doi.org/10.1109/ACCESS.2020.2980977.

    • Search Google Scholar
    • Export Citation
  • Khastagir, A., I. Hossain, and A. H. M. Faisal Anwar, 2022: Efficacy of linear multiple regression and artificial neural network for long-term rainfall forecasting in western Australia. Meteor. Atmos. Phys., 134, 69, https://doi.org/10.1007/s00703-022-00907-4.

    • Search Google Scholar
    • Export Citation
  • Khorram, S., and N. Jehbez, 2023: A hybrid CNN-LSTM approach for monthly reservoir inflow forecasting. Water Resour. Manage., 37, 40974121, https://doi.org/10.1007/s11269-023-03541-w.

    • Search Google Scholar
    • Export Citation
  • Kratzert, F., D. Klotz, C. Brenner, K. Schulz, and M. Herrnegger, 2018: Rainfall–runoff modelling using Long Short-Term Memory (LSTM) networks. Hydrol. Earth Syst. Sci., 22, 60056022, https://doi.org/10.5194/hess-22-6005-2018.

    • Search Google Scholar
    • Export Citation
  • Kumar, R. S., and C. Ramesh, 2018: Decision tree based rainfall prediction model with data driven model using multiple linear regression. Adv. Nat. Appl. Sci., 12, 1219, https://doi.org/10.22587/anas.2018.12.6.3.

    • Search Google Scholar
    • Export Citation
  • Le, X.-H., H. V. Ho, G. Lee, and S. Jung, 2019: Application of Long Short-Term Memory (LSTM) neural network for flood forecasting. Water, 11, 1387, https://doi.org/10.3390/w11071387.

    • Search Google Scholar
    • Export Citation
  • Litta, A. J., S. M. Idicula, and U. C. Mohanty, 2013: Artificial neural network model in prediction of meteorological parameters during premonsoon thunderstorms. Int. J. Atmos. Sci., 2013, 525383, https://doi.org/10.1155/2013/525383.

    • Search Google Scholar
    • Export Citation
  • Liu, J., L. Xu, and N. Chen, 2022: A spatiotemporal deep learning model ST-LSTM-SA for hourly rainfall forecasting using radar echo images. J. Hydrol., 609, 127748, https://doi.org/10.1016/j.jhydrol.2022.127748.

    • Search Google Scholar
    • Export Citation
  • Ljung, G. M., and G. E. P. Box, 1978: On a measure of lack of fit in time series models. Biometrika, 65, 297303, https://doi.org/10.1093/biomet/65.2.297.

    • Search Google Scholar
    • Export Citation
  • Marcellino, M., J. H. Stock, and M. W. Watson, 2006: A comparison of direct and iterated multistep AR methods for forecasting macroeconomic time series. J. Econometrics, 135, 499526, https://doi.org/10.1016/j.jeconom.2005.07.020.

    • Search Google Scholar
    • Export Citation
  • Murthy, K. V. N., R. Saravana, and K. V. Kumar, 2018: Modeling and forecasting rainfall patterns of southwest monsoons in North–East India as a SARIMA process. Meteor. Atmos. Phys., 130, 99106, https://doi.org/10.1007/s00703-017-0504-2.

    • Search Google Scholar
    • Export Citation
  • Nash, J. E., and J. V. Sutcliffe, 1970: River flow forecasting through conceptual models Part I—A discussion of principles. J. Hydrol., 10, 282290, https://doi.org/10.1016/0022-1694(70)90255-6.

    • Search Google Scholar
    • Export Citation
  • Niknia, N., H. K. Moghaddam, S. M. Banaei, H. T. Podeh, F. Omidinasab, and A. A. Yazdi, 2014: Application of gamma test and neuro-fuzzy models in uncertainty analysis for prediction of pipeline scouring depth. J. Water Resour. Prot., 6, 514525, https://doi.org/10.4236/jwarp.2014.65050.

    • Search Google Scholar
    • Export Citation
  • Nitesh, K., Y. Abhiram, R. K. Teja, and S. Kavitha, 2023: Weather prediction using Long Short Term Memory (LSTM). 2023 Fifth Int. Conf. on Smart Systems and Inventive Technology (ICSSIT), Tirunelveli, TN, Institute of Electrical and Electronics Engineers, 1–6, https://doi.org/10.1109/ICSSIT55814.2023.10061039.

  • Ohmichi, Y., K. Takahashi, and K. Nakakita, 2022: Time-series image de-noising of pressure-sensitive paint data by projected multivariate singular spectrum analysis. arXiv, 2203.07574v4, https://doi.org/10.48550/arXiv.2203.07574.

  • Panda, A., and N. Sahu, 2019: Trend analysis of seasonal rainfall and temperature pattern in Kalahandi, Bolangir and Koraput districts of Odisha, India. Atmos. Sci. Lett., 20, e932, https://doi.org/10.1002/asl.932.

    • Search Google Scholar
    • Export Citation
  • Partal, T., and Ö. Kişi, 2007: Wavelet and neuro-fuzzy conjunction model for precipitation forecasting. J. Hydrol., 342, 199212, https://doi.org/10.1016/j.jhydrol.2007.05.026.

    • Search Google Scholar
    • Export Citation
  • Ratnam, J. V., H. A. Dijkstra, T. Doi, Y. Morioka, M. Nonaka, and S. K. Behera, 2019: Improving seasonal forecasts of air temperature using a genetic algorithm. Sci. Rep., 9, 12781, https://doi.org/10.1038/s41598-019-49281-z.

    • Search Google Scholar
    • Export Citation
  • Rodrigues, P. C., and R. Mahmoudvand, 2018: The benefits of multivariate singular spectrum analysis over the univariate version. J. Franklin Inst., 355, 544564, https://doi.org/10.1016/j.jfranklin.2017.09.008.

    • Search Google Scholar
    • Export Citation
  • Salman, A. G., Y. Heryadi, E. Abdurahman, and W. Suparta, 2018: Single layer & multi-layer Long Short-Term Memory (LSTM) model with intermediate variables for weather forecasting. Procedia Comput. Sci., 135, 8998, https://doi.org/10.1016/j.procs.2018.08.153.

    • Search Google Scholar
    • Export Citation
  • Sun, X., H. Zhang, J. Wang, C. Shi, D. Hua, and J. Li, 2022: Ensemble streamflow forecasting based on variational mode decomposition and long short term memory. Sci. Rep., 12, 518, https://doi.org/10.1038/s41598-021-03725-7.

    • Search Google Scholar
    • Export Citation
  • Suresh, A., and S. Pekkat, 2023: Importance of copula-based bivariate rainfall intensity-duration-frequency curves for an urbanized catchment incorporating climate change. J. Hydrol. Eng., 28, 05023012, https://doi.org/10.1061/JHYEFF.HEENG-5577.

    • Search Google Scholar
    • Export Citation
  • Taieb, S. B., and R. J. Hyndman, 2012: Recursive and direct multistep forecasting: The best of both worlds. Department of Econometrics and Business Statistics Working Paper 19/12, 36 pp., https://www.monash.edu/business/ebs/research/publications/ebs/wp19-12.pdf.

  • Umut, O., 2012: Using wavelet transform to improve generalization capability of feed forward neural networks in monthly runoff prediction. Sci. Res. Essays, 7, 16901703, https://doi.org/10.5897/SRE12.110.

    • Search Google Scholar
    • Export Citation
  • Unnikrishnan, P., and V. Jothiprakash, 2018: Data-driven multi-time-step ahead daily rainfall forecasting using singular spectrum analysis-based data pre-processing. J. Hydroinf., 20, 645667, https://doi.org/10.2166/hydro.2017.029.

    • Search Google Scholar
    • Export Citation
  • Vautard, R., and M. Ghil, 1989: Singular spectrum analysis in non-linear dynamics, with applications to paleoclimatic time series. Physica D, 35, 395424, https://doi.org/10.1016/0167-2789(89)90077-8.

    • Search Google Scholar
    • Export Citation
  • Vautard, R., P. Yiou, and M. Ghil, 1992: Singular-spectrum analysis: A toolkit for short, noisy chaotic signals. Physica D, 58, 95126, https://doi.org/10.1016/0167-2789(92)90103-T.

    • Search Google Scholar
    • Export Citation
  • Wang, N., J. Nie, J. Li, K. Wang, and S. Ling, 2022: A compression strategy to accelerate LSTM meta-learning on FPGA. ICT Express, 8, 322327, https://doi.org/10.1016/j.icte.2022.03.014.

    • Search Google Scholar
    • Export Citation
  • Wang, W.-C., K.-W. Chau, D.-M. Xu, and X.-Y. Chen, 2015: Improving forecasting accuracy of annual runoff time series using ARIMA based on EEMD decomposition. Water Resour. Manage., 29, 26552675, https://doi.org/10.1007/s11269-015-0962-6.

    • Search Google Scholar
    • Export Citation
  • Wu, C. L., K. W. Chau, and Y. S. Li, 2009: Methods to improve neural network performance in daily flows prediction. J. Hydrol., 372, 8093, https://doi.org/10.1016/j.jhydrol.2009.03.038.

    • Search Google Scholar
    • Export Citation
  • Wu, C. L., K. W. Chau, and C. Fan, 2010: Prediction of rainfall time series using modular artificial neural networks coupled with data-preprocessing techniques. J. Hydrol., 389, 146167, https://doi.org/10.1016/j.jhydrol.2010.05.040.

    • Search Google Scholar
    • Export Citation
  • Wu, X., and Coauthors, 2021: The development of a hybrid wavelet-ARIMA-LSTM model for precipitation amounts and drought analysis. Atmosphere, 12, 74, https://doi.org/10.3390/atmos12010074.

    • Search Google Scholar
    • Export Citation
  • Wu, Z., and N. E. Huang, 2009: Ensemble empirical mode decomposition: A noise-assisted data analysis method. Adv. Adapt. Data Anal., 1 (1), 141, https://doi.org/10.1142/S1793536909000047.

    • Search Google Scholar
    • Export Citation
  • Yu, P.-S., T.-C. Yang, S.-Y. Chen, C.-M. Kuo, and H.-W. Tseng, 2017: Comparison of random forests and support vector machine for real-time radar-derived rainfall forecasting. J. Hydrol., 552, 92104, https://doi.org/10.1016/j.jhydrol.2017.06.020.

    • Search Google Scholar
    • Export Citation
  • Yunpeng, L., H. Di, B. Junpeng, and Q. Yong, 2017: Multistep ahead time series forecasting for different data patterns based on LSTM recurrent neural network. 2017 14th Web Information Systems and Applications Conf. (WISA), Liuzhou, China, Institute of Electrical and Electronics Engineers, 305–310, https://doi.org/10.1109/WISA.2017.25.

  • Zar, J. H., 1972: Significance testing of the Spearman rank correlation coefficient. J. Amer. Stat. Assoc., 67, 578580, https://doi.org/10.1080/01621459.1972.10481251.

    • Search Google Scholar
    • Export Citation
  • Zuo, G., J. Luo, N. Wang, Y. Lian, and X. He, 2020: Decomposition ensemble model based on variational mode decomposition and long short-term memory for streamflow forecasting. J. Hydrol., 585, 124776, https://doi.org/10.1016/j.jhydrol.2020.124776.

    • Search Google Scholar
    • Export Citation

Supplementary Materials

Save
  • Abbot, J., and J. Marohasy, 2012: Application of artificial neural networks to rainfall forecasting in Queensland, Australia. Adv. Atmos. Sci., 29, 717730, https://doi.org/10.1007/s00376-012-1259-9.

    • Search Google Scholar
    • Export Citation
  • Alexandrov, T., 2009: A method of trend extraction using singular spectrum analysis. arXiv, 0804.3367v3, https://doi.org/10.48550/arXiv.0804.3367.

  • Ansari, K., T.-S. Bae, K. D. Singh, and J. Aryal, 2022: Multivariate singular spectrum analysis of seismicity in the space–time-depth-magnitude domain: Insight from eastern Nepal and the southern Tibetan Himalaya. J. Seismol., 26, 147166, https://doi.org/10.1007/s10950-021-10057-6.

    • Search Google Scholar
    • Export Citation
  • Bajirao, T. S., A. Elbeltagi, M. Kumar, and Q. B. Pham, 2022: Applicability of machine learning techniques for multi-time step ahead runoff forecasting. Acta Geophys., 70, 757776, https://doi.org/10.1007/s11600-022-00749-z.

    • Search Google Scholar
    • Export Citation
  • Benjamin, S. G., J. M. Brown, G. Brunet, P. Lynch, K. Saito, and T. W. Schlatter, 2019: 100 years of progress in forecasting and NWP applications. A Century of Progress in Atmospheric and Related Sciences: Celebrating the American Meteorological Society Centennial, Meteor. Monogr., No. 59, Amer. Meteor. Soc., https://doi.org/10.1175/AMSMONOGRAPHS-D-18-0020.1.

  • Bergen, K. J., P. A. Johnson, M. V. De Hoop, and G. C. Beroza, 2019: Machine learning for data-driven discovery in solid Earth geoscience. Science, 363, eaau0323, https://doi.org/10.1126/science.aau0323.

    • Search Google Scholar
    • Export Citation
  • Broomhead, D. S., and G. P. King, 1986: On the qualitative analysis of experimental dynamical systems. Non-linear Phenomena and Chaos, S. Sarkar, Ed., Adam Hilger, 113–144.

  • Chang, M., B. Liu, B. Wang, C. Martinez-Villalobos, G. Ren, and T. Zhou, 2022: Understanding future increases in precipitation extremes in global land monsoon regions. J. Climate, 35, 18391851, https://doi.org/10.1175/JCLI-D-21-0409.1.

    • Search Google Scholar
    • Export Citation
  • Chen, G., B. Tang, X. Zeng, P. Zhou, P. Kang, and H. Long, 2022: Short-term wind speed forecasting based on long short-term memory and improved BP neural network. Int. J. Electr. Power Energy Syst., 134, 107365, https://doi.org/10.1016/j.ijepes.2021.107365.

    • Search Google Scholar
    • Export Citation
  • Das, S., R. Chakraborty, and A. Maitra, 2017: A random forest algorithm for nowcasting of intense precipitation events. Adv. Space Res., 60, 12711282, https://doi.org/10.1016/j.asr.2017.03.026.

    • Search Google Scholar
    • Export Citation
  • Dawson, C. W., and R. L. Wilby, 2001: Hydrological modelling using artificial neural networks. Prog. Phys. Geogr., 25, 80108, https://doi.org/10.1177/030913330102500104.

    • Search Google Scholar
    • Export Citation
  • Durai, V. R., and R. Bhardwaj, 2014: Forecasting quantitative rainfall over India using multi-model ensemble technique. Meteor. Atmos. Phys., 126, 3148, https://doi.org/10.1007/s00703-014-0334-4.

    • Search Google Scholar
    • Export Citation
  • Elsner, J. B., and A. A. Tsonis, 1996: Singular Spectrum Analysis: A New Tool in Time Series Analysis. Springer Science and Business Media, 164 pp.

  • Fan, M., O. Imran, A. Singh, and S. A. Ajila, 2022: Using CNN-LSTM model for weather forecasting. 2022 IEEE Int. Conf. on Big Data, Osaka, Japan, Institute of Electrical and Electronics Engineers, 4120–4125, https://doi.org/10.1109/BigData55660.2022.10020940.

  • Faulina, R., D. A. Lusia, B. W. Otok, Sutikno, and H. Kuswanto, 2012: Ensemble method based on ANFIS-ARIMA for rainfall prediction. 2012 Int. Conf. on Statistics in Science, Business and Engineering (ICSSBE), Langkawi, Malaysia, Institute of Electrical and Electronics Engineers, 1–4, https://doi.org/10.1109/ICSSBE.2012.6396564.

  • Golyandina, N., 2010: On the choice of parameters in singular spectrum analysis and related subspace-based methods. Stat. Interface, 3, 259279, https://doi.org/10.4310/SII.2010.v3.n3.a2.

    • Search Google Scholar
    • Export Citation
  • Golyandina, N., and A. Zhigljavsky, 2013: Singular Spectrum Analysis for Time Series. Springer, 120 pp.

  • Golyandina, N., V. Nekrutkin, and A. A. Zhigljavsky, 2001: Analysis of Time Series Structure: SSA and Related Techniques. CRC Press, 320 pp.

  • Güldal, V., and H. Tongal, 2010: Comparison of recurrent neural network, adaptive neuro-fuzzy inference system and stochastic models in Eğirdir Lake level forecasting. Water Resour. Manage., 24, 105128, https://doi.org/10.1007/s11269-009-9439-9.

    • Search Google Scholar
    • Export Citation
  • Hamzaçebi, C., D. Akay, and F. Kutay, 2009: Comparison of direct and iterative artificial neural network forecast approaches in multi-periodic time series forecasting. Expert Syst. Appl., 36, 38393844, https://doi.org/10.1016/j.eswa.2008.02.042.

    • Search Google Scholar
    • Export Citation
  • Han, D., W. Yan, and A. Moghaddamnia, 2009: Model input data selection by the gamma test. Geophysical Research Abstracts, Vol. 11, Abstract 9711, https://meetingorganizer.copernicus.org/EGU2009/EGU2009-9711-2.pdf.

  • Hassani, H., S. Heravi, and A. Zhigljavsky, 2009: Forecasting European industrial production with multivariate singular spectrum analysis. Int. J. Forecasting, 25, 103118, https://doi.org/10.1016/j.ijforecast.2008.09.007.

    • Search Google Scholar
    • Export Citation
  • Hewage, P., A. Behera, M. Trovati, and E. Pereira, 2019: Long-short term memory for an effective short-term weather forecasting model using surface weather data. 15th IFIP Int. Conf. on Artificial Intelligence Applications and Innovations, Hersonissos, Greece, Springer International Publishing, 382–390, https://inria.hal.science/hal-02331313.

  • Hochreiter, S., and J. Schmidhuber, 1997: Long short-term memory. Neural Comput., 9, 17351780, https://doi.org/10.1162/neco.1997.9.8.1735.

    • Search Google Scholar
    • Export Citation
  • Hung, N. Q., M. S. Babel, S. Weesakul, and N. K. Tripathi, 2009: An artificial neural network model for rainfall forecasting in Bangkok, Thailand. Hydrol. Earth Syst. Sci., 13, 14131425, https://doi.org/10.5194/hess-13-1413-2009.

    • Search Google Scholar
    • Export Citation
  • Hutter, F., L. Kotthoff, and J. Vanschoren, 2019: Automated Machine Learning: Methods, Systems, Challenges. Springer, 219 pp.

  • Hyndman, R. J., and A. B. Koehler, 2006: Another look at measures of forecast accuracy. Int. J. Forecasting, 22, 679688, https://doi.org/10.1016/j.ijforecast.2006.03.001.

    • Search Google Scholar
    • Export Citation
  • Joseph, S., and Coauthors, 2015: North Indian heavy rainfall event during June 2013: Diagnostics and extended range prediction. Climate Dyn., 44, 20492065, https://doi.org/10.1007/s00382-014-2291-5.

    • Search Google Scholar
    • Export Citation
  • Khan, M. I., and R. Maity, 2020: Hybrid deep learning approach for multi-step-ahead daily rainfall prediction using GCM simulations. IEEE Access, 8, 52 77452 784, https://doi.org/10.1109/ACCESS.2020.2980977.

    • Search Google Scholar
    • Export Citation
  • Khastagir, A., I. Hossain, and A. H. M. Faisal Anwar, 2022: Efficacy of linear multiple regression and artificial neural network for long-term rainfall forecasting in western Australia. Meteor. Atmos. Phys., 134, 69, https://doi.org/10.1007/s00703-022-00907-4.

    • Search Google Scholar
    • Export Citation
  • Khorram, S., and N. Jehbez, 2023: A hybrid CNN-LSTM approach for monthly reservoir inflow forecasting. Water Resour. Manage., 37, 40974121, https://doi.org/10.1007/s11269-023-03541-w.

    • Search Google Scholar
    • Export Citation
  • Kratzert, F., D. Klotz, C. Brenner, K. Schulz, and M. Herrnegger, 2018: Rainfall–runoff modelling using Long Short-Term Memory (LSTM) networks. Hydrol. Earth Syst. Sci., 22, 60056022, https://doi.org/10.5194/hess-22-6005-2018.

    • Search Google Scholar
    • Export Citation
  • Kumar, R. S., and C. Ramesh, 2018: Decision tree based rainfall prediction model with data driven model using multiple linear regression. Adv. Nat. Appl. Sci., 12, 1219, https://doi.org/10.22587/anas.2018.12.6.3.

    • Search Google Scholar
    • Export Citation
  • Le, X.-H., H. V. Ho, G. Lee, and S. Jung, 2019: Application of Long Short-Term Memory (LSTM) neural network for flood forecasting. Water, 11, 1387, https://doi.org/10.3390/w11071387.

    • Search Google Scholar
    • Export Citation
  • Litta, A. J., S. M. Idicula, and U. C. Mohanty, 2013: Artificial neural network model in prediction of meteorological parameters during premonsoon thunderstorms. Int. J. Atmos. Sci., 2013, 525383, https://doi.org/10.1155/2013/525383.

    • Search Google Scholar
    • Export Citation
  • Liu, J., L. Xu, and N. Chen, 2022: A spatiotemporal deep learning model ST-LSTM-SA for hourly rainfall forecasting using radar echo images. J. Hydrol., 609, 127748, https://doi.org/10.1016/j.jhydrol.2022.127748.

    • Search Google Scholar
    • Export Citation
  • Ljung, G. M., and G. E. P. Box, 1978: On a measure of lack of fit in time series models. Biometrika, 65, 297303, https://doi.org/10.1093/biomet/65.2.297.

    • Search Google Scholar
    • Export Citation
  • Marcellino, M., J. H. Stock, and M. W. Watson, 2006: A comparison of direct and iterated multistep AR methods for forecasting macroeconomic time series. J. Econometrics, 135, 499526, https://doi.org/10.1016/j.jeconom.2005.07.020.

    • Search Google Scholar
    • Export Citation
  • Murthy, K. V. N., R. Saravana, and K. V. Kumar, 2018: Modeling and forecasting rainfall patterns of southwest monsoons in North–East India as a SARIMA process. Meteor. Atmos. Phys., 130, 99106, https://doi.org/10.1007/s00703-017-0504-2.

    • Search Google Scholar
    • Export Citation
  • Nash, J. E., and J. V. Sutcliffe, 1970: River flow forecasting through conceptual models Part I—A discussion of principles. J. Hydrol., 10, 282290, https://doi.org/10.1016/0022-1694(70)90255-6.

    • Search Google Scholar
    • Export Citation
  • Niknia, N., H. K. Moghaddam, S. M. Banaei, H. T. Podeh, F. Omidinasab, and A. A. Yazdi, 2014: Application of gamma test and neuro-fuzzy models in uncertainty analysis for prediction of pipeline scouring depth. J. Water Resour. Prot., 6, 514525, https://doi.org/10.4236/jwarp.2014.65050.

    • Search Google Scholar
    • Export Citation
  • Nitesh, K., Y. Abhiram, R. K. Teja, and S. Kavitha, 2023: Weather prediction using Long Short Term Memory (LSTM). 2023 Fifth Int. Conf. on Smart Systems and Inventive Technology (ICSSIT), Tirunelveli, TN, Institute of Electrical and Electronics Engineers, 1–6, https://doi.org/10.1109/ICSSIT55814.2023.10061039.

  • Ohmichi, Y., K. Takahashi, and K. Nakakita, 2022: Time-series image de-noising of pressure-sensitive paint data by projected multivariate singular spectrum analysis. arXiv, 2203.07574v4, https://doi.org/10.48550/arXiv.2203.07574.

  • Panda, A., and N. Sahu, 2019: Trend analysis of seasonal rainfall and temperature pattern in Kalahandi, Bolangir and Koraput districts of Odisha, India. Atmos. Sci. Lett., 20, e932, https://doi.org/10.1002/asl.932.

    • Search Google Scholar
    • Export Citation
  • Partal, T., and Ö. Kişi, 2007: Wavelet and neuro-fuzzy conjunction model for precipitation forecasting. J. Hydrol., 342, 199212, https://doi.org/10.1016/j.jhydrol.2007.05.026.

    • Search Google Scholar
    • Export Citation
  • Ratnam, J. V., H. A. Dijkstra, T. Doi, Y. Morioka, M. Nonaka, and S. K. Behera, 2019: Improving seasonal forecasts of air temperature using a genetic algorithm. Sci. Rep., 9, 12781, https://doi.org/10.1038/s41598-019-49281-z.

    • Search Google Scholar
    • Export Citation
  • Rodrigues, P. C., and R. Mahmoudvand, 2018: The benefits of multivariate singular spectrum analysis over the univariate version. J. Franklin Inst., 355, 544564, https://doi.org/10.1016/j.jfranklin.2017.09.008.

    • Search Google Scholar
    • Export Citation
  • Salman, A. G., Y. Heryadi, E. Abdurahman, and W. Suparta, 2018: Single layer & multi-layer Long Short-Term Memory (LSTM) model with intermediate variables for weather forecasting. Procedia Comput. Sci., 135, 8998, https://doi.org/10.1016/j.procs.2018.08.153.

    • Search Google Scholar
    • Export Citation
  • Sun, X., H. Zhang, J. Wang, C. Shi, D. Hua, and J. Li, 2022: Ensemble streamflow forecasting based on variational mode decomposition and long short term memory. Sci. Rep., 12, 518, https://doi.org/10.1038/s41598-021-03725-7.

    • Search Google Scholar
    • Export Citation
  • Suresh, A., and S. Pekkat, 2023: Importance of copula-based bivariate rainfall intensity-duration-frequency curves for an urbanized catchment incorporating climate change. J. Hydrol. Eng., 28, 05023012, https://doi.org/10.1061/JHYEFF.HEENG-5577.

    • Search Google Scholar
    • Export Citation
  • Taieb, S. B., and R. J. Hyndman, 2012: Recursive and direct multistep forecasting: The best of both worlds. Department of Econometrics and Business Statistics Working Paper 19/12, 36 pp., https://www.monash.edu/business/ebs/research/publications/ebs/wp19-12.pdf.

  • Umut, O., 2012: Using wavelet transform to improve generalization capability of feed forward neural networks in monthly runoff prediction. Sci. Res. Essays, 7, 16901703, https://doi.org/10.5897/SRE12.110.

    • Search Google Scholar
    • Export Citation
  • Unnikrishnan, P., and V. Jothiprakash, 2018: Data-driven multi-time-step ahead daily rainfall forecasting using singular spectrum analysis-based data pre-processing. J. Hydroinf., 20, 645667, https://doi.org/10.2166/hydro.2017.029.

    • Search Google Scholar
    • Export Citation
  • Vautard, R., and M. Ghil, 1989: Singular spectrum analysis in non-linear dynamics, with applications to paleoclimatic time series. Physica D, 35, 395424, https://doi.org/10.1016/0167-2789(89)90077-8.

    • Search Google Scholar
    • Export Citation
  • Vautard, R., P. Yiou, and M. Ghil, 1992: Singular-spectrum analysis: A toolkit for short, noisy chaotic signals. Physica D, 58, 95126, https://doi.org/10.1016/0167-2789(92)90103-T.

    • Search Google Scholar
    • Export Citation
  • Wang, N., J. Nie, J. Li, K. Wang, and S. Ling, 2022: A compression strategy to accelerate LSTM meta-learning on FPGA. ICT Express, 8, 322327, https://doi.org/10.1016/j.icte.2022.03.014.

    • Search Google Scholar
    • Export Citation
  • Wang, W.-C., K.-W. Chau, D.-M. Xu, and X.-Y. Chen, 2015: Improving forecasting accuracy of annual runoff time series using ARIMA based on EEMD decomposition. Water Resour. Manage., 29, 26552675, https://doi.org/10.1007/s11269-015-0962-6.

    • Search Google Scholar
    • Export Citation
  • Wu, C. L., K. W. Chau, and Y. S. Li, 2009: Methods to improve neural network performance in daily flows prediction. J. Hydrol., 372, 8093, https://doi.org/10.1016/j.jhydrol.2009.03.038.

    • Search Google Scholar
    • Export Citation
  • Wu, C. L., K. W. Chau, and C. Fan, 2010: Prediction of rainfall time series using modular artificial neural networks coupled with data-preprocessing techniques. J. Hydrol., 389, 146167, https://doi.org/10.1016/j.jhydrol.2010.05.040.

    • Search Google Scholar
    • Export Citation
  • Wu, X., and Coauthors, 2021: The development of a hybrid wavelet-ARIMA-LSTM model for precipitation amounts and drought analysis. Atmosphere, 12, 74, https://doi.org/10.3390/atmos12010074.

    • Search Google Scholar
    • Export Citation
  • Wu, Z., and N. E. Huang, 2009: Ensemble empirical mode decomposition: A noise-assisted data analysis method. Adv. Adapt. Data Anal., 1 (1), 141, https://doi.org/10.1142/S1793536909000047.

    • Search Google Scholar
    • Export Citation
  • Yu, P.-S., T.-C. Yang, S.-Y. Chen, C.-M. Kuo, and H.-W. Tseng, 2017: Comparison of random forests and support vector machine for real-time radar-derived rainfall forecasting. J. Hydrol., 552, 92104, https://doi.org/10.1016/j.jhydrol.2017.06.020.

    • Search Google Scholar
    • Export Citation
  • Yunpeng, L., H. Di, B. Junpeng, and Q. Yong, 2017: Multistep ahead time series forecasting for different data patterns based on LSTM recurrent neural network. 2017 14th Web Information Systems and Applications Conf. (WISA), Liuzhou, China, Institute of Electrical and Electronics Engineers, 305–310, https://doi.org/10.1109/WISA.2017.25.

  • Zar, J. H., 1972: Significance testing of the Spearman rank correlation coefficient. J. Amer. Stat. Assoc., 67, 578580, https://doi.org/10.1080/01621459.1972.10481251.

    • Search Google Scholar
    • Export Citation
  • Zuo, G., J. Luo, N. Wang, Y. Lian, and X. He, 2020: Decomposition ensemble model based on variational mode decomposition and long short-term memory for streamflow forecasting. J. Hydrol., 585, 124776, https://doi.org/10.1016/j.jhydrol.2020.124776.

    • Search Google Scholar
    • Export Citation
  • Fig. 1.

    Location map of the study areas.

  • Fig. 2.

    The architecture of LSTM memory cell.

  • Fig. 3.

    Flowchart for hybrid model architecture.

  • Fig. 4.

    Eigenvalue plot and variance ratio for hourly multivariate series for (a) Guwahati city and (b) Aizawl city for the period 2015–19.

  • Fig. 5.

    The W-correlation for the first 50 RCs for (a) Guwahati city and (b) Aizawl city for the period 2015–19.

  • Fig. 6.

    Correlogram plot, boxplot, and power spectrum of a noise component for (a) Guwahati city and (b) Aizawl city for the period 2015–19.

  • Fig. 7.

    Performance comparison by using error indices RMSE, NSE, MNE, and SMAPE between observed and forecasted rainfall (a) before preprocessing and (b) after preprocessing by MSSA at a different learning rate for Guwahati city and (c) before preprocessing and (d) after preprocessing by MSSA at a different learning rate for Aizawl city.

  • Fig. 8.

    Hourly observed and forecasted rainfall by hybrid MSSA–LSTM model for the period July (a) 3 Jan 2015–29 Apr 2019 (for Guwahati) and (b) 3 Jan 2015–4 May 2019 (for Aizawl); scatterplot of observed and model forecast during training and testing period (c) for Guwahati city and (d) for Aizawl city; observed and model forecast for next 1 h for the period (e) 1700:00–2300:00 LT 29 Apr 2019 (for Guwahati) and (f) 1400:00–2000:00 LT 4 May 2019 (for Aizawl); radar chart showing the performance skill of the LSTM model in hourly rainfall forecast with and without preprocessing by MSSA for (g) Guwahati city and (h) Aizawl city.

All Time Past Year Past 30 Days
Abstract Views 2145 2145 0
Full Text Views 722 722 116
PDF Downloads 154 154 20