RBF Neural Networks Combined with Principal Component Analysis Applied to Quantitative Precipitation Forecast for a Reservoir Watershed during Typhoon Periods

Chih-Chiang Wei Department of Information Management, Toko University, Pu-Tzu City, Taiwan

Search for other papers by Chih-Chiang Wei in
Current site
Google Scholar
PubMed
Close
Full access

Abstract

The forecast of precipitations during typhoons has received much attention in recent years. It is important in meteorology and atmospheric sciences. Hence, the study on precipitation nowcast during typhoons is of great significance to operators of a reservoir system. This study developed an improved neural network that combines the principal component analysis (PCA) technique and the radial basis function (RBF) network. The developed methodology was employed to establish the quantitative precipitation forecast model for the watershed of the Shihmen Reservoir in northern Taiwan. The results obtained from RBF, multiple linear regression (MLR), PCA–RBF, and PCA–MLR models included the forecasts of L-ahead (L = 1, 3, 6) hourly accumulated precipitations. The deducted prediction results were compared in terms of four measures [mean absolute error (MAE), RMSE, coefficient of correlation (CC), and coefficient of efficiency (CE)] and four skill scores [percentage error (PE), area-weighted error score (AWES), bias score (BIAS), and equitable threat score (ETS)]. The results showed that predictions obtained using RBF and PCA–RBF were better than those produced by MLR and PCA–MLR. Although both RBF and PCA–RBF can provide good results on average, the network architecture and the learning speed of the PCA–RBF network are superior to those of the simple RBF network. This is because PCA technique could greatly reduce the input parameters and simplify concurrently the network structure. Consequently, the PCA–RBF neural networks can be regarded as a reliable model for predicting precipitation during typhoons.

Corresponding author address: Chih-Chiang Wei, Department of Information Management, Toko University, No. 51, Sec. 2, University Rd., Pu-Tzu City, Chia-Yi County 61363, Taiwan. E-mail: d89521007@ntu.edu.tw

Abstract

The forecast of precipitations during typhoons has received much attention in recent years. It is important in meteorology and atmospheric sciences. Hence, the study on precipitation nowcast during typhoons is of great significance to operators of a reservoir system. This study developed an improved neural network that combines the principal component analysis (PCA) technique and the radial basis function (RBF) network. The developed methodology was employed to establish the quantitative precipitation forecast model for the watershed of the Shihmen Reservoir in northern Taiwan. The results obtained from RBF, multiple linear regression (MLR), PCA–RBF, and PCA–MLR models included the forecasts of L-ahead (L = 1, 3, 6) hourly accumulated precipitations. The deducted prediction results were compared in terms of four measures [mean absolute error (MAE), RMSE, coefficient of correlation (CC), and coefficient of efficiency (CE)] and four skill scores [percentage error (PE), area-weighted error score (AWES), bias score (BIAS), and equitable threat score (ETS)]. The results showed that predictions obtained using RBF and PCA–RBF were better than those produced by MLR and PCA–MLR. Although both RBF and PCA–RBF can provide good results on average, the network architecture and the learning speed of the PCA–RBF network are superior to those of the simple RBF network. This is because PCA technique could greatly reduce the input parameters and simplify concurrently the network structure. Consequently, the PCA–RBF neural networks can be regarded as a reliable model for predicting precipitation during typhoons.

Corresponding author address: Chih-Chiang Wei, Department of Information Management, Toko University, No. 51, Sec. 2, University Rd., Pu-Tzu City, Chia-Yi County 61363, Taiwan. E-mail: d89521007@ntu.edu.tw

1. Introduction

Taiwan is located on the main path of western North Pacific tropical cyclones and was affected by at least one typhoon each year, according to the official records of the Central Weather Bureau (CWB), Taiwan (Lee et al. 2008). As soon as a typhoon strikes, the upstream watershed receives voluminous rainfall within a short time, which quickly converges downstream. Heavy downpour can easily cause floodwater to rise above the downstream embankments, resulting in considerable economic losses and casualties (Hsu and Wei 2007). Therefore, a useful scheme for quantitative precipitation forecast (QPF) during typhoon periods is highly desired (Chang et al. 1993; Lee et al. 2006; Wei and Hsu 2008a).

In Taiwan, Wang et al. (1986) first developed a technique using the climatology average method (a simple statistical approach developed from the spatial distribution of typhoon center) to forecast typhoon rainfalls over land in Taiwan. This method was adopted to be one of the operational typhoon rainfall forecast aids by CWB. Many researchers followed this approach; for example, Yeh (2002) applied empirical orthogonal function analysis, using the climatology average method, deviation persistence method, and regression equation to forecast the 6-h accumulated typhoon rainfall. Lee et al. (2006) employed a climatology model for forecasting typhoon rainfalls to estimate reasonable cumulative rainfall for each river basin. Hsu and Wei (2007) employed the climatology average model to predict the real-time hourly rainfalls for a reservoir watershed.

In recent years, artificial neural networks (ANNs) have become rather widely used tools. ANNs were created to simulate the nervous system and brain activity (Zhao and Huang 2007; Wei and Hsu 2008a). French et al. (1992) developed a neural network to forecast rainfall intensity fields in space and time; it is a three-layer learning network with input, hidden, and output layers. Training is conducted using back propagation. Fox and Wikle (2005) presented a forecast methodology developed from a Bayesian hierarchical model that produces a QPF product for a 1-h period along with an associated estimated forecast error field. Lin and Lee (2007) proposed a gray forecasting model integrated within Fourier series and a Markov chain to enhance the tendency catching ability of typhoon rainfall events. Fan and Lee (2007) developed a Bayesian mixture regression model and applied it to rainfall nowcast during typhoons. Sheng et al. (2008) used tropical cyclone data (position, pressure, and wind) to establish the distribution functions of rainfall and ran the support vector machine (SVM) regression models for rainfall forecast. Lin and Wu (2009) combined the self-organizing map (SOM) and the multilayer perceptron network (MLP) to forecast the typhoon rainfall.

As one of the most popular neural networks, radial basis function (RBF) neural networks is to model an unknown system with observable inputs and outputs, which resembles synthesizing an approximation of a set of multidimensional functions (Bors and Gabbouj 1994). RBF has been used extensively in many engineering and scientific applications (Park and Sandberg 1991; Bors and Gabbouj 1994). For example, Hwang and Bang (1997) designed a RBF classifier for handwritten numeral recognition that showed a very competitive performance. Moradkhani et al. (2004) used a self-organizing RBF to improve one-step-ahead forecast of daily streamflow. Moreover, RBF-related networks further attract lots of attention on the improvement of its approximate ability as well as the construction of its architecture (Shi et al. 2005). Some researchers—such as Dong and MacAvoy (1996), Monahan (2000), Hsieh (2001), Lu et al. (2004), and Ture et al. (2007)—have successfully combined the principal component analysis (PCA) with neural networks. They concluded that PCA can be employed to find a set of orthogonal components that minimize the error in the reconstructed data.

This study developed an improved neural network that combines the PCA technique and RBF networks. The four models—including RBF, classical multiple linear regression (MLR), PCA–MLR, and the improved PCA–RBF—then formulated the QPF models to predict hourly rainfalls during typhoons. The developed methodology was then utilized to construct the QPF models for the watershed of the Shihmen Reservoir in northern Taiwan. The PCA–RBF model was compared with the other three models in the analysis of historical typhoon events.

2. Methodology

This section presents a methodology for developing a usable scheme for forecasting amounts of rainfall during typhoon periods. The main processing stages are plotted in Fig. 1. In the figure, the four prediction algorithms mentioned above—RBF, MLR, PCA–RBF, and PCA–MLR—are involved. The model inputs include typhoon characteristics (pressure of typhoon center, direction angle of typhoon relative to watershed, distance of typhoon from watershed, radius of typhoon, speed of typhoon, and maximum wind speed of typhoon center) and rainfall data (precipitation in the watershed), while the outputs comprise L-h-ahead (L = 1, 3, 6) accumulated precipitations.

Fig. 1.
Fig. 1.

Flowchart of the developed models.

Citation: Journal of Hydrometeorology 13, 2; 10.1175/JHM-D-11-03.1

In the stage of preliminary analysis (see Fig. 1), three scenarios are designed for selecting the best attribute (input) combinations. Then, according to the best scenario evaluated by performance measures, the best cases of these attribute lag-time D varying from 1 to 6 h are identified for each target (L-ahead accumulated precipitation) by running RBF and MLR. In the stage of PCA-related models (right side of the figure), PCA is analyzed to derive new variables from original attributes and their lag times. In a similar way, the best attribute lag-time cases for each target (L) run by PCA–RBF and PCA–MLR are identified. Finally, the best results of the above six cases are regarded as the model predictions for RBF, MLR, PCA–RBF, and PCA–MLR.

3. Theory of algorithm

a. RBF networks

The architecture of RBF networks is a three-layer feed-forward network that consists of one input layer, one hidden layer (also called receptor layer), and one output layer. Each input neuron corresponds to a component of an input vector x (Zhao and Huang 2007). The hidden layer consists of n neurons that represent clusters of input patterns, similar to the clusters in a k-means model (Hwang and Bang 1997). Each input neuron is connected to the hidden-layer neurons. The weight vector between the input layer and the ith hidden-layer neuron (RBF center) corresponds to the center ci in Gaussian function
e1
where ci and σi represent the center and the width of the neuron i, respectively, and denotes the Euclidean distance between x and ci.
The output layer consists of m neurons that correspond to the possible classes of the problem. Each output-layer neuron is fully connected to the hidden layer and computes a linear weighted sum of the outputs of the hidden neurons—that is,
e2
where wij is the weight between the ith hidden-layer neuron and the jth output-layer neuron.

The connections between the input neurons and the receptor neurons (receptor weights) are trained in essentially the same manner as a k-means model. The receptor weights are trained with only the input fields, with the output fields ignored in the first phase of training. Only after the receptor weights are optimized to find clusters in the input data are the connections between the receptors and the output neurons trained to generate predictions (Bishop 1991; Park and Sandberg 1993; SPSS Inc. 2002).

b. RBF networks combined with PCA

The RBF network, in theory, provides an effective method for the learning of a network. RBF networks show good performance in generalization. However, the disadvantages of the RBF network are in cases dealing with high-dimensional input spaces, especially when the original dataset contains some invalid variables (Lu et al. 2004). That is to say, RBFs are more sensitive to the curse of dimensionality, and have greater difficulties if the number of input units is large (Ripley 1996). This is because each additional input unit in a network adds another dimension to the space in which the data cases reside; the networks are attempting to fit a response surface to this data. Thought of in this way, there must be sufficient data points to populate an M-dimensional space sufficiently dense to be able to see the structure. The number of points needed to do this properly grows very rapidly with the dimensionality (roughly, in proportion to 2M) (Fausett 1994; Bishop 1995).

Principal component analysis can gather highly correlated independent variables into a principal component, and all principal components are independent of each other, so that all the analysis does is to transform a set of correlated variables into a set of uncorrelated principal components (Liu et al. 2003). A small set of uncorrelated variables is much easier to understand and use in further analyses than a larger set of correlated variables. Hence, the PCA is required and used first in the proposed PCA–RBF method to simplify and orthogonalize the original dataset in order to make the RBF networks more effective (Lu et al. 2004; Ture et al. 2007).

PCA is employed to describe the variance in datasets of observations on p variables (Jolliffe 1986; Manly and Bryan 1986). The first principal component (Y1) can be defined as a linear combination of the elements of the data matrix—that is,
e3
where coefficients chosen to maximize the variance represented by the first principal component are simply the eigenvectors of the symmetric covariance matrix. The eigenvalues of the covariance matrix represent the variation of each principal component, where . Ideally, a PCA will yield several components that describe the majority of the total variation of the dataset (Ture et al. 2007).

4. Application

a. Study area and data

Shihmen Reservoir completed in 1964 and located on the upstream reaches of the Tahan River (see Fig. 2) was the study area. The reservoir is one of the largest water reservoirs in Taiwan. Shihmen Reservoir is a multipurpose reservoir for irrigation, hydroelectric energy generation, public water supply, flood control, and tourism (Cheng et al. 2008; Wei and Hsu 2008b). The watershed covers an area of 763.4 km2. It is currently managed by the Water Resources Agency (WRA). Figure 2 also shows the 13 rain gauges in the watershed (Wei and Hsu 2009).

Fig. 2.
Fig. 2.

Map of Tahan River basin and rain gauges.

Citation: Journal of Hydrometeorology 13, 2; 10.1175/JHM-D-11-03.1

This study collected a total of 157 typhoon events affecting the Shihmen Reservoir watershed over the past 45 yr (1964–2008), as can be seen in Fig. 3. Complete data of hourly typhoon characteristics can be obtained from CWB while historical hydrological data of the reservoir watershed are available from the WRA.

Fig. 3.
Fig. 3.

Number of historical typhoon events affecting the Shihmen Reservoir watershed in each year.

Citation: Journal of Hydrometeorology 13, 2; 10.1175/JHM-D-11-03.1

b. Data preprocessing

1) Attribute definitions

Figure 4 demonstrates the location of the study site and the historical typhoon tracks [an example of Typhoon Sinlaku (2008)]. When a typhoon approaches the study area, many typhoon characteristics (attributes) can be measured. For the complicated typhoon system, the selected attributes can be divided into two categories. One is typhoon characteristics and the other comprises rainfall data in the watershed (as classified in Table 1).

Fig. 4.
Fig. 4.

Location of study site and historical track of Typhoon Sinlaku (2008).

Citation: Journal of Hydrometeorology 13, 2; 10.1175/JHM-D-11-03.1

Table 1.

Maximum and minimum value of attributes.

Table 1.

First, the attributes of typhoon characteristics include pressure of typhoon center at time t, denoted A1(t) (102 Pa); direction angle of typhoon relative to watershed at time t, A2(t) (°); distance of typhoon from watershed at time t, A3(t) (km); radius of typhoon at time t, A4(t) (km); speed of typhoon at time t, A5(t) (km h−1); and maximum wind speed at typhoon center at time t, A6(t) (km h−1). Second, the hydrological attributes of reservoir watershed constitute the hourly average precipitation in the watershed at time t, A7(t) (mm h−1). The average rainfalls at the watershed are assumed to represent the precipitation characteristics because the area of the reservoir basin is relatively small in comparison with the region invaded by typhoons. That is, A7(t) is the mean of the 13 rain gauges as an estimate of all basin-average precipitations at time t.

In the above attributes, A2(t) is calculated by
e4
where and are the longitude (°E) and latitude (°N) of typhoon center at time t, respectively, and and are the longitude (°E) and latitude (°N) of the reservoir dam, respectively.
Moreover, A3(t) is computed by
e5

The target is the hourly accumulative precipitation of the watershed at lead time L, , where L = 1, 3, and 6 h herein.

2) Scaling of range fields

The range fields of attributes in this study are rescaled to have values between 0 and 1. The linear normalized transformation used is
e6
where and xj are the normalized and original data, respectively, and xmax and xmin are the maximum and minimum value of the original data, respectively. Table 1 shows the maximum and minimum value of the attributes selected. For all the patterns of the reservoir watershed, 3739 hourly records are available.

c. Scenarios designed and performance criteria

In this study, the QPF models for rainfall prediction are a function of these attributes. The following three scenarios of attribute combinations are designed:

  • Scenario A: Climatologic characteristics of the typhoon only. The 1-h-ahead rainfall precipitation can be expressed as a function of climatologic attributes—that is,
    e7
  • Scenario B: Average precipitations of the watershed only—that is,
    e8
  • Scenario C: Both climatologic characteristics of the typhoon and average precipitations of the watershed involved—that is,
    e9

All the above attributes are in normalized values. For comparison purposes, the criteria of mean absolute error (MAE), root mean squared error (RMSE), coefficient of correlation (CC), and coefficient of efficiency (CE) are employed to assess the forecasted results, shown as
e10
e11
e12
e13
where is the predicted precipitation at record j, is the observed precipitation at record j, is the average of observed precipitation, is the average of predicted precipitation, and N is the number of hourly records. Generally, smaller MAE and RMSE values mean better performances, while larger CC and CE values indicate good predictions.

d. Model construction

The 153 typhoon events during 1964–2007 were used for training and validating. One of the typhoons that occurred in 2008 was used for testing. The cross-validation subsampling approach was employed to evaluate the model performances. The entire dataset was randomly partitioned into 10 equal-sized subsets. During each run, one of the partitions was chosen for testing, while the rest of them were used for training.

As mentioned in section 3, the architecture of classical RBF networks is a three-layer network. The parameter of learning rate (α) is 0.9. The parameter of neuron width (σi) is calculated according to the distances between two closest clusters—that is, , where d1 is the distance from the center of one cluster to that of another cluster closest to it, and d2 is the distance to the center of the next closest cluster. Thus, clusters that are close to other clusters will have a smaller receptive field, while those that are far from other clusters will have a larger receptive field (SPSS Inc. 2002). The parameter of number of center neurons, however, varies from case to case (Bors and Gabbouj 1994; Lu et al. 2004). Here, the stopping condition is set at reaching the maximal training cycle (set to be 1000) or the minimal error rate (set to 10−4) for determining the number of center neurons of RBF in order to find the optimal hidden neurons.

e. Model analysis

In this section, three scenarios of different attribute combinations and six cases of lag times are analyzed.

1) Three scenarios of attribute combinations

The three scenarios were run using the cross-validation approach. The derived results of normalized precipitation should be transferred to the original values. Figure 5 shows the 10-fold results obtained by these models using the validating dataset after the RBF models are trained. As shown in Fig. 5a, MAE ranges from 2.577 to 2.807 in scenario A, 2.131 to 2.609 in scenario B, and 1.368 to 1.725 in scenario C. Meanwhile, in Fig. 5b, RMSE varies between 4.440 and 5.259 in scenario A, with ranges (3.523–4.423) and (2.535–4.056) in scenarios B and C, respectively. In view of the small variation in MAE and RMSE, good quality of dataset can be verified. Moreover, in Fig. 5a, the MAE averages in scenarios A, B, and C are 2.695, 2.402, and 1.564, respectively; meanwhile, in Fig. 5b, the RMSE averages in scenarios A, B, and C are 4.748, 4.073, and 3.232, respectively. These results illustrate that scenario C possesses better prediction ability than scenarios A and B.

Fig. 5.
Fig. 5.

Results of (a) MAE and (b) RMSE in three scenarios obtained using 10-fold cross-validation training by RBF networks.

Citation: Journal of Hydrometeorology 13, 2; 10.1175/JHM-D-11-03.1

2) Six cases of lag times by RBF and MLR

According to the above analysis, the climatologic typhoon characteristics and average precipitations of watershed were selected to identify the best lag time for predicting precipitations. Here, L-h-ahead accumulated precipitations (L = 1, 3, 6) were analyzed. RBF and MLR models were used in the stage of analysis. The RBF models for lag-time D (ranges from 1 to 6) were expressed as
e14
The MLR regression models were
e15
where ait is the model parameter to be estimated, and e(t) is the model error at time t. The model and its parameters can be obtained by using Matlab.

For the specific length of lead time (i.e., L = 1, 3, 6), the number of input nodes (variables) for lag-time D from 1 to 6 h are 7, 14, 21, 28, 35, and 42, respectively. In the RBF model runs, the hidden layer takes more nodes (from about 200 to 650) with increasing lag times. Moreover, the calculation also consumes more time (from 250 to 3000 s) with increasing lag times when training the RBF networks.

Figure 6 shows the prediction performance of L = 1, 3, 6 forecast precipitations by RBF networks and MLR regressions. For L = 1 of RBF, the optimal situation occurs at D = 2 h with (MAE, RMSE) values of (1.434, 2.980). For L = 3 and L = 6, the best situation occurs at D = 3 and 4 h, respectively. In addition, for MLR, the best situation occurs at D = 2, 2, and 2 h for L = 1, 3, 6, respectively.

Fig. 6.
Fig. 6.

Comparisons of MAE and RMSE in various lag-time situations for (a) 1-, (b) 3-, and (c) 6-h-ahead accumulated forecasts.

Citation: Journal of Hydrometeorology 13, 2; 10.1175/JHM-D-11-03.1

3) PCA analysis and PCA-related model results

Before conducting the PCA–RBF and PCA–MLR models, PCA analysis was first applied to the original data to explore possibilities for data reduction in further predictions. Here PCA analysis was executed according to the various situations of the lag-time D. For example, the total number of inputs for L = 1 at D from 1 to 6 are 7, 14, 21, 28, 35, and 42, respectively.

Figure 7 depicts the eigenvalue of components and total variance explained for L = 1. This study selected eigenvalues greater than 1.0 as principal components. Out of the seven components in D = 1 (see Fig. 7a), the first two eigenvalues (i.e., λ1 and λ2) are greater than 1.0. The corresponding total variance percentage of these two components is 56.48%. Meanwhile, as seen in Figs. 7b–f, the first few eigenvalues reach the critical values and the corresponding total variance percentage are 83.69%, 84.24%, 80.69%, 80.61%, and 78.24%, respectively.

Fig. 7.
Fig. 7.

Eigenvalue and total variance explained for various lag times of 1-h-ahead prediction.

Citation: Journal of Hydrometeorology 13, 2; 10.1175/JHM-D-11-03.1

According to the PCA analysis, the new PCA-derived variables are then defined instead of the original ones. The relationships between new variables and original inputs can be expressed in a matrix form. For the 1-h-ahead prediction of D = 2, the equation is
e16
where are the normalized inputs, and yj=1,5 are the new PCA-derived inputs.

Through the above data reduction procedure, new inputs can then be imported into the PCA-related models to analyze rainfall predictions. In the model training of PCA–RBF, the number of hidden nodes for L = 1, 3, 6 was around 50, 60, and 60, respectively. In addition, the average computing times taken were about 30 s. Figure 6 also shows the prediction performance of L = 1, 3, 6 by PCA–RBF and PCA–MLR. According to the results, for PCA–RBF, the best situation occurs at D = 2, 3, and 4 h for L = 1, 3, 6, respectively. In addition, for PCA–MLR, the best situations are all at D = 2 h for L = 1, 3, 6.

5. Discussion and simulation

According to the above analysis, for comparison purposes, this study selects the best cases of RBF-, MLR-, and PCA-related models for L = 1, 3, 6 in this section. Furthermore, Typhoon Sinlaku (2008) was selected for testing comparisons.

a. Definitions of skill scores

Besides the four performances—MAE, RMSE, CC, and CE defined in the previous section—four skill score measures of forecast rainfall including the percentage error (PE), area-weighted error score (AWES), bias score (BIAS), and equitable threat score (ETS) were used.

PE is a simple proportion of misclassifications for rain–no-rain as a proportion of the total number of observations, and AWES is an index designed to ensure that high classification accuracies are not obtained simply by classifying all observations as “no rain.” The indices of classification accuracy used are as follows (Todd et al. 1995):
e17
e18
where Rr is gauge rain hours classified as rain hours by forecast models (rain hits), Rn is gauge no-rain hours classified as rain hours by models (false alarms), Nr is gauge rain hours classified as no-rain hours by models (misses), and Nn is gauge no-rain hours classified as no rain by models (no-rain hits). Here, PE equal to 1.0 means complete error in rain–no-rain predicted frequency. For the AWES score, false alarms are weighted by the total number of no-rain hours and the misses by the total number of rain hours, thereby giving equal weight to each. A score of 0 is a perfect classification, and a score of 2 is a perfect misclassification.
BIAS measures the ratio of the predicted rain frequency to the observed frequency, regardless of forecast accuracy (Ebert et al. 2003). BIAS is defined as (Tuleya et al. 2007)
e19
where F and O are the numbers of forecasts and observations, respectively, satisfying the assigned threshold of rainfall. BIAS can be employed to assess the tendency of the model to under- or overpredict rain occurrence. If BIAS equals 1.0, then the predicted rainfall frequency is the same as was observed (Ebert et al. 2003).
ETS has been used for several decades to measure the correspondence between the forecast and observed rain occurrences (Ebert et al. 2003). ETS was defined as
e20
where H is the number of cases when both the observed and estimated rainfalls exceeded the assigned threshold. ETS takes into account rain occurrences that were not predicted. “Equitable” measure accounts for the random chance that both forecasted and observed rain occurrences meet the criteria Hrandom = FO/(F + O). ETS ranges from −⅓ to 1, with a value of 1 indicating perfect correspondence between predicted and observed rain occurrence.

b. Comparisons

Table 2 lists the prediction results produced by RBF-, MLR-, and PCA-related models in terms of MAE, RMSE, CC, and CE. As can be seen, the MAE and RMSE of predictions obtained using RBF and PCA–RBF models were better than those produced by MLR and PCA–MLR models, and both RBF and PCA–RBF produced similar results. For CC and CE, comparing RBF with PCA–RBF shows that the forecasts of L = 1 and L = 6 by RBF are better than those by PCA–RBF, while L = 3 yields the opposite result.

Table 2.

Measures of MAE, RMSE, CC, and CE for four models.

Table 2.

Table 3 lists the skill scores—including PE, AWES, BIAS, and ETS—of rainfall forecasts from the predictions produced by the four models. Because these models cannot just match the zero values in predicting the no-rain situation, this study assumes that rainfall predictions <0.2 mm were regarded as no rain when evaluating PE and AWES scores. According to the results of Table 3, the RBF model is the best for L = 1, with (PE, AWES) = (0.219, 0.397). Moreover, PCA–RBF and RBF are the best for L = 3 and L = 6, respectively, indicating that RBF and PCA–RBF have higher accuracy for predicting rain or no-rain situations.

Table 3.

Scores of PE, AWES, BIAS, and ETS for four models.

Table 3.

To evaluate the average BIAS and ETS in possible rainfalls, the thresholds were assigned from 0 to 50 mm h−1. As seen in Table 3, all models have BIAS scores <1.0 for L = 1 in the order of BIASPCA–RBF < BIASPCA–MLR < BIASRBF < BIASMLR, indicating that MLR gives the best estimation (i.e., closest to 1.0). For L = 3, the scores of BIASPCA–RBF and BIASRBF are smaller than 1.0 (i.e., underprediction), while that of both BIASPCA–MLR and BIASMLR are larger than 1.0 (i.e., overprediction). For L = 6, the BIAS scores of all four models exceed 1.0.

As defined above, ETS measures the number of forecast fields that match the observed threshold. As illustrated in Table 3, the average ETS values of RBF and PCA–RBF for all L predictions give better estimation than MLR-related models. To further analyze different rainfall strengths, according to the definitions of rainfall levels by CWB of Taiwan, the rain strengths can be graded into four major levels, which are “light rain,” “heavy rain,” “torrential rain,” and “pouring rain.” Their rainfall ranges can be seen in Table 4. Interestingly, Table 5 lists the evaluation of ETS scores for the four rainfall levels. As can be seen, among the four models, the PCA–RBF model has the highest ETS value of 0.576 for “regular rain.” Meanwhile, the RBF, PCA–RBF, and PCA–RBF models obtain better ETS scores for heavy rain, torrential rain, and pouring rain, respectively.

Table 4.

Definition of rain strengths.

Table 4.
Table 5.

ETS score for 1-h-ahead forecast by different rain levels.

Table 5.

Although both RBF and PCA–RBF models can provide good results on average, the network architecture and the learning speed of the PCA–RBF network are superior to those of the simple RBF network. Comparing the two networks, this study found that the number of input nodes needed by PCA–RBF for L = 1, 3, 6 are 2.80, 3.50, and 4.67 times, respectively, less than those by RBF, and that the number of hidden nodes required by PCA–RBF for L = 1, 3, 6 are 6.79, 6.77, and 8.21 times, respectively, less than those by RBF. Furthermore, comparing the variations in training time, the average computing time required by PCA–RBF for L = 1, 3, 6 are 22.72, 31.69, and 48.98 times, respectively, shorter than those by RBF. In conclusion, although some information might be lost when using PCA to reduce the dimensions of the input space, satisfactory predictions can still be achieved by PCA–RBF, which has a simpler network structure and a faster training speed than RBF (Lu et al. 2004).

c. Simulation of a typhoon

Typhoon Sinlaku (2008) reached super typhoon intensity before landfall on Taiwan in September 2008 (the typhoon path can be seen in Fig. 4). It was named a tropical storm at 1800 UTC 8 September over the Philippines and moved gradually with increasing intensity north–northwestward along the edge of the subtropical high. It made landfall about 0150 UTC 14 September in northeastern Taiwan and then weakened while approaching and passing over northern Taiwan (Hsiao et al. 2010).

Table 6 lists eight performances for comparisons. As can be seen, RBF and PCA–RBF achieve better performance than MLR and PCA–MLR. Moreover, Fig. 8 depicts the observations and simulations of the testing typhoon for the 1-h-ahead rainfall. As can be seen, prediction results given by RBF and PCA–RBF (see Fig. 8a) are satisfactory and are closely consistent with the observed data.

Table 6.

Performances of 1-h model predictions for Typhoon Sinlaku.

Table 6.
Fig. 8.
Fig. 8.

Comparisons between records and 1-h-ahead predictions for Typhoon Sinlaku (2008) by (a) RBF and PCA–RBF and by (b) MLR and PCA–MLR.

Citation: Journal of Hydrometeorology 13, 2; 10.1175/JHM-D-11-03.1

6. Conclusions

The forecast of precipitations during typhoons has received much attention in recent years. It is important in meteorology and atmospheric sciences. The study on precipitation forecast during typhoons is of great significance to operators of a reservoir system. In this paper, QPF models—including RBF, MLR, PCA–RBF, and PCA–MLR models—have been developed to predict hourly rainfalls during typhoons. The proposed methodology can be employed to characterize precipitation in a reservoir watershed and to develop a usable scheme for forecasting amount of rainfall during typhoons. This paper evaluated the best attribute combinations, and identified the best lag time for L-ahead (L = 1, 3, 6) hourly accumulated precipitations. The deducted predictions were compared in terms of four measures (MAE, RMSE, CC, and CE) and four skill scores (PE, AWES, BIAS, and ETS).

The developed methodology was employed to establish the QPF models for the Shihmen Reservoir watershed. This study collected 157 typhoons affecting the watershed in the past 45 yr. The results obtained using four models included the forecasts of L = 1, 3, 6. As can be seen, in terms of MAE and RMSE, predictions obtained by RBF and PCA–RBF are better than those produced by MLR and PCA–MLR. For CC and CE, predictions for L = 1, 6 by RBF are better than those by PCA–RBF, while the opposite is true for L = 3. Meanwhile, RBF is the best model for L = 1, 6 with (PE, AWES) = (0.219, 0.397) and (0.318, 0.770), respectively, and PCA–RBF has the best scores for L = 3 with (PE, AWES) = (0.286, 0.585). In other words, both RBF and PCA–RBF have higher accuracy for predicting rain or no-rain situations. This study also evaluated different rainfall strengths, denoted as “light rain,” “heavy rain,” “torrential rain,” and “pouring rain.” It was found that PCA–RBF has the highest ETS value of 0.576 in light rain, while RBF, PCA–RBF, and PCA–RBF provided better ETS scores in heavy rain, torrential rain, and pouring rain, respectively.

Although both RBF and PCA–RBF models can provide good results on average, the network architecture and the learning speed of the PCA–RBF network are superior to those of the simple RBF network. This study found that using the PCA technique can greatly simplify the network structure and reduce computing time. It is because PCA–RBF orthogonalizes the input variables and makes the network training easier in general. In view of the results obtained, the PCA–RBF neural network can be regarded as a reliable model for predicting precipitation during typhoons.

Acknowledgments

The support under Grant NSC100-2111-M-464-001 by the National Science Council, Taiwan, is greatly appreciated. The writer is also grateful for the constructive comments of the referees.

REFERENCES

  • Bishop, C. M., 1991: Improving the generalization properties of radial basis function neural networks. Neural Comput., 3, 579581.

  • Bishop, C. M., 1995: Neural Networks for Pattern Recognition. Oxford University Press, 504 pp.

  • Bors, A. G., and Gabbouj M. , 1994: Minimal topology for a radial basis functions neural network for pattern classification. Digital Signal Process., 4, 173188.

    • Search Google Scholar
    • Export Citation
  • Chang, C.-P., Yeh T.-C. , and Chen J.-M. , 1993: Effects of terrain on the surface structure of typhoons over Taiwan. Mon. Wea. Rev., 121, 734752.

    • Search Google Scholar
    • Export Citation
  • Cheng, C.-C., Hsu N.-S. , and Wei C.-C. , 2008: Decision-tree analysis on optimal release of reservoir storage under typhoon warnings. Nat. Hazards, 44, 6584.

    • Search Google Scholar
    • Export Citation
  • Dong, D., and MacAvoy T. J. , 1996: Batch tracking via nonlinear principal component analysis. AIChE J., 42, 21992208.

  • Ebert, E. E., Damrath U. , Wergen W. , and Baldwin M. E. , 2003: The WGNE assessment of short-term quantitative precipitation forecasts (QPFs) from operational numerical weather prediction models. Bull. Amer. Meteor. Soc., 84, 481492.

    • Search Google Scholar
    • Export Citation
  • Fan, T.-H., and Lee Y.-H. , 2007: A Bayesian mixture model with application to typhoon rainfall predictions in Taipei. Int. J. Contemp. Math. Sci., 2, 639648.

    • Search Google Scholar
    • Export Citation
  • Fausett, L., 1994: Fundamentals of Neural Networks. Prentice Hall, 461 pp.

  • Fox, N. I., and Wikle C. K. , 2005: A Bayesian quantitative precipitation nowcast scheme. Wea. Forecasting, 20, 264275.

  • French, M. N., Krajewski W. F. , and Cuykendall R. R. , 1992: Rainfall forecasting in space and time using a neural network. J. Hydrol., 137 (1–4), 131.

    • Search Google Scholar
    • Export Citation
  • Hsiao, L.-F., Liou C.-S. , Yeh T.-C. , Guo Y.-R. , Chen D.-S. , Huang K.-N. , Terng C.-T. , and Chen J.-H. , 2010: A vortex relocation scheme for tropical cyclone initialization in advanced research WRF. Mon. Wea. Rev., 138, 32983315.

    • Search Google Scholar
    • Export Citation
  • Hsieh, W. W., 2001: Nonlinear principal component analysis by neural networks. Tellus, 53A, 599615.

  • Hsu, N.-S., and Wei C.-C. , 2007: A multipurpose reservoir real-time operation model for flood control during typhoon invasion. J. Hydrol., 336, 282293.

    • Search Google Scholar
    • Export Citation
  • Hwang, Y.-S., and Bang S.-Y. , 1997: Recognition of unconstrained handwritten numerals by a radial basis function neural network classifier. Pattern Recognit. Lett., 18, 657664.

    • Search Google Scholar
    • Export Citation
  • Jolliffe, I. T., 1986: Principal Component Analysis. Springer-Verlag, 487 pp.

  • Lee, C.-S., Huang L.-R. , Shen H.-S. , and Wang S.-T. , 2006: A climatology model for forecasting typhoon rainfall in Taiwan. Nat. Hazards, 37, 87105.

    • Search Google Scholar
    • Export Citation
  • Lee, C.-S., Liu Y.-C. , and Chien F.-C. , 2008: The secondary low and heavy rainfall associated with typhoon Mindulle (2004). Mon. Wea. Rev., 136, 12601283.

    • Search Google Scholar
    • Export Citation
  • Lin, G.-F., and Wu M.-C. , 2009: A hybrid neural network model for typhoon-rainfall forecasting. J. Hydrol., 375, 450458.

  • Lin, Y.-H., and Lee P.-C. , 2007: Novel high-precision grey forecasting model. Autom. Constr., 16, 771777.

  • Liu, R.-X., Kuang J. , Gong Q. , and Hou X. L. , 2003: Principal component regression analysis with SPSS. Comput. Methods Programs Biomed., 71, 141147.

    • Search Google Scholar
    • Export Citation
  • Lu, W.-Z., Wang W.-J. , Wang X.-K. , Yan S.-H. , and Lam J.-C. , 2004: Potential assessment of a neural network model with PCA/RBF approach for forecasting pollutant trends in Mong Kok urban air, Hong Kong. Environ. Res., 96, 7987.

    • Search Google Scholar
    • Export Citation
  • Manly, B. F., and Bryan F. J. , 1986: Multivariate Statistical Methods: A Primer. Chapman and Hall, 159 pp.

  • Monahan, A. H., 2000: Nonlinear principal component analysis by neural networks: Theory and applications to the Lorenz system. J. Climate, 13, 821835.

    • Search Google Scholar
    • Export Citation
  • Moradkhani, H., Hsu K. L. , Gupta H. V. , and Sorooshian S. , 2004: Improved streamflow forecasting using self-organizing radial basis function artificial neural networks. J. Hydrol., 295, 246262.

    • Search Google Scholar
    • Export Citation
  • Park, J., and Sandberg I. W. , 1991: Universal approximation using radial basis functions network. Neural Comput., 3, 246257.

  • Park, J., and Sandberg I. W. , 1993: Approximation and radial basis function networks. Neural Comput., 5, 305316.

  • Ripley, B. D., 1996: Pattern Recognition and Neural Networks. Cambridge University Press, 415 pp.

  • Sheng, W.-B., Zhang R.-P. , and Lu C.-L. , 2008: Application of SVM method in tropical cyclone rainfall forecast. Guangdong Meteor., 3, 1012.

    • Search Google Scholar
    • Export Citation
  • Shi, D., Yeung D. S. , and Gao J. , 2005: Sensitivity analysis applied to the construction of radial basis function networks. Neural Networks, 18, 951957.

    • Search Google Scholar
    • Export Citation
  • SPSS, Inc., 2002: Clementine 7.0 User’s Guide. SPSS Press, 741 pp.

  • Todd, M. C., Barrett E. C. , Beaumont M. J. , and Green J. L. , 1995: Satellite identification of rain days over the upper Nile River basin using an optimum infrared rain/no-rain threshold temperature model. J. Appl. Meteor., 34, 26002611.

    • Search Google Scholar
    • Export Citation
  • Tuleya, R. E., DeMaria M. , and Kuligowski R. J. , 2007: Evaluation of GFDL and simple statistical model rainfall forecasts for U.S. landfalling tropical storms. Wea. Forecasting, 22, 5670.

    • Search Google Scholar
    • Export Citation
  • Ture, M., Kurt I. , and Akturk Z. , 2007: Comparison of dimension reduction methods using patient satisfaction data. Expert Syst. Appl., 32, 422426.

    • Search Google Scholar
    • Export Citation
  • Wang, S.-T., Yen C.-L. , Chen G.-T. , and Shieh S.-L. , 1986: The characteristics of typhoon precipitation and the prediction methods in Taiwan area (III) (in Chinese). Hazards Mitigation Program Tech. Rep. 74-51, National Science Council, Taiwan, 40 pp.

  • Wei, C.-C., and Hsu N.-S. , 2008a: Multireservoir real-time operations for flood control using balanced water level index method. J. Environ. Manage., 88, 16241639.

    • Search Google Scholar
    • Export Citation
  • Wei, C.-C., and Hsu N.-S. , 2008b: Derived operating rules for a reservoir operation system: Comparison of decision trees, neural decision trees and fuzzy decision trees. Water Resour. Res., 44, W02428, doi:10.1029/2006WR005792.

    • Search Google Scholar
    • Export Citation
  • Wei, C.-C., and Hsu N.-S. , 2009: Optimal tree-based release rules for real-time flood control operations on a multipurpose multireservoir system. J. Hydrol., 365, 213224.

    • Search Google Scholar
    • Export Citation
  • Yeh, T.-C., 2002: Typhoon rainfall over Taiwan area: The empirical orthogonal function modes and their applications on the rainfall forecasting. TAO, 13, 449468.

    • Search Google Scholar
    • Export Citation
  • Zhao, Z. Q., and Huang D. S. , 2007: A mended hybrid learning algorithm for radial basis function neural networks to improve generalization capability. Appl. Math. Modell., 31, 12711281.

    • Search Google Scholar
    • Export Citation
Save
  • Bishop, C. M., 1991: Improving the generalization properties of radial basis function neural networks. Neural Comput., 3, 579581.

  • Bishop, C. M., 1995: Neural Networks for Pattern Recognition. Oxford University Press, 504 pp.

  • Bors, A. G., and Gabbouj M. , 1994: Minimal topology for a radial basis functions neural network for pattern classification. Digital Signal Process., 4, 173188.

    • Search Google Scholar
    • Export Citation
  • Chang, C.-P., Yeh T.-C. , and Chen J.-M. , 1993: Effects of terrain on the surface structure of typhoons over Taiwan. Mon. Wea. Rev., 121, 734752.

    • Search Google Scholar
    • Export Citation
  • Cheng, C.-C., Hsu N.-S. , and Wei C.-C. , 2008: Decision-tree analysis on optimal release of reservoir storage under typhoon warnings. Nat. Hazards, 44, 6584.

    • Search Google Scholar
    • Export Citation
  • Dong, D., and MacAvoy T. J. , 1996: Batch tracking via nonlinear principal component analysis. AIChE J., 42, 21992208.

  • Ebert, E. E., Damrath U. , Wergen W. , and Baldwin M. E. , 2003: The WGNE assessment of short-term quantitative precipitation forecasts (QPFs) from operational numerical weather prediction models. Bull. Amer. Meteor. Soc., 84, 481492.

    • Search Google Scholar
    • Export Citation
  • Fan, T.-H., and Lee Y.-H. , 2007: A Bayesian mixture model with application to typhoon rainfall predictions in Taipei. Int. J. Contemp. Math. Sci., 2, 639648.

    • Search Google Scholar
    • Export Citation
  • Fausett, L., 1994: Fundamentals of Neural Networks. Prentice Hall, 461 pp.

  • Fox, N. I., and Wikle C. K. , 2005: A Bayesian quantitative precipitation nowcast scheme. Wea. Forecasting, 20, 264275.

  • French, M. N., Krajewski W. F. , and Cuykendall R. R. , 1992: Rainfall forecasting in space and time using a neural network. J. Hydrol., 137 (1–4), 131.

    • Search Google Scholar
    • Export Citation
  • Hsiao, L.-F., Liou C.-S. , Yeh T.-C. , Guo Y.-R. , Chen D.-S. , Huang K.-N. , Terng C.-T. , and Chen J.-H. , 2010: A vortex relocation scheme for tropical cyclone initialization in advanced research WRF. Mon. Wea. Rev., 138, 32983315.

    • Search Google Scholar
    • Export Citation
  • Hsieh, W. W., 2001: Nonlinear principal component analysis by neural networks. Tellus, 53A, 599615.

  • Hsu, N.-S., and Wei C.-C. , 2007: A multipurpose reservoir real-time operation model for flood control during typhoon invasion. J. Hydrol., 336, 282293.

    • Search Google Scholar
    • Export Citation
  • Hwang, Y.-S., and Bang S.-Y. , 1997: Recognition of unconstrained handwritten numerals by a radial basis function neural network classifier. Pattern Recognit. Lett., 18, 657664.

    • Search Google Scholar
    • Export Citation
  • Jolliffe, I. T., 1986: Principal Component Analysis. Springer-Verlag, 487 pp.

  • Lee, C.-S., Huang L.-R. , Shen H.-S. , and Wang S.-T. , 2006: A climatology model for forecasting typhoon rainfall in Taiwan. Nat. Hazards, 37, 87105.

    • Search Google Scholar
    • Export Citation
  • Lee, C.-S., Liu Y.-C. , and Chien F.-C. , 2008: The secondary low and heavy rainfall associated with typhoon Mindulle (2004). Mon. Wea. Rev., 136, 12601283.

    • Search Google Scholar
    • Export Citation
  • Lin, G.-F., and Wu M.-C. , 2009: A hybrid neural network model for typhoon-rainfall forecasting. J. Hydrol., 375, 450458.

  • Lin, Y.-H., and Lee P.-C. , 2007: Novel high-precision grey forecasting model. Autom. Constr., 16, 771777.

  • Liu, R.-X., Kuang J. , Gong Q. , and Hou X. L. , 2003: Principal component regression analysis with SPSS. Comput. Methods Programs Biomed., 71, 141147.

    • Search Google Scholar
    • Export Citation
  • Lu, W.-Z., Wang W.-J. , Wang X.-K. , Yan S.-H. , and Lam J.-C. , 2004: Potential assessment of a neural network model with PCA/RBF approach for forecasting pollutant trends in Mong Kok urban air, Hong Kong. Environ. Res., 96, 7987.

    • Search Google Scholar
    • Export Citation
  • Manly, B. F., and Bryan F. J. , 1986: Multivariate Statistical Methods: A Primer. Chapman and Hall, 159 pp.

  • Monahan, A. H., 2000: Nonlinear principal component analysis by neural networks: Theory and applications to the Lorenz system. J. Climate, 13, 821835.

    • Search Google Scholar
    • Export Citation
  • Moradkhani, H., Hsu K. L. , Gupta H. V. , and Sorooshian S. , 2004: Improved streamflow forecasting using self-organizing radial basis function artificial neural networks. J. Hydrol., 295, 246262.

    • Search Google Scholar
    • Export Citation
  • Park, J., and Sandberg I. W. , 1991: Universal approximation using radial basis functions network. Neural Comput., 3, 246257.

  • Park, J., and Sandberg I. W. , 1993: Approximation and radial basis function networks. Neural Comput., 5, 305316.

  • Ripley, B. D., 1996: Pattern Recognition and Neural Networks. Cambridge University Press, 415 pp.

  • Sheng, W.-B., Zhang R.-P. , and Lu C.-L. , 2008: Application of SVM method in tropical cyclone rainfall forecast. Guangdong Meteor., 3, 1012.

    • Search Google Scholar
    • Export Citation
  • Shi, D., Yeung D. S. , and Gao J. , 2005: Sensitivity analysis applied to the construction of radial basis function networks. Neural Networks, 18, 951957.

    • Search Google Scholar
    • Export Citation
  • SPSS, Inc., 2002: Clementine 7.0 User’s Guide. SPSS Press, 741 pp.

  • Todd, M. C., Barrett E. C. , Beaumont M. J. , and Green J. L. , 1995: Satellite identification of rain days over the upper Nile River basin using an optimum infrared rain/no-rain threshold temperature model. J. Appl. Meteor., 34, 26002611.

    • Search Google Scholar
    • Export Citation
  • Tuleya, R. E., DeMaria M. , and Kuligowski R. J. , 2007: Evaluation of GFDL and simple statistical model rainfall forecasts for U.S. landfalling tropical storms. Wea. Forecasting, 22, 5670.

    • Search Google Scholar
    • Export Citation
  • Ture, M., Kurt I. , and Akturk Z. , 2007: Comparison of dimension reduction methods using patient satisfaction data. Expert Syst. Appl., 32, 422426.

    • Search Google Scholar
    • Export Citation
  • Wang, S.-T., Yen C.-L. , Chen G.-T. , and Shieh S.-L. , 1986: The characteristics of typhoon precipitation and the prediction methods in Taiwan area (III) (in Chinese). Hazards Mitigation Program Tech. Rep. 74-51, National Science Council, Taiwan, 40 pp.

  • Wei, C.-C., and Hsu N.-S. , 2008a: Multireservoir real-time operations for flood control using balanced water level index method. J. Environ. Manage., 88, 16241639.

    • Search Google Scholar
    • Export Citation
  • Wei, C.-C., and Hsu N.-S. , 2008b: Derived operating rules for a reservoir operation system: Comparison of decision trees, neural decision trees and fuzzy decision trees. Water Resour. Res., 44, W02428, doi:10.1029/2006WR005792.

    • Search Google Scholar
    • Export Citation
  • Wei, C.-C., and Hsu N.-S. , 2009: Optimal tree-based release rules for real-time flood control operations on a multipurpose multireservoir system. J. Hydrol., 365, 213224.

    • Search Google Scholar
    • Export Citation
  • Yeh, T.-C., 2002: Typhoon rainfall over Taiwan area: The empirical orthogonal function modes and their applications on the rainfall forecasting. TAO, 13, 449468.

    • Search Google Scholar
    • Export Citation
  • Zhao, Z. Q., and Huang D. S. , 2007: A mended hybrid learning algorithm for radial basis function neural networks to improve generalization capability. Appl. Math. Modell., 31, 12711281.

    • Search Google Scholar
    • Export Citation
  • Fig. 1.

    Flowchart of the developed models.

  • Fig. 2.

    Map of Tahan River basin and rain gauges.

  • Fig. 3.

    Number of historical typhoon events affecting the Shihmen Reservoir watershed in each year.

  • Fig. 4.

    Location of study site and historical track of Typhoon Sinlaku (2008).

  • Fig. 5.

    Results of (a) MAE and (b) RMSE in three scenarios obtained using 10-fold cross-validation training by RBF networks.

  • Fig. 6.

    Comparisons of MAE and RMSE in various lag-time situations for (a) 1-, (b) 3-, and (c) 6-h-ahead accumulated forecasts.

  • Fig. 7.

    Eigenvalue and total variance explained for various lag times of 1-h-ahead prediction.

  • Fig. 8.

    Comparisons between records and 1-h-ahead predictions for Typhoon Sinlaku (2008) by (a) RBF and PCA–RBF and by (b) MLR and PCA–MLR.

All Time Past Year Past 30 Days
Abstract Views 0 0 0
Full Text Views 472 137 12
PDF Downloads 283 96 4