## 1. Introduction

The island of Taiwan is situated in one of the main paths of western North Pacific typhoons and is affected by an average of four typhoons each year (Chang et al. 1993). When a typhoon is predicted to affect Taiwan or the storm force wind would affect the area as far as 100 km from the coast in the next 24 h, the Central Weather Bureau (CWB) will issue a sea warning (Lee et al. 2006). As soon as the typhoon lands, the upstream watershed receives torrential rain within a short time, which quickly converges downstream. The subsequent flash floods have the potential to cause considerable economic losses and casualties. As a result, prediction of flash floods in an accurate and timely fashion is one of the most important challenges in weather prediction (Wei and Hsu 2009).

In recent years, climatologic data, such as pressure at the typhoon's center and the radius of the typhoon, have been widely used to predict rainfall during typhoon hits. For example, Hsu and Wei (2007), Mackey and Krishnamurti (2001), and Yeh (2002) developed multiple linear regressions using climatologic data to generate ensemble forecasts of typhoon track, intensity, and precipitation. Lonfat et al. (2007) and Tuleya et al. (2007) developed a parametric hurricane rainfall prediction scheme using rainfall climatology and a persistence model. Wei (2012a,b) has recently developed artificial neural network (ANN) techniques using climatologic data for forecasting hourly precipitation during tropical cyclones.

Remotely sensed measurements on board meteorological satellite instruments play an extremely important role in studying the earth climate (Arkin and Ardanuy 1989; Ferraro et al. 1996). Data from passive microwave satellite instruments are now widely employed to derive climatologic values of various atmospheric water cycle components (Klepp and Bakan 2000). Many approaches to retrieving rainfall rate from passive microwave measurements have been proposed (Dietrich et al. 2000). For example, the National Oceanic and Atmospheric Administration (NOAA) Satellite Research Laboratory used an empirically derived function of the lower-frequency channels of the Special Sensor Microwave Imager (SSM/I) to predict vertically polarized 85.5-GHz brightness temperatures expected in the absence of precipitation (Ferraro et al. 1994; Grody 1991; Petty and Krajewski 1996). Spencer et al. (1989) identified the precipitation in warm and cold land and ocean environments from SSM/I. Chiu et al. (1990) derived a nonlinear relation between rain rate and microwave temperature. Moreover, researchers, such as Atlas et al. (2005), Biancamaria et al. (2008), Dietrich et al. (2000), Gan et al. (2009), Liu and Curry (1997), Mishra et al. (2009), Nativi et al. (1997), and Nesbitt et al. (2006), have demonstrated the usability of microwave sensor data.

Wei and Roan (2012) addressed the rainfall retrieval problem for quantitative precipitation forecasting over land during tropical cyclones. They developed support vector machines for regression (SVR), a scattering index over land approach (SIL), and a hybrid SIL–SVR model for rainfall-rate retrievals. The brightness temperature data from SSM/I on board the satellites were employed to retrieve quantitative precipitation. The feasibility of SVR and SIL–SVR was examined through comparison with traditional regression and SIL approaches. The above-mentioned studies, such as Wei and Roan (2012), have examined the usability of microwave sensor data in the retrieval of rainfall rates in typhoons. However, the combined use of both microwave sensor data and climatologic characteristics has not been studied in the past.

Forecasting the behavior of complex systems has been a broad application domain for machine learning. In the fields of hydrology and water resource engineering, machine learning has been successfully applied to the prediction of rainfall, rainfall runoff, and river stage (Chau et al. 2005; Cheng et al. 2008; Lin et al. 2006; Muttil and Chau 2006; Tsai et al. 2012; Wang et al. 2012; Wei 2012a,b; Wu et al. 2009). As is known, Bayesian networks (BN) serve as a graphical knowledge representation tool (Pearl 1988), and might be one of the most prominent approaches when considering the ease of knowledge interpretation. In recent years, BN has drawn considerable attention owing to its high predictive ability for a wide range of applications, as demonstrated by Ahn and Ezawa (1997), Balov (2011), Friedman et al. (1997), Guo et al. (2007), Uusitalo (2007), Verron et al. (2010), Wong et al. (2004), and Zhu (2003).

This paper focuses on addressing the rainfall prediction problem for quantitative precipitation forecasts over land during tropical cyclones. The study area is the watershed of the Tanshui River in Taiwan. To improve the typhoon precipitation forecast efficiency, this study develops BN and logistic regression (LR) models using three types of datasets comprising typhoon climatologic data, hydrological rainfall rates, and the SSM/I microwave data; their feasibility under different rain intensities are also examined.

## 2. Methodology

This section presents a procedure for conceptualizing the precipitation forecast processes. As shown in Fig. 1, the proposed procedure involves six steps. These steps are described as follows.

- receive a tropical storm warning over ocean and land issued by the CWB, and assess whether the typhoon is approaching the study area or not (the assessment is made according to the predicted typhoon path issued by the CWB);
- collect typhoon climatologic data {
*A*_{1},*A*_{2}, … ,*A*_{6}} (denoted as dataset*A*; see section 4 for more details) issued by the CWB, obtain hourly surface rainfalls {*P*_{t}_{-L}, … ,*P*_{t}_{-1},*P*} (denoted as_{t}*P*, where*P*is hourly precipitation,*t*is time index, and*L*is lag time) from automatic rainfall gauges managed by the Water Resources Agency (WRA), and gather the SSM/I microwave data from NOAA; - check if there are scanned microwave data at the moment
*t*; if “yes,” then obtain the vertically polarized brightness temperatures {*B*_{1},*B*_{2},*B*_{3},*B*_{4}} = {Tb_{19V}, Tb_{22V}, Tb_{37V}, Tb_{85V}} at 19.35, 22.23, 37.0, and 85.5 GHz (denoted as*B*); otherwise, generate the brightness temperatures in the last two time periods during typhoon hits using the extrapolation method; - run the rainfall forecast model using BN and LR with inputs from the three previous datasets
*A*,*P*, and*B*; - derive the forecasted rainfall at time
*t*+ 1,; and - record the forecasted value
and update the time period *t*=*t*+ 1.

## 3. Model development

This section presents the model development with steps for rainfall forecasting during typhoons:

- step 1—select the specific study area,
- step 2—define attributes affecting the behavior of rainfall rate; construct datasets of
*A*,*P*, and*B*, - step 3—classify patterns into training and validation datasets,
- step 4—design cases with different combinations of the three datasets,
- step 5—construct various case models and process them in training and validation stages,
- step 6—define the performances of skill scores to assess the suitable cases,
- step 7—compare the results derived from case models.

The model theories and constructions are described in the following.

### a. Theory of Bayesian networks

BN (Langley et al. 1992) is a classification method developed from Bayes's theorem. A BN of *G*, which models probabilistic relationships between a set of random variables *n* denotes the number of attributes, *X _{n}* is the attribute term, and Ω is the class variable. Each variable in

*U*has specific states or values denoted by lowercase letters:

*X*and

_{i}*X*(

_{j}*X*≠

_{i}*X*) are conditionally independent given the class label of node Ω. Hence,

_{j}*x*is conditionally independent of

_{i}*x*given class

_{j}*ω*, whenever

**x**belongs to the class

*t*corresponds to the number of class labels. The conditional probability density functions

**x**is performed by assigning a class label

**x**.

### b. BN model construction

The dual nature of a BN makes its construction a two-stage process. Stage 1 involves determining a network structure and stage 2 involves establishing the probability tables.

#### 1) Determining BN structure

To construct a BN model, the network structure that best matches a given training set needs to be found. Naïve Bayes is the simplest form of BN classifiers (Duda and Hart 1973). It is obvious that the conditional independence assumption in naïve Bayes is rarely true in reality, which would harm its performance in applications with complex attribute dependencies (Jiang et al. 2012). Figure 2a shows graphically an example of naïve Bayes. In naïve Bayes, each attribute node has the class node as its parent, but does not have any parent from other attribute nodes.

Numerous algorithms have been proposed to improve naïve Bayes by weakening its conditional attribute independence assumption (Pernkopf 2005). In recent years, tree augmented naïve Bayes (TAN) approach has demonstrated remarkable classification performance and is competitive with the general BN classifiers in terms of classification accuracy or error rate, while maintaining efficiency and simplicity (Madden 2009). Figure 2b shows an example of TAN. TAN is a specific case of general BN classifiers, in which the class node also points directly to all attribute nodes, but there is no limitation on the arcs among attribute nodes. In practice, TAN is a good trade-off between model complexity and learnability (Xiao et al. 2009).

*B*for a database

_{S}*D*, the Bayesian metric,

*P*(

*B*) is the prior on the network structure,

_{S}*n*is the number of records in

*D*, Γ is the gamma function,

*r*(1 ≤

_{i}*i*≤

*n*) is the cardinality of

*x*, and

_{i}*q*denotes the cardinality of the parent set of

_{i}*x*in

_{i}*B*, that is, the number of different values to which the parents of

_{S}*x*can be instantiated. Therefore,

_{i}*q*can be calculated as the product of cardinalities of nodes in pa(

_{i}*x*), which is the parents of

_{i}*x*in

_{i}*B*; that is

_{S}*N*(1 ≤

_{ij}*i*≤

*n*, 1 ≤

*j*≤

*q*) denotes the number of records in

_{i}*D*for which pa(

*x*) takes its

_{i}*j*th value and

*N*(1 ≤

_{ijk}*i*≤

*n*, 1 ≤

*j*≤

*q*, 1 ≤

_{i}*k*≤

*r*) is the number of records in

_{i}*D*for which pa(

*x*) takes its

_{i}*j*th value and for which

*x*takes its

_{i}*k*th value; therefore,

#### 2) Estimating BN model parameters

With the estimator, the CPTs can be established by estimating the CPTs of a node *x _{i}*, given its parents pa(

*x*) as a weighted average of all CPTs of

_{i}*x*and given subsets of pa(

_{i}*x*).

_{i}*T*or the minimal error rate

*ɛ*

_{min}. The error rate

*ɛ*is defined aswhere RMSE(alpha) is the root-mean-square error at a specific alpha value. The steps are described as follows:

- step 1—begin at the initial alpha value = 0.0 and training cycle
*τ*= 1; - step 2—check if
*τ*=*T*(set to 10), then go to step 5; otherwise, continue with step 3; - step 3—run the model and calculate the error rate
*ɛ*; - step 4—check if
*ɛ*<*ɛ*_{min}(set to 10^{−5}), then continue with step 5; otherwise, let*τ*=*τ*+ 1 and alpha = alpha + 0.1 and go back to step 2; - step 5—record the alpha value and stop the training model.

BN has a strong capacity for processing discrete variables, but is weak at handling continuous variables. Therefore, the continuous variables for a target need to be converted into discrete values (Liang et al. 2012). For data processing, the class of target is discretized by 1-mm rainfall. The results of the training process are presented in section 4.

### c. Construction of logistic regression

Regression is one of the most important statistical methods applied to science, engineering, and management. LR, also called logistic discrimination, is a regression method for predicting a dichotomous dependent variable. LR has long been a tool for the statistics community. In this study, the target in LR is the same as that in BN. The following summarizes the basic LR theories.

*p*independent variables can be written aswhere

*P*(

*Y*= 1) is the probability of presence and

*β*

_{0},

*β*

_{0}, …,

*β*, are regression coefficients. There is a linear model hidden within the logistic regression model. The natural logarithm of the ratio of

_{p}*P*(

*Y*= 1) to 1 −

*P*(

*Y*= 1) gives a linear model; that is,where

*g*(

*x*) has many of the desirable properties of a linear regression model. The independent variables can be a combination of continuous and categorical variables. More information can be found in Hosmer and Lemeshow (2000) and Kurt et al. (2008).

## 4. Application

The proposed methodology is applied to the watershed of the Tanshui River located in northern Taiwan (see Fig. 4). The watershed covers an area of 2726 km^{2}. The spatial range of the watershed spans 24.43°–25.19°N and 121.20°–121.86°E. Metropolitan Taipei is located in the region and has a population of about six million. In summer and fall, tropical cyclones with torrential rain occur frequently because of the attributes of the subtropical climate.

### a. Sources and data refined

This study collects a total of 70 typhoon events affecting the watershed over the years 1997–2008 (listed in Table 1). Figure 5 indicates historical tracks and landfalls of typhoons on Taiwan during these years. Three types of data are collected and their statistics can be seen in Table 2. First, the typhoon dataset *A* comprises the attributes of typhoon climatologic characteristics, including the pressure at typhoon center *A*_{1}, the latitude of the typhoon center *A*_{2}, the longitude of the typhoon center *A*_{3}, the radius of the typhoon *A*_{4}, the speed of the typhoon *A*_{5}, and the maximum wind speed of the typhoon center *A*_{6}.

Typhoon events studied.

Dataset attributes.

The second dataset *P* is obtained from automatic rainfall gauges (see Fig. 4). The 16 gauges are located in Shihmen, Kaoyi, Kalahe, Hsiayung, Shichiusan, Shanshia, Tabau, Fushan, Tatungshang, Chungcheng, Shiding, Pinglin, Bihu, Wudu, Ruifang, and Chuchihu. The mean and standard deviation of the annual rainfalls for these gauges are 2866 and 811 mm, respectively. Moreover, the mean elevation and standard deviation calculated in this study are 497 and 530 m, respectively. Both topographical and climatic conditions exhibit high variability within this basin.

The most frequently adopted algorithms for estimating average rainfall within a certain watershed are the arithmetical averaging method, Thiessen polygons method, height balance polygons method (HBPM), and isohyetal method. To calculate the average hourly precipitation of all gauges representing the rainfall amount over the watershed, the HBPM approach was employed in this study because of the topographical properties. The main difference between the HBPM and other approaches is that weighted functions are calculated according to the elevations of rain gauges; that is, topographic factors are involved in the estimation. The main steps of HBPM are as follows: 1) digitize rain stations and watershed polygon information, 2) triangulate irregular networks, 3) estimate central points of elevation between two rain stations, 4) connect all elevations of central points of triangulated irregular networks, 5) create height balance polygons, 6) analyze intersection polygons between height balance polygons and watershed polygons, and 7) compute elevation weights of HBPM. More details of the HBPM method can be found in Chow et al. (1988).

The third dataset *B*, containing brightness temperature records, is obtained using the SSM/I satellite instrument. The records for the spatial range of the watershed during typhoon periods are searched. SSM/I is part of the instrument suite flown on board the Defense Meteorological Satellite Program (DMSP) series of satellites (Raytheon Systems Company 2000). As mentioned earlier, the measurements from the SSM/I instrument are comprised of the terrestrial brightness temperatures at 19.35, 22.235, 37.0, and 85.5 GHz frequency channels. The spatial sampling interval is 25 km for the four frequencies. Because the basin covering an area of 2726 km^{2} is greater than the footprint of the frequencies with sampling size of 25 km × 25 km, the averaging of values goes on between multiple grids obtained from the remote sensing data.

According to the above data analysis, the attributes of the typhoon climatologic data *A*, the average hourly precipitation *P*, and the brightness temperatures *B* are refined. Table 2 lists these attributes, symbols, mean values, and domain ranges. In total, there are 2704 hourly records affecting the watershed available for the studied typhoons.

### b. Cases

This study compares the rainfall forecasts obtained by the following five model cases.

- case 1—prediction using only typhoon climatologic data (i.e.,
*A*), - case 2—prediction using only hydrological information (i.e.,
*P*), - case 3—prediction using only brightness temperatures (i.e.,
*B*), - case 4—prediction using both datasets
*A*and*P*, - case 5—prediction using all datasets
*A*,*P*, and*B*,

Note that because the travel time of flow from upstream to downstream in the basin is often less than 6 h because of the small watershed area, *L* is set to 5 h (Wei and Hsu 2008). As mentioned earlier, the target should be discretized for both BN and LR model training. Accordingly, the hourly rainfalls associated with class *k* where *k* ∈ {0, 1, …, 51} then become the categorical data. Here, *k* = 51 is referred to as the maximum rainfall.

### c. Model training

The cross-validation subsampling approach was employed. The entire dataset was randomly partitioned into 10 equal-sized subsets. During each run, one of the partitions was chosen for testing, while the rest of them were used for training. For training the BN model, because the alpha parameter varied from case to case, the sensitivity analysis was employed to choose the suitable alpha values.

Figure 6 plots the results of RMSE(alpha) for cases 1–5 with alpha parameters ranging from 0.1 to 2.0. As seen in Fig. 6, the alpha values were within the range of 4.565–4.936, 3.630–3.950, 4.287–4.625, 3.213–3.466, and 3.192–3.436 for cases 1–5, respectively. The outcomes show that the alpha values show significant differences in RMSE. That is to say, as can be seen in the figure, the increasing alpha would cause the curve to drop downward at the range of 0.1–0.3 and rise upward at the range of 0.3–2.0. The appropriate alpha values suggested could range from 0.1 to 0.5 because of their lower RMSE measures. This study would select the alpha values of 0.3 for each model analysis.

## 5. Model analysis and evaluation

### a. Comparisons of different data sources

#### 1) Definitions of MAE and RMSE

*i*,

*i*, and

*N*is the number of hourly records. Note the observed precipitation is referred to as the average hourly precipitation of all gauges estimated using the HBPM method. Generally, the two criteria having lower values indicate better performance.

#### 2) Analysis

Figure 7 shows the results for cases 1–5 tested using the BN and LR models. As can be seen, case 5 has the best MAE results, with MAEs of 1.236 and 1.275 for BN and LR, respectively. Moreover, case 5 also has the best RMSE results, with RMSEs of 3.192 and 3.071 for BN and LR, respectively. These results reveal that the combination of using all datasets *A*, *P*, and *B* can give better predictions. In contrast, case 1, using only dataset *A*, has the worst prediction performance of all five cases for both models.

_{Case1}is the MAE value of case 1 obtained by the BN model and MAE

_{i}is the MAE value of case

*i*. Likewise, the improvement rate of RMSE can be defined as in Eq. (14).

Table 3 lists the improvement rate of the model cases. For BN-based cases, the MAE and RMSE of case 5 are both better than those of cases 1–4. The reasons might be that coupling microwave sensor data *B* with climatologic characteristics *A* and precipitation *P* can improve the typhoon rainfall forecast accuracy.

Improvement rate for BN and LR model cases.

### b. Comparisons between BN and LR models

The scatterplots of the observed versus predicted precipitation are shown in Fig. 8 with the outcomes of cases 2, 4, and 5 derived using the BN and LR models. In addition, the linear regression equation and squared correlation *R*^{2} are presented in Figs. 8a–f. As can be seen, cases 2, 4, and 5 have greater slopes in LR than in BN, indicating that LR gives better estimations that are closer to the observed results. However, the *R*^{2} values of BN in these three cases are better than those of LR. The reasons might be, as proposed by Mitchell (2005, chapter 1), that BN is a learning algorithm with greater bias, but lower variance, than LR.

To assess the ability of such forecasts, it is necessary to have accurate ways of quantifying “forecast skill” (Murphy 1993). Forecast evaluation and verification is confounded by the many possible skill measures that can be employed to summarize the complex behavior (Stephenson 2000). A categorical dichotomous statement is simply a yes–no statement; in this study, it is whether the forecast or observed precipitation is below or above a defined rainfall threshold. The combination of different possibilities between observed and forecast values defines a contingency table (Accadia et al. 2003). In Table 4, the columns display the predicted variables, while the rows show the observed variables. For each precipitation threshold, four categories of hits, false alarms, misses, and correct no-rain forecasts (denoted as *a*, *b*, *c*, and *d*) are defined. There are three categorical scores used in the following analysis.

Contingency table of possible events for a selected rainfall threshold.

#### 1) Definition of skill scores

A BIA equal to 1.0 means that the forecast frequency is the same as the observed frequency, and the model is then referred to as unbiased. A BIA greater than 1.0 is indicative of a model that forecasts more events above the threshold than are observed (i.e., overforecasting), while a BIA less than 1.0 shows that the model forecasts fewer events than are observed (i.e., underforecasting).

*a*is the number of model hits expected from a random forecast; that is,

_{r}ETS ranges from −⅓ to 1, with a value of 1 indicating perfect correspondence between predicted and observed rain occurrences. ETS is now used for rainfall verification at most operational centers (Ebert et al. 2003).

For ideal perfect predictions, the PRE value equals 1.

#### 2) Analysis

Figures 9–11 plot the three measures of skill scores. Figure 9 shows the BIA score as a function of the threshold value, which ranges from 0 to 40 mm h^{−1}, chosen as rainfall thresholds to separate the yes–no events. The top panel in Fig. 9 is for the BN model, while the bottom panel is for the LR model. In Fig. 9a, cases 2, 4, and 5 have roughly a horizontal line of BIA = 1.0, with thresholds between 0 and 7 mm h^{−1}; the BIA scores then decreases smoothly until a threshold of 20 mm h^{−1}, and at higher thresholds (i.e., >20 mm h^{−1}), the BIA score is approximately zero. This trend implies that these cases might correctly predict light rain but underestimate heavy rain. Moreover, Fig. 9b reveals that cases 2 and 4 underestimate the rain, ranging from 0 to 40 mm h^{−1} with BIA values of 0.5–1.0. Additionally, case 5 underestimates the rain, ranging from 0 to 22 mm h^{−1} with BIA fluctuating around 1.0 for a threshold >22 mm h^{−1}. The above results reveal that the combined use of all datasets *A*, *P*, and *B* can give better predictions than cases 2 and 4.

Figures 10 and 11 illustrate the ETS and PRE scores for various thresholds, respectively. In Figs. 10a and 11a, the ETS and PRE scores of all model cases decrease at thresholds of 0–12 mm h^{−1}; while in Figs. 10b and 11b, skills are a strong function of thresholds, with both ETS and PRE decreasing from no rain to 40 mm h^{−1}, except for cases 1 and 3. Overall, comparing BN and LR in Figs. 9–11 reveals that BN predicts well for light rain, while LR has good scores for heavier rain.

The BN classifier shows underprediction (in terms of BIA, ETS, and PRE scores) for high rainfall rates, even for case 5. One of explanations for this physical phenomenon might possibly be that the BIA falls rapidly during the transition from stratiform to convective rainfall rates. Grecu et al. (2000) suggested that the transition from stratiform to convective rainfall rates occurs around 10 mm h^{−1}. According to the BN learning structure in Fig. 12, the causality relationship for case 5 was fine-tuned using domain knowledge because, as mentioned earlier, BN is a graphical representation of probability distributions.

The variables included in Fig. 12 are involved in datasets *A*, *P*, and *B*. The structure-learning algorithm can be employed to obtain the key relationships between the indicators. That is to say, there is causality between brightness temperatures at time *t* and the rainfall prediction at *t* + 1, which is, of course, reasonable. However, during the more intense convection associated with a tropical cyclone, the brightness temperature will probably change significantly between *t* and *t* + 1. Thus, the brightness temperature at *t* may not be representative of the brightness temperature at *t* + 1. In other words, when the rainfall rate is high (during intense convection associated with a spiral band) at *t* + 1, the brightness temperature is probably much cooler (corresponding to a higher rainfall rate) than the brightness temperature at *t*. Yet, the rainfall rate is predicted according to brightness temperature at *t* (corresponding to a lower rainfall rate).

### c. Comparisons of different rain intensities

To evaluate the model capability in estimating light, moderate, and heavy rain during the typhoon period, both the average MAE and RMSE measures were computed to give a correct estimate of the observed rainfall over a certain rain interval. As illustrated in Fig. 7 and Table 3, case 5 gives better estimation than do cases 1–4. Therefore, case 5 was selected to further analyze different rainfall intensities. According to the definitions of rainfall levels by the CWB of Taiwan, the rain intensities can be graded into four major levels: light rain, heavy rain, torrential rain, and pouring rain (Wei 2012b). Their rainfall ranges can be seen in Table 5. Table 6 lists the evaluation of MAE and RMSE measures for these four rainfall intensities. As can be seen, the BN model has higher MAE and RMSE values for light rain, heavy rain, and torrential rain, while the LR model has higher MAE and RMSE values for pouring rain.

Definition of rain intensities.

Performance of case 5 for different rain intensities.

### d. Merits and limitations

An advantage of LR is its capability of generalization (e.g., Fig. 8); that is, it can correctly process information that only resembles broadly the original training data. LR is also fault tolerant in being capable of handling properly noisy or incomplete data (Niculescu 2003). However, multivariate regression describes associations, not causes; therefore, its limitation is that it does not explain the decision.

One of the major advantages of BN is that its graphical representation allows us to interpret causality easily by following the arc directions. Probabilistic inference follows the network structure and is used for classification. Since each covariate in a BN is independent of its nondescendants (given its parents), inference is computationally fast (Stajduhar et al. 2009). In essence, the relations between the variables of the domain can be visualized graphically, in addition to providing an inference mechanism that allows quantifying, in probabilistic terms, the effect of these relations (de Santana et al. 2007). For example, the network in Fig. 12 displays complicated causality relationships between these attributes. However, BN is weak at handling continuous variables; that is, the continuous variables for the target need to be converted into discrete values.

## 6. Summary and conclusions

Prediction of flash floods in an accurate and timely fashion is one of the most important challenges in weather prediction. This paper focuses on addressing the rainfall prediction problem for quantitative precipitation forecasts over land during tropical cyclones. To improve the typhoon precipitation forecast efficiency, this study develops BN and LR models using three different datasets and examines their feasibility under different rain intensities.

The developed models were applied to rainfall predictions for the Tanshui River basin. The microwave sensor data from the SSM/I instrument, the rainfall measurements from WRA of Taiwan, and the climatologic characteristics of typhoons as determined by CWB were collected to estimate quantitative precipitation. This study collected datasets from 70 typhoons affecting the studied watershed over the years 1997–2008. The measurements from SSM/I included the terrestrial brightness temperatures at 19.35, 22.23, 37.0, and 85.5 GHz frequency channels. In this study, MAE, RMSE, BIA, ETS, and PRE were reviewed to assess forecast performance. This study obtained the following results.

### a. Comparisons of different data sources

The different data sources were compared in terms of MAE and RMSE. Five cases of data combinations are tested; that is, cases 1–3 were performed using typhoon climatologic data *A*, hydrological information *P*, and brightness temperatures *B*, respectively, with cases 4 and 5 run using data from *A* and *P*, and *A*, *P*, and *B*, respectively. Results show that the optimal data combination occurred with case 5 where MAE is 1.236 and RMSE is 3.192 for BN and, MAE is 1.275 and RMSE is 3.071 for LR. In other words, the case using all three datasets, *A*, *P*, and *B*, together is better than the other four cases, regardless of the model used.

### b. Comparisons between BN and LR models

Using the three skill scores BIA, ETS, and PRE, BN has BIA scores decreasing smoothly from 1.0 to 0 until a threshold of 20 mm h^{−1}, but a value of approximately zero for a threshold >20 mm h^{−1}, indicating that BN provides a better estimation for light rain than for flash rain. Moreover, LR yields better BIA, ETS, and PRE scores than BN at a threshold >20 mm h^{−1}, implying that LR achieves better predictions for flash rainfall.

### c. Comparisons of different rain intensities

Further, this study evaluated different rainfall intensities, denoted as light rain, heavy rain, torrential rain, and pouring rain. For light, heavy, and torrential rain, BN has smaller MAE and RMSE and provides better rainfall estimation than LR, meaning that BN achieves better predictions for rainfall <14.6 mm h^{−1}. Nevertheless, for pouring-rain situations, LR yields better MAE and RMSE than BN, implying that the LR model achieves better predictions for rainfall >14.6 mm h^{−1}.

The results show that the case involving the use of all three input datasets is better than the other four cases. Moreover, LR can provide better predictions than BN, especially in flash rainfall situations. However, BN might be taken as one of the most prominent approaches when considering the ease of knowledge interpretation. Finally, LR describes associations, not causes, and does not explain the decision.

The support of Grant NSC101-2119-M-464-001 by the National Science Council, Taiwan, is greatly appreciated. The authors would like to acknowledge data provided by the Central Weather Bureau (CWB), the Water Resources Agency (WRA), and the National Oceanic and Atmospheric Administration (NOAA). The writer is also grateful for the constructive comments of the referees.

## REFERENCES

Accadia, C., , Mariani S. , , Casaioli M. , , Lavagnini A. , , and Speranza A. , 2003: Sensitivity of precipitation forecast skill scores to bilinear interpolation and a simple nearest-neighbor average method on high-resolution verification grids.

,*Wea. Forecasting***18**, 918–932.Ahn, J. H., , and Ezawa K. J. , 1997: Decision support for real-time telemarketing operations through Bayesian network learning.

,*Decis. Support Syst.***21**, 17–27.Arkin, P. A., , and Ardanuy P. E. , 1989: Estimating climatic-scale precipitation from space: A review.

,*J. Climate***2**, 1229–1238.Atlas, R., , Hou A. Y. , , and Reale O. , 2005: Application of SeaWinds scatterometer and TMI-SSM/I rain rates to hurricane analysis and forecasting.

,*J. Photogramm. Remote Sens.***59**, 233–243.Balov, N., 2011: A Gaussian mixed model for learning discrete Bayesian networks.

,*Stat. Probab. Lett.***81**, 220–230.Biancamaria, S., , Mognard N. M. , , Boone A. , , Grippa M. , , and Josberger E. G. , 2008: A satellite snow depth multi-year average derived from SSM/I for the high latitude regions.

,*Remote Sens. Environ.***112**, 2557–2568.Bouckaert, R. R., , Frank E. , , Hall M. , , Kirkby R. , , Reutemann P. , , Seewald A. , , and Scuse D. , 2010:

*WEKA Manual.*University of Waikato Press, 325 pp.Chang, C. P., , Yeh T. C. , , and Chen J. M. , 1993: Effects of terrain on the surface structure of typhoons over Taiwan.

,*Mon. Wea. Rev.***121**, 734–752.Chau, K. W., , Wu C. L. , , and Li Y. S. , 2005: Comparison of several flood forecasting models in Yangtze River.

,*J. Hydrol. Eng.***10**, 485–491.Cheng, C. T., , Wang W. C. , , Xu D. M. , , and Chau K. W. , 2008: Optimizing hydropower reservoir operation using hybrid genetic algorithm and chaos.

,*Water Resour. Manage.***22**, 895–909.Chiu, L. S., , North G. R. , , Short D. A. , , and McConnell A. , 1990: Rain estimation from satellites: Effect of finite field of view.

,*J. Geophys. Res.***95**(D3), 2177–2185.Chow, V. T., , Maidment D. R. , , and Mays L. W. , 1988:

*Applied Hydrology*. McGraw-Hill, 572 pp.Cooper, G., , and Herskovits E. , 1992: A Bayesian method for the induction of probabilistic networks from data.

,*Mach. Learn.***9**, 309–347.de Santana, A. L., , Frances C. R. , , Rocha C. A. , , Carvalho S. V. , , Vijaykumar N. L. , , Rego L. P. , , and Costa J. C. , 2007: Strategies for improving the modeling and interpretability of Bayesian networks.

,*Data Knowl. Eng.***63**, 91–107.Dietrich, S., , Bechini R. , , Adamo C. , , Mugnai A. , , and Prodi F. , 2000: Radar calibration of physical profile-based precipitation retrieval from passive microwave sensors.

,*Phys. Chem. Earth***25B**, 877–882.Duda, R., , and Hart P. , 1973:

*Pattern Classification and Scene Analysis.*John Wiley and Sons, 482 pp.Ebert, E. E., , Damrath U. , , Wergen W. , , and Baldwin M. E. , 2003: The WGNE assessment of short-term quantitative precipitation forecasts (QPFs) from operational numerical weather prediction models.

,*Bull. Amer. Meteor. Soc.***84**, 481–492.Ferraro, R. R., , Grody N. C. , , and Marks G. F. , 1994: Effects of surface conditions on rain identification using the SSM/I.

,*Remote Sens. Rev.***11**, 195–209.Ferraro, R. R., , Weng F. , , Grody N. C. , , and Basist A. , 1996: An eight-year (1987–1994) time series of rainfall, clouds, water vapor, snow cover, and sea ice derived from SSM/I measurements.

,*Bull. Amer. Meteor. Soc.***77**, 891–905.Friedman, N., , Geiger D. , , and Goldszmidt M. , 1997: Bayesian network classifiers.

,*Mach. Learn.***29**, 131–163.Gan, T. Y., , Kalinga O. , , and Singh P. , 2009: Comparison of snow water equivalent retrieved from SSM/I passive microwave data using artificial neural network, projection pursuit and nonlinear regressions.

,*Remote Sens. Environ.***113**, 919–927.Grecu, M., , Anagnostou E. N. , , and Adler R. F. , 2000: Assessment of the use of lightning information in satellite infrared rainfall estimated.

,*J. Hydrometeor.***1**, 211–221.Grody, N. C., 1991: Classification of snow cover and precipitation using the Special Sensor Microwave Imager.

,*J. Geophys. Res.***96**(D4), 7423–7435.Guo, S., , Xu G. , , Zhang H. , , and Li C. , 2007: A real-time flood updating model based on the Bayesian method.

,*Methodol. Hydrol.***311**, 210–215.Hamill, T. M., 1999: Hypothesis tests for evaluating numerical precipitation forecasts.

,*Wea. Forecasting***14**, 155–167.Hosmer, D. W., , and Lemeshow S. , 2000:

*Applied Logistic Regression*. John Wiley and Sons, 373 pp.Hsu, N. S., , and Wei C.-C. , 2007: A multipurpose reservoir real-time operation model for flood control during typhoon invasion.

,*J. Hydrol.***336**, 282–293.Jiang, L., , Cai Z. , , Wang D. , , and Zhang H. , 2012: Improving tree augmented naïve Bayes for class probability estimation.

,*Knowl.-Based Syst.***26**, 239–245.Klepp, C., , and Bakan S. , 2000: Satellite derived energy and water cycle components in North Atlantic cyclones.

,*Phys. Chem. Earth***25**, 65–68.Kurt, I., , Ture M. , , and Kurum A. T. , 2008: Comparing performances of logistic regression, classification and regression tree, and neural networks for predicting coronary artery disease.

,*Expert Syst. Appl.***34**, 366–374.Langley, P., , Iba W. , , and Thompson K. , 1992: An analysis of Bayesian classifiers.

*Proc. 10th National Conf. on Artificial Intelligence,*San Jose, CA, Association for the Advancement of Artificial Intelligence, 223–228.Lee, C. S., , Huang L. R. , , Shen H. S. , , and Wang S. T. , 2006: A climatology model for forecasting typhoon rainfall in Taiwan.

,*Nat. Hazards***37**, 87–105.Liang, W., , Zhuang D. , , Jiang D. , , Pan J. , , and Ren H. , 2012: Assessment of debris flow hazards using a Bayesian network.

,*Geomorphology***171–172**, 94–100.Lin, J. Y., , Cheng C. T. , , and Chau K. W. , 2006: Using support vector machines for long-term discharge prediction.

,*Hydrol. Sci. J.***51**, 599–612.Liu, G., , and Curry J. A. , 1997: Precipitation characteristics in Greenland–Iceland–Norwegian Seas determined by using satellite microwave data.

,*J. Geophys. Res.***102**(D12), 13 987–13 997.Lonfat, M., , Rogers R. , , Marchok T. , , and Marks F. D. Jr., 2007: A parametric model for predicting hurricane rainfall.

,*Mon. Wea. Rev.***135**, 3086–3097.Mackey, B. P., , and Krishnamurti T. N. , 2001: Ensemble forecast of a typhoon flood event.

,*Wea. Forecasting***16**, 399–415.Madden, M. G., 2009: On the classification performance of TAN and general Bayesian networks.

,*Knowl.-Based Syst.***22**, 489–495.Mishra, A., , Gairola R. M. , , Varma A. K. , , Sarkar A. , , and Agarwal V. K. , 2009: Rainfall retrieval over Indian land and oceanic regions from SSM/I microwave data.

,*Adv. Space Res.***44**, 815–823.Mitchell, T. M., 2005:

*Machine Learning.*McGraw-Hill, 414 pp.Murphy, A. H., 1993: What is a good forecast? An essay on the nature of goodness in weather forecasting.

,*Wea. Forecasting***8**, 281–293.Muttil, N., , and Chau K. W. , 2006: Neural network and genetic programming for modelling coastal algal blooms.

,*Int. J. Environ. Pollut.***28**, 223–238.Nativi, S., , Barrett E. C. , , and Beaumont M. J. , 1997: Monitoring of rainfall integrating active and passive microwave sensors: Possibilities and problems.

,*Phys. Chem. Earth***22**, 229–233.Nesbitt, S. W., , Cifelli R. , , and Rutledge S. A. , 2006: Storm morphology and rainfall characteristics of TRMM precipitation features.

,*Mon. Wea. Rev.***134**, 2702–2721.Niculescu, S. P., 2003: Artificial neural networks and genetic algorithms in QSAR.

,*J. Mol. Struct. THEOCHEM***622**, 71–83.Pearl, J., 1988:

*Probabilistic Reasoning in Intelligent Systems: Networks of Plausible Inference*. Morgan Kaufmann Publishers, 552 pp.Pernkopf, F., 2005: Bayesian network classifiers versus selective

*k*-NN classifier.,*Pattern Recognit.***38**, 1–10.Pernkopf, F., , and O'Leary P. , 2003: Floating search algorithm for structure learning of Bayesian network classifiers.

,*Pattern Recognit. Lett.***24**, 2839–2848.Petty, G. W., , and Krajewski W. F. , 1996: Satellite estimation of precipitation over land.

,*Hydrol. Sci. J.***41**, 433–451.Raytheon Systems Company, 2000: Special Sensor Microwave/Imager (SSM/I) user's interpretation guide (UIG). NOAA Grant UG32268-900, 96 pp.

Spencer, R. W., , Goodman H. M. , , and Hood R. E. , 1989: Precipitation retrieval over land and ocean with the SSM/I: Identification and characteristics of the scattering signal.

,*J. Atmos. Oceanic Technol.***6**, 254–273.Stajduhar, I., , Dalbelo-Basic B. , , and Bogunovic N. , 2009: Impact of censoring on learning Bayesian networks in survival modelling.

,*Artif. Intell. Med.***47**, 199–217.Stephenson, D. B., 2000: Use of the “odds ratio” for diagnosing forecast skill.

,*Wea. Forecasting***15**, 221–232.Tsai, C. C., , Lu M. C. , , and Wei C. C. , 2012: Decision tree-based classifier combined with neural-based predictor for water-stage forecasts in a river basin during typhoons: A case study in Taiwan.

,*Environ. Eng. Sci.***29**, 108–116.Tuleya, R. E., , DeMaria M. , , and Kuligowski R. J. , 2007: Evaluation of GFDL and simple statistical model rainfall forecasts for U.S. landfalling tropical storms.

,*Wea. Forecasting***22**, 56–70.Uusitalo, L., 2007: Advantages and challenges of Bayesian networks in environmental modeling.

,*Ecol. Modell.***203**, 312–318.Verron, S., , Li J. , , and Tiplica T. , 2010: Fault detection and isolation of faults in a multivariate process with Bayesian network.

,*J. Process Control***20**, 902–911.Wang, W. C., , Cheng C. T. , , Chau K. W. , , and Xu D. M. , 2012: Calibration of Xinanjiang model parameters using hybrid genetic algorithm based fuzzy optimal model.

,*J. Hydroinf.***14**, 784–799.Wei, C.-C., 2012a: RBF neural networks combined with principal component analysis applied to quantitative precipitation forecast for a reservoir watershed during typhoon periods.

,*J. Hydrometeor.***13**, 722–734.Wei, C.-C., 2012b: Wavelet support vector machines for forecasting precipitation in tropical cyclones: Comparisons with GSVM, regressions, and numerical MM5 model.

,*Wea. Forecasting***27**, 438–450.Wei, C.-C., , and Hsu N. S. , 2008: Derived operating rules for a reservoir operation system: Comparison of decision trees, neural decision trees and fuzzy decision trees.

,*Water Resour. Res.***44**, W02428, doi:10.1029/2006WR005792.Wei, C.-C., , and Hsu N. S. , 2009: Optimal tree-based release rules for real-time flood control operations on a multipurpose multireservoir system.

,*J. Hydrol.***365**, 213–224.Wei, C.-C., , and Roan J. , 2012: Retrievals for the rainfall rate over land using Special Sensor Microwave/Imager data during tropical cyclones: Comparisons of scattering index, regression, and support vector regression.

,*J. Hydrometeor.***13**, 1567–1578.Wong, M. L., , Lee S. Y. , , and Leung K. S. , 2004: Data mining of Bayesian networks using cooperative coevolution.

,*Decis. Support Syst.***38**, 451–472.Wu, C. L., , Chau K. W. , , and Li Y. S. , 2009: Predicting monthly streamflow using data-driven models coupled with data-preprocessing techniques.

,*Water Resour. Res.***45**, W08432, doi:10.1029/2007WR006737.Xiao, J., , He C. , , and Jiang X. , 2009: Structure identification of Bayesian classifiers based on GMDH.

,*Knowl.-Based Syst.***22**, 461–470.Yeh, T. C., 2002: Typhoon rainfall over Taiwan area: The empirical orthogonal function modes and their applications on the rainfall forecasting.

,*Terr. Atmos. Oceanic Sci.***13**, 449–468.Zhu, W., 2003: Using Bayesian network on network tomography.

,*Comput. Commun.***26**, 155–163.