Abstract

How to extract the causal relations in climate–cyclone interactions is an important problem in atmospheric science. Traditionally, the most commonly used research methodology in this field is time-delayed correlation analysis. This may be not appropriate, since a correlation cannot imply causality, as it lacks the needed asymmetry or directedness between dynamical events. This study introduces a recently developed and very concise but rigorous formula—that is, a formula for information flow (IF)—to fulfill the purpose. A new way to normalize the IF is proposed and then the normalized IF (NIF) is used to detect the causal relation between the tropical cyclone (TC) genesis over the western North Pacific (WNP) and a variety of climate modes. It is shown that El Niño–Southern Oscillation and Pacific decadal oscillation are the dominant factors that modulate the WNP TC genesis. The western Pacific subtropical high and the monsoon trough are also playing important roles in affecting the TCs in the western and eastern regions of the WNP, respectively. With these selected climate indices as predictors, a method of fuzzy graph evolved from a nonparametric Bayesian process (BNP-FG), which is capable of handling situations with insufficient samples, is employed to perform a seasonal TC forecast. A forecast with the classic Poisson regression is also conducted for comparison. The BNP-FG model and the causality analysis are found to provide a satisfactory estimation of the number of TC genesis observed in recent years. Considering its generality, it is expected to be applicable in other climate-related predictions.

1. Introduction

In predicting the interannual variability of tropical cyclone (TC; see  appendix A for a glossary of the acronyms) geneses over the western North Pacific (WNP), there are two outstanding problems that have caught wide attention. The first problem is unraveling the causal relation between various climate factors and the WNP TC genesis, and the other problem is how to forecast the TC genesis. Regarding the first problem, there have been many studies on cyclone–climate interactions over the WNP. For example, Wang and Chan (2002) and Zhan et al. (2011) found that El Niño–Southern Oscillation (ENSO) contributes to both the east–west shift of the WNP TC birth places and the TC intensity, because ENSO plays a vital role in the interannual variability of the barotropic energy conversion in the region, which leads to change in meridional shear of the large-scale zonal wind, and hence induces more intense TCs in El Niño years. Motivated by Gray (1998), Chia and Ropelewski (2002) observed that the interannual modulation of the WNP TC genesis is related to the west Pacific sea surface temperature (SST), the zonal vertical wind shear (ZVWS), the western Pacific subtropical high (WPSH), and the monsoon trough; a similar conclusion was reached by Chen et al. (2006) and Huangfu et al. (2017). Zhang et al. (2017) reproduced the relationship between WNP TC activity and anomalous ZVWS caused by Atlantic meridional mode–induced changes to the Walker circulation. It is found that all four of the abovementioned climate modes—that is, ENSO, ZVWS, WPSH, and the monsoon trough—may be related to the interannual variation of the central equatorial Pacific heating and the subsequent Rossby wave response, and hence may cause the low-level anomalous development of cyclones and/or anticyclones (Chia and Ropelewski 2002).

Recently, Zhan et al. (2011) and Ha et al. (2015) linked the interannual variability of WNP TC occurrence to the SST in the east Indian Ocean (EIO SST). Two major mechanisms have been proposed to interpret how EIO SST affects the WNP TC genesis. The first is that EIO SST leads to a land–sea thermal contrast, inducing an abnormal East Asian and WNP summer monsoon associated with the monsoon trough, and thus affecting the TC genesis in the region. The second is that the EIO SSTA can excite equatorial Kelvin waves to the east, influencing the surface pressure over the equatorial region and contributing to anomalous anticyclonic (cyclonic) vorticity and divergence (convergence) in the region of concern. Both of these mechanisms result in an anomalous ascending (descending) motion and a wet (dry) midtroposphere, and hence enhance (suppress) the TC genesis in the region (Ha et al. 2015).

The WNP TC activity has also been connected to the slowly varying Pacific decadal oscillation (PDO; Liu and Chan 2008) and to the quasi-biennial oscillation (QBO; Chan 1995a). PDO may contribute to the westward extension and strength of the subtropical high and the midlevel steering flow affecting the TC occurrence pattern (Liu and Chan 2008). Camargo and Sobel (2010) revisited the issue and found no clear link between the QBO and TC activity.

Although a variety of climate factors, as mentioned above, that modulate the WNP TC activity have been identified, much is yet to be explored in tracing the causal origin of the WNP TC variability. In climate science, time-delayed correlation analysis is still the primary tool for causal identification. This is unfortunate, as there has been strong argument in philosophy against using correlation analysis for this purpose, because, for example, correlation lacks the needed asymmetry or directedness between dynamical events (Liang 2014). Causality in the modern sense begins with Granger (1969), who formulated the problem as a statistical hypothesis testing, and this approach has been known as the Granger causality test. On the other hand, a new quantity called transfer entropy (TE; Schreiber 2000) was empirically proposed, and it has since been of tremendous interest in various disciplines (Chatzisavvas et al. 2005; Liu et al. 2010; Zhang et al. 2006). It has evolved into alternative forms, such as direct causality entropy (Duan et al. 2013), transfer zero entropy (Duan et al. 2014), causation entropy (Sun and Bollt 2014), etc. Recently, it has been established that Granger causality and transfer entropy 1) are actually equivalent up to a factor of 2, and 2) will give spurious causality inference in several situations; see Liang (2016) for a brief historical review.

During the past years, Liang (2008, 2014, 2015, 2016) realized that causality is actually a real physical notion and can be put on a rigorous footing. In his formalism, causality is measured by information flow (IF). Rigorous formulas have been derived in a closed form, and for linear systems the maximum likelihood estimator of the IF from a series, say, , to another series, , turns out to be very simple. IF in this framework is not only very easy to evaluate and efficient for detecting causality for linear systems but proves to be remarkably successful with a highly nonlinear time series that fails TE and the Granger causality test (Stips et al. 2016). Then Liang (2015) normalized the obtained IF (NIF) in order to assess the relative importance of an identified causality. Unfortunately, the normalization may lead to too-small relative information flows in many situations (a numerical experiment is supplied in  appendix B). An objective of this study is, therefore, to propose a new formula to normalize the IF developed in Liang (2014) so as to fit for our climate–cyclone interactions studies. Another objective regards the prediction of TC activities. The models for this kind of prediction can be broadly divided into two groups: physical models and regression-based methods. For the first group (Camp et al. 2015; Caron et al. 2011; Chen and Lin 2013; Hsiao et al. 2015; Reale et al. 2014; Strachan et al. 2013; Wang 2012; Zhang et al. 2007; Zhao et al. 2010), equations of mathematical physics and their algorithms are developed to describe and solve the dynamical systems. However, the coarse resolution for the dynamical models often overestimate the size of the tropical storm vortices; computation-intensive complexities always pose a challenge for higher-resolution climate models to implement operational seasonal predictions (Camp et al. 2015). The second group, which includes linear regressions or Poisson regressions, has been widely used for forecasting TC numbers and variability (Caron et al. 2015; Chan 1995b; Goh and Chan 2012). However, this method assumes that the observations obey some certain distribution, say, a normal distribution. It is very difficult to obtain a reasonable result for a small sample (Shenton and Bowman 1977) without any clue about the population shape, which usually is the case in climate science.

The nonparametric Bayesian process (BNP) approach may serve as an acceptable tool for modeling unknown densities. With a BNP model, densities can be estimated without restriction to any specific parameterized form. The Gaussian process prior or the Dirichlet process prior has been commonly used as a BNP prior (Gasparini 1996; Riihimäki and Vehtari 2014). The challenge with the BNP model is its analytical intractability in constructing the prior distribution over scant data (Bai et al. 2017). In this study, we introduce a fuzzy graph approach (Bai et al. 2014) evolved from a new adaptive BNP (BNP-FG) (Bai et al. 2017) to improve the performance of annual WNP TC forecasts with small samples. BNP-FG is mainly based on the traditional Bayesian scheme and the optimal information diffusion model (Wang and You 2002), which is not only an effective method for dealing with the small-sample problem but can capture complex nonlinear relationships without detailed knowledge of the physical processes (Bai et al. 2014, 2015).

The main purpose of this paper is to compare the importance of different climate factors in influencing the WNP TC genesis and to determine the dominant ones using the new normalized IF. The main climate factors are then selected as the input of BNP-FG for a seasonal prediction of WNP TC geneses with rare observations.

The remainder of this paper is organized as follows. Section 2 describes the details of the data used in this paper. To the best of our knowledge, no study reported in the climate literature has used the IF and BNP-FG. Therefore, a section (section 3) is devoted to the introduction of their basics, plus our development in the IF normalization. The results of causality analysis on cyclone–climate interactions and the annual WNP TC genesis forecast are presented in section 4 and section 5, respectively. This study is concluded in section 6.

2. Data

a. Typhoon data

The quality of the data prior to 1970 has been considered to be poor because of the lack of satellite coverage. We hence select the best-track data of the Joint Typhoon Warning Center (https://metoc.ndbc.noaa.gov/web/guest/jtwc/best_tracks/western-pacific) for the period 1970–2016 during June–October over the western North Pacific (0°–45°N, 100°–180°E). The data from the Shanghai Typhoon Institute of China Meteorological Administration (CMA; www.typhoon.org.cn) and the Regional Specialized Meteorological Center of the Japan Meteorological Agency (JMA; https://www.jma.go.jp/jma/jma-eng/jma-center/rsmc-hp-pub-eg/trackarchives.html) are also used as for validation purposes. Only the TCs that have at least tropical storm (TS) intensity (with a maximum sustained wind speed ) and a lifetime of 48 h or more are considered, in order to minimize the uncertainty in identifying the tropical depression (Liu and Chan 2008) and to address the artificial trend in short-duration storms (Landsea et al. 2010). Moreover, we follow the proposal suggested by Zhan et al. (2011), that is, we divide the WNP into two subregions—a western region west of 145°E and an eastern region east of 145°E—since it displays different trends of TC activity eastward and westward around 145°E. The positions of TC genesis during typhoon season in the western and eastern clusters are shown in Fig. 1.1

Fig. 1.

Positions of TC genesis during the typhoon season for the period 1970–2016 during June–October over the western North Pacific.

Fig. 1.

Positions of TC genesis during the typhoon season for the period 1970–2016 during June–October over the western North Pacific.

b. Climate data

The climate indices in this study are selected based on the studies mentioned in the introduction. ENSO is characterized by the Niño-3.4 (58°S–58°N, 120°–70°W) index obtained from the NCEP Climate Prediction Center (CPC). Data for PDO are obtained directly from the National Oceanic and Atmospheric Administration (NOAA) Earth System Research Laboratory. In addition, the WNP (0°–45°N, 100°–180°E) and EIO (10°S–22.5°N, 75°–100°E) SST indices are extracted as the average of the Extended Reconstructed SST analyses from NOAA (Smith and Reynolds 2003) measured over the associated region. The definition of the WPSH, as suggested by Hong et al. (2015), is the average grid points of 500-hPa geopotential height > 588 gpm in the range (10°–90°N, 110°–180°E). ZVWS is defined in this paper as the difference in zonal winds between 200 and 850 hPa (Chia and Ropelewski 2002) over the WNP TC genesis region, and the 850-hPa wind composites may characterize the monsoon trough over the WNP (Chia and Ropelewski 2002). The associated wind data are acquired from the monthly National Centers for Environmental Prediction–National Center for Atmospheric Research reanalysis. Data for QBO are obtained from the University of Berlin by combining observations of the zonal winds at 30 hPa at the three radiosonde stations: Canton Island, Gan/Maledive Islands, and Singapore (Naujokat 1986). All the values of the climate factors for the months of June–October are computed as the averages for each year during 1970–2016 (see Fig. 2). Especially the data for 2007–16 will be used to validate our seasonal forecast.

Fig. 2.

Normalized time series of various climate indices used in this study: (a) WNP SST, (b) EIO SST, (c) ENSO, (d) PDO, (e) QBO, (f) ZVWS, (g) monsoon, and (h) WPSH.

Fig. 2.

Normalized time series of various climate indices used in this study: (a) WNP SST, (b) EIO SST, (c) ENSO, (d) PDO, (e) QBO, (f) ZVWS, (g) monsoon, and (h) WPSH.

3. Methodologies

a. Information flow normalization

The climate influence study is mainly based on IF, a physical notion for causality analysis that has just been rigorously formulated. For two time series, and , Liang (2014) established that the maximum likelihood estimator of the rate of the IF from to is

 
formula

where denotes the covariance between and , and is determined as follows. Let be the finite-difference approximation of using the Euler forward scheme,

 
formula

with or [the details about how to determine are referred to in Liang (2014)] and is the time step. Term in Eq. (1) is the covariance between and . Ideally, if , then does not cause ; otherwise, it is causal. In practice, a significance test needs to be done.

An objective here is to find a practical way to normalize the abovementioned IF. As Liang (2015) stated, this may be not as simple as it seems to be. We first need to get back to its original derivation. Given a two-dimensional dynamical system , where is a 2D vector of white noise, and and can be any nonlinear functions of , Liang (2008) has proved that the rate of change of the marginal entropy of is

 
formula

where represents mathematical expectation, is the marginal density of , and . The first term on the right-hand side of Eq. (2) yields

 
formula

Substitution of

 
formula

(adapted from Liang 2008) into Eq. (2) yields

 
formula

The right-hand side has three terms—that is, the change of due to itself, the rate of information flow from to , and the stochastic effects —where

 
formula

In the case where with , , and , which are constant vectors/matrices, the distribution of the state variables will keep being Gaussian, provided that they are originally Gaussian (see Liang 2014). So, we may let

 
formula

It is easy to obtain (Liang 2015)

 
formula

We define as the normalizer, which differs from that in Liang (2015) in that the term is taken out. This makes sense, since that term measures the contribution from itself and that could be the reason why the resulting relative causality in Liang (2015) is too small. With the modified normalizer, can be normalized as follows:

 
formula

Clearly, measures the importance of the information flow from to in comparison to other stochastic processes. For readers’ easy convenience, the step-by-step computation of is given in algorithm 1 [a matrix laboratory (MATLAB) implementation of algorithm 1 is available online; https://cn.mathworks.com/matlabcentral/fileexchange/62471-the-normalized-information-flow]. Moreover, to demonstrate the effectiveness of , several simulation experiments have been performed in  appendix B. Liang’s (2015) original IF normalization, written as , and the normalized transfer entropy (NTE) by Duan et al. (2013) are also included for comparison.

1) Algorithm 1

Input: Two time series and .

Step 1: Calculate the rate of information flow from to using Eq. (1).

Step 2: Estimate the parameters of the two-dimensional dynamical system :

, and

.

Step 3: Compute the maximum likelihood estimator of :

.

Step 4: Substitute for .

Step 5: Substitution of and into Eq. (3) yields the rate of the stochastic effect .

Step 6: Calculate the normalization of based on Eq. (4).

Output:

b. Inference using fuzzy graph

After the main climate factors causing TCs are determined with algorithm 1, a seasonal forecast of TC genesis can be made using BNP-FG (Bai et al. 2017), which is introduced henceforth.

Let and denote the random variables of interest, where is the input (ENSO and PDO indices, etc.) and is the output (e.g., seasonal TC number). Let be a set of observations on , where and . Suppose that and are two fuzzy sets of and , respectively; that is to say,

 
formula

where and are, respectively, their membership functions in the form of conditional probabilities, with as an illustrating point. More details are discussed in algorithm 5 of Bai et al. (2017). We can then construct the information gain of as

 
formula

Let the sum of the information gain be

 
formula

which consists of the information matrix (Huang 2001)

 
formula

According to the theory of factor space (Wang 1990), which discusses how to normalize the information matrix appropriately, we use

 
formula

to produce a normalized information matrix, that is, the fuzzy relation matrix ,

 
formula

To calculate the output fuzzy set , we first use to denote the input fuzzy set

 
formula

Based on the fuzzy inference formula,

 
formula

Here the operator signifies the maximum–minimum fuzzy composition rule,

 
formula

where . Thus, we can obtain

 
formula

Finally, the gravity center of the fuzzy set is generated as the output,

 
formula

In general, we use the given sample and illustrating points to construct a relationship between the input and the output in the following form:

 
formula

where represents the values of selected climate indices in a certain year when we want to know its TC genesis number.

4. Climate influence on western North Pacific tropical cyclone genesis

We first compute the causal relations between the TC numbers and a variety of climate factors. The whole procedure follows algorithm 1 together with an examination of the statistical significance of the resulting NIF (i.e., if , and if the IF is significant at the 5% confidence level; Liang 2015). The results are tabulated in Table 1. This section is a summary of the results. Note that the direction of the causality or information flow in the table is from the column index to the row index.

Table 1.

NIF between TC genesis number and predictors (%). Boldface font indicates statistically significant values at the 5% level.

NIF  between TC genesis number and predictors (%). Boldface font indicates statistically significant values at the 5% level.
NIF  between TC genesis number and predictors (%). Boldface font indicates statistically significant values at the 5% level.

As is clearly seen, most of the NIFs are significant (as highlighted) except QBO, which generally agrees with previous studies mentioned in the literature. For the western cluster (WC) time series—that is, the first row in Table 1—the NIF values vary from 0.3605% to 26.1865%. The maximum is , in agreement with Chia and Ropelewski (2002), who claimed that ENSO is a major factor in determining the seasonal TC mean genesis positions. Second to it is . This is consistent with Liu and Chan (2008), who found that PDO displays a dipole-like structure on the 500-hPa geopotential height anomaly map. Strong easterly anomalies tend to steer TCs toward the west.

From the second row of the table, the variability in the eastern cluster (EC) seems to be largely tied to, aside from ENSO and PDO, WPSH. WPSH generally moves northward in June, reaching its northernmost position near 40°N in August and September, and withdraws in October. When it retreats from the South China Sea (SCS), the monsoon westerly winds penetrate from the Indian Ocean to the SCS, the Philippine Sea, and the western Pacific, which may be more favorable for TC genesis over the WNP (Frank 1987). The most interesting information is found in the last two columns of Table 1. These values validate the work of Chia and Ropelewski (2002), who showed that the strengthened (weakened) WPSH and the enhanced (reduced) monsoon trough in the Philippine Sea lead to an eastward (westward) displacement of the major TC genesis pattern; this is also consistent with the ENSO composites of Wang (1995).

Because climate science correlation analysis may be a primary approach for causality detecting, we also compute the correlation coefficients between the pairs for comparison purposes. The results are shown in Table 2. Most of the results are not significant, except those with ENSO and PDO. This suggests that correlation analysis should not be a prior choice for predictor selection in climate science. In other words, although correlation analysis may help select the proper climate factors with the largest correlation coefficients, it cannot display the inner links between some certain climate indices and the WNP TCs. For example, the WNP SST should be an important climate index for influencing TCs over the WNP; however, correlation analysis cannot always correctly validate it (see Table 2). In contrast, at the same statistical significance level, we can obtain the cause–effect relation between the events of interest using the new NIF without computational complexity.

Table 2.

Correlation between TC genesis number and predictors. Boldface font indicates statistically significant values at the 5% level.

Correlation  between TC genesis number and predictors. Boldface font indicates statistically significant values at the 5% level.
Correlation  between TC genesis number and predictors. Boldface font indicates statistically significant values at the 5% level.

5. Prediction of annual TC activity over the WNP

To achieve acceptable prediction performance, we first get two ranking lists of climate indices ordered by the values of NIF and correlation coefficients based on Tables 1 and 2, respectively. Then, as recommended by Song et al. (2013), we select Top-k factors from the two ranking lists, where and m is the number of potentially selected indices. In this study the number of climate factors is eight, then k = 5. Thus, the common indices in the two Top-k ranking lists are chosen as the final predictors. Following this procedure, ENSO, PDO, and WPSH are the three most important predictors causing TCs in the western cluster; ENSO, PDO, and monsoons are the major ones for the eastern cluster. Therefore, in this section, we analyze the prediction capability of BNP-FG in coordination with these selected climate influences and compare it with the Poisson regression, which is a common approach in climatology.

a. Experiment 1: Prediction of the annual TC genesis in the eastern region of the WNP

Following a general rule for forecasting exercises, 80% of the sample is used for training and the rest is used for validation. Therefore, we train BNP-FG with data from 1970 to 2006 and make predictions from 2007 to 2016 (it should be noticed that the size of the sample is small). A new framework is presented here to illustrate the step-by-step implementation of BNP-FG.

  • Step 1: Let the indices of ENSO, PDO, and WPSH, and the recorded TC genesis numbers measured from 1970 to 2006 be 
    formula
  • Step 2: According to the maximum and minimum values of and in Fig. 2, let the illustrating points be 
    formula
  • Step 3: Based on algorithm 5 of Bai et al. (2017), the membership functions and can be computed.

  • Step 4: Then, the fuzzy relationship matrix can be computed using Eqs. (5)(9).

  • Step 5: Recalling Eqs. (10)(14), the TC genesis number in response to every possible can be inferred. We take the prediction for year 2016 as an example. We know that in 2015 from Fig. 2. We use Eq. (12) to obtain 
    formula

Next, based on the

 
formula

-th row of and Eq. (13), we compute

 
formula

Finally, we calculate the gravity center of the output fuzzy set as the predicted value using Eq. (14),

 
formula

which is nearly the same as the true number of TC genesis in 2016 (see Fig. 3a).

Fig. 3.

Seasonal forecasts for (a) eastern and (b) western clusters during the 2007–16 period. Shown are PR (blue) and BNP-FG (red).

Fig. 3.

Seasonal forecasts for (a) eastern and (b) western clusters during the 2007–16 period. Shown are PR (blue) and BNP-FG (red).

Following the steps given above, the results of BNP-FG for TC genesis forecasting over the WNP can be obtained (see Fig. 3a). Figure 3a shows that BNP-FG can capture the decreasing trend in 2007–08, the increase in 2008–09, the drop in 2010, the increasing pattern in 2011–15, and the drop in 2016. The mean absolute percentage error (MAPE; Caron et al. 2015) and the root-mean-square error (RMSE; Caron et al. 2015) are also employed as objective functions to calibrate the new model (see Table 3). It should be noted that in this section we discuss the results based on only the JTWC best-track data. Obviously, the BNP-FG performs rather satisfactorily, with an MAPE value of 19.05% and an RMSE value of 1.5233. One of the most common models to estimate the relation between objects of interest in climate science is Poisson regression (PR). We hence also perform PR for comparison. We use the “glmfit” and “glmval” functions in the toolbox of MATLAB to obtain the prediction of TC genesis numbers in 2007–16. The results are displayed in Fig. 3a. From the figure it is obvious that BNP-FG performs better than PR: BNP-FG outperforms PR by about 21.35% and 9.32% in terms of MAPE and RMSE reductions, respectively. This may be attributed to samples containing insufficient information for PR to model the interannual variability of TC geneses.

Table 3.

Skill scores (in terms of MAPE and RMSE) of different models for the ECs and WCs of the WNP ( boldface font indicates the best performance).

Skill scores (in terms of MAPE and RMSE) of different models for the ECs and WCs of the WNP ( boldface font indicates the best performance).
Skill scores (in terms of MAPE and RMSE) of different models for the ECs and WCs of the WNP ( boldface font indicates the best performance).

b. Experiment 2: Prediction of the interannual TC genesis in the western region of the WNP

As in experiment 1, in step 1 the ENSO, PDO, and monsoon indices and the TC numbers in the eastern cluster are selected as the input. Figure 3b shows the observed and reconstructed TC number using different models. Clearly BNP-FG still provides better predictions than PR, with the best MAPE and RMSE reductions by 14.53% and 1.5224, respectively. From this we see a strong prospect for the prediction of the interannual variability of the WNP TC numbers from limited information using BNP-FG.

c. Experiment 3: Prediction of the interannual TC genesis in the whole WNP using data from CMA and JMA

In this experiment we repeat the abovementioned procedures using the data from CMA and JMA. The results are presented in the supplementary material (Tables S1–S4). We take the prediction of TC genesis frequency for the WC based on the JMA datasets as an example. First, we get two ranking lists of climate indices ordered by the values of NIF and correlation coefficients based on Tables S1 and S2, respectively. Then, we select Top-k factors from the two ranking lists, where k = 5. The Top-k influences derived from Table S1 are ENSO, PDO, WNP SST, WPSH, and EIO SST. The Top-k list from Table S2 only consists of ENSO, PDO, and WPSH, since other factors’ correlations with TC genesis for WC are insignificant at the 5% level. Thus, the common indices in the two Top-k ranking lists are ENSO, PDO, and WPSH as final predictors. The performances of BNP-FG versus PR over the years 2007–16 for WC using the selected predictors are shown in Fig. S1. Subsequent to this example, we can obtain the results of predicting TC genesis frequency for different regions over the WNP based on JMA and CMA datasets (see Figs. S2–S4). Details about the robustness analyses are provided in Table 3. Although there are some discrepancies between the selected predictors derived from different datasets (it may be attributed to different physical parameterization schemes and data assimilation techniques utilized by the operational centers in TC activity forecasting; Peng et al. 2017), the same conclusion can be drawn from Table 3; that is to say, the proposed algorithm mentioned above makes a competitive tool for forecasting the number of WNP TCs when measurements are insufficient.

6. Conclusions

In this study we introduced a recently formulated rigorous causality inference method—that is, the information flow analysis method by Liang (2014)—into the field of climate–cyclone interaction analysis and TC forecasting, and developed for it a new normalization scheme. We used the NIF to identify the cause–effect relation between the western North Pacific tropical cyclone genesis and a variety of climate indices. Key factors are then selected for seasonal prediction. The resulting causalities generally agree with previous studies on the variability of the WNP TC genesis, but they show a difference from those obtained through correlation analysis, a technique most commonly used in climate science. Although there are no significant correlations between certain climate factors and TCs over the WNP, with the new method the links between them can be accurately revealed. In particular, the principal influences of ENSO and PDO on the WNP TC variability have been reconfirmed through the causality analysis; the secondary influences of WPSH and monsoon trough have been faithfully detected as well, consistent with the observation that they significantly affect the atmospheric circulation and result in atmospheric anomalies over the WNP (Chia and Ropelewski 2002; Xie et al. 2009; Zhan et al. 2011).

The second part of this study is the prediction of the interannual variability of the TC numbers. Based on a fuzzy graph that evolved from a new nonparametric Bayesian process (BNP-FG), a robust model was proposed for predicting the TC numbers in the western and eastern regions over the WNP when the observations are insufficient. The causal climate factors selected through the aforementioned causality analysis are taken as input for the prediction. It has been shown that the prediction with the ENSO, PDO, and WPSH indices can achieve acceptable performance for the western cluster; for the eastern cluster, the prediction with ENSO, PDO, and monsoon taken into account is satisfactory. A classic Poisson regression (PR) is also employed for comparison. It is observed that the new method significantly outperforms PR. Although much is yet to be improved, this newly proposed method—that is, the BNP-FG model combined with the recently developed causality analysis—provides a competitive tool for more reliable TC genesis forecasts. We look forward to seeing more applications in the future over the different basins.

Acknowledgments

The authors are very grateful to the anonymous editor and reviewers for their valuable comments and constructive suggestions, which helped us to improve significantly the quality of the paper. XSL acknowledges the supports from the 2015 Jiangsu Program for Innovation Research and Entrepreneurship Groups, and the National Program on Global Change and Air-Sea Interaction (GASI-IPOVAI-06). This research was supported by the National Natural Science Foundation of China (51609254) and the Specific Fund for the Industrial Site in the city of Tangshan (CQZ-2014001).

APPENDIX A

Glossary of the Acronyms

     
  • BNP

    Nonparametric Bayesian process

  •  
  • CMA

    China Meteorological Administration

  •  
  • CPC

    Climate Prediction Center

  •  
  • EC

    Eastern cluster

  •  
  • EIO

    East Indian Ocean

  •  
  • ENSO

    El Eiño–Southern Oscillation

  •  
  • FG

    Fuzzy graph

  •  
  • IF

    Information flow

  •  
  • JMA

    Japan Meteorological Agency

  •  
  • MAPE

    Mean absolute percentage error

  •  
  • MATLAB

    Matrix laboratory

  •  
  • NIF

    Normalized information flow

  •  
  • NOAA

    National Oceanic and Atmospheric Administration

  •  
  • NTE

    Normalized transfer entropy

  •  
  • PDO

    Pacific decadal oscillation

  •  
  • PR

    Poisson regression

  •  
  • QBO

    Quasi-biennial oscillation

  •  
  • RMSE

    Root-mean-square error

  •  
  • SCS

    South China Sea

  •  
  • SST

    Sea surface temperature

  •  
  • TC

    Tropical cyclone

  •  
  • TE

    Transfer entropy

  •  
  • TS

    Tropical storm

  •  
  • WC

    Western cluster

  •  
  • WNP

    Western North Pacific

  •  
  • WPSH

    Western Pacific subtropical high

  •  
  • ZVWS

    Zonal vertical wind shear

APPENDIX B

Several Numerical Simulations

In this section two numerical examples (adapted from Duan et al. 2013) are adopted to validate . A comparison with the NTE (Duan et al. 2013) and (Liang 2015) is also presented.

Experiment 1: Linear equations

 
formula

where , , and . We have run 6000 steps, but the initial 3000 are discarded to obtain two stationary series. The NIF and NTE between each pair of , , and are then computed and listed in Table B1. Based on the NIF values, it can be seen that causes and causes because and are much larger than zero. This conclusion is obviously consistent with Eq. (B1) from which it can be seen that there is information delivery from both to and to . Although can correctly unravel the causal relation between and , it is hard to detect the information flow pathway from to from .

Table B1.

, , and for Eq. (B1).

, , and  for Eq. (B1).
, , and  for Eq. (B1).

In addition, based on Eq. (19) from (Duan et al. 2013), it is noticed that NIT can achieve nearly the same performance with ours (except for , because NIF represents a direct cause–effect relation, while NTE cannot distinguish whether the causality is direct or indirect). But as Duan et al. (2013) stated, the computational complexity for NTE is , where represents the sample size, and and are the embedding dimensions of each pair of objects. In contrast, the computational complexity for NIF is only , by far lower than that for NTE.

Experiment 2: Nonlinear equations

 
formula

where , , and . Following Eq. (B1), we compute all the values of NTE and NIF, and list them in Table B2. Clearly, our NIF here yields good results, considering the preset information flow pathways in Eq. (B2). Its performance is obviously better than that of Liang (2015) and NTE (Duan et al. 2013). Our normalization scheme for Liang’s (2014) IF is hence successful.

Table B2.

, , and for Eq. (B2).

, , and  for Eq. (B2).
, , and  for Eq. (B2).

REFERENCES

REFERENCES
Bai
,
C.
,
M.
Hong
,
D.
Wang
,
R.
Zhang
, and
L.
Qian
,
2014
:
Evolving an information diffusion model using a genetic algorithm for monthly river discharge time series interpolation and forecasting
.
J. Hydrometeor.
,
15
,
2236
2249
, https://doi.org/10.1175/JHM-D-13-0184.1.
Bai
,
C.
,
R.
Zhang
,
M.
Hong
,
L.
Qian
, and
Z.
Wang
,
2015
:
A new information diffusion modelling technique based on vibrating string equation and its application in natural disaster risk assessment
.
Int. J. Gen. Syst.
,
44
,
601
614
, https://doi.org/10.1080/03081079.2014.980242.
Bai
,
C.
,
R.
Zhang
,
L.
Qian
, and
Y.
Wu
,
2017
:
A fuzzy graph evolved by a new adaptive Bayesian framework and its applications in natural hazards
.
Nat. Hazards
,
87
,
899
918
. https://doi.org/10.1007/s11069-017-2801-y.
Camargo
,
S. J.
, and
A. H.
Sobel
,
2010
:
Revisiting the influence of the quasi-biennial oscillation on tropical cyclone activity
.
J. Climate
,
23
,
5810
5825
, https://doi.org/10.1175/2010JCLI3575.1.
Camp
,
J.
,
M.
Roberts
,
C.
MacLachlan
,
E.
Wallace
,
L.
Hermanson
,
A.
Brookshaw
,
A.
Arribas
, and
A. A.
Scaife
,
2015
:
Seasonal forecasting of tropical storms using the Met Office GloSea5 seasonal forecast system
.
Quart. J. Roy. Meteor. Soc.
,
141
,
2206
2219
, https://doi.org/10.1002/qj.2516.
Caron
,
L. P.
,
C. G.
Jones
, and
K.
Winger
,
2011
:
Impact of resolution and downscaling technique in simulating recent Atlantic tropical cylone activity
.
Climate Dyn.
,
37
,
869
892
, https://doi.org/10.1007/s00382-010-0846-7.
Caron
,
L. P.
,
M.
Boudreault
, and
S. J.
Camargo
,
2015
:
On the variability and predictability of eastern Pacific tropical cyclone activity
.
J. Climate
,
28
,
9678
9696
, https://doi.org/10.1175/JCLI-D-15-0377.1.
Chan
,
J. C. L.
,
1995a
:
Tropical cyclone activity in the western North Pacific in relation to the stratospheric quasi-biennial oscillation
.
Mon. Wea. Rev.
,
123
,
2567
2571
, https://doi.org/10.1175/1520-0493(1995)123<2567:TCAITW>2.0.CO;2.
Chan
,
J. C. L.
,
1995b
:
Prediction of annual tropical cyclone activity over the western North Pacific and the South China Sea
.
Int. J. Climatol.
,
15
,
1011
1019
, https://doi.org/10.1002/joc.3370150907.
Chatzisavvas
,
K. Ch.
,
Ch. C.
Moustakidis
, and
C. P.
Panos
,
2005
:
Information entropy, information distances, and complexity in atoms
.
J. Chem. Phys.
,
123
, 174111–174121, https://doi.org/10.1063/1.2121610.
Chen
,
J. H.
, and
S. J.
Lin
,
2013
:
Seasonal predictions of tropical cyclones using a 25-km-resolution general circulation model
.
J. Climate
,
26
,
380
398
, https://doi.org/10.1175/JCLI-D-12-00061.1.
Chen
,
T. C.
,
S. Y.
Wang
, and
M. C.
Yen
,
2006
:
Interannual variation of the tropical cyclone activity over the western North Pacific
.
J. Climate
,
19
,
5709
, https://doi.org/10.1175/JCLI3934.1.
Chia
,
H. H.
, and
C. F.
Ropelewski
,
2002
:
The interannual variability in the genesis location of tropical cyclones in the northwest Pacific
.
J. Climate
,
15
,
2934
2944
, https://doi.org/10.1175/1520-0442(2002)015<2934:TIVITG>2.0.CO;2.
Duan
,
P.
,
F.
Yang
,
T.
Chen
, and
S. L.
Shah
,
2013
:
Direct causality detection via the transfer entropy approach
.
IEEE Trans. Control Syst. Technol.
,
21
,
2052
2066
, https://doi.org/10.1109/TCST.2012.2233476.
Duan
,
P.
,
F.
Yang
,
S. L.
Shah
, and
T.
Chen
,
2014
:
Transfer zero-entropy and its application for capturing cause and effect relationship between variables
.
IEEE Trans. Control Syst. Technol.
,
23
,
855
867
, https://doi.org/10.1109/tcst.2014.2345095.
Frank
,
W. M.
,
1987
:
Tropical cyclone formation
.
A Global View of Tropical Cyclone
,
R. L. Elsberry et al., Eds., Naval Postgraduate School
,
53
90
.
Gasparini
,
M.
,
1996
:
Bayesian density estimation via Dirichlet density processes
.
J. Nonparametric Stat.
,
6
,
355
366
, https://doi.org/10.1080/10485259608832681.
Goh
,
Z. C.
, and
J. C. L.
Chan
,
2012
:
Variations and prediction of the annual number of tropical cyclones affecting Korea and Japan
.
Int. J. Climatol.
,
32
,
178
189
, https://doi.org/10.1002/joc.2258.
Granger
,
C. W. J.
,
1969
:
Investigating causal relations by econometric models and cross-spectral methods
.
Econometrica
,
37
,
424
438
, https://doi.org/10.2307/1912791.
Gray
,
W. M.
,
1998
:
The formation of tropical cyclones
.
Meteor. Atmos. Phys.
,
67
,
37
69
, https://doi.org/10.1007/BF01277501.
Ha
,
Y.
,
Z.
Zhong
,
X.
Yang
, and
Y.
Sun
,
2015
:
Contribution of East Indian Ocean SSTA to Western North Pacific tropical cyclone activity under El Niño/La Niña conditions
.
Int. J. Climatol.
,
35
,
506
519
, https://doi.org/10.1002/joc.3997.
Hong
,
M.
,
R.
Zhang
,
C. Z.
Bai
,
X.
Chen
,
D.
Wang
, and
J.
Ge
,
2015
:
Reconstruction of statistical–dynamical model of the Western Pacific subtropical high and East Asian summer monsoon factors and its forecast experiments
.
Nat. Hazards
,
75
,
2863
2883
, https://doi.org/10.1007/s11069-014-1467-y.
Hsiao
,
L. F.
, and Coauthors
,
2015
:
Blending of global and regional analyses with a spatial filter: Application to typhoon prediction over the western North Pacific Ocean
.
Wea. Forecasting
,
30
,
754
770
, https://doi.org/10.1175/WAF-D-14-00047.1.
Huang
,
C.
,
2001
:
Information matrix and application
.
Int. J. Gen. Syst.
,
30
,
603
622
, https://doi.org/10.1080/03081070108960737.
Huangfu
,
J.
,
R.
Huang
,
W.
Chen
,
T.
Feng
, and
L.
Wu
,
2017
:
Interdecadal variation of tropical cyclone genesis and its relationship to the monsoon trough over the western North Pacific
.
Int. J. Climatol.
,
37
,
3587
3596
, doi:.
Landsea
,
C. W.
,
G. A.
Vecchi
,
L.
Bengtsson
, and
T. R.
Knutson
,
2010
:
Impact of duration thresholds on Atlantic tropical cyclone counts
.
J. Climate
,
23
,
2508
2519
, https://doi.org/10.1175/2009JCLI3034.1.
Liang
,
X. S.
,
2008
:
Information flow within stochastic dynamical systems
.
Phys. Rev.
,
78E
,
031113
, https://doi.org/10.1103/PhysRevE.78.031113.
Liang
,
X. S.
,
2014
:
Unraveling the cause-effect relation between time series
.
Phys. Rev.
,
90E
,
052150
, https://doi.org/10.1103/PhysRevE.90.052150.
Liang
,
X. S.
,
2015
:
Normalizing the causality between time series
.
Phys. Rev.
,
92E
,
022126
, https://doi.org/10.1103/PhysRevE.92.022126.
Liang
,
X. S.
,
2016
:
Information flow and causality as rigorous notions ab initio
.
Phys. Rev.
,
94E
,
052201
, https://doi.org/10.1103/PhysRevE.94.052201.
Liu
,
K. S.
, and
J. C. L.
Chan
,
2008
:
Interdecadal variability of western North Pacific tropical cyclone tracks
.
J. Climate
,
21
,
4464
4476
, https://doi.org/10.1175/2008JCLI2207.1.
Liu
,
L.
,
J.
Zhou
,
X.
An
,
Y.
Zhang
, and
L.
Yang
,
2010
:
Using fuzzy theory and information entropy for water quality assessment in Three Gorges region, China
.
Expert Syst. Appl.
,
37
,
2517
2521
, https://doi.org/10.1016/j.eswa.2009.08.004.
Naujokat
,
B.
,
1986
:
An update of the observed quasi-biennial oscillation of the stratospheric winds over the tropics
.
J. Atmos. Sci.
,
43
,
1873
1880
, https://doi.org/10.1175/1520-0469(1986)043<1873:AUOTOQ>2.0.CO;2.
Peng
,
X.
,
J.
Fei
,
X.
Huang
, and
X.
Cheng
,
2017
:
Evaluation and error analysis of official forecasts of tropical cyclones during 2005–14 over the western North Pacific. Part I: Storm tracks
.
Wea. Forecasting
,
32
,
689
712
, https://doi.org/10.1175/WAF-D-16-0043.1.
Reale
,
O.
,
K. M.
Lau
,
A.
Silva
, and
T.
Matsui
,
2014
:
Impact of assimilated and interactive aerosol on tropical cyclogenesis
.
Geophys. Res. Lett.
,
41
,
3282
3288
, https://doi.org/10.1002/2014GL059918.
Riihimäki
,
J.
, and
A.
Vehtari
,
2014
:
Laplace approximation for logistic Gaussian process density estimation and regression
.
Bayesian Anal.
,
9
,
425
448
, https://doi.org/10.1214/14-BA872.
Schreiber
,
T.
,
2000
:
Measuring information transfer
.
Phys. Rev. Lett.
,
85
,
461
464
, https://doi.org/10.1103/PhysRevLett.85.461.
Shenton
,
L. R.
, and
K. O.
Bowman
,
1977
:
Maximum likelihood estimation in small samples
.
Tien Tzu Hsueh Pao
,
32
,
2020
2023
.
Smith
,
T. M.
, and
R. W.
Reynolds
,
2003
:
Extended reconstruction of global sea surface temperatures based on COADS data (1854–1997)
.
J. Climate
,
16
,
1495
1510
, https://doi.org/10.1175/1520-0442-16.10.1495.
Song
,
Q.
,
J.
Ni
, and
G.
Wang
,
2013
:
A fast clustering-based feature subset selection algorithm for high-dimensional data
.
IEEE Trans. Knowl. Data Eng.
,
25
,
1
14
, https://doi.org/10.1109/TKDE.2011.181.
Stips
,
A.
,
D.
Macias
,
C.
Coughlan
,
E.
Garciagorriz
, and
X. S.
Liang
,
2016
:
On the causal structure between CO2 and global temperature
.
Sci. Rep.
,
6
,
21 691
, https://doi.org/10.1038/srep21691.
Strachan
,
J.
,
P. L.
Vidale
,
K.
Hodges
,
M.
Roberts
, and
M. E.
Demory
,
2013
:
Investigating global tropical cyclone activity with a hierarchy of AGCMs: The role of model resolution
.
J. Climate
,
26
,
133
152
, https://doi.org/10.1175/JCLI-D-12-00012.1.
Sun
,
J.
, and
E. M.
Bollt
,
2014
:
Causation entropy identifies indirect influences, dominance of neighbors and anticipatory couplings
.
Physica D
,
267
,
49
57
, https://doi.org/10.1016/j.physd.2013.07.001.
Wang
,
B.
,
1995
:
Interdecadal changes in El Niño onset in the last four decades
.
J. Climate
,
8
,
267
285
, https://doi.org/10.1175/1520-0442(1995)008<0267:ICIENO>2.0.CO;2.
Wang
,
B.
, and
J. C. L.
Chan
,
2002
:
How strong ENSO events affect tropical storm activity over the western North Pacific
.
J. Climate
,
15
,
1643
1658
, https://doi.org/10.1175/1520-0442(2002)015<1643:HSEEAT>2.0.CO;2.
Wang
,
P.-Z.
,
1990
:
A factor spaces approach to knowledge representation
.
Fuzzy Sets Syst.
,
36
,
113
124
, https://doi.org/10.1016/0165-0114(90)90085-K.
Wang
,
X.
, and
Y.
You
,
2002
: The theory of optimal information diffusion estimation and its application. Computational Intelligent Systems for Applied Research: Proceedings of the Fifth International FLINS Conference, D. Ruan, P. D’Hondt, and E. E. Kerre, Eds., 198–207, https://doi.org/10.1142/9789812777102_0025.
Wang
,
Y.
,
2012
:
Recent research progress on tropical cyclone structure and intensity
.
Trop. Cyclone Res. Rev.
,
1
,
254
275
, https://doi.org/10.6057/2012TCRR02.05.
Xie
,
S.-P.
,
K. M.
Hu
,
J.
Hafner
,
H.
Tokinaga
,
Y.
Du
,
G.
Huang
, and T. Sampe,
2009
:
Indian Ocean capacitor effect on Indo–western Pacific climate during the summer following El Niño
.
J. Climate
,
22
,
730
747
, https://doi.org/10.1175/2008JCLI2544.1.
Zhan
,
R.
,
Y.
Wang
, and
X.
Lei
,
2011
:
Contributions of ENSO and east Indian Ocean SSTA to the interannual variability of northwest Pacific tropical cyclone frequency
.
J. Climate
,
24
,
509
521
, https://doi.org/10.1175/2010JCLI3808.1.
Zhang
,
W.
,
G. A.
Vecchi
,
G.
Villarini
,
H.
Murakami
,
A.
Rosati
,
X.
Yang
,
L.
Jia
, and
F.
Zeng
,
2017
:
Modulation of western North Pacific tropical cyclone activity by the Atlantic Meridional Mode
.
Climate Dyn.
,
48
,
631
647
, https://doi.org/10.1007/s00382-016-3099-2.
Zhang
,
X.
,
Q.
Xiao
, and
P. J.
Fitzpatrick
,
2007
:
The impact of multisatellite data on the initialization and simulation of Hurricane Lili’s (2002) rapid weakening phase
.
Mon. Wea. Rev.
,
135
,
526
548
, https://doi.org/10.1175/MWR3287.1.
Zhang
,
Y.
,
Z.
Yang
, and
W.
Li
,
2006
:
Analyses of urban ecosystem based on information entropy
.
Ecol. Modell.
,
197
,
1
12
, https://doi.org/10.1016/j.ecolmodel.2006.02.032.
Zhao
,
M.
,
I. M.
Held
, and
G. A.
Vecchi
,
2010
:
Retrospective forecasts of the hurricane season using a global atmospheric model assuming persistence of SST anomalies
.
Mon. Wea. Rev.
,
138
,
3858
3868
, https://doi.org/10.1175/2010MWR3366.1.

Footnotes

Supplemental information related to this paper is available at the Journals Online website: https://doi.org/10.1175/JTECH-D-17-0109.s1.

© 2018 American Meteorological Society. For information regarding reuse of this content and general copyright information, consult the AMS Copyright Policy (www.ametsoc.org/PUBSReuseLicenses).

1

If a TC forming on the eastern or western side drops below the TC threshold while propagating westward or eastward and reintensifies on the western or eastern side, we count it only once and assign it to the eastern or western cluster.

Supplemental Material