1. Introduction
Analyzing temporal changes in a climate time series is becoming increasingly important as we often need to know when a major shift in climate systems occurs. This information, if assessed appropriately, would aid researchers and planners in their strategy for more comprehensive analyses of complex climate systems and in sound decision-making processes. One such example is the well-known changing phase of the Pacific decadal oscillation (PDO) in the late 1970s. Studies have shown that the negative phase of the PDO is instrumental for the wintertime precipitation in the Pacific Northwest while the positive phase of the PDO does just the opposite (Mantua et al. 1997). Therefore, knowing the turning phase of a major climate system would be beneficial for many sectors such as agriculture and hydropower operations.
Bayesian analysis is an efficient way to provide a coherent and rational framework for distilling uncertainties by incorporating diverse information sources such as subjective beliefs, historical observations, model simulations, and new information. A comprehensive textbook introducing the Bayesian paradigm and its applications to atmospheric data is Epstein (1985). Solow (1988) applied a Bayesian method for inferences about climate change based on the two-phase regression model. Elsner and Bossak (2001) explicitly demonstrated the use of Bayesian analysis to the U.S. hurricanes by combining the less reliable historical accounts of hurricanes in the nineteenth century with the more reliable records from the twentieth century to yield a best estimate of the annual rates. Besides using a Bayesian technique for making inferences, Bayes' theorem can be applied in a predictive mode for the probability of future U.S. landfalling hurricanes (e.g., Epstein 1985; Elsner and Bossak 2001). This feature is applicable for disaster mitigation planning and insurance/reinsurance industries because landfalling hurricanes cause enormous property damage and their future occurrences are unknown in relation to climate variability.
Using a step function as an independent variable and taking a logarithmic transformation of the annual major hurricane rates over the North Atlantic as a dependent variable, Elsner et al. (2000) developed a model for detecting change points in the Atlantic hurricane time series. Chu (2002) used a similar log-linear regression method to model the shifts in annual tropical cyclone (TC) frequency over the central North Pacific (CNP) for the period 1966–2000. It was found that two change points are significant at α = 0.05 level. A change-point time may be defined as the last year of an old epoch or the first year of a new epoch. Here the latter definition was adopted. The first change point occurred in 1982 with a t ratio of 2.45, and the second shift occurred in 1995 with a marginally significant t ratio of 2.04. As a result, the entire 35-yr record is partitioned into three epochs, 1966–81, 1982–94, and 1995–2000.
Though the approach taken by Chu (2002) provided a simple and straightforward analysis as to when decadal variations in TC frequency occurred, it does not take into account the fact that the seasonal TC occurrence may be better described by a discrete Poisson process. Moreover, the log-linear model does not contain information related to the posterior probability for the change point, which is important for the prediction of future outcomes of the process. Elsner and Schmertmann (1993) and Wilson (1999) noted that the very low number of intense hurricanes over the North Atlantic and the lack of serial dependence in intense hurricanes from year to year justify the Poisson assumption. This may also be the case for the CNP because the annual TC events are rather few. For example, during the last 37 yr (1966–2002), 5 yr have no TC activity, 6 yr have only one TC occurrence in each year, and 8 yr have two TC occurrences in each year.
In view of the shortcoming of representing TC variations by a linear model, this study attempts to model the temporal changes of TC activity by a Poisson process with the Poisson parameter being treated as a gamma distribution. The resulting Bayesian analysis is then used to forecast the future TC activity over CNP. The essential issue of this study is that, rather than assuming the statistical distribution of the TC rate is time invariant throughout the observation period, we introduce a “single change” hypothesis under which there is a major shift on the TC activity rate. A few remarks justify the application of Bayesian methods in this study. As will be seen in section 4, the Bayesian approach is able to include other sources of less reliable data as prior information in the analysis, a certain advantage over the non-Bayesian method which often depends on more reliable but shorter portions of the historical records. Moreover, inferences about temporal shifts are couched in terms of probabilities, another desirable feature of the Bayesian paradigm. In contrast, the classical non-Bayesian method provides a deterministic estimate of the change-point location, but not probability information about the uncertainty of change points.
The structure of this paper is as follows. Section 2 describes the data source. The basic mathematical model of TC activity is reviewed in section 3. Section 4 introduces the three-level Bayesian analysis framework pertinent to our specific problem. Main results are described in section 5. A summary and discussion are found in section 6.
2. Data
The tropical cyclone records over the CNP at 6-h intervals come from the National Hurricane Center's (NHC) best tracks dataset, as described in Clark and Chu (2002). Here, tropical cyclones refer to tropical storms and hurricanes. In this study, tropical storms are defined as the maximum sustained surface wind speeds between 17.5 and 33 m s−1, and hurricanes are defined as wind speeds at least 33 m s−1. Mayfield and Rappaport (1992) suggest that reliable TC statistics in the CNP began in 1966, when satellite reconnaissance was initiated in the region. A second dataset used is the recent TC records compiled annually by the Central Pacific Hurricane Center, an entity of the National Weather Service Forecast Office in Honolulu, Hawaii. By combining these two datasets, reliable TC records extend from 1966 to 2002, which constitutes the main dataset for this study.
For Bayesian analysis, prior information of TCs before 1966 is needed. Data prior to 1966 can be found in Shaw (1981), who did the laudable task of compiling historical TC records for the CNP from various sources, including Mariners Weather Log, the Joint Typhoon Warning Center's annual typhoon reports, real-time cyclone tracks and advisories issued by the Central Pacific Hurricane Center, published and unpublished papers by emeritus Prof. James Sadler of the University of Hawaii, and others. Although historical TC accounts are thought to be less reliable, we are only concerned with the annual counts of TC in this study, not with the attributes of TC at 6-h intervals as detailed in the NHC's best track dataset.
3. Mathematical model of TC activity



Given h TCs occurring in T yr, if the prior density for λ is gamma distributed with parameters h′ and T′, the posterior density for λ will also be gamma distributed with parameters h + h′ and T + T′. That is, the gamma density is the conjugate prior for λ. Referring to (2), the conditional expectation with respect to λ is E[λ|h′, T′] = h′/T′. In the later part of this paper, we will discuss how to find the prior information h′ and T′.

4. Bayesian approach for detection of shift in the TC series
a. Hypothesis model
In this study, we will mainly focus on the case in which the probability of more than one change point within the desired period is negligible. This simplifies the analysis to one of the two scenarios: a “no change-point” hypothesis versus a “single change-point” hypothesis. The following derivations are based on the mathematical model described in section 3. The annual tropical cyclone data, h1, h2, … , hn, are assumed to be described as a series of independent random variables. Mathematically, the two hypotheses models are postulated below.
- Hypothesis H0: “no change point of the rate” of the TC series:hi ∼ Poisson (hi|λ, T), i = 1, 2, … , n, where T is the unit observation timeλ ∼ gamma (h′, T′),where the prior knowledge of the parameters h′ and T′ is given.
- Hypothesis H1: “a single change point of the rate” of the TC series:hi ∼ Poisson (hi|λ1, T), when i = 1, 2, … , τ − 1hi ∼ Poisson (hi|λ2, T), when i = τ, … , nτ = 2, 3, … , n, T is as defined in the hypothesis H0, andλ1 ∼ gamma (
,h′1 )λ2 ∼ gamma (T′1 ,h′2 ),where the prior knowledge of the parametersT′2 ,h′1 ,T′1 ,h′2 is given.T′2
Note that there are two epochs in this model and τ is known as the first year of the new epoch, or the change point.
b. Hypothesis analysis
With the formula of Bayesian inference under hypothesis H1 as described in the appendix, we will derive the Bayesian method to analyze the posterior probability of the hypothesis model H0 and H1 based on the given observation data and statistical assumption described previously. Basically, we need to determine the prior distribution for the hypothesis model H0 and H1, which can be of any discrete probability distribution function. A proper noninformative choice is the uniform distribution, that is, P(H0) = P(H1) = 1/2 since there is no prior information regarding which one of the hypotheses is preferable.




c. Predictive distributions





This simplified equation (11) is obviously biased relative to (10). However, if the posterior probability of the H1 hypothesis is much larger than that of the H0 hypothesis and the MAP estimation of the change point is comparatively much higher than the probability of other years, this bias should be within tolerance. As will be shown later, this simplification does not much impact the final prediction result.
d. Calculation of the prior parameters
So far, we have constructed the theoretical framework for a three-level hierarchical Bayesian analysis, but we have not mentioned how to obtain the prior knowledge of the distribution, that is, the estimation of prior parameters h′and T′for H0 hypothesis and parameters
It is also necessary to have prior parameters for the post change point (e.g., Tapsoba et al. 2004). In Chu (2002), a significant shift in 1982 and a marginally significant shift in 1995 were suggested. To make inferences about the shift around 1982, it is reasonable to choose the data before 1995 as the target period for the “after change” period under the H1 hypothesis. To maximize the available sample size for the modeling period, we thus select the annual TC observations from 1990– 94 as the prior. As for the H0 hypothesis, we just combine this set of data with the short, pre-1966 dataset to form the prior knowledge.

In this study, under H1, for the “before change-point” period, sample mean and sample variance are equal to 2.22 and 5.44, respectively. From (12), this leads to
5. Results
a. Results of shift in intensity of TCs
Figure 1 shows the time series of annual TC counts over the CNP since 1966. The average rate prior to 1982 is about 1.9 TCs yr−1 but it increases to almost 3.6 TCs yr−1 thereafter. The result of the Bayesian analysis on the shift year of the annual TC counts in CNP is listed in Table 2. From this table, we can see that the measure of Bayes factor [2 ln(B)] for the annual TC counts during the 1966–89 period is 2.22, which favors H1 over the H0 hypothesis with a uniform prior for the hypothesis layer. The posterior probability that a change has occurred is rather high, reaching 0.75. Figure 2 shows the posterior probability of the change point of TC activity, plotted as a function of time. Larger probabilities on year i imply a more likely change occurring with i being the first year of a new epoch. The maximum probability of 0.31 occurs in 1982. This suggests that the most likely year of the new epoch is 1982 although other change-point years such as 1981 and 1980 are plausible candidates.
The posterior PDFs of TC intensity before and after the change point, λ1 and λ2, are plotted in Fig. 3a. The posterior distribution represents a combination of the prior distribution and the likelihood function. In this plot, the change-point year is fixed in 1982. Recalling from Table 2, the sample rate before 1982 is 1.88 and after 1982 is 3.57. Figure 3a shows very little overlapping in the tail areas between these two posterior distributions, implying a rate increasing for the “after the shift” distribution beginning with 1982. Figure 3b displays the posterior density of (λ2 − λ1). The p value of this difference [P(λ2 − λ1 < 0|H1, h)] is very small (<0.01), strongly supporting the contention of a shift toward a higher rate of annual TC intensity since 1982.
From a log-linear regression analysis, Chu (2002) noted that after a major shift in 1982, a second shift, albeit weak, appears to occur in 1995. In order to test whether the latter shift also can be identified from the Bayesian framework, a similar analysis is performed for the period 1985–2002. For the H1 hypothesis, two rather short prior periods, 1984–88 and 1998–2002, are chosen. For the H0 hypothesis, these two periods are combined as the prior. Results indicate that 2 ln(B) = −0.38, which means the odd is in favor of H0 hypothesis (i.e., no change point) over H1 hypothesis for the post-1982 period.
b. Decadal tropical cyclone prediction
After having identified a change-point year in the TC series, our next goal is to predict TC activity over the CNP on a climate time scale. One way to calculate this predictive distribution is to use formula (10); however, this form may be computationally complicated. Based on the Bayesian analysis results presented in Table 2 and Fig. 2, it is reasonable to choose the H1 hypothesis. Thus, we opt to use the simplified formula (11) with the fixed change point at 1982.
The final decadal predictive PDF and cumulative distribution function (CDF) of TC counts calculated from both (10) and (11) over the CNP are plotted in Fig. 4a and Fig. 4b, respectively. As a comparison, we also plot the predictive PDF and CDF that do not involve the hypothesis layer. In other words, only the traditional approach involving the H0 hypothesis is assumed and (8) is applied. In Fig. 4a, the PDF is narrower when only H0 is assumed and becomes broader after considering H1 hypothesis. Thus, one may expect larger variability in TC rates for the next decade when the H1 is assumed. Moreover, there is an overlap between the two predictive PDFs under H0 and H1, but a significant shift toward the right is clearly seen when H1 is considered. Figure 4b displays the CDFs of predicting no more than a particular number of cyclones over the next decade. For example, the probability of predicting no more than 40 TCs in the next 10 yr when we only consider the H1 hypothesis is 0.74 while predicting the same TC numbers under the H0 hypothesis is 0.98, an almost guaranteed probability of occurrence. Moreover, the difference between the predictive distribution calculated from (10) and the simplified form from (11) is negligible, implying the simplified formula (11) works well for our problem.
6. Summary and discussion
In this study, a hierarchical Bayesian change-point analysis of tropical cyclone counts is developed. Specifically, the annual tropical cyclone counts over the central North Pacific are described by a Poisson process that is conditional on gamma distributions. The method focuses on the scenario in which the probability of more than one shift is negligible. Considering two equiprobable hypotheses, H0 and H1, we perform a hierarchical Bayesian analysis of making inferences about shifts in the tropical cyclone series. Inferences are based on the posterior probabilities of the possible shifts. Results suggest that there is a great likelihood of a change point in TC intensity in 1982 over the CNP, which is consistent with our earlier analysis based on a simple log-linear regression method (Chu 2002). Bayesian analysis is also used for predicting decadal tropical cyclone variations, and higher TC frequency is predicted in the next decade when the change point is taken into account. The predicted TC frequency may serve as a benchmark to gauge the future observed TC activity over the central North Pacific.
In the fundamental Bayesian framework, only two layers—a data layer and a parameter layer—are considered for deriving the posterior distribution P(θ|h) and obtaining the optimum predictive distribution P(ĥ|h) = ∫ P(ĥ|θ)P(θ|h) dθ. As illustrated in Fig. 5, the data layer is embodied by a likelihood distribution P(h|θ) and a parameter layer is embodied by prior information P(θ). In this framework, no change points are assumed. Expanding from this two-layer thinking, we introduce a new layer, called the hypothesis layer, which is embodied by prior information P(H), where H represents hypothesis. In this three-layer paradigm (Fig. 5), both the data layer and parameter layer are conditional on hypothesis selection so they are described by P(h|θ, H) and P(θ|H), respectively.
Following the same Bayes' rule, we obtain the posterior distribution for both hypotheses and parameters, P(H|h) and P(θ|H, h). The predictive distribution is thus P(ĥ|h) = ∫∫ P(ĥ|θ, H)P(θ|H, h)P(H|h) dθ dH. For the sake of computational simplicity, we also used the simplified formula P(ĥ|
Also recently, Elsner et al. (2004) applied a Markov chain Monte Carlo (MCMC) approach based on Gibbs sampling algorithm to detect change points in the Atlantic hurricane series. Gibbs sampling assumes that a value for one element of a multidimensional parameter can be generated when values for all other elements of this parameter are given. With some initial prior values of distribution parameters being prescribed, Gibbs sampling produces sequences of the parameters such as the hurricane rates before and after a change point. This approach provides an alternative to the classical Bayesian change-point analysis involving the prior, likelihood function, and the posterior distribution as presented in this study. While our study and Elsner et al. (2004) focus on a single change-point scenario, more elaborate multiple hypothesis choices such as the “double change points” hypothesis have been proposed by Lavielle and Labarbier (2001). It is yet to be demonstrated how such complicated modeling processes can be applied to detecting more than one change point in the hurricane time series.
We thank two anonymous reviewers and Francis Zwiers for suggestions that led to improvements in the manuscript. Partial support for this study has been provided by NOAA Grant NA17RJ1230.
REFERENCES
Carlin, B. P., , and T. A. Louis, 2000: Bayes and Empirical Bayes Methods for Data Analysis. Chapman & Hall/CRC, 419 pp.
Chu, P-S., 2002: Large-scale circulation features associated with decadal variations of tropical cyclone activity over the central North Pacific. J. Climate, 15 , 2678–2689.
Chu, P-S., , and J. Wang, 1998: Modeling return periods of tropical cyclone intensities in the vicinity of Hawaii. J. Appl. Meteor, 37 , 951–960.
Clark, J. D., , and P-S. Chu, 2002: Interannual variation of tropical cyclone activity over the central North Pacific. J. Meteor. Soc. Japan, 80 , 403–418.
Elsner, J. B., , and C. P. Schmertmann, 1993: Improving extended-range seasonal predictions of intense Atlantic hurricane activity. Wea. Forecasting, 8 , 345–351.
Elsner, J. B., , and B. H. Bossak, 2001: Bayesian analysis of U.S. hurricane climate. J. Climate, 14 , 4341–4350.
Elsner, J. B., , T. Jagger, , and X-F. Niu, 2000: Changes in the rates of North Atlantic major hurricane activity during the 20th century. Geophys. Res. Lett, 27 , 1743–1746.
Elsner, J. B., , X. Niu, , and T. H. Jagger, 2004: Detecting Shifts in Hurricane Rates Using a Markov Chain Monte Carlo Approach. J. Climate, 17 , 2652–2666.
Epstein, E. S., 1985: Statistical Inference and Prediction in Climatology: A Bayesian Approach. Meteor. Monogr., No. 42, Amer. Meteor. Soc., 199 pp.
Gelman, A., , J. B. Carlin, , H. S. Stern, , and D. B. Rubin, 2004: Bayesian Data Analysis. 2d ed. Chapman & Hall/CRC, 668 pp.
Keim, B. D., , and J. F. Cruise, 1998: A technique to measure trends in the frequency of discrete random events. J. Climate, 11 , 848–855.
Lavielle, M., , and M. Labarbier, 2001: An application of MCMC methods for the multiple change-points problem. Signal Process, 81 , 39–53.
Mantua, N. J., , S. R. Hare, , Y. Zhang, , J. M. Wallace, , and R. C. Francis, 1997: A Pacific interdecadal oscillation with impacts on salmon production. Bull. Amer. Meteor. Soc, 78 , 1069–1079.
Mayfield, M., , and E. N. Rappaport, 1992: Eastern North Pacific hurricane season of 1991. Mon. Wea. Rev, 120 , 2697–2708.
Raftery, A. E., 1996: Approximate Bayes factors and accounting for model uncertainty in generalized linear models. Biometrika, 83 , 251–266.
Shaw, S. L., 1981: A history of tropical cyclones in the central North Pacific and the Hawaiian Islands: 1832–1979. Central Pacific Hurricane Center Rep., NOAA/National Weather Service Forecast Office, Honolulu, HI, 137 pp.
Solow, A. R., 1988: A Bayesian approach to statistical inference about climate change. J. Climate, 1 , 512–521.
Tapsoba, D., , M. Haché, , L. Perreault, , and B. Bobée, 2004: Bayesian rainfall variability analysis in West Africa along cross sections in space–time grid boxes. J. Climate, 17 , 1069–1082.
Wilson, R. M., 1999: Statistical aspects of major (intense) hurricanes in the Atlantic basin during the past 49 hurricane seasons (1950– 1998): Implications for the current season. Geophys. Res. Lett, 26 , 2957–2960.
APPENDIX
Bayesian Inference for the Hypothesis H1








Time series of annual tropical cyclone counts over the central North Pacific from 1966 to 2002. Broken lines denote the means for the period 1966–81 and 1982–2002, respectively
Citation: Journal of Climate 17, 24; 10.1175/JCLI-3248.1

Posterior probability distribution of the change point, P(τ|h, H1) of TC series over the CNP
Citation: Journal of Climate 17, 24; 10.1175/JCLI-3248.1

(a) Posterior density function of annual TC intensity before the shift, P(λ1|h, H1), and after the shift, P(λ2|h, H1), with the change-point year being set in 1982. (b) Posterior density of (λ2 − λ1)
Citation: Journal of Climate 17, 24; 10.1175/JCLI-3248.1

(a) Predictive PDF and (b) predictive CDF of decadal TC counts over the CNP, where the circle refers to the two-layer Bayesian analysis, the triangle refers to the complete three-layer Bayesian analysis, and the asterisk refers to the simplified three-layer Bayesian analysis by using Eq. (11)
Citation: Journal of Climate 17, 24; 10.1175/JCLI-3248.1

Hierarchical structure of the Bayesian analysis methodology
Citation: Journal of Climate 17, 24; 10.1175/JCLI-3248.1
Raftery's scale for interpreting the Bayes factors

Results of the Bayesian analysis on change-point of an nual TC counts over the CNP. Here, τ stands for the change-point year, B is the Bayes factor, λ1 and λ2 represent the TC rate before and after the change point under H1 hypothesis, respectively, and P(H1|h) is the posterior probability of hypothesis H1 . The analysis period is 1966–89

School of Ocean and Earth Science and Technology Contribution Number 6490.