1. Introduction
Long instrumental time series of climate variables are often affected by abrupt changes caused by climatic and/or nonclimatic factors. In a time series context, the identification of abrupt changes is equivalent to the subdivision of a series into segments characterized by homogeneous statistical features (e.g., mean and higher-order moments). Time series segmentation or changepoint detection is a broad topic involving several application fields, such as, for example, information systems (Tartakovsky et al. 2006), neurology (Robinson et al. 2010), genetics (Tai et al. 2010), and economics (Koop and Potter 2009). Changepoint problems have also received increasing attention in geophysical research (e.g., Reeves et al. 2007). Many studies have recognized that abrupt changes affect climate records, like the North Pacific Ocean mean sea level pressure (Trenberth 1990), stratospheric temperature (Pawson et al. 1998), and surface air temperature (e.g., Miranda and Tomé 2009; Toreti and Desiato 2008). Besides these climate-related changes, other nonclimatic factors (e.g., relocation of the weather station, changes of instrumentation) usually cause sudden changes (Kuglitsch et al. 2009, and references therein). Therefore, the identification and attribution of abrupt changes in time series are essential tasks for an accurate analysis of climate and climate change (e.g., Randall et al. 2007; Jansen et al. 2007).
Various methods, characterized by different features, are generally applicable—for example, Bayesian methods (Perreault et al. 2000; Ray and Tsay 2002), likelihood ratio tests (Kim and Siegmund 1989), dynamic algorithms (Bai and Perron 1998), optimal least squares segmentation, and hidden Markov models (Hubert 1997; Kehagias 2004; Kehagias et al. 2005; Kehagias and Fortin 2006). Some methods have also been developed for specific tasks such as the homogenization of climate data (e.g., Alexandersson and Moberg 1997; Easterling and Peterson 1995; Lund et al. 2007; Wang et al. 2007; Lu et al. 2010).
In focusing on changepoints attributable to nonclimatic factors (i.e., inhomogeneities or break points), identification and correction (whenever possible) are ideally performed after standard quality control procedures (e.g., Moberg et al. 2006). For instance, trustworthy trend assessments and analyses of extreme events rely on high-quality data, not influenced by nonclimatic factors (Toreti et al. 2010a). The aim of homogenization techniques (i.e., procedures combining detection and correction) is the removal (or at least the reduction) of the nonclimatic signal affecting time series under investigation [for an overview, the reader is referred to Aguilar et al. (2003)]. The detection step aims to identify times at which the series suddenly changes behavior because of local factors not ascribable to the climate system. During the analyses of sets of annual and seasonal mean temperature and total precipitation data (Kuglitsch et al. 2009; Toreti et al. 2009, 2010b), several methods were applied for the detection of break points: sequential standard normal homogeneity test (SNHT; Alexandersson and Moberg 1997), RHtest (Wang et al. 2007; Wang 2008a,b), and the method of Caussinus and Mestre (2004; CauMe). All these methods have limitations and drawbacks in regard to underestimation or overestimation of multiple changepoints and/or incorrect break point location times. Trying to overcome these shortcomings, we present a more general time series segmentation method based on hidden Markov models (HMM) and a genetic algorithm (GA). The method, called genetic algorithm hidden Markov models for detection of inhomogeneities (GAHMDI), is explained and implemented with some restrictive hypotheses (e.g., the process cannot come back to a previous condition), usually satisfied in the homogenization context, that can be relaxed for other applications.
Section two focuses on the proposed methodology. Section 3 compares the performance of GAHMDI with the CauMe, SNHT, and RHtest segmenters using simulated series and one case study. This comparative analysis is not exhaustive. Other recent homogenization procedures (e.g., Lu et al. 2010) have not been included, because simulations are computationally expensive and a complete comparison of GAHMDI with respect to all available methods was not feasible in the frame of this research. In addition, a case study on winter precipitations recorded in northern Italy from 1950 to 2006 is shown. In section 4 conclusions and outlook are provided. Appendixes A and B give further details on the method and the simulation of the series.
2. Method
Let {Xt}t=1,...,N be a discrete time process, for instance annual mean temperature from a weather station, affected by K − 1 changepoints (K is not known) located at unknown times {τ1, … , τK−1} (e.g., Lu et al. 2010). Thus, the process is characterized by K homogeneous segments given by τj−1 < t ≤ τj for j = 1, … , K with τ0 = 0 and τK = N. The aim of detection procedures is the identification of K and {τ1, … , τK−1} and the estimation of the statistical features describing the homogeneous segments, as for example the mean and variance. GAHMDI puts this problem in a hidden Markov models framework (section 2a) and provides an estimated segmentation when K is fixed a priori. To avoid convergence to local maximum, GAHMDI performs an initial estimation by using a GA (section 2b). Finally, GAHMDI is applied with K ∈ {1, … , Kmax} for some preset Kmax and the optimal number of segments (changepoints) is chosen by minimizing a penalized likelihood objective function of the form −log(likelihood) + penalty (section 2c).
a. Hidden Markov models
HMMs are broadly applied, especially in speech recognition; however, few authors have applied them to segmentation problems (e.g., Kehagias 2004). In this frame, {Xt}t=1,2,...,N depends on an unobservable process {St}t=1,2,...,N (the state process) that takes values in {1, 2, … , K}. In the homogenization context the state process could be considered as the set of nonclimatic factors influencing the measured variables (e.g., the position of the weather station, the observing practices). The state process is a Markov chain, characterized by a transition matrix









b. Initial estimation
Suppose the number of states K is fixed. The Baum–Welch algorithm depends on an initial segmentation estimate that can be obtained by partitioning the observations, that is, giving a first guess of the state sequence. This state sequence is usually random (Kehagias 2004) or provided by partitioning around medoids (Fridlyand et al. 2004; Swami and Jain 2006). However, this can lead to local maxima. To get global maxima, GAHMDI uses a GA to estimate the initial state sequence. GAs can be defined either as a family of computational models inspired by evolution (Whitley 1994) or as robust techniques for optimization based on the laws of natural selection and genetics. They were introduced by Holland (1992) and several authors applied GAs for the estimation of HMM parameters, usually hybridizing them (e.g., Kwong et al. 2001; Won et al. 2004). Jann (2006) and Li and Lund (2012) directly use GAs to solve climate homogenization problems.
The main elements of a GA are a set of possible solutions, called population, and an evaluation function. Through genetic processes, these algorithms reach an optimal solution in terms of the evaluation function. The first step of a GA is the creation of the initial population (randomly generated). The individuals of the population are quantified via a notion of a chromosome and represent possible solutions to the optimization problem. A fitness value is assigned to each chromosome by using the evaluation function. Before undertaking the genetic operations, an intermediate population must be created by selecting individuals from the initial population. For this task, methods like stochastic universal sampling (Baker 1987) or tournament selection (e.g., Blickle and Thiele 1995) have been developed. The latter is widespread and well known for its efficiency. It organizes tournaments between two individuals, randomly chosen, and winners (in terms of the fitness values) become members of the intermediate population. The next step involves genetic processes, that is, crossover and mutation. The aim is the evolution of the population toward a new generation of individuals. Crossover (single point) involves two chromosomes that are split at the same point and recombined, thus to produce offspring. For instance, parents a = (a1, … , an) and b = (b1, … , bn), crossover at point h, give offspring: c = (a1, … , ah, bh+1, … , bn) and d = (b1, … , bh, ah+1, … , an). The crossover point is randomly chosen and the crossover operation is performed with a fixed probability pc. The other genetic operation is called mutation; it involves only one chromosome and produces small changes in its structure. Let a = (1, 1, 1, 1, 2, 2, 2) be a chromosome. A mutation of a, for instance, is the chromosome am = (1, 1, 1, 2, 2, 2, 2). Also mutation is applied with a fixed probability pm. Finally, the fitness value of each individual of the new generation is calculated. To guarantee the survival of the best solutions belonging to the previous population, a process called elitism is performed. Elitism replaces the worst solutions of the new generation with the best solutions of the previous one (only if their fitness values are lower). The procedure is iterated until convergence.


c. How many states?



d. Additional remarks







3. Results: Detection of inhomogeneities
The homogenization of climate time series (the candidate) relies on the detection of an unknown number of changepoints. This is usually done by comparing the candidate series to a set of well-correlated neighboring series belonging to the same climatic area (e.g., Peterson and Easterling 1994; Aguilar et al. 2003; Caussinus and Mestre 2004; Menne and Williams 2005; Kuglitsch et al. 2009). To remove the climate signal and to detect only artificial break points, changepoint methods are usually applied to the standardized difference series (candidate minus reference) of annual/seasonal values (in case of temperature) or to the log series of ratios (in case of precipitation; e.g., Toreti et al. 2009). The detection process is run r times, where r is the cardinality of the reference set. Kuglitsch et al. (2009) suggest retaining break points confirmed by three or more reference series within two consecutive years. In section 3a GAHMDI is tested on a simulated sample and its performance is compared to CauMe, SNHT, and RHtest segmenters. In section 3b the application of the method to a total winter precipitation series (from a weather station located in northern Italy) is shown.
a. Simulated series
Two well-correlated series (a candidate and a reference) of 100 independent values are generated 1000 times following the multivariate method of Wilks (1999, 2005) and using a Gaussian distribution for both series (see appendix B for details). To get additional information on GAHMDI’s behavior, artificial signals are added to the simulated candidate series (not affected by changepoints). Since it is not possible to cover all combinations of changepoints in terms of number, magnitude, and location, five specific cases are investigated (see Table 1). They are common in the homogenization of a real dataset and give the opportunity to test our procedure in a simple and effective way, looking at the number of detections and the contemporaneous identification of more than one changepoint. This last feature is important, because long time series are usually affected by more than one inhomogeneity, and unidentified break points could induce an erroneous correction of the candidate series. Following classical notation, σ denotes the standard deviation of the candidate. In the first three cases, the mean of the candidate series is changed after the 25th (50th, 75th) value, adding a constant signal [i.e., μa(t) = μa] of variable magnitude (from 0.1σ to 1.5σ, with steps of 0.1). In the fourth case, a random signal N(μa, 0.2σ) is added after the 70th values, with μa in (0.1σ, 0.2σ, … , 1.5σ). In the last case, three changepoints (associated with three changes of the mean) are added to the candidate series at the 20th, 50th, and 85th value; the constant signals are of magnitude equal to 0.8σ, −0.5σ, and 0.7σ, respectively. Since SNHT is designed to identify only one inhomogeneity, it has to be applied in a sequential way (e.g., Alexandersson and Moberg 1997); that is, after each detected changepoint, the series is split into subperiods and the test reapplied to each of them, until no further inhomogeneities are detected or the length of the subperiods is too short. Finally, in this subsection a detected changepoint is considered correct with a leeway of ±1.
Simulated case studies; cp and as denote changepoint and artificial signal, respectively.
Figure 1 shows GAHMDI’s performance in the first four cases. The behavior of our method is approximately identical to the behavior of CauMe and SNHT, whereas RHtest seems to underdetect single break points. The results are similar for break points located at the beginning (25th value), the middle (50th value), and the end (75th value) of the series. Moreover, the addition of an artificial random signal does not affect the detections of the four methods. When a stepwise function with three changepoints is added (Fig. 2), GAHMDI performs better than CauMe and RHtest in the contemporaneous identification of the changepoint vector (20, 50, 85): 229 times against 57 (CauMe) and 148 (RHtest). SNHT has the highest number of correct identifications (408 times), but it often detects more than three break points. Summarizing, the proposed method shows a behavior similar to widespread tests (CauMe and SNHT) for single changepoints, but it is better in the detection of multiple changepoints (i.e., more than two segments). Furthermore, GAHMDI is able to identify break points due to changes in variance, whereas methods designed to identify only changes in the mean (e.g., CauMe) show a drastic performance reduction. Indeed, when the variance of the candidate series is changed by 0.5 after the 50th value, GAHMDI detects this break point 77 times against 12 for CauMe, 10 for RHtest, and 4 for SNHT (not shown in the figures). Notice that this evaluation of the method (performed on a simulated dataset) does not cover all possible cases (e.g., different sizes, locations and number of inhomogeneities). However, the five simulated situations are rather common (e.g., Kuglitsch et al. 2009), so the results provide a proper description.
Number of correct detections of a single break point in function of the shift (expressed in terms of σ) for GAHMDI (black line), CauMe (gray line), RHtest (dashed gray line), and SNHT (dashed black line). A constant inhomogeneity is added after the (a) 25th, (b) 50th, and (c) 75th value. (d) A random signal N(μa, σa) is added after the 70th value.
Citation: Journal of Applied Meteorology and Climatology 51, 2; 10.1175/JAMC-D-10-05033.1
Identification of three break points (located at the 20th, 50th, and 85th values) added at the simulated series. The number of correct detections of each break point is plotted in correspondence of 20, 50, and 85 (x axis). The label 3 on the x axis represents the contemporaneous identification of the three break points. Black circles are associated with GAHMDI, gray circles with CauMe, gray stars with RHtest, and black stars with SNHT.
Citation: Journal of Applied Meteorology and Climatology 51, 2; 10.1175/JAMC-D-10-05033.1
b. Case study
Besides results from simulated series, a detection of inhomogeneities was performed on a winter (December to February) precipitation series that has been derived from daily observations (over the period 1950–2006) at the weather station of Milan (northern Italy; see Fig. 3). To make the comparison of GAHMDI with the other three methods straightforward, only one reference series is used. As shown in Fig. 3, GAHMDI flags one break point in 1991. The same holds for CauMe, although the inhomogeneity is identified in 1989; while both SNHT and RHtest detect two inhomogeneities, that is, (1991, 1996) and (1980, 1990). Therefore, only the break point located in 1991 is unanimous. These results, although not confirmed by metadata, point out the differences between the applied methods and show the need of multiple detection procedures in the homogenization of real series.
Winter precipitation sums (black line) from the weather station of Milan, Italy (red dot in the small panel). Vertical dashed lines identify the break points detected by GAHMDI (blue line), CauMe (red line), SNHT (green lines), and RHtest (gray lines).
Citation: Journal of Applied Meteorology and Climatology 51, 2; 10.1175/JAMC-D-10-05033.1
4. Conclusions
Time series segmentation is a complex task with potential applications in many research fields (e.g., climate change, economics, finance, biology, music, and informatics). Abrupt changes (characterizing transitions from a state to another one) affect several physical systems. A statistical description, through changepoints, of abrupt responses (e.g., in the climate system) to external forcings helps to improve the understanding of mechanisms behind those phenomena. In this context, an approach based on a genetic algorithm and hidden Markov models was proposed. GAHMDI has been developed to be immediately applied in the homogenization field. However, its flexibility allows an easy adaptability to different fields and initial assumptions. GAHMDI guarantees the reliability of the solution by avoiding convergence to local optimums. Furthermore, application of the MDL principle permits us to choose the number of states in an objective way; although, as pointed out by several authors, the order selection of a HMM is very difficult and still an active field of research. The method is theoretically explained and its applicability in climate homogenization was demonstrated. GAHMDI’s behavior is investigated by using a simulated dataset, and compared with the method developed by Caussinus and Mestre (2004), the standard normal homogeneity test (Alexandersson and Moberg 1997), and the RHtest (Wang et al. 2007; Wang 2008a,b). GAHMDI performs better than the other three methods in the contemporaneous detection of multiple changepoints. In addition, it also takes into account changes in variance. An application of GAHMDI to an observed series of winter precipitation recorded at Milan (northern Italy) demonstrates the practical utility of the method. The described evaluation is surely not exhaustive. Extensive tests based on simulated series, a complete set of cases and involving additional methods are computationally very expensive and could be performed within a dedicated project.
In future research, GAHMDI will be expanded to handle autocorrelated data as well as on changepoint detection of daily climate time series.
Acknowledgments
We are grateful to Dr. D. Harte (SRA) for the R-package HiddenMarkov and helpful discussions. We thank Dr. P. D. Grünwald (CWI and Leiden University) and Dr. T. Roos (University of Helsinki) for useful and interesting discussions and W. Perconti (ISPRA) for his support during the simulation process. The comments and suggestions of two anonymous referees and Dr. R. Lund (Clemson University) improved the quality and the readability of the manuscript. This research was funded by the EU/FP6 integrated project CIRCE (Climate Change and Impact Research: the Mediterranean Environment; http://www.circeproject.eu/; Contract 036961) and the EU/FP7 project ACQWA (Assessing Climate Impacts on the Quantity and Quality of Water; http://www.acqwa.ch/; Grant 212250). The method is fully developed in R.
APPENDIX A
Likelihood L1
APPENDIX B
Simulated Series










REFERENCES
Aguilar, E., I. Auer, M. Brunet, T. C. Peterson, and J. Wieringa, 2003: Guidance on metadata and homogenization. WMO TD 1186, 53 pp.
Akaike, H., 1973: Information theory and an extension of the maximum likelihood principle. Proceedings of the Second International Symposium on Information Theory, B. N. Petrov and F. Csádki, Eds., Akadémiai Kiadó, 267–281.
Alexandersson, H., and A. Moberg, 1997: Homogenization of Swedish temperature data. Part I: Homogeneity test for linear trends. Int. J. Climatol., 17, 25–34.
Bai, J., and P. Perron, 1998: Estimating and testing linear models with multiple structural changes. Econometrica, 66, 47–78.
Baker, J., 1987: Reducing bias and inefficiency in the selection algorithm. Proc. Second Int. Conf. on Genetic Algorithms and Their Application, Cambridge, MA, Massachusetts Institute of Technology, 14–21.
Blickle, T., and L. Thiele, 1995: A mathematical analysis of tournament selection. Proc. Sixth Int. Conf. on Genetic Algorithms, Pittsburgh, PA, University of Pittsburgh, 9–16.
Cappé, O., E. Moulines, and T. Rydén, 2005: Inference in Hidden Markov Models. Springer, 672 pp.
Caussinus, H., and O. Mestre, 2004: Detection and correction of artificial shifts in climate series. Appl. Stat., 53, 405–425.
Celeux, G., and J. B. Durand, 2008: Selecting hidden Markov model state number with cross-validated likelihood. Comput. Stat., 23, 541–564.
Chambaz, A., A. Garivier, and E. Gassiat, 2009: A minimum description length approach to hidden Markov models with Poisson and Gaussian emissions. Application to order identification. J. Stat. Plann. Inference, 139, 962–977.
Davis, R. A., T. C. M. Lee, and G. A. Rodriguez-Yam, 2006: Structural break estimation for nonstationary time series models. J. Amer. Stat. Assoc., 101, 223–239.
DeGaetano, A. T., 2006: Attributes of several methods for detecting discontinuities in mean temperature series. J. Climate, 19, 838–853.
Easterling, D. R., and T. C. Peterson, 1995: A new method for detecting undocumented discontinuities in climatological time series. Int. J. Climatol., 15, 369–377.
Ephraim, Y., and W. J. J. Roberts, 2005: Revisiting autoregressive hidden Markov modeling of speech signals. IEEE Signal Process. Lett., 12, 166–169.
Forney, G. D., 1973: The Viterbi algorithm. Proc. IEEE, 61, 268–278.
Fridlyand, J., A. M. Snijders, D. Pinkel, D. G. Albertson, and A. N. Jain, 2004: Hidden Markov models approach to the analysis of array CGH data. J. Multivariate Anal., 90, 132–153.
Frühwirth-Schnatter, S., 2006: Finite Mixture and Markov Switching Models. Springer, 492 pp.
Gassiat, E., 2002: Likelihood ratio inequalities with applications to various mixtures. Ann. Inst. Henri Poincaré, 38, 897–906.
Grünwald, P. D., 2007: The Minimum Description Length Principle. MIT Press, 703 pp.
Holland, J. H., 1992: Adaptation in Natural and Artificial Systems. MIT Press, 211 pp.
Hubert, P., 1997: Change points in meteorological time analysis. Application of Time Series Analysis in Astronomy and Meteorology, T. Subba Rao, M. B. Priestly, and O. Lessi, Eds., Chapman and Hall, 399–412.
Jann, A., 2006: Genetic algorithms: Towards their use in the homogenization of climatological records. Croatian Meteor. J., 41, 3–19.
Jansen, E., and Coauthors, 2007: Paleoclimate. Climate Change 2007: The Physical Science Basis, S. Solomon et al., Eds., Cambridge University Press, 433–497.
Juang, B. H., and L. R. Rabiner, 1991: Hidden Markov models for speech recognition. Technometrics, 33, 251–272.
Kehagias, A., 2004: A hidden Markov model segmentation procedure for hydrological and environmental time series. Stochastic Environ. Res. Risk Assess., 18, 117–130.
Kehagias, A., and V. Fortin, 2006: Time series segmentation with shifting means hidden Markov models. Nonlinear Processes Geophys., 13, 1–14.
Kehagias, A., E. Nidelkou, and V. Petridis, 2005: A dynamic programming segmentation procedure for hydrological and environmental time series. Stochastic Environ. Res. Risk Assess., 20, 77–94.
Kim, H. J., and D. Siegmund, 1989: The likelihood ratio test for a change point in simple linear regression. Biometrika, 76, 409–423.
Koop, G., and S. M. Potter, 2009: Prior elicitation in multiple change-point models. Int. Econ. Rev., 50, 751–772.
Kuglitsch, F. G., A. Toreti, E. Xoplaki, P. M. Della-Marta, J. Luterbacher, and H. Wanner, 2009: Homogenization of daily maximum temperature series in the Mediterranean. J. Geophys. Res., 114, D15108, doi:10.1029/2008JD011606.
Kwong, S., C. W. Chau, K. F. Man, and K. S. Tang, 2001: Optimisation of HMM topology and its model parameters by genetic algorithms. Pattern Recognit., 34, 509–522.
Lee, T. C. M., 2001: An introduction to coding theory and the two-part minimum description length principle. Int. Stat. Rev., 69, 169–183.
Li, S., and R. Lund, 2012: Multiple changepoint detection via genetic algorithms. J. Climate, 25, 674–686.
Lu, Q., R. Lund, and T. C. M. Lee, 2010: An MDL approach to the climate segmentation problem. Ann. Appl. Stat., 4, 299–319.
Lund, R., X. L. Wang, Q. Lu, J. Reeves, C. Gallagher, and Y. Feng, 2007: Changepoint detection in periodic and autocorrelated time series. J. Climate, 20, 5178–5190.
MacDonald, I. L., and W. Zucchini, 1997: Hidden Markov and Other Models for Discrete-Valued Time Series. Chapman and Hall, 256 pp.
MacKay, R. J., 2002: Estimating the order of a hidden Markov model. Can. J. Stat., 30, 573–589.
Menne, M. J., and C. N. Williams, 2005: Detection of undocumented changepoints using multiple test statistics and composite reference series. J. Climate, 18, 4271–4286.
Mielke, P. W., K. J. Berry, and G. W. Brier, 1981: Application of multi-response permutation procedures for examining seasonal changes in monthly mean sea level pressure patterns. Mon. Wea. Rev., 109, 120–126.
Miranda, P. M. A., and A. R. Tomé, 2009: Spatial structure of the evolution of surface temperature (1951–2004). Climatic Change, 93, 269–284.
Moberg, A., and Coauthors, 2006: Indices for daily temperature and precipitation extremes in Europe analyzed for the period 1901–2000. J. Geophys. Res., 111, D22106, doi:10.1029/2006JD007103.
Pawson, S., K. Labitzke, and S. Leder, 1998: Stepwise changes in stratospheric temperature. Geophys. Res. Lett., 25, 2157–2160.
Perreault, L., J. Bernier, B. Bobee, and E. Parent, 2000: Bayesian change-point analysis in hydrometeorological time series. Part 1. The normal model revisited. J. Hydrol., 235, 221–241.
Peterson, T. C., and D. R. Easterling, 1994: Creation of homogeneous composite climatological reference series. Int. J. Climatol., 14, 671–679.
Rabiner, L. R., 1989: A tutorial on hidden Markov models and selected applications in speech recognition. Proc. IEEE, 77, 257–286.
Randall, D. A., and Coauthors, 2007: Climate models and their evaluation. Climate Change 2007: The Physical Science Basis, S. Solomon et al., Eds., Cambridge University Press, 589–662.
Ray, B. K., and R. S. Tsay, 2002: Bayesian method for change-point detection in long-range dependent processes. J. Time Ser. Anal., 23, 687–705.
Reeves, J., J. Chen, X. L. Wang, R. Lund, and Q. Lu, 2007: A review and comparison of changepoint detection techniques for climate data. J. Appl. Meteor. Climatol., 46, 900–915.
Robinson, L. F., T. D. Wager, and M. A. Lindquist, 2010: Change point estimation in multi-subject fMRI studies. NeuroImage, 49, 1581–1592.
Schwarz, G., 1978: Estimating the dimension of a model. Ann. Stat., 6, 461–464.
Swami, D. K., and R. C. Jain, 2006: PAMC: Partitioning around medoids for classification. Inf. Technol. J., 5, 1102–1105.
Tai, Y. C., M. N. Kvale, and J. S. Witte, 2010: Segmentation and estimation for SNP microarrays: A Bayesian multiple change-point approach. Biometrics, 66, 675–683.
Tartakovsky, A. G., B. L. Rozovskii, R. Blažek, and H. Kim, 2006: Detection of intrusions in information systems by sequential change-point methods. Stat. Methodol., 3, 252–293.
Toreti, A., and F. Desiato, 2008: Temperature trend over Italy from 1961 to 2004. Theor. Appl. Climatol., 91, 51–58.
Toreti, A., G. Fioravanti, W. Perconti, and F. Desiato, 2009: Annual and seasonal precipitation over Italy from 1961 to 2006. Int. J. Climatol., 29, 1976–1987.
Toreti, A., F. G. Kuglitsch, E. Xoplaki, J. Luterbacher, and H. Wanner, 2010a: A novel method for the homogenization of daily temperature series and its relevance for climate change analysis. J. Climate, 23, 5325–5331.
Toreti, A., E. Xoplaki, D. Maraun, F. G. Kuglitsch, H. Wanner, and J. Luterbacher, 2010b: Characterisation of extreme winter precipitation in Mediterranean coastal sites and associated anomalous atmospheric circulation patterns. Nat. Hazards Earth Syst., 10, 1037–1050.
Trenberth, K. E., 1990: Recent observed interdecadal climate changes in the Northern Hemisphere. Bull. Amer. Meteor. Soc., 71, 988–993.
Viterbi, A. J., 2006: A personal history of the Viterbi algorithm. IEEE Signal Process. Mag., 120, 120–122.
Wang, X. L., 2008a: Accounting for autocorrelation in detecting mean shifts in climate data series using the penalized maximal t or F test. J. Appl. Meteor. Climatol., 47, 2423–2444.
Wang, X. L., 2008b: Penalized maximal F test for detecting undocumented mean shift without trend change. J. Atmos. Oceanic Technol., 25, 368–384.
Wang, X. L., Q. H. Wen, and Y. Wu, 2007: Penalized maximal t test for detecting undocumented mean change in climate data series. J. Appl. Meteor. Climatol., 46, 916–931.
Welch, L. R., 2003: Hidden Markov models and the Baum-Welch algorithm. IEEE Inf. Theory Soc. Newsl., 53, 10–13.
Whitley, D., 1994: A genetic algorithm tutorial. Stat. Comput., 4, 65–85.
Wilks, D. S., 1999: Simultaneous stochastic simulation of daily precipitation, temperature and solar radiation at multiple sites in complex terrain. Agric. For. Meteor., 96, 85–101.
Wilks, D. S., 2005: Statistical Methods in the Atmospheric Sciences. Academic Press, 648 pp.
Won, K. J., A. Prügel-Bennett, and A. Krogh, 2004: Training HMM structure with genetic algorithm for biological sequence analysis. Bioinformatics, 20, 3613–3619.
Xie, Y., J. Yu, and B. Ranneby, 2008: A general autoregressive model with Markov switching: Estimation and consistency. Math. Methods Stat., 17, 228–240.