1. Introduction
Suppose that each station within a geographical domain has a data record (a time series). Our goal is to group (cluster) these series into disjoint regions (climatic zones) from this data. Two series from the same region should be more similar (in general) than series from different regions. Essentially, the problem involves clustering of time series.
Clustering is the standard method used to define climate zones. Stooksbury and Michaels (1990), Fovell and Fovell (1993), Gong and Richman (1995), DeGaetano (1996, 2001), Fovell (1997), Bunkers et al. (1996), Yao (1997), Gerstengarbe et al. (1999), Böhm et al. (2001), Steinbach et al. (2003), Unal et al. (2003), and Vrac and Naveau (2007) use clustering methods to construct climate zones. Clustering methods are also useful for classifying weather conditions into different synoptic regimes (Kalkstein et al. 1987; Davis and Walker 1992; Michelangeli et al. 1995; Kidson 2000; Stephenson et al. 2004; Straus et al. 2007). Complementing this climate work, the problem of clustering autocorrelated zero mean series has been recently considered in the statistics literature. In particular, Coates and Diggle (1986), Kakizawa et al. (1998), and Lund et al. (2009) show how to test whether two zero mean time series have the same autocovariance structure. Of these three references, only Kakizawa et al. (1998) consider more than two series. Zero mean series with differing autocorrelations often arise in financial and speech settings. Indeed, speech series can effectively be discriminated through their autocovariance structures. Unfortunately, speech classification methods are not directly applicable to our climate clustering problem, where the first moment (or mean) is the most prominent series feature. On the other hand, the second moment (autocovariances) also provides information: two stations that are truly similar should have similar means and autocovariances. As we show later, ignoring autocorrelations results in a suboptimal method.
Another important practical issue lies with seasonality. Most climate series possess seasonal structures. Seasonality has been handled in various ways by previous authors, but our treatment is general in that our model allows for seasonal means, variances, and autocorrelations. Differences in the means of two series, even if only seasonal, are a useful quantity to discriminate from. Because of this, we do not advocate clustering anomalies where the mean (seasonal or nonseasonal) has been removed. This issue will be revisited in section 5 below.
In this paper, we develop a clustering method that examines the means and autocovariances of the series in tandem. Such a method is based on the one-step-ahead prediction errors for general time series models. In fact, since the scaled one-step-ahead prediction errors of Gaussian time series are independent and identically distributed (IID), the proposed method is essentially an optimal clustering method for IID data. The methods can handle covariate and/or trend components if desired.
The rest of this paper proceeds as follows: Section 2 proposes a distance that measures how far apart two time series are. This distance accounts for the mean and autocovariances (in tandem) in a natural way. As a by-product of the analysis, a statistical test for whether two series should serve as reference stations for one another is obtained. Section 3 discusses how to cluster a collection of series into a prespecified number of regions from the section 2 distances. Section 4 provides a short simulation study of the reference station test aspects of this study. Section 5 illustrates the techniques by clustering 292 temperature records from the state of Colorado. Section 6 closes with conclusions and comments.
2. A new distance
Clustering techniques require a distance measuring the similarity between any two time series. In this section, our goal is to develop such a distance that accounts for means and autocovariances simultaneously. Our attention here is on two series X = {Xt} and Y = {Yt} representing two distinct stations in the domain of study. The distance between X and Y, denoted by d(X, Y), should be small when the two series are similar and large when they are dissimilar. Any distance used should obey the general properties of distances. From Mardia et al. (1979), a function d is a distance if it is
symmetric, where d(X, Y) = d(Y, X);
nonnegative, where d(X, Y) ≥ 0; and
identification marking, where d(X, X) = 0.
For many distances, the following properties also hold:
definiteness, where d(X, Y) = 0 if and only if X = Y; and
the triangle inequality, where d(X, Y) ≤ d(X, Z) + d(Z, Y).
For the other quantities in (2.1), X̂t = P(Xt|1, X1, … , Xt − 1) is the best linear prediction of Xt from a constant (this allows mean effects into the prediction) and the “observed past” X1, … , Xt − 1 (this allows for second-order, or autocovariance, effects) and υt = E[(Xt − X̂t)2] is its squared prediction error. The divisor υt1/2 is needed to account for the possibility of seasonal variances; for example, one-step-ahead temperature predictions have larger errors during winter months than in summer. Scaling by υt1/2 ensures that each observation contributes equally to the distance. An example is given below that shows how to compute X̂t and υt for monthly temperature series.
The key fact driving the above comes from time series: regardless of the mean and correlation structures of {Xt}, the scaled error sequence {St} defined for each fixed t via St = (Xt − X̂t)/υt1/2 is zero mean white noise with a unit variance. This means that St and Ss are uncorrelated when t ≠ s, E[St] ≡ 0, and var (St) ≡ 1. Brockwell and Davis (1991, chapter 5) is a good reference for time series prediction theory. By examining the one-step-ahead prediction residuals, we are essentially transforming a clustering problem with a time-varying mean and autocorrelated errors into one with zero mean and uncorrelated errors (and the latter is a well-studied problem of statistics; see Hartigan 1975).
The distance in (2.1) obeys all five properties of distances listed above. This is not the only distance metric that can be used; in fact, Moeckel and Murray (1997), Maharaj (2000), Boets et al. (2005), and Bengtsson and Cavanaugh (2008) provide alternative choices. One can link the distance in (2.1) to likelihood ratio tests for Gaussian series, but we will not pursue such aspects here. A very convenient facet of the distance in (2.1) is that it scales into a chi-squared test to assess the equality of two time series (whether or not they can serve as references for one another). We now elaborate on this test.


Our next task is to fit good time series models to our series of interest and show how to compute d(X, Y) from these time series fits. This is tedious but essential. Periodicities and covariate effects will be considered. Let T denote the period of the series (e.g., T = 12 for monthly data and T = 365 for daily data). Our seasonal notation takes XnT+ν as the observation during the νth season of cycle n in X. Here, ν is a seasonality suffix satisfying 1 ≤ ν ≤ T. For example, with monthly data, ν = 1 refers to January and ν = 10 to October.
Once the regression model form is chosen, one must estimate all regression and time series parameters in the model for both stations. From these estimates, the distance between the two stations can be computed. Computation of the quantities needed to evaluate the distance is discussed in the appendix.
As a simple example of the above methods, we consider 54 yr of monthly temperatures from Athens and Atlanta, Georgia. Both stations lie in the Piedmont region of north Georgia and are approximately 75 miles apart. There are N = 648 observations from both stations.
Empirical computations do not reveal a trend in these series. As we do not consider covariates, the trend function fitted is simply f (t) ≡ 0. The time series model fitted to the errors is a first-order periodic autoregression with mean μν, autoregressive parameter ϕν, and white noise variance σν2 during season ν for 1 ≤ ν ≤ T. This model is explained further in the appendix here or in Lund et al. (1995).
The distance computed between Athens and Atlanta is d = 0.075. Using the chi-squared distribution in (2.2), we obtain a p value of approximately 1.000, strongly suggesting that Athens and Atlanta are very good reference stations for one another. Figure 1 plots monthly estimates of μν, ϕν, and σν2 for each station. As these parameters are virtually identical, we conclude that that Athens and Atlanta indeed have similar means and autocovariances. One could use this information to calibrate a new gauge or fill in missing data.
This completes our description of the distance function behind the clustering. Our next section shows how to cluster the series into zones given all their pairwise distances.
3. Clustering the stations
This section discusses how to take the section 2 distances and define a fixed number of zones (say K) out of M total stations. Here, the zones will be referred to as clusters and are simply subsets of stations. While many different clustering methods exist, we will focus on hierarchical agglomerative methods for simplicity as these perform reasonably well on a variety of climate data structures. Other clustering techniques, many of which are prevalent in synoptic classification, include Ward’s minimum variance, centroid, and K means (see Gong and Richman 1995 and Wilks 2006 for overviews).
Henceforth, we consider three widely used agglomerative clustering methods: nearest-neighbor single linkage, furthest-neighbor complete linkage, and average linkage. These methods all start with each station serving as its own cluster. Next, the two stations having the smallest distance are identified and combined into a cluster of two stations. At this point, there are M − 1 clusters. At each successive step in the algorithm, the two clusters having the smallest distance between them are merged into a single cluster (this between-cluster distance is clarified below). The step at which there are K clusters is the grouping that we seek.
4. A simulation study
This short section presents a simulation study of the reference station test proposed in section 2. Here, we will generate synthetic series with known statistical properties. Because the series are simulated, this allows us to preclude changepoint effects from influencing the results. To elaborate, station relocation times or instrumentation changes can induce level shifts into climatic time series and produce spurious results if applied to the unadjusted (raw) series (see Lund and Reeves 2002 for examples).
Here, case I entails a situation where the average annual temperature is 60°, average temperatures fluctuate between 30° and 90°, and there is moderate positive correlation in the error process {ϵt}. Case II is similar to case I except the seasonal mean cycle is less pronounced. Case III has the same dynamics as case I except that the station is located in a 1° colder setting (in each and every month). Case IV uses the same seasonal means as case I but takes heavier (stronger) autocorrelations in the errors. Case V is like case IV except that there is no seasonal cycle in the autocorrelations. In truth, each of the five cases are different; hence, an ideal test would reject the notion that any station could serve as a reference for any other station. Figure 2 plots a synthetic draw from each of the five cases. Visually, it is difficult to distinguish some of these series from one another.
Table 1 below summarizes results from 10 000 simulated series from each station. On each of the 10 000 replications, series from the five stations are first generated, one each from cases 1–5, respectively. We then compare the distances between each pair of series (there are 10 comparisons to make) with α = 0.05. Table 1 shows the sample proportion of times (out of 10 000) that the test in (2.2) rejects that the stations can be used as references for one another.
The sample proportions show that cases I, II, and III are easily differentiated. The test has worked well (perfectly in fact) when one station’s seasonal mean differs. Differences in autocovariances are harder to detect. This said, case I and case IV, where true differences lied only with the autocovariances, were correctly differentiated 78% of the time. Case I and case V are very close in statistical structure, with the only difference being the slight oscillation in ϕν in case I. The test correctly discriminates between these two cases only 5.3% of the time.
We also studied the so-called type I error (also called a false alarm rate) of our reference station test. Here, the 10 000 pairs of series from case I were generated and were compared with a 95% test. The proportion of times the test rejected that the station pairs could serve as references for one another was 0.0317, slightly (but not too far) below the 0.05 level it should be. The reason that the proportion is slightly lower than 5% lies with time series parameter estimation. As the sample sizes increases, these proportions will move closer to 5%. For cases II–V, the type I errors are 0.0304, 0.0299, 0.0219, and 0.0292, respectively, which are diagonal elements of Table 1.
With the reference station test performing well, we now move to the task of clustering a geographic region with our new distance. Before this, we make a comment. It is possible for two stations to be in the same cluster and not be good references for one another. Indeed, clustering simply minimizes the within-cluster heterogeneity; in extreme cases, one could reject that every pair of stations within a cluster are adequate references for one another.
5. Colorado climate zones
This section examines how clustering techniques based on (2.1) work in a setting with several distinct climatic zones: the state of Colorado. Colorado is composed of high plains in its eastern regions, mountains in its center and western parts, and canyon lands, mesas, and steppes on its western flanks. As noted in Bengtsson and Cavanaugh (2008), establishing climate zones in Colorado is difficult because of its widely varying topography and elevations. An accurate clustering of Colorado would help delineate agricultural and forestation zones—plants and animals that thrive at one station should thrive at other stations within the same zone. Thornthwaite (1931) lists many other reasons to pursue a rigorous definition of climate zones.
Monthly maximum and minimum temperatures for 292 cooperative Colorado weather stations were downloaded from the Web site http://www.image.ucar.edu/Data/US.monthly.met/FullData.shtml#precip. At each station, the monthly average temperature is the simple average of the monthly maximum and minimum temperatures. The data span 1895–1997. This dataset contains 270 more stations than the 22 analyzed in Bengtsson and Cavanaugh (2008).
Some of the 292 stations appear to have trends. For simplicity, no covariate effects are considered. Hence, the trend model is f (t) = αt at each station. First-order periodic autoregressive error models, as discussed in the appendix, are fitted as the error component at each station.
In many applications, the number of zones (clusters) is a priori determined or is estimated via an ad hoc “stopping rule” such as the pseudo-F criterion (Calinski and Harabasz 1974). As our advance lies with improving the distance between different time series, we set the number of clusters as six, keep this parameter fixed, and focus on the differences between our method and squared Euclidean distance methods. The differences between the methods for other cluster numbers are similar.
Figures 3 –5 graphically portray the results by plotting the cluster number over each station’s location. The two panels compare the distance in (2.1) (top panel) to the squared Euclidean distance (bottom panel). The three figures report clusterings for single linkage, complete linkage, and average linkage methods, respectively.
The single linkage plots in Fig. 3 are problematic because one cluster contains all but five of the stations. This is attributed to the fact that single linkage methods use the minimum of distances between two stations in the respective clusters as the within-cluster distance: when a cluster contains many stations, it is likely that some station in the cluster will match the “station in question” well. Several of the stations are so different from all others that the clustering algorithm isolates them as “singleton” clusters. This drawback applies to both the new and squared Euclidean distances. These five different stations were omitted and the clustering was redone, but the single linkage clustering simply found five new singleton clusters.
The complete and average linkage clusterings in Figs. 4 and 5 are more appealing in that they are less degenerate. The overall patterns in these two figures are very similar, but there are also some differences. The patterns in Figs. 4 and 5 clearly identify the eastern plains and the San Juan, San De Cristo, Gore, and Front Mountain Ranges in the state’s center and southwestern locales, and the scattered canyonlands in the west. The grasslands in the southeastern corner of the state (Baca County in particular) are also apparent. There are subtle differences between the results from the two distances. For example, the two average linkage plots in Fig. 5 for Larimer County in the north-central part of the state differ. This county has five different zones under our distance, but only four zones under a squared Euclidean distance. The station responsible for this discrepancy is the Willow Park station in Rocky Mountain National Park (labeled WP in the graphic), which resides at the extreme elevation of 10 702 feet. Hence, there is reason to believe that the additional zone is warranted [i.e., the distance in (2.1) is preferable]. In Fig. 4, Mineral County in the southwest part of the state sees disagreement. Here, squared Euclidean distances imply that the six stations in the county all belong to the same zone while the distance in (2.1) prefers two zones. Given that the stations in this county include the Wolf Creek Ski Area (Colorado’s snowiest ski resort, labeled WC in the graphic) and Wagon Wheel Gap (a town at a much lower elevation, labeled WW in the graphic) two zones for this county seem more realistic. Whether to use the complete or average linkage clusterings is always debatable. Unless there is a good reason, we suggest the average linkage clustering simply because averages are more stable than maximums.
To explore the differences noted above in Larimer County further, we will examine the Longs Peak (LP in Fig. 5) and Willow Park stations. The Longs Peak station resides at 9005 feet, some 1697 feet below the Willow Park station. For average linkage methods, the squared Euclidean distance puts these stations in the same zone while the new distance in (2.1) puts them in different zones (see Fig. 5). The reference station comparison test in section 2 for these two stations has d = 3.214, with a p value of approximately zero. Hence, statistical evidence suggests that different zones are warranted.
It is worth comparing the results to conventional methods. Figure 6 shows the resulting clustering if a sample mean is subtracted from the series before analysis (this is one common definition of an anomaly). Specifically, Fig. 6 shows an average linkage clustering with a Euclidean distance applied to the “grand mean adjusted series” {Xt −
The results have data collection implications. For instance, Fig. 8 depicts relationships between 9 stations in Larimer County. Specifically, a line is drawn to connect any pair of stations that pass as being equivalent at the 95% level with the previously discussed chi-squared test. The graphic shows that station 8, which is the previously encountered Willow Park station, is different than all other stations in the county. On the other extreme, station 2 tests as equivalent to stations 1, 3, 4, 5, and 9. Because station 2 is connected to the most other stations, this station seems to record the most redundant data and would be the best candidate to omit if one was forced to discontinue data collection at some station within the county.
6. Conclusions and comments
The methods here enable one to more accurately define climate zones. By considering the seasonal mean cycle and autocovariances of the data from stations being clustered, improvements were made to techniques that employ squared Euclidean distances. As a by-product of the methods, a statistical test for whether two stations can serve as references for one another is obtained.
We close with a comment. First, the methods here are univariate in that only temperatures were considered. Multivariate extensions of the methods are currently being studied and could improve the resulting clusterings further. The necessary arguments for such an endeavor are reasonably straightforward for Gaussian data. However, as precipitations are decisively non-Gaussian, several nuances may need to be addressed to obtain a method that performs optimally. Even in multivariate Gaussian settings, we do not recommend standardizing each component series by subtracting a mean and dividing by a standard deviation. Indeed, one of the components may be a better discriminator than the others; mean and covariance information can be destroyed in such a transformation. Rather, multivariate extensions should simply use multivariate versions of the periodic time series models. This topic is an active area of current statistical research.
Acknowledgments
The authors thank Caspar Ammann, Philippe Naveau, and Julien Emile-Geay for useful conversations on the subject matter. The problems considered here were posed at the Junior Faculty Forum at the National Center for Atmospheric Research in July 2007. The comments made by two referees substantially improved this manuscript.
REFERENCES
Bengtsson, T., and J. E. Cavanaugh, 2008: State-space discrimination and clustering of atmospheric time series data based on Kullback information measures. Environmetrics, 19 , 103–121.
Boets, J., K. DeCock, M. Espnioza, and B. DeMoor, 2005: Clustering time series, subspace identification, and cepstral distances. Commun. Inf. Syst., 5 , 69–96.
Böhm, R., I. Auer, M. Brunetti, M. Maugeri, T. Nanni, and W. Schöner, 2001: Regional temperature variability in the European Alps: 1760–1988 from homogenized instrumental records. Int. J. Climatol., 21 , 1779–1801.
Brockwell, P. J., and R. A. Davis, 1991: Time Series: Theory and Methods. 2nd ed. Springer-Verlag, 577 pp.
Bunkers, M. J., J. R. Miller Jr., and A. T. DeGaetano, 1996: Definition of climate regions in the northern plains using an objective cluster modification technique. J. Climate, 9 , 130–146.
Calinski, R. B., and J. Harabasz, 1974: A dendrite method for cluster analysis. Commun. Stat., 3 , 1–27.
Coates, D. S., and P. J. Diggle, 1986: Tests for comparing two estimated spectral densities. J. Time Ser. Anal., 7 , 7–20.
Cochrane, D., and G. H. Orcutt, 1949: Application of least squares regression to relationships containing auto-correlated error terms. J. Amer. Stat. Assoc., 44 , 32–61.
Davis, R. E., and D. R. Walker, 1992: An upper-air synoptic climatology of the western United States. J. Climate, 5 , 1449–1467.
DeGaetano, A. T., 1996: Delineation of mesoscale climate zones in the northeastern United States using a novel approach to cluster analysis. J. Climate, 9 , 1765–1782.
DeGaetano, A. T., 2001: Spatial grouping of United States climate stations using a hybrid clustering approach. Int. J. Climatol., 21 , 791–807.
Devore, J. L., and K. N. Berk, 2007: Modern Mathematical Statistics with Applications. Thomson Higher Education, 838 pp.
Fovell, R. G., 1997: Consensus clustering of U.S. temperature and precipitation data. J. Climate, 10 , 1405–1427.
Fovell, R. G., and M. C. Fovell, 1993: Climate zones of the conterminous United States defined using cluster analysis. J. Climate, 6 , 2103–2135.
Gerstengarbe, F-W., P. C. Werner, and K. Fraedrich, 1999: Applying non-hierarchical cluster analysis algorithms to climate classification: Some problems and their solution. Theor. Appl. Climatol., 64 , 143–150.
Gong, X., and M. B. Richman, 1995: On the application of cluster analysis to growing season precipitation data in North America east of the Rockies. J. Climate, 8 , 897–931.
Hartigan, J. A., 1975: Clustering Algorithms. John Wiley & Sons, 351 pp.
Kakizawa, Y., R. H. Shumway, and M. Taniguchi, 1998: Discrimination and clustering for multivariate time series. J. Amer. Stat. Assoc., 93 , 328–340.
Kalkstein, L., G. Tan, and J. A. Skindlov, 1987: An evaluation of three clustering procedures for use in synoptic climatological classification. J. Climate Appl. Meteor., 26 , 717–730.
Kidson, J. W., 2000: An analysis of New Zealand synoptic types and their use in defining weather regimes. Int. J. Climatol., 20 , 299–316.
Lund, R. B., and I. V. Basawa, 2000: Recursive prediction and likelihood evaluation for periodic ARMA models. J. Time Ser. Anal., 21 , 75–93.
Lund, R. B., and J. Reeves, 2002: Detection of undocumented changepoints: A revision of the two-phase regression model. J. Climate, 15 , 2547–2554.
Lund, R. B., H. Hurd, P. Bloomfield, and R. L. Smith, 1995: Climatological time series with periodic correlation. J. Climate, 8 , 2787–2809.
Lund, R. B., H. Bassily, and B. Vidakovic, 2009: Testing equality of autocovariance functions. J. Time Ser. Anal., in press.
Maharaj, E. A., 2000: Clusters of time series. J. Classif., 17 , 297–314.
Mardia, K. V., J. T. Kent, and J. M. Bibby, 1979: Multivariate Analysis. Academic Press, 521 pp.
Michelangeli, P-A., R. Vautard, and B. Legras, 1995: Weather regimes: Recurrence and quasi stationarity. J. Atmos. Sci., 52 , 1237–1256.
Moeckel, R., and B. Murray, 1997: Measuring the distance between time series. Physica D, 102 , 187–194.
Steinbach, M., P-N. Tan, V. Kumar, S. Klooster, and C. Potter, 2003: Discovery of climate indices using clustering. Proc. Ninth ACM SIGKDD Int. Conf. on Knowledge Discovery and Data Mining, New York, NY, ACM, 446–455.
Stephenson, D. B., A. Hannachi, and A. O’Neill, 2004: On the existence of multiple climate regimes. Quart. J. Roy. Meteor. Soc., 130 , 583–605.
Stooksbury, D. E., and P. J. Michaels, 1990: Cluster analysis of southeastern U.S. climate stations. Theor. Appl. Climatol., 44 , 143–150.
Straus, D. M., S. Corti, and F. Molteni, 2007: Circulation regimes: Chaotic variability versus SST-forced predictability. J. Climate, 20 , 2251–2272.
Thornthwaite, C. W., 1931: The climates of North America, according to a new classification. Geogr. Rev., 38 , 55–94.
Unal, Y., T. Kindap, and M. Karaca, 2003: Redefining the climate zones of Turkey using cluster analysis. Int. J. Climatol., 23 , 1045–1055.
Vrac, M., and P. Naveau, 2007: Stochastic downscaling of precipitation: From dry events to heavy rainfalls. Water Resour. Res., 43 , W07402. doi:10.1029/2006WR005308.
Wilks, D., 2006: Statistical Methods in the Atmospheric Sciences. 2nd ed. Academic Press, 648 pp.
Yao, C. S., 1997: A new method of cluster analysis for numerical classification of climate. Theor. Appl. Climatol., 57 , 111–118.
Appendix
Distance Computation Details
This appendix shows how to compute d(X, Y) from specification of a trend function f and a time series model for the errors {ϵt}. The procedure is best illustrated by example. Henceforth, we will fit (2.4) ( f (t) = αt + κ NAOt) with first-order periodic autoregressive {ϵt}; the discourse for other models is similar.



If a PAR(1) model is judged inadequate, higher-order periodic autoregressive moving-average models may merit attention. These are discussed in Lund and Basawa (2000). We proceed with a PAR(1) model below as it is simplistic and reasonably flexible.
One can perform several iterations of a Cochrane–Orcutt scheme (Cochrane and Orcutt 1949) to tune the first-moment and PAR(1) time series parameter estimates jointly. For example, with the time series estimates in (A.2) and (A.3), one can reestimate the mean parameters in β with weighted least squares methods. Such a procedure will require the PAR(1) covariance matrix, which is quantified in Eqs. (5.4)–(5.7) in Lund and Basawa (2000).
Estimated (a) μν, (b) ϕν, and (c) σν for Athens and Atlanta.
Citation: Journal of Climate 22, 7; 10.1175/2008JCLI2455.1
Examples of simulated time series for cases I–V.
Citation: Journal of Climate 22, 7; 10.1175/2008JCLI2455.1
Station assignments to six clusters using (top) the distance in (2.1) and (bottom) the squared Euclidean distance for single linkage clustering.
Citation: Journal of Climate 22, 7; 10.1175/2008JCLI2455.1
As in Fig. 3, but for complete linkage clustering.
Citation: Journal of Climate 22, 7; 10.1175/2008JCLI2455.1
As in Fig. 3, but for average linkage clustering.
Citation: Journal of Climate 22, 7; 10.1175/2008JCLI2455.1
Station assignments of grand mean adjusted anomalies to six zones using a Euclidian distance and average linkage clustering.
Citation: Journal of Climate 22, 7; 10.1175/2008JCLI2455.1
As in Fig. 6, but for monthly adjusted anomalies.
Citation: Journal of Climate 22, 7; 10.1175/2008JCLI2455.1
Connectivity of Larimer County stations.
Citation: Journal of Climate 22, 7; 10.1175/2008JCLI2455.1
Empirical probabilities based on the new distance.
Empirical probabilities based on the squared Euclidian distance.