Extratropical cyclones can vary widely in their configuration during cyclogenesis, development mechanisms, spatial and temporal characteristics, and impacts. An automated method to classify extratropical cyclones identified in ERA-Interim data from 1979 to 2010 in the Australia and New Zealand region has been developed. The technique uses K-means clustering on two upper-tropospheric flow fields at the time of cyclogenesis and identifies four distinct clusters. Composites of these clusters are investigated, along with their life cycles and their spatial and temporal variability. The four clusters are similar to a previous manual classification. Cluster 1 develops in the equatorward entrance region of the subtropical jet, clusters 2 and 4 develop in the poleward exit region of the subtropical jet but with different relative positions of the upper-level trough and jet streak, and cluster 3 resembles secondary cyclogenesis on a preexisting front far poleward of the subtropical jet. The clusters have different impacts in terms of their precipitation (cluster 1 has the highest average precipitation), different seasonal cycles, and different preferred genesis locations. Features of the composite cyclones resemble extratropical cyclones from other regions, indicating the utility of the method over larger regions. The method has been developed to be easily applied to climate model output in order to evaluate the ability of models to represent the full range of observed extratropical cyclones.
Not all extratropical cyclones are created equal. While they are a ubiquitous feature of the midlatitudes, there are many different configurations of the atmospheric circulation conducive to their cyclogenesis. They can also develop and intensify in different ways, giving rise to differing impacts. It is these impacts—the precipitation and strong winds that they bring—that are of socioeconomic importance and that are one of the reasons for the great interest in how extratropical cyclones will change in the future (e.g., Kirtman et al. 2013).
There have been many techniques and methods used to distinguish and classify different flavors of extratropical cyclone genesis and development [see the review by Catto (2016)]. Some of these methods have developed from a weather forecasting point of view (e.g., Young 1993) or focused on small regions (e.g., Evans et al. 1994). Past studies have considered the cloud features using satellite imagery (e.g., Zillman and Price 1972; Evans et al. 1994), with certain cloud features visible in satellite images giving information about how the surface cyclone below is developing. Others have considered atmospheric precursors (e.g., Dacre and Gray 2013; Graf et al. 2017) or considerations of upper- versus lower-level forcing (e.g., Petterssen and Smebye 1971; Deveson et al. 2002; Graf et al. 2017) to better understand the variability of cyclone development. Sinclair and Revell (2000, hereinafter SR00) performed a manual classification on 40 cyclones (from 1990 to 1994) identified in the Australia and New Zealand region. Their classification was based on upper-tropospheric flow features (300-hPa wind speed and geopotential height) and yielded four distinct classes. Despite the use of only two variables, these classes exhibited differences in frontal development and cyclone structure through the life cycle.
In this paper, a relatively simple automated method based on clustering has been developed to classify extratropical cyclones in the Australia and New Zealand region during the period 1979–2010, using an objective feature tracking method applied to the European Centre for Medium-Range Weather Forecasting (ECMWF) interim reanalysis (ERA-Interim) product (Dee et al. 2011). The results of the automated classification used here will be compared using composites against the manual classification of SR00 to determine the wider utility of such a method to realistically depict the variation of cyclone types. Composites have been used in a number of previous studies to investigate the most important features of extratropical cyclones, while removing some of the noise associated with individual systems (e.g., SR00; Field and Wood 2007; Catto et al. 2010; Hawcroft et al. 2016). As well as comparing the features studied in SR00, additional fields are investigated here to further understand the different impacts of the cyclones. The average frequency of cold and warm fronts and their associated precipitation are determined for the different cyclone classes.
The spatial and temporal variability of the cyclone clusters are investigated using the long period of reanalysis data. The Australian and New Zealand region is strongly influenced by interannual variability related to El Niño–Southern Oscillation (ENSO) and by the southern annular mode (SAM), so a question arises of how these may affect the types of cyclones that occur in this region.
A further goal of this work is to develop a method that could be easily applied to data from climate models such as phase 5 of the Coupled Model Intercomparison Project (CMIP5; Taylor et al. 2012). This methodology will provide an objective means to evaluate the realism of climate models against observational data and to intercompare different climate models. To understand how extratropical cyclones and their associated impacts may change in the future, we need to first investigate whether climate models—our primary tool in providing projections of future climate—are able to represent the full spectrum of extratropical cyclone behavior.
Many studies have evaluated the mean extratropical storm tracks in climate models (e.g., Ulbrich et al. 2008; Catto et al. 2011; Colle et al. 2013; Zappa et al. 2013), and overall these seem to be improving over time (Flato et al. 2013). The dynamical structure of the most intense extratropical cyclones in the High-Resolution Global Environment Model (HiGEM; Shaffrey et al. 2009) in the Northern Hemisphere was found to be well represented compared to ERA-40 (Catto et al. 2010); however, many studies show an underestimate of average cyclone intensity (Zappa et al. 2013) and a negative bias in the frequency of the most intense extratropical cyclones (e.g., Seiler and Zwiers 2016).
Studies have shown that models can have an incorrect distribution of clouds within extratropical cyclones (Field et al. 2008; Naud et al. 2010; Booth et al. 2013; Govekar et al. 2014; Hawcroft et al. 2017). The relationship between the dynamical and cloud features in models may also be poorly represented (Govekar et al. 2014). In order for models to represent the cyclone intensification associated with diabatic processes, high resolution is required (Willison et al. 2013).
Following these results, it seems possible that extratropical cyclones that are more dependent on latent heating for their development will be less well represented in climate models. This sort of detail would be masked by the consideration of all types of cyclones together. Separating out different classes may lead to added insight into the representation of these features in models.
The objective classification method described here offers a way of grouping cyclones using atmospheric variables that are widely available from climate model simulations. Manual techniques are impractical for long periods and for multiple models, and they are also not a repeatable methodology—a different synoptic practitioner may place the cyclones into different classes. To test the method, the study region has been chosen to match with SR00. However, if it is seen to produce similar groups of cyclones to the manual classification, it gives motivation to further develop the technique to be applied on larger regions (potentially global) in order to be able to compare against climate model output.
This paper seeks to address the following questions:
Can a simple clustering method identify previously defined classes of extratropical cyclones?
What are the distinct characteristics of cyclones in the various clusters?
What is the spatial and temporal variability of the cyclones in the different clusters?
In section 2, a brief description is given of the data used and the cyclone identification method. A longer explanation of the selection of tracks and the clustering method are also given. Section 3 gives a description of the clusters found using the described method and how they compare with SR00. Section 4 explores the spatial and temporal variability of the clusters, and section 5 gives a summary and discussion.
2. Data and methods
Data from ERA-Interim, extracted at 1.5° resolution, have been used for the period 1979–2010. Most of the fields have been obtained directly from the reanalysis, except for the 850-hPa relative vorticity, which is required for the objective features tracking and is calculated from the zonal and meridional wind, and the frontogenesis function (described below). Precipitation is also taken from ERA-Interim forecast fields and represents 6-hourly accumulations (in the 6 h following the analysis time) from the 0–6- and 6–12-h forecasts from 0000 and 1200 UTC (as used in Catto and Pfahl 2013). Hawcroft et al. (2016) showed that there is a low bias in the precipitation from ERA-Interim in the first 12 h of the forecast due to model spinup. In their analysis, Hawcroft et al. (2016) opted to use the 12–24-h forecast lead time precipitation to overcome this bias. The sensitivity of the results to the use of the data from Hawcroft et al. (2016) has been investigated (see the appendix) and found not to alter the conclusions of the study.
The front identification of Berry et al. (2011) based on the work of Hewson (1998) has been used. This algorithm has been described previously in Berry et al. (2011), Hope et al. (2014), Catto and Pfahl (2013), and Catto et al. (2015), among others, so only a very brief description of the method is given here. Frontal points are identified where the gradient of a thermal front parameter (TFP) is zero, where and θw is the wet bulb potential temperature on 850 hPa, after the TFP field has been masked out above a threshold value. The fronts are separated into cold, warm, and quasi stationary, and only the warm and cold fronts are used in this study. The fronts are identified using the ERA-Interim data degraded to a resolution of 2.5° as in Catto et al. (2012).
Frontogenesis has been calculated on the 850-hPa level using the temperature and wind fields from ERA-Interim and with the Petterssen (1936) definition:
where D is the divergence, D = ∂u/∂x + ∂υ/∂y; E is the total deformation magnitude, , where Est = ∂u/∂x − ∂υ/∂y (the stretching deformation) and Esh = ∂υ/∂x + ∂u/∂y (the shearing deformation); and β is the angle between the isentropes and the axis of dilatation.
b. Cyclone identification and tracking
Extratropical cyclones are identified using the objective feature identification and tracking methodology of Hodges (1994, 1995, 1999) and are demonstrated in Hoskins and Hodges (2002, 2005). This method identifies cyclones in the SH as minima (maxima in the NH) of the 850-hPa relative vorticity, which is first truncated to spectral T42 resolution (approximately 300-km grid spacing) in order to maintain only the synoptic scales. Cyclone centers are grouped into tracks by first applying a nearest neighbor search, and then by minimizing a cost function to determine the smoothest tracks. To keep only mobile cyclones in the dataset, two further criteria are applied; the cyclones must live for at least 2 days (8 time steps) and travel at least 1000 km in order to eliminate quasi-stationary features such as heat lows. Cyclogenesis is defined as the first point that is identified for each track.
For much of the analysis that follows, the spatial fields surrounding the cyclone center are considered. These fields are extracted from the ERA-Interim data by using a radial coordinate system (e.g., Bengtsson et al. 2006; Catto et al. 2010). A spherical cap is centered on the pole and rotated to the cyclone center. The latitude–longitude gridded field is then regridded on the 20° radial cap and saved for each cyclone and track point.
c. Cyclone track selection
To be able to sensibly compare the results of the cyclone classification with SR00, a number of steps were performed in the selection of the tracks to analyze.
Selection of tracks within a certain region: The same region of study as used in SR00 was chosen here (25°–50°S, 150°E–150°W; shown by the black boxes in Fig. 1). SR00 selected only cyclones with their entire lifetime within the box. Because of the typical length of the cyclone tracks identified here [mean of 16.9 points (just over 4 days) for all cyclones in the SH], a less restrictive criterion was used, so that cyclones with their genesis (first identified point) and at least 50% of their track points within the box are chosen.
Selection of developing cyclones: Here the evolution of the T42 resolution central vorticity at 850 hPa is used to identify cyclones that develop after their identified genesis time. The same criteria as SR00 are used; the central cyclone intensity (the T42 vorticity at the track point) must be less than 4 cyclonic vorticity units (CVU; 1 CVU = −1 × 10−5 s−1 in the SH) at the time of genesis, but increase to greater than 6 CVU at some point later in its life cycle.
Selection of cyclones of sufficient strength (circulation): SR00 also used the criterion of the cyclones reaching at least 6 circulation units (CU; 1 CU = 1 × 107 m2 s−1) using the methods described in Sinclair (1997), and we have applied the same method to our data as follows. First, the region where cyclonic vorticity is decreasing with distance from the cyclone center is defined on the cyclone-centered radial grid. Next, the circulation C of the cyclones is calculated using the T42 vorticity within this region as , where P is the number of grid boxes within the specified area, ξT42,n is the T42 resolution vorticity for each grid box, and An is the area of each grid box.
The number of cyclones in the dataset for each season at each of these steps is given in Table 1. Considering all the cyclone tracks identified in the SH, the summer season [December–February (DJF)] has the fewest, consistent with previous studies (Simmonds and Keay 2000a,b). Spring [September–November (SON)] and autumn [March–May (MAM)] feature the largest number of cyclone tracks, likely associated with the maxima in midtropospheric temperature gradients seen at high latitudes in these seasons (van Loon 1967). However, many of these occur around the edge of Antarctica (not shown) and are not relevant for this study. Once the region selection has taken place, the highest frequency of cyclones occurs during winter [June–August (JJA)] when the subtropical jet is at its strongest, with a similar number in SON, and the lowest during DJF. This is consistent with other climatologies of cyclones in the Australian region (e.g., Speer et al. 2009) and the latitudinal variations shown by Simmonds and Keay (2000a). Once the filtering for strong, developing cyclones has taken place, JJA has by far the largest number of cyclones, indicating the occurrence of the strongest cyclones in winter. The final number of cyclones included in the clustering is 483. These tracks have a mean track length of 24.6 points (just over 6 days), with one track of 78 points (19.5 days).
The genesis density and track density (both with units of cyclones per month per 5° radius circle) are shown in Fig. 1. Note that although no cyclones have their genesis outside of the region of interest, because of the smoothing that occurs with the counting, the genesis density outside the box is nonzero in some places. DJF (Fig. 1a) shows the minimum genesis density in the region, while MAM (Fig. 1b) and JJA (Fig. 1c) show the maximum. In all seasons there is a local maximum of cyclogenesis close to the east coast of Australia, which is most pronounced during winter. This is consistent with the higher frequency of east coast lows (ECLs) identified in these seasons (e.g., Dowdy et al. 2013). During DJF, the highest cyclogenesis occurs to the west of New Zealand. Most of the cyclogenesis in all seasons occurs in the western half of the study region, associated with the criterion of 50% of track points lying within the box and the typical eastward propagation of the systems. The track density statistics (Figs. 1e–h) further highlight this feature, with large track density values across much of the region. During winter, the maximum track density lies at a lower latitude than the other seasons, but there are more cyclones that track farther poleward (indicated by the higher values of track density to the south of the region in the box). This is potentially an artifact of the region selection, with the winter cyclones remaining in the box long enough to fulfill the selection criterion of 50% of points within the box more often than the summer cyclones.
To automatically group the cyclones according to their similarities, a simple clustering method is employed. Graf et al. (2017) showed in their recent classification that cyclone genesis features exist as a continuum. However, the wide variability allows classes with distinct features to be identified. Clustering has been used in the fields of meteorology and climate since the late 1960s, and there are a number of different available methods (see review by Gong and Richman 1995). Ayrault et al. (1995) used clustering on 850-hPa vorticity to separate different subsynoptic cyclone types over a small region of the United Kingdom. Here K-means clustering has been used. K-means clustering specifically has recently been used in a number of ways related to synoptic-scale meteorology: to identify distinct synoptic-scale patterns associated with different vertical profiles from radiosonde data in northern Australia (Pope et al. 2009b), to identify wind regimes in the Antarctic region (Coggins et al. 2014), on spatial patterns of precipitation to again identify important synoptic-scale features (Raut et al. 2014), and to objectively identify extratropical transition of tropical cyclones (Studholme et al. 2015).
1) Fields used
Fields of upper-level winds (at 250 hPa; U250) and anomalies from the zonal mean of potential temperature on the tropopause [where potential vorticity is −2 PVU (1 PVU = 10−6 K kg−1 m2 s−1); θPV2] have been calculated from the ERA-Interim dataset and used in the clustering. The cyclone-centered spatial fields are used as the basis for the clustering. The 20° radial regions of these fields around the cyclone center are extracted at the time of genesis for each cyclone. The full spatial patterns are used in the clustering algorithm, rather than any single point or value. The 20° radial fields for each cyclone are normalized using the mean and the standard deviation of the spatial pattern. This normalization is necessary when using two fields (winds and potential temperature) of different magnitude and variability.
2) Clustering method
K-means clustering is an iterative technique that objectively groups objects (in this case cyclones) that are most similar to each other. With this type of clustering method, the number of clusters (often referred to as K) needs to be chosen a priori. To compare the results with those of SR00, four clusters were chosen. The clustering algorithm was also applied using differing values of K in order to investigate the variability of cluster centroids that would result. With only three clusters, one of the classes of SR00 was clearly missing. For four clusters and above, the main classes are usually visible in the cluster centroids. (The centroids of the clusters for K = 3, 5, 6, and 7 are shown in Figs. S1–S4 of the supplemental material.)
Methods to objectively select the most appropriate number of clusters have been suggested (Gong and Richman 1995; Rossow et al. 2005; Pope et al. 2009a). Many of these are suited to datasets where the clustering is performed on a single variable, thereby allowing the changes in cluster means to be analyzed statistically as the number of clusters is increased. Although these methods have not been used here to choose the number of clusters, some analysis has been done to check that the choice of four clusters is statistically sensible. An elbow analysis (investigating the proportion of explained variance as the number of clusters increases), applied to the spatial fields of both wind speed and potential temperature on the tropopause, indicates that the increase in the explained variance slows after four clusters (see Fig. S5 in the supplemental material) and again after seven clusters. On inspection of the centroids produced for seven clusters (see Fig. S4), it could be seen that some centroids were very similar to each other, and so seven clusters was deemed to be too many. This analysis gives some confidence in the choice of the number of clusters and the choice of four classes by SR00. When considering larger regions without a manual classification against which to compare, the use of such objective methods will be more important.
The steps in the clustering are as follows:
The two-dimensional spatial patterns of U250 and θPV2 are read in and normalized.
A random cluster number is generated from 1 to K for each cyclone, and the initial cluster centroids for U250 and θPV2 are calculated by averaging the spatial patterns of these initial clusters.
The cyclones are allocated to the cluster to whose centroid they are closest and the new centroids are recalculated.
Since there is some sensitivity to the initial random cluster allocation, for 50 iterations, 30 cyclones are randomly changed to different clusters before the new centroids are calculated. After these 50 iterations, the clusters are allowed to converge. Once there are no more shifts between clusters (convergence has been achieved), the algorithm terminates.
e. Climate indices
Extratropical cyclones in the Australian and New Zealand region are influenced by large-scale climate drivers (e.g., Rudeva and Simmonds 2015) that impact the subtropical jet and hence may determine the variability of the occurrence of the different cyclone clusters. The ENSO index used is based on the Niño-3.4 region (5°N–5°S, 120°–170°W) sea surface temperature (SST) anomalies [calculated from the Extended Reconstructed Sea Surface Temperature (ERSST), version 4 (Huang et al. 2015), and obtained online from http://www.cpc.ncep.noaa.gov/products/analysis_monitoring/ensostuff/ensoyears.shtml]. Seasons where the index is less than −0.5°C are classified as La Niña, and seasons where the index is greater than 0.5°C are classified as El Niño.
The SAM index is based on pressure differences between 40° and 65°S (Marshall 2003, obtained online from https://legacy.bas.ac.uk/met/gjma/sam.html) and describes the meridional movement of the strongest westerlies. SAM negative seasons are those with SAM index less than −0.5 and SAM positive seasons are those with SAM index greater than 0.5. While an index of greater than 1.0 or less than −1.0 may often be considered, this was found to create sampling issues and was not used here.
3. The clusters and how they compare with manual classification
The composites of cyclone-centered fields at the time of genesis are given for the four clusters in Figs. 2–5. The number of the cluster is an arbitrary label, but will be used consistently in the remainder of the paper. The number of cyclones represented by clusters 1–4 are 139, 123, 104, and 117, respectively.
Cluster 1 (Fig. 2) shows cyclogenesis on the equatorward side of the main jet streak, downstream of an upper-level trough (Fig. 2a), and resembles the equatorward entrance class (E) of SR00 (see their Fig. 8). This cluster is one of the clusters with the highest equivalent potential temperature at 850 hPa θe,850 at the time of cyclogenesis with values up to 325 K in the northern region of the composite, and a strong temperature gradient in the cold frontal region to the west of the cyclone center (Fig. 2b). At the time of genesis in the north of the composite, the surface winds are easterly and the feature resembles an easterly dip (e.g., Holland et al. 1987). This is somewhat different to the surface pattern seen for class E in SR00. At the time of genesis there are already large values of frontogenesis in both the warm frontal and cold frontal regions (Fig. 2b), which occurs in a classic deformation flow field due to the low pressure systems to the north and south and high pressure to the west and east. As the cyclone develops the frontogenesis in the warm frontal region increases markedly, while that in the cold frontal region and the cyclone center decreases (Figs. 2e,h), consistent with the analysis of the equatorward entrance class of SR00. Surface winds are diffluent in the region of the strongest temperature gradient, which contributes to the strong frontogenesis that occurs between genesis and maximum growth. Highest values of composite precipitation occur where there is maximum frontogenesis (Fig. 2c), with the heaviest precipitation occurring at the time of maximum growth (Fig. 2f). The central pressure decreases from about 1002 hPa at the second stage to 990 hPa at maximum intensity.
Cluster 2 closely resembles the class D composite from SR00 (see their Fig. 9), with the upper-level jet streak downstream of the upper-level trough and the cyclone genesis occurring in the poleward jet exit region (Fig. 3a). There is a fairly deep upper-level trough with its axis to the west of the cyclone center. As the composite cyclone develops, the jet streak appears weaker, and the upper level trough starts to tilt to the east (Fig. 3d). By the time of maximum intensity, there is a distinct cutoff almost directly above the surface low (Fig. 3g). At the time of cyclogenesis, there is frontogenesis greater than 4 K (103 km)−1 day−1 in both the cold front and warm front regions (Fig. 3b). Only cold fronts are visible along the baroclinic zone from northwest to southeast (Fig. 3c). As the cyclone undergoes its maximum growth, there is an increase in frontogenesis in the warm front region (Figs. 3e,f), accompanied by an increase in precipitation in this region (Fig. 3f), and the identification of high frequency of warm fronts.
Cluster 3 is most similar to class T (trough) of SR00, with the cyclone forming underneath a deep upper-level trough, far from the main jet streak (Fig. 4a). This upper-level trough appears much like a PV streamer (e.g., Martius et al. 2008), eventually becoming a cutoff feature lying almost directly above the surface cyclone center by the time of maximum intensity (Figs. 4g,h). In the SR00 trough class, there was a single jet streak visible to the north of the cyclone center, whereas in the composite shown for cluster 3 here, there is also a jet streak to the west of the upper-level trough (Fig. 4a). At the time of genesis, there is no closed pressure contour in the composite (Fig. 4b), rather a strong indication of genesis occurring on a preexisting cold front associated with a low pressure system at higher latitudes—so-called “secondary cyclogenesis” (e.g., Parker 1998). At the time of genesis, frontogenesis can be seen mainly in the cold frontal region (Figs. 4b,c), associated with the cold advection from the southwest. As the surface cyclone develops a closed contour of mean sea level pressure (MSLP), frontogenesis can be seen in the warm frontal region, and by the final stage, warm fronts can be identified, collocated with the main precipitation region (Figs. 4f,i). This cluster has the weakest warm front region due to relatively weaker warm advection. The maximum in cold front frequency seen at maximum intensity to the west of the cyclone center (Fig. 4i) is more likely a bent-back warm front (which the front identification would pick up as a cold front).
Cluster 4 resembles the class U (upstream exit) composite from SR00 (see their Fig. 7). The cyclone develops in the poleward jet exit region, as with cluster 2, but here the upper-level jet streak is upstream of the upper-level trough (Fig. 5a). As with the other clusters, there is a deep upper-level trough which undergoes cyclonic rotation as the life cycle progresses (Figs. 5a,d,g). There is no cutoff feature at upper levels by the time of maximum intensity, as there is in cluster 2. At the time of genesis, there is a rather zonally oriented baroclinic zone with θe,850 up to about 320 K in the north of the composite (Fig. 5b). There are quite weak surface winds, resulting in weak cold advection, slightly weaker frontogenesis, and fewer cold fronts identified, compared to the other clusters (Fig. 5c). There are already quite large values of precipitation in the warm front region at the time of genesis (Fig. 5c). By the time of maximum growth, the frontogenesis has increased markedly (Fig. 5e), especially in the warm front region, and the precipitation has increased (Fig. 5f). The strong development of the warm fronts can be related to the fairly strong warm advection seen on the east of the cyclone.
a. Cluster life cycles
The clusters clearly have a number of synoptic orientations. An interesting question is whether these characteristics impact the development and the life cycles of features of the composite cyclones. Figure 6 shows the life cycles of the cluster composites centered on the time of maximum vorticity (at T42 resolution). The life cycles of vorticity (Fig. 6a; this is the full-resolution vorticity rather than the T42 resolution; note that the vorticity has been multiplied by −1 as cyclonic vorticity is negative in the SH) show that the average composite maximum vorticity 120 h before the time of maximum intensity is about 10 CVU. About 60 h before maximum intensity the vorticity begins to increase in all clusters, although cluster 3 seems to develop slightly later than the others and has a quicker increase in vorticity (consistent with the rapid development seen in secondary cyclogenesis; Parker 1998). Cluster 1 vorticity begins to increase from 72 h prior to maximum intensity, likely associated with the latent heating from the increase in precipitation seen around the same time (Fig. 6c), and reaches the highest peak vorticity of all the clusters. The maximum vorticity is reached 6–12 h after the maximum precipitation for all clusters.
The MSLP life cycles look remarkably similar between the clusters in the period before the maximum intensity (which coincides with the minimum MSLP). All clusters show a slow decrease in pressure until about 60 h before maximum intensity, then an increased rate of deepening until the minimum pressure. After the time of maximum intensity, the MSLP life cycle curves diverge, likely related to the path the cyclones take. For example, cluster 2 maintains quite a low central MSLP as it moves poleward into a region of climatologically low pressure (see Fig. S8 in the supplemental material). Wind speeds in the cyclones also do not vary much between the clusters. Maximum wind speeds are seen at the time of maximum vorticity, and cluster 1 has slightly higher maximum values than the other clusters.
There is a large amount of variability within the clusters, indicated by the dashed lines at one standard deviation. Despite the different synoptic orientations, the life cycles of MSLP are similar. There is a larger difference between the clusters in the life cycles of impactful variables (precipitation and winds). To further determine the significance of the differences in precipitation between the clusters, normal distributions with cluster mean and standard deviation of the maximum precipitation along the tracks in each cluster (averaged within 5° radius of the cyclone centers) are shown in Fig. 7 along with the mean values and the statistical significance. The maximum precipitation in clusters 1 and 4 are not statistically significantly different to each other, which can be seen by how close the density functions are for these clusters. However, the differences between all the other clusters are statistically significantly.
b. Summary of clusters
Overall, the clusters identified using this automated method closely resemble the manually identified classes from SR00. The upper-level features, surface and low-level temperatures, and the frontogenesis seen in these clusters are mostly consistent with those seen in the classes of SR00. Cluster 3, which looks like class T from SR00, here looks very much like secondary frontal cyclogenesis, and is associated with highly amplified upper-level flow in the form of a PV streamer and cutoff PV feature (e.g., Wernli and Sprenger 2007). This type of upper-tropospheric feature is also seen in the Bmoist class of Graf et al. (2017), but this occurs predominantly in the lower midlatitudes in contrast to cluster 3. The other clusters are more associated with cyclonic wave breaking at upper levels, which has previously been shown to be associated with a large number of intense extratropical cyclones in the North Atlantic region (Gómara et al. 2014).
The inclusion of the precipitation and objectively identified fronts gives additional information about the potential impacts of the different classes. Cluster 1, similar to the equatorward entrance class of SR00, produces the highest volume of precipitation. This is associated with the advection of warm moist air from the northeast on the equatorward side of the jet. Cluster 1 also reaches the highest central vorticity and 925-hPa wind speeds (Fig. 6), which is consistent with a number of other studies showing the strong relationship between cyclone intensity and precipitation (Chang and Song 2006; Rudeva and Gulev 2011; Pfahl and Sprenger 2016).
While cluster 1 develops in confluent upper-level flow on the warm equatorward side of the jet entrance, clusters 2 and 4 develop in diffluent upper-level flow on the cold poleward side of the upper-level jet. Cluster 2 has a larger number of identified cold fronts throughout the lifetime of the composite cyclone, with high frequency of cold fronts at genesis, while cluster 4 has higher frequency of warm fronts and associated higher precipitation.
4. Spatial and temporal variability of the clusters
a. Genesis locations and tracks
The second goal of the study is to investigate if the identified clusters have distinct characteristics in their preferred locations and their temporal variability. There is some differentiation in the spatial distribution of the cyclogenesis in the identified clusters. Figure 8 shows the genesis locations of each of the cyclones within the clusters. Cluster 1, whose composite shows cyclogenesis on the equatorward side of the jet, has a high density of cyclogenesis in the northern part of the region, with 45% of the cyclogenesis of this cluster occurring at or northward of 30°S. Cluster 4 also has many cyclogenesis events in the north of the region, explaining the warm temperatures in the northern part of the composite cyclone (Fig. 2). Cluster-2 cyclogenesis occurs mostly at higher latitudes than clusters 1 and 4. Cluster 3 has the highest-latitude cyclogenesis on average and shows a high density of cyclogenesis around New Zealand, similar to the climatology of cutoff lows found by Fuenzalida et al. (2005). This is different to the class T cyclones of SR00 (structurally similar to cluster 3), which predominantly develop off the coast of Australia. In contrast, the cyclogenesis regions of the other clusters are bunched around the east coast of Australia—a region of climatologically high cyclogenesis (Hoskins and Hodges 2005). The seasonal changes in the genesis locations (Figs. 8b–e) mostly reflect differences in the seasonal cyclone numbers (discussed in the next section).
To more clearly show the differences in genesis latitude between the clusters, normal distributions with the mean and standard deviation of the genesis latitudes of each cluster were plotted (Fig. 8a, right). Clusters 1 and 4 have similar genesis latitudes (mean of 32.5° and 33.0°S, respectively), but develop on different sides of the jet. The mean genesis latitude of cluster 3, the most poleward cluster, is 41.2°S. Clusters 2 and 4, while showing some similarities in their composite structure and development, have different latitudinal distributions. The mean latitude of cluster-2 genesis is 36.5°S and of cluster 4 is 33.0°S. These are statistically significantly different at the 95% level (as determined by the Student’s t test).
The genesis density and track density of the cluster cyclones in each season are shown in Figs. S6–S9 of the supplemental material, and Fig. 9 shows the annual average track density for each cluster. The tracks of the cluster cyclones are quite closely associated with the orientation of the jet (and the steering flow) of the composite cyclones. Cluster 1 shows tracks with a strong southeastward movement in all seasons. Cluster-3 cyclones track quite zonally and the track density highlights the genesis around New Zealand. Clusters 2 and 4 show some variability in their track directions, but the track density also highlights the longitudinal variations between these two clusters. These figures demonstrate that, although the clustering method uses only a single time in the cyclone life cycle (genesis), the tracks of the groups do show differences in their propagation direction.
b. Temporal variability of the clusters
As well as having distinct spatial characteristics and genesis locations, the clusters show differences in their temporal variability. Figure 10a shows the number of cyclones identified in each cluster per season as well as the total number per season. The total number of cyclones varies strongly by season, with an average of 1.9, 4, 5.9, and 3.3 cyclones per season for DJF, MAM, JJA, and SON, respectively. Each cluster displays differences in its seasonal cycle, but all clusters have a minimum during DJF.
Clusters 2 and 4 show the strongest seasonal variability, with the smallest number of cyclones in the summer season (DJF) and the largest in the winter season (JJA) (Fig. 10a). Cluster 2 has a larger number during MAM than cluster 4. Clusters 2 and 4 both have their cyclogenesis on the poleward side of the jet streak, but fairly close to it, so their variability will be similarly linked to the seasonal shifting of the subtropical jet. The subtropical jet is at its strongest and most zonal during JJA (see, e.g., Risbey et al. 2009).
Cluster 1, the equatorward entrance cluster, has a weak seasonal cycle, with 1.3, 1.2, and 1.2 cyclones per season occurring during MAM, JJA, and SON, respectively. During the winter, the cyclogenesis region of this cluster is quite well constrained to the north of 35°S, but during the other seasons, the genesis locations are quite spread throughout the region. This may be related to the weaker northern component of the split jet system during the other seasons.
Cluster 3, which forms in a deep trough, also has a fairly weak seasonal cycle, with the highest number per season (just over 1) occurring during MAM, and the lowest (0.6) occurring during DJF. For this cluster, the strength and orientation of the subtropical jet is not as important as the distance from the main jet streak. This cluster has its cyclogenesis in highly amplified upper-level flow and along a front associated with a more poleward extratropical cyclone. The seasonal cycle of this cluster is likely associated with the seasonal changes in the jet (which is weakest in DJF), as well as the seasonal cycle of the frequency of cold fronts seen in this region (which is higher in MAM and SON than JJA; Catto et al. 2012).
With the long record from ERA-Interim of 32 years (compared to the 5 years used in SR00), it is possible to investigate the impact of large-scale climate drivers (ENSO and SAM, which influence the subtropical jet) on the occurrence of the different identified clusters. Figure 10b shows the number of cyclones per season identified during La Niña and El Niño seasons. Cluster 1 shows some differences during MAM, JJA, and SON, with more cyclones in this cluster identified during La Niña events than El Niño events. During DJF and MAM, cluster-3 cyclones occur more frequently during El Niño conditions than La Niña. Clusters 2 and 4 show opposing signals in most seasons, with the largest differences occurring during JJA. There are more cluster-2 cyclones during La Niña than El Niño, and more cluster-4 cyclones during El Niño than La Niña. None of the differences are statistically significant at the 95% level, despite some of the differences being quite large, because of the small sample size.
A similar comparison between negative and positive SAM seasons is shown in Fig. 10. Overall there are not many differences, and considering all seasons together, the total numbers of cyclones identified are very similar, with an average of about 3.5 cyclones per season during negative SAM and 3.8 cyclones per season for positive SAM. During DJF there are slightly more cluster-1 cyclones during negative SAM (the only result that is statistically significant). During JJA and SON, cluster 3 shows fewer cyclones in negative SAM and more in positive SAM. This may be related to the poleward contraction of the storm tracks and more pronounced split jet that can be seen during positive SAM. There is more blocking over New Zealand and associated higher-amplitude upper-level wave activity, which would influence cluster 3. Risbey et al. (2009) showed an increase in cutoff lows in the Australian region during positive SAM and overall wetter Australia during the wet season, but Rudeva and Simmonds (2015) showed very little difference in cyclone numbers in this region between positive and negative SAM during JJA.
As well as large seasonal variability, there is also large interannual variability in the total number of cyclones identified and the number in each cluster. Figure 11 shows the total numbers of cyclones identified each year, as well as the numbers identified in each cluster. The maximum total of 23 cyclones occurs in 1986, and the minimum of 7 in 1995. There are no apparent trends over this period. The selection of only the strong developing cyclones means that the sample size each year is fairly small. These results may be different for a larger region, or when considering all cyclones, although Pepler et al. (2017) found no significant trends in Australian east coast lows since 1911 using the Twentieth Century Reanalysis product.
5. Summary and discussion
Here an automated method using K-means clustering to classify extratropical cyclones from 1979 to 2010 in the Australia and New Zealand region has been presented. Fields of upper-level (250 hPa) wind speed and zonal anomalies of θPV2 from a radial region of 20° were used as the fields to cluster on. The results of this classification have been compared against the manual classification of 5 years of extratropical cyclones by SR00. Some of the features of the four clusters are as follows.
Cluster 1 has its genesis in the equatorward entrance region of the upper-level jet streak in diffluent flow. It is the cluster with the highest θe,850 and the highest average rainfall, associated with advection of warm moist air from the tropics, and a high frequency of warm fronts.
Cluster 2 occurs in the poleward exit region of the upper-level jet streak associated with a jet streak that is downstream of the upper-level trough. It has quite low rainfall totals and more cold fronts than warm fronts. This cluster has a large seasonal cycle with a maximum in winter (JJA) and minimum in summer (DJF).
Cluster 3 occurs beneath a deep upper-level trough, far poleward of the main jet streak, and clearly resembles secondary frontal cyclogenesis. This cluster has strong cold advection at the time of genesis and the highest number of identified cold fronts. It exhibits rapid development, but has the lowest rainfall totals.
Cluster 4 has its genesis in the poleward exit region of the upper-level jet streak, similar to cluster 2, but with the jet streak upstream of the upper-level trough. This cluster occurs farther north that cluster 2 but has very similar seasonal variability. It has the second-highest rainfall totals and shows higher frequency of warm fronts.
The clusters identified here closely resemble the classes from SR00 in terms of their structure at the time of genesis, and also how they develop through their life cycles. The upshot of this result is twofold. First, we see that the clustering method is clearly good enough to identify the types of cyclones that an experienced synoptician would identify. Second, it provides further evidence of the success of the manual classification technique of SR00. As well as showing that a simple clustering can reproduce classes of cyclones similar to a manual classification, the results presented here add extra credence to such manual classifications.
A next step, having established the utility of the method, will be to expand the region of study to the hemispheric or global scale. A brief investigation into the robustness of the clusters found in this study when considering a larger region has been performed (using a region expanded by 10° in latitude and longitude to the south and east). The cluster centroids were similar, but not exactly the same (not shown). When considering larger regions, it may be that more clusters are required [e.g., Graf et al. (2017) found five distinct classes over the Northern Hemisphere]. Since there are no global manual classifications against which to compare, a more objective method of selecting the number of clusters would likely need to be employed. It has been suggested that cyclones with different characteristics occur preferentially in different regions (e.g., Dacre and Gray 2013), and such classification may help to further investigate this suggestion. The resemblance of cluster 3 to secondary cyclogenesis could be used to more systematically analyze the occurrence of this type of cyclogenesis globally. The relative importance of upper-level versus lower-level forcing, as well as diabatic processes at upper and lower levels, could be investigated in order to link these clusters to the threefold classification scheme of Deveson et al. (2002) and Plant et al. (2003), the objective cyclogenesis classification of Graf et al. (2017), and the recent work of Binder et al. (2016) on the role of warm conveyor belts in the intensification of cyclones. While these avenues of research are beyond the scope of the present study, they would add greatly to knowledge on the processes within extratropical cyclones and how they vary spatially and temporally over the globe.
There are clearly some differences in the precipitation associated with the different classes. The composites and the life cycles indicate that cluster 1 has the highest rainfall and cluster 3 has the lowest. A statistical analysis of the maximum precipitation along the cyclone tracks in each cluster reveals that there are statistically significant differences between most of the clusters (only clusters 1 and 4 are indistinguishable in this measure). These results suggest that the latitude of the cyclones plays an important role in determining the amount of precipitation produced since cluster 1 occurs at the lowest latitudes. Figure S10 in the supplemental material shows the relationship between the maximum precipitation and latitude. The latitude at which the maximum precipitation occurs is a stronger influence than the genesis latitude of the cyclone, but the correlations are still only between 0.44 and 0.53. This indicates that there are other factors responsible for determining the precipitation of the cyclones—the dynamics of the features themselves. Recently, Pfahl and Sprenger (2016) showed that stronger cyclones are associated with higher rainfall, with the strongest relationship with the precipitation from before the maximum cyclone intensity. The cyclones with higher rainfall may, therefore, be more dependent on latent heating for their intensification. Such latent heating and other diabatic processes would need to be diagnosed using multiple datasets (such as remotely sensed cloud data) since these processes in ERA-Interim show large biases (Hawcroft et al. 2017). The clustering developed here may help to investigate these aspects in future research. We will use cyclone classification to better understand both the impact of latent heating on the development of different types of extratropical cyclones and also the impacts they have in terms of the precipitation they produce.
A particular motivation for developing a simple automated method of extratropical cyclone classification is the need to evaluate state-of-the-art climate models for their ability to represent the full spectrum of extratropical cyclone characteristics (Catto 2016). Climate models need to be able to replicate the clusters observed in nature. Since cyclones involve complex nonlinear interactions on a variety of space and time scales, their climatologies and classification provide an exacting means of assessing the realism of climate model results. These systems are responsible for bringing a large proportion of the total and extreme rainfall to many regions of the midlatitudes (Pfahl and Wernli 2012; Catto et al. 2012; Catto and Pfahl 2013; Dowdy and Catto 2017), and so we need to have confidence in projections of their future changes. The method that has been presented here is simple enough to apply to climate model output and will help to understand model shortcomings and how extratropical cyclones and their associated impacts may change in the future.
This work was funded by the Australian Research Council (ARC) through a Discovery Early Career Research Award (DE140101305) and supported by the Centre of Excellence for Climate Systems Science (CE110001028). Thanks to Julian Quinting and Duncan Ackerley for comments on an earlier version of the manuscript. Thanks also to Matt Hawcroft (University of Exeter) for use of his precipitation data. ERA-Interim data are available online (http://apps.ecmwf.int/datasets/). The author acknowledges with thanks the valuable comments and suggestions from a number of anonymous reviewers.
Sensitivity to Precipitation Data
To investigate the impact of using the 12–24-h forecast lead time precipitation (as per Hawcroft et al. 2016, 2017), the composites (Fig. A1) and life cycles (Fig. A2) of precipitation from the clusters have been produced. The precipitation from Hawcroft et al. (2017) represents the 6-hourly accumulation before the analysis time, whereas from Catto and Pfahl (2013) it represents the 6-hourly accumulation following the analysis time. The effect of this offset can be seen when comparing the composites of precipitation, where the original precipitation is slightly further ahead of the cyclone, and the Hawcroft et al. precipitation is closer to the cyclone center.
At the times of genesis and maximum growth, the composites of precipitation show lower precipitation volumes around the cyclones with more spread out features, whereas at the time of maximum intensity, the precipitation volumes are higher. Early in the life cycles of the cyclones, the forecasts may show more uncertainty in the exact location of the cyclone and associated fronts. This would produce a smearing effect on the precipitation composites. At the time of maximum intensity, the forecast errors at the longer lead times would be smaller and so the features would likely line up more coherently, producing higher-volume composites consistent with Hawcroft et al. (2016). The life cycles show higher precipitation in the 5° radius region surrounding the composite cyclones, consistent with the more centrally located precipitation when using the Hawcroft et al. precipitation data.
Supplemental information related to this paper is available at the Journals Online website: https://doi.org/10.1175/JCLI-D-17-0746.s1.