Reliable projections of future changes in tropical cyclone (TC) characteristics are highly dependent on the ability of global climate models (GCMs) to simulate the observed characteristics of TCs (i.e., their frequency, genesis locations, movement, and intensity). Here, we investigate the performance of a suite of GCMs from the U.S. CLIVAR Working Group on Hurricanes in simulating observed climatological features of TCs in the Southern Hemisphere. A subset of these GCMs is also explored under three idealized warming scenarios. Two types of simulated TC tracks are evaluated on the basis of a commonly applied cluster analysis: 1) explicitly simulated tracks, and 2) downscaled tracks, derived from a statistical–dynamical technique that depends on the models’ large-scale environmental fields. Climatological TC properties such as genesis locations, annual frequency, lifetime maximum intensity (LMI), and seasonality are evaluated for both track types. Future changes to annual frequency, LMI, and the latitude of LMI are evaluated using the downscaled tracks where large sample sizes allow for statistically robust results. An ensemble approach is used to assess future changes of explicit tracks owing to their small number of realizations. We show that the downscaled tracks generally outperform the explicit tracks in relation to many of the climatological features of Southern Hemisphere TCs, despite a few notable biases. Future changes to the frequency and intensity of TCs in the downscaled simulations are found to be highly dependent on the warming scenario and model, with the most robust result being an increase in the LMI under a uniform 2°C surface warming.
There has been a concerted effort over the last decade or so to understand the response of tropical cyclones (TCs) to a warming climate (e.g., Knutson et al. 2010; Walsh et al. 2016). TCs are a particularly hazardous natural phenomenon because of their extreme winds, flooding from torrential rainfall, and coastal inundation from storm surges. The Southern Hemisphere is home to about 30% of the roughly 80 TCs that form around the globe each year, with genesis locations and tracks extending over a semi-continuous zone from the east coast of Africa (~30°E) to the central South Pacific (~120°W). They typically form from November to April (94%), with a climatological peak from January to March (64%) (Gray 1968; Schreck et al. 2014; Ramsay 2017). The islands of the South Pacific are especially vulnerable to the impacts of TCs; a notable example is severe Tropical Cyclone Pam (March 2015), which struck the islands of Vanuatu causing catastrophic damage, the displacement of hundreds of thousands of people, around 15 fatalities, and a recovery budget of 268.4 million (U.S. dollars; UNESCO 2015).
The effects of climate change on TCs in the Southern Hemisphere have received relatively less attention in the scientific literature compared to some other regions of the world (most notably, the North Atlantic region) despite the vulnerability of the Southern Hemisphere. That a warming climate will likely result in a small decrease in overall TC frequency but a small increase in lifetime maximum intensity (LMI) and precipitation rate of TCs (e.g., Walsh et al. 2016) is the current consensus from global climate model (GCM) projections, despite some studies indicating that global TC frequency may actually increase with warming (e.g., Emanuel 2013). Detecting an anthropogenic footprint on such metrics using past historical data, however, has been challenging (e.g., Sobel et al. 2016) because of several factors, including large internal variability, the relatively short record of reliable TC intensity data (e.g., Kossin et al. 2013), and questions about the quality of the historical best track datasets.
Meaningful and robust future projections are highly dependent on the ability of GCMs to accurately simulate the observed characteristics of TC tracks (i.e., their frequency, genesis locations, movement, and intensity). Previous studies have explored observed TC track types using cluster analysis in different geographical regions, including the western North Pacific (Camargo et al. 2007b), eastern North Pacific (Camargo et al. 2008), North Atlantic (Kossin et al. 2010), South Pacific (Chand and Walsh 2009), and Southern Hemisphere (Ramsay et al. 2012). We apply the same clustering technique as used in these previous studies to explore differences between observed and model-simulated TC tracks in the Southern Hemisphere (refer to section 2e for details).
The focus of the current study is to assess the ability of climate models to reproduce observed TC characteristics in the Southern Hemisphere, as well as to explore projections of TC frequency and intensity in a subset of these models under specific idealized warming scenarios (section 2a) for which a large population of synthetic TC tracks is available. The approach here is similar to recent studies of TC track characteristics of the same group of climate model simulations for the North Atlantic (Daloz et al. 2015) and western North Pacific (Nakamura et al. 2017) regions. The remainder of this study is structured as follows. Section 2 provides details of the model experiments, the observed and modeled TC tracks, and the cluster technique. Results for the explicit and downscaled TC tracks in the context of the control climate scenario are presented in sections 3 and 4, respectively. Section 5 examines future changes in TC frequency and intensity under specific idealized warming scenarios, and the main results are summarized in section 6.
2. Data and methodology
a. Hurricane working group models
We investigate climate model simulations from a suite of idealized experiments, which were performed as part of the U.S. CLIVAR Working Group on Hurricanes [hereafter, Hurricane Working Group (HWG); Walsh et al. 2015]. The HWG consists of 13 modeling centers and 16 different models [see Table 1 in Walsh et al. (2015) for the full list], but here we examine only a subset of these models (see Table 1), based partly on data availability but also to allow for model-based comparison with previous similar studies (e.g., Daloz et al. 2015; Nakamura et al. 2017). Two types of TC tracks are explored: explicit tracks (denoted by “-E” after the model name) and downscaled tracks (denoted by “-D” after the model name). The explicit tracks are obtained by tracking TC-like storms in the models’ output, while the downscaled tracks were obtained by Emanuel’s downscaling technique (Emanuel 2006; Emanuel et al. 2006, 2008). For the explicit track simulations, six models are evaluated: HiRAM (hereafter HIRAM), CMCC-ECHAM5 (hereafter CMCC), GISS, CAM5.1 (hereafter CAM5), FSU COAPS (hereafter FSU), and GFS (Table 1), with the first four of these models also used for evaluating the downscaled tracks. Following the approach of Held and Zhao (2011), these simulations contained four climate scenarios: a controlled twentieth-century climate and the three idealized warming climate scenarios. The control climate (20C) was forced with climatological monthly-average sea surface temperature (SST) for the periods ranging between 5 and 20 years, depending on model (Table 1). The three highly idealized warming climate scenarios included experiments corresponding to a doubling of carbon dioxide concentration in the atmosphere (2CO2), a uniform 2-K increase of SST (p2K), and the combination of these two scenarios (p2K2CO2).
b. Explicit tropical cyclone tracks
For the purpose of this study, TCs explicitly simulated by the GCMs are evaluated mostly for the twentieth-century control climate simulation to assess the general characteristics of the modeled tracks and how these characteristics compare with the observed climatology. An ensemble-mean approach is used to assess changes to TC frequency, as the model-generated explicit tracks for the Southern Hemisphere were too few in some HWG experiments to deduce meaningful and robust conclusions on future projections. The explicit tracks are generated using different tracking schemes, depending on model, as listed in Table 1. These tracking schemes share several common criteria for TC detection (see Daloz et al. 2015, appendix A therein), but previous work has shown that explicit TC statistics are still sensitive to the differences among them, especially for the weaker storms and low-resolution models (Horn et al. 2014). Daloz et al. (2015) found that although the annual number of TCs is sensitive to the tracking routine, the results of the cluster analysis were not sensitive to the tracking routine used.
c. Downscaled tropical cyclone tracks
The downscaled TC tracks have essentially the same key statistical properties of explicit tracks but with a much larger number of realizations. Therefore, these are explored for both the current climate, as well as future climate projections. The downscaled tracks were generated using a statistical–dynamical technique developed by K. Emanuel (Emanuel 2006; Emanuel et al. 2006), which has subsequently been used to explore TC behavior in GCMs (e.g., Emanuel et al. 2008; Emanuel 2013; Walsh et al. 2015) as well as regional TC activity from the HWG experiments, including the North Atlantic basin (Daloz et al. 2015) and the western North Pacific basin (Nakamura et al. 2017). A brief explanation of the technique is given below [the reader is referred to Emanuel et al. (2006) and Emanuel et al. (2008) for further details on the technique].
The starting points, or “seeds,” for each synthetic track are weak warm-cored vortices with peak winds of 12 m s−1, randomly distributed everywhere poleward of 2° latitude at all times regardless of season, SST and other climatological factors (Emanuel et al. 2008). Once generated, the synthetic storms are steered by a weighted-average of the flow at the 850- and 200-hPa levels, in addition to a beta drift term (Holland 1983). The resultant storm motion is similar to the “deep” version of the beta-advection model described in Marks (1992). The survival rate of the initial seeds is determined by the storm intensity, which is determined at each track point by the Coupled Hurricane Intensity Prediction System (CHIPS; Emanuel et al. 2004), a relatively simple axisymmetric tropical cyclone model coupled to a slab ocean model. The collective movement and intensity of each synthetic storm is determined entirely by the large-scale environment of the host model, by design. The initial seeding rate is independent of the ambient wind field; however, any storm embedded in strong vertical wind shear will quickly dissipate. The synthetic track technique has been shown to produce a reasonable climatology of Southern Hemisphere TCs when forced with historical reanalysis data (Emanuel et al. 2008), despite there being too many genesis events in the far western Indian Ocean, north of Madagascar.
d. Observed tropical cyclone tracks
The observed tropical cyclone tracks are the same as those analyzed in Ramsay et al. (2012). This dataset comprises 1329 systems from July 1969 to June 2008 using IBTrACS-v02r01. The tracks and genesis locations of the seven clusters identified in Ramsay et al. (2012) are used here as a baseline to evaluate model behavior, and are illustrated in Fig. 1. For a more accurate quantitative comparison between observed and modeled tracks, a shorter data period is used starting from the 1984/85 season, owing to incomplete wind data in the Southern Hemisphere best tracks prior to the mid-1980s (Schreck et al. 2014; Ramsay 2017). This subset is used to derive the observed statistics (OBS) shown in Tables 2–4.
e. Tropical cyclone track clustering method
The cluster technique applied here is a mixture of polynomial regression models developed by Gaffney (2004). The technique is described in detail when applied to extratropical cyclones in Gaffney et al. (2007). The cluster algorithm is used to fit the geographical shape of trajectories (tropical cyclone tracks here) by finding the parameters that maximize the likelihood of the data. The only input of the cluster algorithm is the position of the individual trajectories, defined here as the latitude and longitude at each snapshot of the tropical cyclone track. This procedure is repeated 100 times, randomizing the order that the tracks are included in the analysis. The final cluster assignment is chosen as the best fit among them. This cluster analysis is able to handle trajectories of different lengths, which is an important feature for tropical cyclone tracks.
This cluster technique has been applied extensively to observed tropical cyclones in various basins, including the North Atlantic (Kossin et al. 2010; Kozar et al. 2012; Boudreault et al. 2017), the eastern (Camargo et al. 2008; Caron et al. 2015) and western North Pacific (Camargo et al. 2007a,b), the Fiji region (Chand and Walsh 2009), the north Indian Ocean (Paliwal and Patwardhan 2013), and the Southern Hemisphere (Ramsay et al. 2012), as well as to typhoon post-landfall tracks over China (Zhang et al. 2013). Other recent applications of this cluster analysis include a comparison of tracks of TCs in a reanalysis dataset to observations (Bell et al. 2018), analysis of the skill of tropical cyclone forecasts (Don et al. 2016, Kowaleski and Evans 2016), and in the development of statistical–dynamical seasonal forecasts (Zhang et al. 2016). The application we focus on here is to examine the ability of climate models to simulate observed track types and if/how they change under anthropogenic climate change. This approach was used in various multimodel studies in the North Atlantic (Camargo 2013; Daloz et al. 2015) and the eastern (Camargo 2013) and western North Pacific (Nakamura et al. 2017). Here we examine the characteristics of Southern Hemisphere tropical cyclone model tracks in the HWG models in present and future climates.
In Ramsay et al. (2012) we applied the cluster analysis to analyze Southern Hemisphere TC tracks in the period July 1969–June 2008, using seven clusters, highly dependent on the longitude. These seven clusters are located in the south Indian Ocean and South Pacific (Fig. 1). The ordering of clusters in the models was based primarily on the longitude of tracks, as they tend to be well separated by longitude, as in observations (Ramsay et al. 2012). For the two westernmost clusters in the downscaled models, which are not well separated by longitude, the seasonality was used also to help match with observations. Here, in order to compare the models with these tracks, we chose eight clusters for the models, which enables the models to have seven clusters in the south Indian Ocean and South Pacific, which will be compared with observations and used for projections, as well as an additional cluster with tracks in the South Atlantic.
After the occurrence of Hurricanes Catarina in March of 2004 (McTaggart-Cowan et al. 2006; Pezza and Simmonds 2005; Pezza et al. 2009; Veiga et al. 2008; Vianna et al. 2010; Pereira Filho et al. 2010) and Anita in March of 2010 (Dias Pinto et al. 2013, Dutra et al. 2017), studies of South Atlantic subtropical systems that undergo tropical transition in the South Atlantic received more attention (Evans and Braun 2012, Gozzo et al. 2014, 2017). These studies demonstrate that even though South Atlantic TCs are rare (and typically subtropical) they do exist; however, they were not included in Ramsay et al. (2012).
f. Statistical significance
Statistical significance of future changes of TC activity for the downscaled simulations is assessed by bootstrapping the difference in the mean of a particular variable (e.g., the mean LMI in the p2K scenario minus the mean LMI from the control run), using 1000 replications. Bootstrap confidence intervals are calculated using the bias-corrected and accelerated (BCa) technique (Efron and Tibshirani 1993), with the 95% confidence interval (CI) set as the bar for statistical significance (i.e., 95% of the bootstrapped replicates must not cross zero, assuming a null hypothesis of no change in the mean of a particular variable between current and future climate).
3. Cluster characteristics from explicit simulations
a. Genesis locations and tracks
We start by evaluating the ability of the HWG models to explicitly simulate general characteristics of observed TC activity in the Southern Hemisphere basins (Figs. 1 and 2). Here, general characteristics are evaluated, including genesis locations and annual frequency, as well as the mean TC duration and lifetime maximum intensity, for each of the seven clusters (Table 2). As discussed above, the eighth cluster is included to objectively identify and isolate model-generated TCs that do not correspond to the observations, such as those in the far southeastern Pacific and poleward of around 30°S, as well as the rare South Atlantic TCs, which the models generate much more frequently than in observations. Therefore, TCs in this eighth cluster are usually considered to be a result of model biases and deficiencies (e.g., Tory et al. 2013; Chand et al. 2017).
Comparison of the cluster analysis applied to the Southern Hemisphere tropical cyclones in observations with those explicitly simulated by the six HWG models (HIRAM-E, CCMC-E, GISS-E, CAM5-E, FSU-E, and GFS-E) under the control climate show varying results. Overall, these climate models are able to replicate spatial distribution of genesis locations reasonably well over most of the Southern Hemisphere (Fig. 2). However, some notable exceptions are in the southeastern Pacific and South Atlantic basins, and to a lesser extent in the region poleward of about 30°S, where models generate TCs that otherwise are rare or not present in the observations, (as indicated by the brown clusters in Figs. 2b–f). These subtropical/extratropical model TCs could potentially be attributed to biases of the tracking schemes, though, as discussed in Horn et al. (2014).
There is also a certain degree of disagreement in terms of cluster separation for the genesis locations and resulting tracks between models [see also Figs. 3 and 4 for explicit tracks in a “good” (HIRAM-E) and a “bad” (GFS-E) performing model]. For example, the fifth cluster (orange cluster in Figs. 2 and 4) shows large variations between models, particularly for HIRAM-E, GISS-E, and CAM5-E, where this cluster is reproduced farther eastward in the southwestern Pacific basin instead of over the west Australian region as per the observations. Unfortunately, the GFS-E model presents substantial disagreement when compared to the observations both in the genesis locations (Fig. 2g) and the resulting tracks (Fig. 4).
b. Annual frequency
The annual average number of tropical cyclones forming in the Southern Hemisphere is 26.9 (Table 2; Schreck et al. 2014; Ramsay 2017), with between-cluster variations ranging from 2.7 for cluster 5 to 4.8 for cluster 7 (Table 2). While some models are able to realistically reproduce this annual climatology reasonably well (e.g., CAM5-E, CMCC-E, and HIRAM-E), others show substantial underestimation across all clusters in the Southern Hemisphere basin with GFS-E having the lowest number of Southern Hemisphere TCs (~5.2 cyclones per year).
Given that each model suffers from its own biases and deficiencies, it is not unusual to expect large discrepancies in annual tropical cyclone frequency between models (see, e.g., Camargo 2013). To assess the relative performance of models irrespective of model biases and deficiencies, we normalize each cluster member with the total global number of TCs for that model. We also compute the six-member multimodel mean to provide a comparative measure of between-cluster differences for observations and for all models combined. Overall, we see some encouraging results in the ability of the individual models in replicating the between-cluster differences across the Southern Hemisphere. For example, the observed and multimodel mean proportions of annual TCs in cluster 1 are 15% and 14.1% respectively. Comparatively similar results are present for other clusters as well (Table 2).
Although the focus here is mostly on climatological aspects of the explicit tracks, we take advantage of the multimodel mean approach to assess future changes of TC frequency under the three idealized climate change scenarios. The projections show that a 2-K surface warming with CO2 fixed (p2K) results in about the same storm frequency reduction (~−16%) as when the SST is held fixed but the CO2 doubled (2CO2) (Table 3). This roughly equal percent reduction differs somewhat from Held and Zhao (2011), who found that storm frequency changes in the Southern Hemisphere as simulated by HIRAM were less sensitive to 2CO2 than p2K (this is also evident in Table 3). The combination of p2K and 2CO2 (i.e., p2K2CO2) results in a slightly larger percent reduction of −19.2%.
c. Mean TC duration and LMI
Another measure of TC activity is the mean duration of TCs in each cluster. The observed duration of TCs varies from 4.5 days for cluster 4 to 7.0 days for cluster 3. There are large variations in the mean duration of TCs between models for each cluster. For example, the mean duration of TCs in CAM5-E for cluster 1 is 8.7 days compared to 3.5 days in GFS-E (the corresponding observed value is 5.3 days). Similarly, the mean durations of TCs for these two models in cluster 3 are 8.5 days and 3.6 days respectively. Overall, the multimodel mean duration of TCs compares well with the respective observed values for all seven clusters in the Southern Hemisphere. For completeness, we also examined the mean LMI of TCs in each cluster for different models. Unsurprisingly, the model values of the mean maximum TC intensity are substantially underestimated compared to the observations due to coarse model resolutions that are unable to adequately capture TC intensity in explicit simulations. The relatively high-resolution models of HIRAM (50 km) and CAM5 (25 km; Wehner et al. 2015) are able to generate TCs with correspondingly high LMIs (43.2 and 37.8 m s−1, respectively), exceeding even the observed mean LMI for the Southern Hemisphere (33.8 m s−1) (Table 2).
Finally, we examine the extent to which the HWG models’ explicitly simulated tracks can replicate seasonality of TCs in various clusters. Overall, all models have substantial skill in reproducing the seasonal cycle of TCs across most clusters in the Southern Hemisphere (Fig. 5). For example, in the South Pacific basins (clusters 6 and 7), all models do very well in simulating not only the magnitude of proportions but also the seasonal variability of TCs. However, a notable exception is in cluster 2 where most models substantially overestimate the number of TCs, particularly for the early seasons.
All in all, the HWG models perform reasonably well in explicitly simulating certain characteristics of TC activity, such as seasonality; however, much work is needed to improve the model performance in the Southern Hemisphere in terms of reproducing actual number of TCs and TC intensity. Some models, such as GFS, substantially underestimate TC frequency for the entire Southern Hemisphere while others show widely varying statistics of both the total hemispheric annual TC frequency and relative proportion of TCs in each cluster (Table 2). Therefore, alternative techniques such as dynamical downscaling (see the next section) must be additionally utilized in order to more confidently use these models for climate projection studies.
4. Cluster characteristics from downscaled simulations: Control climate
a. Formation rate, genesis, and tracks
The four downscaled models investigated, HIRAM-D, CMCC-D, GISS-D and CAM5-D, differ quite considerably with respect to the annual formation rate of tropical cyclones (Table 4), ranging from 21.2 TCs per year (i.e., ~80% of the annual observed rate of ~27 TCs) in CMCC-D to 36.1 TCs per year in CAM5-D. Note that a 17 m s−1 wind speed threshold is applied here for the sake of counting TCs. In an analysis of these same models for the North Atlantic region (Daloz et al. 2015), HIRAM-D produced the most TCs (65.9 TCs per year), more than twice that of the least active model of CAM5-D (28.7 TCs per year), whereas for the Southern Hemisphere HIRAM-D is somewhere in between the most active and least active and closer to the observed climatology. As mentioned in section 2c, the starting points of the synthetic tracks are random in space and time, regardless of the host model, so any model differences in TC formation rates must be due to differences in the large-scale environment (e.g., deep-layer vertical wind shear, potential intensity, etc.). Hence the CAM5-D model appears to have the most favorable environmental conditions for the initial intensification of TCs (as governed by the CHIPS model) of the four models examined here. In particular, Emanuel et al. (2008) point out that the initial survival rate of the seeds is strongly linked to the ambient environmental wind shear. However, an analysis of the climatological deep-layer vertical shear over the peak SH season [January–March (JFM)] did not reveal any obvious difference between models with disparate annual storm frequencies (e.g., CAM5 compared to CMCC; see Fig. 10 later). This may be partly due to the deterministic nature of the intensity model (which is also coupled to the ocean and nonlinear), such that there is likely not a simple attribution to time-averaged environmental fields as used in empirical approaches to genesis such as genesis potential indices.
The ranking of the annual frequencies by model in the explicit simulations (excluding FSU-E and GFS-E, which have no downscaled counterparts) is not the same as in the downscaled simulations, with HIRAM-E containing the most storms (33.4 TCs per year) and GISS-E containing the fewest storms (18.8 TCs per year). This result points to the sensitivity of model TC climatology to model characteristics, in particular convection parameterizations, as models with very similar large-scale characteristics can have very different TC climatologies (Kim et al. 2012, 2018). Another plausible reason for these discrepancies is that the annual TC frequencies in the explicit simulations may be sensitive to the model’s ability to simulate precursor synoptic disturbances (the seeds for subsequent TC development), but the random seeding used in the downscaled tracks essentially assumes that such precursors are distributed evenly in time and space.
The observed annual frequency of TCs per cluster over the period of most complete data (i.e., from 1984/85 onward) ranges from four to five (16%–18%) over the open ocean to about three near the Australian coast (10%–12%) (Table 4). With the exception of cluster 7 (C7) in CAM5-D, the proportion of total storms in each cluster generally falls within the spread of the observations, ranging from 11% to 18% (i.e., there are no clusters that particularly dominate in terms of TC frequency). That there are so few storms in CAM5-D C7 is most likely because of the far eastward extent of its genesis region (μ = 210.7°E). The genesis potential becomes increasingly less favorable moving from west to east in the South Pacific (e.g., Camargo 2013), and the historical best track data indicate that less than 2% of TCs originate eastward of C7’s genesis centroid in CAM5-D.
A promising result from the downscaled models is that the spatial characteristics of the observed track clusters are generally well represented (Figs. 6 and 7; note that kernel density estimates (KDEs) are shown here rather than the actual genesis points and tracks due to the sheer number of simulated TCs in the downscaled models). For instance, there are four clusters in the south Indian Ocean (with the exception of CAM5-D, which contains an additional cluster), and three clusters in the South Pacific. Clusters tend to be well separated by longitude, as in observations, with the exception of the far western Indian Ocean. Nevertheless, there are some notable differences and biases. First, both HIRAM-D and CMCC-D produce a cluster in the South Atlantic basin with a high number of TCs (see insets in Figs. 6 and 7), where there have been rare cases of TCs, as discussed above (e.g., Pezza and Simmonds 2005). South Atlantic TCs occur also in CAM5-D and GISS-D, but are much less prevalent [note that cluster 8 in the GISS-D model, which contains tracks split over the South Atlantic and South Pacific basins, is not analyzed here because of the very small number of tracks (~13% of other clusters) and spatial incoherence]. Second, GISS-D and CAM5-D fail to produce the correct climatological track direction in the South Pacific, maintaining a southwesterly track direction instead of a southeasterly direction as seen in observations (Figs. 1 and 7). This discrepancy is presumably due to a misrepresentation of the large-scale flow in GISS and CAM5 given that the downscaled storm motions are determined in large part by the horizontal winds at 850 and 200 hPa (Emanuel et al. 2008).
Another very notable bias in the downscaled models is the tendency for TCs to form and track closer to the equator than in observations (Table 4). This bias occurs on average across the Southern Hemisphere, with downscaled storms forming ~3° closer to the equator, but is particularly evident in the south Indian Ocean, with the median genesis (track) locations of C2 and C4 occurring 4.3° and 4.8° (5.3° and 4.2°) closer to the equator than the observed clusters, respectively. The equatorward bias is likely due to the random seeding technique, which allows storms to form anywhere poleward of 2°S regardless of season, with dynamical differences in models playing a secondary role. This same bias also occurs when historical reanalysis data are used to determine the synthetic tracks (Emanuel et al. 2008).
The typical seasonality for the Southern Hemisphere is reproduced well by the downscaled tracks (Fig. 8) with a peak of activity in the months JFM, flanked by lower TC formation rates in the late spring to early summer [October–December (OND)] and autumn [April–June (AMJ)]. About 62% of all storms occur during JFM in observations, whereas the models are less peaked in that period—ranging from 43% (CAM5-D) to 56% (CMCC-D). Almost all models produce too many storms in the austral autumn (AMJ), with a +12% bias compared to observations (particularly apparent in the CAM5-D model, which has comparatively high midlevel relatively humidity over much of the tropical Indo-Pacific in AMJ, especially compared to HIRAM-D and CMCC-D; not shown). However, the proportion of synthetic storms forming in the early part of the season (OND) differs by only a few percent from observations. Another promising aspect of the downscaled simulations is their ability to capture the relatively flat seasonality of C2 and C3 in the Indian Ocean. That these clusters are less peaked in JFM than other regions of the Southern Hemisphere has been noted in previous work (e.g., Ramsay et al. 2012), with the most likely explanation being the mean position and width of the ITCZ outside of the core TC season (e.g., Berry and Reeder 2014), including the double ITCZ over the Indian Ocean during austral winter and spring (e.g., Waliser and Gautier 1993; Philander et al. 1996).
c. Duration and intensity
The mean observed lifetime of TCs during the period of reliable data (i.e., commencing in 1984/85 with a wind speed threshold of at least 17 m s−1 used to define genesis) is about 5.6 days (Table 4), with longer-lived storms not surprisingly occurring in clusters well removed from major landmasses (i.e., C2, C3, and C6). The downscaled TCs have relatively longer lifetimes, ranging from 6.2 days (HIRAM-D) to 7.4 days (GISS-D), although some of this difference may be due the averaging period used to define wind speed in the best tracks compared to the downscaling technique, as discussed below.
Focusing now on simulated intensity, we find that 25%–30% of Southern Hemisphere storms exceed major hurricane intensity based on the Saffir–Simpson hurricane wind scale (i.e., maximum sustained 1-min surface winds ≥ 96 kt), depending on model. Although the Saffir–Simpson scale is nonstandard for Southern Hemisphere meteorological agencies, we use it here for a more straightforward comparison with related studies (e.g., Daloz et al. 2015). If a crude conversion factor of 0.88 is applied to the observed wind speed data to get an equivalent 1-min average [V1-min = V10-min/0.88; see Harper et al. (2010) for a detailed discussion on converting between different wind-averaging periods], then a similar percentage (~24%) of TCs in the Southern Hemisphere become major hurricanes. However, applying this simple but by no means optimal conversion factor to the best track data only introduces additional ambiguity to an already nonhomogeneous dataset, so we opt to leave the wind speed data unchanged (i.e., typically a 10-min average in the Southern Hemisphere). With this caveat in mind, only 4.3 TCs per year (16%) reach major hurricane intensity in the observations.
Other intensity metrics, such as the mean LMI and power dissipation index (PDI) also are lower in the observed tracks compared to the downscaled tracks (Table 4). The positive LMI bias in the models is particularly evident in the South Pacific clusters (C5–C7), where LMIs are, on average, 9 m s−1 greater than observed LMIs (even after adjusting for wind-averaging periods, there is still a +4 m s−1 bias). The average PDI per storm is about 2–3 times higher in the downscaled tracks compared to observations. These biases are reduced somewhat if the downscaled winds are converted to 10-min averages (i.e., multiplying the PDIs in the models by ~0.68), but still persist. For instance, in HIRAM-D, PDIs based on 10-min equivalent winds are still 20%–50% higher than in observations. The positive intensity biases in the downscaled tracks are consistent with several other biases noted above, namely equatorward shifts in genesis locations, tracks, and mean latitude of LMI relative to observations (Table 4), as well as storm duration. Both the potential intensity (PI) and deep-layer vertical wind shear become increasingly more favorable for intensification at these equatorward latitudes (Figs. 9 and 10), giving physical support for these biases. Previous work has also shown that storm latitude and storm duration both are strong determinants of statistical TC intensity (e.g., Kossin et al. 2007).
To investigate these model biases further, an analysis was performed in which the evolutions of the median and 90th percentile wind speeds in the observations and HIRAM-D were compared side by side (Fig. 11). This revealed that, although the median storm durations are not too dissimilar, the downscaled TCs have higher wind speeds at both the median and 90th percentile, particularly after a few days from genesis. The LMIs occur at much longer times from genesis in HIRAM-D compared to in observations, and this is a major contributing factor to the positive PDI bias. A similar breakdown of the evolution of the median and 90th percentile winds was performed for each of the downscaled models (Fig. 12) and revealed similar overall behavior to HIRAM-D, with any notable differences in intensity being due to differences in the geographic location and shape of clusters (e.g., CAM5-D’s unique location of C2 and C3 in the Indian Ocean and differences in the shape of C6 in the Indian Ocean between HIRAM-D/CMCC-D and GISS-D/CAM5-D).
5. Cluster characteristics from downscaled simulations: Future climate
Finally, we explore projected changes in three characteristics of the downscaled storms: 1) mean annual frequency, 2) mean LMI, and 3) the mean latitude of LMI in each of the three warming scenarios (p2K, 2CO2, and p2K2CO2). Corresponding changes to seasonally averaged environmental fields, such as PI, midlevel saturation deficit and relative humidity, and deep-layer vertical wind shear (i.e., the absolute value of the vector wind difference between 850 and 200 hPa) were also analyzed to provide physical interpretation of the statistical changes. Broadly speaking, all models show increases in PI over the tropics for the two SST warming scenarios (p2K and p2K2CO2) and typically a slight decrease in PI when the SST is held fixed and the carbon dioxide doubled (2CO2), in agreement with previous studies (e.g., Held and Zhao 2011; Wehner et al. 2015). An example of these projected PI changes for the HIRAM model is shown in Fig. 13. In terms of midlevel saturation deficit, all models depict broad and robust increases over the tropics for the p2K and p2K2CO2 scenarios, but much smaller changes for the 2CO2 scenario, with both increases and decreases evident (see the HIRAM example in Fig. 14). From theoretical arguments (e.g., Emanuel 2013), one might expect the increasing saturation deficit in the SST warming simulations to result in a decrease in overall storm frequency, but the results here indicate the opposite relationship (consistent with previous results based on this technique; e.g., Emanuel 2013; Walsh et al. 2015). Projected changes of vertical wind shear are mixed, with both increases and decreases evident over the regions where the model TCs form and track (see the HIRAM example in Fig. 15).
In the p2K scenario, two of the models, CMCC-D and CAM5-D, show a robust and statistically significant increase in the annual frequency of TCs in the Southern Hemisphere (Fig. 16a), whereas HIRAM-D and GISS-D display only relatively small (statistically insignificant) frequency changes of +4% and −3%, respectively. CMCC-D indicates statistically significant increases for all clusters, amounting to a +60% increase in overall TC frequency. A similarly large percentage change (+45%) was found for the North Atlantic region by Daloz et al. (2015) for the same model. Like Daloz et al. (2015), the CMCC-D model is the only downscaled model to show statistically significant increases for all clusters for all future scenarios (p2K, 2CO2, p2K2CO2; Fig. 16), but it is not obvious why this model should have a different trend from the others in the 2CO2 scenario, at least from inspection of changes to seasonally averaged environmental fields (e.g., PI, midlevel saturation deficit, and vertical wind shear; not shown). CAM5-D has a similar response to CMCC-D under surface warming (i.e., increased frequency), albeit less drastic, but it shows decreasing TC frequency under doubled CO2. In fact, the same scenario-dependent sign changes were found by Wehner et al. (2015) when analyzing explicitly simulated TCs in the same low-resolution version of CAM5. Like CAM5-D, HIRAM-D also reveals a decrease in total storm frequency when forced with doubled CO2 (Fig. 16b). Note that this decrease differs from the results of Held and Zhao (2011), who investigated changes to explicitly simulated TCs under the same forcing scenarios and found that the roughly 20% reduction in global TC frequency was due in equal parts to p2K and 2CO2. The results here are broadly consistent with the (increased) global frequency changes illustrated in Walsh et al. (2015) (see their Fig. 4) in an assessment of the same four models’ downscaled models and warming scenarios, although they remain an anomaly compared to the explicitly simulated frequency changes (Table 3) as well as to previous studies based on explicitly simulated TC tracks, which have indicated typically a reduction of TC frequency in the Southern Hemisphere with warming (e.g., Walsh et al. 2016).
In terms of TC intensity changes for the Southern Hemisphere as a whole, three of the models (CMCC-D, GISS-D, and CAM5-D) reveal statistically significant increases in the mean LMI under p2K forcing (Fig. 17a), with the most consistent increases occurring in the south Indian Ocean. Daloz et al. (2015) found also that the CMCC-D and CAM5-D models had the largest increases in mean LMI for the North Atlantic region. HIRAM-D is an outlier in both surface warming scenarios (p2K and p2K2CO2), with overall decreases in LMI when averaged over the seven clusters. Again, it is not clear why this should be the case. For instance, PI increases by roughly the same amount with surface warming in both HIRAM (Figs. 13a,c) and CMCC (not shown), yet they display opposite LMI changes. In both the 2CO2 and p2K2CO2 scenarios, CMCC-D is the only model to show a robust increase in LMI (Figs. 17b,c). The negative LMI trend in HIRAM-D is generally consistent across all scenarios, and is particularly pronounced in p2KCO2.
One of the more robust observational results of the impact of climate change on global TC activity during recent years has been the poleward migration of the latitude where TCs reach their LMI (Kossin et al. 2014), and this trend is expected to continue (at least for the western North Pacific) as the climate warms further (Kossin et al. 2016). Figure 18 shows the change in the mean latitude of LMI for each of the downscaled models under the three idealized warming scenarios. The models are consistent in indicating little or no change of this metric, with some models exhibiting an equatorward shift (depending on cluster) and others displaying a poleward shift. The differences in mean latitude of LMI are, for the most part, statistically insignificant, and there is no one scenario that stands out from the other two. We are careful to point out, however, that despite these statistics, three of the four CLIVAR simulations examined here do signify tropical expansion with warming. Based on the latitudinal maxima of zonally averaged sea level pressure (e.g., Hu et al. 2011; Staten et al. 2018) during peak TC season (JFM), the p2K and p2K2CO2 scenarios reveal equal magnitudes of tropical expansion, which ranges between 1° and 3° depending on model (i.e., 3° in HIRAM, 1° in CMCC, and 1° in GISS). The mean latitude of LMI in these warming scenarios therefore does not appear to be responding to the circulation changes in any consistent manner. The physical reasoning for this is not immediately apparent. The discrepancies between our results and previous studies (e.g., Kossin et al. 2016) may be partly due to the initial equatorward biases in the genesis locations, tracks, and latitude of LMI in the historical (20C) runs. The 2CO2 scenario does not indicate any tropical expansion. It is interesting to note that the clusters in which most models project a poleward shift in scenarios 2CO2 and p2K2CO2 are C5 and C6, in the South Pacific, where Daloz and Camargo (2018) found the most significant trend of poleward genesis position in observations, associated with LMI poleward shifts.
6. Discussion and conclusions
The ability of a select group of climate models from the U.S. CLIVAR Working Group on Hurricanes to simulate observed characteristics of TCs in the Southern Hemisphere has been investigated, including genesis locations and tracks, lifetime maximum intensity, annual frequency, and seasonality. Two types of simulated TCs were explored: 1) explicit TCs, generated using several TC detection and tracking schemes (Table 1), and 2) downscaled (“synthetic”) TCs, generated using the technique described in a series of papers (Emanuel 2006; Emanuel et al. 2006, 2008). Both types of tracks were further partitioned into regional clusters, following the method developed by Gaffney (2004). Although the number of clusters was ultimately a subjective choice, we found that the choice of eight clusters allowed for optimal comparison with TC track clusters from observations (Ramsay et al. 2012).
The six climate models evaluated with respect to the explicitly simulated TC tracks for the control climate simulation, namely HIRAM, CMCC, GISS, CAM5, FSU, and GFS, revealed fairly disparate results depending on the TC characteristic explored. The HIRAM-E model, for instance, was able to reproduce the observed track characteristics reasonably well but overestimated the average LMI by about 30% (Table 2). The annual hemispheric frequency of storms in the explicit tracks ranged from as low as 5.2 TCs per year (GFS-E), or ~20% of climatology (~27 TCs per year), to as high as 49.5 TCs per year (FSU-E). Encouragingly, the annual frequencies calculated from the multimodel mean were much closer to observed annual frequencies than any of the individual models, both for specific clusters and for the Southern Hemisphere as a whole. An analysis of projected TC frequency changes using the multimodel mean approach revealed roughly equal reductions in the Southern Hemisphere (~−16%) for the p2K and 2CO2 scenarios, and a slightly larger reduction (~−19%) for the p2K2CO2 scenario.
Clusters generated from the downscaled tracks resulted generally in more accurate representations of observed track behavior compared to explicit tracks generated by the same models (Figs. 6 and 7). Although some models had explicit tracks that compared well with their downscaled counterparts, as well as with observations (e.g., HIRAM-E and HIRAM-D, Fig. 3), other models, including CMCC-E, produced a relatively poor spatial depiction of the historical best tracks and were clearly inferior to the synthetic tracks. Despite this generally improved representation of climatological tracks by the downscaled models, there were some notable deficiencies. First, the mean track directions in the South Pacific region in GISS-D and CAM5-D were toward the southwest, counter to the typical southeasterly direction in the best track data. Second, the median genesis and track positions exhibited substantial equatorward biases in the downscaled simulations, between +2° and +4° on average depending on model, which likely resulted in positive biases of other metrics including TC duration (+1.2 days across all models), mean LMI (from +5 to +8 m s−1 depending on model), and mean PDI. Therefore, despite the appeal of the synthetic track technique for statistical analysis (especially for quantifying the likelihood of extreme events; e.g., Emanuel 2017), the results from the present study indicate that the technique should not be taken at face value. We encourage further work aimed at resolving the track and intensity biases evident here.
Finally, the downscaled models were evaluated in relation to projected future changes of three TC metrics: 1) annual frequency, 2) mean LMI, and 3) mean latitude of LMI, when forced with three highly idealized warming scenarios (+2K, 2CO2, and p2K2CO2). In terms of changes to annual frequency, the CMDD-D model revealed robust and statistically significant increases under all three scenarios, with +60% and +36% increases in the p2K and 2CO2 scenarios, respectively. The other three downscaled models (HIRAM-D, GISS-D, and CAM5-D) showed both increases and decreases in annual frequency, depending on scenario. For instance, TC frequency in HIRAM-D and CAM5-D increased under p2K forcing but decreased with 2CO2. In terms of TC intensity changes, the most consistent increases occurred in the p2K scenario, with three of the four models showing statistically significant increases in LMI (the outlier being HIRAM-D, which showed a decrease in LMI in all three warming scenarios). Changes to the mean latitude of LMI were more ambiguous despite the fact that surface warming resulted in tropical expansion of between 1° and 3° in three of the four downscaled models examined here. Based on statistical changes of the mean latitude of LMI alone, there is some evidence to suggest that poleward migration is more likely in the South Pacific under a doubling of CO2 (compared to a uniform 2-K warming), as all four downscaled models in that scenario display a poleward shift (at least on average across the Southern Hemisphere), with both HIRAM-D and CAM5-D indicating statistically significant poleward shifts.
The authors thank all the members of U.S. CLIVAR Hurricane Working Group (HWG) for their contribution to this significant effort, in particular those who produced the model simulations, used in this study: Kerry Emanuel, Maxwell Kelley, Arun Kumar, Jeffrey Jonas, Timothy LaRow, Enrico Scoccimarro, Hui Wang, Michael Wehner, and Ming Zhao. We would also like to thank Naomi Henderson for managing the HWG dataset. HAR acknowledges support from the ARC Centre of Excellence for Climate System Science, as well as the Australian National Environmental Science Program (NESP). SSC also acknowledges support from NESP. SJC acknowledges the support of NSF Grant AGS 1143959 and NASA Grants NNX13AM18G and 80NSSC17K0196. The model data used here can potentially be made available by individual requests.
This article is included in the US CLIVAR Hurricanes and Climate special collection.