The U.K. on Partnership for Advanced Computing in Europe (PRACE) Weather-Resolving Simulations of Climate for Global Environmental Risk (UPSCALE) project, using PRACE resources, constructed and ran an ensemble of atmosphere-only global climate model simulations, using the Met Office Unified Model Global Atmosphere 3 (GA3) configuration. Each simulation is 27 years in length for both the present climate and an end-of-century future climate, at resolutions of N96 (130 km), N216 (60 km), and N512 (25 km), in order to study the impact of model resolution on high-impact climate features such as tropical cyclones. Increased model resolution is found to improve the simulated frequency of explicitly tracked tropical cyclones, and correlations of interannual variability in the North Atlantic and northwestern Pacific lie between 0.6 and 0.75. Improvements in the deficit of genesis in the eastern North Atlantic as resolution increases appear to be related to the representation of African easterly waves and the African easterly jet. However, the intensity of the modeled tropical cyclones as measured by 10-m wind speed remains weak, and there is no indication of convergence over this range of resolutions. In the future climate ensemble, there is a reduction of 50% in the frequency of Southern Hemisphere tropical cyclones, whereas in the Northern Hemisphere there is a reduction in the North Atlantic and a shift in the Pacific with peak intensities becoming more common in the central Pacific. There is also a change in tropical cyclone intensities, with the future climate having fewer weak storms and proportionally more strong storms.
There is an increasing need for skillful climate information at regional and local scales, particularly for considering variability and extremes, in order to inform future planning and impact risk assessments, both to prepare for climate change and to investigate opportunities for renewable energy and for the insurance and reinsurance industries to understand risk. Current phase 5 of the Coupled Model Intercomparison Project (CMIP5)-class models (Taylor et al. 2012) generally fall short of being able to provide information on these small space and time scales (Christensen et al. 2014), though they provide a wealth of information on climate uncertainty and large-scale patterns of variability (Stocker et al. 2014).
Tropical cyclone simulation presents a particular challenge to the CMIP5 class of model, since these features are relatively small in size and have complex circulations driven by convective processes. These aspects are particularly weak elements of coupled general circulation models (CGCMs) implemented at horizontal resolutions that allow multicentennial integrations under a variety of forcing scenarios, often with full Earth system biogeochemistry components.
To address such issues, a long-standing collaboration exists between the Met Office and the University of Reading to develop “weather resolving” climate models, which are able to capture typical weather features such as fronts and atmospheric rivers (as found in a weather forecast) while also being integrated over multidecadal time scales (Shaffrey et al. 2009; Strachan et al. 2013; Demory et al. 2014; Mizielinski et al. 2014). Many other groups are also progressing quickly in this direction, often using higher-resolution components of existing weather/seasonal forecasting or climate models (e.g., Zhao et al. 2009; Murakami and Sugi 2010; Wehner et al. 2010; Manganello et al. 2012; Rathmann et al. 2013; Bacmeister et al. 2013), as significant progress in model scalability, supercomputing, data storage, and processing capacity become available. Such developments enable investigation into the impact that enhanced resolution has on aspects of climate, particularly on how processes are better represented. For tropical cyclone (TC) simulation, there has been much research on the impact of complex changes to forcings in future projections, such as in CMIP3 and CMIP5 experiments (Zhao et al. 2009; Murakami et al. 2012b; Tory et al. 2013b; Camargo 2013): alternatively, idealized forcings have been used, such as a uniform +2-K sea surface temperature (SST) increase or a 2 × CO2 increase (Yoshimura and Sugi 2005; Held and Zhao 2011; Daloz et al. 2015). The latter method attempts to understand how the differing forcing factors influence the TC climatology, while the former may give some guidance on future projections, which are also summarized in Knutson et al. (2010).
The TC activity in models can be determined in a variety of ways, such as using explicit feature-tracking algorithms (Bengtsson et al. 2007a; Zhao et al. 2009). Alternatively, measures based on the larger-scale climatology of factors known to influence TC formation (e.g., wind shear, thermodynamic instability, and humidity) can be computed, such as the genesis potential index (GPI; Emanuel 1988; Camargo et al. 2007; Emanuel 2010) or measures combining aspects of both such as Tory et al. (2013a). Walsh et al. (2013) made a comparison between GPI-based and explicit tracking of storms in CMIP3 models, showing that lower-resolution models simulate the large-scale GPI better than they simulate the TCs explicitly. However, TCs also have a feedback on the large-scale climate as noted in observations (D’Asaro et al. 2013), and in coupled models (Hu and Meehl 2009) and are therefore important to represent explicitly.
As detailed in Mizielinski et al. (2014), the U.K. on Partnership for Advanced Computing in Europe (PRACE) Weather-Resolving Simulations of Climate for Global Environmental Risk (UPSCALE) project ran the Met Office Unified Model (MetUM), using a forced atmosphere–land configuration named Global Atmosphere 3.0 (GA3.0; Walters et al. 2011), on the Cray XE6 supercomputer Hermit at the High Performance Computing Center Stuttgart (HLRS) in Stuttgart, Germany. Using a hierarchy of models with midlatitude resolutions of N96 (130 km), N216 (60 km), and N512 (25 km) with consistent physics and dynamics settings, our goal was to investigate the extent to which such models can reproduce aspects of observed climate variability and extremes and to help put results from Intergovernmental Panel on Climate Change (IPCC)-class models into context. Building on the work of Strachan et al. (2013), this study investigates higher resolutions (both horizontally and vertically with a top at 85 km) with more ensemble members (enabling more stringent statistical tests) extending to near present day and with a more recent model configuration that makes it possible to have even fewer configuration changes with resolution. In addition, an ensemble of integrations was performed with an idealized future climate scenario [representative concentration pathway 8.5 (RCP8.5) at 2100] with strong forcing to see how this affects projections of climate change, including tropical cyclone characteristics.
The ultimate goal of our research is to use coupled models at similarly high resolutions in order to generate an internally consistent future climate (i.e., in a system where the atmosphere–land–ocean–ice system can reach its own equilibrium and be in surface energy balance). This would require many more ensemble members because of the increased internal variability of such a system (Deser et al. 2012), which increases the difficulty in separating signal from noise. Such experiments would likely be at least an order of magnitude more computationally expensive and are strongly affected by biases in all model components being small enough for the mean climatology and variability to be realistic. As shown in Murakami et al. (2014), model biases can significantly alter the future projections of TCs. Research that provides quantification of inherent model error (e.g., due to coarse resolution) is a necessary preparatory step in that direction.
The manuscript continues as follows: the models, methods, and data used in this work are summarized in section 2 and the characteristics of the tropical cyclone simulation in terms of climatology, variability, seasonal cycle, and structure are described in section 3, for both the present climate and future climate simulations. There then follows a summary and discussion in sections 4 and 5.
2. Models, data, and methods
The climate models used in this study are based on the GA3.0 and Global Land 3 (GL3) configurations of the MetUM and the Joint U.K. Land Environment Simulator (JULES) respectively, as documented in Walters et al. (2011) and Mizielinski et al. (2014). These references describe the many developments compared to the Hadley Centre Global Environment Model, version 2–Atmosphere and Ocean (HadGEM2-AO) from CMIP5 (Martin et al. 2011), including the standard use of 85 levels up to 85 km for improved stratospheric representation, a prognostic cloud fraction and cloud concentrate scheme (Wilson et al. 2008), and modifications to microphysics to reduce the spurious occurrence of drizzle (Abel and Shipway 2007). The hierarchy of horizontal resolutions has been developed such that the physics and dynamics of the models are kept the same as far as practically possible, so that any differences in simulation can be attributed to resolution alone (see Mizielinski et al. 2014). The horizontal resolutions of the respective models are shown in Table 1 and span the range between CMIP5-type resolutions and global weather forecasting resolution. The model integrations span February 1985–December 2011 in order to use all the data then available from the Operational Sea Surface Temperature and Sea Ice Analysis (OSTIA; Donlon et al. 2012) daily SST and sea ice dataset, which has a native resolution of ° and is a synthesis of satellite and in situ observations covering from 1985 to the present day, where the period 1985–2008 is a reanalysis (Roberts-Jones et al. 2012). The present climate (PC) simulations use this surface forcing, together with CMIP5 Atmospheric Model Intercomparison Project phase 2 (AMIP-II) standard forcings for aerosols and greenhouse gases [described in Taylor et al. (2012) as “historical” using realistic radiative forcings until 2005 and then using RCP4.5 forcings to 2011]. There are five ensemble members at N96 and N512 resolution and three members at N216, with each member differing only in the initial conditions (see Mizielinski et al. 2014). The number of ensemble members was based purely on available computing resources.
The future climate (FC) “time slice” integrations (Bengtsson et al. 1996; Zhao et al. 2009; Murakami et al. 2012b) are an idealized attempt to simulate a future climate state using only a forced atmosphere–land surface model. As described in Mizielinski et al. (2014), the change in SST between 2100 and the present day from the HadGEM2–Earth System (ES) from the IPCC Fifth Assessment Report (AR5; Collins et al. 2011) under the strong emissions pathway RCP8.5 scenario (van Vuuren et al. 2011) is added to the present-day daily OSTIA SSTs as a repeating annual cycle (with spatial global pattern), while the sea ice concentrations are taken directly from the HadGEM2-ES simulation. Greenhouse gas concentrations are made consistent with those in the HadGEM2-ES simulation at year 2100. While it is true that the thermodynamic state is not a unique function of SST (Emanuel and Sobel 2013) and hence with no interactive ocean coupling this type of experiment is not the ideal method in which to examine how TCs might change in future, it has the benefit of being a global model (and hence able to respond to remote forcings) and is among the simplest methods to use and interpret. There are three ensemble members for each model resolution for the FC integrations.
b. Reanalyses and observations
The reanalysis datasets used in this work have been obtained from the respective centers, and the horizontal resolutions of these data are shown in Table 1. Data from the Interim European Centre for Medium-Range Weather Forecasts (ECMWF) Re-Analysis (ERA-Interim) for the period 1985–2011 (Dee et al. 2011) were used on the 1° × 1° grid (not the native resolution) for both the 6-hourly data used in TC tracking and the monthly data used for the climatological calculations. The National Aeronautics and Space Administration (NASA) Modern Era Retrospective-Analysis for Research and Applications (MERRA) reanalysis data (Rienecker et al. 2011) and the Japan Meteorological Agency (JMA) Japanese 25-year Reanalysis Project (JRA-25) reanalysis data (Onogi et al. 2007) were similarly used.
Observational data for the North Atlantic (NA) and eastern Pacific (EP) basins were obtained from the updated National Hurricane Center Hurricane Database (HURDAT2; Landsea and Franklin 2013) and data for the remaining basins are from the Joint Typhoon Warning Center best-track files (Chu et al. 2002). Here these datasets will be referred to jointly as observations. These datasets are used in preference to the International Best Track Archive for Climate Stewardship (IBTrACS; Knapp et al. 2010) because they have unique, 1-min sustained wind speeds reported for each basin. The definitions used throughout this work for each basin are defined in Fig. 1 (the separation between EP and NA lies along Central America), and a tropical cyclone is assigned to the basin where the maximum wind speed is attained while the storm has a warm core unless otherwise stated.
In this work, comparison will only be made for the period in which the reanalysis datasets overlap with the model results, typically 1985–2011. Particular seasons are used—May–November for the Northern Hemisphere (NH) and October–May for the Southern Hemisphere (SH)—since these periods capture the vast majority of observed and modeled storms (see Emanuel 2003) and make the processing of these large datasets more manageable (Strachan et al. 2013).
An objective, resolution-independent feature-tracking methodology is used to identify and track TC-like features in the GCM output and reanalyses. This method is described fully in Hodges (1995, 1996, 1999) and Bengtsson et al. (2007a). Maxima in low-level 850-hPa relative vorticity are used for identification and tracking of all tropical vorticity centers using vorticity spectrally filtered to T42 resolution (in order to perform the feature tracking on a common grid that has removed grid-scale anomalies) for both models and reanalyses as described in Strachan et al. (2013), Manganello et al. (2012), and Bell et al. (2013). TC identification is performed with a warm-core check on a common T63 grid using four vertical pressure levels (850, 500, 300, and 200 hPa) [compared to the three levels used in Strachan et al. (2013) and Bell et al. (2013)]. The limited number of parameters used in the method (primarily vorticity and warm core) has been derived from tracking of reanalysis datasets and is not tuned for the different models.
All model storms are weak compared to observations (see Strachan et al. 2013), particularly with regard to 10-m wind speed, as shown in Fig. 2, but also applying to wind at other levels. This has been a characteristic of the MetUM for some years over many configurations, and the cause continues to be investigated. Intensity for the models and reanalyses will be measured by minimum mean sea level pressure (MSLP) during TC lifetime, with the understanding that this is not an official categorization but based on the original Saffir–Simpson hurricane scale (Simpson 1974). The categories are defined by the following: >994 hPa, 980–994 hPa, 965–979 hPa, 945–964 hPa, 920–944 hPa, and <920 hPa for tropical storms (TSP) and category (CatP) 1–5 hurricanes, respectively (subscript P to distinguish from the standard 10-m wind speed definition).
Although the model TCs are weak in intensity, they do satisfy the detection algorithm for structure and warm core. In fact, when tracking TCs in reanalyses, for which storm-by-storm matching is possible, we have shown that our detection algorithm has higher skill for the higher TC categories, for both mean frequency and variability (A. Cobb 2014, personal communication). It is expected then that the weakest observed storms (tropical depressions and storms) in our simulations would not be robustly detected by our tracking algorithm. Moreover, when making use of observations for model assessment, weaker storms may depend more on monitoring practices of individual operational centers. Finally, weaker storms may be less sensitive to large-scale drivers (which is the main focus here). As a consequence, in the following the main comparisons with models and reanalyses will be with observed storms of hurricane strength category 1 (standard 10-m wind speed definition) and above (see also Zhao et al. 2009).
The composite structure of the tropical cyclone shown later is derived from the strongest 10 TCs for each resolution over the ensemble, calculated during the warm-core phase at peak 850-hPa vorticity, and aligned with the direction of storm travel. It is calculated over a 10° radial cap, with resolution 0.25° × 1°, following the methodology of Bengtsson et al. (2007b) and Catto et al. (2010).
The PC integrations will be discussed before the FC results.
a. Large-scale global tropical cyclone characteristics
The average number of TCs detected in each basin in the model ensembles, reanalyses, and observations (tropical storm strength and above, and hurricane and above) over the period 1986–2010 are shown in Fig. 3. The overall frequency shown in Fig. 3a is subdivided into maximum intensities during TC lifetime so that the bottom part of the bar denotes storms that are less than category 1 (CatP 1 for the models/reanalyses), with the top part showing categories 1–5. The models and reanalyses have total TC frequencies generally within the range of the observations (as defined by the range between TS and hurricanes) but with the majority of storms being weak sub–CatP 1 strength. As model resolution increases, so does the number of detected storms (as also found in Strachan et al. 2013), with the main difference between N216 and N512 being an increase in more intense storms, with CatP 3 storms beginning to appear at N216 and N512 (the strongest model storms have maximum 10-m wind speeds of around 35 m s−1 with minimum MSLP of 940 hPa; Fig. 2). Models and reanalyses both have lower TC frequency in the NH compared to observations and too many TCs in the SH.
Breaking down these totals into individual basins in Figs. 3b,c, the numbers here have been normalized by their respective hemispheric total so that it is easier to examine the distributions between basins. A TC is categorized in a particular basin if it reaches peak intensity there.
The models produce typically too few TCs in the NA with an improvement at N512 and too many in the EP, with N512 also having relatively fewer in the western Pacific (WP) and more in the central Pacific (CP), perhaps because of their too-zonal track (see below for details). In the SH, where there is generally an excess of storms, the positive bias is larger in the southwestern Indian Ocean (SWI) basin, while the Australasian (AU) region has fewer storms. The models also generate storms in the South Atlantic (SA), where hurricanes are observed to be rare (McTaggart-Cowan et al. 2006), though subtropical cyclones are quite common (Evans and Braun 2012).
The reason for the increase in TC frequency with model resolution is not immediately obvious. The tracking algorithm attempts to be as resolution independent as possible, with its main threshold of vorticity on the common T42 grid to which all fields are filtered. The increased resolution allows more vorticity structures to be generated at the native resolution (e.g., via African easterly waves in the NA) and together with more realistic structures at higher resolution, which are more likely to be identified by the algorithm, can generate more storm detections. There are some indications that, at least with a more recent configuration of the MetUM model, the TC total frequency seems to saturate between 25- and 12-km resolutions (using a comparable model setup), but with a decrease in the proportion of weaker storms at the higher resolution (this will be described in future work).
The TC frequencies in the reanalyses do not follow model resolution but are affected by different data assimilation schemes, as discussed by Bengtsson et al. (2007a). For example, the JRA-25 dataset has the lowest nominal resolution, but its assimilation scheme enhances near-surface wind speeds around tropical cyclone centers once the TC has formed (Onogi et al. 2007) and hence may make them stronger and/or have a structure more likely to fulfil the identification criteria. MERRA has the highest resolution of data used here but the fewest TCs detected, with its basin-by-basin distribution closely matching the observations.
The track density, defined as transits per month within a 4° radius of each point, is shown in Fig. 4, using the warm-core-only portion of the track to allow better comparison with observations. The increase in model resolution improves the NA basin, with more tracks being located in the eastern Atlantic, while at low resolution almost all of the tracks are contained within the Gulf of Mexico [as found in Strachan et al. (2013) and Bell et al. (2013)]. Increased model resolution enhances an error in the CP, where the density becomes too high compared to observations, with tracks that are too zonal, yielding the basin plot in Fig. 3. Part of this error may be attributed to differences between the detection algorithm from the model and observed TCs. Observed TCs will tend to be attributed to a particular basin, depending on the observing center, and will be detected at a later stage of development than the vorticity in the model, even if the latter only uses the warm-core part of the track. Study of model vorticity also shows that there are several sources of vorticity in the CP: waves associated with the intertropical convergence zone, waves from the eastern Pacific (possibly remnants of African easterly waves), and waves from convergence zones of flow from north and south of the equator: some of which leads to anomalous genesis. However, overall the TCs simulated by the models travel too zonally, with fewer storms being generated nearer to the equator in the western Pacific than seen in the observations. Such biases are sensitive to changes in model configuration and parameters, such as convective entrainment rates (ongoing work).
An additional model bias lies in the Gulf of Mexico/EP region, where the track density is again too high. In the model the coast of Venezuela generates a continual stream of vorticity (perhaps from easterly waves generated over the Caribbean and Gulf of Mexico; Serra et al. 2010). This feeds vorticity structures both in the Gulf of Mexico and, together with interactions with mountains in Central America, is a large source for the EP and is why many storms found in the EP seem to originate in the NA (Serra et al. 2010).
In the Southern Hemisphere, the main error in distribution is found in the SWI basin, where the track density is strongly enhanced to the west near Madagascar. In observations and reanalyses, it is more evenly distributed.
The reanalysis track densities (again using only the warm-core portion of the track) are shown in Fig. 4 and, as expected, are much more comparable to the observations. The JRA-25 dataset generally has higher track densities in most basins compared to the ERA-Interim and MERRA, and is in best agreement with the observations. The data assimilation around TCs is again likely to be aiding this. The reanalysis datasets are also better than the higher-resolution models in the CP, where they have a more realistic gap between EP and WP activity, suggesting that it is not the detection of early formation vorticity structure resulting in the model bias above.
b. Seasonal cycle
The seasonal cycle of tropical cyclones in each basin for models, averaged over each resolution ensemble, and observed hurricanes for 1986–2010 is shown in Fig. 5. Bearing in mind that the detection of storms is constrained to using May–November and October–May periods for the NH and SH, respectively, the shape of the seasonal cycle is well represented, particularly in the WP. In the NA, the season starts too early and does not increase strongly through July–September, as seen in the observations and reanalyses. The early part of this bias may be due to storms generated in the NA traveling into the EP in July and being classified there. The latter part is almost entirely due to a lack of genesis of eastern Atlantic TCs at this time of year, as shown in Kossin et al. (2010) in their “cluster 3” of storms originating in the eastern Atlantic and Daloz et al. (2015), showing how this cluster has peak frequency in August–September. The low frequency, particularly at low resolution, seems to be related to a reduced number of African easterly waves (AEWs), whose seasonal climatology at 15°W is shown in Fig. 6 (top), as measured by the method of Bain et al. (2013), where an analysis of Hovmöller diagrams of curvature vorticity at 700 hPa is used to identify propagating waves (once wind fields have been interpolated to a common N96 grid) between 5° and 15°N. The model AEW seasonal cycle peaks earlier than in the reanalyses, and the AEW numbers gradually increase with model resolution. However, they remain much weaker in terms of peak 700-hPa vorticity than the reanalyses and completely miss the seasonal cycle in vorticity (Fig. 6, bottom). This is consistent with the changed structure of the African easterly jet (AEJ), which the southern flank of the AEJ has a much weaker meridional wind gradient in the models than the reanalyses (Fig. 7) and an AEJ core located farther south. This would in turn restrict the formation of AEWs to the south of the jet, and they have been shown by Chen et al. (2008) to be particularly associated with TC formation in the eastern Atlantic. As an aside, seasonal hindcasts from the Met Office coupled Global Seasonal Forecast System, version 5 (GloSea5; MacLachlan et al. 2015), using the same atmospheric model at N216 resolution coupled to a ° ocean model, has stronger AEW vorticity and an AEJ located in a similar position to the N512 model (Fig. 7) with a stronger meridional wind gradient and an improved eastern Atlantic TC formation rate (Camp et al. 2014, manuscript submitted to Quart. J. Roy. Meteor. Soc.). This improvement may be attributable to the seasonal initialization of, for example, land surface properties such as soil moisture and the ocean, as well as full ocean coupling.
The EP has too many storms in the higher-resolution models, with the strong peak in July apparently due to storms with vorticity genesis in the Gulf of Mexico (Bengtsson et al. 2006, 2007a) perhaps because of too many easterly waves propagating into this region (Serra et al. 2010) and with a dip in September that is not seen in observations. The northern Indian Ocean (NI) region has somewhat different characteristics, with fewer TCs at higher resolution and a poor seasonal cycle. Part of this may be explained by the poor simulation of the Indian monsoon in the model (Walters et al. 2011): the onset is late, which may affect the early season, and the excessive activity in September–October may be the result of both the early retreat of the monsoon and some monsoon depressions being classified as TCs. The N512 model does have somewhat reduced monsoon circulation biases, and at higher resolution it is somewhat easier for the tracking code to distinguish between depressions and TCs; for these reasons, it has fewer TCs.
In general, the Southern Hemisphere has too many storms, a typical model error also seen in Strachan et al. (2013), with the southern Pacific (SP) having a more exaggerated peak in February–March than observed, whereas the other basins have a reasonably timed, if exaggerated, cycle. This is also true for reanalyses and may suggest that in these regions with significantly fewer in situ observations the large-scale environment may not be well represented; however, cyclogenesis indices based on reanalyses (Menkes et al. 2012; Strachan et al. 2013) suggest that the cyclogenesis environment is not unreasonable. Apart from differences in absolute numbers, there is little sensitivity of the seasonal cycle to model resolution.
c. Interannual variability
The interannual variability in TC frequency (model and reanalyses) and hurricanes (observations) for the NA is shown in Fig. 8. As seen previously, the higher-resolution models produce more storms on average, and N512 in particular reproduces all but the 2005 peak well, with its ensemble range almost always containing the observations. The reanalysis datasets represent the interannual variability realistically, with the number of diagnosed storms generally at or above observed hurricane numbers, with a good representation of 2005 in particular. Note that reanalyses have a reasonable chance of containing some of the weaker storms given the observational constraints used in their production.
Correlations of interannual variability for the Northern Hemisphere basins are shown in Table 2 for model ensemble means and individual members and for reanalyses. In the NA, the correlation for all model ensemble means is significant (at the 95% level using a two-tailed t test) and increases with model resolution. In the WP, all models have significant ensemble correlations of above 0.6, though in this case the N512 model has the lowest correlation. The EP has somewhat lower correlation, only N216 is significant, and other basins are poorly correlated. For the reanalyses datasets, the correlations are similarly high in the NA and WP, higher than the models in the NI and similar to models in the EP. In general JRA-25 has the best correlations, perhaps again because of the assimilation of winds around hurricanes. The fact that the reanalyses are not perfectly correlated may be attributed to limitations in both the datasets themselves and in identification criteria used to select a subset of all vorticity tracks, as it is likely that most of the observed hurricanes are represented in some way in the reanalyses [see Strachan et al. (2013) for ERA-Interim]. How the structure of the storm (both horizontally and vertically) is represented in the datasets will influence whether they are detected by the identification criteria.
As discussed in Chen and Lin (2013), the correlation between each ensemble member and the ensemble mean of the other members gives some measure of the internal variability (“weather noise”). Stronger intermember correlation would suggest that the forcing plays a stronger role in interannual variability. These correlations are shown in Table 3: in common with Chen and Lin (2013), the EP has the strongest intermember correlation, especially at N512, but here the EP has the weakest ensemble average correlation with observed storms (Table 2). This suggests that the model is biased in a way that strongly influences all members, or the model is not responding to the forcing in the correct way. It is possible that the results differ from Chen and Lin (2013) because of the configurations used, with their integrations being seasonal in length with persisted SST anomalies that encourage larger internal variability. Their NA correlation is also larger than seen here, so model initialization may be important.
In addition to explicitly tracking TCs, the large-scale climatology can act as a strong constraint on TC formation even if precursors such as vorticity structure favor formation, particularly in the Atlantic. The GPI is a standard measure of the amenability of the large-scale mean environment to tropical cyclone formation (Emanuel 1988; Camargo et al. 2007; Emanuel 2010). Figure 9 shows the GPI for the Atlantic for the models and reanalyses (following Emanuel 2010) calculated as the July–October monthly mean taken over the main development region (MDR; defined by 7.5°–20°N, 15°–85°W). The ensemble correlation of the model GPI with the model storm counts is very high [as shown in Table 5, where both older (GPI; Camargo et al. 2007) and updated (GPI2010; Emanuel 2010, 2013) correlations are shown], while the correlation of the GPI with the observed hurricane counts is slightly lower. However, the fact that the GPI remains almost the same for the different resolution models, while as seen above the explicitly tracked TC frequency increases with model resolution, suggests that the GPI in the Atlantic is a good guide for relative (normalized) TC interannual variability but cannot distinguish the absolute TC frequency differences between the model resolutions. It also suggests that the large-scale mean state represented by the different resolution models is quite similar, which may be expected given the same imposed SST forcing though the strength of remote teleconnections may differ. In addition, as noted by Menkes et al. (2012), the standard deviation of the GPI is also much less than the observed TC interannual variability (the GPI standard deviation shown here is around 0.4, while the TC frequency standard deviation is around 2–3 for models and even higher for reanalyses).
Although TC frequency is reasonably well represented by the models at higher resolutions, accumulated cyclone energy (ACE) is a more robust and representative measure (Bell et al. 2000), accumulating as it does tropical cyclone frequency over a season and the intensity and lifetime of each storm through integrating the squared wind speed along each track every 6 h. The correlations in ACE variability for the NA are 0.59, 0.71, and 0.77 for the N96, N216, and N512 ensembles, respectively, and 0.91 for the reanalyses ensemble in the NA: hence, very similar to the TC frequency correlation shown in Table 2. However, this hides the fact that the mean ACE in the models and reanalyses is much smaller than observed, typically by 3–10 times. This is due to a combination of shorter track length when using the warm-core definition (though in the NA the main cause is the lack of TC genesis in the eastern Atlantic) and very weak 10-m wind speeds, with the latter being the more significant. If the model 10-m wind speeds are scaled up using the observed wind speed–MSLP relationship (Fig. 2), then the model average ACE is comparable to that observed (not shown). This suggests that the MSLP associated with TCs in the model is reasonable and would lend support to the hypothesis that the modeled storms tend to be too large in size with a consequent weakening of the wind speeds: that is, a weaker pressure gradient.
In addition to the TC frequency, it is also important to reproduce the spatial distribution, since this can have important consequences for the TC track and the likelihood of landfall (Kossin et al. 2010; Daloz et al. 2015). If the NA is split into western and eastern regions divided at 60°W and TCs are counted by their genesis point, then the model ensembles have significant correlations of around 0.6 (see Table 4) in the western Atlantic but smaller (and for N96 not significant) correlations in the eastern Atlantic, though there are fewer storms here even when using the full length track rather than just the warm-core part. The reanalyses generally have a smaller difference between the correlation in the eastern and western Atlantic. The model correlations remain high as they are weighted by the greater numbers in the Gulf of Mexico, but this distribution significantly reduces the frequency of landfall on the eastern coast of the United States (not shown) and may change the likely model response in a future climate (see section 3e).
To try and further understand the links between tropical cyclone interannual variability and aspects of climate variability in the NA, Table 5 shows correlations between NA TC frequency and other modes of observed climate variability: the Atlantic meridional mode [AMM; a measure of meridional tropical SST gradient calculated using the SST method of Chiang and Vimont (2004) and Vimont and Kossin (2007, hereafter VM07) averaged over August–October], the Niño-3.4 index averaged over August–October, the Atlantic multidecadal oscillation (AMO), and AEW count as diagnosed previously. As described in VM07, the AMM acts as a strong constraint on TC frequency and hence is highly correlated with interannual storm counts in both models and reanalyses. The AMO, which is also thought to interact with the AMM (see VM07), is also significantly correlated with storm counts and is likely to have even higher correlations over longer time periods (Smith et al. 2010; VM07). El Niño influences the NA primarily through an increase in wind shear (Bell et al. 2014, and references therein). The correlation with Niño-3.4 variability is somewhat stronger in the N512 ensemble than found in either reanalyses or observations, while correlation with AEW frequency is marginally significant at the higher resolutions and several reanalyses.
To gain insight into some aspects of the complex relationship between TC frequency and individual modes of climate variability, Fig. 10 shows relationships between TC frequency and these various modes individually, as well as between the AMM and the AEW number, to illustrate the correlations above: each model symbol denotes an ensemble mean for a given year, while each reanalysis dataset is shown individually. Although the interannual correlation of TC frequency and AEW number is relatively weak, as discussed previously there seems to be a relationship between mean AEW numbers and TC frequency and hence indicating that AEW activity is an important precursor (Chen et al. 2008; Serra et al. 2010). Overall the increased variance in TC frequency at higher resolution allows the N512 model to have a wider spread of points, which seems to bring the fitted lines closer to those from the reanalyses.
To further investigate this relationship from a multivariate perspective, multiple linear regression (Wilks 2011) has been attempted to determine which of these covariates play the most robust roles in the interannual variability and whether there is any systematic difference with resolution. Table 6 shows the coefficients fit using an ordinary least squares model between the TC interannual frequency (model or reanalyses) and the above indices of climate variability, together with the coefficient of multiple determination (R2). Perhaps the most interesting resolution-dependent aspect is the respective roles of the AMM and AMO: at N96, the exclusion of the AMM as a predictor makes no difference to R2 and the AMM coefficient is not significantly different from zero. In contrast, for N216, N512, and reanalyses, the AMO coefficient is not bounded from zero and R2 is unchanged when the AMO predictor is removed. This is undoubtedly due in part to the covarying of the AMO and AMM, which are themselves highly correlated at 0.8, but hints at more response to the AMM at higher resolutions. The N512 has the most similar covariability of the AMM and Niño-3.4 coefficients compared to the reanalyses with similar changes in R2 when each predictor is removed in turn. The AEW variability seems to play a weak role in the TC interannual variability, particularly at low resolution: removing the AEW as a predictor makes little difference to R2 for N96 and the reanalyses and only seems to explain an additional 4%–6% of variance at N216 and N512 (e.g., R2 reducing from 0.5 to 0.46).
An attempt to visualize the combined four-dimensional relationship between NA TC frequency, the AMM index, the AEW count, and the Niño-3.4 index in August–October is shown in a supplementary material figure (available at the Journals Online website: http://dx.doi.org/10.1175/JPO-D-14-00131.s1); the first three as axes and the latter as a color shading, with dark red indicating strong El Niño conditions and dark blue indicating strong La Niña conditions, with the size of the circles scaled by the AMM index magnitude. This shows that in both models and reanalyses the AMM index is a strong constraint on TC activity, with the smallest and largest storm counts coinciding with the AMM index extremes (also Fig. 10c). Niño-3.4 conditions are also important, with the largest TC counts coinciding with strong La Niña conditions, though the strongest El Niño events do not necessarily produce the fewest storms (also seen in Bell et al. 2014); indeed, if coincident with high AMM index, they can have a relatively high TC activity. Similarly, in years where Niño-3.4 is weak (gray color) it is only at the high resolutions and the reanalyses that such years can attain stronger TC activity (also Fig. 10b), which points to a local forcing factor.
d. TC structure
The composite structure of the strongest 10 tropical cyclones from each model resolution ensemble is derived at peak intensity during the warm-core phase and aligned such that their direction of travel is northward, on a common 10° cylindrical grid. The cross section of wind speed along the northern axis/direction of travel from each of these composites is shown in Figs. 11a–c. As model resolution is increased, the winds get stronger as expected, with a tighter core (the radius of peak wind shrinks from about 3° at N96 to about 2° at N512, shown by the contour) and a more upright structure; however, Manganello et al. (2012) show that, with a resolution comparable to N512, their radius of maximum wind is closer to 0.75° with correspondingly stronger wind speeds, though this is still considerably larger than some observational studies such as Stern and Nolan (2009) of 0.5° or less. The relatively large size of the model storms is also consistent with the 10-m wind–MSLP relationship shown in Fig. 2; although the MSLP minima can be relatively deep, the associated wind speeds are much weaker than observations.
Associated with this wind field is a temperature anomaly (relative to the mean temperature over a 10° radius), and this is shown in Figs. 11d–f. The peak temperature anomaly occurs at between 200 and 300 hPa in all model resolutions, though the maximum increases from 4 to 7.5 K between N96 and N512, while the surface to peak anomaly is about 3 K at N96 and 5 K at N512. This is significantly smaller than, for example, that shown in models by Hill and Lackmann (2011) and Manganello et al. (2012) and in observations by Frank (1977), even when the anomaly is taken over the same radius (not shown), and may indicate why the tropical cyclones in the model are weak at all resolutions: one possibility is that the convective parameterization does not have its maximum heating at the correct height, which may limit the intensification of the storm. Further work is ongoing to understand the reasons for this.
e. Future climate results
The FC forcing is a strong perturbation to the PC using an RCP8.5 scenario at year 2100 from one IPCC AR5 model (HadGEM2-ES). It is also somewhat of a hybrid given that some aspects of the forcing, including aerosols and the baseline SST interannual cycle, remain the same as in the PC ensemble, while the greenhouse gases, ΔSST, and sea ice have been changed. Recent work in particular suggests that aerosol forcing plays an important role in TC climatology and variability (Booth et al. 2012; Dunstone et al. 2013), both directly via radiation and indirectly via their forcing of the SSTs. Regional TC changes are also sensitive to the pattern of SST change (Sugi et al. 2009), and this pattern is shown for June–August in Mizielinski et al. (2014).
The seasonal cycle for the FC integrations (Fig. 5) generally has a much lower amplitude in the NA and the whole SH, with a slightly enhanced amplitude in the NI. The NA and WP seem to have a slightly delayed peak in the cycle, the former consistent with the shift in the AEW seasonal cycle (Fig. 6) and with projected changes in SST as in Dwyer et al. (2012).
The change in track density distribution between FC and PC is shown in Fig. 12 for each model resolution. The main changes are broadly consistent with those found by Murakami et al. (2012a), with reductions in TC tracks in the western NA, WP and throughout the SH, and a large increase in the CP. The most noticeable difference between the resolutions is the reduction in the eastern NA particularly at N512, because of the lower resolutions having so few TCs here in the PC that little or no reduction is possible in the FC.
Some of these changes in track density are consistent with the change in GPI (Emanuel 2010) between FC and PC, as shown in Fig. 13, particularly in the CP and western NA/Gulf of Mexico. Examination of the relative change in the individual terms of the GPI (not shown) suggests that changes to wind shear play a dominant role here, with the terms involving relative vorticity and potential intensity also contributing (the latter particularly in the EP/CP likely because of SST increases), while the moist static energy ratio term generally declines. The NA TC change does not seem to be due to basic AEW properties, as these are similar or slightly more intense in the FC, as shown in Fig. 6, and have a slight shift to later in the season. The interannual variability of GPI in the NA is somewhat reduced, mainly in the later years of the simulation (Fig. 9). Since a disproportionate amount of TC genesis in the models is concentrated in the western NA region in the PC ensemble, the reduction in the FC is likely stronger than would be the case in a model with a better genesis distribution.
There are also regions where the GPI changes and track density differences do not agree. The NA has a strong GPI increase farther north, in a similar region to that shown in Emanuel (2013), suggesting a more amenable TC genesis environment. However, there is no evidence of an increase in explicit tracks (indeed a decrease at N512), suggesting that either there are no precursor vorticity structures to help genesis to occur or that the model resolutions are not sufficient to adequately represent TCs in this region. The far west of the WP also indicates opposite trends from GPI and tracked storms at all resolutions, with the GPI increase primarily due to the potential intensity term, with a smaller contribution from the wind shear. Broadly, the SH has similar trends in GPI and track density, with weaker SST warming than in the NH and generally small or negative changes to GPI components.
Figure 14 (top) shows the percentage change in average TC frequency between the FC and PC integrations, where the changes in the CP are 200%–250% but based on low frequency in the present climate (see Fig. 3). The total decrease, of around 20% at all resolutions, is dominated by a strong 50% decrease in the SH (as seen in Fig. 14) and a much smaller (to insignificant at higher resolution) decrease in the NH (also seen in Gleixner et al. 2014). There is a strong decrease in the NA, while the Pacific shows a TC track shift with slightly fewer at the edges of the basin and a big increase in the CP, as also found in Li et al. (2010) and Murakami et al. (2012b). The NI and SA are the only places where the different model resolutions disagree on the sign of frequency change, but these are relatively small in magnitude (and note caution about the NI simulation discussed earlier). Otherwise, there is no strong resolution sensitivity, with particularly the N216 and N512 results agreeing closely, which may be partly due to all models using the same SST forcing and hence being strongly constrained.
Although the TCs simulated by the model are rather weak in terms of 10-m wind speed, the change in intensity in the FC compared to PC, shown in Fig. 14 (bottom), shows the shift to higher intensities also seen in many previous studies (Zhao et al. 2009; Murakami et al. 2012b; Rathmann et al. 2013; Bell et al. 2013). This shift is more pronounced at N216 and N512 resolutions, with between 5% and 10% decrease in the weakest storms and small increases at higher intensities. However, these results should be treated with caution, given the prescribed, patterned future SST used here and the lack of coupling to the ocean, which can have important effects (Bell et al. 2013).
Since the FC SST forcing consists of the addition of a repeating annual cycle, one might expect that the NA interannual variability of TCs in the FC would be similar to the present climate but with the frequency modified by the change in mean state. This is found to be true when considering the correlations of the FC GPI with the observed TC frequency (0.71, 0.72, and 0.78 for N96, N216, and N512, respectively), which is just slightly lower than the PC GPI correlations with observed TCs (0.77, 0.77, and 0.8 from Table 5). This is obvious given the similarity in GPI for PC and FC (Fig. 9), which correlate at about 0.9 between themselves. As found previously, this confirms that the GPI is a good measure of relative normalized TC frequency in PC and indeed at the lower resolutions may be a better measure than the tracked TCs (Walsh et al. 2013). However, the correlation between explicitly tracked TCs in the FC and PC integrations is much less than 0.9, being 0.43, 0.51, and 0.66 for N96, N216, and N512, respectively. This suggests that there may be other factors that change the TC frequency in the FC that are not accounted for by the GPI. Examining the correlation between the GPI and explicit TC frequency in the FC (0.7, 0.8, and 0.82) and PC (0.88, 0.86, and 0.82) reinforces this result that the lower-resolution models have reduced correspondence between GPI and explicitly tracked TCs in the future climate.
This work has described the results from an ensemble of forced atmospheric model integrations at resolutions of 130, 60, and 25 km, using both present climate and an end-of-century future climate forcing. It has shown that, as model resolution is increased in a controlled and systematic way, the representation of tropical cyclones improves in terms of their climatology, frequency, seasonal cycle, and structure so that at 25-km resolution the model replicates many aspects of the observed climatology. The interannual variability in the North Atlantic and northwestern Pacific is in good agreement with the observed hurricane frequency with correlations of between 0.6 and 0.75, with realistic teleconnections with modes of climate variability such as the El Niño–Southern Oscillation.
The main biases in the models involve genesis regions and storm intensity. There is a deficit in tropical cyclone genesis in the eastern Atlantic, which is reduced at the highest model resolution, and this seems to be associated with the properties of the African easterly waves and African easterly jet. Conversely, there is an excess in tropical cyclones in the eastern and central Pacific, a bias also seen in other models (Bengtsson et al. 2007a), which may be due to a combination of factors including an increase in vorticity sources at higher resolution. Such biases can have important consequences for whether the storms make landfall, which is of great importance for future impacts studies and risk assessment. The intensity of the modeled tropical cyclones is weak as measured by 10-m wind speed, and even at 25-km resolution it only achieves category 1 status, whereas the minimum mean sea level pressure can reach 940 hPa: this is likely due to the storms remaining relatively large in size. The cause of this bias continues to be investigated: recent dynamical core improvements to the MetUM (Wood et al. 2014) have allowed further deepening of the TC MSLP minima but have only slightly increased the wind speeds.
The weakness of simulated TC intensity, even though the frequency is well represented in the present climate, is a particular problem when investigating projections of future changes in TC climatology, given the general consensus of increases in the intensity of the strongest storms (Zhao et al. 2009; Murakami et al. 2012b; Rathmann et al. 2013; Bell et al. 2013). An idealized strong climate change forcing using RCP8.5 shows a general reduction in TC frequency (particularly in the Southern Hemisphere and North Atlantic), with a shift in tracks in the Pacific from the basin edges to the central Pacific, such that many more occur near Hawaii (as seen in Murakami et al. 2012b). The reduction in the Atlantic seems to be due to aspects of the mean state such as increased wind shear, particularly in the Gulf of Mexico, since other precursor factors such as AEWs are mostly unchanged, though the seasonal cycle does peak later in the year. The relative frequency of weaker storms decreases while the strongest storms increase.
Further study of different tracking methods and identification criteria is required and being done as part of the U.S. Climate Variability and Predictability Research Program (CLIVAR) Hurricane Working Group (HWG; http://www.usclivar.org/working-groups/hurricane; Walsh et al. 2013; Horn et al. 2015) since there continue to be uncertainties related to the details of these methods. There are a variety of tracking methods for explicitly simulated storms in climate models, although the typical model resolution used, for example, in the long CMIP5 integrations is rather low (on average about 1.5°; IPCC AR5) to reliably contain the whole spectrum of TC intensities. Walsh et al. (2013) showed that lower-resolution models represent large-scale indices such as GPI better than storm structures explicitly. There are downscaling and seeding methods (Knutson et al. 2013; Emanuel et al. 2010) that allow explicit simulation of the storm structures, with much higher intensities being captured, but there are issues with the large-scale forcing of such models to make it consistent with the global model, as well as seeding rates and lack of feedback of the idealized TC on the environment. There are methods for assessing the TC climatology from the large-scale environment (Camargo 2013; Tory et al. 2013b), but as shown here this does not produce relative changes between model resolutions and often gives an opposite signed response under climate change forcing than does explicitly tracking the storms (Emanuel 2013), as seen in this study, particularly in the NA. The gradual increase in resolution possible for global, long-term climate integrations may eventually help to unpick this issue, but together with the need for ensembles of integrations this is expensive (Manganello et al. 2012; Mizielinski et al. 2014), both computationally and analytically.
An understanding of the observed TC genesis regions and the precursors for formation also continues to be important to enable closer comparison with models. The tracking algorithms can trace the source vorticity structure back a long way; in particular for storms that strengthen in the EP, the source vorticity in models is often in the Atlantic (Serra et al. 2010). Using satellite imagery to discover whether this also happens in the real world, as done in case studies by Serra et al. (2010) and Rappaport and Mayfield (1992), could then help improve understanding of these genesis precursors. Further development of high-resolution reanalysis datasets, perhaps with data assimilation methods such as that used in JRA-25 to enhance the winds around TCs (only for those of TS strength and above), would also help both to test tracking algorithms and to give further insight into the genesis, evolution, and decay of tropical cyclones.
From this work it is difficult to say very much about TC intensity changes in the future since the model storms are relatively weak, only up to category 3 and even then only based on mean sea level pressure criteria. Small improvements have been achieved with some enhancement to model dynamics, but the surface winds are significantly weaker than those found in other models of similar resolution. However, it is unclear whether models at this resolution, using parameterizations of convection, can properly represent the processes found in tropical cyclones and hence represent the intensity for the right reasons.
More work is needed to understand model biases in the eastern Atlantic, where TC genesis is low, both because the genesis region has an important influence on TC tracks and potential landfall and because projections in TC changes in a FC state will have the wrong sensitivity. Ongoing comparison with the coupled seasonal forecast model GloSea5 (Camp et al. 2014, manuscript submitted to Quart. J. Roy. Meteor. Soc.), which shares the same atmospheric component but has different TC biases, may be valuable here, as might further investigation of differences between models at high resolution such as in Shaevitz et al. (2015). There are strong indications that African easterly waves play an important role and that their representation is sensitive to both model resolution and potentially to coupling or initialization. It may be that regional models, in which the AEW properties could be controlled/modified as part of the lateral boundary condition, are particularly useful to understand further the influence on tropical cyclone formation.
Although there is no evidence of convergence of TC frequency with resolution in this study, particularly in the North Atlantic, more recent work using MetUM global models with resolutions up to 12 km suggests that there is little further increase in global TC frequencies using the same tracking algorithm but a redistribution in intensities with fewer of the weakest storms being retained. If this could be confirmed and also found in other models, it would be a strong place from which to investigate and understand why the observed global TC frequency stays within a relatively narrow range.
The authors thank the members of the U.S. CLIVAR Hurricane Working Group for many useful discussions on aspects of this analysis and the three anonymous reviewers for their helpful comments for improving this manuscript. We also thank the large team of model developers, infrastructure experts, and all the other essential components required to conduct such a large-scale simulation campaign, in particular the PRACE infrastructure and the Stuttgart HLRS supercomputing center, as well as the STFC CEDA service for data storage and analysis using the JASMIN platform. This work was supported by the Joint U.K. DECC/Defra Met Office Hadley Centre Climate Programme (GA01101). M.-E. Demory and P. L. Vidale acknowledge the National Centre for Atmospheric Science Climate directorate (NCAS-Climate) (Contract R8/H12/83/001) for the High Resolution Climate Modelling (HRCM) program, and R. Schiemann acknowledges Natural Environmental Research Council (NERC)–Met Office Joint Weather and Climate Research Programme (JWCRP) HRCM funding. Prof. P. L. Vidale acknowledges the support provided to the Willis Chair in Climate System Science and Climate Hazards. We acknowledge use of the MONSooN system, a collaborative facility supplied under the JWCRP.
Supplemental information related to this paper is available at the Journals Online website: http://dx.doi.org/10.1175/JCLI-D-14-00131.s1
Current affiliation: Met Office Hadley Centre, Exeter, United Kingdom.
This article is included in the US CLIVAR Hurricanes and Climate special collection.