A simple index of extreme surface (2 m) monthly temperature was analyzed over the conterminous United States for 13 models from the Coupled Model Intercomparison Project phase 5 (CMIP5) hindcast (1981–2010) and prediction (2006–35) datasets as well as the U.S. climate division dataset, version 2 (nClimDiv), as observations for 1981–2010. Results are analyzed for regions defined in the recent Third U.S. National Climate Assessment. There is good agreement between models and observations for all regions for the annual warm and cold indices except for the warm index in the Northwest. For seasonal values of the temperature index, model simulations generally agree with the sign of the observed seasonal trends in all regions except for the Northwest and a few seasons in the “warming hole” areas of the central and southeastern United States. Most individual ensemble member simulations agree with the sign of the observed trend. However, in all regions and seasons, some simulations, in the range of 10%–40% of all ensemble members, show opposite signs, indicating that even overall skillful projections can have substantial uncertainty. These results indicate that there is potential skill in use of GCMs to provide projections of hot and cold extremes on the 30-yr time scale. However, it is important to note that natural variability is comparable to the forced signal on this time scale and thus introduces uncertainty. Analysis of the future simulations (2006–35) indicates that warm extremes increase rapidly while cold extremes become substantially more rare.
Numerous studies of U.S. mean and extreme temperature trends have been done. An overall decrease in the number of cold days and nights and the increase in the number of warm days and nights have been observed (IPCC 2012). In a study using the U.S. Historical Climatology Network (USHCN), Lee et al. (2014) concluded that extreme temperatures (both daily maximum and minimum) are increasing and specifically that high minimum temperatures are increasing more than maximum temperatures. Karl et al. (2012) and Peterson et al. (2008) also find similar results over the United States and North America. DeGaetano and Allen (2002) and Meehl et al. (2009) found that not only are the extreme high minimum and maximum temperature trends increasing but also that the number of cold extremes are decreasing. Peterson et al. (2013) also concluded that over the past several decades, heat waves are increasing, and cold waves are decreasing.
Extreme temperature events can have a multitude of adverse effects, both direct and indirect. One obvious direct effect is on heat-related deaths and potential increases in these type of deaths (Maloney and Forbes 2011). One indirect effect is on drought. Severe droughts can have devastating consequences on agriculture, including crops and livestock as well as water storage and supplies. Increasing trends in extreme temperatures can exacerbate drought conditions by increasing evaporation rates and the rate of depletion of soil moisture and thereby decreasing the supply of atmospheric water vapor, especially in the plains states and Texas (Hoerling et al. 2013; Namias 1983). Increased rates of soil moisture loss associated with extreme temperatures can also increase risks for wildfire (Park Williams et al. 2013, 2014).
Extreme temperatures can also significantly alter forest ecosystem carbon dioxide exchange, that is, transforming a forest from a net carbon sink into a net carbon source (IPCC 2012). Extreme warm years can lead to sustained decrease in carbon dioxide uptake by ecosystems (Arnone et al. 2008). Changes in extreme temperatures can also have an impact on surface ozone levels (Sanderson et al. 2003).
As global mean temperatures rise, temperatures during extreme events are certain to be higher in the mid to late twenty-first century (Meehl and Tebaldi 2004; Kunkel et al. 2010). On the shorter 10–30-yr time frame, climate model simulations indicate that uncertainties in projections of the future arise principally from internal climate variability and model-to-model differences (Hawkins and Sutton 2009). Temperatures in the continental United States increased by about 0.7°C over the 30-yr period of 1981–2010 (Walsh et al. 2014), and the Coupled Model Intercomparison Project phase 5 (CMIP5) generation of climate models simulates this increase well (Sun et al. 2015) with a considerable model spread. This suggests that there could be sufficient skill in projecting extreme temperature conditions to provide actionable information to decision-makers.
The U.S. National Weather Service’s seasonal climate outlooks identify areas where there are believed to be some skill in anticipating temperature and precipitation out to a year in advance. The likelihood of abnormal conditions is typically indicated as 5%–20% above nominal climatological values. Although this level of skill is not high in comparison with the skill achieved in forecasting weather out to a few days in the future, these outlooks have been used for planning in many sectors (Rajagopalan et al. 2002). Our goal is to identify aspects of near-term (10 to 30 yr) climate where similar skill levels can be achieved. There is relatively high confidence in continued warming, and this provides a robust foundation for skillful outlooks.
This study analyzed the CMIP5 hindcast and prediction ensembles (Taylor et al. 2012). Of specific interest were the 30-yr hindcast simulations, initialized with conditions beginning in 1980, leading to a simulation period of 1981–2010. The initial climate state may exert some influence in the early part of the period, but anthropogenic influence will surely be more important in the latter part and provide the basis for skillful forecasts. The prediction simulation of interest is the 2006–35 time period. The 1981–2010 hindcast can be compared with observations.
Meehl et al. (2014) provided a thorough review of progress and potential of decadal-scale predictions based on findings from analyses of the CMIP5 hindcast simulations. They indicate spatially variable skill of surface air temperature in the 6–9-yr time frame with high skill over the North Atlantic and western Pacific and moderate skill over North America. For North Atlantic upper-ocean temperatures, the skill derived from initial conditions is dominant before year 8, after which long-term anthropogenic forcing becomes the primary source.
The study here will present analysis of a simple extreme temperature index for 1981–2035, which will include observations, CMIP5 hindcast data, and CMIP5 prediction data. This paper is organized with a description of the datasets used in this study followed by a description of the methodology used to calculate the extreme temperature index. Then results are presented followed with a discussion and conclusions.
2. Methods and datasets
The CMIP5 includes a number of different experiments, including ones to better understand feedbacks associated with the carbon cycle and clouds, to explore climate predictability on decadal time scales, and to determine reasons why similarly forced models produce a wide range of responses. For our study, we use the decadal hindcasts and prediction simulations, which include model integrations for 10–30-yr intervals. There are three 30-yr simulations: 1960–90, 1980–2010, and 2005–35. Of most relevance to our study are the 1980–2010 hindcast and 2005–35 prediction simulations. The 1980–2010 period is characterized by rapidly rising greenhouse gas concentrations and global average temperatures. The hindcast–prediction experiment utilizes atmosphere–ocean global climate models (AOGCMs) that are initialized by observed conditions in 1980/2005 and include observed and projected time-varying concentrations of various atmospheric constituents, including greenhouse gases and volcanic eruptions (Taylor et al. 2012). These are fully coupled simulations. As such, they will not in general capture the exact timing of historical occurrences, such as specific El Niño–Southern Oscillation (ENSO) events. The model simulation data used here are monthly surface temperature from 1981 to 2010 and 2006 to 2035. Table 1 is a list of the 13 models selected for this study. The number of ensemble members for each selected model is also included in Table 1. For the 1981–2010 hindcast, all 13 models are used; however, for the 2006–35 prediction, only 10 of the 13 are available for use. The excluded models are the CCSM4, FGOALS-g2, and EC-EARTH models. The 13 (10) CMIP5 models selected for this study are based on those models that output the hindcast (prediction) data.
The U.S. climate division dataset, version 2 (nClimDiv; http://www.ncdc.noaa.gov/monitoring-references/maps/us-climate-divisions.php) is used as the observational surface temperature data. This dataset covers the conterminous Unites States for the period 1895–present at a monthly resolution. The underlying data used to compute climate division values are a gridded product on a nominal 5-km latitude–longitude grid. The gridded data are derived from a network of 10 325 stations. This dataset incorporates a number of improvements from version 1: (i) inclusion of additional station networks with improved quality assurance and temperature bias adjustments; (ii) addition of maximum and minimum temperatures to the climatic elements in the data; and (iii) the application of a new computational approach, which employs climatologically aided interpolation to address topography and network variability (Vose et al. 2014). The aggregation of the observations from the 5-km grid to the much coarser GCM grid is not expected to affect the results because there is considerable spatial coherence in the recent observed temperature trends, as shown by Walsh et al. (2014).
To consistently and directly compare extreme temperature metrics, the CMIP5 model data and the observational data are interpolated onto a common 1.5° × 1.5° grid. An extreme temperature index is defined based on standardized temperature anomalies (STAs). At each grid point, anomalies are calculated by subtracting the 1981–2010 long-term gridpoint monthly mean and then standardized by dividing by the gridpoint monthly standard deviation (σ). The mean and standard deviation values for 1981–2010 are applied to the 2006–35 simulation to calculate STA values. Extreme positive anomalies are defined as STA > +1.5 and extreme negative anomalies are defined as STA < −1.5. National values of the positive and negative extreme temperature indices are derived by counting the number of grid points over the United States for each month of each year with extreme values of the STA. The monthly values were then summed to seasonal and annual values. A regional analysis was also performed, and the regional values of the indices consisted of counts of extreme values of the indices for each of six climate regions (Fig. 1). The definitions of regions are the same as those used in the recent Third National Climate Assessment (Melillo et al. 2014). The choice of ±1.5 as the STA threshold definition of extremes is somewhat arbitrary and represents a compromise between sufficient events for robust statistical analysis and events that are impacts relevant. Using a ±2.0 STA yielded too few results, and ±1.0 STA was determined to not be extreme enough for this study.
a. Annual extremes index
Figure 2 shows time series of annual counts (expressed as a percentage of the total number of grids) of the number of positive (warm) extreme temperature events for the 1981–2035 time for each region. The three curves are the hindcast (1981–2010, blue), prediction (2006–35, red), and observed (1981–2012, black). The dashed black line is the observed linear trend for 1981–2010 for direct comparison with the hindcast (1981–2010) trends discussed later in the paper. Each of the regions shows increasing trends for both the observations and the model mean (both the 1981–2010 and the 2006–35) with the exception of the Northwest (NW) region. For the NW, the observed trend is downward, while the model mean trend (1981–2010 and 2006–35) is upward. However, the 2006–35 model simulations indicate a rather sharp increasing trend in warm extremes. The Northeast (NE) region shows almost identical trends for the observations and the model mean. For the Southeast (SE), Midwest (MW), Great Plains (GP), and Southwest (SW), the trend magnitudes are somewhat different between models and observations. The 1981–2012 observed trend (not shown) and the 1981–2010 model mean show better agreement than the 1981–2010 observed trends in the MW and GP because of the extreme temperatures in 2012. The differences in the observed trends of 1981–2010 and 1981–2012 (especially in the GP and MW) show a sensitivity of the linear trend to interannual variability. To illustrate this sensitivity for the NW, the years of 1986–88 and 1992 are set to missing since these are the largest values for the time period except 2003. The resulting linear trends without these years yield a slight increase in the observed trends. The magnitude is still smaller than the 1981–2010 model mean; however, the model mean and the observed trends now have the same sign, that is, increasing warm extremes.
The sensitivity of the trends to ENSO was investigated by alternately removing events occurring during the warm and cold phases and then recalculating the regional trends (noted as T-EN and T-LN for trends with El Niño and La Niña events removed, respectively). Warm phase events were defined as those occurring during months when the 3-month running average (centered on the month) of the oceanic Niño index (ONI) met the criterion ONI > +1.0 (http://www.cpc.ncep.noaa.gov/products/analysis_monitoring/ensostuff/ensoyears.shtml). Cold phase events were defined as those occurring in months meeting the criterion ONI < −1.0. The number of warm phase months is 23, 18, and 5 in the decades of 1981–90, 1991–2000, and 2001–10, respectively. The number of cold phase months is more uniformly distributed at 11, 19, and 11 in 1981–90, 1991–2000, and 2001–10, respectively. The removal of La Niña events results in an increase (more positive) in the upward trend in warm event extremes and a decrease (less negative) in the downward trend in cold extremes. There is no systematic change in trends for the removal of El Niño events. The magnitude of the change in trends is small for both the T-EN and T-LN sets of trends, generally about 0.1% of the total number of grids per decade or smaller, relative to the base trends, which are 0.3% per decade or higher in most cases. There are no cases where the sign of the trend is changed.
As expected, the observations show much more interannual variability than the model mean (Fig. 2) because of the averaging of the 77 ensemble members (1981–2010 and 64 ensemble members for 2006–35) in calculating the model mean, which will dampen the variability since there is no reason that occurrences of extremes in the models will be coincident in time among models. It is very evident that the two hindcast/prediction datasets for the overlapping years 2006–10 yield different results. For all six regions, the 2006–35 warm extreme temperature index shows higher fractions of each region than the 1981–2010 warm index. Finally, in all regions, the modeled trend for 2006–35 is substantially larger than for 1981–2010, that is, the models simulate an accelerating upward trend in this extreme metric.
Time series of the number of negative (cold) extreme temperature events (Fig. 3) show good agreement between model means from 1981 to 2010 and observed trends for all the regions. The 1981–2010 model means exhibit peaks in the early 1990s, presumably reflecting the effects of the Mount Pinatubo eruption, which is incorporated into the forcing data that drive the climate models (Halpert et al. 1994). The observations show smaller peaks, except for the MW and the GP, where the observations and the 1981–2010 model mean are in close agreement. This difference between the model mean and observations could be the result of several factors, including oversensitivity of some of the models to aerosol forcing, uncertainties in the forcing data, or natural variability muting the observed response to the volcanic eruption. Given the large number of models and associated multiple ensemble members, further analysis is beyond the scope of this paper. The 2006–35 data show very low values, indicating that the overall warming has decreased the instances of STA values less than −1.5 to near zero.
In summary, the model observations are in agreement with the general observed increase in the number of warm events and decrease in the number of cold events for 1981–2010. The observed results are consistent with prior studies (e.g., DeGaetano and Allen 2002) and consistent with the overall global warming due to increasing concentrations of greenhouse gases. For 2006–35, model projections indicate an acceleration of the upward trend in warm extremes and the virtual disappearance by 2035 of cold extremes. Seasonal behavior is explored in the following section.
b. Seasonal extremes index
Figure 4 presents the trends of the warm extreme temperature index for the four seasons (DJF, winter; MAM, spring; JJA, summer; and SON, fall) as a scatterplot of observed trend and model ensemble mean trend (both trends are in percent per decade) for the six regions. The model mean trends are positive (upward) for all six regions and for all four seasons. The observed regional trends are mostly positive, the exceptions being the GP in MAM, the MW in JJA, the SE in DJF, and the NW for all four seasons. Although most of the regions have the same sign between observed and model trends, most of the regional model trends (except the NW) are larger than that of the observations.
Figure 5 is similar to Fig. 4 but for the cold extreme temperature index. The model mean trends are downward for all six regions and the four seasons. The observed trends are also all downward except for the SE in winter and the NW in spring. The observed magnitude of the observed trends tends to be mainly less than the modeled trends.
Figures 4 and 5 show model ensemble mean trends. Further analysis examines the trends of all the ensemble members individually and compares the distribution of those trends with the observations. Since the model mean trend in some of the regions, especially the NW, does not agree well with the observed trends, it is useful to explore if any of the ensemble members agree well with the observed trends when the overall model mean does not.
Good agreement between the model mean trend and the observed trend was found for the NE (Figs. 2 and 3). Figure 6 shows the cumulative distribution functions (CDFs) of the trends for the 77 ensemble member set for the NE region for each season for both warm and cold temperature extremes. In each season, the distribution is quite large and spans zero, that is, the ensemble set includes trends of both signs. This shows that natural internal variability is sufficiently large to potentially account for differences in trend sign between models and observations. When considering just the sign of the trend, for winter, ~65% of the ensemble members agree for the warm index and ~82% for the cold index with observations. Since the CanCM4 alone has 18 members of the total 77 and thus heavily weighs the results, we investigated whether the results are different for the rest of the models. The red dashed lines in Fig. 6 are the CDFs without the CanCM4 included. It is clear that the resulting CDFs are not much different than the CDFs with all the members included. Results for spring show ~75% for the warm index and ~87% for the cold index in agreement on the trend sign. Summer results show ~80% for the warm index and ~96% for the cold index with agreement to the observed trend. For fall, the warm index has ~71% and the cold index has ~94% agreement to the observed trend. In all seasons, large percentages of the simulations agree in sign with the observed trends.
The comparison between the observed trend and the ensemble CDF was quantified as the ensemble CDF percentile value of the observed trend. For example, in Fig. 6, the observed trend for the winter warm index intersects the CDF at approximately the 45th percentile; it is this value that is plotted in Figs. 7 and 8. Good agreement is characterized as percentile values in the middle of the distribution, for example, in the range of 25–75 (interquartile range). Poor agreement is characterized as trend values outside of the range of the ensemble set or on the extreme tails of the distribution (<5th or >95th percentile).
The results for the warm index (all seasons and regions) are shown in Fig. 7. All observed values are within the 5th–95th percentile range, indicating that the ensemble set includes the actual observed outcome in all seasons and regions. However, there is variability in the closeness of agreement, as several observed values are outside of the interquartile range. The winter season shows the best agreement between the observed trend and the CMIP5 ensemble member CDFs as all regions except for the SE within the 40th to 60th percentile range. The SE percentile value is around 20. For spring, trends in three regions are outside of the interquartile range: NE (86th percentile), NW (12th percentile), and GP (16th percentile). For summer, two regions are outside the interquartile range: NW (13th percentile) and MW (9th percentile). For fall, the spread in percentile values is the greatest among the regions, and three regions are outside the interquartile range: NW (22nd percentile), NE (80th percentile), and MW (87th percentile).
The cold index results (Fig. 8) show poor agreement for two regions in the winter. The percentile values of the observed trend for the NW and the GP are both below the fifth percentile, indicating that the observed trends, which are downward, are larger in magnitude than almost all of the model trends. The MW region winter value (23rd percentile) is also outside of the interquartile range. For spring and fall, observed trends for four of the regions are within the interquartile range. In both seasons, the NW and GP observed trends are in the 75th–95th percentile range. For summer, observed trends for four of the regions are in the 75th–95th percentile range. Only the NW and SW are within the interquartile range.
Overall, the results of Figs. 7 and 8 suggest that the observed trends are generally consistent with the trends in the model ensemble set. For the warm (cold) index, 15 (12) of the seasons and regions (out of a total of 24 for each index) have observed trends within the interquartile range. Only 2 of the 48 (for both extremes) are outside of the 5th–95th percentile range. These values are fully consistent with a random sampling of the ensemble set.
There is variation among the models in the level of agreement with observations. An individual model assessment of agreement with the signs of the observed trends is summarized in Fig. 9 (warm index) and Fig. 10 (cold index), which display the percentage of ensemble members in agreement with the sign of the observed trend. For the NE region, most models have a majority of their members in sign agreement with observations. The major exception is CNRM-CM5 for warm extremes, for which most ensemble members are in sign disagreement in most seasons. For the other regions, the individual ensemble members are mostly in agreement with the sign of the observed trend for most models. Exceptions are those regions and seasons for which the sign of the observed trend is opposite the expectation for a warming world. These include downward trends in warm extremes for winter in the SE, spring in the GP, summer in the MW, and spring, summer, and fall in the NW, and upward trends in cold extremes for spring in the NW. In these cases, the sign of the simulated trend for most ensemble members is opposite the observed, although in all cases there are individual ensemble members that simulate both the sign and magnitude of the observed trend.
The CNRM-CM5 shows somewhat greater disagreement with observed trends than the other five models. The disagreement with the sign of the trends occurs primarily for the NE and SE regions. An examination of the trend of each ensemble member indicates that the model range includes the observed trend in all cases except JJA. In JJA, the model-simulated trends are in all cases less than the observed trend. An examination of mean summer temperature indicates no trend for the CNRM-CM5 multiensemble mean while observed temperatures are upward for 1980–2010 (Kunkel et al. 2013a,b). This model behavior is likely due to natural variability since the external forcing in all of the CMIP5 historical model simulations results in increases of the radiative forcing during this period (Myhre et al. 2013).
A metric of future changes in the extremes index was quantified as the block average difference in the index between the later and earlier periods, specifically the 2006–35 average minus the 1981–2010 average. The variations among ensemble members in future changes in extremes are displayed in Fig. 11 (warm index) and Fig. 12 (cold index). For the warm temperature extreme index (Fig. 11), the great majority of ensemble members show increased occurrences in all regions. The multimodel mean difference indicates an approximate doubling of occurrences in all regions. However, there are a few members that indicate slight decreases, illustrating that natural variability can cause future changes in the opposite direction of overall global warming. For the cold temperature extremes index (Fig. 12), all ensemble members indicate decreases in occurrences in the future period. The multimodel mean differences indicate decreases of about 60% or more in all regions.
4. Discussion and conclusions
Mean annual temperature has warmed since 1980 in all regions of the United States (Walsh et al. 2014). In general, monthly temperature extremes have changed in the same way, with hot extremes increasing and cold extremes decreasing, with some key exceptions discussed later. Greenhouse gas (GHG) forcing has increased continuously since 1980, and global temperatures have risen substantially. An expectation is that global climate models forced by this increase in GHGs will simulate increases in warm extremes and decreases in cold extremes. This is generally the case, as the models mostly simulate the correct sign of the observed trend.
Most models have an ensemble of simulations. These provide some quantification of the effects of natural variability. The majority of simulations agree in sign with observed trends, indicating that there is skill in use of such simulations as projections of the future. However, in all seasons and regions, there are some members, in the range of 10%–40% of all members, with opposite signs. For use as 30-yr projections, this indicates that even under strongly increasing forcing, any projections, though skillful, will be associated with substantial uncertainty.
The primary source of skill is presumably the long-term trend in external forcing from greenhouse gas concentration increases. Meehl et al. (2014) indicate that the contribution to skill from initialization is relatively small in years 2–9 over the United States, and the contribution would be even smaller after year 9.
The well-studied “warming hole” (Pan et al. 2004; Rogers 2013) in the central and southeastern United States is also accompanied by a lack of upward trends in warm extremes in winter in the Southeast, spring in the Great Plains, and summer in the Midwest. In general, climate models fail to simulate the lack of mean warming in these regions (e.g., Kunkel et al. 2006), and in this study, the trends in most ensemble members are of the opposite sign to the observed trends. However, there are some ensemble members that simulate a lack of upward trend in warm extremes in these seasons and regions. This is similar to Meehl et al. (2012), who found ensemble members from the CCSM that simulated the annual mean temperature “warming hole.” In fact, all of the observed trends are within the ranges of the ensemble set of trends.
We conclude that there is potential skill in use of GCMs to provide projections of hot and cold extremes on the 30-yr time scale. External forcing appears to be the dominant source of skill on this time scale. Analysis of the future simulations (2006–35) indicates that warm extremes increase rapidly, while cold extremes become substantially rarer. Given the difficulty of shifting to an energy economy not dominated by fossil fuel consumption, the likelihood that external forcing will continue to increase appears high, thus providing a rather strong basis for GCMs’ use as skillful projections. However, it is important to note that natural variability is comparable to the forced signal on this time scale and thus introduces uncertainty.
We acknowledge the World Climate Research Programme’s Working Group on Coupled Modelling, which is responsible for CMIP, and we thank the climate modeling groups (listed in Table 1 of this paper) for producing and making available their model output. For CMIP, the U.S. Department of Energy’s Program for Climate Model Diagnosis and Intercomparison provides coordinating support and led development of software infrastructure in partnership with the Global Organization for Earth System Science Portals. This work was partially supported by the National Science Foundation under Award CBET-1204368 and by NOAA through the Cooperative Institute for Climate and Satellites–North Carolina under Cooperative Agreement NA14NES432003.