Multiproxy warm season (September–February) temperature reconstructions are presented for the combined land–ocean region of Australasia (0°–50°S, 110°E–180°) covering 1000–2001. Using between 2 (R2) and 28 (R28) paleoclimate records, four 1000-member ensemble reconstructions of regional temperature are developed using four statistical methods: principal component regression (PCR), composite plus scale (CPS), Bayesian hierarchical models (LNA), and pairwise comparison (PaiCo). The reconstructions are then compared with a three-member ensemble of GISS-E2-R climate model simulations and independent paleoclimate records. Decadal fluctuations in Australasian temperatures are remarkably similar between the four reconstruction methods. There are, however, differences in the amplitude of temperature variations between the different statistical methods and proxy networks. When the R28 network is used, the warmest 30-yr periods occur after 1950 in 77% of ensemble members over all methods. However, reconstructions based on only the longest records (R2 and R3 networks) indicate that single 30- and 10-yr periods of similar or slightly higher temperatures than in the late twentieth century may have occurred during the first half of the millennium. Regardless, the most recent instrumental temperatures (1985–2014) are above the 90th percentile of all 12 reconstruction ensembles (four reconstruction methods based on three proxy networks—R28, R3, and R2). The reconstructed twentieth-century warming cannot be explained by natural variability alone using GISS-E2-R. In this climate model, anthropogenic forcing is required to produce the rate and magnitude of post-1950 warming observed in the Australasian region. These paleoclimate results are consistent with other studies that attribute the post-1950 warming in Australian temperature records to increases in atmospheric greenhouse gas concentrations.
Paleoclimate records are fundamental for evaluating the long-term context of recent regional and global climate variability. Extending the baseline of preindustrial climate variations from paleoclimate proxies allows natural forcing and internal variations to be isolated from anthropogenically forced changes using detection and attribution studies (Hegerl et al. 2011; PAGES2k–PMIP3 Group 2015). Uncertainties in future climate change projections depend not only on future emissions of greenhouse gases, but also on the ability of climate models to skillfully reproduce past climate variability. Reconstructions of regional-scale temperature provide an extended basis for evaluating the accuracy of climate models in simulating past regional climate variability and an opportunity to reduce uncertainties associated with future climate variability and change (Hegerl et al. 2011; PAGES2k–PMIP3 Group 2015).
In this study we consider the land and ocean region of Australasia. This is defined as an area comprising Australia, New Zealand, and neighboring islands in the Indian, Southern, and Pacific Oceans bounded by 0°–50°S, 110°E–180°. Sustained warming has been recorded across much of Australasia from the beginning of the twentieth century. Since 1910 (the period of extensive, high-quality observational records), Australia, the largest continental mass in Australasia, has experienced an annual mean land surface temperature increase of 0.9°C with approximately 0.7°C of the warming observed since 1960 (Della-Marta et al. 2004; Keenan and Cleugh 2011). The period 2001–10 was the warmest decade recorded in both Australian land and sea surface temperature (SST) observations (Keenan and Cleugh 2011). Increases in mean minimum and maximum temperatures have also been observed from stations on the North and South Islands of New Zealand over the 1961–2005 period (Chambers and Griffiths 2008). New Zealand’s seven-station temperature record shows a warming of 0.96°C over the 1910–2010 period (http://www.niwa.co.nz/climate/nz-temperature-record). Recent research has shown that the late twentieth-century and early twenty-first-century (1980–2009) temperatures of Australian waters were 0.57°C higher than the early twentieth century SSTs (1910–39), with greatest increases reported off the southeastern and southwestern Australian coasts (Lough and Hobday 2011).
Given the large warming trend observed in Australasian temperature records since the late twentieth century, it is important to quantify how climate in the region varied during preindustrial times, centuries before meteorological observations are available. Paleoclimate records can be used as extended estimates of the climate under natural boundary forcing to evaluate climate model simulations for the region. The latest climate model projections suggest that Australian land surface temperatures may rise between 0.6° and 1.3°C above 1986–2005 levels by 2030, with a best estimate of 0.9°–1.0°C (Whetton et al. 2015). By 2090, increases of 0.6°–5.1°C are projected over Australia depending on global greenhouse gas mitigation policies, with a best estimate of 1.9°–4.1°C (Whetton et al. 2015). Robust, well-verified paleoclimate reconstructions can help evaluate the performance of global climate models in the Australasian region by providing extended estimates of annual–centennial-scale climate variations.
Reconstructions of past climate variability from Australasia are of broader importance, as the region contains the core dynamical regions of several major atmospheric and oceanic circulation features with a hemispheric or near-global influence—for example, El Niño–Southern Oscillation (ENSO), the interdecadal Pacific oscillation (IPO), the southern annular mode (SAM), the Australian monsoon, the Indian Ocean dipole (IOD), and the midlatitude westerlies. Consequently, reconstructing past climate variations in the Australasian region may provide better estimates of the local variability associated with these major climate modes with both natural and anthropogenic forcing. Ultimately this may contribute to the understanding of the evolution and stability of these circulation features and their future regional climatic impacts.
Northern Hemisphere multiproxy temperature reconstructions show that the 1983–2012 period is likely to be the warmest in at least the last 1400 years (Jansen et al. 2007; Mann et al. 2008; Frank et al. 2010; Masson-Delmotte et al. 2013; PAGES2k Consortium 2013). While the multiproxy temperature reconstructions in the Southern Hemisphere (Jones et al. 1998; Huang et al. 2000; Mann and Jones 2003; Mann et al. 2008; Neukom et al. 2014a) are more uncertain, recent work suggests that the post-1974 warming is the only period of the past millennium where both hemispheres are likely to have experienced simultaneous warm extremes (Neukom et al. 2014a). The centennially resolved borehole estimates in Huang et al. (2000) from Australia, South America, and Africa indicate that the magnitude of land surface warming over the past 500 years is estimated to be lower in the Southern Hemisphere (0.8°C) than the Northern Hemisphere (1.1°C).
Despite advances in estimating hemispheric and global mean temperature trends over the last 2000 years (Wahl et al. 2010), there are still uncertainties in understanding regional responses to large-scale temperature changes from global radiative forcing (D’Arrigo et al. 2009; Mann et al. 2009). Little is known about the magnitude and timing of temperature fluctuations in the Southern Hemisphere during the so-called Medieval Climate Anomaly (MCA) warm (900–1250) or Little Ice Age (LIA) cool (1400–1700) intervals described from Northern Hemisphere climate reconstructions (Hughes and Diaz 1994; D’Arrigo et al. 2006a; Mann et al. 2009; Diaz et al. 2011; Graham et al. 2011).
Previous research for Australasia has focused on two annually resolved tree-ring-based land temperature reconstructions from Australia and New Zealand and a composite of 57 centennially resolved borehole sites throughout Australia (Cook et al. 2000, 2002a; Pollack et al. 2006). Silver Pine tree-ring widths from New Zealand suggest that twentieth-century warm season temperatures have been unusual, but not unprecedented, in the context of the past millennium in this subregion of Australasia (D’Arrigo et al. 1998; Cook et al. 2002a,b, 2006).
The unusual nature of recent warmth is suggested by a composite borehole temperature reconstruction for Australia, which shows a temperature increase of approximately 0.5°C over the past 500 years, with 80% of the warming occurring during the nineteenth and twentieth centuries (Pollack et al. 2006). The borehole record indicates that the seventeenth century was the coolest interval of the five-century reconstruction. As most of the Australian boreholes were logged prior to 1976, the observed subsurface temperatures do not include the pronounced warming recorded over the last two decades of the twentieth century but currently provide the only baseline of preindustrial temperature conditions experienced over the large-scale continental region of Australia (Pollack et al. 2006; Jansen et al. 2007).
In recent years, attention has focused on quantifying regional temperature variations in paleoclimate reconstructions in response to the radiative forcing associated with natural solar and volcanic variations and increases in anthropogenic greenhouse gas concentrations (Mann et al. 2005; Hegerl et al. 2007a; Schurer et al. 2014; PAGES2k–PMIP3 Group 2015). In particular, there has been a focus on improving climate reconstructions of the last 2000 years as it is a period that contains marked temperature variations in many parts of the globe like the MCA, LIA, and late twentieth-century warming (Jones and Mann 2004; Jones et al. 2009; Newman et al. 2009) and is the period when the majority of Earth’s precisely dated, monthly to annually resolved paleoclimate records are available for direct calibration with instrumental records over the twentieth century.
In response to the lack of continental-scale climate reconstructions in IPCC AR4, in 2009 the International Geosphere–Biosphere Programme’s (IGBP) Past Global Changes (PAGES) initiative developed the Regional 2k Network for research on the past two millennia, a set of working groups to collect and process the best available paleoclimate data to develop climate reconstructions in eight regions of the world (http://www.pages-igbp.org/workinggroups/2k-network; Newman et al. 2009; PAGES2k Consortium 2013). The Australasia (Aus2k) working group examines the Indo-Pacific region consisting of the landmasses of Australia, New Zealand, the Indonesian archipelago, and the neighboring islands of the Pacific Ocean.
This paper evaluates the Aus2k working group’s regional consolidation of Australasian temperature proxies by developing an ensemble of reconstructions that uses a range of published statistical methods. Multiproxy combined land and ocean mean temperature reconstructions are developed and compared for the austral spring–summer [September–February (SONDJF)] warm season using four previously published reconstruction techniques. An assessment of possible contributions of natural and anthropogenic forcing to temperature variations in the region over the past 1000 years are also provided through a comparison with 1000-yr forced and unforced GISS-E2-R climate model simulations. The model comparison provides a preliminary investigation of the role of natural forcing, anthropogenic forcing, and internal climate variability for Australasian temperature fluctuations over the past millennium and demonstrates the value of such reconstructions for detection and attribution studies in the region.
2. Data and methods
To develop our reconstruction of warm season SONDJF Australasian temperatures, we use three sources of data: instrumental temperatures, paleoclimate proxy data, and climate model simulations. Throughout this study, Australasia is defined as the land and ocean areas of the Indo-Pacific and Southern Oceans, bounded by 0°–50°S, 110°E–180°.
a. Instrumental data
Hadley Centre/Climatic Research Unit, version 3, variance adjusted (HadCRUT3v), 5° × 5° monthly combined land and ocean temperature data were used to represent instrumental temperatures in Australasia (Brohan et al. 2006; Rayner et al. 2006). Mean SONDJF data for each year from 1900 to 2014 were area averaged over the study domain. Note that the 1850–99 section of HadCRUT3v was excluded from our analysis because of large data gaps in the network prior to 1900 (Jones et al. 1999; Brohan et al. 2006). For calibration of the reconstruction statistical models, we use subsets of the instrumental data from 1931 onward because prior to this time data were generally only present over southeastern Australia. The data from 1900 to 1930 are used for a separate, independent verification of the temperature reconstructions.
The utility of a spatial mean temperature series is directly related to the presence of large-scale coherence of land and ocean temperatures on decadal and longer time scales, which is the time scale of interest in the current analysis. Strong spatial coherence is meaningful as it indicates that common dynamical mechanisms are regulating temperature over most of the region. Figure S1.1 in the supplemental material demonstrates this coherence, illustrating the strength of the relationships between the temperatures at the HadCRUT3v grid cells and area-averaged temperatures.
b. Paleoclimate proxy data
Our temperature proxy network was drawn from a broader Australasian domain (10°N–80°S, 90°E–140°W) than previously defined for the reconstruction target [see appendix A, section S1 in the supplemental material, and details provided in Neukom and Gergis (2012)]. Sourcing proxies from a slightly wider domain allows more potential predictors associated with Southern Hemisphere circulation features associated with ENSO, IOD, and SAM variability to be considered in the reconstruction. Given the demonstrated influence of large-scale circulation on regional climate variability in the Australian region (e.g., Allan 1988; Nicholls 2010; Gergis et al. 2012) and our use of ensemble reconstruction methods to robustly quantify a range of uncertainty parameters outlined in section 2c and appendix B, our approach is justified given the lack of temperature proxies available from mainland Australia.
The proxy network showed sensitivity to Australasian temperatures from September to February—the period that contains the tree-ring growing season of the austral spring–summer months. All tree-ring chronologies were developed from raw measurements using the signal-free detrending method, which improves the resolution of medium-frequency variance (Melvin et al. 2007; Melvin and Briffa 2008). All years when fewer than five tree-ring series were available or the expressed population signal (EPS; Briffa and Jones 1990) values were below 0.85, indicating reduced correlation between constituent series, were excluded from the analysis.
The only exceptions to this signal-free tree-ring detrending method were the New Zealand silver pine tree-ring composite (Oroko Swamp and Ahaura), which contains logging disturbance after 1957 (D’Arrigo et al. 1998; Cook et al. 2002a, 2006), and the Mount Read Huon pine chronology from Tasmania, which is a complex assemblage of material derived from living trees and subfossil material. For consistency with published results, we use the final temperature reconstructions provided by the original authors that include disturbance-corrected data for the silver pine record and regional curve standardization for the complex age structure of the wood used to develop the Mount Read temperature reconstruction (Cook et al. 2006).
Note that the instrumental data used to replace the disturbance-affected period from 1957 in the silver pine tree-ring record may have influenced proxy screening and calibration procedures for this record. However, given that our reconstructions show skill in the early verification interval, which is outside the disturbed period, and our uncertainty estimates include proxy resampling (detailed below), we argue that this irregularity in the silver pine record does not bias our conclusions. It is also worth noting that there are another 27 records that replicate the post-1957 period in the predictor network, highlighting the value of using multiproxy reconstructions.
Although the Mount Read record from Tasmania extends as long as 3602 years, in this study we only examine data spanning the last 1000 years. The last millennium contains the better-replicated sections of the silver pine chronology from New Zealand (Cook et al. 2002b, 2006) and is the key period for which model simulations have been run for comparison with paleoclimate reconstructions (e.g., Schmidt et al. 2012; Fernández-Donado et al. 2013).
All coral records with monthly, bimonthly, or seasonal resolution were averaged over the SONDJF period to align with the warm season reconstruction window. For predictor selection, both proxy climate and instrumental data were linearly detrended over 1931–90. As detailed in appendix A, only records that were significantly (p < 0.05) correlated with temperature variations in at least one grid cell within 500 km of the proxy’s location over the 1931–90 period were selected for further analysis. This process identified 28 temperature predictors for the SONDJF warm season (Fig. 1 and Table 1), henceforth referred to as the R28 network. As three records end before 1990 and others have single missing values within the 1900–90 calibration/verification period, missing values in the predictor matrix during this period (1.12%) were infilled using principal component regression analysis (PCA; Scherrer and Appenzeller 2006; Neukom et al. 2011). The influence of proxy selection on the subsequent reconstructions is extensively examined in section S1 of the supplemental material.
To verify the sensitivity of our results to a loss of records back in time, we perform additional reconstructions using the four reconstruction methods (see appendix B and section 2c below) based on the R3 network that comprises only the three longest proxy records available in the pre-1430 period and the R2 network using only the two longest (midlatitude tree ring) records that extend continuously back to 1000. A comparison of the R3 and R2 networks is provided in appendix C to assess how ocean and land proxies influence the reconstruction results during the early part of the millennium when the reconstructions are only based on few records. We caution that because the results are based on a small number of records, they might not be representative of the complete Australasian region because either they are highly regional (only the midlatitudes are represented in the R2 network) or they rely on remote teleconnections (Palmyra coral, included in the R3 network) to infer Australasian temperatures (Gallant et al. 2013). However, as we are interested in evaluating combined land and ocean temperatures using HadCRUT3v, we recommend evaluating temperature anomalies based on the full R28 network against the R3 network that incorporates both terrestrial and ocean proxies from mid- and low latitudes to match our target predictand.
c. Reconstruction methods
We use four published climate reconstruction techniques to develop reconstructions of Australasian temperature variations over the past 1000 years ( appendix B). Following multimethod approaches considered by PAGES2k Consortium (2013) and Hanhijärvi et al. (2013), we use a PCR ensemble (Luterbacher et al. 2002; Neukom et al. 2014a,b), composite plus scale (CPS; Mann et al. 2008; Neukom et al. 2011), pairwise comparison (PaiCo; Hanhijärvi et al. 2013), and a Bayesian hierarchical model (referred to as LNA; Li et al. 2010) to develop four separate temperature reconstruction ensembles. Note that here we use the abbreviation LNA (for the authors Li, Nycha, and Amman; Li et al. 2010) to use terminology that is consistent with PAGES 2k Consortium (2013) study, and distinguish the method from other studies that use Bayesian hierarchical models.
A 1000-member ensemble reconstruction was developed for each statistical method for each of the R28, R3, and R2 networks by perturbing reconstruction parameters—for example, by removing individual proxies from the predictor matrix or sampling a subset of years for calibration. For the LNA and PaiCo methods, the same parameter settings were applied as in the PAGES2k Consortium (2013) analysis. The PCR and CPS methods were applied similarly to Neukom et al. (2014a). However, for a more compatible comparison across all four methods, an additional uncertainty estimate was included for those ensembles generated using the PCR and CPS methods; namely, each member of the PCR and CPS ensembles had an additional error term introduced, which represents the residual error between that member and the observations from which the transfer function was generated (i.e., the regression residual). These residual errors were modeled using Gaussian noise with a first-order autoregressive [AR(1)] component, scaled to have the same standard deviation as the regression residuals from the PCR and CPS methods and added to the reconstructed temperatures, similar to the approach of Wahl and Smerdon (2012). Further details on the implementation of the statistical methods used in this study are provided in appendix B, Hanhijärvi et al. (2013), and Li et al. (2010).
These methods follow the ensemble approach of Frank et al. (2010) and ensure that the major sources of uncertainty in the reconstructions are included. From these sources of uncertainty, a 90% confidence interval is produced for each reconstruction method. This confidence interval is defined as being between the 5th and 95th percentile of the errors computed for each reconstruction method, described above, that are associated with methodological parameters, resampling of the predictor network, and the errors between the instrumental data and the individual reconstructions from the ensemble.
The reconstructions were calibrated with the instrumental data over the 1931–90 period, and the 1900–30 period was used for independent early verification. Instrumental verification analyses were undertaken to assess whether we could reconstruct mean temperature from the entire Australasian field using instrumental data only from grid cells representing the R28 proxy network. This was tested by applying the PCR reconstruction method to instrumental data taken from the HadCRUT3v grid at locations closest to the 28 proxy locations over the 1921–2000 period. Because of large amounts of missing data in some parts of the HadCRUT3v grid in the early twentieth century (see Fig. S1.2), only grid cells with less than 33.3% of data missing were used.
For further validation, the same analysis was also run using instrumental temperatures from the closest Global Historical Climatology Network (GHCN) stations (Peterson and Vose 1997) for land temperature proxies and the HadISST data (Rayner et al. 2003) for ocean temperature proxies. Note that considerable amounts of missing data at a number of stations in our domain restricted the GHCN analysis to the 1953–92 period. As a final verification exercise, 10 different variants of the HadCRUT3v grid points were degraded by including white noise in the temperature signal so that the relationship (as measured by the Pearson correlation coefficient r) between the noise-degraded grid cell and the original grid cell was the same as that between the original grid cell and the proxy record. Since each proxy displays a different correlation coefficient with its corresponding observation, the amount of white noise added was correspondingly different at each location. The results of these additional instrumental pseudoproxy verification analyses are presented in section S2.1 in the supplemental material.
d. Climate model simulations
To assess the role of climate forcings on the warm season Australasian temperature reconstructions over the last millennium, we compared the temperature reconstruction results with three simulations of the GISS-E2-R climate system model, a fully coupled global atmosphere–ocean general circulation model (Schmidt et al. 2006). For the fully forced simulation of the past millennium, we used an ensemble of three simulations with the external forcing according to PMIP3–CMIP5 conventions (see Schmidt et al. 2011). Ensemble member 121 uses solar forcing from Steinhilber et al. (2009), volcanic forcing from Crowley et al. (2008), and land-use and land-cover changes from Pongratz et al. (2008). Member 124 uses solar forcing from Vieira et al. (2011), volcanic forcing from Crowley et al. (2008), and land-use and land-cover changes from Pongratz et al. (2008). Member 127 uses solar forcing from Vieira et al. (2011), volcanic forcing from Crowley et al. (2008), and land-use and land-cover changes from Kaplan et al. (2011). All simulations use the greenhouse gas forcing of MacFarling-Meure et al. (2006).
We also used a 1200-yr preindustrial control simulation with stable forcing at 800 conditions (solar, land-use/land-cover, greenhouse gas, and orbital forcing; volcanic forcing at average conditions over 850–1999). Additionally we use a historical simulation covering 1850–2005 with natural forcing only. GISS-E2-R was the only model in the PMIP3–CMIP5 suite (Masson-Delmotte et al. 2013) that had an ensemble of multiple fully forced runs of the past millennium that were publicly available at the time of analysis. Accordingly, we confine our analysis and interpretation to this climate model.
3. Results and discussion
a. Reconstruction calibration, verification, and quality assessment
The variance explained by the median of the temperature reconstruction ensembles during the 1900–30 early verification period ranges from 15% (r = 0.39) for LNA to 30% (r = 0.55) for CPS (Fig. 2). Further metrics of reconstruction skill were used—namely, the reduction of error (RE) and coefficient of efficiency (CE) (Fritts 1976; Cook et al. 1994). These metrics compare the errors between reconstructed (i.e., predicted) and observed temperatures to the errors computed when the mean of the calibration (for RE) and verification (for CE) periods are used as a predictor. Values above zero indicate the reconstruction’s improved ability to track instrumental variance when compared to the simple model of the instrumental mean (Fritts 1976; Cook et al. 1994).
While RE measures for the R28 network are well above zero for the ensemble medians (0.66–0.76; Table 2) and mostly positive for the individual ensemble members, CE values are below zero for the LNA and PaiCo ensemble means and a large fraction of ensemble members. While the 90% confidence intervals of the skill metrics are similar across the methods, the skill of the ensemble median of PCR and CPS methods performs slightly better than for LNA and PaiCo. Table 2 shows that the ensemble median is more skillful than the majority of, and in some cases all of, the individual ensemble members (see Wahl and Smerdon 2012; Neukom et al. 2014a). This is because when the median of the ensemble is calculated, the noise inherent in the different ensemble members is reduced, which improves the skill metrics compared to the 90% confidence interval.
According to the CE and RE metrics, reconstruction skill is lowest during the pre-1430 period when only two or three temperature proxies are available. While the ensemble RE values for R3 remain at relatively high levels (median REs >0.56) over the entire millennium (Table 2), CE values are negative before 1458, indicating reduced confidence in the reconstruction according to the CE metric. We also compared the RE and CE values to those of reconstructions based on AR(1) noise proxies (Figs. S2.6–S2.9 in the supplemental material; Wahl and Smerdon 2012; Neukom et al. 2014a). These noise experiments highlight two findings. The first is that the real proxies outperform the noise proxies, suggesting that a real temperature signal is present in the R2 and R3 networks. The second is that the very negative CE values (i.e., reduced skill) in R2 are only present when the Oroko Swamp record is used as the only proxy, highlighting that Oroko Swamp is producing a subregional signal rather than one consistent with all of Australasia (also evident from Table S1.3 in the supplemental material). CE values are higher (close to zero but still mostly negative) when the other proxy within R2 (Mount Read) is included in the reconstruction.
Appendix C shows that the 90% confidence intervals of the R3 and R28 reconstructions (using 3 and 28 proxies, respectively) overlap for a large fraction of the 1000-yr record. The exception is for the reconstructions developed using the LNA method, where the R28 reconstruction has a considerably larger low-frequency variation compared to other reconstructions, which is discussed in more detail below. The R3 and R2 networks yield similar reconstruction skill; however, the CE values are lower for R2, again highlighting lower skill for the reconstructions produced using less-replicated proxy networks.
In section S1 in the supplemental material we present some sensitivity analyses for the PCR method that indicate that the results and conclusions of our analysis are not sensitive to the proxy selection method. For example, Fig. S1.5 in the supplemental material shows that even in the absence of screening, the predictor network yields very similar reconstructed temperature variations (including the degree of late twentieth-century warmth) to the refined predictor network that was designed to extract only predictors with a demonstrated relationship to Australasian temperature variations. Our results also show that the differences between using detrended and raw correlations to screen the predictor networks, as well as between using field mean and local correlations, are minor (Figs. S1.3 and S1.4 in the supplemental material). Given this insensitivity, local detrended correlations were used to select our final set of temperature predictors (see theoretical discussion in section S1 in the supplemental material).
The instrumental verification analyses confirmed that a subset of data from the HadCRUT3v temperature grids, at the closest locations to the 28 paleoclimate records listed in Table 1, can skillfully reconstruct the SONDJF spatial mean of the HadCRUT3v Australasian combined land and ocean temperatures (section S2.1 in the supplemental material). The correlation of the annual SONDJF temperature reconstruction based on these 28 HadCRUT3v grid cells and the full HadCRUT3v predictand was highly significant (r = 0.86; p ≪ 0.01) from 1900 to 1990 (Fig. S2.1 in the supplemental material) and remained strong even after linear detrending (r = 0.69; p ≪ 0.01). A median verification RE of 0.72 was obtained over the 1900–90 period. Given the data availability issues noted in section 2a, it is unsurprising that the reconstruction results are somewhat weaker using the 28 nearest GHCN stations (r = 0.74; p ≪ 0.01) over the 1953–92 period (r = 0.68 detrended; p ≪ 0.01; recall that GHCN data were not available from 1900). Once again, a positive median verification RE of 0.28 was found over the 1953–92 interval (with a positive bias observed in the full histogram of REs provided in Figure S2.2 in the supplemental material), suggesting that a skillful reconstruction of the HadCRUT3v Australasian SONDJF spatial mean is indeed possible using the R28 network.
A final test of the ability of the reconstruction method to extract a real temperature signal from noisy proxy data was performed using 10 white-noise-degraded HadCRUT3v instrumental datasets (described previously in section 2c). An ensemble of PCR reconstructions was generated from each set of instrumental proxies, and the resulting mean reconstruction (Fig. S2.3 in the supplemental material) indicates that skillful reconstructions are possible using these noise-degraded datasets. The correlations between the mean reconstructions from the 10 sets of pseudo-instrumental proxies and the instrumental predictand were all statistically significant at the 0.05% level or smaller. The correlations computed using raw data ranged between 0.52 and 0.69, and between 0.37 and 0.51 for detrended data, which is similar to the range of 15%–30% of variance explained by the real proxy network. The median degraded instrumental verification RE values ranged between −0.002 and 0.30 across all pseudo-proxy ensembles, and 9 of the 10 ensembles had positive median RE values (Fig. S2.3 in the supplemental material). The results provide evidence that our method can successfully extract an underlying common temperature signal even when it is compounded by extraneous noise.
b. Australasian SONDJF temperature variations 1000–2001
Figure 3 shows the 30-yr loess-filtered Australasian temperature reconstructions of the four different reconstruction methods. The reconstructions developed using each statistical method show considerable variations in the amplitude between present-day and preindustrial Australasian temperatures (Figs. 3a–d). While the amplitudes of reconstructed temperature variations are comparable using PCR, CPS, and PaiCo methods, the LNA method produces systematically cooler preindustrial temperatures.
While the potential loss of low-frequency variance is widely discussed in the literature (Jones et al. 2009; Smerdon 2012; Wahl and Frank 2012; Hanhijärvi et al. 2013), a complete assessment of which of the methods most adequately represents very low-frequency temperature amplitudes in Australasia is beyond the scope of this study. However, a brief comparison of the modern and preindustrial temperature variation (defined as the 1961–90 mean minus the 1500–1900 mean) shows that the results of PCR (median of 0.32°C), CPS (0.35°C), and PaiCo (0.36°C) methods are much closer to the GISS-E2-R simulations (0.38°C) than the LNA method (0.63°C). The LNA reconstruction amplitude is even larger than in Northern Hemisphere reconstructions [ensemble median of 0.43°C in Frank et al. (2010)] and Northern Hemisphere simulations of GISS-E2-R (0.51°C), suggesting an overestimation of the likely amplitude of temperature variations reconstructed by the LNA method.
Despite these differences in very low-frequency variation, Fig. 3e shows that the decadal-scale variations reconstructed by the different methods are remarkably similar. The four methods also show general coherence in the timing of large temperature fluctuations over the last millennium (Fig. 4).
Next we examine 30-yr and decadal temperature extremes over the past millennium based on our 1000-member reconstruction ensembles. Note that while this discussion focuses on the full R28 network, results for the pre-1430 R3 and R2 networks are also presented in Table 2 and appendix C for comparison. Following a relatively cool eleventh century, reconstructed temperatures relative to the 1000–2000 reference period (Fig. 3e) show that the period between 1100 and 1500 is characterized by a relative warm period. The so-called Little Ice Age described from the Northern Hemisphere is thought to extend from approximately 1400 to 1700, but possibly ending as late as 1850 (Mann et al. 2009; Graham et al. 2011). From the reconstructions presented here, a similar cool interval is observed in Australasian temperatures from ~1500 to 1900. The median (5th, 95th percentile) of average temperature anomalies during this cold period were −0.32°C (−0.41°C, −0.21°C) for PCR, −0.35°C (−0.43°C, −0.25°C) for CPS, −0.63°C (−0.66°C, −0.57°C) for LNA, and −0.36°C (−0.46°C, −0.27°C) for PaiCo below 1961–90 levels.
Given that the early part of the reconstruction is based on fewer proxies, the reconstructions are less skillful and the results consequently more uncertain as illustrated by the larger ensemble spreads for all methods during the first half of the millennium (Fig. 3f). Comparison between the R28, R3, and R2 proxy networks show that decadal-scale temperature variations between the reconstructions methods are similar (see Figs. C3–C6). For the LNA method, the R3 and R2 reconstructions have similar low-frequency variations as the R2, R3, and R28 reconstructions developed using the other three methods.
Figure 4 shows the timing of the warmest and coldest 30- and 10-yr periods for all methods and ensemble members. Calculated across the combined 4000-member ensemble of all methods, 77.3% of the ensemble members contain the warmest 30-yr interval in the post-1950 period in the R28 network. In terms of individual methods, the PCR (CPS, LNA, PaiCo) results indicate that 72.4% (78.4%, 99.9%, 58.5%) of the ensemble members contain the warmest 30-yr period after 1950. This fraction decreases to 50.6% (57.4%, 30.8%, 63.1%) for R3 and to 26.2% (24.1%, 27.2%, 25.5%) for the R2 network.
The warmest preindustrial 30-yr interval (defined as the interval with the largest fraction of ensemble members identifying maximum 30-yr averages) occurred in 1151–80 for PCR, CPS, and PaiCo and in 1417–46 for LNA. Ensemble median (5th, 95th percentile) temperatures during these respective periods were between −0.45°C (−0.55°C, −0.37°C; LNA) and −0.06°C (−0.39°C, 0.13°C; PCR) below the 1961–90 base period. While the results are similar for R3, the R2 ensembles for all methods indicate that the 1151–80 period may have been warmer than the warmest 30-yr period in the twentieth century. The R2 ensemble median (5th, 95th percentile) temperature anomalies during 1151–80 are between +0.03°C (−0.08°C, 0.15°C; PCR) and +0.04°C (−0.06°C, 0.15°C; LNA). However, the twelfth-century temperatures reconstructed from the R2 network remain substantially below recent instrumental temperature observations during 1985–2014 (+0.30°C). In fact, this recent 30-yr temperature anomaly of +0.30°C is above the 90th percentile for all 30-yr periods over the entire millennium for all 12 reconstruction experiments (four methods, three proxy networks).
The coldest 30-yr interval in the reconstruction period (defined as the interval with the largest fraction of ensemble members identifying minimum 30-yr averages) occurred in 1055–84 for PCR and PaiCo, 1831–60 for CPS, and 1645–74 for LNA. Ensemble median temperatures during these periods were between −0.74° and −0.35°C below the 1961–90 average. A cold period of similar magnitude occurred between 1055 and 1084, falling within the Medieval Climate Anomaly described from the Northern Hemisphere. The median (5th, 95th percentile) of average temperature anomalies during this early cold period in Australasia were −0.33°C (−0.45°C, −0.20°C) for PCR, −0.33°C (−0.45°C, −0.21°C) for CPS, −0.66°C (−0.75°C, −0.56°C) for LNA, and −0.70°C (−0.90°C, −0.58°C) for PaiCo relative to the 1961–90 average. In the R3 (R2) network, the coldest 30-yr interval is 1898–1927 (1902–31) for PCR and 1055–84 (1055–84) for the other reconstruction methods.
In terms of the warmest decades of the last millennium, the PCR (CPS, LNA, PaiCo) reconstructions for the R28 network indicate that 46.3% (47.3%, 100%, 30.4%) of the ensemble members contain the warmest decade in the post-1950 period. It is worth noting that the most recent decade, 2005–14 (+0.43°C), which is not covered by the paleoclimate reconstruction, is the warmest decade recorded in the observational temperature data and 0.29°C (0.31°, 0.23°, 0.29°C) warmer than 1986–95 (1986–95, 1988–1995, 1986–95), the warmest decade of the ensemble median PCR (CPS, LNA, PaiCo) R28 reconstruction. In addition, the most recent instrumental temperatures (2005–14) are above the 90th percentile of all 10-yr averages in the R28 reconstruction ensembles except for PaiCo. In this method, the ensemble uncertainty range encompasses the instrumental temperatures for 2005–14 during 14 decades between 1150 and 1400.
The R3 and R2 networks comprise only the longest proxy records available. However, as described in section 2b, this is sometimes at the expense of a skillful reconstruction. In the R3 network, there is no decade in which a majority of ensemble members provide consistent timing for the warmest decade. However, the decade in each reconstruction ensemble that shows the highest proportion of ensemble members with the warmest temperatures is 1986 for PCR (11.4% of ensemble members), 1986 for CPS (14.7% of ensemble members), 1158 for LNA (13.0% of ensemble members), and 1304 for PaiCo (25.4% of ensemble members). In the R2 network, this decade starts at 1157 or 1158 for all methods.
The above results highlight that twentieth-century warming cannot be distinguished from warm periods in the twelfth century in the R2 and R3 networks because of the large uncertainty in the reconstructions. However, again we note that the 2005–14 decade (+0.43°C) is above the 90% confidence intervals in the R3 and R2 networks for all methods. This suggests that present-day warmth is at the upper limit, or indeed exceeding, that experienced in the last millennium.
As seen in appendix C and Fig. 4, the main difference between the R3 and R2 network is that R2 reconstructs warmer temperatures in in the first two periods when the floating Palmyra coral record contains data (1150–1220 and 1318–1464). The difference is largest between 1150 and 1180, where the R2 reconstruction shows comparable or even slightly warmer temperatures than during the late twentieth century (e.g., Fig. C3). Consequently, the highest 10-yr mean temperatures are reconstructed in the twelfth century for many ensemble members in R2 (Fig. 4).
The discrepancy between the R3 and R2 networks in the first half of the millennium could indicate that the midlatitude tree rings are not representing an Australasian-wide temperature signal. Alternatively, the differences between the two reconstructions may be explained by uncertainties within the proxy data, as the land (tree ring) and ocean (coral) archives in the R2 and R3 proxy networks may have different low-frequency behavior. Regardless of the reason for the discrepancy between R2 and R3, Table 2 shows that the R2 and R3 networks are inferior to the R28 in terms of all skill metrics. Therefore, we recommend evaluating temperature anomalies based on the R28 network. However, we further note that prior to 1430 the R28 is ultimately based on only those records in the R2 and R3 networks, which again highlights that all results pre-1430 are more uncertain and less skillful than the post-1430 period.
c. Climate model comparison
Using a three-member ensemble of GISS-E2-R climate model simulations allows us to compare decadal variability associated with internal noise from the responses to external forcings. The regional temperature signature was calculated over the same domain as the instrumental target. Figure 5 shows reconstructed Australasian SONDJF temperatures and the ensemble mean of three transient GISS-E2-R model simulations relative to the 1961–90 reference period. Figure 5 shows that the reconstructions and model simulations align well on centennial time scales, including the amplitude between preindustrial and modern temperature levels (except for the LNA method). This agreement suggests that the sensitivity of the model simulations to external forcing is consistent with instrumental temperature observations.
Given that the amplitude and timing of specific unforced variations cannot be reproduced in model simulations because of their stochastic nature, the agreement between the reconstructed interannual to multidecadal variations and the model simulations in the preindustrial period is fairly good. The correlation between the 30-yr-filtered model ensemble mean and the PCR reconstruction median over the full 1000–2000 period is r = 0.36 (ensemble correlations for the 5th and 95th percentiles are 0.13 and 0.40, respectively). The correlations are r = 0.41 (0.21, 0.44) for CPS, r = 0.52 (0.40, 0.52) for LNA, and r = 0.21 (0.02, 0.30) for PaiCo reconstruction methods.
Figure 5 shows that, while some of the temperature declines in the reconstruction coincide with major volcanic events over the past millennium (particularly in 1452 and 1835), they are smaller in magnitude than the temperature declines associated with volcanic forcing in the model simulations. The reasons for this may be due to the volcanic forcing dataset exaggerating the magnitude of these eruptions (Robock 2000), the loss of variance associated with paleoclimate reconstructions (Smerdon 2012), or because of the possibility that total ring width tree-ring chronologies may not be as responsive to volcanic events as maximum density data (e.g., Jones et al. 1995), which are largely still unavailable from Australasia (Heinrich and Allen 2013).
There are 169 (159, 238) years in the preinstrumental period when the 30-yr-filtered model ensemble mean lies outside the PCR (CPS, PaiCo) reconstruction’s 5th–95th percentile ensemble range, mostly coinciding with cold peaks in the model data in the thirteenth, mid-fifteenth, seventeenth, and early nineteenth centuries. These are likely to be a direct result of volcanic eruptions occurring during these periods. Despite widespread evidence of a major 1815 volcanic eruption (Robock 2000), the spatial extent and magnitude of this event are still unclear (Brohan et al. 2012; Neukom et al. 2014a). A comparison between the four reconstructions and the three GISS-E2-R simulations show differences, which suggests that the 1815 Tambora event might not have had a significant influence on Australasian temperatures (Fig. 5). Because of the significantly colder preindustrial temperatures in the LNA reconstruction compared to the other methods, 30-yr-filtered model data are warmer than the LNA reconstruction’s 95th percentile for 780 years between 1000 and 1900.
The relative roles of forced and unforced climate variability were also examined using the climate model simulations (section S4 in the supplemental material). To assess the probability of the late twentieth-century warming occurring by chance because of unforced natural climate variability, we examined the evolution of the Australasian mean SONDJF temperature over the last millennium, simulated by three forced GISS-E2-R members, an unforced preindustrial control run, and a historical simulation (1850–2005) with natural forcing only (Fig. S4.2 in the supplemental material). On decadal time scales, differences between the ensemble members reveal stochastic variability arising from internal dynamics of the coupled atmosphere–ocean system. However, a common signal across the model ensemble also shows the forced response to the largest volcanic eruptions of the last millennium (three unknown eruptions in the thirteenth century, Kuwae, and Tambora). On multidecadal time scales, forced changes dominate over unforced internal variability in GISS-E2-R.
In recent decades, anthropogenic forcing has a clear signal in the model data and is consistent with reconstructed and observed Australasian temperatures on decadal time scales (Fig. S4.2 in the supplemental material). Computing the differences in temperature between consecutive multidecadal-scale periods highlights the historical rates of change of temperatures. In the preindustrial control simulation, the difference in mean temperature between consecutive 50-yr periods never exceeds 0.11°C in magnitude (Fig. S4.3 in the supplemental material). This contrasts with the observed (0.32°C) and reconstructed (between 0.30° and 0.33°C) mean temperature change between 1901–50 and 1951–2000 (Fig. 6). In the GISS-E2-R ensemble considered here, anthropogenic forcing is required to produce the rate and magnitude of post-1950 warming observed in the reconstruction. This suggests that even when longer estimates of natural variability are included through the use of paleoclimate reconstructions, the post-1950 warming does not arise solely as a result of unforced natural variability within the coupled atmosphere–ocean system as simulated by this model (Fig. 6).
These results are consistent with detection and attribution studies that clearly attribute the post-1950 temperature increase noted in instrumental global and Australian temperature records to increases in atmospheric greenhouse gas concentrations (Karoly and Braganza 2005; Hegerl et al. 2007b; Bindoff et al. 2013; Lewis and Karoly 2013). The results presented here demonstrate that in GISS-E2-R, anthropogenic factors are needed to explain the anomalous warming observed in the Australasian region over the past century.
4. Comparisons with independent paleoclimate records
a. Temperature fluctuations over the last millennium
A peak in preindustrial warmth in Australasian temperature is observed in all methods except LNA from around 1150 to 1350 (Fig. 3), somewhat later than warming described from Northern Hemisphere regions. The 30-yr temperature anomalies described in section 3b are comparable with Northern Hemisphere results reported in IPCC AR4 that suggest that maximum preindustrial temperatures were probably between 0.1° and 0.2°C below the 1961–90 mean and significantly below warm anomalies observed in instrumental records after 1980 (Jansen et al. 2007). A study based on an ensemble of Northern Hemisphere reconstructions has shown that late twentieth-century temperature anomalies over the Northern Hemisphere may have been up to 0.31°C warmer than the earlier peak preindustrial maximum from 1071 to 1100 (Frank et al. 2010). Reconstructed SSTs from a sedimentary record from the Makassar Strait (3°S, 119°E) also provide independent support for large positive anomalies similar to, though not significantly warmer than, modern values between ~1000 and 1400 (Newton et al. 2006; Oppo et al. 2009).
A shift from high preindustrial temperatures into a pronounced cooling is supported by paleoclimate evidence and archaeological interpretations that indicate significant societal impacts across the Pacific basin ~1300–1400 (Nunn 2000, 2007). The high-resolution temperature reconstructions presented in Fig. 3 suggest that a transition to cooler conditions in the Australasian region is likely to have occurred after ~1350. This timing agrees with a shift in low-frequency (centennial) circulation features in a reconstruction of mean synoptic flow patterns for New Zealand that suggests enhanced westerly flow between ~1250 and 1360. There is evidence that a more zonal regime is associated with a shift from warm to cool climate conditions, with cooler conditions associated with intensified atmospheric blocking in the southwest Pacific during this period (Lorrey et al. 2008, 2011; Goodwin et al. 2014).
The results presented in section 3 indicate that from the mid-1300s onward, there is a gradual cooling into a period that coincides with the timing of the LIA interval described from the Northern Hemisphere (Mann et al. 2009), or more generally from 1500 to as recently as the beginning of the industrial era around 1850 (Mann et al. 2009; Graham et al. 2011). Independent evidence for a coherent Southern Hemisphere cool period from the early 1300s is also seen from low-resolution tropical Indonesian marine sediments (Oppo et al. 2009).
Using a network of cave records and other hydroclimatic proxies, Lorrey et al. (2008) suggest the general dominance of circulation patterns in the New Zealand sector that are associated with cooler temperatures for the latter half of the last millennium until the late nineteenth century. An independent coral composite record from the Great Barrier Reef, Australia, indicates that from 1565 to 1700 SSTs off northeastern Australia were 0.2°–0.3°C cooler and more saline than 1860–1985 averages (Hendy et al. 2002). This cooling is in general agreement with a high-resolution sedimentary record from Indonesia that suggests between 1550 and 1850, SSTs were 0.5°–1°C colder than modern values (Oppo et al. 2009).
The 1700–1850 period is recognized from Antarctica as being one of the most abrupt climate shifts of the last 1000 years (Goodwin et al. 2004; Mayewski et al. 2004, 2009). During this time, ice cores indicate an increase in sea ice extent and an intensification of the westerly winds in the mid- to high latitudes of the Southern Hemisphere (Goodwin et al. 2004; Mayewski et al. 2004), characteristic of a positive SAM phase (Abram et al. 2014). Comparable conditions to this early nineteenth-century event are thought to have occurred during the 1886–1903 and 1920–29 periods (Goodwin et al. 2004) and are also associated with cooling in our reconstruction.
Finally, the notion of Australasia-wide cooling from the middle of the last millennium to the nineteenth century is further supported by evidence of glacier fluctuations from New Zealand’s Southern Alps (~43°S, 170°E; Schaefer et al. 2009; Lorrey et al. 2013). The timing of major ice advances centered on 1605 ± 70, 1735 ± 50, 1785 ± 10, and 1845 ± 40 (Schaefer et al. 2009) suggests that pronounced cooling also influenced the Southern Hemisphere region of Australasia, particularly from the mid-seventeenth to mid-nineteenth century.
b. Comparison with Australian borehole temperature reconstruction
A comparison with the only continental-scale Australian borehole temperature reconstruction available (Pollack et al. 2006) indicates that the (low frequency) borehole estimates fall within the cooler section of our ensemble range until around 1800 (Fig. S2.4 in the supplemental material). This confirms the expected result that the rise in surface temperatures over the Australian landmass has been greater than within a broader regional domain that combines land and ocean temperatures.
Since most of the boreholes were logged prior to 1976, the observed subsurface temperatures do not capture the strong warming experienced by Australia in the last two decades of the twentieth century (Pollack et al. 2006), but this is captured in the temperature reconstruction ensembles presented here. In terms of cold periods, the borehole record suggests that the seventeenth century was the coolest interval, consistent with our reconstructions (Fig. 3). However, the borehole record fails to identify cold temperatures seen in the reconstructions around 1840. This highlights the inability of boreholes (Pollack et al. 2006; Jansen et al. 2007) to adequately capture the decadal-scale temperature variations seen in Fig. 3 and the role of high-resolution paleoclimatology in improving estimates of regional decadal climate variations.
Overall, the PCR (CPS, LNA, PaiCo) ensemble median of Australasian land and ocean temperature reconstruction presented here suggests that the second half of the twentieth century (1951–2000) was 0.31°C (0.35°C, 0.68°C, 0.36°C) warmer than average preindustrial conditions (1651–1700, the cold phase before the borehole temperatures starts to increase). This corresponds well with the Australian (land only) borehole estimate (Fig. S2.4 in the supplemental material).
c. Ocean–atmosphere interactions
While low-frequency variations of internal ocean–atmosphere interactions like ENSO are known to have played an important role in influencing regional temperature variations over the past millennium (Mann et al. 2005; Hegerl et al. 2007b; Mann et al. 2009; Li et al. 2011), the nature and stability of regional climate variations are still unclear (Lough 2011; Fowler et al. 2012; Gergis et al. 2012). To assess the relationship between reconstructed Australasian warm season temperatures and ENSO, we compared our R28 reconstructions with the Unified ENSO Proxy (UEP) developed by McGregor et al. (2010). The UEP represents the first uncalibrated EOF of 10 published ENSO reconstructions back to 1650, which minimizes spatial bias. Since a number of the paleoclimate records used in the current study have also been used in our previous ENSO reconstruction work (Braganza et al. 2009), the UEP was recalculated removing the Braganza et al. (2009) data [proxies 3 and 9 in McGregor et al. (2010)] to provide independent comparison with our Australasian temperature reconstructions presented here.
The relationship between interannual and interdecadal ENSO variability and Australasian temperature is known to fluctuate over the twentieth century (Power et al. 1999; Jones and Trewin 2000; Gallant et al. 2013). Figure 7 shows the 30-yr running correlation between our interannual Australasian SONDJF temperature reconstruction and the UEP in the post-1649 interval of overlap. The results display a significant negative relationship over the full period with r = −0.35 for the PCR ensemble median, with 85.7% of ensemble members significantly correlated, at p < 0.05 (r = −0.39 and 98.1% for CPS; r = −0.29 and 100% for LNA; and r = −0.36 and 99.8% for PaiCo). Figure 7 shows that this relationship exhibits considerable variability over past centuries. This confirms fluctuations in the influence of Pacific Ocean–driven climate variability and temperatures in the Australasian region during the instrumental period (e.g., around 1960) and reduced correlations in the mid-seventeen century. Large variations in the relationship between Australasian temperature and ENSO are consistent with instrumental data and climate model simulations (Gallant et al. 2013; Ashcroft et al. 2016).
Graham et al. (2011) present results from a coupled GCM showing that a slight warming of the tropical Indian and western Pacific Oceans relative to the other tropical ocean basins may have induced a broad range of the circulation and climate changes indicated by proxy data in the medieval period, including many of those not explained by a cooler eastern tropical Pacific alone. They suggest that tropical SSTs were the principal driver of large-scale climate variations during the MCA, which was characterized by an enhanced zonal Indo-Pacific SST gradient. However, if the Indo-Pacific warm pool was indeed the origin of the relative warmth associated with the MCA, then the temperature signal would be expected to be stronger in the Australasian region than in hemispheric means. The lack of any strong “MCA signal” in the reconstruction presented here therefore appears to be inconsistent with the Graham et al. (2011) hypothesis or may reflect inadequacies in availability of records from tropical regions of Australasia during this period.
Shifts in ENSO variability in the core dynamical region of the Indo-Pacific region may correspond to a notable period of warmth reported in the high-latitude region of the Southern Ocean. Goosse et al. (2004) have proposed a delayed response to natural forcing due to the storage and transport of heat anomalies by the deep ocean to explain the warm Southern Ocean around the 1300s to 1400s as inferred from three Southern Hemisphere climate proxies used by Mann and Jones (2003) and additional Antarctic ice cores.
The delay in the Southern Hemisphere temperature response to external climate forcing may have implications for the evolution of future climate change in the region. Model studies suggest that the present-day Southern Ocean temperatures lag the increases in greenhouse gas concentrations observed during the recent decades (Goosse et al. 2004). This implies that it is possible that large warming of the Southern Ocean will occur when the warm deep water formed during the twentieth century reaches the surface in coming decades (Goosse et al. 2004).
This study presents an austral warm season (September–February) temperature reconstruction for the combined land and ocean region of Australasia over the last millennium. We examined four 1000-member ensembles of reconstructions using independent, published statistical methods (PCR, CPS, PaiCo, and LNA) for a comparison of the methods and to identify any biases associated with each approach.
There is agreement between reconstructed and GISS-E2-R-simulated temperatures during the preindustrial era on multidecadal and longer time scales. GISS-E2-R shows that anthropogenic forcing factors are needed to explain the warm Australasian temperature anomalies during the late twentieth century, which are warmer than any other 30-yr period for the past 1000 years 77.3% of ensemble members calculated over all methods in our most skillful R28 proxy network. Reduced proxy networks (R3 and R2) comprising only the longest records cannot clearly distinguish whether present-day warmth is similar to or exceeded by twelfth-century warmth as the uncertainty in the reconstructions is large. Regardless, the most recent instrumental temperatures (1985–2014) are above the 90th percentile for all 30-yr periods in all 12 reconstruction ensembles (four reconstruction methods based on three proxy networks, R28, R3, R2).
Our results are consistent with detection and attribution studies that use instrumental data and clearly attribute the post-1950 temperature increase noted in instrumental global and Australian temperature records to increases in atmospheric greenhouse gas concentrations (Karoly and Braganza 2005; Hegerl et al. 2007b). The strong warming leading to present-day conditions started around 1905, representing a lagged response to the increase in greenhouse gas concentrations, which started in the early nineteenth century (Fig. S4.2 in the supplemental material). This delayed warming may be explained by the thermal inertia of the vast oceans surrounding the Australasian region (Goosse et al. 2004).
Our reconstructions suggest that peak preindustrial warmth occurred in Australasia around 1150–1350, somewhat later than described from Northern Hemisphere regions. It is worth noting that this medieval warming occurred in the absence of significant anthropogenic greenhouse gas emissions and thus is not analogous to post-1950 observed warming, which is predominantly anthropogenically forced (Karoly and Braganza 2005; Hegerl et al. 2007b; Lewis and Karoly 2013). This implies that the full range of natural climate variability may not have been observed in Australasia since the beginning of instrumental records. Thus, it is possible that future climate change resulting from anthropogenic influence may be compounded by large, natural climate variations.
By around 1350, a cooling trend that lasts several hundred years began. This cooling eventuated in minimum temperature anomalies in the sixteenth and early nineteenth centuries mostly coinciding with the peak of the Northern Hemisphere’s Little Ice Age. Our results support the notion that a pronounced cool period consistent with the timing of the LIA extended well outside of the Northern Hemisphere high latitudes and into the tropical and subtropical regions of the Southern Hemisphere (Newton et al. 2006; Wilson et al. 2006).
Given that instrumental observations in Australasia generally extend back to the early twentieth century, the paleoclimate temperature estimates and the associated analyses presented here provide an extended basis for evaluating the accuracy of climate models in simulating past regional climate variability and an opportunity to reduce uncertainties associated with future climate variability and change (Hegerl et al. 2011). This study provides preindustrial estimates of decadal temperature variations as far back as 1000, which may help to quantify the role of natural and anthropogenic forcing on regional climate variations as demonstrated in other regions of the world (Hegerl et al. 2011; PAGES2k–PMIP3 Group 2015).
Our work provides a significant improvement on the quantification of uncertainties previously reported (Jansen et al. 2007; PAGES2k Consortium 2013) by computing and comparing multiple reconstruction methods using ensemble techniques. It also provides valuable analysis of past temperature estimates for a region in the Southern Hemisphere, given that many studies have been Northern Hemisphere focused (Lamb 1965; Grove 1988; Hughes and Diaz 1994; Crowley and Lowery 2000; Bradley et al. 2003; Mann et al. 2009; Frank et al. 2010; Graham et al. 2011). Future research will focus on consolidating Australasian paleoclimate data with other Southern Hemisphere regions to advance our understanding of global change over the past millennium.
This work is a product of the Past Global Changes (PAGES) Aus2k working group. We appreciate the efforts of all working group members for data contributions and helpful discussions that improved this study. Kathryn Allen, Patrick Baker, Gretel Boswijk, Brendan Buckley, Matthew Brookhouse, Edward Cook, Louise Cullen, Mark Curran, Rosanne D’Arrigo, Pavla Fenwick, Anthony M. Fowler, Ian Goodwin, Pauline Grierson, Erica Hendy, Braddock Linsley, Janice Lough, Andrew Lorrey, Helen McGregor, Andrew Moy, Jonathan Palmer, Chris Plummer, Chris Turney, Tessa Vance, Tas van Ommen, and Limin Xiong are acknowledged for developing data considered in this study and/or feedback on earlier versions of the manuscript. The authors thank Edward Cook for providing access to the signal-free tree-ring standardization program and Shayne McGregor for the modified version of the unified ENSO proxy used in this analysis. All paleoclimate records associated with this study are available from https://www.ncdc.noaa.gov/paleo/study/12915. We thank Bruce Bauer from NOAA’s World Data Center for Paleoclimatology for assistance with the data archiving process.
We acknowledge funding support from the Australian Climate Change Science Program, Australian Research Council Projects LP0990151, FF0668679, and DP1092945, Swiss National Science Foundation Grant PZ00P2_154802, and Past Global Changes. The authors thank John Chiang, David Frank, and numerous anonymous reviewers for their time and comments on this study. J.G. is very grateful for the support provided by M.B. during the course of this project.
Proxy Screening Method
Here we describe the proxy screening approach that was used to develop the reconstructions. In section S1.1.1 (S1.1.2) in the supplemental material, we discuss the theoretical implications and strengths and weaknesses of using raw versus detrended data (field mean temperatures vs local temperature variations) for proxy screening. The sensitivity of proxy selection to these choices, and the subsequent effect on the results, are then quantitatively assessed in section S1.2 in the supplemental material.
The paleoclimate proxies for this study were selected from a network that lies within the domain 10°N–80°S, 90°E–140°W presented in Neukom and Gergis (2012). Given the complexity of the extensive multivariate analysis presented here for four different statistical methods and various nests, the dataset was frozen as of July 2011 and so does not include any incremental changes to existing records or new datasets that may have become available after this date. We a priori exclude unpublished records as well as δ13C and Ba/Ca proxies from corals (Lough 2004; Grottoli and Eakin 2007).
Of the resulting 62 records we also exclude records that were still in development at the time of the analysis (Law Dome chemical species; Australian Antarctic Data Center, http://data.aad.gov.au/aadc/portal; Tonga TH1 and TNI2 δ18O; Linsley et al. 2006, 2008) and records with an issue identified in the literature or through personal communication (Kavieng and Nauru corals; Lough 2004; Alibert and Kinsley 2008). The remaining 51 records are presented in Table A1. Note that the three coral records of Lough (2011) from the Great Barrier Reef [Havannah, Great Barrier Reef rainfall reconstruction (rec4) and Great Barrier Reef rainfall reconstruction (rec17)] are not independent but are differently replicated versions of a composite reconstruction (see Lough 2011). In the final reconstruction, we only use the most replicated available version of the composite for each year of the reconstruction period.
The proxy data were assessed for their suitability as Australasian temperature predictors. Each proxy was compared to local HadCRUT3v grid cells in the study domain (10°N–80°S, 90°E–140°W), defined as grid cells within 500 km of the proxy’s location—similar to an approach taken by Cook et al. (2013). This comparison was only performed for cells containing at least 50 years of data between 1921 and 1990. To account for proxies with seasonal definitions other than the target SONDJF season (e.g., calendar year averages), the comparisons were performed using lags of −1, 0, and +1 years for each proxy.
Once the grid cells were selected using the above definitions (and the proxy data lagged where appropriate), all data were detrended using linear regression. Given that a proxy is representing local climatic variations, the correlations between each local grid cell and the proxy were calculated. The significance of these correlations was determined using a Student’s t distribution, with the degrees of freedom adjusted for autocorrelation at lag 1 using the method of Bretherton et al. (1999). The proxy records that had a statistically significant (p < 0.05; two sided) correlation with at least one nearby grid cell and lag were selected for further analysis. If significant correlations were identified for more than one lag, the lag with the highest absolute correlation was used for the reconstruction (with the exceptions of the proxies Mount Read and Oroko Swamp, which are already calibrated temperature reconstructions; hence only lag 0 was selected).
The final step in the screening process involved the omission of coral records with multiple proxies (Sr/Ca and δ18O) where both records had statistically significant correlations. In these cases, the proxy record with the higher absolute correlation was selected. The resulting 28 predictors are presented in Table 1 of the main text, and correlation results are shown in Table S1.3 in the supplemental material. Of the paleoclimate records in Table A1, 55% (28 out of 51) were defined as representing a significant proportion of local temperature variations. This proportion of selected proxies was compared to a selection from autoregressive-noise proxies, where only 24% (12.1 ± 2.8 records) passed screening (see Neukom et al. 2014a). This shows that the fraction of proxies selected is significantly larger than noise, confirming that they are representing a climatic signal rather than stochastic variations. All paleoclimate records associated with this study are available from https://www.ncdc.noaa.gov/paleo/study/12915.
Temperature Reconstruction Methods
As stated in section 2c, we use four reconstruction methods: a principal component regression (PCR) ensemble (Luterbacher et al. 2002; Neukom et al. 2014a,b), composite plus scale (CPS; Mann et al. 2008; Neukom et al. 2011), pairwise comparison (PaiCo; Hanhijärvi et al. 2013), and a Bayesian hierarchical model (LNA; Li et al. 2010). Note that in this study, we implement PaiCo and LNA using the reconstruction scripts provided in the supplemental material of Hanhijärvi et al. (2013).
The PCR method is based on an ensemble of ordinary least squares regression principal component reconstructions. A more complete description of the underlying PCR method is provided by Luterbacher et al. (2002) and Neukom et al. (2014a). We use the same approach used by Neukom et al. (2014a) to reconstruct Southern Hemispheric temperature means but with slightly different ensemble parameters to account for the smaller number of proxies available for the Australasian region. To assess reconstruction uncertainty associated with proxy selection and calibration, a 1000-member ensemble of reconstructions was calculated. The randomized permutations, designed to include three sources of error, were successively performed 1000 times for each proxy nest. These permutations incorporated differences in the model coefficients that were effectively associated with the sampling errors described below and in sections a(1)–(3).
1) Calibration interval
The choice of calibration interval has been shown to affect model skill in some cases (Gallant and Gergis 2011; Gergis et al. 2012). So the correlation matrix from which the principal components were computed was made using a randomized calibration period between 35 and 40 years long from 1931 to 1990. Each randomized calibration period was generated using multiple 10-yr blocks, plus additional successive years if the period was a nonmultiple of 10 years, to retain the autocorrelation structure. Each resample of the calibration period was a jackknife procedure (Wilks 2011) applied to blocks of data rather than a single time step, as the blocks were sampled without replacement.
2) Proxy selection
A second permutation randomly resampled the proxies that were selected as temperature predictors (Table 1). Either the maximum number of proxies in each nest was used or a jackknife procedure was applied that randomly removed one proxy.
3) PC truncation
The principal components (PCs) were then calculated for those selected proxies and calibration interval, which were randomized as described previously. As higher-order PCs contribute little more than noise, they are usually truncated. There are a variety of possible methods to truncate PCs and no perfect method has ever been established. Thus, we based truncation on the North et al. (1982) criterion for distinguishing unique eigenvalues. However, because no method is perfect we also included a random permutation by truncating at either , , or , where is the truncation point from the North et al. (1982) criteria.
Permutations (1)–(3) were successively applied to each proxy nest (Fishman 1995). The matrix of the loadings for the PCs was generated using only data from the calibration period, as in Luterbacher et al. (2004) and Küttel et al. (2010). The matrix of these weights was then assumed to be representative of the full period spanned by the proxies. To avoid variance biases due to the decreasing number of predictors back in time, the reconstructions of each model were scaled to the variance of the instrumental target over the 1931–90 period, as is commonly done in paleoclimate reconstructions [see Jones and Mann (2004) for a review].
Aside from the ensemble perturbations, we also incorporate the additional uncertainties that exist between each of the PCR transfer functions and the instrumental temperature time series over the calibration interval (i.e., the regression residuals). As such, each reconstruction ensemble member is perturbed with additional noise computed using a red-noise series, which is a model of the aforementioned regression residuals. This approach is similar to Wahl and Smerdon (2012); that is, the regression residuals were computed and AR(1) noise time series were computed with the same AR(1) coefficient and standard deviation as the regression residuals. One realization of these time series of random-draw residuals was then added to the reconstructed temperatures of each proxy nest. This procedure was repeated for each of the 1000 ensemble members. The spread of the resulting 1000-yr, 1000-member ensemble then represents sources of uncertainty associated with both subjective selection procedures (e.g., proxies, calibration interval) and residual errors.
CPS is a simple method whereby temperature-sensitive proxy records are normalized and composited (i.e., averaged; Mann et al. 2008; Neukom et al. 2011, 2014a). The unitless average is then scaled against the available temporally overlapping instrumental target predictand. Here, we use the same CPS approach as Neukom et al. (2014a) that uses a nested CPS reconstruction calculated each time a record drops out at each year back in time. A CPS reconstruction was computed for each nest by standardizing (normalizing and centering) the available predictor series over the calibration interval and subsequently calculating a weighted composite in which the relative weight of each proxy was determined by the strength of the correlation with the Australasian temperature predictand. Each composite was centered and scaled to have the same variance as the instrumental target index over the calibration interval. Finally, to create a 1000-member ensemble of the CPS reconstructions, the calibration period and proxy selection perturbations [listed above as (1) and (2) for the PCR method] were similarly randomly applied to each realization of the reconstruction. Total uncertainties including the calibration error were calculated as described for PCR by adding modeled AR(1) residual errors to each reconstructed ensemble member.
The PaiCo method performs pairwise comparisons of proxy records for successive time steps to generate a monotonically increasing, nonlinear function that is used to reconstruct the target variable (e.g., temperature; Hanhijärvi et al. 2013). The method compares the order of successive measurements in a proxy time series and then determines relative agreement of this ordering for all proxies. The strength of this agreement is then assumed to be indicative of the relative temporal differences in the target variable. Like CPS, the reconstructed time series is unitless and so must be subsequently scaled to match the units of the climate predictand over some calibration period in which the target and reconstruction time series overlap. The uncertainties are estimated using a bootstrapping technique to resample all data inputs used to generate the nonlinear function and is extensively discussed in Hanhijärvi et al. (2013). In this study, we apply their method using the code provided from the Hanhijärvi et al. (2013) paper in their supplemental material.
The Bayesian hierarchical model of Li et al. (2010) has been referred to as the LNA method (Li et al. 2010; PAGES2k Consortium 2013). It is a linear Bayesian method that produces an ensemble of equally likely temperature time series that best correlates with each proxy record. Here, the LNA method is employed using the same reconstruction parameters as described in the PAGES2k Consortium (2013) analysis. Each record was individually standardized to zero mean and unit variance over the length of the record. This was done only to improve convergence since the model for LNA contains shift, scale, and noise variance parameters for each record and, therefore, standardization is not required (Li et al. 2010; PAGES2k Consortium 2013).
In the LNA method developed by Li et al. (2010), noise is assumed to follow an AR(2) model and the parameters of the autoregression process are estimated through Bayesian inference (Hanhijärvi et al. 2013). However, Hanhijärvi et al. (2013) set the autocorrelation coefficient to zero to simplify the model and to stabilize the calculations. Note that the Bayesian technique provides a distribution of equally likely temperature reconstructions conditional on a fixed data collection, whereas the ensembles used in PCR, CPS, and PaiCo used in this study are based on variants of bootstrapping the predictors and other reconstruction parameters (Hanhijärvi et al. 2013).
Reconstructions Based on Reduced Proxy Networks
Here we provide a comparison of the full R28 reconstruction with the less-replicated R3 and R2 proxy networks available in the pre-1430 period. The R3 network includes the midlatitude tree-ring chronologies from Mount Read and Oroko Swamp and the tropical Palmyra coral record, while the R2 network is made up only of the two tree-ring records from Tasmania and New Zealand. Figure C1 shows that the PCR and CPS methods agree very well, with the biggest differences noted between the R28 and R3 network observed in the LNA and PaiCo results. The amplitude between modern and preindustrial temperatures in the R3 reconstructions is similar to the R28 amplitudes of the PCR and CPS methods (see Fig. 3 in the main text). The results are very similar for the R2 network that excludes Palmyra (Fig. C2).
Figures C3–C6 compare the R28, R3, and R2 reconstructions using the four reconstruction methods. The main difference between the R3 and R2 network in the PCR reconstruction is seen in the twelfth century where the R2 network indicates warmer temperatures than the R3 reconstruction. This suggests that warmer temperatures may have characterized the southern terrestrial parts of the Australian domain represented by Tasmanian and New Zealand tree rings during this period unlike the cooler conditions indicated by the inclusion of Palmyra coral data from the tropical Pacific. However, as we are interested in evaluating combined land and ocean temperatures using HadCRUT3v, we recommend evaluating temperature anomalies discussed in the main text based on the full R28 network against the R3 network that incorporates both terrestrial and ocean proxies to match our target predictand.
All temperature reconstructions for the R2, R3 and R28 networks are available from https://www.ncdc.noaa.gov/paleo/study/12915.
Supplemental information related to this paper is available at the Journals Online website: http://dx.doi.org/10.1175/JCLI-D-13-00781.s1.
This article is included in the Australasian climate over the last 2,000 years: The PAGES AUS2K Synthesis special collection.