Knowledge of the range of precipitation variability and extremes is restricted in regions such as Australia, where instrumental records are short and paleoclimatic records are limited in spatial and temporal extent and resolution. In such comparatively data-poor regions, there is limited context for understanding the statistical unusualness of recently observed extreme events, such as heavy rain and drought, and the influence of stochastic and anthropogenic forcings on their magnitude. This study attempts to further understandings of the range of forced and unforced variability using CMIP5 climate models. Focusing on extremes in the magnitude of monthly, seasonal, and annual precipitation, the distribution of instrumental-period observed precipitation in various Australian regions is compared to simulated precipitation in historical experiments as well as various long experiment (preindustrial control and Last Millennium) and anthropogenically forced simulations of the twenty-first century (RCP2.6 and RCP8.5). There is no systematic increase in the magnitude of simulated extremes corresponding to the length of model simulations, although many realizations reveal higher magnitude extremes compared to those observed, suggesting that the duration of the instrumental record may not capture the potential severity of stochastically driven extremes. A coherent increase in both wet and dry extremes is simulated throughout Australian regions in high greenhouse gas emissions scenarios, demonstrating a forced hydrological response.
In recent decades, Australia has experienced significant hydrological extremes associated with high costs to socioeconomic and natural systems. Hydrological extremes include the prolonged drought in the Murray–Darling Basin in the early 2000s (van Dijk et al. 2013) and a series of record flood events—such as the January 2011 floods in Queensland—during 2010–12 (King et al. 2013b). Such events have been characterized respectively as the most extreme dry or wet spells on record (see Fig. 1); however, key challenges remain in understanding Australian precipitation extremes. Foremost, the climatological significance of such events is difficult to evaluate quantitatively given the high degree of natural variability in Australian hydroclimates (Nicholls et al. 1997) and the limited length of Australian observations (Trewin 2012).This study aims to understand recent wet and dry extremes in Australia in the context of long-term climatic variability.
Previous studies have used climatic time series derived from proxy records to infer variability on time scales longer than the instrumental record (e.g., Griffin 2015). Such global studies provide useful insights into climatic processes and variability. However, these approaches are limited in some regions by the difficulties in obtaining extended, high-resolution records with sufficient chronological control. This is particularly problematic in Australia, where few terrestrial paleoclimatic records exist (Mills et al. 2013). Hence, it is additionally useful to employ other approaches to understanding the characteristics of precipitation on time scales beyond the short instrumental record, such as by comparing proxy data with results from coupled climate models (Klein et al. 2016). Further studies have focused entirely on model results as insights into long-term natural variability in precipitation and the relative roles of external and stochastic forcings on extremes. For example, Hunt (2006) used millennial-length simulations from a coupled global climate model to investigate the occurrence of annual mean rainfall extremes under unforced conditions.
While model-based studies provide a means of understanding the range of hydroclimatic variability and separating the role of various forcings, they also present their own set of potential limitations. This is demonstrated by Ljungqvist et al.’s (2016) investigation of long-term hydroclimatic variability in the Northern Hemisphere. Comparison of reconstructed hydroclimatic anomalies with model output shows divergence in the simulated intensification of hydroclimatic variability in the twentieth century. In particular, models may generally underestimate hydroclimatic variability or poorly represent subgrid parameterizations. As such, Ljungqvist et al. (2016) recommend that paleoclimate records provide a valuable means to place recent hydrological extremes and trends in a context.
However, Australian-based terrestrial proxy records are typically sparse and of low-resolution, and they present chronological challenges (e.g., Neukom and Gergis 2011). Hence, hydrologically sensitive proxy reconstructions alone are of limited capacity for quantitatively understanding short-term extreme precipitation events, such as those examples observed in Australia in recent years (see section 2). Instead, an instrumental observation and model-based approach is adopted to investigate the extremity of recent Australian hydrological events. First, phase 5 of the Coupled Model Intercomparison Project (CMIP5; Taylor et al. 2012) historical simulations are evaluated against instrumental-period observational data for extreme high and low Australian precipitation amounts at monthly, seasonal, and annual time scales. Next, the stationarity of characteristics of extreme precipitation is explored using CMIP5 experiments of the Last Millennium, preindustrial control, and twenty-first century. The present study addresses the question of whether the period encompassed by the instrumental record (1910–present) represents the simulated range of forced and unforced hydroclimatic variability in CMIP5 models.
2. Overview of existing hydrological records
Prior to 1900, instrumental records of Australian precipitation are sparse. Jones et al. (2009) report that prior to 1900, few data existed in central, southern, and western Australia. Documentary evidence and early instrumental records from southeastern Australia extend rainfall histories to 1788 (Fenby and Gergis 2013; Gergis and Ashcroft 2013). Gergis and Ashcroft (2013) combined documentary and historical and instrumental records to create an eastern New South Wales (NSW) drought and wet year index covering 1788–2008. This temporally extended Fenby and Gergis’s (2013) documentary-based chronology of drought and wet years in the region from 1788 to 1860, when meteorological observations became widespread. Nonetheless, these extended sources are geographically and temporally limited and are noted within these studies to contain substantial biases and gaps.
Prior to meteorological observations, Australian paleoclimate records of annual resolution are limited (Neukom and Gergis 2011). Neukom and Gergis (2011) reviewed 174 monthly to annually resolved proxy records from across the Southern Hemisphere and identified key spatiotemporal gaps in data coverage. Next, Dixon et al. (2017) reviewed nonannually resolved paleoclimate records of the Australasian region over the last 2000 years, assessing their length, temporal resolution, dating method, and climatic signals recorded. These lower-resolution records (nonannually resolved) are complementary to high-resolution archives. However, in total, only 22 of 661 potential records are designated as high-quality using the Past Global Changes (PAGES) selection criteria.1 This study showed that Australian sites have the potential to provide well-dated, high-resolution records, but such records are currently of very low number. Focusing specially on the Murray-Darling Basin region, Mills et al. (2013) explored the potential of paleoclimate records to elucidate the full range of hydrological variability. Again, the limited number of records from the region hampered a specific reconstruction.
Further studies have assessed the applicability of remotely located records for understanding Australian hydroclimatic variability. A case study for the Murray-Darling Basin identified proxy sites outside the basin that could potentially inform water resource management (Ho et al. 2014). The use of remote proxies to “bridge” data gaps has also been used for understanding drought variability over several centuries (Palmer et al. 2015) and for linking annually resolved Antarctic ice core records to eastern and southeastern Australian rainfall (Tozer et al. 2016; Vance et al. 2015, 2013). These approaches have significantly advanced the spatial and temporal extent of knowledge of Australian hydroclimatic variability, including that of preinstrumental period droughts (Kiem et al. 2016) and floods (Johnson et al. 2016).
However, the use of remote proxies to reconstruct local or regional climates must be used cautiously for understanding changes in precipitation extremity. In many cases, remote records are investigated in terms of large-scale ocean and atmospheric circulation indices and hence may not reflect the full range of hydrological variability that is not driven by large-scale oceanic conditions (Taschetto et al. 2016). In addition, teleconnections between remote and local climates may not be stationary on the time scales investigated (Gallant et al. 2013; Lewis and LeGrande 2015). Finally, the interpretation of the proxies themselves is inherently complex, and reconstructed estimates of hydroclimatic variability or extremes may be underestimated because of systematic proxy biases (Cook et al. 2016).
In recent years, well-resolved hydrological records have been published from particular Australian areas of interest. For example, a tree-ring record of the last two centuries from northwest Australia demonstrates interannual- to multidecadal-scale variability in seasonal rainfall (O’Donnell et al. 2015). In Tasmania, a tree-ring reconstruction of dam inflow provided a cool season signal, although the authors note that a robust regional reconstruction requires multiple records (Allen et al. 2017). Collectively, various studies hint at a range of Australian hydroclimatic variability exceeding that demonstrated in the instrumental record, including an extreme drought from 1837 to 1841 (Fenby and Gergis 2013). Conversely, a recent multicentury, multiproxy reconstruction reveals that the spatial extent and duration of the Millennium Drought is likely unprecedented in southern Australia over the last 400 years (Freund et al. 2017).
In summary, these studies demonstrate the outstanding difficulties for assessing the statistical unusualness of recently observed extremes in a long-term context using only proxy reconstructions. As such, this study presents model-based insights into forced and unforced hydroclimatic variability.
3. Data and methods
a. Observations and model description
Observed rainfall anomalies are calculated for various regions from the Australia Water Availability Project (AWAP) dataset using a 0.5° × 0.5° horizontal grid (Jones et al. 2009). AWAP rainfall is interpolated from the Australian Bureau of Meteorology’s gauged network into a gridded dataset. The gridded product is suitable for examining trends and extremes in observed Australian rainfall (King et al. 2013a).
A suite of CMIP5 model experiments is used (Table 1), including the historical (1850–2005) and the Last Millennium (past1000, 850–1849 CE). The historical simulation includes changing anthropogenic (well-mixed greenhouse gases, aerosols, and ozone) and natural (volcanic and solar) forcings, which are imposed to reproduce climate evolution over the twentieth century as accurately as possible. The Last Millennium simulations have reconstructed time-evolving exogenous forcings imposed, including for changes in orbital parameters, solar changes, and volcanic aerosols, as well as land use and well-mixed greenhouse gases. Precipitation is also explored in the preindustrial (piControl) simulations. These long control simulations are freely evolving experiments with greenhouse gas concentrations set at levels appropriate for circa 1850 and are described as unforced (Taylor et al. 2012). These experiments allow the analysis of a large number of unforced model years to be examined.
Finally, precipitation is investigated in two representative concentration pathway (RCP) experiments of the twenty-first century. This includes RCP8.5, a high-emission scenario that is most representative of greenhouse gas emissions from 2005 to present (Peters et al. 2012), and provides a substantially forced climate in which to examine precipitation responses. In comparison, the RCP2.6 scenario is an aggressive mitigation trajectory, where greenhouse gas emissions peak and decline by the end of the twenty-first century.
Models were used where data were available for each experiment, which provides a significantly reduced set of models (CCSM4, FGOALS-g1.0, FGOALS-s2, GISS-E2-R, IPSL-CM5A-LR, MIROC-ESM, MPI-ESM-P, MRI-CGCM3, BCC_CSM1.1) compared to all CMIP5 models available, as few groups contributed past1000 simulations. Simulated and observed data were processed in a similar manner. Anomalies were calculated relative to the long-term mean (all available years) for monthly, seasonal [December–February (DJF), and June–August (JJA)], and annual precipitation averages in each region.
Area-mean precipitation rate anomalies were calculated over land surface for Australia (AUS), southeast (SEA), northeast (NEA), and east (EA) regions (Fig. 1). These regions were chosen for several reasons: they are large areas that are suitable for general circulation model (GCM)-based analysis, they encompass regions that have experienced recent record hydrological events, and they align closely with Australian Bureau of Meteorology–defined climatologically distinct regions.2 It is noted that this study’s results focus on large spatial scales and may not be applicable necessarily at the catchment scale, nor sensitive to critical topographical differences, such as relating to the Great Dividing Range or along the coast. In addition, although noteworthy hydrological extremes have occurred in the observational record in western Australia (WA)—particularly southwestern WA—the current study focuses on large-scale area mean values in EA where observational products are considered most reliable (King et al. 2013a) (see section 5 for further details).
b. Definition of precipitation characteristics
This study does not focus on investigating the incidence of specific observed extreme events (such as the Millennium Drought) in the suite of model experiments. This event-based approach would limit the scope of the study to events already observed. In addition, extremes with very long return times are affected by the sample sizes of such rare events and cannot be robustly investigated in model simulations of varying sizes. Instead, extremes are investigated using quantiles (Ferro et al. 2005; Lee et al. 2013), which offer a more robust approach for investigating rare events without making assumptions about the shape of the distribution or sample size—such as required, for example, for generalized extreme value fitting. Furthermore, quantiles allow differences in portions of the distribution to be identified specifically.
Before assessments of the stationarity of extremes are made for the various experiments, the distributions of monthly and annual precipitation amounts in the historical multimodel ensemble are compared to observed using several approaches (summarized conceptually in Fig. 2). Observations and historical simulations are compared through several approaches:
A two-sided Kolmogorov–Smirnov (KS) test (at the 5% significance level), which makes few assumptions about the distribution of data and is nonparametric. The KS test reveals major departures of two distributions, testing whether they are drawn from the same population (see Fig. 2a).
An Anderson–Darling (AD) test, which conversely does make assumptions about the specifics of the underlying distribution. However, the AD test is additionally useful, as it is particularly sensitive to departures in the tails of distributions (demonstrated in Fig. 2b), which is the primary focus of this study.
A Perkins skill score (Perkins et al. 2007) measures the cumulative minimum value of the historical and observed distributions, thereby measuring the common area between the two distributions. This score provides a simple measure of the relative similarity of the model and observed distributions, with a score of 0 indicating no overlap and a score of 1 indicating distributions are identical. A Perkins score was calculated for model realizations that were statistically indistinguishable from observations for all regions using both KS and AD tests (Table S1). The score was calculated for area-mean time series using 100 bins.
Next, the various experiments (piControl, past1000, RCP8.5, and RCP2.6) are compared to the historical experiment using the above tests.
While these tests have the potential to indicate differences in distributions, they are only broadly insightful. For example, a lower Perkins skill score may be calculated when comparing observations and historical simulations than when comparing observations and past1000 simulations. In this case, the specific part (low, high, or both tails) of the distribution that is not represented in the historical simulation is not clear. That is, the Perkins score is purely a measure of the degree of overlap, and precisely where possible deviations may exist is not elucidated by this value alone. As such, quantiles are used to examine the specific location of any identified difference in distributions. The p quantile (100p percentile) of a distribution is the value below which a proportion (p) of the probability falls. Quantile values for 1, 5, 10, 25, 50, 75, 90, 95, and 99 are calculated for monthly, seasonal, and annual values for each region in each experiment. As few modeling groups provided past1000 simulations, all available realizations are used for comparing quantile values across the experiments, regardless of AD and KS test results.
Changes in quantiles in the various experiments are used to evaluate potential nonstationarities in Australian precipitation that may warrant further statistical investigation (Marani and Zanetti 2015; Serinaldi 2010). While a formal definition of stationarity in a time series requires simply a constant mean, finite variance, and an autocorrelation dependent only on relative position in a time series (von Storch and Zwiers 2007), such formal definitions are limited in providing insights into the extremes of precipitation.
a. Historical precipitation distributions
Historical realizations were robustly comparable to observed using a KS test (Table S1), although models vary in skill relative to observed according to the extremes-sensitive AD test. In this case, 11 model realizations are different from observed in a statistically significant sense and are excluded from further analysis; 43 historical realizations were included. Next, the ensemble was compared to observations with a Perkins score (Fig. 3). The simulated distributions of monthly temperatures are largely more skillful than for seasonal averages. Median skills scores from the 43-member ensemble range from 0.83 to 0.86 across the four analyzed regions. The lowest model skill occurs in the austral winter (JJA), where skill values of 0.5 to 0.55 are calculated.
Although the calculated model skill varies with time averages considered, a useful comparison can be made between the various CMIP5 experiments (Fig. 3). For monthly precipitation, the median skill score for Australian average values in the past1000 realizations compared to observed is lower than the equivalent historical comparison with observed, though it lies within the 5th–95th percentile range. For annual average precipitation, the past1000 overlap with observed distribution is outside the historical range for Australia and northeastern regions. For austral summer precipitation, the piControl simulations exhibit a similar degree of difference; skill scores relative to observed are lower than the skill score between historical simulations and observations.
The difference in skill scores calculated for the various experiments does not, alone, provide clear insight as to whether differences in distributions are related to the greater length of precipitation simulated in the piControl and past1000 simulations, or due to the forcings imposed in each experiment. In addition, skill score comparisons alone do not provide insight into the precise nature of departures in the simulated distributions from observed. A difference in skill score may result from a change in mean, variance, or higher-order moments of the distribution. These possibilities are next explored using a quantile analysis.
b. Stationarity of Australian extremes in long simulations
The stationarity of extremes is assessed in several steps. First, lower- (Figs. 4 and 6) and upper-quantile values (Figs. 5 and 7) are compared to observations in each region for monthly (Figs. 4 and 5) and seasonal (Figs. 6 and 7) precipitation anomalies for various experiments. In these cases, the same number of observational and modeled samples is compared. Because the past1000 and piControl simulations are substantially longer than historical or observed, these are resampled into equivalent-sized blocks using a bootstrap resampling technique. Using a bootstrap method applied to past1000 and piControl ensemble data, 10 000 time series of 100-yr length were synthesized, and quantiles were calculated for each synthesis. A median value of the bootstrapped data was calculated for each quantile, which is compared to ensemble median historical and observed values.
There is no apparent systematic difference between modeled and observed extremes in precipitation. For monthly mean precipitation, 99th-percentile values of observed precipitation are higher than those simulated in piControl and past1000 simulations, and 1st-percentile values are largely similar in observations and these experiments. For annual average precipitation, more extreme 99th- and 1st-percentile values are simulated for Australian-area average values. However, this effect is not robust across regions.
Next, the statistics calculated for the full length of past1000 and piControl realizations in each model are compared to observations (Fig. 8). Here, quantile values are calculated in each realization and directly compared to observed for monthly precipitation in each region. For each quantile, a spread of values is simulated in both long experiments, with some realizations exhibiting less-extreme quantile values and some more extreme than observed. This is consistent across each region, indicating that there is no systematic increase in the magnitude of extreme monthly or annual precipitation events (defined as 1st- and 99th- quantile values) due to an increase in the length of climate being simulated. In addition, there is no significant difference in spread or magnitude of extremes simulated between the preindustrial and Last Millennium simulations. Overall, there is no clear evidence for nonstationarity of precipitation extremes in Australian regions over long simulations, although a greater range of extremes may be produced by increasing the length of period simulated (Benestad 2003). Hence, a greater range of extremes may occur with a greater length of observations, as may be suggested by some preinstrumental records and proxy reconstructions (Fenby and Gergis 2013; Cook et al. 2016).
c. Future extremes
The statistics of observed precipitation are compared with those simulated under the stronger external forcings imposed in representative concentration pathway experiments. Quantile values for RCP2.6, RCP4.5, and RCP8.5 are shown in Figs. 4–7 as red plot markers. In these simulations, where a substantially higher greenhouse gas forcing is imposed compared to the historical simulation (which terminates in 2005), wetter months than observed occur in all regions (Fig. 5). On a seasonal basis, results are variable regionally. For AUS and NEA, more extreme low winter precipitation is simulated in RCP experiments (Fig. S1), as well as more extreme heavy winter precipitation in all analyzed regions except SEA (Fig. S2). In summer, higher 99th-percentile values are simulated for SEA in future scenarios compared to observed (Fig. S4). These results are regionally and seasonally dependent and depend on the quantile examined.
Indices defined by the Expert Team on Climate Change Detection and Indices (ETCCDI) (Zhang et al. 2011) provide another means to evaluate changes in climatic extremes. In particular, dry and wet extremes can be investigated using consecutive dry days (CDD) and maximum 5-day precipitation indices (Rx5day). The CDD index is a measure of the length of the longest period of CDD in a year. Dry days are defined as days where precipitation is less than 1 mm. The Rx5day index is the monthly (or annual) Rx5day accumulation within that period. ETCCDI indices are calculated for various CMIP5 simulations and available through the Canadian Centre for Climate Modelling and Analysis (Sillmann et al. 2013).
The number of CDD (Fig. 9) and Rx5day values are compared for the last 30 years of the historical (1976–2005) and RCP2.6 and RCP8.5 (2071–2100) simulations. Again, a quantile approach is applied to the simulated distributions of CDD (Fig. 9) and Rx5day (Fig. 10) indices. In all regions the median and 75th-percentile values of CDD are higher in the RCP8.5 experiments than in the historical. Further comparison between historical and RCP8.5 upper- quantile (90, 95, and 99) values demonstrates a significant simulated increase in dry spells under the enhanced greenhouse gas warming of RCP8.5 for all regions (not shown). For Rx5day values, the increase in RCP8.5 values relative to historical is even more pronounced (Fig. 10). This indicates that both wet and dry precipitation extremes in Australian regions are enhanced by strong greenhouse gas forcings. Sillmann et al. (2013) determined that multimodel median CDD and Rx5day increased across Australia in the CMIP5 RCP8.5 scenario. In contrast, the range of simulated wet and dry extremes at the end of the twenty-first century in the RCP2.6 peak and decline trajectory are significantly lower than the RCP8.5 scenario and more similar to historical years 1976–2005 (Figs. 9 and 10).
5. Discussion and conclusions
This study has explored the characteristics of precipitation extremes in Australia and drivers of change using CMIP5 experiments using several approaches. Observations and model simulations of the historical experiment over the same period were compared using a set of statistics, including a KS and AD test and a Perkins skill score. When past1000 and piControl simulations were compared to observations, lower Perkins skill scores were obtained, demonstrating these simulations are less similar to observed. However, the nature of these differences between simulations is difficult to characterize through exploring upper and lower extremes.
While a wider range of stochastic precipitation variability than observed may be suggested by the comparatively lower skills scores calculated for past1000 and piControl than historical experiments, this is not clearly demonstrated in the quantile values examined here. The 1, 5, 10, 90, 95, and 99 quantile values calculated for monthly, seasonal, and annual precipitation past1000 and piControl simulations do not indicate a systematic difference to observed values or nonstationarity in extremes with increasing length of time series. That is, some model realizations simulate statistically significant (at 95% confidence interval), higher-magnitude extremity in lower- and upper-precipitation extremes, compared to observed, while other realizations demonstrate lower magnitude extremes than observed (Fig. 8).
A previous study by Hunt (2006) explored the characteristics of rainfall in various regions using a 10 000-yr simulation of present-day climate. This study determined that the frequency of occurrence of various rainfall anomalies in the long simulations could not be obtained within the limited duration of the observed record. Furthermore, a larger range of natural fluctuations (defined as a particular number of standard deviations above normal) was displayed in millennial-length simulations than observed. This apparent difference in results from the present study may result from several possibilities. First, the current study focuses on the magnitude of extreme monthly, seasonal, and annual precipitation events rather than on frequency. In addition, a multimodel suite of shorter forced and unforced simulations are used, rather than a single multimillennial realization. These results may not be contradictory; an increase in frequency of heavy or dry precipitation events may occur over an extended duration of observations or simulation that does not correspond to a systematic increase in the magnitude of highly anomalous extremes.
It is also worth noting that the current study utilizes only model results and does not explicitly integrate these simulated precipitation data with proxy reconstructions. Prior studies that explore model–data comparisons over various periods find overall agreement between CMIP5 models and reconstructed rainfall fields during the preindustrial period (Ljungqvist et al. 2016). Ljungqvist et al. (2016) found that proxy records did not reveal the intensification of twentieth-century mean hydroclimatic anomalies associated with increases in wet and dry extreme conditions that was demonstrated in CMIP5. This difference between model and proxy results is attributed to some combination of biases in subgrid model parameterizations, model temperature biases, or a bias in overall internal variability.
The current study of Australian precipitation evaluated models against the AWAP datasets through several metrics that examined aspects of precipitation distributions, but it did not explicitly account for model or observed biases. Although AWAP rainfall is a high-quality, observationally based data product, it has notable limitations. Jones et al. (2009) suggest that AWAP accuracy is lower in areas of sparse network coverage, particularly for daily values. Further biases may be introduced from the interpolation of station data onto a horizontal grid (Tozer et al. 2012). King et al. (2013a) specifically examined the suitability of AWAP for understanding extreme values and concluded that while the dataset is suitable for extreme analysis, masking of data-poor regions was recommended. Chubb et al. (2016) also note the limitation of AWAP in local areas of complex topography and heavy snowfall. These data-poor regions, namely, central and western Australia, were excluded from analysis here, and large spatial areas are considered in order to minimize observational uncertainties. Furthermore, this study focused in the first instance on large spatial areas. Additional insight into precipitation extremes in critical areas may be provided using regional modeling or downscaling, or by focusing on Natural Resource Management (NRM) regions.
Further uncertainties in model–observation comparisons may occur from models, including from systematic biases such as excessive low precipitation in models (Stephens et al. 2010). Previous studies of Australian rainfall have excluded models from analysis based on their comparatively poor representations of observed Australian rainfall variability, large-scale atmospheric processes, or teleconnected relationships (Lewis and Karoly 2014). Further studies focused on future projections have identified CMIP5-participating models that perform best and worst for a suite of climate metrics (GISS-E2-R, IPSL-CM5A-LR, and MIROC-ESM; Moise et al. 2015). Although none of these identified worst-performing models for precipitation simulation are used here, in other instances BCC_CSM1.1 and GISS-E2-R have been excluded from analysis because of their El Niño–Southern Oscillation representation (Brown et al. 2015; Lewis and LeGrande 2015). Additionally, MIROC-ESM has been identified as containing possible drifts in surface temperatures in long simulations (Lewis and LeGrande 2015; Gupta et al. 2013), which were not removed from distributions in the present study.
This study included all models that contributed past1000 simulations to CMIP5 and did not exclude models based on their skills in capturing observed precipitation variability. This approach—which did, however, exclude particular ensemble members—provided the requisite large sample size for examining precipitation, given the small number of past1000 simulations contributed to CMIP5. In addition, an identical set of models was examined for each experiment, and hence possible biases in model representations of precipitation compared to observed are not considered inherently prohibitive to investigating changes in the stability of precipitation. Finally, the similarity of past1000 CMIP5 simulations of precipitation with reconstructed Northern Hemisphere hydroclimates, as determined by Ljungqvist et al. (2016), suggests that despite possible biases, models provide useful insights into the characteristics of preinstrumental hydroclimates.
In general, models tend to show a small degree of variability in precipitation in response to natural external and internal forcings on regional and global scales. For example, an investigation of interannual climatic variability in Last Millennium simulations showed that compared to temperatures, precipitation was largely insensitive to external natural forcings, such as volcanic eruptions (Kai-Qing and Da-Bang 2015). The time-evolving volcanic forcing is arguably the primary difference between CMIP5 piControl and past1000 experiments, which were not found to be notably distinct here in terms of Australian precipitation characteristics.
A further study using CCSM4 Last Millennium simulations demonstrates that unlike surface temperature responses, hydroclimatic differences throughout periods of the last 1000 years are not statistically significant, including when comparing the Medieval Climate Anomaly and Little Ice Age (Landrum et al. 2013). In contrast to hydroclimatic responses to natural forcings, a clear response of Australian precipitation to future anthropogenic forcings is simulated in CMIP5 RCP scenarios, including an increase in both wet and dry extremes. Such an intensification of the hydrological cycle has been determined widely and is expected from thermodynamic arguments (Donat et al. 2016).
The current study has attempted to determine whether the period encompassed by the instrumental records captures fully the range of hydroclimatic variability that can be expected under forced and unforced conditions. A greater range of upper- and lower- tail precipitation extremes are unambiguously simulated in anthropogenically forced simulations, while potential changes in the magnitude of precipitation extremes in unforced simulations are less clear. Although there is no systematic increase in wet or dry monthly or annual extreme precipitation anomalies in long simulations (piControl and past1000), many contributing model realizations demonstrate more extreme dry and wet anomalies for various time averages and regions examined. This may indicate that our understanding of natural climate variability is inhibited by the limited duration of observations.
This result is derived from the CMIP5 model-based analysis undertaken here, but further work may be improved by integration of model results with recently available high-resolution reconstructions for specific regions and event types (e.g., extended drought in southern Australia; Freund et al. 2017) as the next step toward understanding the potential magnitude of hydroclimatic extremes in Australia, in addition to other characteristics of extremes such as event frequency, duration, or spatial extent.
This work was supported by ARC DECRA 160100092 and the NCI National Facility. I thank the Australian Bureau of Meteorology, Bureau of Rural Sciences, and CSIRO for providing AWAP data, and acknowledge the WCRP’s Working Group on Coupled Modelling, which is responsible for CMIP. The U.S. Department of Energy’s PCMDI provides CMIP5 coordinating support.
Supplemental information related to this paper is available at the Journals Online website: https://doi.org/10.1175/JCLI-D-17-0393.s1.
Criteria: (i) The proxy must be related to one or more climate variables, as stated in a peer-reviewed publication. (ii) The record must extend continuously for at least 500 years out of the last 2000 years. (iii) The record must have an age model based on at least two to three chronological anchors. (iv) The record must have an average sample resolution between 2 and 50 years per sample or analyses. (v) The collection location must fall within the region that has been identified by PAGES Aus2k to influence Australasian climate (10°N–80°S, 90°E–140°W). The Australasian region includes tropical Southeast Asia because of the dynamical influences of the Indo–Pacific region on the Australasian monsoon.