## Abstract

A hierarchical spatial model for daily rainfall extremes that characterizes their temporal variation due to interannual climatic forcing as well as their spatial pattern is proposed. The model treats the parameters of at-site probability distributions for rainfall extremes as “data” that are likely to be spatially correlated and driven by atmospheric forcing. The method is applied to daily rainfall extremes for summer and winter half years over the Swan–Avon River basin in Western Australia. Two techniques for the characterization of at-site extremes—peaks-over-threshold (POT) analysis and the generalized extreme value (GEV) distribution—and three climatic drivers—the El Niño–Southern Oscillation as measured by the Southern Oscillation index (SOI), the Southern Hemisphere annular mode as measured by an Antarctic Oscillation index (AOI), and solar irradiance (SI)—were considered. The POT analysis of at-site extremes revealed that at-site thresholds lacked spatial coherence, making it difficult to determine a smooth spatial surface for the threshold parameter. In contrast, the GEV-based analysis indicated smooth spatial patterns in daily rainfall extremes that are consistent with the predominant orientation of storm tracks over the study area and the presence of a coastal escarpment near the western edge of the basin. It also indicated a linkage between temporal trends in daily rainfall extremes and those of the SOI and AOI. By applying the spatial models to winter and summer extreme rainfalls separately, an apparent increasing trend in return levels of summer rainfall to the northwest and decreasing trends in return levels of winter rainfall to the southwest of the region are found.

## 1. Introduction

Weather-related natural disasters—mostly flooding—have caused death and destruction in many parts of the world. In the United States, the 1993 Midwest flood was one of the most damaging natural disasters ever with damages totaling $15 billion and the loss of 50 human lives (Larson 1997). There are many examples of widespread flooding as a result of extreme rainfall events with European rivers during the past two decades, including the August 2002 flood in the Elbe River basin and parts of the Danube basin. Flood damage of this magnitude had never occurred in Europe before (Becker and Grünewald 2003). Given a trend of significant increases in extreme precipitation in most part of the world, including much of Australia (e.g., Karl et al. 1995; Suppiah and Hennessy 1998), there is a need to understand and characterize the spatial patterns of these extreme events and to identify possible climatic predictors.

The severity and the frequency of extreme rainfall events have increased or are expected to increase because of enhanced global warming (Bernstein 2007; Pittock 2003). The Fourth Assessment Report of the Intergovernmental Panel on Climate Change (IPCC) suggests that increasing atmospheric concentrations of greenhouse gases will result in changes in the frequency, intensity, and duration of extreme events. Many of these projected changes may lead to increased risks of floods and droughts in many regions, resulting in adverse affects on ecological systems (Bernstein 2007).

Changes in the frequency and magnitude of extreme rainfall events are not well understood, and there are significant gaps in the scientific literature (e.g., Angel and Huff 1997). Existing methods provide many approaches to characterizing at-site extremes, but published research on the spatial modeling of extremes in terms of their climatic forcing is not very common (see, e.g., Zhang et al. 2001; Kahana et al. 2004). This is an important step because it offers potential predictability, both on short and long time scales.

There is also very little published work on modeling extremes in spatial networks such as rainfall stations (Coles and Tawn 1996). Because neighboring rainfall stations can be expected to experience similar weather sequences, it would be unrealistic to model rainfall stations as independent of one another (see, e.g., Zhang et al. 2001). Some earlier studies have been carried out on “homogeneous” regions, where all the annual maximum series have similar properties, implying that these rain events were produced by similar storm systems (e.g., Nguyen et al. 2002). However, Schaefer (1990), Alila (1999), and others have found that the coefficient of variation and skewness can vary systematically with mean annual precipitation across a specified homogeneous region, and they have called for the development of a continuous spatial model.

The aim of this paper is to develop a hierarchical spatial model for rainfall extremes in a network of rainfall stations that is linked to climatic predictors. This implicitly assumes a nonstationary framework, and concepts such as return period become problematical. We shall take a level corresponding to a return period of *N* years to mean that it is the level exceeded in any one year with probability 1/*N*; this allows for the return level to vary annually but with a fixed probability. As a case study, we apply our model to the Swan–Avon River basin in Western Australia (WA). The study is designed to (i) determine whether there have been temporal changes in extreme rainfall over the basin, (ii) characterize the nature of any changes, and (iii) determine whether there are any candidate drivers for these temporal trends. We analyzed rainfall extremes for each station, selecting events using theoretically determined statistical properties of extremes and developing spatial models for the summer half-year (November–April) and winter half-year (May–October) rainfall.

## 2. Methodology

### a. A spatial modeling approach

Suppose we have a collection of daily rainfall series taken from *k* different stations within a spatial domain. The rainfall series at location *s* ∈ *S* = {1,…,*k*} has *n _{st}* observations {

*X*(

*s*,

*t*,

*d*) :

*d*= 1, …,

*n*

_{st}} for day

*d*within year

*t*∈

*T*= (1, …,

*n*

_{s}). Let

*n*(

_{ω}*s*,

*t*) denote the number of extreme values of

*X*in season

*ω*—such as the winter or summer half year—at station

*s*and year

*t*, with corresponding observations

*Y*(

_{ω}*s*,

*t*) = {

*Y*

_{ω}^{i}(

*s*,

*t*) :

*i*= 1, …,

*n*(

_{ω}*s*,

*t*)}. For convenience we write

*Y*= {

_{ω}*Y*(

_{ω}*s*,

*t*) :

*s*∈

*S*,

*t*∈

*T*}. In a peaks-over-threshold (POT) approach, we would select the members of

*Y*as the excesses of

_{ω}*X*over some threshold at each station and year, whereas in a block maxima approach, we would retain only the maximum values of

*X*.

We assume an at-site cumulative distribution function *F* such that Pr[*Y*(*s*, *t*) ≤ *y*] = *F*[*y*|Θ(*s*, *t*)], for parameters Θ = {Θ(*s*, *t*) : *s* ∈ *S*, *t* ∈ *T*} that are allowed to vary in space and time. We assume that Θ(*s*, *t*) is *p*-dimensional, and we denote the *i*th element Θ_{i}(*s*, *t*). Conditional on these parameters, we assume that the observations are independent, so that the likelihood for the observations becomes

A spatial model is able to capture functions of extremes in the landscape, such as the spatial return level surface for a given period, and interpolation to ungauged sites within the domain of the data. These advantages make a spatial model superior to the standard site-by-site approach (see, e.g., Schaefer 1990). We, therefore, seek to incorporate spatial correlation among the parameters and also forcing in time due to climate variables driving extreme rainfall. We decompose the at-site parameters as follows:

where the function *f _{i}* links a matrix of climate predictors 𝗔 (variables are columns and rows are observations in time) to the at-site parameters

*β*. For example, a multiple linear regression model incorporating a spatially dependent linear trend in time might take the form

The errors {ɛ_{i}(*s*, *t*)} are assumed to be zero-mean Gaussian random variables correlated in space but not in time. We also allow cross correlation between the at-site parameters. The Gaussian assumption could cause problems for parameters that have a restricted range, which could be dealt with via transformation. We examine this question in the case study.

### b. Choice of at-site distribution

The spatial model described by Eqs. (1) and (2) uses an at-site extremes distribution function *F*, to be chosen in any application. In the literature the two main distributions used are (i) the generalized Pareto distribution (GPD), which is a model for the excess over a chosen threshold and is commonly referred to as the POT model; and (ii) the generalized extreme value (GEV) distribution used for modeling block maxima (e.g., seasonal maxima).

The POT model assumes that successive rainfall peaks have independent magnitudes and that the number of peaks per year *N*_{p} exceeding a sufficiently high threshold *u _{ω}* during a period of

*n*years has a Poisson distribution with mean

_{s}*λ*(

*u*). The limiting distribution for the exceedances

_{ω}*Y*=

*X*−

*u*is the GPD (Pickands 1975), having cumulative distribution function

_{ω}where *μ* and *κ* are the scale and shape parameters respectively, and 0 ≤ *y*< ∞ if *κ* ≤ 0, otherwise 0 ≤ *y* ≤ *μ*/*κ* (Hosking and Wallis 1987; Choulakian and Stephens 2001).

The generalized extreme value model for the seasonal maxima has parameters Θ = (*μ*, *σ*, *γ*)′, representing location, scale, and shape, respectively. The cumulative distribution function is given by

Nguyen et al. (2002) point out the GEV approach is more common than the GPD approach because of its simpler structure, although the GPD approach is commonly used for flood frequency analysis; it is also known as the partial duration series. POT has a potential advantage because it uses more information than the single value per block in the GEV approach. The POT approach is ideal for identifying extreme events, but it is far more time-consuming than the GEV approach as a result of the additional effort required to choose thresholds for modeling extreme events at multiple stations over the study area. Because a universal method for choosing the threshold is unavailable (Tanaka and Takara 2002) and because annual maximum flood modeling is easy to use, Lang et al. (1999) recommend that analysis be carried out using both approaches so that a comparison can be made. In our case study, we will explore and implement the most suitable of these choices for the at-site extremes distribution.

## 3. Case study

### a. Study area

The Swan–Avon River basin (118 000 km^{2}) is located in southwest Western Australia (Fig. 1). Most of southwest Western Australia experiences a Mediterranean climate of hot summers with infrequent rainfall and cool-to-mild winters with frequent moderate rainfall. Winter rainfall events are associated with incursions of cold-air outbreaks from higher latitudes associated with depressions originating in the Indian or Southern Oceans (Sturman and Tapper 1996).

The most extreme rainfalls in the region are produced by the extratropical cyclones during mid-to-late summer (Wright 1997). The average annual rainfall of this region ranges from 700 to 300 mm from west to east. The higher rainfall regions are near an escarpment east of Perth near the western boundary of the basin. The scarp runs in a north–south direction and has an average height of 300 m above the coastal plain. Annual potential evaporation far exceeds the annual rainfall throughout the basin. Since the mid-1970s, there has been a marked reduction in the total volume of rainfall in the region (Indian Ocean Climate Initiative 2002) as a result of a step change in rainfall pattern.

Extreme rainfall events can significantly affect the region. An example is of the rainfall event on 22–23 January 2000, which was brought about by the interaction of a midlevel trough with a surface trough. A tropical low off the Pilbara coast in the north of the state provided additional moisture for the development of midlevel clouds and precipitation (Muirden 2000). Much of the region experienced greater than 100 mm of rain, having already received between 30 and 100 mm of rain in the period from late December to 17 January. Another contributing factor to the influence of such events is the extensive clearing of forest cover for agriculture since the European settlement and the elevated water table in the region, leading to greater runoff (Muirden 2000). The downstream consequence was a major toxic algal bloom in the Swan estuary.

There is little published work on extremes of relevance to the basin. Suppiah and Hennessy (1998) examined trends in heavy rainfall events in the period 1910–1989 across Australia using the Bureau of Meteorology’s (BOM) high-quality rainfall dataset, concluding that there has been a decline in heavy rainfall in southwest Australia. Similar findings are also reported by Hennessy et al. (1999) and Li et al. (2005).

### b. Rainfall data and quality control

Because we focus on a small number of events in any rainfall record, data quality is especially important. In particular, we require long records whose quality is consistent in time. Because we are assessing spatial patterns, we require this consistency to extend from site to site. Therefore, a rigorous examination of the quality of the rainfall data for the Swan–Avon River basin was carried out.

Data from the Bureau of Meteorology’s rainfall network was used in our research. The observations are made at 0900 each day, with the 24-h rainfall total being recorded against the day of observation. These gauges are mostly operated by volunteers, often at workplaces such as post offices, local government offices, and farms. As a consequence the quality of the data is variable, even for different periods at a single station. Some of the data quality issues have been discussed by Lavery et al. (1992) and Viney and Bates (2004). These include inconsistencies in observer practices and exposure changes.

Missing observations arise because of the observer failing to record rainfall for a period of one or more days, especially during weekends. These multiday rainfall accumulations are recorded against the date of observation and are termed “tagged accumulations” by Viney and Bates (2004). These tagged accumulations are present to some degree in almost all station records. Statistical analysis of rainfall on different days of the week shows that in many cases, observers had failed to note that their Monday rainfall observation was a multiday accumulation, a phenomenon termed by Viney and Bates (2004) as “untagged accumulations.” Obviously untagged accumulations containing multiday rainfall presented as a single-day event pose significant problems in the analysis of extreme events. Although the presence of untagged accumulations in a rainfall record can readily be assessed using the diagnostic developed by Viney and Bates (2004), it is not possible to definitively identify the specific days involved. In this study we eliminated gauges showing evidence of possible untagged accumulations.

### c. Data screening and gap filling

The geographical domain chosen for the rainfall dataset encompasses the Swan–Avon River basin and adjacent areas within about 50 km of the basin boundary (Fig. 1). In total about 300 rainfall stations are located within this region. However, many of these stations have short or discontinuous records, whereas others have been decommissioned altogether. We decided to use only those gauges that operated continuously (with allowances for short discontinuities) between 1950 and 2003. A preliminary screening of the 300 candidate gauges revealed that 69 have fewer than 60 missing observations in any year since 1950 and fewer than 200 missing observations altogether between 1950 and 2003. All gauges with evidence of untagged accumulations in any year since 1950 were eliminated, leaving a final set of 58 stations. This final set of stations provides uniform spatial coverage, except for the uncleared and largely unpopulated far east and far northeast portions of the basin (Fig. 1).

The 58 stations selected contain no more than 165 missing observations (0.8% of all days) in the 54-yr period between 1950 and 2003. To reconstruct daily rainfall totals for the missing days, we used the interpolation scheme of Jeffrey et al. (2001). This scheme uses a trivariate thin-plate smoothing spline with latitude, longitude, and elevation as the independent variables to create an interpolated surface for each day based on observed records from up to 4600 stations across Australia. Estimates of missing data values were derived from the nearest grid cell on the surface. Stations with missing data or tagged or untagged accumulations were not used to generate the interpolated surface.

## 4. Analysis and results

### a. Seasonality and at-site analysis

The Swan–Avon River basin has a winter-dominated rainfall regime that is normally associated with the passage of cold fronts and associated systems. Therefore, for the purpose this analysis, the rainfall data have been split into summer (November–April) and winter (May–October) seasons. Figure 2a shows the seasonality of the average number of monthly exceedances (*N _{p}*) per year above a given threshold rainfall (see the next subsection) for the six stations labeled in Fig. 1. Values of

*N*are larger during winter months; values of

_{p}*N*for a station also depend on its threshold value such that, in general, stations with larger threshold values have smaller

_{p}*N*.

_{p}The amount and frequencies of exceedance for station 10534 are shown in Fig. 2b. Although the majority of exceedances occur during winter, the highest exceedances generally occur in January and February. The latter feature highlights the influence of extratropical cyclonic activities in the Swan–Avon River basin in mid-to-late summer.

At each site *s* (*s* = 1, …,58) and for each season *ω* an estimate of a reasonable lower bound for *u _{ω}*(

*s*) was determined using (i) a mean residual life plot, in which the mean observed excess over a candidate threshold is plotted against

*u*(

_{ω}*s*); and (ii) plots of the estimates of the modified scale parameters

*μ*(

*s*) and

*κ*(

*s*) against

*u*(

_{ω}*s*) (Coles 2001, p. 83). We refined the initial estimate of

*u*(

_{ω}*s*) by raising the threshold from one ranked exceedance to the next until the

*p*values of the goodness of fit statistics (Spinelli and Stephens 1997; Choulakian and Stephens 2001) for the Poisson distribution and the GPD exceeded 0.10. The goodness of fit was examined visually through the use of probability, quantile–quantile, and return level plots. We adopted a minimum gap of three days between the events to deal with the clustering of high threshold events. This ensured that successive exceedances were approximately uncorrelated.

Being a winter rainfall-dominated region, threshold values for winter are mostly higher than summer threshold values (Figs. 3a and 3b). There are three evident spatial patterns for winter (Fig. 3a): a high threshold more than 15 mm (red and white) along the western edge of the domain, a medium threshold of 10–15 mm (yellow) in the central part of the domain, and a low threshold of less than 10 mm (green and blue) farther inland. Summer rainfall threshold patterns over the basin are not as well defined (Fig. 3b). They show a moderate threshold (about 3–5 mm, green) in the western and northern regions of the Swan–Avon River basin and a low threshold (less than 2 mm, blue) over the northeast edge. However, the higher summer rainfall thresholds (more than 15 mm, red) located at the center of the basin shows spatial incoherence. We investigated the data and physical settings for the relevant stations to check for the presence of potential outliers and their possible sources, and we found the incoherence to be robust.

In the POT approach, daily rainfall extremes are often defined as exceedances over particular percentiles such as the 90th, 95th, and 99th percentiles (Plummer et al. 1999). We explored these percentiles of the wet-day rainfall distributions at each site and found that spatial patterns for the 90th percentile bear marginal resemblance to the threshold surfaces for winter (Figs. 3a and 3c) but not for summer (Figs. 3b and 3d). It appears, therefore, that no particular quantile can be used as a guide for choosing an appropriate threshold. The interpolated at-site 30-yr return levels are shown in Figs. 3e and 3f. The spatial patterns between the 30-yr return level and the 90th percentiles are reasonably consistent in winter (Figs. 3c and 3e), indicating that the POT approach is suitable for winter rainfall extremes but not so for summer (Figs. 3d and 3f). In particular, Fig. 3f suggests that the 30-yr return levels of summer daily rainfall in the southern Swan–Avon River basin are much larger than those in the northwest. This appears to be an artifact of the POT method because extratropical cyclonic activity in summer brings heavy rainfall to parts of the southern inland of the state, including the northern Swan–Avon River basin (Wright 1997; Braganza 2008). A more formal synoptic study may be able to explain the spatial variation of summer rainfall extremes, but this will be the subject of further research and is beyond the scope of this paper. Instead, the GEV approach was used.

### b. Spatial modeling using the GEV distribution

For summer and winter seasons, the maximum daily rainfall for each station was selected for each year. Although it would be possible to leave the data in one complete set and introduce seasonally dependent parameters, it is more straightforward at this stage to interpret results obtained for summer and winter separately. It is envisioned that climate change will shift and rescale the GEV distribution, producing a change in the location and scale parameters that can be modeled by introducing covariates (Katz and Brown 1992).

A natural approach to implementing the spatial model is within a Bayesian framework via simulation techniques, such as Markov chain Monte Carlo (Smith and Roberts 1993). Such an approach would be computationally expensive, so we investigated instead an approximate approach motivated by Smith (1999). The steps in the process are as follows.

Estimate the coefficients in the GEV for each of the stations separately, using maximum likelihood. This stage would involve parameterizations that depend on covariates, such as time or physical measurements (e.g., the Southern Oscillation index).

These estimated coefficients are treated then as “data” that characterize the individual stations, but we expect that they would be spatially correlated. Thus, we apply geostatistical techniques—in particular, kriging—to predict the coefficients at locations other than the initial stations.

These geostatistical techniques require the estimation of covariance functions for the estimated coefficients. A flexible approach described by Ver Hoef et al. (2004) to estimating covariance functions is adopted.

We are now in a position to predict “surfaces” corresponding to each of the coefficients over regions, such as a river basin. These surfaces can then be used to predict other surfaces (e.g., the return level surface corresponding to a 30-yr period).

We fitted the GEV distribution to the maximum rainfall values for each station and season combination using the above method derived from Coles (2001). Maximum likelihood was used because of the need to include covariate modeling of the parameters. This cannot be readily handled by other methods, such as probability weighted moments (Katz et al. 2002).

Graphical examination based on quantile–quantile plots (Katz et al. 2002) indicates that the model fits so obtained are generally satisfactory (not shown). The greatest correlation between the estimated winter and summer parameters is in absolute value less than 0.2, which is not significant in a sample of size of 58, confirming the known differences in summer and winter rainfall patterns.

### c. Temporal trends

The procedure used to determine the presence, nature, and candidate drivers of temporal changes in extreme rainfall over the basin was to fit covariates representing the effects of interest. Temporal changes in the distribution of the extreme values were modeled by including covariates that represent linear and quadratic temporal trends in the GEV parameters (Coles 2001; Katz et al. 2002). The GEV was fitted, and changes in the log-likelihood as a result of the inclusion of covariates were examined for statistical significance.

A further stage in the analysis was to seek potential predictors. We did this by examining changes in the log-likelihood resulting from the introduction of potential predictors to explain the variations in the observed data. As a first step, we explored the potential for linear trends as well as curvature by using a quadratic term.

The inclusion of a linear time trend for all three parameters is statistically significant (*p* < 0.004, summer; *p* < 0.001, winter) when considering the changes in likelihood from combining the analyses over all the stations. The results of fitting a linear time trend to the location parameter alone show that for summer, much of the temporal trend is accounted for by the location parameter (*p* < 0.001). The effect of including a linear term for the scale parameter, although not statistically significant for summer, is significant for winter (*p* < 0.001). The implication of this finding is that there may be different physical processes at work for the summer and winter seasons.

A second-order (quadratic) term was added to the model to examine whether a nonlinear trend might be present. The quadratic term was not statistically significant for the summer model but was significant for winter (*p* < 0.1). One possible driver for these temporal trends is the step change in rainfall in southwest Australia around 1976 (section 3a). We, therefore, examined a “broken stick” model that includes a linear time trend but incorporates a change of slope in 1976. Compared with a straight linear model, the broken stick model conferred no statistically significant improvements in fit for either the winter or summer data, despite the importance of quadratic term for the winter model.

Examination of the individual estimated parameters revealed a consistent pattern for the summer data, with nearly all the linear terms positive. The location parameter is negatively correlated with northing (*r* = −0.45), indicating a north–south trend, whereas the linear trend is positively correlated with easting (*r* = 0.55). An interpretation of these parameters is that during the 50 yr considered, there is an average increase of 5 mm in the location parameters and that the greatest increase happens in the areas of the lowest location parameter values.

For winter the situation is more complex, with a quadratic (in time) model for both the location and scale parameters being most appropriate, indicating significant departure from linearity. Furthermore, there are significant correlations between the parameters and geographic location. Eastings were correlated significantly with the intercept (*r* = −0.68) and quadratic terms (*r* = 0.17) for the location parameter, and with the intercept (*r* = −0.33), linear (*r* = 0.29), and quadratic (*r* = −0.24) terms for the scale parameter. Northings were correlated significantly with the linear (*r* = 0.48) and quadratic (*r* = −0.57) terms for the location parameter, and with the intercept (*r* = 0.23), linear (*r* = 0.63), and quadratic (*r* = −0.22) terms for the scale parameter.

### d. Candidate trend explanations

The results presented above indicated that temporal trends are present in the GEV parameters and for winter these trends are nonlinear in nature. Because the negative trend in the GEV location parameter is associated with the decreasing trend in maximum (e.g., winter rainfall), we explored the potential predictors for driving these trends. Examples of the use of atmospheric predictors to predict major flood events can be found in Kahana et al. (2004).

We considered three predictors: the Southern Oscillation index (SOI), the Antarctic Oscillation index (AOI), and solar irradiance (SI) as potential physical drivers. SOI is the standardized pressure difference between Darwin, Australia, and Tahiti and was chosen because it represents the El Niño–Southern Oscillation (ENSO) cycle. Generally, in a La Niña year (or positive SOI phase), there is a good chance of receiving above-average rainfall in the Swan–Avon region. Conversely, in an El Niño year (or negative SOI phase), there is less chance of exceeding average rainfall. The AOI is defined as the difference of zonal mean sea level pressure between 40° and 65°S and has the potential for clarifying climate regimes in the Southern Hemisphere (Gong and Wang 1999). AOI is used to describe the dominant mode, the Antarctic Atmospheric Oscillation (AAO), of the Southern Hemisphere. The AAO is also referred to as the Southern Hemisphere annular mode (SAM) and is defined as the leading principal component (PC) of 850-hPa geopotential height anomalies south of 20°S (Thompson and Wallace 2000).

We calculated the AOI using the definition in Gong and Wang (1999) based on the National Centers for Environmental Protection’s mean sea level pressure data. The signature of ENSO will project its influence onto the AAO (L’Heureux and Thompson 2006), and it has influence on winter rainfall over the southwest of Western Australia (SWWA). With increasing mean sea level pressure in the midlatitudes, winter rainfall over SWWA tends to decrease as a result of decreased cloudiness and decreased westerly winds, a relationship that exists in both observations and the CSIRO coupled general circulation model (Cai and Watterson 2002; Cai et al. 2005). Li et al. (2005) show the decline of SWWA winter rainfall is manifested as a reduction in high-intensity rainfall events and is accompanied by an upward trend in the AOI. Moreover, multidecadal AOI is also associated with SWWA rainfall (Cai et al. 2005). The third potential predictor, solar irradiance, was chosen because the increasing trends in solar radiation (or decreasing trends in cloudiness) over SWWA (Nunez and Li 2007) might have also contributed to the decreasing rainfall trend. Although it has been shown that Indian Ocean dipole events can be a predictor for anomalous rainfall patterns in SWWA (Saji et al. 1999; England et al. 2006), issues related to the Indian Ocean dipole and the extent of its correlation with ENSO are still being debated (e.g., Shinoda et al. 2004).

Monthly series for the SOI, AOI, and SI were used as covariates, with their values determined by the months in which rainfall extremes occurred. SI data are only available on an annual basis until 1995, and interpolated values were used to provide monthly values. Solar irradiance is available as either background or total irradiance. Initial use of the predictor series in the models, in which the temporal trends in the covariates were replaced by these physical variables (in situations where the temporal trend was significant), was not successful. However, the predictor series are very noisy (Fig. 4). Running a smoother through the data revealed simpler underlying patterns; for example, the smoothed AOI is essentially linear over the study period, whereas the smoothed SOI exhibits some nonlinear behavior The smoothed SI, however, exhibits a more complex pattern.

We, therefore, replaced the monthly observations with their smoothed values. Two different degrees of smoothed SOI were used (Friedman 1984; our Fig. 4c, dashed and solid lines). It is clear that the dashed line follows the trends in SOI relatively closely, and it also made the greatest improvement in goodness of fit, as judged by changes in the likelihood. Use of solar irradiance (either background or total) did not lead to a significant change in model fit.

Examination of Table 1 shows that, for summer, SOI and AOI could effectively replace the linear time trend for the location parameter, in the sense that the likelihood is similar. It is recognized that this is not a formal test because the models are not nested. Rather, it indicates that we have a comparable fitting model that only includes physical drivers. With the exception of the shape parameter, all of the parameters were significantly positively correlated (*r* > 0.36) with easting. Both the location and scale parameters were negatively correlated with northing (*r* < −0.44).

Analysis of the winter data indicates a more complex situation, with both AOI and SOI making a significant contribution. It is notable that for winter, the model that included both SOI and AOI as covariates had a much greater improvement in goodness of fit than that including a quadratic time trend for the location and scale parameters. There is also evidence of correlation between these parameters and their location, both the location and scale intercept parameters are negatively correlated with easting (*r* = −0.63 and *r* = −0.40, respectively), the SOI trend for location is correlated with northing (*r* = 0.47), and the AOI trend for location and scale is correlated with northing (*r* = 0.72 and *r* = 0.53, respectively).

### e. Spatial surface prediction

A geostatistical approach was used to estimate spatial surfaces, corresponding to the estimated parameters and functions of these. This approach requires the estimation of spatial covariance or variogram functions. Initial examinations indicated the limitations of naïve assumptions. These included (i) assumptions of isotropy in the spatial covariances with distance, which were clearly unwise; and (ii) ignoring spatial cross correlations between the GEV parameters, which were clearly correlated as shown by the examination of the parameters fitted to the individual 58 stations, although this is more apparent in summer than in winter.

A flexible nonparametric approach (Ver Hoef et al. 2004) that can accommodate these issues was implemented to estimate spatial auto and cross correlation functions, and then to apply them to “krige,” the GEV parameters across a spatial surface. The temporal sequences in these surfaces as they change through time, reflecting the changes in the covariates (AOI, among others), can then be generated. These surfaces can also be used to generate plots for a range of return levels for summer and winter (Figs. 5 and 6). Figure 5 shows that the rainfall values for all return levels for summer increase toward the east of the basin for years 1950, 1975, and 2000. This is consistent with the rainfall trend map published by BOM for summer rainfalls (available online at http://www.bom.gov.au/cgi-bin/silo/reg/cli_chg/trendmaps.cgi). Figure 6 shows similar changes in the northern portion of the basin for winter. It should be noted that these surfaces have been estimated from all the data, not just the data prior to and including the year shown.

## 5. Discussion

The estimation of extreme rainfall and associated return levels for ungauged locations is a recurring problem for hydrologists. The return level surfaces determined for the whole catchment using geostatistical methods that incorporate all rainfall stations can potentially provide a much more accurate estimate of extreme rainfall intensities at an ungauged location for regions that exhibit spatial trends. Although regionalization methods can also provide ways to estimate data from ungauged locations, the identification of homogeneous regions and their associated probability distributions is a problem in these methods. Most approaches identifying homogeneous regions use classification of their hydrologic responses to rainfall, employing purely the statistical methods that group watersheds according to their landscape and/or streamflow attributes (e.g., Burn 1990; Huang and Ferng 1990; Nathan and McMahon 1990; Lu and Stedinger 1992; Bates et al. 1998). Because the regionalization method ignores the spatial correlation of rainfall and the effects of atmospheric forcing, a comparison of the values given by the geostatistical method used in this paper with those given by the regional analysis (e.g., Schaefer 1990) is a topic for further research.

## 6. Summary and conclusions

We study rainfall extremes to examine temporal changes and to investigate candidate drivers. Spatial models analyzing rainfall extremes are developed for a network of rainfall stations over a domain using a peaks-over-threshold (POT) approach and a hierarchical geostatistics method based on GEV block maxima approaches. Time slices through the extreme spatial surfaces are examined, with a view to identify any time trends that may be present.

Rigorous data analysis is conducted to develop a high-quality dataset for the Swan–Avon River basin in Western Australia and to identify data quality problems due to untagged accumulations. This is potentially very serious in any analysis of extremes because multiday accumulations may inadvertently be confused with genuine extreme daily events.

Despite using more data to estimate spatial patterns than the GEV method, the POT did not produce a coherent spatial pattern in the basin. This prompted the development of a prototype spatial model using the GEV distribution, though some information on near-maximum rainfall events may have been lost. The analysis is used to derive a working prototype spatial model based on the GEV distribution, in parallel with the station-by-station POT analysis. Unlike the POT method, the GEV method produced a spatially coherent extreme value pattern. Results show that spatial patterns in extreme rainfall are present and are probably driven by the predominant storm tracks in the region and the coastal escarpment to the west. This suggests that a site-by-site analysis is suboptimal and that more powerful inferences will result from a spatial model.

By applying the spatial models to winter and summer extreme rainfalls separately, an apparent increasing trend in return levels of summer rainfall to the northwest and decreasing trends in return levels of winter rainfall to the southwest of the region are found. This implies that extreme summer rainfall over the northwest of the basin is increasing, possibly linked to northwest cloud band activity (Tapp and Barrell 1984; Nicholls 1989; Indian Ocean Climate Initiative 2002) while extreme winter rainfall over the southwest of the basin is decreasing. Preliminary analysis suggests such changes may be linked to the El Niño–Southern Oscillation and the SAM. The assessment of the affects of the northwest cloud bands on both winter and summer extreme rainfall events and a formal synoptic study to identify the mechanisms that cause spatial variations of extreme rainfall over the Swan-Avon basin will be the subject of further research.

Finally, our study has the following caveat: data traits such as random sampling variability, outliers, and cross correlation may introduce spatial discontinuity that may or may not mask environmentally significant spatial trends of the mapped statistical attributes. All or some of these traits are inherent in all data, and the artifacts (if any) they create are unavoidable.

## Acknowledgments

Rainfall data were obtained from the Patched Point database developed by the Queensland Department of Natural Resources and the Australian Bureau of Meteorology. Partial funding from CSIRO’s Water for a Healthy Country Flagship Program (Project C2.P2.1) is also acknowledged.

## REFERENCES

**,**(

**,**

**,**

**,**

**,**

**,**

**,**

**,**

**,**

**,**

**,**

**,**

**,**

**,**

**,**

**,**

**,**

**,**

**,**

**,**

**,**

**,**

**,**

**,**

**,**

**,**

**,**

**,**

**,**

**,**

**,**

**,**

**,**

**,**

**,**

**,**

**,**

**,**

**,**

**,**

**,**

**,**

**,**

## Footnotes

*Corresponding author address:* Dr. Santosh K. Aryal, CSIRO Land and Water, Private Bag No. 5, Wembley WA 6913, Australia. Email: santosh.aryal@csiro.au