Both natural climate change and anthropogenic impacts may cause nonstationarities in hydrological extremes. In this study, long-term annual maximum streamflow (AMS) records from 145 stations over Canada were used to investigate the nonstationary characteristics of AMS, which include abrupt changes and monotonic temporal trends. The nonparameteric Pettitt test was applied to detect abrupt changes, while temporal monotonic trend analysis in AMS series was conducted using the nonparameteric Mann–Kendall and Spearman tests, as well as a parametric Pearson test. Nonstationary frequency analysis of the AMS series was done using a group of nonstationary probability distributions. The nonstationary characteristics of Canadian AMS were further investigated in terms of the Hurst exponent (H), which represents the long-term persistence (LTP) of streamflow data. The results presented here indicate that for Canadian AMS data, abrupt changes are detected more frequently than monotonic trends, partly because many rivers began to be regulated in the twentieth century. Drainage basins that have experienced significant land-use changes are more likely to show temporal trends in AMS, compared to pristine basins with stable land-use conditions. The nonstationary characteristics of AMS were accounted for by fitting the data with probability distributions with time-varying parameters. Large H found in almost ⅔ of the Canadian AMS dataset indicates strong LTP, which may partly represent the presence of long-term memories in many Canadian river basins. Furthermore, H values of AMS data are positively correlated with the basin area of Canadian rivers. It seems that nonstationary frequency analysis, instead of the traditional stationary hydrologic frequency analysis, should be employed in the future.
In recent years, climate variability and climate change have impacted various components of the hydrological cycle in many regions across the world. In particular, changes to the climate system can potentially affect the magnitude and frequency of extreme hydrological events, thus altering the risk to critical infrastructure (Burn et al. 2010; Ishak et al. 2013). Therefore, considering the possible impacts of climate change, the assumption of stationarity, the basis for conventional hydrologic frequency analysis, will no longer hold. Design storms or flood estimation for certain return periods derived from conventional frequency analysis for municipal infrastructure developments could be underestimated, and public safety standards could be compromised (Khaliq et al. 2006). Several recent major floods that occurred around the world highlight the necessity to address nonstationarity and possible changes to future occurrences of hydrological extremes that would affect water security, resources management, and the operation of large dams (Jakob 2013; Milly et al. 2008).
Several past studies have examined temporal trends in Canadian streamflow records. Even though Zhang et al. (2001) reported that the annual- and monthly-mean streamflow in Canada had generally decreased from 1947 to 1996, a significant increase in the monthly-mean streamflow was observed in March and April probably because of the earlier onset of spring snowmelt, and major regional differences and variability of streamflow trends are observed across Canada (Burn and Elnur 2002; Whitfield and Cannon 2000). For example, statistically significant increasing trends in the streamflow were detected along the Winnipeg River (St. George 2007) and in the northern part of British Columbia (Whitfield 2001), even though nationally the daily streamflow has generally evidenced a broad-scale decreasing pattern (Zhang et al. 2001). On the other hand, Whitfield et al. (2003) found an increase in frequency of flooding in the Georgia Basin of British Columbia. Although in Canada the magnitude of low flows could be more significantly affected by the impact of climate change than high flows (Burn et al. 2010; Cunderlik and Ouarda 2009; Déry et al. 2009; Khaliq et al. 2008; Zhang et al. 2001), possible changes to future annual maximum floods warrant more attention than they have in the past. For example, based on simulations of some general circulation models in relation to climate change impact, it seems more extreme events may occur across Canada as a result of an enhanced hydrologic cycle (e.g., Mailhot et al. 2012; Whitfield et al. 2003).
The cause of temporal trends in streamflow data can be ambiguous because floods could be caused by many complicating factors, such as natural and anthropogenic changes in atmospheric forcings and catchment characteristics, rising concentrations of greenhouse gases, and land-use changes. Regulation of reservoirs further complicates the flow regimes of river basins of Canada. The downstream hydrological impact of dams (Assani et al. 2006) is an obvious cause leading to changes in the annual maximum streamflow (AMS) probability distributions (PDs) of many gauging stations globally. Similarly, the twentieth century had been a time of profound land-use changes resulting from activities such as agricultural practices, urbanization, and forest management. Studies show that land-use changes have significant impacts on the river basins of Canada (Buttle 2011; Kerkhoven and Gan 2013) and the United States (Villarini et al. 2009a,b).
In addition to nonstationarity and temporal changes, limitations in the amount, frequency, and accuracy of observed hydrologic data and analysis procedures could affect PDs of AMS derived from streamflow databases of Environment Canada (EC) and provincial agencies. The majority of the past studies analyzing streamflow trends over Canada are based on data provided by the Reference Hydrometric Basin Network (RHBN) of EC. By 2010, the average record length of streamflow stations of RHBN was about 50 yr. In view of the possible impact of climate change, climate variability, and land-use change on the streamflow of Canada, the key objective of this study is to examine nonstationary changes to Canadian AMS based on some selected, long-term streamflow records of HYDAT, which includes data from RHBN. HYDAT is a database compiled by Water Survey of Canada (WSC) that contains streamflow data mainly computed from station rating curves and water levels, as well as sediment data for rivers. Annual maximum daily streamflow data from 145 WSC stations, each with a record of at least 50 yr, were selected for this study. Drainage areas for all the selected stations range from 87.6 km2 (Pennask Creek near Quilchena, British Columbia) to 347 000 km2 (Saskatchewan River at The Pas, Manitoba).
Several major analyses conducted in this study include changepoint detection, possible effects of streamflow regulation, temporal trend analysis, nonstationary extreme flood PDs, nonstationary properties of the floods, and long-term persistence. According to Koutsoyiannis (2006), who examined stationarity and nonstationarity in hydrology, a hydrologic time series is usually regarded as stationary if the time series does not have trends, and without shifts in its mean and variance. Conventionally, the stationary assumption is considered valid if there is no slowly varying trend or detected changepoint (abrupt changes in the mean and/or the variance of the PD of the variable of interest) in the time series.
Trend analysis is applied to find historical changes of a time series, while changepoint analysis distinguishes a shift from one regime to another where the status is likely to remain the same until a new regime shift occurs (Khaliq et al. 2009). Streamflow trends in many Canadian watersheds have been studied (e.g., Burn et al. 2010; Yue et al. 2003), but we have yet to come across the application of changepoint analysis that characterizes the nonstationarity of streamflow records in Canada. Therefore, we performed both trend and changepoint analyses in this study, in which the former is done after the latter, but not vice versa. If a changepoint is detected, the AMS time series is first divided into two subseries (before and after the changepoint) and trend analysis is performed on both subseries separately.
In addition to finding trends and detecting changepoints, the AMS data were also fitted to PDs and evaluated for nonstationary characteristics. Without considering spatial correlations between station streamflow data, we performed a flood frequency analysis of observed AMS of stations individually by first estimating the location, scale, and shape parameters of some selected PDs using the generalized additive model [Generalized Additive Models for Location, Scale, and Shape (GAMLSS)] developed by Rigby and Stasinopoulos (2005). GAMLSS was chosen for this study because it allows the choice of a wide range of PDs for best fitting a dataset of interest, and it can also estimate model parameters that are stationary, or parameters with trends that vary slowly over time, or parameters with abrupt changes.
Koutsoyiannis (2006) suggested that properties such as nonstationarity, persistence, and scaling should be analyzed jointly, regardless of whether a time series possesses long-term persistence (LTP) or not. On the other hand, a time series with LTP may be falsely diagnosed as a time series with a statistically significant trend, even though no trend is present (Cohn and Lins 2005; Koutsoyiannis 2006; Koutsoyiannis and Montanari 2007). LTP, also referred to as the Hurst phenomenon, has been detected in many hydrologic time series, particularly in river flows (e.g., Klemeš 1974; Potter 1976; Szolgayova et al. 2014). Although certain patterns observed in some hydrologic series could be better explained by LTP (Hurst 1951; Koutsoyiannis 2006), this phenomenon is often overlooked in the analysis of streamflow records. In this study, we investigated possible long-term persistent behavior of AMS in Canada.
This paper is organized as follows: section 2 describes the AMS data for Canada; section 3 explains basic aspects of the GAMLSS, estimation of LTP in a time series, changepoint detection, and trend analysis; section 4 presents a discussion of the results; and section 5 presents our summary and conclusions.
A long-term dataset subjected to a sound quality control process is essential for achieving reliable changepoint and nonstationary analyses. The AMS data used in this study were taken from the HYDAT database of WSC until 2013. The RHBN data included in HYDAT have been extensively used for climate change studies, since RHBN data are characterized by relatively pristine and stable land-use conditions (<5% of the surface modified) and have a minimum record of 20 yr (Burn et al. 2010; Coulibaly and Burn 2004).
Out of over 200 active gauging stations in the RHBN network, 62 stations were selected in this study, each with a minimum record length of 50 yr to ensure statistical validity of the study results. In addition, 83 non-RHBN stations from HYDAT (with a minimum record length of 70 yr and a maximum total missing record of 4 yr) are selected to provide data collected from a larger coverage area of watersheds and a greater variety of geographic areas in Canada (Fig. 1). River basins of non-RHBN stations had experienced anthropogenic influences such as land-use changes, river regulations, agricultural production, and other human activities. Missing AMS values at a station were gap filled with the mean AMS of that station. Average record lengths of RHBN and non-RHBN stations are 72 and 85 yr, with maximum record lengths of 102 and 118 yr, and average drainage areas of 9070 km2 and 31 403 km2, respectively. All station data ended either in 2010 or 2011. There is a lack of stations in Quebec and all the stations of Quebec only have data available until 2001 or earlier. As shown in Fig. 1, selected stations are mainly located in southern Canada and the Rocky Mountains area in British Columbia. By analyzing AMS records from 145 stations selected across river basins of Canada, we should obtain a representative history of the occurrences of statistically significant changepoints and trends in Canadian AMS records over the twentieth and early twenty-first centuries.
3. Research methodology
To fully evaluate the stationarity and nonstationarity of the observed AMS in Canada, several popular tests and estimators were employed to analyze the changepoint, temporal trend, and Hurst exponents for each selected station, respectively. Parametric PDs of GAMLSS were also fitted to the AMS data.
a. Changepoint analysis
From comparing results of six tests applied for changepoint detections, we found that the Pettitt (1979) test (PETT), a nonparameteric rank-based test for detecting the changepoint position (time), is the best option for reasons given below. Among the six tests were three nonparameteric tests, PETT (Pettitt 1979), cumulative sums test (Csörgö and Horváth 1997), and Wilcoxon rank sum test (Gibbons and Chakraborti 2011), and three parametric tests, structural change test in linear regressions (Zeileis et al. 2003), Bayesian changepoint test (Sarr et al. 2013), and spectral test using a wavelet approach (Islam 2008). These tests have been developed primarily to detect changes in the mean, but they may not be able to effectively detect changes in the variance. Given that changes in the variance of AMS data is possible, we did some limited tests on changes to AMS variance using the cumulative sums of squares test proposed by Inclan and Tiao (1994).
Our focus is not on detecting multiple changepoints but in identifying a single, “major” changepoint in the AMS data of this study. We found that PETT consistently detected the changepoint that corresponds to the changepoint with the highest probability of occurrence identified by the Bayesian changepoint and the spectral tests. Additionally, we analyzed AMS data from the Columbia River, where the year at which an abrupt change in the discharge resulted from starting the dam regulation (Environment Canada 1992; Lehner et al. 2011) was accepted as the changepoint, and PETT identified the changepoint that best agrees with the year the dam operation began. Villarini et al. (2009a) also found PETT to more accurately identify changepoints of annual flood peaks in the United States. As a nonparametric, rank-based test, PETT is capable of detecting a change in the time series without having to assume its PD. It calculates p values of the Mann–Whitney test statistic using a limiting distribution of the Kolmogorov–Smirnov goodness-of-fit statistic for continuous variables to determine if two samples (Y1, …, Ym and Ym+1, …, Yn) come from the same population (Pettitt 1979).
b. Temporal trend analysis
To identify if a time series has a statistically significant trend, the nonparametric, Mann–Kendall test (Kendall 1975) and Spearman rank correlation (hereafter referred to Spearman) test, and the parametric, least squares linear regression (hereafter referred to Pearson) test are commonly used (Khaliq et al. 2009). After a changepoint is detected, these tests were applied to analyze trends of two subseries: one prior to and one after the detected changepoint in the original record. If no changepoint is detected, these tests will be applied to the whole record. The Mann–Kendall test determines whether a trend is statistically significant without specifying whether the trend is linear or nonlinear. It is the most widely adopted statistical test for detecting temporal trends in discharge data (Khaliq et al. 2009). It is based on the test statistic S, defined as the proportion of concordant pairs minus the proportion of discordant pairs in the samples:
where N is the total number of data points and Yi and Yj are the sequential data. A positive (negative) value of S indicates an upward (downward) trend. For N ≥ 8, the statistic S is approximately normally distributed with the mean of 0 and a variance related to the sample size. Where there is either serial or cross correlation, the Mann–Kendall test will be affected and such correlations should be examined before conducting the test (Yue et al. 2003). We computed the lag-1 correlation of all selected AMS series and found all lag-1 correlations to be not significant statistically. Next, the Spearman rank correlation test is based on the Spearman rho (Helsel and Hirsch 2002, chapter A3):
where d = RTi − RYi is the difference between two rankings, such that RTi is the rank of the variable Ti, Yi the observation series, and RYi its rank presented in the chronological order i. If there are ties, RT becomes the average rank. The null hypothesis is based on the test statistic:
where T has a Student’s t distribution with υ = N − 2 degrees of freedom. At a two-sided test based on the α significance level, the time series has no trend if tυ,α/2 < T < tυ,1−α/2.
The Pearson’s r, a linear correlation between Y and T, is the ratio of the covariance Cov(Y, T) over the standard deviations of Y and T. Since testing the significance of Pearson’s r is based on the Gaussian distribution, we need to transform skewed data into Gaussian-distributed deviations, which was done using the normal quantile transformation (Villarini et al. 2009a). However, this transformation may distort trends in the original streamflow data.
c. Nonstationary extreme distribution analysis
In fitting a suitable PD to a stationary AMS series Y, the parameters are usually assumed to remain constant with minimal change over the years. For nonstationary AMS series, parameters are expected to change with time and they can be expressed as a function of some explanatory variables (covariate). The simplest covariate is time, such as years or seasons. Although other covariates [e.g., climate indices (Kwon et al. 2008) and socioeconomic indices (Villarini et al. 2009b)] may be more interpretational for describing the nonstationarity of floods, we only chose time as the covariate to examine whether there is nonstationarity in selected AMS time series.
The models of GAMLSS provide a very flexible framework for estimating parameters of a wide range of PDs applicable to both stationary and nonstationary AMS. Compared with classical generalized additive models, GAMLSS is flexible and capable of using nonexponential distributions, such as highly skewed and/or kurtotic, continuous, and discrete distributions with heavy tails. It can model the location, scale, and shape parameters of a PD as linear and/or nonlinear, parametric, and/or additive nonparametric functions of covariates, and/or random effects.
A more comprehensive description of GAMLSS can be found in Rybski et al. (2006) and Stasinopoulos and Rigby (2007), and at the GAMLSS website (http://www.gamlss.org/). Villarini et al. (2009a,b) applied this model when conducting their frequency analysis of nonstationary annual flood peaks. For GAMLSS, it is assumed that independent observations yi, for i = 1, …, n, have a cumulative PD function FY(yi; θi) with , a vector of p parameters accounting for the position, scale, and shape of the PD. Usually, p ≤ 4, since one- to four-parameter families provide enough flexibility for most applications to highly skewed and/or kurtotic continuous and discrete distributions. Given an n vector of the variable yT = (y1, …, yn), let gk(⋅) for k = 1, …, p, be known monotonic link functions relating the distribution parameters to explanatory variables and random effects through an additive model given by
Where θk and ηk are vectors of length n, for example, , is a parameter vector of length Jk, k is a known design matrix of order n × Jk, jk is a fixed known n × qjk design matrix, and γjk is a qjk random variable. Additive terms in Eq. (5) represent smoothing terms that allow for flexibility in modeling the dependence of the parameters of the PD on the covariates. If we choose jk = n, where n is an n × n identity matrix, and γjk = hjk = hjk(xjk) for all combinations of j and k, we obtain a semiparametric additive equation of GAMLSS:
Where hjk is an unknown function of the covariate xjk and hjk = hjk(xjk) is the vector that represents the function hjk at xjk.
In this study we considered four, widely used, two-parameter PDs (Gumbel, gamma, lognormal, and Weibull; Table 1) to perform the flood frequency analysis of selected AMS data (GREHYS 1996; Ouarda and El-Adlouni 2011; Ouarda et al. 2000). Given that some runoff data could be nonstationary, only two-parameter PDs were chosen partly to avoid potential numerical round-off problems when estimating the distribution parameters. For a two-parameter PD, the parameters, θ1 (θ2) related to the mean (variance), are based on a generalized additive model with time t as the covariate to account for nonstationary. Therefore, Eq. (6) can be simplified to
A cubic spline smoothing technique was used to derive h(⋅) for this additive model. Note that if we only consider the linear models in GAMLSS, then there is no additive term hk(ti) shown in Eq. (7). Both the temporal trend and changepoint could be taken into account in this GAMLSS model to investigate the nonstationarity assumption of the flood frequency analysis.
Selecting a particular PD is based on visual assessment of diagnostic plots and maximum likelihood values L. Diagnostic plots of the residuals (van Buuren and Fredriks 2001) are visually inspected. The goodness of fit of the PDs selected were assessed using the Akaike information criterion [AIC = 2k − 2 ln(L)], where L is penalized by the number of parameters k. The PD with the minimum AIC value, and which best fit the Canadian AMS data, are selected.
d. Long-term persistence analysis
The LTP characteristics of Canadian runoff data are estimated from the Hurst exponent H (Hurst 1951). An H of 0.5 indicates a lack of LTP, but H larger than 0.5 indicates the presence of LTP, which increases with the H value. Several conceptual algorithms have been developed to detect LTP (Montanari et al. 1999). Among these the most popular are the rescaled range (R/S) statistic algorithm developed by Hurst (1951), the aggregated variance approach (Taqqu et al. 1995), and the detrended fluctuation analysis of Peng et al. (1994).
The aggregated variance approach is presented below. For a stationary process yi for a discrete time step i (years in this case) with a standard deviation σ,
denotes the aggregated process at time scale k, with a standard deviation σ(k) (the notation implies that ). For a sufficiently large k, represents the aggregated hydroclimatic process; k = 30 is often used to estimate H in terms of an elementary scaling property:
This aggregated variance approach is vulnerable to trends, abrupt changes, periodicities, and other sources of nonstationarities in the data (Rust et al. 2008). Five methods, namely, aggregated variance, aggregated variance (Teverovsky and Taqqu 1997), aggregated absolute value (Montanari et al. 1999), R/S method (Hurst 1951), and Peng’s method (Peng et al. 1994) were used to derive H, since any single estimator for H is prone to draw false conclusions, as is shown later (see Table 4, which shows that different estimators of H resulted in different numbers of stations showing LTP).
In this section, the nonstationary characteristics of Canadian AMS data will be illustrated based on results obtained from changepoint and temporal trend analyses, GAMLSS modeling, and the estimation of LTP.
a. Changepoint analysis
The presence of abrupt changes in the mean of annual maximum runoff records of Canada are examined using the PETT at a 5% significance level. The results show that 19 out of 62 RHBN stations, and 40 out of 83 non-RHBN stations (Fig. 1), exhibited a significant abrupt change in the mean. The occurrences of abrupt change in the mean clustered around the 1940s and 1970s. However, there was no changepoint detected in the variance of all the stations examined in this study. Spatially, stations that show a significant abrupt change in the mean are spread across Canada, with the first two clusters centered across southwestern and central Canada, where the time of changepoint occurred around the 1970s, and the third cluster spread across southeastern Canada, where the time of changepoint ranges from the 1940s to the 1990s. From applying wavelet analysis on 79 RHBN stations of runoff data, Coulibaly and Burn (2004) concluded that changepoints occurred around 1950 and 1970 in western and eastern Canada, which mainly agrees with the time of changepoint (1970s) detected for the first two clusters of this study.
Given RHBN stations are stations selected from pristine catchments with stable land-use conditions, the abrupt changes detected in these stations should be primarily attributed to past changes in their regional climate and variability. In contrast, the timing of abrupt change detected in non-RHBN stations could be related to anthropogenic changes, such as dam regulation and land-use shifts, or past changes in their regional climate, or both. Among stations with an abrupt change, we found that two stations at the lower Columbia River selected in this study (EC WSC station numbers 08NE049, Birchbank, and 08NE058, International Boundary), the detected changepoint (1973) happened during the year when several dams began to regulate flow for power generation and irrigation (Naik and Jay 2011). For the station located at the upper Columbia River (08NA002, Nicholson), where flow has not been regulated, no significant changepoint was detected. However, another unregulated station (08NB005, Donald) located at the upper Columbia River also showed a significant changepoint. It seems that even for a regional river basin such as the Columbia River, the reason behind the detected abrupt changes of AMS at different stations can be caused by changes in the climate or by streamflow regulations, or both. Naik and Jay (2011) also found that both flow regulation and climate change had affected the peak flow change of the Columbia River basin.
Among the 62 RHBN stations, only 1 out of 5 (18 out of 57) stations where streamflow have (have not) been regulated has detected changepoints in its AMS time series. Apparently, the abrupt change of RHBN AMS was mostly linked to climatic rather than human factors. For non-RHBN stations, 5 out of 21 nonregulated stations have detected changepoints, and 35 out of 62 regulated stations have detected changepoints. By comparing the detected changepoints of 22 non-RHBN stations with the respective year that dam operation began at each station (Environment Canada 1992; Lehner et al. 2011), we found that for 16 out of 22 stations the year a changepoint was detected coincides with the year that streamflow regulation began. Therefore, river regulation plays a more significant role than do climatic factors with regard to the occurrence of abrupt changes in AMS data of non-RHBN stations. However, for regulated rivers, the possibility of detecting changepoints in the AMS series should also depend on the operating policy of the dam sluice dates (e.g., the degree of streamflow regulation).
b. Temporal trend analysis
In this section, long-term temporal trends, which can be another cause of nonstationarity in the AMS data, are investigated below. As explained in section 3b, if a changepoint is detected, several tests (Mann–Kendall, Spearman, and Pearson two sided) will be applied to perform a trend analysis on two subseries, one prior to and the other after the detected changepoint in the original record. Next, whether a changepoint is detected or not, these tests will be repeated for the whole record. Furthermore, since none of the AMS record shows any significant lag-1 autocorrelation, the possible effects of serial correlation were not considered in the trend analysis of these AMS data.
The results obtained differ marginally between the Mann–Kendall and Spearman tests, but not results from the Pearson test (Table 2), partly because the transformation of AMS data to normal quantiles required by the Pearson test might have distorted the trend. Without considering the presence of a changepoint, about 50 out of 145 stations are detected with a significant trend at a 10% significance level. Again, more non-RHBN stations (37 out of 83) show a significant trend than do RHBN stations (12 out of 62), which means that drainage basins that have experienced major land-use changes are more likely to show temporal trends in their AMS data than pristine basins with stable land-use conditions. However, only 12 out of 59 stations with detected changepoints show significant temporal trends before and/or after the detected changepoint. Compared to subseries before changepoints, more subseries after changepoints showed increasing trends (Table 2; Fig. 3).
It is noted that 26 out of 59 stations detected with changepoints show significant trends if changepoints were not considered, even though no significant trend may be detected in their corresponding subseries before or after changepoints. On the other hand, some subseries showed significant trends both before and after changepoints, yet the entire series did not show a significant trend. The above results obtained for the same stations, with and without considering changepoints, indicate the importance of identifying the changepoint (if a changepoint exists) for estimating the monotonic trends of streamflow or other hydroclimatic data (Villarini et al. 2009a). However, in many past studies that estimated temporal trends of Canadian streamflow data, the detection of abrupt changes was mostly ignored (e.g., Cunderlik and Ouarda 2009; Khaliq et al. 2009; St. George 2007; Yue et al. 2003).
Figure 2 shows stations with significant trends obtained from the Mann–Kendall test applied to the entire series, while Fig. 3 shows significant trends for subseries before and after the detected changepoint. The spatial distributions of trend magnitudes for the entire series are also plotted in Fig. 2. Most significant trends are negative, which generally agrees with other past streamflow trend studies conducted in Canada (e.g., Burn et al. 2010; Cunderlik and Ouarda 2009; Gan 1998; Zhang et al. 2001).
In general, a significant decline in the AMS is predominantly found in the Canadian prairies (CPs), which include the three prairie provinces of Alberta, Saskatchewan, and Manitoba (central white area in Fig. 2), and which generally agree with trend analysis results of annual mean flows of Gan (1998). Given Gan (1998) also showed that about 30% of monthly precipitation data for 37 stations of the CPs experienced a significant decrease during 1949–89, which with a possible increase in evaporation because of climatic warming, we would expect the AMS in the CPs to generally decrease. Moreover, Ripley (1986) also found declining trends in the summer precipitation in the CPs, which likely implies that the CPs have been slowly becoming drier in recent decades. Burn et al. (2008) attributed the decrease of streamflow to a combination of reductions in snowfall and increases in temperatures during the winter months for snowmelt runoff generation. In contrast, the Winnipeg River basin had experienced increasing trends in its AMS (central black area in Fig. 2) likely because its summer and autumn precipitation have increased (St. George 2007).
Agricultural practices can also affect regional hydrological regimes significantly. Compared to conventional tillage, soil infiltration and water seepage through soil layers of agriculture land subjected to conservation tillage will increase, and therefore, for the same amount of precipitation, surface runoff and peak streamflow will decrease (Shelton et al. 2000). Thus, the increasing use of conservation tillage in agricultural land throughout the CPs could be partly responsible for the decrease in the AMS over the region (Shelton et al. 2000).
The diversion of water from rivers for irrigation and other consumptive usage will also decrease the streamflow. The amount of agricultural land in Canada has increased in recent decades, which generally has resulted in a significant decrease in the AMS, especially in areas cultivated for intensive agricultural production (Environment Canada 2004). The agricultural lands of the CPs account for about 82% of the total land used for agriculture in Canada. Other human-related land-use practices such as deforestation (Lin and Wei 2008), afforestation (Buttle 2011), mining, and petroleum production (Environment Canada 2004) may also play a major role in changes to flood regimes across Canada.
Past studies of Canadian streamflow trends (e.g., Whitfield and Cannon 2000; Zhang et al. 2001; Burn and Elnur 2002; Whitfield et al. 2003; St. George 2007; Burn et al. 2008, 2010; Cunderlik and Ouarda 2009; Déry et al. 2009) usually assumed that streamflow was not affected by human activities but by climatic factors alone. St. Jacques et al. (2010), who analyzed unregulated, regulated, and naturalized streamflow records of southern Alberta, found that the declining streamflows are due to hydroclimatic changes (probably from global warming) and severe human impacts, which could be of the same order of magnitude, if not greater. Unfortunately, it is very difficult, if not impossible, to separate the impact of climatic and human factors on the streamflow of a river basin because of the complex interactions between climate, humans, and hydrologic processes.
c. GAMLSS modeling of extreme distribution
Since both long-term monotonic trends and abrupt changes have been detected in some AMS series, we further evaluate the nonstationarities of these streamflow data by fitting these data with several PDs designed to characterize data nonstationarities.
If no statistically significant changepoint was detected in an AMS series, that dataset was fitted with four different two-parameter PDs (gamma, Weibull, Gumbel and lognormal) of GAMLSS. PDs for streamflow series with no significant changepoints include 1) a PD with stationary θ1 and θ2; 2) a PD with one time-varying parameter θ1, or both time-varying parameters, θ1 and θ2 modeled as a link function, g(θ), dependent on certain linear functions of time; and 3) a PD with one (θ1) or two time-varying parameters (θ1 and θ2) modeled as g(θ), dependent on certain additive functions of time such as the cubic spline function adopted in this study. If a significant changepoint is detected in the mean of AMS series, more complicated PDs of GAMLSS will be chosen to model both abrupt changes and a time-varying θ1 based on certain piecewise linear functions of time that are discontinuous at the time the changepoint is detected. Because of numerical convergence problems encountered while applying piecewise linear functions of GAMLSS, only θ1 may be modeled as a time-varying parameter. The goodness of fit of all PDs, ranging from a PD with two stationary parameters to a PD with two nonstationary parameters was assessed in terms of bias and AIC scores.
Table 3 shows that for 39 out of 85 stations without a detected changepoint, the AMS series were fitted with several stationary PDs and the PDs with the best goodness-of-fit statistics were selected. For the remaining 46 stations without detected changepoints, it turns out that the data were better fitted using nonstationary PDs with time-varying parameters based on either linear or nonlinear functions of time. Out of 60 stations with detected changepoints, nonstationary PDs with the time-varying location parameter θ1 based on certain piecewise functions of time evidently gave the best fit to 36 stations, nonstationary PDs with certain nonlinear descriptions of time-varying parameters gave the best fit to another 21 stations, and stationary PDs gave the best fit to the remaining 3 stations of the AMS. On the whole, the AMS of 103 out of 145 stations was better fitted with nonstationary PDs, which indicates that over 70% of the AMS records over Canada are nonstationary, and most of the AMS records are best fitted by either lognormal and gamma distributions.
The effectiveness of five nonstationary PDs and one stationary PD applied to model the AMS series of six stations (see Fig. 1) is further examined. These stations with their best-fit PDs are, respectively, 1) 05CC002, Red Deer River at Red Deer, Alberta, a lognormal distribution with one location parameter based on a piecewise linear function of time; 2) 05KJ001, Saskatchewan River at The Pas, Manitoba, a gamma distribution with one location parameter based on a piecewise linear function of time; 3) 08NE049, Columbia River at Birchbank in British Columbia, a lognormal distribution with one location parameter based on a nonlinear function of time; 4) 08JB002, Stellako River at Glenannan, British Columbia, a lognormal distribution with both location and shape parameters based on a nonlinear function of time; 5) 08MH016, Chilliwack River at the outlet of Chilliwack Lake in British Columbia, a lognormal distribution with both location and shape parameters based on a linear function of time; and 6) 01AK001, Shogomoc Stream in New Brunswick, near the Trans-Canada Highway, a stationary lognormal distribution.
The first four stations had abrupt changes but not the latter two stations. The first three are non-RHBN stations while the latter three are RHBN stations. Worm plots of residuals (Fig. 4) for these six stations generally demonstrate good agreements between the fitted PDs with their observed AMS series, with the data points primarily aligned on the fitted solid gray curves that generally fluctuate closely along the zero deviation line, and within the 95% confidence intervals (gray dashed eliptical curves). Figure 5 shows that for station 08MH016, with no significant abrupt change or trend detected, the percentiles of AMS with low probability of occurrences (5% and 25%) increased more significantly compared to other quantiles. Also, extreme annual maximum flood events (5th percentile) for stations 05CC002, 05KJ001, and 08JB002 increased significantly after the 1980s, and at faster rates than in other quantiles, which demonstrates a generally larger increase in extreme events that occurred after the 1980s, or that the chances of the occurrence of extreme events seem to be increasing, which likely implies that achieving reliable flood forecasts in Canada will become more challenging in the future. Different temporal variations of quantiles in the AMS series of these six stations probably confirm predominantly nonstationarities in the AMS series of Canada, which include observed changes in the mean, increased streamflow variability, and the severity of extreme events.
d. Long-term persistence
For natural processes such as streamflow, the range of cumulative departure from the mean normalized by the standard deviation generally follows a power law behavior (Hurst 1951; Klemeš 1974; Potter 1976). If the exponent of this power law phenomenon is larger than 0.5, it is known as the Hurst phenomenon (Hurst 1951), which tends to strongly increase the large time-scale variability or statistical uncertainty. Results of Hurst exponents H obtained in this study are used to suggest some plausible physical explanations for the properties of the Canadian AMS data. Table 4 presents a number of AMS series showing LTP (i.e., H larger than 0.5).
Given that H values calculated for the same station by different methods can differ widely, it seems that detecting the presence of LTP for AMS involves large uncertainties. For example, the number of stations with H > 0.5 as determined by the differenced aggregated variance method is more than double that estimated using the aggregated variance method. By comparing many H estimators, Teverovsky and Taqqu (1997) concluded that the aggregated variance method tends to produce large errors when trends or abrupt changes are present. Therefore, results obtained by the aggregated variance method for stations with detected trends or abrupt changes are henceforth discarded. Out of 145 stations, both the R/S method and Peng’s method estimated slightly more than 100 stations with H larger than 0.5, while the aggregated absolute value method estimated slightly fewer than 100 stations showing LTP.
The inconsistent results obtained from different methods may be partly caused by limited sample sizes. The reliability of the results can also be affected by the nonstationarities of the time series. To estimate the uncertainty of H estimated with sample size, a bootstrapping resampling procedure (Davison and Hinkley 1997) was used to derive the PDs and confidence intervals of H using the R/S and Peng methods. The time series of each station was resampled with replacement for 10 000 times, and the H for each sample was estimated. Spatial distributions of the mean and the 95% confidence intervals of H values estimated by this bootstrapping approach are shown in Fig. 6. The average H derived from the R/S method and Peng’s method are 0.67 and 0.58, with corresponding 95% confidence intervals of 0.46–0.89 and 0.35–0.74, respectively. Apparently, Peng’s method estimated lower H values than did the R/S method. At the lower confidence limit (2.5%), the R/S method estimated 46 stations and Peng’s method estimated 15 stations, out of 145 stations, with H higher than 0.5, which suggests the strong (statistically significant) presence of LTP in the Canadian AMS series. The spatial patterns of H derived from the R/S and Peng methods are similar to each other (Fig. 6). The above results show that stations located in southern Canada and inland generally exhibit higher H than in other parts of Canada; for example, clusters of high H values are located in southeastern British Columbia, southwestern Manitoba, and southeastern Ontario.
Bras and Rodriguez-Iturbe (1985) offered three possible physical explanations underlying the Hurst phenomenon. First, the Hurst phenomenon is transitory because our data are not long enough to test the steady-state behavior of R, the range of cumulative departures from the mean (Cohn and Lins 2005; Koutsoyiannis 2006; Franzke 2012). Second, the phenomenon is due to nonstationarities in the mean of the process (Klemeš 1974; Potter 1976). And third, it is due to stationary processes with infinite memories (Mandelbrot and Wallis 1969). By showing strong positive correlation between H and the mean discharge, air temperature, and basin area of European rivers, Szolgayova et al. (2014) claimed to have found evidence supporting the third explanation.
The scatterplots of the mean H value derived from bootstrap resampling with basin drainage area and the length of time series shown in Fig. 7 reveal a modest positive correlation between H estimated by the R/S and Peng methods and the basin drainage area with statistically significant Mann–Kendall S values, 0.14 and 0.12, respectively, and Spearman rho values, 0.21 and 0.18, respectively. It seems that the AMS for larger watersheds are more likely to show stronger LTP, possibly because of their larger capacity to store inundated flood water. However, Fig. 7 also shows that H values do not depend on the length of the time series but the result is inconclusive because in this study the AMS series are only limited to just over 100 yr.
Trends in environmental data can be attributed to stochastic or deterministic processes, or both. Stochastic trends can exist in very simple stationary stochastic processes over long periods of time, as shown by Cohn and Lins (2005), Koutsoyiannis (2006), and Franzke (2012), who linked LTP to multidecadal oscillations in climate regimes or other internal climate variability. Deterministic trends driven by external forcings such as greenhouse gases and land-use changes can also lead to high values of H (Potter 1976; Rust et al. 2008). Among the 46 (15) stations showing statistically significantly LTP in AMS data detected by the R/S (Peng’s) method, only 16 (2) are RHBN stations while 30 (13) are non-RHBN stations.
On the basis of these results, it seems that having human-induced nonstationarities in a time series can enhance the LTP represented by the H exponent. Furthermore, similar effects of nonstationarities on H have also been found by Klemeš (1974), Potter (1976), and Rust et al. (2008) in their simulation studies on hydroclimatic time series. However, our results are not sufficient to conclude whether LTP detected in Canadian AMS series are the results of nonstationarities caused by stochastic and/or deterministic processes or both. Moreover, a lack of additional information, limited sample sizes, the appropriateness of the estimation methods used in this study, and some unknown external forcings that can possibly contribute to deterministic trends are unresolved issues that are beyond the scope of this study.
5. Summary and conclusions
Based on 145 RHBN and non-RHBN stations across Canada with long-term AMS series and hydrological observations, we investigated the nonstationary characteristics of AMS throughout Canada. The research results can be summarized as follow.
Results from PETT testing show that almost half of the stations, including non-RHBN stations, experienced an abrupt shift in the mean of the AMS. The timing of abrupt changepoints has been shown to be closely related to the years when the regulation of streamflow began. Given that only about ⅙ of the RHBN stations with minimal streamflow regulation and stable land-use conditions experienced abrupt changes, it seems that human interference with nature has been instrumental in the abrupt shift in the mean of AMS across Canada. However, in this study, no changepoint was detected in the variance of AMS at any station.
Trend analysis of complete time series of AMS shows that about 50 out of 145 stations exhibit monotonic temporal (more negative than positive) trends; for example, abrupt changepoints were not considered in the analysis. However, further trend analyses show that only about 12 out of 59 stations detected with abrupt changepoints showed significant monotonic trends in the time series before and/or after changepoints were detected. Thus, abrupt changes are more likely the cause of nonstationarities to AMS series over Canada than are monotonic trends. Similar to abrupt changes attributed to the regulation of streamflow, drainage basins subjected to significant land-use changes are more likely to show temporal trends in the AMS compared to pristine basins with stable land-use conditions. Climate change impacts resulting from human activities could also contribute to monotonic trends and, possibly, abrupt changes in AMS series across Canada.
More than ⅔ of the AMS series could be accurately fitted with lognormal and gamma PDs. The nonstationarities of the AMS series, which include monotonic trends, nonlinear fluctuations, and abrupt changes, were modeled using PDs with time-varying parameters. Other factors such as climate anomalies and land-use change descriptors will be necessary to more fully investigate physical explanations behind various types of nonstationarities found in the streamflow series.
The LTP analysis using the R/S and Peng methods reveals that almost ⅔ of the Canadian AMS series show significant LTP (high H). It seems that widespread LTP detected in the AMS of Canada should be primarily related to the predominant nonstationarities characteristics of AMS data attributed mainly to climatic changes, climate variability, and land-use impacts of human activities. Furthermore, the degree of LTP (H) is also found to be positively correlated with the areas of Canadian river basins.
A possible future extension to this study is to explore changes in climate variables and evidence of human activity that contributes to abrupt changes, trends, and nonstationarity in the streamflow across Canada presented above.
The first author was partly funded by the Chinese Scholarship Council (CSC) and the University of Alberta. The streamflow data were taken from the HYDAT Database of Environment Canada (http://ec.gc.ca/rhc-wsc/default.asp?lang=En&n=9018B5EC-1). All analysis was performed using the R software and the gamlss package, version 4.2-6, 2013, (cran.r-project.org/web/packages/gamlss/gamlss.pdf). The authors thank two anonymous reviewers for their thoughtful comments and recommendations.