Regional Frequency Analysis at Ungauged Sites with the Generalized Additive Model

F. Chebana Eau Terre Environnement, Institut National de la Recherche Scientifique, Université du Québec, Québec, Québec, Canada

Search for other papers by F. Chebana in
Current site
Google Scholar
PubMed
Close
,
C. Charron Institute Center for Water and Environment (iWater), Masdar Institute of Science and Technology, Abu Dhabi, United Arab Emirates

Search for other papers by C. Charron in
Current site
Google Scholar
PubMed
Close
,
T. B. M. J. Ouarda Institute Center for Water and Environment (iWater), Masdar Institute of Science and Technology, Abu Dhabi, United Arab Emirates, and Eau Terre Environnement, Institut National de la Recherche Scientifique, Université du Québec, Québec, Québec, Canada

Search for other papers by T. B. M. J. Ouarda in
Current site
Google Scholar
PubMed
Close
, and
B. Martel Eau Terre Environnement, Institut National de la Recherche Scientifique, Université du Québec, Québec, Québec, Canada

Search for other papers by B. Martel in
Current site
Google Scholar
PubMed
Close
Full access

Abstract

The log-linear regression model is one of the most commonly used models to estimate flood quantiles at ungauged sites within the regional frequency analysis (RFA) framework. However, hydrological processes are naturally complex in several aspects including nonlinearity. The aim of the present paper is to take into account this nonlinearity by introducing the generalized additive model (GAM) in the estimation step of RFA. A neighborhood approach using canonical correlation analysis (CCA) is used to delineate homogenous regions. GAMs possess a number of advantages such as flexibility in shapes of the relationships as well as the distribution of the output variable. The regional model is applied on a dataset of 151 hydrometrical stations located in the province of Québec, Canada. A stepwise procedure is employed to select the appropriate physiometeorological variables. A comparison is performed based on different elements (regional model, variable selection, and delineation). Results indicate that models using GAM outperform models using the log-linear regression as well as other methods applied to this dataset. In addition, GAM is flexible and allows for the inclusion and presentation of nonlinear effects of explanatory variables, in particular, basin area effect (scale). Another finding is the reduced effect of CCA delineation when combined with GAM.

Corresponding author address: Fateh Chebana, INRS-ETE, Université du Québec, 490 rue de la Couronne, Québec, QC G1K 9A9, Canada. E-mail: fateh.chebana@ete.inrs.ca

Abstract

The log-linear regression model is one of the most commonly used models to estimate flood quantiles at ungauged sites within the regional frequency analysis (RFA) framework. However, hydrological processes are naturally complex in several aspects including nonlinearity. The aim of the present paper is to take into account this nonlinearity by introducing the generalized additive model (GAM) in the estimation step of RFA. A neighborhood approach using canonical correlation analysis (CCA) is used to delineate homogenous regions. GAMs possess a number of advantages such as flexibility in shapes of the relationships as well as the distribution of the output variable. The regional model is applied on a dataset of 151 hydrometrical stations located in the province of Québec, Canada. A stepwise procedure is employed to select the appropriate physiometeorological variables. A comparison is performed based on different elements (regional model, variable selection, and delineation). Results indicate that models using GAM outperform models using the log-linear regression as well as other methods applied to this dataset. In addition, GAM is flexible and allows for the inclusion and presentation of nonlinear effects of explanatory variables, in particular, basin area effect (scale). Another finding is the reduced effect of CCA delineation when combined with GAM.

Corresponding author address: Fateh Chebana, INRS-ETE, Université du Québec, 490 rue de la Couronne, Québec, QC G1K 9A9, Canada. E-mail: fateh.chebana@ete.inrs.ca

1. Introduction

Knowledge of flood characteristics is very important for resource management and design of hydraulic structures. Estimation of design flows is often needed at locations where little or no information is available. In this case, regional frequency analysis (RFA) is often used for the estimation of flow characteristics. Ouarda et al. (2008) presented a detailed review of the various available RFA methods (Blöschl et al. 2013). Generally, RFA is composed of two main steps: the identification of groups of hydrologically homogeneous basins and the application of a regional estimation method within each delineated region (GREHYS 1996a; Ouarda 2013). Since flow characteristics are highly dependent on physiographical and meteorological basin characteristics, these can be used to estimate flood quantiles at ungauged sites. The hydrological literature abounds with studies dealing with the development and evaluation of methods for the delineation of hydrological regions and for the study of their homogeneity. However, much less attention has been dedicated to the development of new regional estimation methods.

In the present study, canonical correlation analysis (CCA) is used to delineate homogenous regions. In GREHYS (1996b), it was shown that this method produced the best performances in comparison to other ones. Among RFA estimation methods, regression models and index-flood models are commonly used. GREHYS (1996b) showed that their performances are equivalent and are superior to other models. Generally, regression models such as linear regression models (LRMs) or log-linear regression models (LLRMs) are preferred for their simplicity and rapidity, as well as their performances. LLRM has been used in conjunction with CCA in many studies (Chokmani and Ouarda 2004; Ouarda et al. 2001). Linear models imply that the relations between the dependent variable (hydrologic) and the predictors (physiometeorological) are linear. This is generally not realistic and can be problematic in some situations, such as the effect of the basin size on flood quantiles, where it is documented that small basins behave differently than large ones. The basin hydrologic response is also not linearly related to the slope of the basin, as larger basin slopes (which are often associated to smaller size basins) lead to much more intense flood responses and very extreme specific peak values.

Generalized additive models (GAMs; Hastie and Tibshirani 1986) allow us to take into account possible nonlinearities that are not possible through linear models or by using simple variable transformations such as log, power, or square root. The use of a nonlinear model is justified by the fact that hydrological processes are naturally nonlinear (Kundzewicz and Napiórkowski 1986; Wittenberg 1999). Pandey and Nguyen (1999) compared a number of regional flood quantile estimation methods for the power regression model (equivalently log-linear) and found that nonlinear estimation methods (within the same power model) outperformed the log-linear one. Shu and Ouarda (2007) used an artificial neural network approach, which represents a nonlinear model, and obtained better results than with linear regression methods.

GAMs are an extension of generalized linear models (GLMs; Nelder and Wedderburn 1972). The latter brought flexibility to regression methods by allowing nonnormal residuals as well as a general link between predictors and the response variable. In addition, GAMs use nonparametric smooth functions to link the dependent variable to the predictors. Therefore, they are more flexible and can more realistically capture the relation between variables. GAMs have been attracting a lot of attention in statistical developments as well as in practical applications (Hastie and Tibshirani 1986; Kauermann and Opsomer 2003; Marx and Eilers 1998; Morlini 2006; Schindeler et al. 2009; Wood 2003). Recently, additional methodological developments and the availability of implemented computer programs made GAMs increasingly popular in practical research, mainly in the public health and epidemiology fields (Bayentin et al. 2010; Cans and Lavergne 1995; Leitte et al. 2009; Rocklöv and Forsberg 2008; Vieira et al. 2009) and in environmental studies (Borchers et al. 1997; Wen et al. 2011; Wood and Augustin 2002). In the field of meteorology, GAMs were used to model the effect of traffic and meteorology on air quality (Bertaccini et al. 2012), to predict air temperature from satellite surface temperatures (Kloog et al. 2012), and to model mean temperature in mountainous regions (Guan et al. 2009). In hydrological modeling, very few studies employed GAMs. For instance, Tisseuil et al. (2010) used GLM and GAM for the statistical downscaling of general circulation model outputs to local-scale river flows. GAMs were used to estimate nonlinear trends in water quality by Morton and Henderson (2008) and in hydrological extreme series modeling by Ramesh and Davison (2002). Recently, Asquith et al. (2013) employed GAMs to develop readily implemented procedures for the estimation of discharge and velocity from selected predictors at ungauged stream locations. However, to the authors’ best knowledge, GAMs have never been used in the context of RFA of hydrological variables.

The objective of the present study is to introduce GAMs in a complete regional model to estimate flood quantiles. A set of 151 basins in the province of Québec, Canada, is considered as case study. It is used in combination with the neighborhood approach using CCA. A cross validation is used to evaluate performances. In previous studies dealing with the estimation of flood quantiles with the same dataset (Chokmani and Ouarda 2004; Nezhad et al. 2010; Shu and Ouarda 2007), explanatory variables have been selected based on correlation with specific quantiles. In the present study, an attempt is made to select optimal variables with a stepwise method. The regional model adopting GAM is compared with a model using LLRM, which is commonly used in RFA. Comparisons are also carried out for models with and without the delineation of homogenous regions with CCA, and also with and without the use of the stepwise method for the selection of variables. The latter is important to separate the impacts of using the GAM and the stepwise variable selection procedure.

This paper is organized as follows. Section 2 presents the theoretical background on linear regression models, GAMs, and the CCA approach for the delineation of neighborhoods in RFA. The considered dataset as well as the study design are presented in section 3. Section 4 includes the obtained results, while the last section contains the conclusions of the study.

2. Theoretical background

In this section, the required statistical tools are briefly presented and their use in RFA is discussed.

a. Linear regression models

Regression analysis is used to find a relationship between a random variable Y, called the response variable or dependent variable, and one or several random variables X, called the explanatory or predictor variables (or independent variables). Let us define as a matrix whose columns are X1, X2, …, Xm, a set of m explanatory variables. The linear regression model is defined by
e1
where and are unknown parameters and is the error term that is assumed to be normally distributed . The model parameters are often estimated by the least squares estimator .
A power product model is generally used to express the relationship between flood quantiles and explanatory variables (Ouarda et al. 2008; Pandey and Nguyen 1999). A log transformation allows for the expression of this model as follows (log-linear model):
e2
Note that the log transformation introduces a bias in the prediction since the aim is the estimation of the variable expectation rather than its logarithm (Girard et al. 2004).

b. Generalized additive models

GLMs are a generalization of the well-known ordinary linear model presented previously. They allow for a response distribution other than normal and for a degree of nonlinearity in the model structure (Wood 2006). The GLM can be expressed as follows:
e3
where g is a monotonic link function and Y could have any distribution from the exponential family, which includes, for instance, Poisson, binomial, and normal distributions.
For more flexibility, GLMs are themselves extended to GAMs by allowing nonparametric fits of the Xj where the linear forms are replaced by smooth functions fj (Hastie and Tibshirani 1986; Wood 2006):
e4
GAM has several advantages over linear models. It is more flexible because of the smooth functions fj, where there is no need for a transformation to achieve linearity. Hence, it is possible to identify more realistically the effect of each explanatory variable Xj on Y.

To estimate the smooth function fj, a spline is used. A spline is a curve composed of piecewise polynomial functions, joined together at points called knots. A number of spline types have been proposed in the literature, such as cubic-, P-, and B-splines. The thin plate regression splines have some advantages, such as fast computation, lack of requirement for a choice of knot locations, and optimality in approximation of the smoothing [for more details, see Wood (2003, 2006)]. In the present study, the latter splines are considered.

In general, a smooth function fj can be defined by a set of q spline basis functions such that
e5
where represent the smoothing coefficients related to the jth function. To avoid overfitting, the estimator of is obtained by maximizing the penalized log-likelihood:
e6
where is the log-likelihood function, is the smoothing parameter of the jth smooth function fj, and j is a matrix with known coefficients (Wood 2008). The parameter controls the smoothness degree of the curve fj. Its value ranges from 0 to 1, with 0 corresponding to the unpenalized case and 1 to the completely smoothed curve. The optimum value of is a right balance between best fitting and smoothing. The function is maximized by the penalized iteratively reweighted least squares (P-IRLS; Wood 2004). The smoothing parameter can be selected according to a criterion such as the generalized cross validation (GCV; Wahba 1985), unbiased risk estimator (UBRE; Craven and Wahba 1978), or maximum likelihood (ML).

c. CCA approach in RFA

This section briefly presents the CCA approach and its connection to the delineation step of RFA. This method is explained in more details in Ouarda et al. (2001) in the RFA context. Let us define two sets of random variables (indicated by curly brackets) and . In the present study, the set X contains basin physiographical and meteorological variables, for example, drainage area and mean annual precipitation, and Y contains basin hydrological variables such as flood quantiles. In general, all variables should be standardized and transformed for normality. Mainly, CCA aims to identify the dominant linear modes of covariability between the vectors X and Y and then make an inference about Y given the vector X.

Consider the linear combinations V and W of the variables of X and Y:
e7
CCA allows for the identification of vectors a and b for which are maximized as well as with unit variance.
For each basin , within a given set of basins B, and the corresponding values for and are denoted as and . Let denote the physiometeorological canonical score for a target site, associated with the obtained canonical variables. The vector is known, but the interest is the estimation of the unknown hydrological canonical score . The approximation can be obtained through such that . This leads to the definition of the 100(1 − α)% confidence level neighborhood for containing sites with realizations w of W such that
e8
where p is the identity matrix and is such that . All the aspects related to the CCA in the RFA context are developed in Ouarda et al. (2001).

3. Dataset and study design

The considered dataset has already been studied in the context of RFA in a number of previous studies (Chebana and Ouarda 2008; Chokmani and Ouarda 2004; Nezhad et al. 2010; Shu and Ouarda 2007), which provides an opportunity for comparative evaluation of the results. The dataset consists of 151 hydrometric stations located in the southern half of the province of Québec (between 45° and 55°N), Canada. The hydrological variables are represented by specific flood quantiles (quantiles divided by the basin area), denoted by QS10, QS50, and QS100. The physiographical and meteorological variables, available for each basin, are summarized in Table 1. To avoid redundancy with the previously mentioned studies, details concerning the dataset are not reported here. The reader is referred to the references listed above for information concerning the geographic location of the stations and the scatterplots of the basins in the canonical spaces.

Table 1.

Descriptive statistics of hydrological variables and physiometeorological variables.

Table 1.

The CCA in conjunction with LLRM has been proven to perform well (GREHYS 1996b). However, it is suspected that the more general approach (GAM) can improve the estimations. In this study, LLRM and GAM are compared as regional estimation models. The fitting of data for GAM is performed with the R package mgcv (Wood 2004). Smooth parameters, in Eq. (6), are estimated with the P-IRLS procedure where the ML score is employed as a criterion.

Homogenous regions are delineated with the CCA method on the basis of the variables BV, PMBV, PLAC, PTMA, and DJBZ. These variables are selected on the basis of maximizing correlations with the hydrological variables. Since CCA requires normality, these variables are transformed for the regional analysis as in the previous studies for this region, that is, a logarithmic transformation for the hydrological variables, PMBV, PTMA, DJBZ, and a square root transformation for PLAC. Figure 3 (not presented here to avoid repetition) in Shu and Ouarda (2007) shows clear nonlinearities in different levels for some variables. This represents a motivation for the use of the GAM with the present dataset.

The design of the present study aims to check the performance of three elements: (i) adoption of the CCA delineation step or considering all stations, (ii) consideration of the nonlinearity in the regression model through either LLRM or GAM during the regional estimation step, and (iii) the variable selection method (stepwise or correlation). This leads to eight combinations denoted as follows:

  • LLRM|ALL|CORR: LLRM with all (ALL) stations (no delineation) and with the five selected variables [from correlation (CORR)];

  • LLRM|ALL|STPW: LLRM with all stations (no delineation) and variables selected using the stepwise method (STPW);

  • LLRM|CCA|CORR: LLRM with homogeneous regions defined by CCA and with the five selected variables (from correlation);

  • LLRM|CCA|STPW: LLRM with homogeneous regions defined by CCA and variables selected using the stepwise method;

  • GAM|ALL|CORR: GAM with all stations (no delineation) and with the five selected variables (from correlation);

  • GAM|ALL|STPW: GAM with all stations (no delineation) and variables selected using the stepwise method;

  • GAM|CCA|CORR: GAM with homogeneous regions defined by CCA and with the five selected variables (from correlation); and

  • GAM|CCA|STPW: GAM with homogeneous regions defined by CCA and variables selected using the stepwise method.

The selection method used in this study is the backward stepwise selection method. It starts with an initial model including all available variables. The regression method is then applied with the current model and the variable with the highest p value is excluded, corresponding to the hypothesis that in Eq. (5), where j is the jth variable. At each step, one variable is excluded. The procedure ends when the p values of all the remaining and significant variables are under a given threshold (5%).

Once a model is established, its performance can be evaluated. A jackknife procedure is applied to assess the performance of the models. In this procedure, gauged sites are in turn considered ungauged in order to carry out regional estimation. This procedure allows for the assessment of the following performance criteria: the coefficient of determination
e9
the root-mean-square error
e10
the relative root-mean-square error
e11
the mean bias
e12
and the relative mean bias
e13
where and are the local (at site) and regional quantile estimates at station i, respectively; is the local mean of the hydrological variable; and n is the number of stations.

4. Results and discussion

The CCA is applied on the dataset with the normalized variables BV, PMBV, PLAC, PTMA and DJBZ. An optimal value of is obtained with the optimization procedure of Ouarda et al. (2001). This optimal value is used to delineate the neighborhood at each station. Each regional model, when considering CCA delineation, uses the same neighborhood for a given station. When CCA is applied to the whole dataset, the two physiometeorological canonical variables are defined as
e14
and
e15
and the two hydrological canonical variables are defined as
e16
and
e17
The nonnegligible values of the BV coefficient in V1 and V2 confirm the need to include BV in the CCA despite the fact that specific hydrological quantiles are used.

The stepwise selection of variables is applied for each specific quantile separately and for each regression model LLRM and GAM. Table 2 indicates that the selected variables are the same for a given model and a given selection method, independently of whether CCA is used for homogeneous region delineation. Therefore, the delineation step seems not to have an effect on the selected variables.

Table 2.

Variables selected for each regional model.

Table 2.

The results of the application of the jackknife procedure for the performance evaluation of each regional model are presented in Table 3. The best overall performances are obtained with GAM|ALL|STPW and GAM|CCA|STPW, with CCA leading to slightly better performances. More precisely, and in particular based on the rRMSE, GAM always performs better than LLRM for combinations using the same variable selection approach and the same delineation approach (CCA or ALL).

Table 3.

Performances obtained with the eight combinations (model, delineation, and variable selection). Best performances are in boldface for each criterion and quantile.

Table 3.

The use of CCA to delineate hydrologically homogeneous regions generally leads to improvements in regional estimation in comparison to the ALL approach for the same selection of variables and the same regression model (GAM or LLRM). However, when GAM is used, the difference between CCA and ALL is not significant, especially when using the stepwise procedure for the selection of variables. These results show that the use of GAM makes the procedure more robust and compensates for the advantages of using CCA. This is not the case for LLRM, where the use of CCA was shown to lead to significant improvements (see, e.g., Chokmani and Ouarda 2004). In other words, this indicates that the use of GAM reduces the importance of delineating the appropriate hydrological neighborhood. A possible interpretation for this result is that the consideration of nonlinear formulations in the relation between the explanatory physiometeorological variables on one side and the hydrological variables on the other side leads to a reduction of the weight of basins that are not hydrologically similar to the target site.

The stepwise method for variable selection improves quantile estimations in comparison to those obtained with the fixed five variables. This can be explained by the fact that the correlation-based selection of physiometeorological variables to be used in the model is mainly based on a linear relationship between variables. It must also be noted that the variables are originally selected for CCA purposes (delineation) rather than for regression modeling (estimation).

Figures 1 and 2 present the smooth functions fj of the response variable log(QS100) with the explanatory variables of the fitted models GAM|ALL|CORR and GAM|ALL|STPW respectively. It can be seen that the variables BV, PLAC, LAT and DJBZ show nonlinear relations. Furthermore, the nonlinear relation is more complex for some variables. For instance, the relationship between log(QS100) and DJBZ decreases for small values of DJBZ, increases for midrange values, and decreases again for high values of DJBZ. This result reflects the seasonality effect of temperature, through DJBZ, on the flood regime. Another particular example of interest concerns the BV variable. Indeed, it can be seen that small basins have a different effect than moderate basins. This result is important since nonlinearity allows for the appropriate inclusion of the variable BV in the model, which eliminates the need to develop specific models for small, moderate, or large basins. Variables PMBV, LON, PLMA, and PTMA have approximately linear relations.

Fig. 1.
Fig. 1.

Smooth functions of QS100 for the explanatory variables included in the regional model GAM|ALL|CORR. The dotted lines represent the 95% confidence intervals. The y axes are named s(var, edf), where var is the name of the explanatory variable and edf is the estimated degree of freedom of the smooth.

Citation: Journal of Hydrometeorology 15, 6; 10.1175/JHM-D-14-0060.1

Fig. 2.
Fig. 2.

As in Fig. 1, but for GAM|ALL|STPW.

Citation: Journal of Hydrometeorology 15, 6; 10.1175/JHM-D-14-0060.1

In the present study, the proposed approach based on GAM is mainly compared with the basic formulation of one of the most popular RFA approaches, which is the log-linear estimation model combined with the CCA delineation approach. The comparison can be extended to other regional flood frequency models, such as the ensemble artificial neural networks–CCA (EANN-CCA) approach (Shu and Ouarda 2007; Shu and Ouarda 2008), the kriging–CCA approach (Chokmani and Ouarda 2004), and the depth-based approach (Chebana and Ouarda 2008; Wazneh et al. 2013a,b). To widen the comparison, results corresponding to the above approaches are considered since they are already available for the dataset considered in the present study. Table 4 summarizes the obtained results for all these methods. The results indicate that the GAM-based approach outperforms significantly all the above-listed approaches in terms of rRMSE. In terms of rBIAS, the optimal depth-based approach seems to lead to slightly better results, although the difference is not significant.

Table 4.

Results of several RFA approaches applied to the same dataset considered in this study. Best results are in boldface.

Table 4.

5. Conclusions

GAM is commonly used in health, epidemiological, and environmental studies. However, it remains unutilized in the field of hydrology, especially in RFA. The multiple linear regression model is the most employed estimation model in RFA mainly because of its simplicity. However, it assumes a log-linear relationship between the response variable and the explanatory variables. This assumption is not always true and does not reflect the complexity of the hydrological processes involved. The purpose of the present study is first to introduce GAM in RFA and then to compare its results with those obtained by LLRM. GAM is a flexible model that relaxes the assumptions of the LLRM (normality and linearity).

Results of this study indicate that significantly better estimations are obtained from regional models with GAM. For some explanatory variables, the logarithmic relationship of the response variable with the explanatory variables is not linear. Smooth curves allow for a more realistic understanding of the true relationship between response and explanatory variables. The performance gain is not significant using CCA in conjunction with GAM compared to LLMR. This indicates that GAM is robust and is efficient in RFA even without use of a neighborhood approach. Further efforts are required to generalize this conclusion and to test the benefits of GAM in other hydrological applications.

In summary, the use of GAM in RFA is valuable not only in terms of performance but also in terms of other practical aspects (e.g., explicit formulation of the smooth functions, flexibility, reduced number of assumptions, and less subjective choices).

Acknowledgments

Financial support for this study was graciously provided by the Natural Sciences and Engineering Research Council (NSERC) of Canada, under funding to the Canada Research Chair on the Estimation of Hydro-Meteorological variables. The authors thank the Ministry of the Environment of Quebec (MENVIQ) services for the employed datasets. For confidentiality reasons, the data cannot be released. The authors are grateful to the Editor and the anonymous reviewers for their valuable comments and suggestions.

REFERENCES

  • Asquith, W. H., Herrmann G. R. , and Cleveland T. G. , 2013: Generalized additive regression models of discharge and mean velocity associated with direct-runoff conditions in Texas: Utility of the U.S. Geological Survey discharge measurement database. J. Hydrol. Eng., 18, 13311348, doi:10.1061/(ASCE)HE.1943-5584.0000635.

    • Search Google Scholar
    • Export Citation
  • Bayentin, L., El Adlouni S. , Ouarda T. B. M. J. , Gosselin P. , Doyon B. , and Chebana F. , 2010: Spatial variability of climate effects on ischemic heart disease hospitalization rates for the period 1989–2006 in Quebec, Canada. Int. J. Health Geogr., 9, doi:10.1186/1476-072X-9-5.

    • Search Google Scholar
    • Export Citation
  • Bertaccini, P., Dukic V. , and Ignaccolo R. , 2012: Modeling the short-term effect of traffic and meteorology on air pollution in Turin with generalized additive models. Adv. Meteor., 2012, 609328, doi:10.1155/2012/609328.

    • Search Google Scholar
    • Export Citation
  • Blöschl, G., Sivapalan M. , Wagener T. , Viglione A. , and Savenije H. , Eds., 2013: Runoff Prediction in Ungauged Basins: Synthesis across Processes, Places and Scales. Cambridge University Press, 490 pp.

    • Search Google Scholar
    • Export Citation
  • Borchers, D. L., Buckland S. T. , Priede I. G. , and Ahmadi S. , 1997: Improving the precision of the daily egg production method using generalized additive models. Can. J. Fish. Aquat. Sci., 54, 27272742, doi:10.1139/f97-134.

    • Search Google Scholar
    • Export Citation
  • Cans, C., and Lavergne C. , 1995: De la régression logistique vers un modèle additif généralisé: Un exemple d’application. Rev. Stat. Appl., 43, 7790.

    • Search Google Scholar
    • Export Citation
  • Chebana, F., and Ouarda T. B. M. J. , 2008: Depth and homogeneity in regional flood frequency analysis. Water Resour. Res., 44, W11422, doi:10.1029/2007WR006771.

    • Search Google Scholar
    • Export Citation
  • Chokmani, K., and Ouarda T. B. M. J. , 2004: Physiographical space-based kriging for regional flood frequency estimation at ungauged sites. Water Resour. Res., 40, W12514, doi:10.1029/2003WR002983.

    • Search Google Scholar
    • Export Citation
  • Craven, P., and Wahba G. , 1978: Smoothing noisy data with spline functions. Numer. Math., 31, 377403, doi:10.1007/BF01404567.

  • Girard, C., Ouarda T. B. M. J. , and Bobée B. , 2004: Study of the bias in the log-linear regional estimation model. Can. J. Civ. Eng., 31, 361368, doi:10.1139/l03-099.

    • Search Google Scholar
    • Export Citation
  • GREHYS, 1996a: Presentation and review of some methods for regional flood frequency analysis. J. Hydrol., 186, 6384, doi:10.1016/S0022-1694(96)03042-9.

    • Search Google Scholar
    • Export Citation
  • GREHYS, 1996b: Inter-comparison of regional flood frequency procedures for Canadian rivers. J. Hydrol., 186, 85103, doi:10.1016/S0022-1694(96)03043-0.

    • Search Google Scholar
    • Export Citation
  • Guan, B. T., Hsu H. W. , Wey T. H. , and Tsao L. S. , 2009: Modeling monthly mean temperatures for the mountain regions of Taiwan by generalized additive models. Agric. For. Meteor., 149, 281290, doi:10.1016/j.agrformet.2008.08.010.

    • Search Google Scholar
    • Export Citation
  • Hastie, T., and Tibshirani R. , 1986: Generalized additive models. Stat. Sci., 1, 297310, doi:10.1214/ss/1177013604.

  • Kauermann, G., and Opsomer J. D. , 2003: Local likelihood estimation in generalized additive models. Scand. J. Stat.,30, 317337. [Available online at www.jstor.org/stable/4616766.]

  • Kloog, I., Chudnovsky A. , Koutrakis P. , and Schwartz J. , 2012: Temporal and spatial assessments of minimum air temperature using satellite surface temperature measurements in Massachusetts, USA. Sci. Total Environ., 432, 8592, doi:10.1016/j.scitotenv.2012.05.095.

    • Search Google Scholar
    • Export Citation
  • Kundzewicz, Z. W., and Napiórkowski J. J. , 1986: Nonlinear models of dynamic hydrology. Hydrol. Sci. J., 31, 163185, doi:10.1080/02626668609491038.

    • Search Google Scholar
    • Export Citation
  • Leitte, A. M., and Coauthors, 2009: Respiratory health, effects of ambient air pollution and its modification by air humidity in Drobeta-Turnu Severin, Romania. Sci. Total Environ., 407, 40044011, doi:10.1016/j.scitotenv.2009.02.042.

    • Search Google Scholar
    • Export Citation
  • Marx, B. D., and Eilers P. H. C. , 1998: Direct generalized additive modeling with penalized likelihood. Comput. Stat. Data Anal., 28, 193209, doi:10.1016/S0167-9473(98)00033-4.

    • Search Google Scholar
    • Export Citation
  • Morlini, I., 2006: On multicollinearity and concurvity in some nonlinear multivariate models. Stat. Methods Appl.,15, 326, doi:10.1007/s10260-006-0005-9.

    • Search Google Scholar
    • Export Citation
  • Morton, R., and Henderson B. L. , 2008: Estimation of nonlinear trends in water quality: An improved approach using generalized additive models. Water Resour. Res., 44, W07420, doi:10.1029/2007WR006191.

    • Search Google Scholar
    • Export Citation
  • Nelder, J. A., and Wedderburn R. W. M. , 1972: Generalized linear models. J. Roy. Stat. Soc., 135A, 370384, doi:10.2307/2344614.

  • Nezhad, M. K., Chokmani K. , Ouarda T. , Barbet M. , and Bruneau P. , 2010: Regional flood frequency analysis using residual kriging in physiographical space. Hydrol. Processes, 24, 20452055, doi:10.1002/hyp.7631.

    • Search Google Scholar
    • Export Citation
  • Ouarda, T. B. M. J., 2013: Regional hydrological frequency analysis. Encyclopedia of Environmetrics, Vol. 3, 2nd ed. A. H. El-Shaarawi and W. W. Piegorsch, Eds., John Wiley & Sons, doi:10.1002/9780470057339.vnn043.

  • Ouarda, T. B. M. J., Girard C. , Cavadias G. S. , and Bobee B. , 2001: Regional flood frequency estimation with canonical correlation analysis. J. Hydrol., 254, 157173, doi:10.1016/S0022-1694(01)00488-7.

    • Search Google Scholar
    • Export Citation
  • Ouarda, T. B. M. J., St-Hilaire A. , and Bobée B. , 2008: Synthèse des développements récents en analyse régionale des extrêmes hydrologiques. Rev. Sci. Eau,21, 219–232.

  • Pandey, G. R., and Nguyen V. T. V. , 1999: A comparative study of regression based methods in regional flood frequency analysis. J. Hydrol., 225, 92101, doi:10.1016/S0022-1694(99)00135-3.

    • Search Google Scholar
    • Export Citation
  • Ramesh, N. I., and Davison A. C. , 2002: Local models for exploratory analysis of hydrological extremes. J. Hydrol., 256, 106119, doi:10.1016/S0022-1694(01)00522-4.

    • Search Google Scholar
    • Export Citation
  • Rocklöv, J., and Forsberg B. , 2008: The effect of temperature on mortality in Stockholm 1998–2003: A study of lag structures and heatwave effects. Scand. J. Public Health, 36, 516523, doi:10.1177/1403494807088458.

    • Search Google Scholar
    • Export Citation
  • Schindeler, S., Muscatello D. , Ferson M. , Rogers K. , Grant P. , and Churches T. , 2009: Evaluation of alternative respiratory syndromes for specific syndromic surveillance of influenza and respiratory syncytial virus: A time series analysis. BMC Infect. Dis., 9, 190, doi:10.1186/1471-2334-9-190.

    • Search Google Scholar
    • Export Citation
  • Shu, C., and Ouarda T. B. M. J. , 2007: Flood frequency analysis at ungauged sites using artificial neural networks in canonical correlation analysis physiographic space. Water Resour. Res., 43, W07438, doi:10.1029/2006WR005142.

    • Search Google Scholar
    • Export Citation
  • Shu, C., and Ouarda T. B. M. J. , 2008: Regional flood frequency analysis at ungauged sites using the adaptive neuro-fuzzy inference system. J. Hydrol., 349, 3143, doi:10.1016/j.jhydrol.2007.10.050.

    • Search Google Scholar
    • Export Citation
  • Tisseuil, C., Vrac M. , Lek S. , and Wade A. J. , 2010: Statistical downscaling of river flows. J. Hydrol., 385, 279291, doi:10.1016/j.jhydrol.2010.02.030.

    • Search Google Scholar
    • Export Citation
  • Vieira, V., Webster T. , Weinberg J. , and Aschengrau A. , 2009: Spatial analysis of bladder, kidney, and pancreatic cancer on upper Cape Cod: An application of generalized additive models to case-control data. Environ. Health, 8, 3, doi:10.1186/1476-069X-8-3.

    • Search Google Scholar
    • Export Citation
  • Wahba, G., 1985: A comparison of GCV and GML for choosing the smoothing parameter in the generalized spline smoothing problem. Ann. Stat., 13, 13781402, doi:10.1214/aos/1176349743.

    • Search Google Scholar
    • Export Citation
  • Wazneh, H., Chebana F. , and Ouarda T. B. M. J. , 2013a: Optimal depth-based regional frequency analysis. Hydrol. Earth Syst. Sci., 17, 22812296, doi:10.5194/hess-17-2281-2013.

    • Search Google Scholar
    • Export Citation
  • Wazneh, H., Chebana F. , and Ouarda T. B. M. J. , 2013b: Depth-based regional index-flood model. Water Resour. Res.,49, 7957–7972, doi:10.1002/2013WR013523.

  • Wen, L., Rogers K. , Saintilan N. , and Ling J. , 2011: The influences of climate and hydrology on population dynamics of waterbirds in the lower Murrumbidgee River floodplains in Southeast Australia: Implications for environmental water management. Ecol. Modell., 222, 154163, doi:10.1016/j.ecolmodel.2010.09.016.

    • Search Google Scholar
    • Export Citation
  • Wittenberg, H., 1999: Baseflow recession and recharge as nonlinear storage processes. Hydrol. Processes, 13, 715726, doi:10.1002/(SICI)1099-1085(19990415)13:5<715::AID-HYP775>3.0.CO;2-N.

    • Search Google Scholar
    • Export Citation
  • Wood, S. N., 2003: Thin plate regression splines. J. Roy. Stat. Soc., 65B, 95114, doi:10.1111/1467-9868.00374.

  • Wood, S. N., 2004: Stable and efficient multiple smoothing parameter estimation for generalized additive models. J. Amer. Stat. Assoc., 99, 673686, doi:10.1198/016214504000000980.

    • Search Google Scholar
    • Export Citation
  • Wood, S. N., 2006: Generalized Additive Models: An Introduction with R. Chapman and Hall/CRC Press, 392 pp.

  • Wood, S. N., 2008: Fast stable direct fitting and smoothness selection for generalized additive models. J. Roy. Stat. Soc., 70B, 495518, doi:10.1111/j.1467-9868.2007.00646.x.

    • Search Google Scholar
    • Export Citation
  • Wood, S. N., and Augustin N. H. , 2002: GAMs with integrated model selection using penalized regression splines and applications to environmental modelling. Ecol. Modell., 157, 157177, doi:10.1016/S0304-3800(02)00193-X.

    • Search Google Scholar
    • Export Citation
Save
  • Asquith, W. H., Herrmann G. R. , and Cleveland T. G. , 2013: Generalized additive regression models of discharge and mean velocity associated with direct-runoff conditions in Texas: Utility of the U.S. Geological Survey discharge measurement database. J. Hydrol. Eng., 18, 13311348, doi:10.1061/(ASCE)HE.1943-5584.0000635.

    • Search Google Scholar
    • Export Citation
  • Bayentin, L., El Adlouni S. , Ouarda T. B. M. J. , Gosselin P. , Doyon B. , and Chebana F. , 2010: Spatial variability of climate effects on ischemic heart disease hospitalization rates for the period 1989–2006 in Quebec, Canada. Int. J. Health Geogr., 9, doi:10.1186/1476-072X-9-5.

    • Search Google Scholar
    • Export Citation
  • Bertaccini, P., Dukic V. , and Ignaccolo R. , 2012: Modeling the short-term effect of traffic and meteorology on air pollution in Turin with generalized additive models. Adv. Meteor., 2012, 609328, doi:10.1155/2012/609328.

    • Search Google Scholar
    • Export Citation
  • Blöschl, G., Sivapalan M. , Wagener T. , Viglione A. , and Savenije H. , Eds., 2013: Runoff Prediction in Ungauged Basins: Synthesis across Processes, Places and Scales. Cambridge University Press, 490 pp.

    • Search Google Scholar
    • Export Citation
  • Borchers, D. L., Buckland S. T. , Priede I. G. , and Ahmadi S. , 1997: Improving the precision of the daily egg production method using generalized additive models. Can. J. Fish. Aquat. Sci., 54, 27272742, doi:10.1139/f97-134.

    • Search Google Scholar
    • Export Citation
  • Cans, C., and Lavergne C. , 1995: De la régression logistique vers un modèle additif généralisé: Un exemple d’application. Rev. Stat. Appl., 43, 7790.

    • Search Google Scholar
    • Export Citation
  • Chebana, F., and Ouarda T. B. M. J. , 2008: Depth and homogeneity in regional flood frequency analysis. Water Resour. Res., 44, W11422, doi:10.1029/2007WR006771.

    • Search Google Scholar
    • Export Citation
  • Chokmani, K., and Ouarda T. B. M. J. , 2004: Physiographical space-based kriging for regional flood frequency estimation at ungauged sites. Water Resour. Res., 40, W12514, doi:10.1029/2003WR002983.

    • Search Google Scholar
    • Export Citation
  • Craven, P., and Wahba G. , 1978: Smoothing noisy data with spline functions. Numer. Math., 31, 377403, doi:10.1007/BF01404567.

  • Girard, C., Ouarda T. B. M. J. , and Bobée B. , 2004: Study of the bias in the log-linear regional estimation model. Can. J. Civ. Eng., 31, 361368, doi:10.1139/l03-099.

    • Search Google Scholar
    • Export Citation
  • GREHYS, 1996a: Presentation and review of some methods for regional flood frequency analysis. J. Hydrol., 186, 6384, doi:10.1016/S0022-1694(96)03042-9.

    • Search Google Scholar
    • Export Citation
  • GREHYS, 1996b: Inter-comparison of regional flood frequency procedures for Canadian rivers. J. Hydrol., 186, 85103, doi:10.1016/S0022-1694(96)03043-0.

    • Search Google Scholar
    • Export Citation
  • Guan, B. T., Hsu H. W. , Wey T. H. , and Tsao L. S. , 2009: Modeling monthly mean temperatures for the mountain regions of Taiwan by generalized additive models. Agric. For. Meteor., 149, 281290, doi:10.1016/j.agrformet.2008.08.010.

    • Search Google Scholar
    • Export Citation
  • Hastie, T., and Tibshirani R. , 1986: Generalized additive models. Stat. Sci., 1, 297310, doi:10.1214/ss/1177013604.

  • Kauermann, G., and Opsomer J. D. , 2003: Local likelihood estimation in generalized additive models. Scand. J. Stat.,30, 317337. [Available online at www.jstor.org/stable/4616766.]

  • Kloog, I., Chudnovsky A. , Koutrakis P. , and Schwartz J. , 2012: Temporal and spatial assessments of minimum air temperature using satellite surface temperature measurements in Massachusetts, USA. Sci. Total Environ., 432, 8592, doi:10.1016/j.scitotenv.2012.05.095.

    • Search Google Scholar
    • Export Citation
  • Kundzewicz, Z. W., and Napiórkowski J. J. , 1986: Nonlinear models of dynamic hydrology. Hydrol. Sci. J., 31, 163185, doi:10.1080/02626668609491038.

    • Search Google Scholar
    • Export Citation
  • Leitte, A. M., and Coauthors, 2009: Respiratory health, effects of ambient air pollution and its modification by air humidity in Drobeta-Turnu Severin, Romania. Sci. Total Environ., 407, 40044011, doi:10.1016/j.scitotenv.2009.02.042.

    • Search Google Scholar
    • Export Citation
  • Marx, B. D., and Eilers P. H. C. , 1998: Direct generalized additive modeling with penalized likelihood. Comput. Stat. Data Anal., 28, 193209, doi:10.1016/S0167-9473(98)00033-4.

    • Search Google Scholar
    • Export Citation
  • Morlini, I., 2006: On multicollinearity and concurvity in some nonlinear multivariate models. Stat. Methods Appl.,15, 326, doi:10.1007/s10260-006-0005-9.

    • Search Google Scholar
    • Export Citation
  • Morton, R., and Henderson B. L. , 2008: Estimation of nonlinear trends in water quality: An improved approach using generalized additive models. Water Resour. Res., 44, W07420, doi:10.1029/2007WR006191.

    • Search Google Scholar
    • Export Citation
  • Nelder, J. A., and Wedderburn R. W. M. , 1972: Generalized linear models. J. Roy. Stat. Soc., 135A, 370384, doi:10.2307/2344614.

  • Nezhad, M. K., Chokmani K. , Ouarda T. , Barbet M. , and Bruneau P. , 2010: Regional flood frequency analysis using residual kriging in physiographical space. Hydrol. Processes, 24, 20452055, doi:10.1002/hyp.7631.

    • Search Google Scholar
    • Export Citation
  • Ouarda, T. B. M. J., 2013: Regional hydrological frequency analysis. Encyclopedia of Environmetrics, Vol. 3, 2nd ed. A. H. El-Shaarawi and W. W. Piegorsch, Eds., John Wiley & Sons, doi:10.1002/9780470057339.vnn043.

  • Ouarda, T. B. M. J., Girard C. , Cavadias G. S. , and Bobee B. , 2001: Regional flood frequency estimation with canonical correlation analysis. J. Hydrol., 254, 157173, doi:10.1016/S0022-1694(01)00488-7.

    • Search Google Scholar
    • Export Citation
  • Ouarda, T. B. M. J., St-Hilaire A. , and Bobée B. , 2008: Synthèse des développements récents en analyse régionale des extrêmes hydrologiques. Rev. Sci. Eau,21, 219–232.

  • Pandey, G. R., and Nguyen V. T. V. , 1999: A comparative study of regression based methods in regional flood frequency analysis. J. Hydrol., 225, 92101, doi:10.1016/S0022-1694(99)00135-3.

    • Search Google Scholar
    • Export Citation
  • Ramesh, N. I., and Davison A. C. , 2002: Local models for exploratory analysis of hydrological extremes. J. Hydrol., 256, 106119, doi:10.1016/S0022-1694(01)00522-4.

    • Search Google Scholar
    • Export Citation
  • Rocklöv, J., and Forsberg B. , 2008: The effect of temperature on mortality in Stockholm 1998–2003: A study of lag structures and heatwave effects. Scand. J. Public Health, 36, 516523, doi:10.1177/1403494807088458.

    • Search Google Scholar
    • Export Citation
  • Schindeler, S., Muscatello D. , Ferson M. , Rogers K. , Grant P. , and Churches T. , 2009: Evaluation of alternative respiratory syndromes for specific syndromic surveillance of influenza and respiratory syncytial virus: A time series analysis. BMC Infect. Dis., 9, 190, doi:10.1186/1471-2334-9-190.

    • Search Google Scholar
    • Export Citation
  • Shu, C., and Ouarda T. B. M. J. , 2007: Flood frequency analysis at ungauged sites using artificial neural networks in canonical correlation analysis physiographic space. Water Resour. Res., 43, W07438, doi:10.1029/2006WR005142.

    • Search Google Scholar
    • Export Citation
  • Shu, C., and Ouarda T. B. M. J. , 2008: Regional flood frequency analysis at ungauged sites using the adaptive neuro-fuzzy inference system. J. Hydrol., 349, 3143, doi:10.1016/j.jhydrol.2007.10.050.

    • Search Google Scholar
    • Export Citation
  • Tisseuil, C., Vrac M. , Lek S. , and Wade A. J. , 2010: Statistical downscaling of river flows. J. Hydrol., 385, 279291, doi:10.1016/j.jhydrol.2010.02.030.

    • Search Google Scholar
    • Export Citation
  • Vieira, V., Webster T. , Weinberg J. , and Aschengrau A. , 2009: Spatial analysis of bladder, kidney, and pancreatic cancer on upper Cape Cod: An application of generalized additive models to case-control data. Environ. Health, 8, 3, doi:10.1186/1476-069X-8-3.

    • Search Google Scholar
    • Export Citation
  • Wahba, G., 1985: A comparison of GCV and GML for choosing the smoothing parameter in the generalized spline smoothing problem. Ann. Stat., 13, 13781402, doi:10.1214/aos/1176349743.

    • Search Google Scholar
    • Export Citation
  • Wazneh, H., Chebana F. , and Ouarda T. B. M. J. , 2013a: Optimal depth-based regional frequency analysis. Hydrol. Earth Syst. Sci., 17, 22812296, doi:10.5194/hess-17-2281-2013.

    • Search Google Scholar
    • Export Citation
  • Wazneh, H., Chebana F. , and Ouarda T. B. M. J. , 2013b: Depth-based regional index-flood model. Water Resour. Res.,49, 7957–7972, doi:10.1002/2013WR013523.

  • Wen, L., Rogers K. , Saintilan N. , and Ling J. , 2011: The influences of climate and hydrology on population dynamics of waterbirds in the lower Murrumbidgee River floodplains in Southeast Australia: Implications for environmental water management. Ecol. Modell., 222, 154163, doi:10.1016/j.ecolmodel.2010.09.016.

    • Search Google Scholar
    • Export Citation
  • Wittenberg, H., 1999: Baseflow recession and recharge as nonlinear storage processes. Hydrol. Processes, 13, 715726, doi:10.1002/(SICI)1099-1085(19990415)13:5<715::AID-HYP775>3.0.CO;2-N.

    • Search Google Scholar
    • Export Citation
  • Wood, S. N., 2003: Thin plate regression splines. J. Roy. Stat. Soc., 65B, 95114, doi:10.1111/1467-9868.00374.

  • Wood, S. N., 2004: Stable and efficient multiple smoothing parameter estimation for generalized additive models. J. Amer. Stat. Assoc., 99, 673686, doi:10.1198/016214504000000980.

    • Search Google Scholar
    • Export Citation
  • Wood, S. N., 2006: Generalized Additive Models: An Introduction with R. Chapman and Hall/CRC Press, 392 pp.

  • Wood, S. N., 2008: Fast stable direct fitting and smoothness selection for generalized additive models. J. Roy. Stat. Soc., 70B, 495518, doi:10.1111/j.1467-9868.2007.00646.x.

    • Search Google Scholar
    • Export Citation
  • Wood, S. N., and Augustin N. H. , 2002: GAMs with integrated model selection using penalized regression splines and applications to environmental modelling. Ecol. Modell., 157, 157177, doi:10.1016/S0304-3800(02)00193-X.

    • Search Google Scholar
    • Export Citation
  • Fig. 1.

    Smooth functions of QS100 for the explanatory variables included in the regional model GAM|ALL|CORR. The dotted lines represent the 95% confidence intervals. The y axes are named s(var, edf), where var is the name of the explanatory variable and edf is the estimated degree of freedom of the smooth.

  • Fig. 2.

    As in Fig. 1, but for GAM|ALL|STPW.

All Time Past Year Past 30 Days
Abstract Views 0 0 0
Full Text Views 2011 1291 421
PDF Downloads 669 152 30