1. Introduction
a. Weather extremes in south China
In China, extreme weather events regularly cause damages to ecosystems and affect the socioeconomic sphere (e.g., the agricultural production). The population that is living in areas vulnerable to weather extremes such as floods, rain storms, and droughts is increasing. Weather extremes are common in China because of the monsoon systems over East Asia and China’s various climate regimes (Zhai et al. 2005). The potential for economic damages from extreme climate events and other natural disasters increased in recent decades because of the fast-growing population and industrialization in major river basins in China (Feng et al. 2007). The intensity and frequency of these extremes have been analyzed in several studies on climate change in China and at the global scale (Ding et al. 2007; Trenberth et al. 2007; Klein Tank et al. 2009). For the climate and weather extremes in south China, the East Asian summer monsoon (EASM) and the western Pacific subtropical high (WP) have been identified as strong influencing factors (Ding et al. 2007; Fischer et al. 2011, 2012; Gemmer et al. 2011; Yin et al. 2009).
b. Impacts of precipitation extremes and weather insurance concept
Heavy rainfall has led to catastrophic flooding in southern China and caused decreasing annual grain production and direct economic losses observable in economic data from the China Meteorological Administration (CMA), the Pearl River Water Resources Commission (PRWRC), and the Economic Research Service (ERS) from the United States Department of Agriculture. Thus, these flood and heavy rainfall events have adverse effects on farmers’ livelihoods. According to the World Food Programme (WFP), weather-indexed crop insurance programs can help farmers to cope with disaster losses and help governments predict the onset of natural hazards to take appropriate measures to cushion their impacts and to reduce vulnerability (Hazell et al. 2010; Parry et al. 2009). Crop insurance has been one of the most successful risk management and longest-running stabilization programs for farmers in many parts of the world (Boyd et al. 2011). Recently, the China Insurance Regulatory Commission (CIRC) is encouraging insurers to apply new methods to extend agricultural insurance cover to improve food security and decrease the vulnerability to weather-related losses in the country (www.circ.gov.cn). Based on several case studies, Skees (2007) concludes that properly designed and targeted index-based weather insurance products can facilitate the development of robust rural financial markets. In weather-indexed crop insurance, a contract is written against a prespecified index that establishes a relationship between weather phenomena (e.g., heavy rainfall) and crop failure (i.e., losses in agricultural production). Based on another study by Turvey and Kong (2010), farmers in China would have an interest in purchasing weather insurance, with a strong interest in precipitation insurance.
Statistically, weather index insurance covers the extreme tail of the probability distribution of weather events for a specified region (Belete et al. 2007). The determination of the index depends on the probabilities associated with the given risk. This typically depends on long datasets of acceptable quality, which enable the estimation of the likelihood of an extreme event, the level of vulnerability and exposure, and the economic losses incurred (Jiang et al. 2010). Commonly, the probability distribution of the indexed parameter is calculated on observed data. The probability distribution of the extreme tails is often expressed in return levels of reconstructed and observed climate variables (i.e., precipitation extremes) at specific return periods. An accurate estimation of return levels at given return periods are relevant for the determination of indices for weather index–based crop insurance and other adaptation measures (Adger et al. 2007; Belete et al. 2007; Klein Tank et al. 2009; Semmler and Jacob 2004; Lehner et al. 2006).
These statistical distributions are often used to model time series and to define the frequency of annual extremes. The r-yr return level is the quantile that has a probability 1/r of being exceeded in a particular year. In general, the distributions of hydrological time series are assessed to distinguish the extent of an r-yr return level of a 100-yr flood event, for instance (Petrow and Merz 2009). Less often, studies focus on the distribution of meteorological time series of maximum precipitation and rain storm events, as these extreme precipitation indicators are the main climate elements where flood risks are concerned (e.g., Feng et al. 2007; Groisman et al. 1999; Klein Tank et al. 2009; Nadarajah and Choi 2007; Su et al. 2009; Vovoras and Tsokos 2009; Yang et al. 2010; Zhai et al. 1999).
Insurance providers require preliminary research on the accurate estimation of potential weather indexes for crop insurance products in south China. In addition, it is important to identify which distribution function fits best for the determination of the extreme tails of these precipitation extremes.
c. Distribution functions and state of research in south China
In China, the most commonly applied distribution function for precipitation extremes in the last decades was the three-parameter gamma distribution (GA3; similar to the Pearson type-3 distribution) for comparing extreme floods (Groisman et al. 1999; Wang et al. 2008). For example, Wang et al. (2008) applied the gamma distribution and the Kolmogorov–Smirnov (KS) test to detect changes in extreme precipitation and extreme streamflow in southern China. The gamma distribution is also used as the standard distribution in the calculation of the standardized precipitation index (SPI; Zhai et al. 2010; Fischer et al. 2011) and for precipitation intensity estimation in the Intergovernmental Panel on Climate Change (IPCC) Fourth Assessment Report (AR4; Solomon et al. 2007).
Recently, the World Meteorological Organization (WMO) recommended the use of the generalized extreme value (GEV) for block maxima and the generalized Pareto (GPA) distribution for peak-over-threshold events (Klein Tank et al. 2009). Several studies applied these distributions with diverse results (e.g., Solomon et al. 2007, etc.). For 651 stations in China, time series of annual maximum precipitation (1, 2, 5, and 10 day) from 1951 to 2000 have been analyzed by Feng et al. (2007). Applying the GEV distribution, they concluded that 50-yr events in these regions in the 1950s became more frequent and developed to 25-yr events in the 1990s. They detected negative trends in extreme events in north China but significant positive trends in the Yangtze River basin and northwestern China. For eastern China, Jiang et al. (2009) analyzed extreme precipitation from daily records and concluded that the GPA distribution is superior to other extreme value distributions such as GEV. For the Zhujiang River basin (ZRB) in south China, Yang et al. (2010) analyzed annual consecutive 1-, 3-, 5-, and 7-day precipitation for 42 stations from 1960 to 2005 and determined the best fitting of six different distributions for six predefined regions. In that study, GEV is among the best-fitting distributions for the estimation of return periods of 1, 10, 50, and 100 years. They also showed the spatial distribution of the different calculated return periods.
Relatively recent studies investigated and applied the five-parameter Wakeby (WAK) distribution for the estimation of return periods of extreme precipitation indicators (Park et al. 2001; Öztekin 2007; Su et al. 2009). To obtain reliable quantile estimates with the WAK distribution, Park et al. (2001) investigated the use of L moments for the estimation of the five parameters. Similar research was undertaken by Öztekin (2007), who concluded that based on results by the Anderson–Darling (AD) test the WAK distribution fits best for station data in the eastern United States. Su et al. (2009), who calculated the maximum precipitation and the Munger index to estimate observed and projected extreme precipitation events in the Yangtze River basin, selected the best-fitting distribution for flood/drought frequencies by comparing the GEV, general logistic (GLO), GPA, and WAK distributions. Here, the WAK distribution proved the best fit based on the Kolmogorov–Smirnov test. The 50-yr return periods were estimated for simulated (1951–2000) and projected (2001–50) time series.
Most of the aforementioned studies analyze the spatial pattern of extreme events at certain return periods, while some also investigate the best-fitting distributions. The results indicate the probability of extreme events and can be used in the planning of adaptation measures.
d. Objectives
For the preliminary research of the theoretical development of a weather index–based insurance program in south China, reported annual economic losses caused by flood events are associated with annual extreme precipitation indicators and annual grain production. An accurate estimation of return levels at given return periods is required to determine the theoretical thresholds. An investigation on four commonly used distribution functions for precipitation extremes will be carried out on the hypothesis that GEV is the overall best-fitting distribution for 192 stations in the Zhujiang River. Different goodness-of-fit tests will be applied to obtain insights on the reliability and robustness of each distribution function, as this has not been done yet. Based on the results, a spatial analysis of the frequency of annual precipitation extremes in the ZRB is presented and set into context with the theoretical development of a weather index–based crop insurance program.
2. Data and methodology
a. Regional setting
Located in south China, the Zhujiang River basin (also known as the Pearl River basin) stretches almost entirely over the administrative areas of Guangdong Province and Guangxi Autonomous Region. With an area of approximately 579 000 km2, it is ranked as the third largest within China. The East Asian summer and winter monsoons have strong influences on the seasonal climate regimes, which are categorized as tropical to subtropical climates. In the western part of the basin the topography shows mountainous areas, while the central and southeastern parts are mainly hilly low lands. The Zhujiang River is a construct of three main rivers (i.e., the Beijiang River, Dongjiang River, and Xijiang River). The streamflow has a southeast- to southwestward direction because of the basin’s topography. All three main rivers, including tributaries, merge into a large network delta (i.e., Zhujiang River Delta) at the southeastern coast. A map with the location of the 192 weather stations used in this investigation and the main river system is provided in Fig. 1.
Overview map of the Zhujiang River basin in south China, indicating location of meteorological stations (black dots) and the river system (gray curved line).
Citation: Journal of Hydrometeorology 13, 3; 10.1175/JHM-D-11-041.1
b. Data
Daily precipitation records of 192 meteorological stations (Fig. 1) in the ZRB for the period 1961–2007 are used. Earlier instrumental records are unavailable. The dataset was provided by the National Meteorological Information Center (NMIC) of the CMA. The datasets were controlled on their quality by the NMIC (Qian and Lin 2005). The NMIC checked the data on homogeneity using the departure accumulating method (Buishand 1982). The precipitation data records remain unadjusted, while less than 0.01% of data gaps appear in the daily precipitation records. The same dataset was used and investigated by Gemmer et al. (2011) and Fischer et al. (2011, 2012). Multiple findings (including figures and tables) on precipitation extremes, wetness and dryness pattern, and change points in the ZRB can be retained from these articles.
Historical and recent data on major flood events in the ZRB (i.e., economic losses and affected area and population) are taken from the website of the PRWRC (www.pearlwater.gov.cn), while the annual overall economic losses from all natural hazards of Guangxi Autonomous Region are taken from the Provincial Disaster Statistics for Guangxi Autonomous Region 1950–2000 as provided by the CMA. The estimated annual grain production from 1963–2007 for the provinces of Guangdong and Guangxi is extracted from the ERS website (www.ers.usda.gov).
c. Methodology
1) Indicators
Based on observed daily precipitation data (1961–2007), two indicators are created in order to analyze and describe annual 1- and 5-day maximum precipitation extremes, which represent flood indexes’ weather index–based insurance. The precipitation indicators consist of the annual maximum 1-day precipitation (RX1) and annual maximum consecutive 5-day precipitation (RX5). The indicators were defined on fixed terms predetermined by CMA and international standards (Li et al. 2010; Qian and Lin 2005; Su et al. 2009; Klein Tank et al. 2009).
For reliable results in estimated return levels, and hence in the application of all common distribution functions, a main assumption is that the time series are stationary. Significant trends, cycles, and autocorrelation (heteroscedasticity) indicate nonstationarity. Hence, the methods of linear regression, Mann–Kendall test, and Engle’s test (on autocorrelation) are applied to the RX1 and RX5 of each station to identify stations exhibiting significant trends and autoregressive conditional heteroskedasticity (ARCH) effects in residuals at the 0.05 significance level (Gemmer et al. 2004; Gao et al. 2010; Duchesne 2006). Based on the two trend tests on the extreme indicators RX1 (RX5), significant trends are found at 8 (9) and 4 (3) out of 192 stations. Autocorrelation at lag 1 was detected at 4 and 7 stations. At these stations, the residual indicator time series are not independently distributed. The stations with significant trend and/or autocorrelation show no obvious spatial patterns in their location. Hence, the influence of large-scale covariates is rather unlikely. Conclusively, 181 time series are assumed to be stationary, while 11 time series show significant trends or autocorrelation, and are therefore excluded from the distribution analysis.
2) Distribution functions
To identify the best-fitting distribution function for extreme precipitation at 181 stations in the ZRB, a comparative frequency analysis is applied to the precipitation indicators (RX1 and RX5). Following the examples by Feng et al. (2007), Su et al. (2009), and Yang et al. (2010), only the four most commonly used three- and five-parameter distributions are considered. Here, the GA3, GEV, and GPA distributions and the WAK distribution are used. In reference to the GEV and GPA distributions, a good overview of the advantages and disadvantages of the peak-over-threshold and block maxima approaches is given by Palutikof et al. (1999). The indicator values of the total available time period from 1961 to 2007 (47 years) are chosen for estimating the distributions, because results for longer return periods (e.g., of 25 years) are more accurate than from values of a shorter time period. The definition of the cumulative distribution functions of GA3, GEV, and GPA (Klein Tank et al. 2009; Hamed and Rao 1999; Su et al. 2009) are presented in Table 1. Here, x is an individual raw score, γ is a location parameter similar to the time series’ mean, β and k are continuous shape parameters, and z is the time series’ mean (μ) subtracted from an individual raw score (X) and divided by the time series’ standard deviation (σ). While for the probability distribution function of WAK (Table 1), β, γ, and σ are shape parameters, and ξ and α are location parameters (Hamed and Rao 1999; Su et al. 2009).
Respective cumulative and probability distribution functions of GA3, GEV, GPA, and WAK.
Several methods can be used for the estimation of distribution function parameters. The most common are the maximum likelihood method, the method of moments, the probability weighted moments method, and the L-moments method. According to Hamed and Rao (1999), the maximum likelihood method is the most efficient since it provides the smallest sampling variance of the estimated quantiles. The L-moments method has advantages when dealing with small and moderate samples (Hosking and Wallis 1997). The use of different parameter estimation methods may result in different quantile estimates as each parameter estimation method has its own strengths and limitations. In this paper, the maximum likelihood method (with one thousand iterations) and the L-moments approach are applied.
3) Uncertainty quantification
We apply the bootstrapping method to more reliably assess the associated variability of the parameter estimation. For each station-based indicator, 1000 bootstrap members are generated by sampling with replacement (Davison and Hinkley 1997; Kharin et al. 2007). The 10% and 90% confidence bounds for each distribution function are generated for the return levels of the 25- and 50-yr return periods (Kao and Ganguly 2011). To quantify the uncertainty in parameter estimation of each distribution function due to sample errors in annual precipitation extremes, we compare the percentage differences of the confidence bounds to the estimated return level of the original sample. A higher percentage displays a higher (i.e., more uncertain) variability of the return level estimators of the used candidate distribution. Further procedures, equations, and estimations of parameter uncertainty can be derived from Hamed and Rao (1999), Su et al. (2009), Yang et al. (2010), and Hosking and Wallis (1997).
4) Goodness of fit
As a second step, the adequacy for each probability distribution (i.e., goodness of fit) is assessed. Three goodness-of-fit tests (KS, AD, and χ2 tests) are applied in order to determine the distribution with the best fit—that is, the most adequate probability distribution (Corder and Foreman 2009; Su et al. 2009).
(i) Kolmogorov–Smirnov test
The KS test derives the distance between the empirical cumulative distribution function (ECDF) of the observed time series and the cumulative distribution function (CDF) of the candidate distribution (Corder and Foreman 2009; Su et al. 2009; Schönwiese 2006). The KS test statistic (Dn) for a given candidate cumulative distribution function [F(x)] is the largest vertical difference between F(x) and Fn(x). The equation for the KS test statistic (Dn) and the ECDF are defined in Table 2. Here, supx is the least upper bound of the set of distances and IXi≤x is an indicator function, which is 1 if Xi ≤ x or 0 if otherwise. If D is greater than the critical value (here 0.198) at the 0.05 significance level, the hypothesis on the distributional form is rejected.
Functions of goodness-of-fit tests (KS, AD, and χ2).
(ii) Anderson–Darling test
The AD test is similarly to the KS test used to compare an ECDF to the CDF of a candidate distribution. The AD statistic (A2) measures how well the data follow a particular distribution. For a given dataset and distribution, the better the distribution fits the data, the smaller this statistic will be. The calculation is weighted more heavily in the tails of the distribution than the KS test (Corder and Foreman 2009; D’Agostino and Stephens 1986). For the equation described in Table 2, if A2 is greater than the critical value (here 2.502) at the 0.05 significance level, the hypothesis on the distributional form is rejected. The critical value is approximated depending on the sample size only and not on the distribution.
(iii) χ2 test
The χ2 test (normally used for independence determination) is used here to determine if the empirical data can be fitted well to the candidate distribution. In general, this test is applied to binned data with a certain degree of freedom (Corder and Foreman 2009; Schönwiese 2006). The calculation of the χ2 statistic is shown in Table 2, where Oi is the observed frequency for bin i, and Ei is the expected frequency for bin i with F as the CDF of the candidate distribution, while x1, x2 are the limits for bin i. Here, the values of the test statistics have a degree of freedom of 3, 4, or 5. Hence, the critical values at the 0.05 significance level are 7.81, 9.49, or 11.07, respectively.
5) Return levels
As stated by Klein Tank et al. (2009), information on multidecadal time scales are particularly relevant for adaptation planning and typical thresholds of weather indexes for insurance products (Adger et al. 2007; Belete et al. 2007), because nearly all infrastructure design relies on assessment of probabilities of extremes with return periods of 20 years or more. To further allow comparisons with relevant findings in recent literature (Feng et al. 2007; Su et al. 2009; Yang et al. 2010), the return levels in this study are calculated for the 25- and 50-yr return periods (quantiles 0.96 and 0.98, respectively) for each indicator and meteorological station. Based on the test results, the best-fitting distribution function for each station is determined according to the highest scores and robustness of the three goodness-of-fit tests. Additionally, the averaged percentage differences of the 10% and 90% confidence bounds to the estimated 25- and 50-yr return levels are calculated for each distribution function and presented as a basin average.
The return levels of the 25- and 50-yr return period of both indicators are spatially interpolated using the inverse distance weighting (IDW) method. This method is chosen because of its common use in previous articles on the ZRB (Gemmer et al. 2011; Fischer et al. 2011, 2012) and other relevant literature (e.g., Su et al. 2009). Although IDW usually performs with less accuracy than the kriging method (Shi et al. 2007; Chen et al. 2011), it does not assign values outside the range of the given points as is done within kriging, and hence a potential secondary overestimation of the return levels can be avoided, while local differences are more accentuated (Anderson 2011). With IDW, the weighted averages of 12 neighboring stations are used to calculate a raster image that presents the spatial distribution of the estimated indicators. The weighting is based on the local influence of distant points (stations), which decreases with distance (Gemmer et al. 2004).
3. Results
a. Indicators
The spatial distribution of RX1 and RX5 arithmetically averaged for the period 1961–2007 is displayed in Fig. 2. A very similar and obvious west-to-southeast disparity in the amounts of RX1 and RX5 can be found. The lowest average amounts in extreme precipitation (RX1 amounts are below 80 mm and RX5 amounts are below 120 mm) fall in the western parts, while the highest values (RX1 amounts are above 160 mm and RX5 amounts are above 240 mm) (RX1 > 160 mm; RX5 > 240 mm) are recorded along the southeastern coast of the basin.
Average annual (a) 1- and (b) 5-day-maximum precipitation (RX1 and RX5) in the ZRB, 1961–2007.
Citation: Journal of Hydrometeorology 13, 3; 10.1175/JHM-D-11-041.1
b. Association of indicators with flood events and economic data
The two basin-averaged annual indicators, the annual economic losses due to floods for the ZRB, and the overall disaster losses of Guangxi are presented in Fig. 3. Only four major flood events with reported economic losses are documented for the ZRB from 1978 to 2000, with the highest losses (above ¥7 billion, where ¥ is the symbol for Chinese yuan) in 1994. It can be seen that the highest amount of the heavy rainfall indicators in 1994 coincides with higher than usual overall economic losses in Guangxi Province. Taking the relatively steady trend in grain production of Guangdong and Guangxi into account (Fig. 4), some obvious corresponding low peak events in 1988, 1994, and 2002 can be distinguished, and decreases by more than 0.5 ton ha−1 are apparent. The loss event in 1994 can be found in RX1 and RX5 (Fig. 5), while the loss event in 1988 is not well represented in RX1 and RX5. In Fig. 5, the RX5 values of 1994 are visualized for all stations. The affected regions can be quite obviously distinguished for the northern and southern parts of the basin. The highest rainfall amounts in 1994 were detected in these areas, where similar high losses might be assumed.
Extreme precipitation [RX1 (green line) and RX5 (orange line)], economic flood losses in the ZRB (red column), and overall disaster losses in Guangxi Province (blue columns).
Citation: Journal of Hydrometeorology 13, 3; 10.1175/JHM-D-11-041.1
Agricultural grain production (t/ha) of Guangdong and Guangxi Provinces of China, 1963–2007.
Citation: Journal of Hydrometeorology 13, 3; 10.1175/JHM-D-11-041.1
Maximum consecutive 5-day precipitation (RX5) in 1994 in the ZRB.
Citation: Journal of Hydrometeorology 13, 3; 10.1175/JHM-D-11-041.1
c. Distribution functions
As a first step of the comparative frequency analysis, the parameters for the candidate distribution (GA3, GEV, GPA, and WAK) are estimated for each time series and two indicators at 181 stations. For each station and indicator, the four distribution functions are calculated and integrated into a database. Based on this, the probability density curves, the cumulative probability, and the probability difference (P–P plot) can be statistically interpreted and visualized for every single station and indicator. The averaged percentage differences of the 10% and 90% confidence bounds to the 25- and 50-yr return levels for each candidate distribution as basin averages are presented in Table 3. It can be seen that GEV and GPA show smaller differences than GA3 and WAK. This indicates that the parameter estimation with GEV and GPA is slightly less affected in the case of sample errors in annual precipitation extremes.
Averaged percentage differences of the 10% and 90% confidence bounds to the 25- and 50-yr return levels for each of the four candidate distribution functions (GA3, GEV, GPA, and WAK) for two extreme precipitation indicators (RX1 and RX5) at 181 stations in the ZRB, 1961–2007.
As an example, the estimated parameters of the four candidate distributions at Xuwen station are shown in Table 4. The distribution functions are drawn according to their parameters (Fig. 6). In Fig. 6a the empirical probability density in comparison to all four candidate distributions is shown. The probability density of RX5 is shown as columns with a left-sided distribution. The highest probability density is above 0.10 (at around 190 and 250 mm), where at the right tail end the two most extreme events in the 47 years of record can be distinguished. The estimated probability curves of GA3 and GEV (red and blue curves) show a right-sided bell curve with relatively similar shape, scale, and location. Compared to GA3 and GEV the curve of WAK (purple curve) shows a narrower shape, a higher scale, and a more right-shifted location, while the GPA (green curve) probability curve simply slopes downward from the left (P > 0.09 at 120 mm) to the right.
Estimated parameters of four candidate distribution functions (GA3, GEV, GPA, and WAK) for annual maximum consecutive 5-day precipitation (RX5) at Xuwen station, 1961–2007.
(a) Probability density, (b) cumulative probability, and (c) P–P plot of annual 5-day-maximum precipitation (RX5) and four candidate distributions (GA3, GEV, GPA, and WAK) at Xuwen station, 1961–2007. In (a) and (b) the x axis shows the precipitation (mm) and the y axis the probability [f(x)]; in (c) the x axis represents the empirical probability [f(RX5)] and the y axis the candidate probabilities [f(x)].
Citation: Journal of Hydrometeorology 13, 3; 10.1175/JHM-D-11-041.1
In Fig. 6b, the S-shaped GA3, GEV, and WAK curves are relatively close to the (stepwise) empirical cumulative probability curve. The GPA curve is simpler in its shape but also follows the empirical curve quite closely. To determine the actual difference of the candidate probability distribution to the empirical probability distribution, the P–P plot (Fig. 6c) allows an easy interpretation. The closer the candidate’s probability is to the diagonal line (0.0–1.1), the better it fits to the empirical probability. In Fig. 6c, the GPA line shows the highest distance from the diagonal line, while the other three show a very similar behavior and are closer to the empirical probability.
d. Goodness of fit
In the present study, goodness-of-fit tests (KS, AD, and χ2) are applied to all four candidate distributions of the two indicators at 181 stations. The statistics are calculated, proven on significance at the 0.05 significance level, and can be ranked based on a direct comparison of all four candidate distributions (the lowest value ranks first). Exemplarily, in Table 5, the test statistics for the four candidate distributions and for each goodness-of-fit test, as well as the corresponding return levels and the averaged percentage differences to the 10% and 90% confidence bounds, are presented for RX5 at Xuwen station.
Goodness-of-fit test results (KS, AD, and χ2) for four candidate distribution functions (GA3, GEV, GPA, and WAK), and estimated return levels for 25 and 50 years [F(0.96) and F(0.98)], including the averaged difference to the 10% and 90% confidence bounds (percentage difference in brackets) of RX5 at Xuwen station, 1961–2007.
In Table 6, the number of stations for which the calculation of the test statistics was successful (i.e., availability) are summarized. Some statistics are not available as some parameters of the candidate distribution could not be estimated or the hypotheses of the tests were rejected. The number of significant stations (α = 0.05) are also presented in Table 6. Table 7 includes the number of stations where the candidate distribution is ranked first in a direct comparison of all four candidate distributions. To validate the reliability and robustness of the candidate distributions from a different perspective, the numbers of stations at which the statistics are below a certain threshold (D < 0.09, A2 < 0.50, and Χχ2 < 4.00) are also shown in Table 7.
Availability and significance of stations according to the goodness-of-fit test results (KS, AD, and χ2) for four candidate distribution functions (GA3, GEV, GPA, and WAK) of two extreme precipitation indicators (RX1 and RX5) at 181 stations in the ZRB, 1961–2007.
First rank and number of stations under a certain threshold according to the goodness-of-fit test results (KS, AD, and χ2) for four candidate distribution functions (GA3, GEV, GPA, and WAK) of two extreme precipitation indicators (RX1 and RX5) at 181 stations in the ZRB, 1961–2007.
1) Kolmogorov–Smirnov test results
As can be seen in Table 6, the results of the KS test of the observed time series show that the calculation of the distributions’ test statistics (D) was successful for most of the stations and all available statistics are significant at the 0.05 confidence level (here 0.198 with n = 47). For RX1 (RX5), only at 5 (4) stations are the WAK statistics not available.
In Table 7, the KS test statistics for each candidate CDF and for each station are ranked according to their point-oriented highest distance to the empirical CDF. The candidate distribution with the shortest distance is ranked first (rank 1). WAK was ranked first at 118 (126) of 181 rankings, followed by GEV with 43 (31) of 181. The number of stations [RX1 (RX5)] below the threshold of D < 0.09 (taken from Su et al. 2009) are highest for WAK [174 (168)], followed by GEV [168 (153)] and GA3 [127 (128)].
2) Anderson–Darling test results
The calculation of the AD test statistics (A2) for RX1 (RX5) was successful at most of the stations, but not all statistics are significant at the 0.05 confidence level (here 2.502 with n = 47). At 5 (4) stations the WAK statistics are not available. All statistics are significant for GEV. No significance was found at 7 (4) stations for GA3, at 49 (51) stations for WAK, and at 175 (177) stations for GPA (Table 6). The significant candidate distributions are ranked according to their lowest distance to the weighted squared empirical distribution statistic (Table 7). WAK was ranked first at 92 (89) of 181 rankings, followed by GEV with 61 (68) of 181 stations. The number of stations below the chosen threshold of A2 < 0.50 are highest for GEV [171 (167)], followed by GA3 [141 (148)] and WAK [124 (114)], and very low for GPA [6 (4)].
3) χ2 test results
The availability of χ2 test statistics shows higher differences between the candidate distributions than those of KS and AD test statistics (cf. Table 6). The availability is given for GEV at all stations (181), for GA3 at 174 (177), for WAK at 135 (123), and for GPA at 6 (4) stations. Not all available statistics are significant at the 0.05 confidence level (here 7.81, 9.49, or 11.07 depending on the degree of freedom). In Table 7, the significant candidate distributions are ranked based on their distance to the binned empirical probability distribution. For RX1, GEV and WAK ranked first for similar number of stations (>70), while for RX5 a relatively even distribution of first ranks between GA3, GEV, and WAK can be found. GPA ranks first two (one) times only. For RX1 (RX5), the number of stations below the chosen threshold of χ2 < 4.00 are highest for GEV [135 (132)], followed by GA3 [114 (125)], and WAK [102 (99)], and very low for GPA [5 (3)].
4) Combined test results
Following the goodness-of-fit tests’ results, the adequacy of each candidate probability distribution can be determined. The overall outcome shows similar results for RX1 and RX5 (Table 6 and Table 7). Comparing the three tests, only GEV achieves successful calculation of test statistics (availability) for all stations. Availability of GA3 statistics is relatively high, while for WAK and GPA fewer results are found for χ2. The significant test statistics show similar responses. For RX1 (RX5), all test statistics for GEV are significant, except for 3 (1) times when χ2 was applied. GA3 shows also a high number of significant results in significance testing. In the AD testing, WAK shows a high variation with nonsignificance at one-third of available statistics, while GPA is nonsignificant at almost all statistics (Table 6).
Within the four candidate distributions, the five-parameter WAK ranks proportionately highest for both indicators at 181 possible stations. Comparing the three-parameter distribution functions, GEV ranks first more often than GA3 and GPA. Analyzing the test statistics below a certain threshold (D < 0.09, A2 < 0.50, and χ2 < 4.00), the results reveal that the statistics of WAK and GEV show most often relatively low values, followed by GA3, while the statistics of GPA show generally the highest values (Table 7). Low values of available statistics indicate a good reliability and robustness of the candidate distribution.
5) Spatial distribution of test results
To spatially analyze the test results, the first-ranking candidate distributions were visualized for each goodness-of-fit test on the station area level for RX1 and RX5 (Fig. 7). No spatial pattern or relationship of the first-ranking candidate distributions can be recognized for RX1 or RX5. The results illustrated in Fig. 7 indicate that the geographic location of stations does not play an important role in the estimation and calculation of reliable distribution functions.
Spatial distribution of (a),(c),(e) first-ranking candidate distributions (GA3, GEV, GPA, and WAK) of RX1 and (b),(d),(f) RX5 for (a),(b) the KS, (c),(d) AD, and (e),(f) χ2 tests (1961–2007).
Citation: Journal of Hydrometeorology 13, 3; 10.1175/JHM-D-11-041.1
e. Return levels
1) Calculated results
For both indicators, all four candidate distribution functions are used to estimate return levels of 25- and 50-yr return periods in the ZRB (1961–2007). Taking all results of both indicators at 181 stations into account, the return levels range from GPA (lowest return level) to GA3, GEV, and finally WAK (highest return levels).
Figure 8 shows the right (upper) tail section of the aforementioned curves/lines for RX5 at Xuwen station. Similar curves can be detected for GA3, GEV, and WAK, and a different path for GPA. As can be seen in Table 4, the test statistics of GA3, GEV, and WAK have similarly low values (except GA3 with χ2), while values of GPA are higher (KS), not significant (AD), or even not available (χ2). The return levels are estimated higher for WAK and GEV. Hence, considering the ranking results, the return levels of GEV and WAK should be assumed as fitting best for RX5 at Xuwen station. The return levels of GA3 and especially GPA involve an underestimation of the 25- and 50-yr return periods.
As in Fig. 6, but for the right (upper) tail section with precipitation above 380 mm or from f(x) > 0.90. The (a) blue columns and (b),(c) the light-gray lines signify the empirical distribution. In (b) and (c) the dashed dark-gray lines indicate the 25- and 50-yr return periods [f(x) = 0.96/f(x) = 0.98].
Citation: Journal of Hydrometeorology 13, 3; 10.1175/JHM-D-11-041.1
A station-based calculation and visualization of the temporal distribution of extreme precipitation amounts, including the return level thresholds, can be used to easily identify the years when, for example, 25- and 50-yr events occurred. This is shown in Fig. 9 for RX5 at Xuwen station, where the 50-yr return level was reached once (in 1990) and 25-yr return level was crossed twice (in 1990 and 2007). These one- and two-time occurrences can be logically explained by the sample size of 47 years (1961–2007). Based on such temporal findings, further analyses (e.g., on local extreme events) can be realized.
RX5 (columns) at Xuwen station, 1961–2007. The dotted (dashed) line indicates the 25 (50)-yr return level at 412 mm (455 mm) as arithmetic averages of all four candidate distributions (GA3, GEV, GPA, and WAK), while the full gray line signifies the arithmetic mean 5-day-maximum precipitation.
Citation: Journal of Hydrometeorology 13, 3; 10.1175/JHM-D-11-041.1
2) Spatial distribution of return levels
Mapping the spatial distribution of return levels provides a simple (supra) regional overview for decision makers concerned with weather extremes and risks. The return levels in this paper were calculated with each candidate distribution function and interpolated and mapped for both extreme indicators in the ZRB (not shown). Comparing the maps, no distinct spatial anomalies are apparent because of the large range in return levels for the entire basin and relatively small differences between the results of each candidate distribution. Hence, we only visualize the return levels of RX1 and RX5 for the 25- and 50-yr return period based on GEV (Fig. 10), as the GEV distribution function can be considered to fit best the time series of both indicators because of the good availability and high robustness of its test statistics.
Return levels of the (a),(b) 25- and (c),(d) 50-yr return periods for (a),(c) RX1 and (b),(d) RX5 in the ZRB, 1961–2007. The dashed lines indicate the 25-/50-mm interval (RX1/RX5).
Citation: Journal of Hydrometeorology 13, 3; 10.1175/JHM-D-11-041.1
For both indicators, the spatial distribution of the return levels of the 25- and 50-yr return periods show similar characteristics (Fig. 10) and are also very akin to the averaged annual amounts (cf. Figure 2). In general, the 25- and 50-yr return levels for RX1 and RX5 increase from west to southeast. For RX5, the 25 (50)-yr return levels range from less than 200 mm to more than 500 mm (Figs. 10b,d). The highest 25 (50)-yr return levels with a maximum of 796 mm (942 mm) are reached along the southeast coast and in a small area in the central north. The lowest return levels with a minimum of 137 mm (143 mm) are found in the western mountainous region. The basin-averaged 25 (50)-yr return levels are for RX1 200 mm (228 mm) and for RX5 329 mm (379 mm), respectively.
4. Discussion and conclusions
a. Distribution functions
The WMO recommends the use of GEV in the analysis of climate extremes (Klein Tank et al. 2009). In this study, four commonly used distribution functions were used in order to investigate whether the WMO’s recommendation can be underlined for a humid area in south China. Different goodness-of-fit tests were used for analyzing the reliability and robustness of each distribution function. GEV was successfully used (successful parameter estimation or proven test assumption) with all three tests for all stations but one. GA3 was available with all three tests at most of the time series, while GPA and WAK showed especially low availability with χ2. Most test statistics were significant for GEV and GA3, while WAK and GPA showed fewer significant test statistics with AD. In direct comparison, WAK ranked first with each test followed by GEV and GA3.
It is noted that the test results (in significance and ranking) of each goodness-of-fit test often differ within each time series. As every test depicts different criteria, this means more weighting should be put on the test results depending on the focus of further analysis. In the case of climate extremes, the tails of the distributions are of main concern (for extreme return periods); hence, in this study, the AD test is weighted higher than KS and χ2 as it focuses more on the tail ends of a distribution. Based on the results of the KS, AD, and χ2 tests, we determine that the GEV distribution fits best for both extreme precipitation indicators at 181 meteorological stations in the ZRB.
In the case of single-station analysis, WAK does often fit better but is less often available than GEV and shows a higher uncertainty in parameter estimation. The results of this investigation agree upon the hypothesis that GEV is the overall best-fitting distribution for climate extremes as recommended by the WMO. GEV shows a high availability and high significance for most test statistics regionwide and a relatively low uncertainty in parameter estimation due to sample errors in annual precipitation extremes. It should be noted that the applied statistical distributions can reasonably fit the observed time series of extreme precipitation data, but differences in the estimated quantiles may be apparent, since estimation of higher extreme quantiles is based on the upper tail of the probability distribution. For the development of a theoretical weather index–based insurance, it depends on the regional scale of the area to cover, of which one of the best-fitting distributions is recommended. If it only covers a small area of few stations, the best-fit distribution of those stations should be considered. The use of GEV should be considered for larger areas.
Our findings are in line with those of Yang et al. (2010), who concluded that GLO, GEV, Generalized Normal (GNO), and Pearson Type III (PE3) distribution fit best to estimate return periods for indicated precipitation events at 42 stations in the ZRB based on the Z-distance goodness-of-fit test. The determination of the best-fitting distribution was not the focus in their paper, hence a more detailed approach including three tests and with much higher resolution (192 stations instead of 42 stations) is presented in our study. Su et al. (2009) concluded that the Wakeby distribution is the best-fitting distribution for precipitation maxima in the Yangtze River basin (north of the study region), which is partially in line with the current study. These results would have similarly favored WAK as the best-fitting distribution in our study. As our study focused more on the identification of the best-fitting distribution, the differences might be the result of the application of different goodness-of-fit tests, station densities, and even different time series data sources. Thus, the use of a variety of goodness-of-fit tests allows a more precise analysis with distinct findings, which has been done in the current study.
b. Frequencies
Based on the results, a spatial analysis on the frequency of annual precipitation extremes in the ZRB is presented. The GEV was used to identify return levels of both precipitation extremes for different return periods. Data on different return periods are displayed separately. This paper delivers practical maps on the return levels at given time periods. Distinct patterns in the spatial distribution of return levels for the 25- and 50-yr return periods of each indicator were detected. A west to southeast disparity (low to high levels) for both 1- and 5-day-maximum precipitation is apparent. The highest return levels for both indicators can be found in the southeast of the basin (e.g., delta region) and in the central northern part. Low return levels for 1- and 5-day-maximum precipitation are estimated for the western and southwestern area.
The spatial distribution of the 50-yr return level of 1- and 5-day-maximum precipitation (RX1 and RX5) shows similar disparity to the distributions of the 100-yr return levels of annual maximum 1-day and 5-day rainfall (AM1R and AM5R) by Yang et al. (2010), with highest levels located at the southeast coast and the central–northern region. The spatial distribution of RX1 in the northwestern part is also relatively similar to the findings in summer rainfall of Guizhou Province by Yin et al. (2009) and the precipitation maxima of the Yangtze River basin by Su et al. (2009).
c. Final conclusions
Conclusively, the results of calculating and estimating the best-fitting distribution function (i.e., GEV) as well as the estimated return levels for the Zhujiang River basin can be adopted in the planning of weather index–based crop insurance or rainstorm control measures and for the production of practical maps that correspond to typical return periods in other sectoral planning (e.g., the 50-yr return period in flood management). The research in return periods of extreme precipitation might give assistance to the potential development of a weather index–based crop insurance. In future studies, projected changes of return levels should also be addressed to give estimates on future frequencies of precipitation extremes and flood/drought events.
Acknowledgments
This study was supported by the National Basic Research Program of China (973 Program, 2010CB428401), the Special Fund of Climate Change of the China Meteorological Administration (Fund CCSF-09-16), and by the National Natural Science Foundation of China (Fund 40910177). Many thanks go to Dr. Marco Gemmer, who continuously provided comments and suggestions to this study, and to the anonymous reviewers. The positions of Thomas Fischer and Marco Gemmer at the National Climate Center are supported by the German Development Cooperation through the Center for international Migration and Development (www.cimonline.de).
REFERENCES
Adger, W. N., and Coauthors, 2007: Assessment of adaptation practices, options, constraints and capacity. Climate Change 2007: Impacts, Adaptation and Vulnerability, M. L. Parry et al., Eds., Cambridge University Press, 717–743.
Anderson, S., cited 2011: An evaluation of spatial interpolation methods on air temperature in Phoenix, AZ. [Available online at http://www.cobblestoneconcepts.com/ucgis2summer/anderson/anderson.htm.]
Belete, N., and Coauthors, 2007: China: Innovations in agricultural insurance—Promoting access to agricultural insurance for small farmers. Sustainable Development, East Asia & Pacific Region, Finance and Private Sector Development, The World Bank Rep., 108 pp.
Boyd, M., Pai J. , Qiao Z. , and Ke W. , 2011: Crop insurance principles and risk implications for China. Hum. Ecol. Risk Assess., 17, 554–565.
Buishand, T. A., 1982: Some methods for testing the homogeneity of rainfall records. J. Hydrol., 58, 11–27.
Chen, G. X., Wu G. P. , Chen L. , He L. Y. , and Jiang C. , 2011: Surface modelling of annual precipitation in the DongJiang River basin, China. Proc. 19th Int. Conf. on Geoinformatics, Shanghai, China, IEEE, 412–415.
Corder, G. W., and Foreman D. I. , 2009: Nonparametric Statistics for Non-Statisticians: A Step-by-Step Approach. John Wiley and Sons, 264 pp.
D’Agostino, R. B., and Stephens M. A. , 1986: Goodness-of-Fit Techniques. Statistics: A Series of Textbooks and Monographs, Vol. 68, Marcel Dekker, 576 pp.
Davison, A. C., and Hinkley D. V. , 1997: Bootstrap Methods and Their Application. Cambridge Series in Statistical and Probabilistic Mathematics, Cambridge University Press, 592 pp.
Ding, Y., and Coauthors, 2007: China’s National Assessment Report on Climate Change (I): Climate change in China and the future trend. Adv. Climate Change Res., 3 (Suppl.), 1–5.
Duchesne, P., 2006: Testing for multivariate autoregressive conditional heteroskedasticity using wavelets. J. Comput. Stat. Data Anal., 51, 2142–2163.
Feng, S., Nadarajah S. , and Hu Q. , 2007: Modeling annual extreme precipitation in China using the generalized extreme value distribution. J. Meteor. Soc. Japan, 85, 599–613.
Fischer, T., Gemmer M. , Liu L. , and Su B. , 2011: Temperature and precipitation trends and dryness/wetness pattern in the Zhujiang River Basin, South China, 1961–2007. Quat. Int., 244, 138–148.
Fischer, T., Gemmer M. , Liu L. , and Jiang T. , 2012: Change-points in climate extremes in the Zhujiang River Basin, South China, 1961–2007. Climatic Change, 110, 783–799.
Gao, C., Gemmer M. , Zeng X. , Bo L. , Su B. , and Wen Y. , 2010: Projected streamflow in the Huaihe River Basin (2010–2100) using artificial neural network. Stochastic Environ. Res. Risk Assess., 24, 685–697.
Gemmer, M., Becker S. , and Jiang T. , 2004: Observed monthly precipitation trends in China 1951–2002. Theor. Appl. Climatol., 77, 39–45.
Gemmer, M., Fischer T. , Jiang T. , Su B. , and Liu L. , 2011: Trends of precipitation extremes in the Zhujiang River basin, South China. J. Climate, 24, 750–761.
Groisman, P.Ya., and Coauthors, 1999: Changes in the probability of heavy precipitation: Important indicators of climatic change. Climatic Change, 42, 243–283.
Hamed, K., and Rao A. R. , 1999: Flood Frequency Analysis. New Directions in Civil Engineering, CRC Press, 376 pp.
Hazell, P., Anderson J. , Balzer N. , Hastrup Clemmensen A. , Hess U. , and Rispoli F. , 2010: The potential for scale and sustainability in weather index insurance for agriculture and rural livelihoods. International Fund for Agricultural Development (IFAD) and World Food Programme (WFP) Publ., 153 pp.
Hosking, J. R. M., and Wallis J. R. , 1997: Regional Frequency Analysis: An Approach Based on L-Moments. Cambridge University Press, 242 pp.
Jiang, T., Vaucel L. , Gemmer M. , Fischer T. , Su B. , Cao L. , and Li X. , 2010: Weather index-based insurance in China: The challenges of dealing with data. Extended Abstracts, Sixth Int. Microinsurance Conf., Manila, Philippines, Microinsurance Network and Munich Re Foundation, 8 pp.
Jiang, Z., Ding Y. , Zhu L. F. , Zhang L. , and Zhu L. H. , 2009: Extreme precipitation experimentation over eastern China based on Generalized Pareto Distribution (in Chinese). Plateau Meteor., 28, 573–580.
Kao, S.-C., and Ganguly A. R. , 2011: Intensity, duration, and frequency of precipitation extremes under 21st-century warming scenarios. J. Geophys. Res., 116, D16119, doi:10.1029/2010JD015529.
Kharin, V. V., Zwiers F. W. , Zhang X. , and Hegerl G. C. , 2007: Temperature and precipitation extremes in the IPCC ensemble of global coupled model simulations. J. Climate, 20, 1419–1444.
Klein Tank, A. M. G., Zwiers F. , and Zhang X. , 2009: Guidelines on analysis of extremes in a changing climate in support of informed decisions for adaptation. World Meteorological Organization Tech. Doc. WMO-TD 1500, WCDMP-No. 72, 52 pp.
Lehner, B., Döll P. , Alcamo J. , Henrichs T. , and Kaspar F. , 2006: Estimating the impact of global change on flood and drought risks in Europe: A continental, integrated analysis. Climatic Change, 75, 273–299.
Li, Z., Zheng F. , Liu W. , and Flanagan D. , 2010: Spatial distribution and temporal trends of extreme temperature and precipitation events on the Loess Plateau of China during 1961–2007. Quat. Int., 226, 92–100.
Nadarajah, S., and Choi D. , 2007: Maximum daily rainfall in South Korea. J. Earth Syst. Sci., 116, 311–320.
Öztekin, T., 2007: Wakeby distribution for representing annual extreme and partial duration rainfall series. Meteor. Appl., 14, 381–387.
Palutikof, J. P., Brabson B. B. , Lister D. H. , and Adcock S. T. , 1999: A review of methods to calculate extreme wind speeds. Meteor. Appl., 6, 119–132.
Park, J., Jung H. , Kim R. , and Oh J. , 2001: Modelling summer extreme rainfall over the Korean Peninsula using wakeby distribution. Int. J. Climatol., 21, 1371–1384.
Parry, M., Evans A. , Rosegrant M. W. , and Wheeler T. , 2009: Climate change and hunger: Responding to the challenge. World Food Programme Publ, 104 pp.
Petrow, T., and Merz B. , 2009: Trends in flood magnitude, frequency and seasonality in Germany in the period 1951–2002. J. Hydrol., 371, 129–141.
Qian, W., and Lin X. , 2005: Regional trends in recent temperature indices in China. Meteor. Atmos. Phys., 90, 193–207.
Schönwiese, C.-D., 2006: Praktische Statistik für Meteorologen und Geowissenschaftler. 4th ed. Bornträger, 302 pp.
Semmler, T., and Jacob D. T. , 2004: Modeling extreme precipitation events—A climate change simulation for Europe. Global Planet. Change, 44, 119–127.
Shi, Y. F., Li L. , and Zhang L. L. , 2007: Application and comparing of IDW and Kriging interpolation in spatial rainfall information. Geoinformatics 2007: Geospatial Information Science, J. Chen and Y. Pu, Eds., International Society for Optical Engineering (SPIE Proceedings, Vol. 6753), 67531I, doi:10.1117/12.761859.
Skees, J. R., 2007: Challenges for use of index-based weather insurance in lower income countries. University of Kentucky Agricultural Experiment Station Number 07-04-091, 24 pp.
Solomon, S., Qin D. , Manning M. , Marquis M. , Averyt K. , Tignor M. M. B. , Miller H. L. Jr., and Chen Z. , Eds., 2007: Climate Change 2007: The Physical Science Basis. Cambridge University Press, 996 pp.
Su, B., Kundzewicz Z. , and Jiang T. , 2009: Simulation of extreme precipitation over the Yangtze River Basin using Wakeby distribution. Theor. Appl. Climatol., 96, 209–219.
Trenberth, K. E., and Coauthors, 2007: Observations: Surface and atmospheric climate change. Climate Change 2007: The Physical Science Basis, S. Solomon et al., Eds., Cambridge University Press, 235–336.
Turvey, C. G., and Kong R. , 2010: Weather risk and the viability of weather insurance in China’s Gansu, Shaanxi, and Henan provinces. China Agric. Econ. Rev., 2, 5–24.
Vovoras, D., and Tsokos C. P. , 2009: Statistical analysis and modeling of precipitation data. Nonlinear Anal., 71, 1169–1177.
Wang, W., Chen X. , Shi P. , and van Gelder P. , 2008: Detecting changes in extreme precipitation and extreme streamflow in the Dongjiang River Basin in southern China. Hydrol. Earth Syst. Sci., 12, 207–221.
Yang, T., Shao Q. , Hao Z.-C. , Chen X. , Zhang Z. , Xu C.-Y. , and Sun L. , 2010: Regional frequency analysis and spatio-temporal pattern characterization of rainfall extremes in the Pearl River Basin, China. J. Hydrol., 380 (3–4), 386–405.
Yin, Z., Cai Y. , Zhao X. , and Chen X. , 2009: An analysis of the spatial pattern of summer persistent moderate-to-heavy rainfall regime in Guizhou Province of Southwest China and the control factors. Theor. Appl. Climatol., 97, 205–218.
Zhai, J., Su B. , Krysanova V. , Vetter T. , Gao C. , and Jiang T. , 2010: Spatial variation and trends in PDSI and SPI indices and their relation to streamflow in 10 large regions of China. J. Climate, 23, 649–663.
Zhai, P., Sun A. , Ren F. , Liu X. , Gao B. , and Zhang Q. , 1999: Changes of climate extremes in China. Climatic Change, 42, 203–218.
Zhai, P., Zhang X. , Wan H. , and Pan X. , 2005: Trends in total precipitation and frequency of daily precipitation extremes over China. J. Climate, 18, 1096–1108.