Pluviographic measurement results from the Institute of Meteorology and Water Management (IMGW) Wrocław–Strachowice meteorological station from the years 1960–2009 constitute the basis for this paper. While conducting the statistical analysis of precipitation occurrence frequency, the criterion of interval precipitation amounts was assumed in order to isolate the intensive rainfalls from the pluviograms, which made it possible to select a number of the most intensive rainfalls in each year. A total of 514 synthetic rainfall instances were isolated, which were then were arranged according to non-increasing amounts in 16 duration intervals. This was the basis to propose the unification of the development methodology of probabilistic models for maximum precipitation amounts, reliable in the designing and verification of drain flow capacity, especially for low probability of precipitation occurrence. Maximum precipitation models for Wrocław were developed (average annual precipitation H = 590 mm).
The extreme natural phenomena, which intensify over the last decades, such as sudden or long-lasting rainfalls, often accompanied by floods or outflows from the sewerage, cause significant economic losses. This should force us to continuously improve the rules of sewerage dimensioning, based on continuous precipitation measurements in order to identify possible climate changes patterns. Modern investigation methods used in hydrology (including precipitation monitoring) in connection with the knowledge of statistics, calculus of probability, and mathematical modeling now become necessary tools applied in engineering practice.
The designing of storm water or combined sewerage systems with facilities, such as storm overflows, separators, storage reservoirs, or wastewater treatment plants, encounters a primary difficulty in Poland, resulting from the lack of a reliable method of an authoritative determination of rainfall intensity for dimensioning or verification of a sewerage flow capacity. Namely, the precipitation model by Błaszczyk (1954), recommended for dimensioning of drains and system facilities in Poland, significantly underestimates measurement results of the calculated rain streams, which was shown in a number of comparative analyses (Kotowski 2009; Kotowski and Kaźmierczak 2009; Kotowski et al. 2010). The model developed by Błaszczyk is based on a statistical analysis of only 79 intense rainfalls (average height h in millimeters of the duration t in minutes: h > t0.5) registered in Warsaw in the years 1837–91 and 1914–25. The application of a model developed on the basis of precipitation measured about 100 yr ago, especially in the context of recently observed climate anomalies (De Toffol et al. 2009; Larsen et al. 2009; Leonard et al. 2008, Olsson et al. 2009; Schaarup-Jensen et al. 2009; Willems et al. 2012), affects in a negative manner the dimensioning of drainage areas in Poland according to recommendations of the EU standard PN-EN 752 (EU 2008), directly influencing a higher frequency of combined sewerage discharge resulting from the impossibility of rainwater collection. The standard restricts the occurrence frequency of such unfavorable phenomena to a rare “socially acceptable” repeatability: once per 10 years in case of rural areas and once per 20 to 50 years for urban areas, respective to the spatial development type. Because of the uncertainty of current projections of future rainfall, it is proposed to check the capacity of urban sewerage systems in terms of extreme rainfall with the frequency of once per 100 years (BLFU 2009; Siekmann and Pinnekamp 2011; Staufer et al. 2010; Willems 2011). The philosophy puts forward a new challenge of satisfying the recommendations by sewerage systems designers. Therefore, systematic research in precipitation patterns and statistical determination of the frequency of occurrence of maximum precipitation amounts are becoming so important nowadays, especially for such rare rainfall repeatability, in order to meet the rigorous requirements of the above-mentioned standard in the future as well.
2. Pluviographic material and research methods
Archival pluviograms from the Institute of Meteorology and Water Management (IMGW) Wrocław–Strachowice station from the years 1960–2009 constituted the research material. Precipitation was recorded by means of a float pluviograph until 2006, whereas the automatic rain gauge RG-50 SEBA Hydrometrie GmbH with electronic recording was used from 2007. Precipitation amounts were determined for 5-min-long intervals for the needs of the paper. Such accuracy is currently required to develop model precipitation (of Euler type) or series of torrential rainfalls (Bröker 2006; Schmitt 2000) or randomly generated synthetic rainfalls (Licznar et al. 2011a,b; Mehrotra and Sharma 2007a,b; Rupp et al. 2012) essential for hydrodynamic modeling of area drainage systems. Precipitation amounts were determined for the following 16 intervals of duration: 5, 10, 15, 30, 45, 60, 90, and 120 min and 3, 6, 12, 18, 24, 36, 48, and 72 h, totaled by means of the moving sum method. To isolate intensive rainfalls for statistical analyses, a precipitation amount criterion h ≥ 0,75t0.5 was assumed. The assumed criterion allowed for isolation of a number of the most intensive rainfalls in each year. A total of 514 synthetic rainfall instances were selected for detailed statistical analysis from the period of 50 yr of observation.
The assignment of precipitation occurrence probability p of a given intensity (i.e., the amount in time) is performed by arranging precipitation amounts in the assumed duration intervals. To determine the probability distribution of a random variable X, precipitation random sample should be first arranged in a non-increasing order,
where x1 ≥ x2 ≥ … ≥ xN; then, the empirical non-exceeding probability should be assigned to particular sample elements. The empirical probability distribution is constructed on the basis of the observed precipitation amount values in time. The concept of the empirical distribution results directly from a partial probability interpretation,
where m is a line number (series) in an integral sequence, m = 1, 2, 3, …, 50, and N is an observation sequence size.
At first, interval precipitation amounts were arranged in a non-decreasing order (with durations ranging from 5 min to 72 h) with N = 50 yr of observation. Points xm, p(m, N) marked on the coordinate system (hmax, p) allow for conclusions about the form of the function of the probability distribution of the random variable X. Empirical cumulative distribution functions of the highest precipitation amounts from the 50-yr measurement period are presented in Fig. 1.
The determination of the theoretical probability distribution function that is best adapted to the phenomenon described in the paper is not a simple process. In the majority of cases, particularly in relation to continuous variables, we do not possess theoretical premises allowing for an unambiguous determination of the distribution type, appropriate for the variable describing the phenomenon in question. On the basis of literature data for description of precipitation phenomena, the following distributions are used (Alila 1999; Di Baldassarre et al. 2006; Brath et al. 2003; Ben-Zvi 2009; Bogdanowicz and Stachý 1998; Kottegoda et al. 2000; Overeem et al. 2008; Schaefer 1990): Fisher–Tippett type Imax; Fisher–Tippett type IIImin; lognormal; and Pearson type III.
The probability distribution is determined by means of the density function for variables of continuous type,
where gi represents distribution parameters. To estimate numerical values of parameters gi by means of statistical data, the density function type must be assumed in advance. All estimation methods for an unknown parameter gi consist in finding such a function of random sample elements,
that can be assumed as an approximated parameter value. The value is called the random sample estimator. The method of maximum likelihood allows for determination of the most effective estimators. The essence of the method consists in finding such parameter values , for which the likelihood function L,
or its logarithm reach a maximum. The condition leads to a system of equations in the form of
from which the searched estimator values are obtained. In case of maximum precipitation descriptions, one of the estimated parameters is the lower limit ɛ of the Fisher–Tippett type IIImin and Pearson type III distributions. Thus, it is recommended to use mixed estimation methods, which means estimation of the lower limit should be performed by means of another method. The Fisher–Tippett type IIImin and Pearson type III distributions will be discussed later at length.
3. The characteristics of selected probability distributions
The density function of Fisher–Tippett type IIImin distribution occurs in the following form:
hence, the likelihood function logarithm is
The values of distribution lower limit were estimated as ɛi = hmax i − 0.1 mm for p(50, 50) = 0.98 (Table 1), taking into consideration the accuracy rating of the rain gauges. Applying the method of maximum likelihood, the parameters α and β were determined on the basis of equations
Parameter calculation results of the Fisher–Tippett type IIImin distribution for the highest precipitation amounts in Wrocław in the period of 1960–2009 and durations t ∈ [5; 4320] min are shown in Table 2.
Quantiles of a random variable for the Fisher–Tippett type IIImin distribution were calculated from the formula
The density function of the Pearson type III distribution occurs in the form of
Hence, the likelihood function logarithm is
The following dependencies were determined using the distribution parameters of the method of maximum likelihood:
where and Γ(λ) is Euler’s gamma function. Calculation results of the parameters of Pearson type III distribution for maximum precipitation in Wrocław are presented in Table 2.
Quantiles of a random variable for the Pearson type III distribution were calculated from the formula
where tp is the value of quantiles of a standard gamma distribution.
4. Selection criteria for probabilistic precipitation models
The λ Kolmogorov test (Kotowski et al. 2010) was carried out in order to verify the consistency of the theoretical distributions, applied to describe precipitation, with the empirical distribution for the data from Wrocław. The test makes it possible to accept or to reject the zero hypothesis H0 of the consistency of assumed distributions. The statistic of the λ Kolmogorov test is determined from the formula
where is the maximum discrepancy between distributions.
At the assigned significance level α, the critical value λkr of Kolmogorov’s statistic is established from the statistic tables. For example, for the value of 1 − α = 0.95 the quantile value λkr = 1.36. The H0 hypothesis should be rejected, when λ ≥ λkr. In the opposite case, the analyzed sample does not negate the hypothesis verified on the assumed significance level α.
At the assumed significance level of α = 0.05, for the lognormal distribution the calculated value of the λ Kolmogorov test statistic was higher than λkr = 1.36 (for t = 2880 min). For the remaining three distributions, the calculated statistic values were significantly lower than λkr = 1.36 (for t from 5 to 4320 min). This means that the lognormal distribution is not applicable for the description of the analyzed maximum precipitation in Wrocław at the assumed significance level. It was therefore assumed that the maximum precipitation can be described by means of Fisher–Tippett type Imax, Fisher–Tippett type IIImin, and Pearson type III distributions on the significance level of α = 0.05.
In case of models estimated using the method of maximum likelihood, it is impossible to define the R2 statistics. There are information criteria that allow for the estimation of the quality of adjustment, simultaneously taking into account a number of degrees of freedom lost in the estimation process. The Bayes information criterion (BIC) (Konishi and Kitagawa 2008; Sakamoto et al. 1986) is applied here. A model, for which the value of BIC calculated by means of the following Eq. (19) is the lowest, is recognized as the best,
where L is the likelihood function of the analyzed random variable sample, k is the number of the estimated parameters, and N is the number of observations.
The BIC criterion consists of two parts: the first describes a model adjustment measure, whereas the second describes its possible simplicity. In general, information criteria allow for the choice of the well-adjusted and simplest possible models that are, at the same time, not overtaught. To compare the analyzed models, the values of the BIC criterion were calculated and shown in Table 3. The two lowest values for each analyzed time are given in bold for clarity. The lognormal distribution was not taken into account, since the λ Kolmogorov statistic criterion was not fulfilled.
The BIC criterion does not unambiguously indicate the best model (the differences between the values of BIC of the analyzed models are small); however, it shows clearly that the Fisher–Tippett type Imax distribution diverges qualitatively from the two remaining ones. Thus, only two distributions (models), Fisher–Tippett type IIImin and Pearson type III, were further analyzed as better in terms of their quality.
5. The precipitation model based on the Fisher–Tippett type IIImin distribution
Calculation results for parameters of the Fisher–Tippett type IIImin distribution were given in Table 1. On the basis of them, the equations describing the investigated dependencies were determined. Since there is no pattern of the dependency of β on t, the mean value of was assumed for calculations (values of β range from 1.030 to 1.636). The dependency of the coefficient α on a precipitation duration t was described (at R = 0.993) with the function
while the dependency of the coefficient ɛ on t (at R = 0.996) was described with the formula
Finally, the quantile xp according to (12)—at the same time being the starting form of the probabilistic model I for maximum precipitation amounts in Wrocław, based on the Fisher–Tippett type IIImin distribution—will take the form (hmax in millimeters)
The graphical interpretation of maximum precipitation amounts hmax in Wrocław for the period of 1960–2009, calculated from the probabilistic model in the form (22), was shown in Fig. 2. This is the family of depth–duration–frequency (DDF) type curves: repeatable precipitation amounts with the occurrence probability p ∈ [1; 0,01] (with the occurrence frequency of C ∈ [1; 100] yr) and the duration of t ∈ [5; 4320] min.
6. The precipitation model based on the Pearson type III distribution
Calculation results for parameters of the Pearson type III distribution were shown in Table 2. On the basis of them, the equations describing the investigated dependency were determined [the dependency of ɛ on t has already been described with the function (21)]. The mean value of λ = 1.11 was assumed for calculation because of the lack of the dependency of λ on t (Table 2). The dependency of the coefficient α on the time t was described (at R = 0.982) with the function
The quantile xp of the Pearson type III distribution, according to (17), depends on the value tp. The function tp for the coefficient λ = 1,11 is described (at R = 0,999) with the formula
Finally, the probabilistic model II for the maximum precipitation amounts (hmax in millimeters) in Wrocław, based on the Pearson type III distribution, will take the form of the quantile hmax = xp,
7. The quantitative evaluation of the probabilistic precipitation models
The accuracy of the developed models of maximum precipitation was analyzed for their quantitative evaluation. The relative residual mean square error (rRMSE) was applied in order to compare the results of the calculation and measurements of hmax,
where hc is the calculated precipitation amount (in millimeters) and hm is the measured precipitation amount (millimeters).
In the case of the maximum precipitation model (22), based on the Fisher–Tippett type IIImin distribution, the value of rRMSE = 7.10%. Figure 3 shows the graphical interpretation of the partial residuals of the analyzed model.
In the case of the maximum precipitation model (25), based on the Pearson type III distribution, the value of rRMSE = 7.99%. Figure 4 shows a graphical interpretation of the partial residuals of this model.
On account of a lower value of rRMSE, model (22) based on the Fisher–Tippett type IIImin distribution was recognized as more precise, especially in the practical range (C ≤ 10 yr and t ≤ 3 h) for the sewerage designing and it was recommended to designing sewerage systems in Wrocław (according to EU 2008). However, model (25) is recommended for the verification of the excessive accumulation frequency and combined sewerage discharges (for the range C > 10 yr and t > 3 h).
8. Final remarks
The conducted investigation and research allow us to reach the following conclusions:
To obtain the comparability of precipitation models, created for different meteorological stations, measurement results of rainfall amounts in time should be described and generalized by means of one methodology that is proposed in this paper.
For designing and especially for the verification of probability of the occurrence frequency of sewerage outflows (for p ∈ [1; 0,01]) by hydrodynamic modeling, it is recommended to use the reliable local precipitation models [as in the case of Wrocław: the probabilistic models (22) or (25), respectively].
The pluviographic material from every meteorological station should be continuously updated; consequently, the mathematical form of the developed models should be periodically verified in order to increase their accuracy, especially for low occurrence probability values p < 0.1 (i.e., for C > 10 yr), and in order to consider nonstationarity of the precipitation in time.