Earth-observing satellites provide a method to measure precipitation from space with good spatial and temporal coverage, but these estimates have a high degree of uncertainty associated with them. Understanding and quantifying the uncertainty of the satellite estimates can be very beneficial when using these precipitation products in hydrological applications. In this study, the generalized normal distribution (GND) model is used to model the uncertainty of the Precipitation Estimation from Remotely Sensed Information Using Artificial Neural Networks (PERSIANN) precipitation product. The stage IV Multisensor Precipitation Estimator (radar-based product) was used as the reference measurement. The distribution parameters of the GND model are further extended across various rainfall rates and spatial and temporal resolutions. The GND model is calibrated for an area of 5° × 5° over the southeastern United States for both summer and winter seasons from 2004 to 2009. The GND model is used to represent the joint probability distribution of satellite (PERSIANN) and radar (stage IV) rainfall. The method is further investigated for the period of 2006–08 over the Illinois watershed south of Siloam Springs, Arkansas. Results show that, using the proposed method, the estimation of the precipitation is improved in terms of percent bias and root-mean-square error.
Precipitation is one of the most important components of water-budget analyses and plays a key role in connecting water and energy cycles. Too much or too little precipitation can lead to potential disasters, such as floods and droughts. Therefore, providing reliable measurements of precipitation is a crucial task for a safer environment. This work focuses specifically on the liquid form of precipitation (rainfall).
For precipitation measurement, rain gauges are the most accurate instrument at point scale, but a lack of a dense network of gauges, especially in remote areas, prevents obtaining the spatial heterogeneity of precipitation patterns necessary for most applications. The use of satellite instruments becomes feasible for precipitation estimation at fine spatial and temporal scales. Unlike gauges and radars, satellite measurements can overcome limitations from ground sensors in terms of coverage and operation, but the uncertainty of estimates can be high and should be evaluated. Even satellites do not provide full continuous images at all times; therefore, the average of limited image samples in time also contributes to the error of the final precipitation product. In addition, the uncertainty of data is dependent on the spatial scale and time accumulation of the estimate. In general, the products at finer spatial and temporal scales are associated with higher uncertainty than products at coarser spatial and temporal scales (Steiner 1996).
Further understanding of the uncertainty of satellite estimates is important to hydrologists and operational meteorologists who use satellite precipitation products for their water resources management and applications, such as rainfall–runoff modeling for their river flow forecasting.
A number of high-resolution, satellite-based precipitation estimates (HRSPEs) are available in near–real time (e.g., Hsu et al. 1997; Huffman et al. 2001; Sorooshian et al. 2000; Xie et al. 2003; Joyce et al. 2004). For better use of those HRSPEs, the uncertainty associated with these products at different scales should be quantified. A number of studies related to precipitation error analysis were proposed (i.e., Krajewski et al. 1991, 2000; Steiner 1996; Steiner et al. 1999; Li et al. 1998; Huffman 1997; Anagnostou et al. 1999; Hossain and Anagnostou 2004; Villarini and Krajewski 2007, 2008, 2009).
Assuming that the true measurement (, rainfall for temporal accumulation of T and spatial resolution of A), is not available, reference data () are frequently used for the evaluation of model- or satellite-based estimates. A reference error can be assigned as
in which is the available reference, such as radar data, and is the satellite estimate. The quantity is the error of the reference, and ε is the error of the product that is not known because the true value is never available. The variance of estimate error (to the reference data) can be presented as
If the two (reference data and satellite estimates) are uncorrelated, the covariance of their errors is zero. Equation (2) can be presented as
If the error variance of reference source is provided, can be calculated based on the reference data and satellite estimate.
Ciach and Krajewski (1999) and Anagnostou et al. (1999) attempted to separate the error into algorithm error and sampling error. Sampling error is the error associated with infrequent passing of satellites over a region that causes the product to be a temporal average that is slightly different from its true value. Laughlin (1981) showed that the sampling error is a mean-square error of the mean value of the precipitation:
where is the variance of the area-averaged rain rate. He studied the Global Atmospheric Research Program (GARP) Atlantic Tropical Experiment (GATE) dataset over the Atlantic Ocean for the summer of 1974. The sampling errors over the ocean, where there is the least amount of in situ observations (no gauges and radar data are available) as a reference for satellite estimates, were estimated. Similar case studies also have been done by other researchers (Seed and Austin 1990; Soman et al. 1995, 1996; Oki and Sumi 1994; Weng et al. 1994).
Bell and Kundu (2000) showed that the sampling error is a function of mean monthly precipitation and also depends on the sampling space and number of satellite visits over a month. They proposed the following equation to compute the sampling error:
where is the sampling error, R is the mean rainfall rate over area A, r is the mean rainfall rate during an event over area a, S is the number of satellite visits, τ is the correlation time of the rainfall events, and T is the length of the sample time. Additionally, there are some studies like Kunsch (1989), who used the moving-block bootstrapping, which is a nonparametric method and is based on the sampling experiment. Steiner et al. (2003) came up with the same relationship as in Bell and Kundu (2000) for the data over the Rocky Mountains by using the Laughlin (1981) formula and the resampling method. Steiner et al. concluded that this uncertainty is a statistical variable and should be defined in probabilistic terms.
Gebremichael and Krajewski (2004) compared parametric and nonparametric error estimation. They used the Laughlin (1981) formula for the parametric and moving-block bootstrapping methods for the nonparametric approach. Gebremichael and Krajewski (2005) further defined satellite precipitation sampling error as asymmetric distribution, such as a shifted gamma and a shifted Weibull. They found out that, for large sampling intervals such as 12 or 24 h, the conditional distribution of error to rainfall rate is shifted Weibull; for smaller sampling intervals, such as 3 or 6 h, the logistic distribution works better.
Steiner et al. (2003) proposed a relationship between radar rainfall estimates and several other factors; a similar relationship was used by Hong et al. (2006) to quantify the variance of the measurement error as a function of area coverage, time integration, sampling frequency, and space–time-average rainfall rate. Several studies on the impact of precipitation uncertainty on flood prediction have been performed (Hossain and Anagnostou 2004). In those publications, the effect of passive microwave and infrared-based satellite product error on flood prediction using a probabilistic error model was demonstrated. In AghaKouchak et al. (2012), the satellite precipitation error is divided into systematic and random error, and their correlation to space and time accumulation is presented.
There is significant interest in the evaluation of the available satellite precipitation products. In a study by Maggioni et al. (2014), the joint model of satellite versus reference precipitation is divided into four regions of hit (where both reference and satellite show precipitation), miss (where reference shows precipitation but satellite shows zero), false alarm (where satellite shows rainfall but reference shows zero), and correct no precipitation (where both show zero). They modeled the hit and missed precipitation using a gamma function and used a constant probability for correct no precipitation and false alarms.
Generally speaking, in most of the previous studies, the errors associated with satellite estimates are assumed to be Gaussian, where error variance is estimated (e.g., Ciach and Krajewski 1999; Anagnostou et al. 1999). More recent studies (e.g., Gebremichael and Krajewski 2005) demonstrated that the error distribution is significantly different from Gaussian distribution. Those studies show that the error distribution is relevant to the spatial and temporal resolution of estimates, and the error of estimate is fitted by shifted gamma; shifted Weibull; and shifted lognormal, logistic, and normal distribution at various spatial and temporal scales.
In this study, a generalized distribution function is introduced to fit the joint probability distribution of satellite-based estimate and reference data. The proposed function provides nonsymmetric probability functions considering the bias and variance, as well as higher-order moments of the uncertainty other than only using the first (mean) and the second (standard variation) moments (Ciach and Krajewski 1999; Anagnostou et al. 1999). The parameters of this generalized statistical distribution are functions of spatial and temporal resolution and rainfall rate, which enables the user to have a higher degree of moments in the uncertainty model (to model the asymmetrical uncertainty data) without having to choose from different distribution forms (Gebremichael and Krajewski 2005). It is assumed that the distribution parameters of the precipitation product uncertainty can vary through rainfall rate and product resolution in space and time, which are further fitted by simple mathematical functions.
The scope of this manuscript is as follows. A generalized probability model to describe the relationship between the satellite measurements and reference data is presented, and its properties are studied in section 2. Section 3 presents the data, the calibration results, and the parameter space, and section 4 contains the evaluation of the model. In section 5, using the GND model, the precipitation uncertainty in a case study over the Illinois River basin is quantified and the method used is evaluated using statistical analysis.
In this study, a generalized probability distribution is proposed to estimate the uncertainty of satellite precipitation estimates, which requires the estimation of the distribution parameters.
After fitting the distribution on sample datasets at particular temporal and spatial scales, the extension of the distribution parameters with respect to various spatiotemporal scale and rainfall rates is further investigated. This is achieved by estimating the distribution parameters by aggregating the product at various spatial and temporal scales. For instance, the chosen data product [e.g., Precipitation Estimation from Remotely Sensed Information Using Artificial Neural Networks (PERSIANN)] is processed from 0.25° × 0.25° and hourly resolution to coarser spatial resolutions, such as 0.5°, 0.75°, and 1.0°, and 3-, 6-, 12-, and 24-h temporal resolutions.
a. Distribution selection
A number of different probability distributions can be used as uncertainty models for a given variable. The most commonly used one is the Gaussian distribution, which is a symmetric function with two parameters (mean and standard deviation); its skewness and kurtosis are zero. However, precipitation data are not shaped symmetrically; they are skewed with a larger occurrence of smaller rainfall rates. There are different distributions that are constructed by adding skewness to the normal distribution, for example, generalized normal distribution (GND), lognormal distribution, skewed normal distribution, or inverse normal distribution. There are certain characteristics that make generalized normal distribution superior to the others. The flexibility of the GND model makes it possible to model the sharper peaks over the smaller values of rainfall rates when the data are highly skewed (Fig. 1). It can also model data with smaller peaks at larger temporal and spatial accumulations. Furthermore, the distribution is bounded from the left when the shape parameter is negative, which is the case for precipitation data because they are skewed to the left:
with α (positive and real) as scale, κ (real) as shape, and ζ (real) as the location parameters of the GND model and as the standard normal probability distribution function.
The different moments of the distribution can be defined in terms of its parameters, as follows:
b. Parameter estimation
After choosing a general distribution model to fit to the data, the maximum likelihood method is often the logical choice for the estimation of the distribution parameters. However, this may not be the case for distributions with a threshold that is a function of the parameters, where the likelihood function may have multiple modes or reach an infinite value when the estimated parameter values are no longer suitable for data. In this work, we are fitting the distribution to the data using least squares estimates of the cumulative distribution functions (CDFs). The idea is that the scatterplot of the empirical CDF of the data and the CDF of the fitted distribution fall along the 1:1 line from zero to one. To obtain this fitted distribution, we need to minimize the objective function of the sum of the squared differences between those two CDFs. To compensate for the variance of the fitted functions, higher weights are given to the tails and lower weights in the center.
To find the parameters for each specific pair of spatial and temporal resolutions, the data are divided into different bins (with the same bin size) with respect to their corresponding satellite estimated rainfall rate, and a distribution is fitted to each bin of data. For example, if the resolution of interest is 1° × 1° and 3-h accumulated rainfall and assuming the range of satellite estimates from 0 to 50 mm day−1, data can be divided into 10 groups with an interval of 5 mm day−1. The model is then fitted to the reference data in each bin, and its parameters are estimated (Fig. 2). By repeating the same process for different spatial and temporal resolutions, the distribution of parameters can be found as a function of rainfall rate and spatial and temporal resolutions.
c. Distribution parameters at various spatiotemporal scales
The three estimated parameters from each of the bins are modeled as a function of the rainfall rate (mean bin value) using simple functions. For the shift and shape parameters, a linear function is used; for the scale parameter, a power function is used:
where (mm day−1) is the mean value of the satellite precipitation at each bin of the specific time accumulation (e.g., 3 h) and spatial resolution (e.g., 0.25° × 0.25° latitude–longitude scales). Using these three functions, at each rainfall rate, the parameters of the uncertainty distribution are calculated and the distribution can be formed.
To expand this method for different spatial and temporal resolutions, we need to aggregate our data into those resolutions. The satellite precipitation product and its corresponding radar data are accumulated into lower temporal resolutions of 3-, 6-, 12-, and 24-h resolution by obtaining the sum of the data for each interval. For spatial resolutions, the average of the rainfall rates is calculated from 0.25° × 0.25°, 0.5° × 0.5°, 0.75° × 0.75°, and 1° × 1°. For each of the 16 different pairs of spatial and temporal resolutions, the distribution is fitted to the bins of the precipitation, and the distribution parameters as a function of rainfall rate are modeled as , , and . The six parameters from Eqs. (11)–(13) construct parameter space for different spatial and temporal resolutions by fitting a one-degree polynomial function to them. Having these planes available, for each specific spatial and temporal resolution, we have the equations for the three distribution parameters, and we can use them to construct the uncertainty distribution at each rainfall rate.
3. Data selection, parameter space, and model generation
a. Precipitation data and study domain
In this study, the uncertainty model is generated for PERSIANN based on stage IV data. After developing the method and evaluating its performance, the same process can be used for any other satellite product to quantify the joint probability distribution of satellite and radar estimates.
PERSIANN (Sorooshian et al. 2000) uses a statistical relationship between the IR-based estimates of cloud-top brightness temperature and rain rate. These IR images are from global geosynchronous satellites provided by the Climate Prediction Center (CPC). Further, using a neural network, the rainfall estimates are adjusted based on microwave data from low-orbital satellites [e.g., Advanced Microwave Scanning Radiometer for Earth Observing System (AMSR-E) on the Aqua spacecraft, the Advanced Microwave Sounding Unit-B (AMSU-B) on board the National Oceanic and Atmospheric Administration (NOAA) satellite series, and the Tropical Rainfall Measuring Mission (TRMM) Microwave Imager (TMI)].
For the reference precipitation, the Next Generation Weather Radar (NEXRAD) stage IV Multisensor Precipitation Estimator (MPE; Lin and Mitchell 2005) is used. NEXRAD is a network of 178 Doppler weather radars (WSR-88D) that is operated by the National Weather Service (NWS) across the United States and measures precipitation at a high spatial resolution. To measure the precipitation, a relationship between reflectivity and rain rate (Z–R relationship) is used that is calibrated differently for different types of precipitation (Rinehart 2004). To create the NEXRAD stage IV data, gauge observations are also added to the data using the MPE where the values of the rain gauges are taken from the weather service stations in the NWS Hydrometeorological Automated Data System network.
It should be mentioned that using stage IV data does not imply their perfection. As has been pointed out in the literature, a radar precipitation dataset has its own uncertainties, which should be evaluated separately (Smalley et al. 2014).
A 5° × 5° region over the southeastern United States bounded between 30° and 35°N and 85° and 90°W is selected (Fig. 3). Data include the months of June, July, August, December, January, and February for the 5-yr period of 2005–09. Both satellite and radar data are aggregated into 3-, 6-, 12-, and 24-h resolutions in time and 0.25° × 0.25°, 0.5° × 0.5°, 0.75° × 0.75°, and 1° × 1° spatially. Both satellite and radar rainfall data that are less than 1 mm day−1 are assigned to zero-rainfall pixels. These parts of data belong to three categories: 1) false alarms that occur when the satellite product shows precipitation and radar measurement shows zero precipitation; 2) missed rain, when the reference is showing precipitation but the satellite shows none; and 3) correct detection of no rain, which is the case when both satellite and reference are showing zero precipitation.
b. Seasonal difference
In general, the study region has a humid subtropical climate. Precipitation patterns vary greatly between summer and winter. During summer, precipitation is almost entirely convective, caused by mesoscale complexes and thunderstorms. Wintertime precipitation is mostly stratiform and tied to synoptic-scale systems. Snowy days are very rare in the region but the warm moist air coming from the Gulf of Mexico during winter could cause frontal freezing rain resulting in ice cover that usually lasts for several days. These different precipitation patterns result in differences in the satellite precipitation data for the two seasons.
Investigating the precipitation products shows that these data are statistically different in winter versus summer. During the summer, data are more scattered, and satellite images show a wider range of values, whereas in winter, data are underestimated, and satellite images show a narrower range of change in rainfall values, as seen in Fig. 4.
The seasonal differences are also apparent in the three first moments of the data (expected value, standard deviation, and skewness).
From the graphs of the mean of the data (Fig. 5, left), we can see that, for winter, the mean is closer to a 1:1 linear function and, for summer, it looks like a power function, and the satellite data are overestimating the amount of precipitation. This shows that treating summer and winter data separately would result in a more accurate estimation of uncertainty.
c. Parameter space and model generation
In section 2c, we stated that Eqs. (11)–(13) represent the parameters of the selected generalized probability distribution as a function of rainfall rate and that each equation contains two distinct parameters (a total of six), which need to be identified. Figures 6 and 7 represent the three-dimensional parameter spaces for each of the six parameters (a, b, c, d, e, and f) as a function of spatial and temporal resolutions. For better visualization, each of the six panels is plotted in such a way that the front center shows the lowest point in the space and gradually increases toward the far back center. For the sake of clarity, if, for example, one chooses a specific spatial and temporal resolution (e.g., 3 h and 0.25°), the corresponding six parameters are obtained from each of these planes for summer and winter separately. By inserting those parameters in Eqs. (11)–(13), we have the three parameters of the product’s uncertainty distribution as a function of rainfall rate, which now allows us to construct the uncertainty model. For any other desired spatial and temporal resolution between the offered ranges, these parameters can be interpolated linearly in the three-dimensional planes. We caution that any extrapolation of the parameters beyond the spatial and temporal resolutions used in this study should be further evaluated. For the case study selected, the specific values of the six parameters used will be introduced in section 5.
Figures 8 and 9 are the joint probability of the PERSIANN product and the reference data in 16 different pairs of spatial and temporal resolutions for summer and winter, respectively. These plots are generated using the uncertainty model presented in this work.
In both plots, the top-left corner displays the highest spatial and temporal resolution, and the bottom-right corner represents the coarsest resolution. The joint probability shows that products with the highest resolution have the highest amount of randomness because the distributions are scattered in the plane with low probability over almost the entire space. Moving from right to left in the lower panels of the Figs. 8 and 9, we can observe more concentration on a line in the plane that shows a higher peak and more pronounced mode of the distribution and less randomness. Additionally, as mentioned before in the literature (Hong et al. 2006), the spread of the uncertainty increases in each plot with an increase in the rainfall rate, which shows standard deviation to be an increasing function of rainfall rate.
In the distributions for summer, there is an obvious bias where the peak of the distribution is tilted toward the lower radar value, which shows that the satellite overestimates during the summer. The distributions are less biased during the winter, where the darker points with higher probabilities are closer to a 1:1 line.
4. Model evaluation
To evaluate the model, in Fig. 10 we show the comparison of RMSE of fitting the uncertainty of PERSIANN from the corresponding stage IV data for each pair of spatial and temporal resolution using several types of distributions, including GND, normal, gamma, lognormal, and Weibull distributions. The results show that the GND model fits better to those data than to others. GND distribution is also able to model the skewness of the joint distribution associated to satellite and stage IV precipitation products that cannot be estimated by symmetrical distribution, such as Gaussian.
The proposed approach models the non-Gaussian joint distribution of satellite and radar estimates, which enables us to estimate the interval bounds of satellite estimates at any range (e.g., upper and lower bounds of 90% interval). To further evaluate the uncertainty range obtained from the model, for 2007 and 2010, the 80% and 90% uncertainty intervals (10%–90% and 5%–95% uncertainty ranges) are calculated for a domain from 30° to 40°N and 85° to 95°W. This domain also contains our calibration domain. The year 2007 is selected from our calibration period, and 2010 is chosen for evaluation. The data are at 0.25° and 3-h resolution, from which the whole year is going to be 2920 images of 40 × 40 pixels. For each pixel, using Δt = 3 h and Δs = 0.25° and the rainfall rate of that pixel, the parameters of the GND model are estimated, and the uncertainty model is generated. Using the inverse CDF of the distribution, the 10% and 90% rainfall rates and the 5% and 95% rainfall rates are estimated. For each pixel in both years, the percentage of the images that radar falls in the uncertainty range is calculated, and the result is illustrated below. Figures 11 and 12 show the percentage of detection for the years 2007 and 2010, respectively. From these figures, we see how the uncertainty range of the model is able to capture the radar rainfall rate for more than 75% of the time for the 90% uncertainty ranges and 65% of the time for the 80% uncertainty ranges.
5. Uncertainty analysis of satellite precipitation: Case study over the Illinois River basin south of Siloam Springs
The GND model is further evaluated for a case study using 3 yr of data from 2006 to 2008 over the Illinois River basin located upstream of a USGS gauging station (07195430) south of Siloam Springs, Arkansas (Fig. 13). The watershed has been used as a test basin for the Distributed Model Intercomparison Project (DMIP). The size of the Siloam watershed is 1489 km2. The elevation ranges from 285 m at the outlet to 590 m at the highest, and the basin’s land cover can be described as uniform, with approximately 90% of the basin area being covered by deciduous broadleaf forest, with the remainder being mostly woody. The dominant soil types in the basin are silty clay (SIC), silty clay loam (SICL), and silty loam (SIL). The average annual rainfall and runoff of the basin are about 1200 and 300 mm yr−1, respectively (Smith et al. 2004). The Illinois River basin is free of major complications, such as orographic influences, significant snow accumulation, and stream regulations (Smith et al. 2004).
For this experiment, the PERSIANN data over the domain are used as the precipitation data, and stage IV radar data are used as the reference. The data are averaged over the domain to determine the mean areal precipitation and are aggregated from 3 to 6 h. The area is 1480 km2, which is approximately 0.35° × 0.35° (each degree is considered to be about 111 km). For each point of the PERSIANN time series, considering its temporal resolution (6 h), spatial resolution (0.35°), and rainfall rate, the uncertainty distribution parameters are calculated using the parameter space equation introduced in this study. The season of the precipitation should also be considered here. We considered events from May to October as summer and November to April as winter precipitation. From each distribution, 1000 samples are randomly drawn to estimate the uncertainty range of each satellite precipitation point using a Monte Carlo approach.
For the period of November–April, the parameters are linearly interpolated from Fig. 7 (winter) and, for the remainder of the year, the parameters are calculated from Fig. 6 (summer). For winter, the parameters are a = −0.7011, b = 1.7992, c = 0.0085, d = −0.9032, e = 7.7719, and f = 0.3979; for summer, the parameters are a = 0.1867, b = 0.912, c = 0.0031, d = −0.9904, e = 3.1023, and f = 0.5446. The parameters are then incorporated into Eqs. (11)–(13) to estimate the three parameters of the uncertainty distribution (shape, scale, and shift parameters). From each time step of the PERSIANN time series over the region, the three parameters of the uncertainty distribution are calculated based on the rainfall rate at the given time. This will result in final form of the uncertainty distribution.
To evaluate the uncertainty model, the 90% uncertainty range is calculated using the CDF of the distribution. The rainfall rates corresponding to the 5th and 95th percentiles of the range are estimated and used as lower and upper bounds of the uncertainty range, respectively. The mean of the satellite precipitation uncertainty distribution is calculated using Eq. (8). The 5th and 95th percentiles of the uncertainty range are plotted for winter and summer in Fig. 14, along with the scatterplot of the PERSIANN and stage IV radar rainfall over the watershed. Furthermore, the mean of the uncertainty model versus the satellite rainfall rate is plotted in black. The calculated results show that 68% of the summer reference precipitation data and 70% of the winter reference precipitation data (blue scattered dots) fall into the range, which means that the calibrations for both seasons are doing a good job by covering the range of the variability in the data. In both seasons, the spread of the distributions increases with an increase in the rainfall rate. The summer season shows a more skewed distribution because the mean of the distribution is close to the lower bound; for winter, the mean is almost in the middle of the lower and upper bounds.
In Fig. 15, the 6-h time series of the stage IV rainfall (Fig. 15a), PERSIANN (Fig. 15b), and mean of the uncertainty model (Fig. 15c) for the year 2006 are presented. The uncertainty model performs very well in reducing the overestimation from the PERSIANN product while keeping its pattern. In this work, the focus is to introduce a method to estimate the uncertainty range of the satellite precipitation product. Specifically, the proposed model not only provides a range for the uncertainty, it also serves as a bias-correction method. In Table 1, the mean of the distribution is compared to the stage IV radar 6-h time series in terms of root-mean-square error (RMSE), percent bias, and correlation coefficient. The same statistics are calculated for the PERSIANN satellite estimates and radar data. By comparing the statistics for the summer and winter periods, respectively, as well as for the entire 3-yr period, the mean of the uncertainty improves the satellite precipitation estimates. In all three cases, the RMSE and percent bias are improved. The improvement is more distinct in percent bias when the two seasons are analyzed separately. Generally speaking, satellite precipitation products have a larger bias in the summer (AghaKouchak et al. 2012), and the proposed method decreased this bias by 47%. For winter, the bias is decreased by 23%. The correlation coefficient stayed the same in most of the cases because the mean is a function of satellite precipitation, and the transformation would not significantly change the correlation coefficient. In all three cases for the respective summer and winter seasons, as well as the entire 3-yr period, the RMSE of the satellite versus radar rainfall improved. This improvement is 23% for the summer data, 6% for the winter data, and 18% for the entire 3-yr period.
6. Summary and conclusions
With the increased global use of satellite precipitation estimates for various applications, it is of paramount importance that the uncertainties associated with such products be evaluated and quantified carefully. Previous studies addressed this need by modeling the bias and the variance of the uncertainty at a specific resolution of the product. In this study, a method is introduced to quantify the uncertainty associated with the PERSIANN product. The model is based on a generalized normal distribution of the joint probability of the satellite precipitation product and the stage IV radar rainfall (used as the reference) for various spatial and temporal resolutions and rainfall rate, and at each specific satellite measurement it will generate the conditional probability distribution of the reference data.
The proposed model is calibrated with data over a 5° × 5° area in the southeastern United States for 2005–09 and for different spatial and temporal resolutions. The model is calibrated differently for summer and winter because the characteristics of the error are different for the two seasons (Tian et al. 2009; AghaKouchak et al. 2012).
The main conclusions drawn from the testing and comparison studies reported above are as follows:
GND is a skewed version of normal distribution, which is able to better model the precipitation uncertainty. The model is evaluated in terms of the goodness-of-fit compared to the normal, gamma, lognormal, and Weibull distributions. It is shown that, over different resolutions, this model always produces a better fit (except for a very small number of exceptions). The results from the evaluation over a 10° × 10° region in the southeastern United States for 2007 and 2010 show that, in both years, the uncertainty range of 90% covers more than 75% of the pixels, and the uncertainty range of 80% covers more than 65% of the pixels. This result supports the suggestion that the model can simulate the range of uncertainty to a good degree.
With the flexibility it offers, the proposed model removes the often-confusing dilemma faced in choosing an appropriate probability distribution for capturing the uncertainty of satellite precipitation estimates. It is noted that the proposed method, whose parameters are a function of rainfall rate and spatial and temporal resolution, provides the adaptability to apply the model, not only for lumped but also for distributed modeling purposes with any desired spatial resolution and temporal accumulation.
The model is evaluated at basin scale over the Illinois River basin south of Siloam Springs, Arkansas, for the period of 2006–08. The rainfall time series and their associated uncertainties estimated using the proposed approach were compared to radar rainfall estimates. The results show improvement in quantifying the uncertainty associated to the satellite precipitation estimates. Percent bias and RMSE both decreased for the summer and winter periods for the selected study region.
Users of the proposed model should be aware that this uncertainty model is defined for cases where both satellites and radar detected precipitation with a value larger than the threshold (1 mm day−1). The next step is to extend the model to cover false alarm and missed precipitation. The general framework of the proposed uncertainty model allows for its application to other satellite precipitation products with the proper calibration of the model parameters, as was done for the PERSIANN data.
Financial support for this study is made available from the NASA Precipitation Measurement Mission (Grant NNX10AK07G and NNX13AN60G), the U.S. Army Research Office (Grant W911NF-11-1-0422), and the NOAA NCDC (Prime Award NA09NES4400006, NCSU CICS Sub-Award 2009-1380-01).