The Moderate Resolution Imaging Spectroradiometer (MODIS) instrument aboard the NASA Earth Observing System (EOS) Aqua and Terra platform with 36 spectral bands provides valuable information about cloud microphysical characteristics and therefore precipitation retrievals. Additionally, CloudSat, selected as a NASA Earth Sciences Systems Pathfinder satellite mission, is equipped with a 94-GHz radar that can detect the occurrence of surface rainfall. The CloudSat radar flies in formation with Aqua with only an average of 60 s delay. The availability of surface rain presence based on CloudSat together with the multispectral capabilities of MODIS makes it possible to create a training dataset to distinguish false rain areas based on their radiances in satellite precipitation products [e.g., Precipitation Estimation from Remotely Sensed Information Using Artificial Neural Networks (PERSIANN)]. The brightness temperatures of six MODIS water vapor and infrared channels are used in this study along with surface rain information from CloudSat to train an artificial neural network model for no-rain recognition. The results suggest a significant improvement in detecting nonprecipitating regions and reducing false identification of precipitation. Also, the results of the case studies of precipitation events during the summer and winter of 2007 over the United States show an accuracy of 77% no-rain identification and 93% detection accuracy, respectively.
Reliable estimation of precipitation is important to predict and manage water resources, hazard preparedness, and climate studies (Ajami et al. 2008; AghaKouchak and Nakhjiri 2012; Anderson et al. 2008). However, spatial and temporal variability of precipitation makes it difficult to rely on sparse gauge point measurements, especially for remote regions. Higher spatial and temporal resolutions as well as global coverage of satellite observations are the main advantages of remotely sensed precipitation estimates over in situ measurements. Since they are an indirect method to estimate precipitation, they are also associated with additional uncertainties.
One way to estimate precipitation is through using visible (VIS) and infrared (IR) wavelengths. VIS and IR data are available from geostationary (GEO) satellites and have high spatial and temporal resolutions. However, VIS and IR channels do not measure precipitation directly. Instead, they measure cloud albedo and cloud top temperature that can be associated with precipitation rate using an indirect relationship. One limitation of these algorithms is that nonprecipitating cold clouds at high altitudes are often falsely identified as precipitating clouds, resulting in false precipitation estimates. Intense precipitation is correlated with cold clouds. However, the converse relationship may not be true. In addition to this issue, orographically induced precipitation or precipitating warm clouds (e.g., stratiform) may cause precipitation, which are not easily identified with current algorithms (Joyce et al. 2004). The misclassification of rain/no-rain (R/NR) clouds is one of the major issues facing IR-based algorithms (Arkin and Xie 1994). In addition to IR and water vapor (WV) channels, low-Earth-orbiting (LEO) satellites are equipped with passive microwave (PMW) sensors that measure the thermal emission and scattering of raindrops. PMW remote sensing of precipitation is recognized as a more reliable source of precipitation estimation from space (Adler et al. 2001; Ebert et al. 1996).However, LEO satellites have low temporal resolution of only one or two times a day for a specific location on Earth (Marzano et al. 2004). Since many LEO satellites are orbiting Earth, PMW data from them are operationally available every few hours. To date, PMW sensors are not carried on GEO satellites because of technical challenges (Joyce et al. 2004).
Many satellite-derived precipitation products take advantage of multiple remote-sensing devices. Tropical Rainfall Measuring Mission (TRMM) Multisatellite Precipitation Analysis (TMPA) products are combined precipitation products that use GEO's IR information to fill the gaps between PMW estimates (Huffman et al. 2007). For example, to overcome the temporal limitations of PMW estimates, National Oceanic and Atmospheric Administration (NOAA) Climate Prediction Center (CPC) morphing technique (CMORPH) uses atmospheric motion vectors derived from GEO's IR data to propagate high-quality PMW precipitation estimates when updated PMW data are unavailable (Joyce et al. 2004). Other precipitation products use PMW-adjusted IR data, such as the PMW-calibrated IR algorithm (PMIR; Kidd et al. 2003), the Precipitation Estimation from Remotely Sensed Information Using Artificial Neural Networks (PERSIANN) algorithm (Hsu et al. 1997; Sorooshian et al. 2000), and the Self-Calibrating Multivariate Precipitation Retrieval algorithm (SCaMPR; Kuligowski 2002). In addition, the Naval Research Laboratory (NRL) blended-satellite precipitation technique uses a combination of Moderate Resolution Imaging Spectroradiometer (MODIS)/Advanced Microwave Scanning Radiometer for Earth Observing System (EOS; AMSR-E) sensors to detect cirrus clouds and reduce false rain estimations in the algorithm (Turk and Miller 2005). More recently, Rain Estimation Using Forward-Adjusted Advection of Microwave Estimates (REFAME) algorithm uses IR images to advect microwave-derived rain rates along the cloud motion tracks. This algorithm takes advantage of a local cloud classification method to adjust the rain rates (Behrangi et al. 2010b). More sophisticated approaches, such as the Lagrangian Model (LMODEL) algorithm, combine information from microwave-calibrated data and morphing techniques using a conceptual modeling framework (Bellerby et al. 2009; Hsu et al. 2009).
Several studies emphasize that more advanced methods are needed to improve the quality of satellite precipitation products, including reducing their false alarm ratio (FAR; Sorooshian et al. 2011). The utility of multispectral satellite data in capturing microphysical properties of clouds and improving precipitation estimation has been the subject of many investigations in recent years. For instance, Li et al. (2007) showed the effectiveness of MODIS channel 31 (11.03 μm) in identifying high clouds with very cold brightness temperatures. Strabala et al. (1994) show that for high ice clouds, a difference between 8.5 and 11 μm brightness temperatures [BTD(8.5 − 11)] is greater than BTD(11 − 12). Furthermore, Wang et al. (2009) used the near-infrared (NIR) 2.19-μm band to retrieve cloud particle size and used the water vapor absorption channel 1.38-μm band to screen out upper-level ice clouds. Turk and Miller (2005) show that significantly positive BTD(3.7 − 11) provides information for identifying cirrus clouds at night.
BTD(11 − 12) is also useful in identifying ice clouds. Inoue (1987) showed that optically thin (τ in the range of 0.1–4) cirrus clouds have BTD(11 − 12) values greater than 2.5 K. Furthermore, BTD(11 − 12) values less than or equal to 0 K correspond to deep convective clouds with a heavy precipitation (Kurino 1997). More recently, Setvak et al. (2003) showed that convective storms exhibit a significant increase in 3.7-μm cloud top reflectivity.
BTD(8.5 − 11) also has been shown to be effective in identifying high ice clouds. Since ice particles absorb much less radiation at 8.5 than 11μm, high cirrus clouds are expected to have a BTD(8.5 − 11) greater than one (Roskovensky and Liou 2003). Thies et al. (2008) considered BTD(8.7 − 10.8) and BTD(10.8 − 12.1) to identify cloud phase.
Using multispectral data for R/NR detection was also a focus of many studies. A combination of VIS and IR channels was initially used by Lovejoy and Mandelbrot (1985) and Austin (1987) to identify R/NR occurrences. Also, Capacci and Conway (2005), Behrangi et al. (2010a), and others have found remarkable improvements in detecting rainy areas when using multispectral data. Lensky and Rosenfeld (2003) implemented the difference between a thermal IR channel and a mid-IR channel, BTD(3.7 − 11), into a night-rain delineation algorithm.
In this paper, the application of multispectral data and statistical classification techniques in improving IR-only precipitation algorithms is explored. Multispectral data available from MODIS images and CloudSat Level 2-C Precipitation Column product (Haynes 2011) are two sources of information that are used to improve the quality of rain estimations and reduce the false rain detection. CloudSat data are used to train a neural network model using MODIS data as input to identify false rain locations. This paper is organized into five sections: section 2 explains false alarm in satellite precipitation data followed by the methodology, satellite data, and the training model in section 3. Section 4 presents the results and discussions, and the conclusions are summarized in section 5.
2. False alarm in satellite precipitation data
Evaluation of satellite precipitation algorithms is essential for future algorithm development. This is why many previous studies are devoted to the validation of satellite-based observations (e.g., Tian et al. 2009; Amitai et al. 2009; AghaKouchak et al. 2010b; Zhou 2008; Gochis et al. 2009; Yilmaz et al. 2005; Shen et al. 2010; Dinku et al. 2008; Liu et al. 2009; Sapiano and Arkin 2009; AghaKouchak et al. 2012). For instance, Tian et al. (2009) analyzed the error of six high-resolution satellite products versus a gauge-based estimate and reported regional and seasonal variations of error patterns in the contiguous United States. They concluded that satellite products tend to overestimate rainfall in the summer and underestimate it in the winter. Sapiano and Arkin (2009) also confirmed that satellites overestimate warm season precipitation over the United States. Using volumetric FAR, AghaKouchak et al. (2011) showed that several satellite products exhibit high false alarm rate for rainfall, especially at high quantiles of observations.
To investigate false alarms in the satellite-based precipitation products, we conducted a validation study to compare PERSIANN precipitation data with ground-based measurements. FAR and probability of detection (POD) are calculated for the time period between 2005 and 2008 over the United States. The FAR is the ratio of falsely identified rainy pixels to the total number of rainy pixels in satellite data, whereas the POD measures the fraction of observed precipitation that was correctly forecasted [the ratio of the total number of times that rainfall was correctly forecasted to the total number of observed rainy pixels (Wilks 2006)]. In the current study, the Stage IV radar-based multisensor precipitation estimates (MPEs), available from the National Centers for Environmental Prediction (NCEP), are used as the reference data. The Stage IV precipitation data are adjusted for various biases using rain gauge measurements (Lin and Mitchell 2005) and are considered the “best” area approximation among the currently available area-averaged rainfall datasets (AghaKouchak et al. 2010a). Stage IV data are aggregated into 0.25° spatial and 3-hourly temporal resolutions, which are the same as the PERSIANN precipitation data. Figure 1 shows the FAR and POD for the entire 4-yr period (Fig. 1a) and the summer and winter seasons for the PERSIANN precipitation product (precipitation threshold is considered as 0.05 mm h−1; Figs. 1b,c). Figure 1 reveals very high FARs over the central and western United States and lower FARs over the eastern United States on average. Higher FAR is associated with the presence of high cirrus clouds, especially in the winter season. Tian et al. (2009) also showed higher FAR over the western United States in winter season by PERSIANN data. The average POD is about 60% over the central United States and low over the southwestern region. Low POD on the eastern and western sides of the continent is associated with missed precipitation over these regions. Missed precipitation may be due to missed warm rain or snow cover on the ground (Tian et al. 2009).
Generally, satellite precipitation estimates seem to be better during the summer seasons, perhaps because of a dominance of convective storms. On the other hand, the FAR is very high during the wintertime because of the presence of nonprecipitating high cold clouds. Additionally, the presence of snow and ice on the ground and the inability of PMW sensors to measure snowfall over snow- or ice-covered surfaces increase the error in satellite precipitation estimations and result in higher FARs during the wintertime. Finally, it is worth mentioning that radar coverage is limited over the western region of the United States (with the very high false alarm shown in Fig. 1) because of beam blockage in mountainous terrain. The Stage IV data have a large number of missing data over the Pacific Northwest region; therefore, the precipitation data for this region are not included in the analysis.
This study develops a no-rain detection algorithm that takes advantage of CloudSat and MODIS observations to detect no-rain areas. To show an example of how different datasets are used in this study, Fig. 2 is presented. Figure 2a demonstrates the CloudSat overpass through a precipitation event (Stage IV data) over South Carolina and neighboring states on 13 August 2008 (0545 UTC). The black line in Fig. 2a represents the track of the CloudSat radar, while Fig. 2b shows the vertical profile of the clouds with different cloud types obtained from the 2B-CLDCLASS product. The CloudSat cloud type classification product is able to identify clear sky, as well as seven different classes of clouds: cumulus (Cu), stratocumulus (Sc), altocumulus (Ac), altostratus (As), nimbostratus (Ns), high cloud (cirrus or cirrostratus), and deep convective cloud. Furthermore, the figure displays PERSIANN (Fig. 3c) precipitation estimates and radar observations (Fig. 2d) corresponding to the CloudSat track. One can see that the maximum amount of precipitation estimated by PERSIANN coincides with the high cirrus anvil, which has the lowest brightness temperature. However, ground-based data indicate that the peak of the storm is in the center of the deep convective tower (about 15 mm h−1), which makes more physical sense. Figures 2e and 2f display cloud brightness temperature, measured by MODIS, which is informative for different cloud types. Figure 2e shows that the lowest value of brightness temperature at 11 μm appears at the location of high clouds and coincides with high precipitation estimations from the PERSIANN product. As discussed earlier, the brightness temperature difference between channels 31 and 29 of MODIS [BTD(8.5 − 11)] is a strong positive value (greater than 2 K) for high ice clouds. Figure 2g presents the radar reflectivity observations by CloudSat showing the vertical structure of the convective zone. MODIS BTD(8.5 − 11) is almost zero in the presence of deep convective cloud, as shown in Figs. 2f and 2g. The distinction between optically thin clouds (i.e., cirrus) and optically deep clouds (i.e., convective clouds) from multispectral channels helps to improve the IR-only algorithms. Underestimation of PERSIANN algorithm in the presence of deep convective clouds is one of the limitations of IR-based algorithms.
Multispectral image classification is an important technique in the application of remote sensing and geosciences. Statistical classification is a multivariate analysis that takes advantage of simultaneous observations coming from images on different spectral bands. Analyzing a set of input variables for a set of known classes (i.e., labels), a statistical connection will be created between the input features and the target response (i.e., training dataset). Among different classification techniques, artificial neural networks (ANNs) have been shown to be an effective tool in classifying complicated systems (e.g., Hsu et al. 1997; Capacci and Conway 2005; Behrangi et al. 2009; Hong et al. 2004; Bellerby et al. 2000; Tapiador et al. 2004).
ANNs are pattern recognition tools usually used to model complex relationships between a set of inputs and corresponding outputs (Bishop 1996). These models are composed of interconnecting artificial neurons and are employed to find statistical correlations between multispectral information on cloud tops and the presence of precipitation (see Fig. 3 for ANNs' model structure). In this study, a feed-forward, back-propagation model with a single hidden layer and a sigmoidal activation function was created. The ANN model calculates the errors between the calculated output and given output data, and by adjusting the weights, minimizes the error. The general equation for ANNs is in the form of a linear combination of fixed nonlinear basis functions φj (x) with the weights ωj and is in the form of
Each basis function φj (x) itself is a nonlinear function of a linear combination of the inputs (i.e., MODIS data), where the coefficients in the linear combination are parameters to be adjusted during model training.
In the general ANN equation, f is the activation function. In this study, a sigmoidal activation function was associated with all the neurons in the model and is in the form of
The target values in the ANN model are a binary vector of no-rain (1) or possible rain (0). The ANN computes the value of the output based on the series of inputs entered into the model. If the output is equal or more than 0.5, it is assumed to be a no-rain scenario, and values less than 0.5 are possible rain pixels.
The presence of precipitation was assigned to textural and spectral features of clouds observed by the MODIS satellite, whenever a CloudSat retrieval was available. In this study, the training dataset was created from CloudSat and MODIS data over the contiguous United States. At each CloudSat track, the multispectral information from MODIS spectral bands were considered as an input to the model and the near-simultaneous observations of CloudSat (target value) defined whether it is a possible rain or no-rain pixel. The trained model was then used as a reference to find whether any pixel in the MODIS image is falsely identified as a rainy pixel for the times that CloudSat data are not available.
b. Satellite observations
The MODIS instrument onboard the National Aeronautics and Space Administration's (NASA) EOS Aqua and Terra platforms with 36 spectral bands provides valuable information about cloud microphysical characteristics. The spatial resolution of the MODIS data is 250 m for visible channels (channels 1 and 2, 0.6–0.9 μm), 500 m for channels 3–7 (0.4–2.1 μm), and 1000 m for channels 8–36 (0.4–14.4 μm). For this study, the MODIS level 1B calibrated radiance data were used.
A set of six WV and IR channels of MODIS (6.75, 7.325, 8.55, 9.7, 11.03, and 12.02 μm) were selected as input to the ANN model. The availability of these channels during the day and night makes it possible to have a consistent R/NR detection algorithm for day and night retrieval.
In addition to MODIS, CloudSat (a NASA Earth Sciences Systems Pathfinder mission) is designed to measure the vertical structure of clouds from space and provides the first direct observation of cloud vertical structure (Weisz et al. 2007). CloudSat is incorporated into the EOS satellites, which fly in a sun-synchronous orbit at a 705-km altitude. The CloudSat satellite consists of a 94-GHz Cloud Profiling Radar (CPR) and provides a rich source of information about cloud properties. MODIS and CloudSat are both part of the afternoon constellation of satellites, called the A-Train (Stephens et al. 2002). The CloudSat radar flies in formation with Aqua, with an average of 60 s delay between them, providing almost simultaneous observations.
The CloudSat Level 2-C Precipitation Column algorithm (Haynes 2011) provides information about the presence of surface precipitation. The determination of surface precipitation occurrence is based on the radar reflectivity data near the surface and the surface reflection characteristics (i.e., the Precip_flag variable in the Precipitation Column dataset). The Precip_flag is available over land, ocean, and sea ice and categorizes precipitation into nine different groups: no precipitation, uncertain, rain possible, rain probable, rain certain, snow possible, snow certain, surface mixed precipitation, mixed precipitation possible, and mixed precipitation certain. In this study, only instances of certain no-precipitation were considered as NR pixels.
c. Training dataset
To have a better estimation of performance of the proposed technique, the analysis was done for summer and winter precipitation events. Separate training for summer and winter times were considered to account for different climate conditions in different seasons and improve the accuracy of the model. As explained earlier, the spectral information from MODIS onboard Aqua and the corresponding CloudSat estimation of R/NR were considered in the training datasets. Data were randomly divided into two groups: training and testing. The summer training data included about 118 000 pixels observed in the summer of 2008, with 16 000 rainy pixels (dry to wet ratio of 7.3:1). Similarly, winter training dataset with a dry to wet ratio of 1:2.8 embraced around 130 000 pixels in total in the winter of 2010.
d. Application of the model on precipitation events
After training the algorithm using collocated MODIS and CloudSat pixels, the ANN model was used on MODIS multispectral images to identify the NR regions. At each MODIS pixel, the ANN model estimated if that pixel is a NR pixel, and the results were compared with CloudSat detections. The model performance was investigated over the continental United States for the summer and winter of 2007.
4. Results and discussions
After training the model using the summer of 2008 and the winter of 2010 datasets, the model validation was performed on 2007 data. CloudSat radar data are considered as the “truth” to validate the R/NR classification model presented in this study. The 2007 summer results were evaluated over 70 000 CloudSat pixels and showed 78% accuracy in detection of NR pixels. The 2007 winter data validation on 50 000 pixels showed a very high accuracy of 93%. Figures 4 and 5 display the distribution of different cloud types for correct NR pixel classification as well as the misclassified pixels for summer and winter seasons, respectively.
Figure 4 shows that the NR detection algorithm has the poorest performance in the case of middle-level clouds such as altostratus and altocumulus as well as precipitating clouds (see Table 1). The misclassification rate of the pixels associated with altostratus clouds in NR detection was 39%, and the misclassification rate was around 34% in the case of altocumulus clouds (Table 1). The model's low performance in the case of middle-level clouds confirms the limitation of IR-based algorithms in detecting warm rain clouds.
The distribution of different cloud classes in the winter validation dataset are demonstrated in Fig. 5. The first panel in the figure shows that most NR pixels are associated with high clouds and stratocumulus. The misclassification rate is 14% in the case of altostratus clouds and the error is less than 6% in the remaining types of clouds.
The NR model's poor performance in the presence of deep convective clouds, with 43% detection error in winter season, is in agreement with summertime results. In general, deep convective and nimbostratus clouds are mostly associated with rain (Aumann et al. 2011). Most NR pixels associated with deep convective cloud are misclassified as rain in both summer and winter seasons. Hence, the NR detection algorithm does not perform well in cases of precipitating clouds showing very low accuracy in the presence of these cloud types.
As discussed in section 1, the PERSIANN dataset shows higher false alarms in the winter season. Applying the current algorithm, one can see a better improvement of precipitation estimation in the winter season.
Two case studies on summer and winter precipitation events are presented here to show the application of this technique to improve the quality of near-real-time PERSIANN precipitation products. The MODIS level 1B dataset has a spatial resolution of 1 km in contrast to the 0.25° (~25 km) PERSIANN precipitation product. Therefore, the MODIS images were regridded to the 0.25° PERSIANN grids and then used as input to the ANN model.
The temporal resolutions of the datasets are also different. PERSIANN data are aggregated from 30-min rain estimations into hourly accumulated precipitations. In contrast, MODIS provides instantaneous observations twice a day. In this study, MODIS images within 20 min of PERSIANN estimations are mosaicked together into one raster image and then compared with corresponding PERSIANN data. Corresponding Stage IV data are presented for comparison of model performance.
Figure 6a shows the Stage IV precipitation data (mm h−1) on 5 August 2007 (0500 UTC). Figure 6b represents the corresponding PERSIANN data for the same time step (mm h−1). By finding the ANN model's results on the corresponding MODIS images [two images for 5 August 2007 (0440 and 0445 UTC)], the false alarms were identified. A false rain pixel is defined as an NR pixel in the ground-based observation data (Stage IV data) that contains precipitation from the satellite estimations. Figure 6c demonstrates the current algorithm's results in identifying false alarms on PERSIANN-derived precipitation. Gray pixels on the image show the location of correct rain detection from the satellite, and red and blue pixels are false rainy pixels from PERSIANN estimations. The blue color identifies the accuracy of the model in identifying NR pixels, while the red color demonstrates a false rain pixel that the model could not detect (here, to define a false rain, the Stage IV data are considered the reference). Table 2 presents the number of rainy pixels in each dataset as well as number of FAR pixels detected. The algorithm was able to identify 155 false rain pixels (i.e., 62% reduction in FAR in this event). Note that the region between the solid blue lines shows the MODIS coverage.
Figure 7 is another example of false rain detection for 6 November 2007 (0300 UTC). Figure 7c shows that 61% of false rain pixels are identified in this event. PERSIANN estimation shows a large area of false rain on the southeast side of the event, and the majority of FAR pixels (300 false rain pixels) could be removed using the current R/NR algorithm (see Table 2). We also acknowledge that the temporal differences between different datasets [i.e., MODIS and Geostationary Operational Environmental Satellite (GOES) observations] could affect the results.
Previous studies have highlighted the need to improve the quality of satellite precipitation data. High false alarm ratio is one of the problems that current satellite products are facing, especially during cold seasons. In this study, the ability of an NR classification model using the CloudSat data as well as corresponding multispectral data from MODIS were investigated.
An artificial neural network model was developed to take advantage of accurate surface rain detections from the CloudSat satellite. Model training was performed on CloudSat and MODIS data from the summer of 2008 and the winter of 2010. The summer and winter 2007 datasets were selected to assess the performance of the model. Model validation showed an accuracy of 93% and 77% in identifying false rain pixels for the winter and summer season events, respectively. Having different cloud classes available from the CloudSat CLDCLASS product, the model performance was evaluated in the presence of different cloud classes. The model performance was the least in cases of deep convective and middle-level (e.g., altostratus and altocumulus) cloud types.
By reducing false rain, the quality of satellite precipitation products for practical applications (e.g., flood forecasting) will significantly improve. In the future, there is a possibility to include multispectral data from the Advanced Baseline Imager (ABI) sensor aboard the future GOES–R Series (GOES-R) satellite to overcome the limited retrievals of MODIS.
The proposed technique has the potential to be integrated into near-real-time satellite precipitation products to reduce false alarms from the algorithms. Two case studies presented in the summer and winter of 2007, using hourly PERSIANN data, showed reduction of false rain in comparison with Stage IV radar data.
The financial support for this study is made available from NOAA/NESDIS/NCDC (Prime Award NA09NES4400006, NCSU CICS Sub-Award 2009-1380-01), the Army Research Office (Award W911NF-11-1-0422), the NASA Decision Making Project (Award NNX09AO67G), and the NASA Earth and Space Science Fellowship (NNX11AL33H).