1. Introduction
In many parts of Africa, rainfall is the single most important meteorological parameter. Too little rainfall can mean crop failure and famine while too much can lead to devastating floods as in Mozambique in 1999. Real-time monitoring of rainfall is vital to allow timely responses to potential disasters. For crop monitoring and famine early warning, real-time monitoring at 10-day intervals (10 days = one dekad) has in the past been regarded as an appropriate temporal resolution for rainfall amounts. However for hydrological forecasting, the time step needed even for large (>10 000 km2) catchments is of the order of 1 day and for smaller catchments is correspondingly shorter. Daily estimates are also useful for some agricultural applications such as monitoring gaps in the growing season.
Although a conventional rain gauge network gives rainfall observations at a daily time step, throughout much of the African continent the network is inadequate both in terms of spatial coverage and timeliness of data collection, while radar is generally not a feasible proposition on cost and infrastructure grounds. Monitoring rainfall from satellite imagery is an attractive alternative as it has the potential for good spatial coverage, is available in real time and is relatively inexpensive to access.
Many algorithms for satellite-based rainfall monitoring exist but most are not appropriate to the specific requirements of real-time, daily, operational rainfall monitoring in Africa. Of those that have been designed for the purpose, most rely on simple empirical algorithms making use of Meteosat thermal infrared data, sometimes in combination with passive microwave imagery or rain gauge data available via the Global Telecommunications System (GTS). In this paper we describe a new approach based on the application of an artificial neural network (ANN) to a combination of satellite imagery and data from numerical weather prediction (NWP) model analyses. The method described could in principle be used for time periods shorter than 1 day but because the most readily available data for calibration and validation are daily rain gauge observations, we have focused on a daily timescale. As a case study we have applied the approach to 4 yr of data from Zambia in central Africa.
Section 2 reviews current rainfall estimation techniques. The case study area and data used are described in section 3. Section 4 gives a brief introduction to neural networks and describes the rainfall estimation algorithms while section 5 outlines the validation methodology. Results for the Zambian case study are presented in section 6.
2. Satellite-based rainfall estimation
a. Conventional approaches
Satellite imagery has been used for rainfall monitoring for more than 30 yr (Barrett 1970). Much of the interest in that time has focused on generating global datasets for climatological purposes (Kidd 2001). Initial methodologies used data from the thermal infrared (TIR) and visible sensors on geostationary satellites to identify convective cumulonimbus clouds (Barrett and Martin 1981). Geostationary satellites have the advantage of good spatial (∼5 km for TIR imagery in the Tropics) and temporal (30 min) resolution, but these methods tend to perform poorly at high latitudes because of the preponderance of nonconvective rain.
The most widely used of these algorithms is the Geostationary Operational Environmental Satellite (GOES) Precipitation Index (GPI, Arkin 1979) in which a rainfall amount of 3 mm is associated with each hour of CCD [a0 = 0 mm, a1 = 3 mm h−1 in Eq. (1)]. For the GPI, the temperature threshold is normally taken as −38°C. Although the GPI gives good results over the tropical oceans, it is known to overestimate rainfall amounts over land (Arkin et al. 1994).
Since the late 1970s, methodologies based on passive microwave data (between 10 and 100 GHz) from instruments such as Special Sensor Microwave Imager (SMM/I) on polar-orbiting satellites have been used (Kidd 2001). Over the oceans at frequencies less than 40 GHz, microwave emission from raindrops gives a more reliable indication of rainfall than inference from TIR images. However over land surfaces at these frequencies, the information is degraded by the high and variable emissivity, which depends on vegetation and soil moisture (Morland et al. 2001). At higher frequencies (>60 GHz) the signal is dominated by scattering from ice particles within clouds and is therefore less sensitive to surface characteristics, but is conversely less directly related to raindrop density. An alternative approach is to make use of differences in the signal at different polarizations to correct for variations in surface emissivity (Kidd 1998).
An additional problem with sensors on polar-orbiting satellites is the poor temporal resolution (one or two overpasses per day), which is inadequate for monitoring rainfall on a daily timescale. Some workers have attempted to combine the advantages of geostationary TIR and microwave data, for example, by using the microwave data to recalibrate the TIR algorithm whenever it is available (Todd et al. 2001).
An important technological advance was made in 1997 with the launch of the Tropical Rainfall Measuring Mission (TRMM) satellite. This is the first satellite to have an onboard precipitation radar. Rainfall estimates using the precipitation radar are promising and this may be the method of choice in the long term but the current TRMM satellite does not provide data at appropriate temporal resolutions to be usable for the operational purposes of interest here.
b. Methods based on artificial neural networks
An artificial neural network (ANN) provides a computationally efficient way of determining an empirical, possibly nonlinear relationship between a number of “inputs” and one or more “outputs.” In addition, the ANN has been shown to be effective in extracting significant features from noisy data (Davolo and Naim 1991) and for this reason the most common applications have been in the field of pattern recognition. A more detailed description of neural networks is given in section 4b; here, we briefly indicate studies relevant to rainfall estimation.
Many studies have been performed using an ANN approach in atmospheric science (Hsieh and Tang 1998). In the field of remote sensing, an ANN approach has also been used by Aires et al. (2001) for retrieval of surface temperature and atmospheric water vapor from satellite data. Recently ANN algorithms for rainfall monitoring have been successfully applied by Hsu et al. (1997), Tsintikidis et al. (1997), and Bellerby et al. (2000).
In the case of the Precipitation Estimation from Remotely Sensed Information using Artificial Neural Networks (PERSIANN) system described by Hsu et al. (1997), the inputs are satellite TIR temperatures and their spatial derivatives plus a parameter that classifies the underlying surface as land, sea, or coast. The neural network is used to discriminate between rain rates of different cloud patterns via a “self-organizing feature map.” A big improvement was noticed if the network was continually updated by calibration against available real-time data. In a further paper (Sorooshian et al. 2000) good results could be achieved by real-time updating with TRMM precipitation radar.
Tsintikidis et al. (1997) compared an ANN approach with linear regression for rainfall estimation over the ocean from SSM/I passive microwave data and found that the ANN performed better than the regression for the same input.
In the method described by Bellerby et al. (2000), the input parameters are brightness temperatures and their spatial derivatives for three IR and one visible sensor on the GOES geostationary satellite. The output is the instantaneous rain rate. Training of the network was carried out using TRMM precipitation radar data. The method is shown to perform consistently better than a locally calibrated GPI technique.
c. Operational rainfall estimation for Africa
For operational, real-time rainfall monitoring in Africa, appropriate methodologies must not only be sufficiently accurate but must fulfil practical criteria with regard to availability of data, low cost, and low technological specifications for hardware and software.
There are several methods that are currently, or have been recently, in operational use in Africa. Most make use of a 10-day (dekadal) time step and are intended primarily for monitoring crop development on a regional or national scale during the growing season. Examples are listed next.
In West Africa a method devised by the Institut Français pour le Developpement en Coopération (ORSTOM) group [now Institut de Recherches pour le Développement (IRD)] at Lannion in France has been used over a number of years (Carn et al. 1989). As well as Meteosat TIR data, this approach uses surface temperature estimates to identify conditions conducive to strong convection.
The B4 method (Todd et al. 1995) is one of a number of methods devised by the Remote Sensing Unit at Bristol University and has been used operationally in West Africa and in the Nile basin. In this case the rainfall is estimated as a function of CCD and mean rainfall amount per rain day. Both the threshold temperature and the CCD calibration against rainfall amount are determined by comparison with local rain gauge data available in the previous 10-day period.
The Climate Prediction Center (CPC) at the United States National Oceanic and Atmospheric Administration (NOAA) has developed several algorithms, the results of which have been used by U.S. Agency for International Development (USAID) in Africa for famine early warning. Until December 2000 an algorithm was employed (Herman et al. 1997) using the GPI as a baseline estimate, which was then recalibrated in the light of available real-time rain gauge data from the GTS. An additional feature was the incorporation of numerical weather analysis data to represent orographic rainfall. Since January 2001, this has been replaced by an algorithm which incorporates GPI, microwave imagery from SSM/I, and the Advanced Microwave Sounding Unit (AMSU) satellite and GTS rain gauge data combined according to a methodology described by Xie and Arkin (1996). The algorithm is calibrated to give daily rainfall amounts.
The Tropical Applications of Meteorology using Satellite Data (TAMSAT) group at Reading University uses an approach based on Eq. (1) but as with the B4 method, Tt, a0, and a1 are determined by calibration against local gauges (Milford and Dugdale 1990). A difference between this method and the operational techniques described so far is that calibration is carried out using historic rain gauge data. The rationale is that, while contemporaneous, high-quality gauge data should give the best calibration; in practice, sufficient data are rarely available in near real time with adequate quality control. Using data from previous years for the same calendar month means that the calibration can be performed with many more observations and greater opportunity for filtering out suspect values. The disadvantage is that the accuracy of the estimates is degraded by the interannual variability in the calibration parameters (Laurent et al. 1998). The TAMSAT algorithm is used as the control for comparison with the ANN approach in this paper and is therefore described in more detail in section 4.
More recently Grimes et al. (1999) have attempted to make best use of both historic and contemporaneous gauge data by basing an initial calibration on historic data and then using a kriging technique to merge this rainfall estimate with any good quality rain gauge data available in real time.
Comparisons between methods used operationally in Africa have been inconclusive. Snijders (1991) compared the TAMSAT approach with the University of Bristol Polar-Orbiter Effective Rainfall Monitoring Integrative Technique (PERMIT) method (similar to B4 described earlier) and several other techniques in the West African Sahel and found that there was little to choose between the methods. Laurent et al. (1998) performed a more detailed comparison of the Lannion and TAMSAT methods at a variety of space and time scales for West Africa and confirmed that the real-time Lannion calibration gave significantly better results provided sufficient gauge data were available. In the absence of real-time calibration, both methods gave similar results. Thorne et al. (2001) compared the TAMSAT method and the earlier (prior to 2001) CPC method in southern Africa and found that the CPC algorithm was superior where there was a high density of GTS gauges, but the TAMSAT approach worked better in other areas.
d. Rainfall monitoring for hydrological purposes in Africa
The methods described in section 2c with the exception of the current CPC algorithm have all been designed to produce dekadal (10 day) rainfall estimates. There has been little investigation of the accuracy and reliability of operational daily estimates in Africa—particularly within a hydrological context. From the ergodic principle, one would expect that daily estimates would be useful provided the reduced averaging in the time domain was compensated for by increased averaging in space. Grimes and Diop (2003) have shown that daily CCD-based estimates (with historic calibration) give useful results as the input to a river catchment with an area of ∼80 000 km2. In this case, the results are at least as good as a gauge network with one gauge per 15 000 km2, which is relatively dense for this part of Africa. They also show that the river flow model output can be improved if numerical weather prediction model analysis data are used to modulate the rainfall estimation algorithm.
From these considerations and following the discussion in the preceding sections, we have developed an artificial neural network approach to real-time rainfall estimation at a daily time step based on Meteosat TIR data and NWP model analysis information. Both sources of data are readily and reliably available in Africa in real time. Data from passive microwave sensors have not been included in the interests of algorithm simplicity (taking into account problems mentioned earlier) and also because microwave data reception systems are not widely available within Africa. The work focuses on seasonally arid Africa but it is expected that the methodology is applicable to many regions within the Tropics.
3. Data and data preparation
a. Data for case study area
Zambia is located in central Africa between 8° and 18°S and 22° and 34°E. It covers an area of 750 000 km2. It was chosen for this study mainly because the rainfall climate is uncomplicated and representative of much of Africa. Additionally we were grateful for the excellent cooperation of the Zambian Meteorology Service in providing rain gauge and other data. Zambia has one rainy season running from October to April corresponding to the annual passage of the intertropical convergence zone (ITCZ) with little influence from either orography or coastline. Daily records from 77 gauges were available covering the period October 1995–April 1999. Rigorous quality control was applied. Stations were rejected for gaps in time series and other data irregularities as well as local knowledge of past reliability. This left a total of 35 gauges, the locations of which are shown in Fig. 1.
NWP analysis fields for this area and time period were available from the European Centre for Medium-Range Weather Forecasts (ECMWF) for each day at 0000, 0600, 1200, and 1800 UTC at 0.5° grid resolution. The fields selected as being likely to provide information on rainfall rate were relative humidity and horizontal wind velocity at the surface and vertical velocity at 400 and 700 mb. In order to reduce data requirements, only the 1800 UTC data were used and the data were extracted for a window 8°–21°S and 18°–34°E. Meteosat TIR images corresponding to the ECMWF analysis time and daily CCD images were extracted from the TAMSAT archive. The pixel size of the Meteosat images for Zambia is roughly 7 km × 5 km.
b. Data preparation
1) Gauge data
A problem with comparing satellite-based rainfall estimates against rain gauge data is that they are concerned with different spatial scales. The gauge value represents rainfall at a point, whereas a satellite pixel based on Meteosat TIR imagery for Zambia is an average over an area of about 35 km2. Flitcroft et al. (1989) analyzed data from a dense rain gauge network in West Africa and showed that the standard deviation of individual point values used to represent a given pixel average rainfall was ∼10 mm and this figure was almost independent of rainfall quantity. They also found a systematic bias in that gauge measurements of high rainfall amounts were likely to overestimate the pixel average rain; whereas gauge measurements of low rainfall amounts tended to underestimate. It is reasonable to suppose that convective rainfall associated with the ITCZ elsewhere in Africa would display similar variability.
To address this problem, we have converted the raw gauge data to areal averages for each pixel by applying the technique of block kriging (Journel and Huijbregts 1978). Using this approach, the best estimate for a pixel is a weighted mean of nearby gauges. The weighting is determined by the distance of the gauges from the target pixel taking into account the spatial correlation of the rainfall event as represented by a variogram function computed from the gauge data. Given the small number of gauges in this study, a climatological variogram was computed from all 4 yr of data for each calendar month. With this approach the rainfall field generated from the gauge data is directly comparable to the satellite-derived rainfall image. The block kriging technique is in principle superior to other interpolation procedures; first because the final pixel estimates are informed by the spatial structure of the rainfall and second because it allows confidence limits to be calculated for each pixel value. Obviously, the error is lowest for pixels containing gauges and highest for pixels far away from any gauge.
For the calibration and validation part of this work, only those pixels containing a rain gauge were used. From now on these will be referred to as gauge pixels. In the interests of simplicity, the variogram was computed using only nonzero gauge values and kriged pixel rainfall estimates were set to zero if the gauge measurement within the pixel was zero.
Figure 2 compares the kriged pixel estimates with the raw gauge observations for the gauge pixels. As might be expected, the kriging process tends to move rainfall amounts towards the mean rain amount per rain day. Thus pixel estimates corresponding to low gauge values are raised while those corresponding to high gauge values are lowered. Note this does not imply that for a high rainfall measurement, the true mean rainfall in the surrounding pixel is necessarily lower; rather it implies that in the absence of other information the best (i.e., most likely) estimate of the pixel rainfall will be lower. The relationship between kriged pixel rainfall and gauge observations is in broad agreement with Flitcroft et al. (1989).
2) NWP model analysis fields


In this way, we can represent of the whole field within the windowed area at time t by the p amplitudes aj. The process is efficient if most of the variance in the dataset can be represented by a small number of principal components—in other words, p can be set to a small value.
In our case, most of the variance for relative humidity and horizontal velocity is explained by the first five principal components (see Table 1). Although the proportion explained for the vertical wind components is relatively small, p was set to 5 for all variables for the sake of consistency. Thus the 1800 UTC field pattern for each day is described by just five numbers (the amplitudes aj, j = 1–5) for each parameter. As an example, the first principal component of the humidity field is shown in Fig. 3. The dominant feature is a strong maximum of humidity over southern Zambia roughly reflecting the position of the ITCZ at the height of the rainy season. The other components are not shown as a direct physical interpretation is less obvious.
4. Rainfall estimation algorithms
a. The TAMSAT algorithm (TAMCCD)
The TAMSAT rainfall estimation technique was used in this study as an example of a simple TIR-only technique against which to compare the ANN approach, therefore it is worth describing in some detail.
For daily rainfall estimates, a separate calibration was carried out for each month for the whole of Zambia. Detailed examination of the gauge data did not indicate that subdivision into smaller calibration zones was necessary. The calibration is a two-stage process. In the first stage rain gauge observations are compared with CCD values for the gauge pixels at a number of different temperature thresholds in order to determine the value of Tt, which best discriminates between rain and no rain. This is done using a contingency table as shown in Table 2.


b. Artificial neural network algorithms
1) General remarks on neural networks
In the terminology usually used, a neural network consists of a number of layers of nodes or neurons. The first layer is the input layer, the final layer is the output layer and layers between the input and output are referred to as hidden layers. Connections exist between the nodes that allow a set of values presented at the input layer to be mapped to the output layer. The exact form of the connections is specified by the network “architecture.” Any given node (apart from those in the input layer) will receive inputs from a subset of the other nodes. The total input to any node is the weighted sum of the outputs from all nodes with an input connection to that node. The weights involved are a property of the individual connections. The output from any node is a function (usually nonlinear) of the input. An expected advantage of the ANN in this kind of study is that its distributed nature makes it more robust to errors or missing values in any individual input parameter.
A network is “trained” to a specific task by presenting it with many examples of inputs and the corresponding desired outputs. After each example, the internodal weights are adjusted so as to improve the match between desired and actual output. Training continues until a stable solution is reached or until a desired degree of accuracy is achieved as judged by the mean-square difference between desired and actual output. A full description of the neural network approach can be found in a number of texts, for example Hecht-Nielsen (1990) or Davolo and Naim (1991).
2) Initial ann algorithm (TAMANN1)






3) Final ann algorithm (TAMANN2)


After numerous trials, it was found that the raw TIR data and its spatial variance could be eliminated provided CCD at several different temperature thresholds were included. Similarly, station location as represented by latitude and longitude was found to be unimportant. These results were confirmed by improvements in validation statistics as described in the following section. Inclusion of an additional hidden layer produced further improvement in validation statistics in line with the results of Bellerby et al. (2000). The final configuration, which gave the best results earlier, is that shown in Fig. 4b with 30 input variables (listed in Table 3) and two hidden layers.
Experiments were also carried out to examine the effect of training the network with a different numbers of stations. As might be expected optimum results were achieved from training with the full set of 25 calibration gauge pixels although useful results could still be obtained from as few as 2.
A slight problem with this configuration of the ANN was that it was difficult to generate zero rainfall. This resulted in estimates of very low rainfall (<1 mm) over large areas. While this is not significant at a daily timescale, it is obviously nonphysical and may give rise to problems if daily values are integrated over longer periods. A pragmatic solution was to set all pixel estimates to zero if CCD = 0 for Tt = −30°C.
This final algorithm as described in the preceding paragraphs will be referred to as TAMANN2 in the rest of this paper.
5. Validation methodology
Estimates of rainfall from the TAMCCD and TAMANN algorithms described in section 4 were compared against the block-kriged gauge pixel values described in section 3b. Both algorithms were calibrated using the same 25 gauge pixels shown in Fig. 1. The remaining 10 gauge pixels from the full set of 35 (also shown in Fig. 1) were used as an independent validation set. In order to make the best possible use of the 4 yr of available data, a cross-validation approach was adopted in which each permutation of 3 yr from 4 was used for calibration and the remaining year was used for validation. In this way we have four separate years for which the validation data are independent of the calibration both in space (different gauges) and in time (different year). To simplify computation the principal components for the NWP analysis and the variograms for the kriging were calculated only once using all 4 yr of data.
It has been suggested that the kriged pixel values for the validation gauge locations are not truly independent as they contain information from nearby calibration gauges. However, this merely reflects the physical reality of the spatial coherence of the rainfall field. The important point is that the calibration gauge pixels are not included and the validation year is outside the calibration period.
6. Results
a. Calibration of TAMCCD
The CCD–rainfall calibration parameters for all months and all validation years are shown in Table 4. It can be seen that there is significant interannual and intraseasonal variability. Within a season, the optimum threshold varies between −30° (October, March) and −50°C (April).
The interannual variation seems strongest at the beginning and end of the season. October shows the highest percentage change in the rainfall rate a1 for a given value of Tt, while April is the only month for which Tt varies by as much as 20°C.
b. Calibration of ANN algorithms
The ANN algorithms were trained using the same 3-yr calibration periods as for TAMCCD. As an example, the mean input weights for TAMANN2 as represented by
Interpretation of Fig. 5 must be made with caution. The complex and distributed nature of the network connections mean that too much significance should not be attached to the relative sizes of the mean weights. Nevertheless, it can be seen that there is a consistency from year to year with the highest mean weights corresponding to CCD values at −30° and −60°C and the first principal components of relative humidity and vertical velocity. The correspondence with CCD is to be expected. The effect of the first principal component of the humidity and vertical velocity appears to be to allow the network to take account of the gross seasonal pattern. The relationship between the amplitude of the first principal component of relative humidity and rainfall is demonstrated for one season in Fig. 6. The rainfall has been scaled to better show the comparison. The principal component amplitude is positive during the rainy season, negative outside it, and reaches a maximum in midseason. Within the season there is also evidence of a correspondence between individual high-rainfall events and high amplitudes.
c. Comparison of results
Results from all the algorithms were compared with the kriged gauge pixel data on three different space-time scales. These were
daily rainfall quantities for individual pixels (scale = 1 pixel-day),
daily rainfall average over all 10 validation pixels (scale = 10 pixel-days), and
average rainfall over 10-day period and all 10 validation pixels (scale = 100 pixel-days).


The bias is an indicator of the quality of the mean estimated rainfall and the rmsd shows the likely error in an individual event estimate. The nrmsd is the rmsd normalized with respect to the error on the kriged gauge estimate for the same pixel. Thus an nrmsd < 1 indicates that the estimates lie, on average, within the uncertainty limits of the validation data. The value R2 shows the fraction of the variance of the rainfall explained by the estimation algorithm. The skill score shows the success of the algorithm relative to using the mean observed rainfall as the estimate. Here, skill = 1 is a perfect match to the observations; skill = 0 means that the estimates are equivalent to using the mean observed value; skill < 0 implies the estimates are worse than using the mean value.
The statistics for the TAMCCD, TAMANN1, and TAMANN2 estimation methods for each individual year and for all years combined are summarized in Table 5. Figures in bold indicate the best result for each time period. It can be seen that at all space and time scales and for all statistical parameters TAMANN1 performs least well and TAMANN2 is slightly but consistently better than TAMSAT with the exception of the bias in 1988/89. The small differences in number of events for TAMANN1 is due to days missing from the TIR record. The improvement of TAMANN2 over TAMANN1 is ascribable to the changes in network architecture, inclusion of additional CCD thresholds, and pruning as described in section 4b(3). The remainder of this discussion will concentrate on comparison of TAMANN2 and TAMCCD.
Inspection of scatterplots (Fig. 7) indicates more clearly the nature of the differences. Figure 7a shows the scatterplot for individual pixel-days. A large spread is to be expected here because neither the CCD nor the model data can resolve rainfall quantities accurately at this resolution, while error on the kriged pixel estimates and satellite collocation error will also contribute to the scatter. Nevertheless, differences in the quality of the performance of the two methods are still apparent. Both methods tend to underestimate rainfall above 10 mm but TAMANN2 does somewhat better and there is also a slightly tighter spread around the one-to-one line for lower rainfall.
For the daily pixel average (Fig. 7b), the spread of data points is reduced because of the spatial averaging and the improvement of TAMANN2 for rainfalls above 10 mm is more clearly discernable. Although there is a tendency for both methods to underestimate, it is less pronounced for TAMANN2. Unfortunately, the wide geographical distribution of the validation pixels means that average rainfall amounts do not exceed 25 mm, so it is not possible to see whether TAMANN2 does much better in estimating extreme rainfall over an extended area.
The better estimation of high rainfall amounts can be also seen in a typical time series of daily average data for 1997/98 shown in Fig. 8. The high peaks (>10 mm) in the kriged rainfall are all more closely approached by the TAMANN2 estimate with the exception of day 96 (overestimated by TAMANN2). This is a useful result as better estimates of high daily rainfalls are of obvious importance for river flow forecasting. It is also worth commenting that the pixels averaged here are noncontiguous and that results may well be better when averaged over a similar number of contiguous pixels (such as a river catchment) as errors associated with collocation will become less important.
For the 10-day pixel average in Fig. 7c both methods do extremely well with R2 ∼ 0.9 (Table 5c). These high correlations are attributable to three factors. First, TIR-based methods work well in convective rainfall regimes as in Zambia. Second, the use of noncontiguous validation pixels reduces the daily mean rainfall, which again favors TIR-based methods. More contiguous data are needed to see how well higher rainfall amounts with the same degree of averaging are represented. Third, the use of kriged gauge data provides calibration and validation at an appropriate spatial scale and reduces the scatter in the validation plots. Of the two methods TAMANN2 is again slightly better with a more even distribution of points about the one-to-one line as evidenced by the negligible bias in Table 5c.
While the statistics and graphs described so far focus on the 10 validation pixels shown in Fig. 1, it is also informative to look at the area rainfall map produced by the two methodologies. An example is shown in Fig. 9 for 30 March 1997. It is apparent that the overall rainfall pattern is similar for both methods, emphasizing the importance of the CCD in the TAMANN2 algorithm. However TAMANN2 produces higher rainfall values overall and the areas of highest rainfall (shown red in Fig. 9) correspond to regions of very cold cloud. Comparison with CCD images (also shown in Fig. 9) indicate that the shape of the high-rainfall area is largely determined by the −60°C CCD while the overall light-rainfall pattern is closer to the −30°C picture.
A question arises as to whether the better results of the network are due to the additional input data or the greater complexity of the network structure. We have found that a network trained on the CCD data alone does not do as well as the standard regression algorithm. On the other hand, preliminary results of ongoing work indicate that a multiple regression using the same inputs as TAMANN2 performs less well than the neural network. The implication is that both the additional NWP information and the nonlinear network characteristics are important. These results will be reported in full elsewhere.
Although in general the improvements afforded by the TAMANN2 algorithm are small, the reduction in bias for high-rainfall amounts is significant for hydrology. It is also worth pointing out the neural network training process is a one-step calibration process, which does not require separate monthly calibrations or separate specification of a temperature threshold. Furthermore, the neural network framework allows easy incorporation and testing of other data streams, which are likely candidates for improving the rainfall estimates.
7. Conclusions
A methodology for the operational estimation of daily rainfall using an artificial neural network has been devised making use of Meteosat TIR imagery and NWP model analysis fields from the ECMWF. Zambia in central Africa has been used as a case study. A novel feature of this methodology is the use of principal component analysis for the efficient inclusion of NWP model data.
Following the use of a pruning technique to remove redundant input parameters, the optimum network configuration was a four-layer perceptron for which the input consists of CCD at four threshold temperatures together with the first five principal components of surface horizontal wind velocities, surface relative humidity, and higher-level vertical wind velocities. The output is an estimate of the mean pixel rainfall. The network has been trained and validated using 4 yr worth of Zambian rain gauge data interpolated to pixel average values using kriging. A comparison has been carried out with a standard CCD algorithm at three scales: pixel-day, 10 pixel-day, 10-pixel × 10-day.
Results show that the ANN method is slightly but consistently better than the standard approach at all space and time scales. There is a high degree of scatter for the individual pixel-day estimates because the resolution is too high for either method and additional uncertainty is introduced by the collocation errors of the satellite. The daily average (10 pixel-day) comparison shows that the neural network gives more accurate estimates for higher rainfall amounts (in excess of 10 mm) although both methods underestimate in this range. This is important for hydrological applications, particularly flood forecasting. At the 10-pixel × 10-day scale, both methods perform very well with the ANN having a lower bias. The nature of the averaging means that the maximum average rainfall is less than 10 mm, so the performance of the methods with high rainfall amounts averaged over 100 pixel-days has not been tested.
An additional advantage of the TAMANN2 approach is that the calibration is carried out in a single-step training procedure with no necessity to select different threshold temperatures or vary calibration parameters for different stages of the season.
Development of the TAMANN2 methodology as an operational tool is still in its early stages. An important next step will be to test the algorithm in areas such as East and northeast Africa where there is a lower correlation between CCD and rainfall amount. A useful feature of the ANN framework is that the inclusion of additional or different data streams is relatively straightforward. The pruning technique means that the importance of additional data can be easily assessed. This could be of great benefit in tailoring the approach to different climatic conditions.
Acknowledgments
Thanks are due to the Zambian Meteorological Service for providing the rain gauge data for this study. Collaboration between L'Aquila and Reading Universities was facilitated by funding under the British–Italian Joint Research Initiative financed by the British Council and CRUI/MIUR. The contributions of Rogerio Bonifacio and lain Russell in assistance with the data analysis are also gratefully acknowledged.
REFERENCES
Aires, F., Prigent C. , Rossow W. B. , and Rothstein M. , 2001: A new neural network approach including first guess for retrieval of atmospheric water vapour, cloud liquid water path, surface temperatures and emissivities over land from satellite microwave observations. J. Geophys. Res., 106 , (D14),. 14887–14907.
Arkin, P. A., 1979: The relationship between the fractional coverage of high cloud and rainfall accumulations during GATE over the B-scale array. Mon. Wea. Rev., 107 , 1382–1387.
Arkin, P. A., Joyce R. , and Janowiak J. E. , 1994: The estimation of global monthly mean rainfall using infrared satellite data: The GOES Precipitation Index. Remote Sens. Rev., 11 , 107–124.
Barrett, E. C., 1970: The estimation of monthly rainfall from satellite data. Mon. Wea. Rev., 98 , 322–327.
Barrett, E. C., and Martin D. W. , 1981: The Use of Satellite Data in Rainfall Monitoring. Academic Press, 340 pp.
Bellerby, T., Todd M. , Kniveton D. , and Kidd C. , 2000: Rainfall estimation from a combination of TRMM precipitation radar and GOES multispectral satellite imagery through the use of an artificial neural network. J. Appl. Meteor., 39 , 2115–2128.
Carn, M., Lahuec J. P. , Dagorne D. , and Guillot B. , 1989: Rainfall estimation using TIR Meteosat imagery over the Western Sahel. Preprints, Fourth Conf. on Satellite Meteorology and Oceanography, San Diego, CA, Amer. Meteor. Soc., 126–129.
Davolo, E., and Naim P. , 1991: Neural Networks. McMillan Education, 145 pp.
Flitcroft, I. D., Milford J. R. , and Dugdale G. , 1989: Relating point to area average rainfall in semi-arid West Africa and the implications for rainfall estimates derived from satellite data. J. Appl. Meteor., 28 , 252–266.
Grimes, D. I. F., and Diop M. , 2003: Satellite-based rainfall estimation for river flow forecasting in Africa. Part I. Rainfall estimates and hydrological forecasts. Hydrol. Sci. J., 48 , 567–584.
Grimes, D. I. F., Pardo E. , and Bonifacio R. , 1999: Optimal areal rainfall estimation using raingauges and satellite date. J. Hydrol., 222 , 93–108.
Hecht-Nielsen, R., 1990: Neurocomputing. Addison-Wesley, 433 pp.
Herman, A., Kumar V. B. , Arkin P. A. , and Kousky J. V. , 1997: Objectively determined 10-day African rainfall estimates created for famine early warning. Int. J. Remote Sens., 18 , 2147–2160.
Hsieh, W. W., and Tang B. , 1998: Applying neural network models to prediction and data analysis in meteorology and oceanography. Bull. Amer. Meteor. Soc., 79 , 1855–1870.
Hsu, K., Gao X. , Sorooshian S. , and Gupta H. V. , 1997: Precipitation estimation from remotely sensed information using artificial neural networks. J. Appl. Meteor., 36 , 1176–1190.
Journel, A. G., and Huijbregts C. J. , 1978: Mining Geostatistics. Academic Press, 600 pp.
Kidd, C., 1998: On rainfall retrieval using polarisation corrected temperatures. Int. J. Remote Sens., 19 , 981–996.
Kidd, C., 2001: Satellite rainfall climatology: A review. Int. J. Climatol., 21 , 1041–1066.
Laurent, H., Jobard I. , and Toma A. , 1998: Validation of satellite data and ground-based estimates of precipitation over the Sahel. Atmos. Res., 47 , –48. 651–670.
Lonbladd, L., Peterson C. , and Rognvaldsson T. , 1991: Pattern recognition in high energy physics with artificial neural networks. Comput. Phys. Comm., 70 , 167.
Milford, J. R., and Dugdale G. , 1990: Estimation of rainfall using geostationary satellite data. Applications of Remote Sensing in Agriculture, Proceedings of 48th Easter School in Agricultural Science, Butterworth.
Morland, J. C., Grimes D. I. F. , and Hewison T. J. , 2001: Satellite observation of the microwave emissivity of a semi-arid land surface. Remote Sens. Environ., 77 , /2. 149–164.
Snijders, F. L., 1991: Rainfall monitoring based on Meteosat data—A comparison of techniques applied to the Western Sahel. Int. J. Remote Sens., 12 , 1331–1347.
Sorooshian, S., Hsu K-L. , Gao X. , Gupta H. V. , Imam B. , and Braithwaite D. , 2000: Evaluation of PERSIANN system satellite-based estimates of tropical rain. Bull. Amer. Meteor. Soc., 81 , 2035–2046.
Thorne, V., Coakley P. , Grimes D. , and Dugdale G. , 2001: Comparison of TAMSAT and CPC rainfall estimates with rainfall, for southern Africa. Int. J. Remote Sens., 22 , 1951–1974.
Todd, M. C., Barrett E. C. , and Beaumont M. J. , 1995: Satellite identification of raindays over the upper Nile River basin using an optimum infrared rain/no-rain threshold temperature model. J. Appl. Meteor., 34 , 2600–2611.
Todd, M. C., Kidd C. , Kniveton D. , and Bellerby T. , 2001: A combined satellite infrared and passive microwave technique for estimation of small-scale rainfall. J. Atmos. Oceanic Technol., 18 , 742–755.
Tsintikidis, D., Haferman J. L. , Anagnostou N. , Krajewski W. F. , and Smith T. F. , 1997: A neural network approach to estimating rainfall from spaceborne microwave data. IEEE Trans. Geosci. Remote Sens., 35 , 1079–1092.
Weigend, A., Huberman B. , and Rumelhart D. , 1991: Predicting sunspots and exchange rates with connectionist networks. Non-linear Modelling and Forecasting, S. Eubank and M. Casdagli, Eds., Addison-Wesley, 1–36.
Wilks, D. S., 1995: Statistical Methods in the Atmospheric Sciences. Academic Press, 467 pp.
Xie, P., and Arkin P. A. , 1996: Analyses of global monthly precipitation using gauge observations, satellite estimates, and numerical model predictions. J. Climate, 9 , 840–858.

Locations of gauges used for calibration and validation during this study. The inset shows the location of Zambia in Africa.
Citation: Journal of Hydrometeorology 4, 6; 10.1175/1525-7541(2003)004<1119:ANNATR>2.0.CO;2

Locations of gauges used for calibration and validation during this study. The inset shows the location of Zambia in Africa.
Citation: Journal of Hydrometeorology 4, 6; 10.1175/1525-7541(2003)004<1119:ANNATR>2.0.CO;2
Locations of gauges used for calibration and validation during this study. The inset shows the location of Zambia in Africa.
Citation: Journal of Hydrometeorology 4, 6; 10.1175/1525-7541(2003)004<1119:ANNATR>2.0.CO;2

Comparison of gauge and kriged pixel rainfall amounts. The solid line indicates one-to-one correspondence
Citation: Journal of Hydrometeorology 4, 6; 10.1175/1525-7541(2003)004<1119:ANNATR>2.0.CO;2

Comparison of gauge and kriged pixel rainfall amounts. The solid line indicates one-to-one correspondence
Citation: Journal of Hydrometeorology 4, 6; 10.1175/1525-7541(2003)004<1119:ANNATR>2.0.CO;2
Comparison of gauge and kriged pixel rainfall amounts. The solid line indicates one-to-one correspondence
Citation: Journal of Hydrometeorology 4, 6; 10.1175/1525-7541(2003)004<1119:ANNATR>2.0.CO;2

First principal component of surface relative humidity over the study period
Citation: Journal of Hydrometeorology 4, 6; 10.1175/1525-7541(2003)004<1119:ANNATR>2.0.CO;2

First principal component of surface relative humidity over the study period
Citation: Journal of Hydrometeorology 4, 6; 10.1175/1525-7541(2003)004<1119:ANNATR>2.0.CO;2
First principal component of surface relative humidity over the study period
Citation: Journal of Hydrometeorology 4, 6; 10.1175/1525-7541(2003)004<1119:ANNATR>2.0.CO;2

Schematic diagram of the artificial neural network architecture. (a) Initial network. (b) Final network
Citation: Journal of Hydrometeorology 4, 6; 10.1175/1525-7541(2003)004<1119:ANNATR>2.0.CO;2

Schematic diagram of the artificial neural network architecture. (a) Initial network. (b) Final network
Citation: Journal of Hydrometeorology 4, 6; 10.1175/1525-7541(2003)004<1119:ANNATR>2.0.CO;2
Schematic diagram of the artificial neural network architecture. (a) Initial network. (b) Final network
Citation: Journal of Hydrometeorology 4, 6; 10.1175/1525-7541(2003)004<1119:ANNATR>2.0.CO;2

Mean input weights for optimized TAMANN algorithm. Weights are shown for five principal components of relative humidity (RH), horizontal velocity (U, V) and vertical velocity (W400, W700); altitude (h) and CCD at temperature thresholds of −30°, −40°, −50°, −60°C. Note that vertical lines are plotted between points to aid identification of the various years; they have no physical significance. The year indicated in each case is the validation year for which the weights were applied
Citation: Journal of Hydrometeorology 4, 6; 10.1175/1525-7541(2003)004<1119:ANNATR>2.0.CO;2

Mean input weights for optimized TAMANN algorithm. Weights are shown for five principal components of relative humidity (RH), horizontal velocity (U, V) and vertical velocity (W400, W700); altitude (h) and CCD at temperature thresholds of −30°, −40°, −50°, −60°C. Note that vertical lines are plotted between points to aid identification of the various years; they have no physical significance. The year indicated in each case is the validation year for which the weights were applied
Citation: Journal of Hydrometeorology 4, 6; 10.1175/1525-7541(2003)004<1119:ANNATR>2.0.CO;2
Mean input weights for optimized TAMANN algorithm. Weights are shown for five principal components of relative humidity (RH), horizontal velocity (U, V) and vertical velocity (W400, W700); altitude (h) and CCD at temperature thresholds of −30°, −40°, −50°, −60°C. Note that vertical lines are plotted between points to aid identification of the various years; they have no physical significance. The year indicated in each case is the validation year for which the weights were applied
Citation: Journal of Hydrometeorology 4, 6; 10.1175/1525-7541(2003)004<1119:ANNATR>2.0.CO;2

Time series of observed rainfall and amplitude of the first EOF of relative humidity averaged over validation pixels for one season
Citation: Journal of Hydrometeorology 4, 6; 10.1175/1525-7541(2003)004<1119:ANNATR>2.0.CO;2

Time series of observed rainfall and amplitude of the first EOF of relative humidity averaged over validation pixels for one season
Citation: Journal of Hydrometeorology 4, 6; 10.1175/1525-7541(2003)004<1119:ANNATR>2.0.CO;2
Time series of observed rainfall and amplitude of the first EOF of relative humidity averaged over validation pixels for one season
Citation: Journal of Hydrometeorology 4, 6; 10.1175/1525-7541(2003)004<1119:ANNATR>2.0.CO;2

Scatterplots of rainfall estimation against kriged pixel data for all time scales; (a) pixel-day, (b) daily pixel mean, 10-day pixel mean; (left) TAMCCD, (right) TAMANN2. The solid lines indicate a one-to-one relationship
Citation: Journal of Hydrometeorology 4, 6; 10.1175/1525-7541(2003)004<1119:ANNATR>2.0.CO;2

Scatterplots of rainfall estimation against kriged pixel data for all time scales; (a) pixel-day, (b) daily pixel mean, 10-day pixel mean; (left) TAMCCD, (right) TAMANN2. The solid lines indicate a one-to-one relationship
Citation: Journal of Hydrometeorology 4, 6; 10.1175/1525-7541(2003)004<1119:ANNATR>2.0.CO;2
Scatterplots of rainfall estimation against kriged pixel data for all time scales; (a) pixel-day, (b) daily pixel mean, 10-day pixel mean; (left) TAMCCD, (right) TAMANN2. The solid lines indicate a one-to-one relationship
Citation: Journal of Hydrometeorology 4, 6; 10.1175/1525-7541(2003)004<1119:ANNATR>2.0.CO;2

Time series of daily average rainfall for 1997/98.
Citation: Journal of Hydrometeorology 4, 6; 10.1175/1525-7541(2003)004<1119:ANNATR>2.0.CO;2

Time series of daily average rainfall for 1997/98.
Citation: Journal of Hydrometeorology 4, 6; 10.1175/1525-7541(2003)004<1119:ANNATR>2.0.CO;2
Time series of daily average rainfall for 1997/98.
Citation: Journal of Hydrometeorology 4, 6; 10.1175/1525-7541(2003)004<1119:ANNATR>2.0.CO;2

(top-left) TAMANN2 rainfall estimate (mm) 30 Mar 1997; (top-right) TAMCCD rainfall estimate (mm) 30 Mar 1997; (bottom-left) CCD image (Tt = −30°C); (bottom-right) CCD image (Tt = −60°C). For the rainfall estimates, the range is from gray = 2 mm to red = 30 mm.
Citation: Journal of Hydrometeorology 4, 6; 10.1175/1525-7541(2003)004<1119:ANNATR>2.0.CO;2

(top-left) TAMANN2 rainfall estimate (mm) 30 Mar 1997; (top-right) TAMCCD rainfall estimate (mm) 30 Mar 1997; (bottom-left) CCD image (Tt = −30°C); (bottom-right) CCD image (Tt = −60°C). For the rainfall estimates, the range is from gray = 2 mm to red = 30 mm.
Citation: Journal of Hydrometeorology 4, 6; 10.1175/1525-7541(2003)004<1119:ANNATR>2.0.CO;2
(top-left) TAMANN2 rainfall estimate (mm) 30 Mar 1997; (top-right) TAMCCD rainfall estimate (mm) 30 Mar 1997; (bottom-left) CCD image (Tt = −30°C); (bottom-right) CCD image (Tt = −60°C). For the rainfall estimates, the range is from gray = 2 mm to red = 30 mm.
Citation: Journal of Hydrometeorology 4, 6; 10.1175/1525-7541(2003)004<1119:ANNATR>2.0.CO;2
Total explained variance for the first five principal components of the NWP model parameters used in this study


Contingency table to determine optimum rain–no-rain threshold temperature Tt


Input parameters for the initial and final versions of the ANN


Calibration parameters for TAMSAT rainfall estimation


Summary statistics for estimation algorithms at various space and time scales: (a) pixel-day, (b) areal mean pixel-day, (c) areal mean pixel average over 10 days. For all years, the mean pixel rain per rain-day is between 8 and 9 mm

