For the summer monsoon seasons of 1995, 1996, and 1997 the day-1 to day-4 forecasts of precipitation from both the National Centre for Medium Range Weather Forecasting (NCMRWF) and the European Centre for Medium-Range Forecasts (ECMWF) models reproduce the main features of the observed precipitation pattern when averaged over the whole season. On average, less than 30% of all rain gauge stations in India report rain on a given day during the monsoon season. The number of observed rainy days increases to 41% after spatial averaging over ECMWF model grid boxes and to 50% after spatial averaging over NCMRWF model grid boxes. The NCMRWF model forecasts have 10%–15% more rainy days, mostly in the light or moderate precipitation categories, when compared with the spatial average of observed values. Seasonal accumulated values of all of India’s average precipitation show a slight increase with the forecast lead time for the NCMRWF model and a small decrease for the ECMWF model. The weekly accumulated values of forecast precipitation from both models, averaged over the whole of India, are in good phase relationship (∼0.9 in most cases) with the observed value for forecasts with a lead time up to day 4. Values of statistical parameters, based on the frequency of occurrence in various classes, indicate that the NCMRWF model has some skill in predicting precipitation over India during the summer monsoon. The NCMRWF model forecasts have higher trend correlation with the observed precipitation over India than do the ECMWF model forecasts. The mean error in precipitation is, however, much less in the ECMWF model forecasts, and the spatial distribution of seasonal average medium-range forecasts of ECMWF is closer to that observed along the west coast mountain ridgeline.
The summer monsoon over India is a component of the Asian monsoon system and has its own characteristics—onset at the southern tip of India during late May to early June and subsequent northward progress in association with a low-level trough, an established phase covering July and August, and a withdrawal phase from the middle of September onward (Rao 1976; Webster et al. 1998; Annamalai et al. 1999). For the purpose of rainfall statistics, the summer monsoon season over India is assumed to be a fixed period of 122 days from 1 June to 30 September. The low-level trough, in the wind field, locally called the “monsoon trough,” is a part of the intertropical convergence zone (ITCZ) with westerly wind to the south of it and easterlies to the north. Because of land–sea asymmetry, the monsoon trough over India is located far to the north of the global average of the ITCZ during the same period. The summer monsoon season is meteorologically most important for India because more than 80% of the land area gets about 90% of its annual precipitation during this period. It is, therefore, the most suitable period for verification of the model-produced quantitative precipitation forecasts.
The National Centre for Medium Range Weather Forecasting (NCMRWF) located at New Delhi, India, is a meteorological data assimilation/forecast center that produces 5-day forecasts of global weather on a daily basis. The forecast model is a global spectral model with 80 waves in triangular truncation (T80), with parameterizations of all known physical processes. This model originated from the National Centers for Environmental Prediction (NCEP)–National Center for Atmospheric Reasearch (NCAR) model (Kalnay et al. 1988), and its main features during the period of this study are listed in Table 1. In the present work, precipitation forecasts from this model are compared with the observed precipitation over India. In addition, precipitation forecasts have been obtained from the European Centre for Medium-Range Weather Forecasts (ECMWF) for comparison with the precipitation forecasts produced at NCMRWF.
During the summer monsoon in India, air close to ground is warm (30°–35°C) and moist, with relative humidity often exceeding 90%. The atmosphere is conditionally unstable with a lapse rate close to 6.5°C km−1 and has large easterly vertical wind shear of a magnitude of 3–5 m s−1 km−1 (Rao 1976). These conditions lead to the frequent formation of nimbostratus and cumulonimbus of shorter heights, with tops reaching 6–8 km above ground. In this season large-scale precipitation occurs only in association with some synoptic-scale system, like a low pressure area or the monsoon trough. Persistent precipitation occurs on the windward sides of topographic obstacles that provide upward motion by forcing air to rise and pass over them. During the summer monsoon there are two regions of persistent upward motion due to topographic forcing—one located along the whole of the west coast and another smaller one located in the northeast of India to the north of Bangladesh (Rao 1976). Both of these regions have accumulated seasonal precipitation in excess of 160 cm, while the all-India average is about 91 cm (Basu 2001). Another region of large precipitation (IMD 1981) is present in the eastern part of the country to the south of the seasonal average location of the eastern part of the monsoon trough. In contrast to the other two regions, the large seasonal total precipitation over the eastern region is not topographical forcing, but is due to dynamical forcing produced by the generation of cyclonic circulations near the eastern end of the monsoon trough dipping into the Bay of Bengal (Rao 1976). This dynamical forcing and the accompanying region of large precipitation are absent in the climate simulations of most atmospheric models (Gadgil et al. 1998). Other important features of the seasonal accumulated precipitation are the rain shadow on the leeward side of the west coast topography and the region of scanty precipitation over the desert to the west of the country. Rainfall decreases drastically to the east of the west coast ridge, and the seasonal total precipitation amounts are less than 50 cm. For a numerical model of the atmosphere to be successful in predicting summer monsoon precipitation over India, the first step is to reproduce these characteristic patterns in the seasonal accumulated values.
During the summer monsoon, the precipitation regime over India can be divided into two classes—one due to westerly winds from the Arabian Sea and the other due to easterly winds from the Bay of Bengal (Fig. 10 in Annamalai et al. 1999). Though there is no strict demarcation between regions getting rain from the westerly or easterly winds, usually rain over the southwest and west parts of the country is due to winds from the Arabian Sea, while rain over the east, northeast, and central part of the country is due to winds from the Bay of Bengal. Often, one precipitation regime performs better than the other in a particular year, and this variation contributes to the interannual variability of the Indian summer monsoon.
In global atmospheric prediction models only the vapor phase of water is carried as a prognostic variable, and precipitation is computed as a by-product of the conservation of the water vapor principle. Precipitation is produced by saturation due to convergence of moisture and resultant upward motion. Parameterization schemes for large-scale condensation in layer clouds, and for convection of both shallow and deep types, are used to take care of the saturation of water vapor. In all of the parameterization processes a fraction of the water vapor condensing in the cloud is allowed to fall out of the cloud base; part of this is allowed to evaporate as it falls through unsaturated layers below the cloud base, and the rest reaches ground as precipitation. The parameterization packages for the computation of precipitation and some other characteristics of both the ECMWF and NCMRWF models are listed in Table 1.
It is difficult to verify precipitation forecasts by the standard methods prescribed by the World Meteorological Office (WMO) (WMO 1992), because precipitation is not an analysis variable and observations are available only at irregular points. These difficulties are further compounded by the large temporal and spatial variability of precipitation. Because of this, no standard method has been developed for the verification of model forecasts of precipitation, and most of the operational centers either validate their medium-range forecasts against the short-range forecast (guess) field, or compute statistical parameters (Murphy and Winkler 1987) that are related to the skill of the forecast based on the model predictions at regular grid points, and gauge and radar measurements at irregular observation points.
Data and methodology
In this work we have collected the observed precipitation amounts for the monsoon seasons of three consecutive years (1995–97) from the archives of the India Meteorological Department (IMD), and studied their temporal and spatial distribution. Also, area averages have been computed to verify the model forecasts of precipitation over the Indian land area. The data that are used are the daily rainfall amounts that have accumulated over the past 24 h. The precipitation observations are reported by about 500 surface observatories maintained by IMD, and the rest are from part-time observatories that are maintained by IMD, or rain/snow gauges that are maintained by various states and hydrological authorities. The National Data Center of IMD collects data from all of these stations and archives them after quality-control checks. Because this process takes considerable time, data are usually less numerous for the later years. The numbers of stations for which quality-controlled precipitation data are available are 2229, 1716, and 1333 for the years of 1995, 1996, and 1997 respectively. The spatial distribution of data is highly inhomogeneous with larger density over the southern part of the country and sparse distribution over the extreme western part covering the Thar Desert (24.5°–30.0°N, 69.5°–76.0°E). The precipitation is collected by the standard, manually operated rain gauges and is measured up to the first decimal place in millimeters.
For the comparison of the irregularly spaced observation of precipitation with the regularly spaced gridpoint predictions from a numerical model, the station data are averaged over an area around each grid point. It is assumed that in a model the grid point closest to a small area represents the condition over it. Hence, the gridpoint value is representative of the average condition over the whole area that is associated with that grid point. For the comparison of observed and model-predicted values of precipitation, we compute the average of the observed precipitation over the area associated with each grid point. In this work, the area associated with each grid point (henceforth called grid box) is obtained by drawing the bisectors of lines joining the nearest grid points of the model. The 847 grid boxes of the ECMWF model that approximately cover the land area of India are shown in Fig. 1. The NCMRWF model has 135 grid boxes (Basu 2003) covering almost the same area.
Because the observing stations are irregularly spaced, the grid-box average of observed precipitation is a weighted mean of the individual observations contributing to the box. The weights are computed by the Thiessen polygon method (WMO 1994) in which the area associated with each grid box is divided into a large number of smaller areas, and each one of these smaller areas is assigned to the observing station nearest to it. The total area assigned to a station, normalized by the grid-box area, is the weight of that station in the computation of the average over the grid box. The spatial distribution of rain gauges contributing to each grid box is shown in Fig. 2. In the Thiessen method of averaging, rain gauges outside of a grid box may also contribute to the average, as is illustrated in Fig. 3. The spacing of the ECMWF forecast model’s transform grid, on which the precipitation is computed, is about 60 km near the equator, and, hence, each grid box covers an area of approximately 3600 km2. Assuming that each rain gauge represents an area of 1962.5 km2 (a circle with a radius of 25 km; Upadhyay et al. 1990), two or more observations are required for the computation of a reliable grid-box average. It is seen from Fig. 2 that most of the grid boxes over India have between two and five rain gauges contributing to it, while some areas, mostly confined to the desert areas in the west of the country, have contributions from less than two rain gauges. A small area to the extreme south has contributions from more than 15 rain gauges per grid box.
The observed precipitation is the accumulated values for a 24-h period, ending on 0300 UTC of the day of reporting. The NCMRWF model forecasts are initiated from the 0000 UTC analysis of the starting date, while ECMWF forecasts are initiated from 1200 UTC of the day. For the purpose of comparison, the NCMRWF model forecasts are accumulated between 3 and 27, 27 and 51, 51 and 75, and 75 and 99 h. These periods are henceforth referred to as day-1, day-2, day-3, and day-4 forecasts, respectively. For the ECMWF model, the accumulated precipitation between successive 0000 UTC forecasts is used (ECMWF 1997).
There are other methods of computing the spatial average of precipitation over an area from the discrete values of rain gauge observations. The Thiessen method is simple because the weight assigned to a station depends on the geographical distribution of stations and is fixed as long as the observing network remains the same. The computer algorithm is also simple because it divides the area of the grid box into a large number of elementary areas and assigns each one of them to the nearest rain gauge. In this method, the rain gauges outside of, but close to, a grid box can contribute to the area average of precipitation for that box.
Spatial and temporal distribution of observed precipitation
The summer monsoon seasons over India were normal during 1995–97 (IMD 1995, 1996, 1997), and the seasonal totals averaged over the whole of India were 97%, 102%, and 102% of the long period (30 yr) mean, respectively. The seasonal total averaged over the whole of the country is normal if it is within 10% (larger or smaller) of the long period mean. For averaged precipitation over smaller areas, the threshold is 19% of the long period mean.
For a more detailed study of its distribution, the precipitation amounts are divided into six main categories, as given in Table 2. The limits of these categories are similar to those defined by IMD (IMD 2002) and are used for all subsequent work with both observed precipitation and grid-box averages.
Averaging precipitation over the grid box reduces the magnitude of the heaviest precipitation because other observations contributing to the averaging process have smaller magnitudes. The heaviest NCMRWF grid-box-averaged precipitations are 34.8 cm on 12 July 1995, 29.1 cm on 26 June 1996, and 41.1 cm on 29 July 1997. The location of the heaviest precipitation may also shift in space as well as time, favoring events with widespread heavy to very heavy precipitation over events with isolated cloudbursts. The number of occurrences in the trace/no-rain category also decreases, because some stations contributing to the average may record comparatively larger amounts of precipitation. The frequency of occurrence in the intermediate, that is, the light, moderate, and rather heavy, categories increases due to averaging over the box area. For the 3-yr period under study, the NCMRWF grid-box-averaged precipitation has 50.0% frequency in the trace category and 0.8% frequency in the heavy and very heavy categories together (Fig. 4).
For the same period, averaging over the ECMWF model grid boxes (almost squares, with sides of ∼60 km) increases the number of rainy days to about 41%, again by increasing the number of occurrences in the light and moderate categories. The frequency of occurrence of heavy and very heavy precipitation together is 1.3%, which is more than that in the NCMRWF box average, but is less than the 1.7% reported in the observation.
The spatial distribution of the seasonal accumulated precipitation in the T80 grid boxes over India, averaged for the summer monsoons of 1995, 1996, and 1997, is shown in Fig. 5. The two regions of heavy precipitation on the windward sides of hills and the rain shadow in the leeward side of the west coast hills are well represented in the 3-yr average of the seasonal accumulated values. The precipitation maximum in the eastern region is also present in Fig. 5, except that its shape is slightly different from that in the long-term average. The axis of the minimum precipitation is neither present in the average of these 3 yr, nor in the individual years (not shown). Another departure from the long-term (1951–80)-averaged seasonal rainfall distribution (IMD 1981) is the appearance of three centers of maximum in the precipitation field instead of isohyets parallel to the west coast hills. This indicates north–south inhomogeneties in the spatial distribution of precipitation along the coastal ridgeline.
The spatial distribution of rainy days over India during the summer monsoon is characterized by two regions of minimum—one at the extreme south and to the lee of the west coast ridgeline, and the other to the extreme northwest, covering the Thar Desert. The latter is due to the presence of an east—west circulation in which the air ascending over the northeast parts of the country during the summer monsoon, move westward and descend over the northwest of the country (Das 1962). The 3-yr average of the number of rainy days in the T80 grid boxes (not shown) has a spatial distribution similar to that of the rainfall, with the highest value exceeding 100 days per season over the northeast and the lowest value of less than 10 days over the desert in the northwest. The number of rainy days along the west coast is about 80 per season, which is about 20% less than the number over the northeast. The occurrence of heavy and very heavy precipitation is more numerous over the northeast than along the west coast, but the occurrence of moderately heavy precipitation is much greater in the grid boxes along the west coast. The seasonal totals are about the same for grid boxes over the northeast and along the west coast. On average, a T80 grid box receives rain only 50% of the time, that is, about 60 days in a season.
During the 3-yr period under study, the performance of the summer monsoon was normal over India in terms of the all-India-averaged seasonal accumulated precipitation. However, there are both spatial and year-to-year variability (Singh et al. 1992), even when the space–time averages are close to each other. This variability has important consequences on agriculture, water resource management, irrigation, and the generation of hydroelectricity, and is an important feature that should be reproduced for skillful prediction by an atmospheric model (Challinor et al. 2003; Parthasarathy et al. 1992; Hansen 2002; Gadgil et al. 2002). In 1995 (Fig. 6) the areas in the extreme north and the northeast recorded a large, positive departure in rainfall (from the 3-yr average), even though the country as a whole received less than the long-term mean (IMD 1995). The former is due to the incursion of active westerly wind (western disturbances) into the north of the country. The excess precipitation over the northeast is associated with that mode of the Bay of Bengal regime of the Indian summer monsoon precipitation in which the monsoon current does not turn westward at the head of the bay, but continues its journey northeastward into south of China (Krishnamurthy and Shukla 2000). This mode also results in deficient precipitation over the central plains of India, as seen in 1995.
The intraseasonal variability (Webster et al. 1998; Sperber et al. 2000; Krishnamurthy and Shukla 2000; Goswami and Ajayamohan 2001) is another important feature of the Indian summer monsoon that needs to be reproduced for an atmospheric model prediction to be successful. In general, there is an increase in weekly rainfall at the beginning of the season that is coincident with the onset of the summer monsoon over the southern tip of India, and a gradual increase up to July when the whole of the country is under the influence of the summer monsoon circulation. The weekly accumulated rainfall values decrease after August because the monsoon circulation starts withdrawing southward from the west and central parts of the country. The withdrawal process is complete over most of the country, except for the east coast of the extreme south, by the end of September. This general trend is, however, modulated by events of enhanced precipitation activity that are usually associated with one or more synoptic-scale weather systems, like low pressure areas of various intensity or extratropical influence due to incursions of westerly troughs to the north of the country. The later situation led to the highest weekly accumulated all-India precipitation in 1995 during the 14th week of the monsoon season, between 30 August and 5 September. An active spell of the Indian summer monsoon lasts for 2–7 days and is usually followed by a period of lesser activity before another period of enhanced activity starts. Thus, the temporal distribution of weekly precipitation (Fig. 7) often shows two or more crests with low values in between.
During June an active monsoon epoch in usually associated with enhanced activity in the westerly wind regime of the monsoon, while in September such activity is more often due to interaction with troughs in the extratropical westerly, moving across the north of the country, or enhanced activity of the easterly winds from the Bay of Bengal in association with a low pressure system moving westward along the seasonal monsoon trough. During July and August, the peak phase of the summer monsoon, the periods of enhanced all-India precipitation usually coincides with greater activity of both regimes of the monsoon. It has been shown by Mukherjee and Shyamala (1986) that a strong vortex in the central part of the country induces stronger-than-normal westerly flow over the Arabian Sea. Because the monsoon air picks up moisture mostly from the east Arabian Sea (Ghosh et al. 1978), increased wind speed over this area results in more evaporation and an increased moisture content that converge at the west coast, due to a change in surface friction, and leads to increased rainfall. In all of the years considered here, a maximum in the weekly precipitation (Fig. 11) is present between 19 July and 2 August (weeks 8 and 9) when both the westerly and easterly wind regimes were active with enhanced precipitation over the west coast and the eastern and central parts of India. Out of the seasonal total of about 94.0 cm (all-India average for 3 yr), more than 9.0 cm is recorded in 1 week of enhanced precipitation.
Characteristics of model-predicted precipitation
We now proceed to examine the NCMRWF and ECMWF model-produced day-1 to day-4 forecasts of precipitation to determine whether the main patterns of seasonal accumulated values, the periods of enhanced precipitation, and the year-to-year and the intraseasonal variabilities, as observed, are present in these short- to medium-range series of forecasts for the monsoon seasons of 1995, 1996, and 1997. In section 5 we compute the values of the verification parameters for a quantitative estimation of the usefulness of the model forecasts. For the purpose of comparison with model output, the observed precipitation is spatially averaged over each model’s grid box. The results of such averaging over the NCMRWF model grid box (almost a square, with sides of ∼150 km) are shown in Figs. 5 and 6 above. The grid box of the ECMWF model (T213) is approximately 3600 km2 in area. Averaging over such a small area brings out much more detail and also highlights the areas of excess precipitation, especially in the regions of inhomogeneity. The width of the area along the west coast, receiving a large amount of precipitation, shrinks considerably on averaging over the smaller grid box (not reproduced), and more centers of local precipitation maximum appear. The largest seasonal totals are now of magnitudes of 606.7, 528.4, and 559.6 cm for 1995, 1996, and 1997, respectively, instead of 466.3, 322.4, and 308.3 cm for averaging over the NCMRWF grid boxes. From these results and the 1° box averaging, reported earlier (Basu 2003), it is obvious that spatial averaging over a smaller area makes the resultant precipitation field more characteristic of the rain gauge observation. The magnitude of the all-India precipitation, averaged over all of the grid boxes, is, however, less sensitive to the size of the area of averaging and lies within 2% of each other.
The frequency distribution of the NCMRWF precipitation forecasts (day 1 to day 4) in different categories is shown in Fig. 4. It is seen that the number of rainy days are 10%–15% more in the NCMRWF model forecasts, compared to that observed, and this number decreases with the increase in forecast lead time. This is mainly due to the higher-than-observed frequencies in the light and moderate categories that decrease with increasing forecast length, and more than compensates the reverse trend for the rather heavy category. In the ECMWF model forecasts the number of rainy days are more by about 6% again due to higher frequencies in the light and moderate categories. In this case, the frequency of occurrence does not change much with the forecast lead time, except in the heavy category for which the frequency decreases with the forecast lead time. In forecasts of both models the trend in the seasonal accumulated all-India precipitation (Fig. 10) is similar to the trend of the rather heavy category (Fig. 4) of precipitation.
The 3-yr (1995–97) mean precipitation from the NCMRWF and the ECMWF models for day-3 forecasts accumulated over the monsoon season are reproduced in Fig. 8. In both forecasts the characteristic spatial distribution of precipitation, with regions of heavy precipitation windward of hills and in the eastern region and rain shadow to the east of the ridge line along west coast, are reproduced. The area of minimum precipitation over the Thar Desert, with increasing trend both toward the east and south, is also predicted in forecasts of both of the models. These features are also reproduced in the seasonal total of medium-range forecasts from the Met Office and Japan Meteorological Agency (JMA) models (Basu et al. 1999). The effect of a better definition of topography, due to higher resolution in the ECMWF model, is reflected in the concentration of the heavy rainfall in a narrower strip along the west coast and in smaller areas in the foothills of central and eastern Himalayas. The rain shadow area to the east of the ridgeline along the west coast of India is also better forecast in the higher-resolution model. However, the region of large precipitation in the east of the country, primarily due to the activity of the monsoon trough, is much smaller in area and is shifted toward the east in the ECMWF model forecasts, compared to that in the NCMRWF model forecasts.
One of the important features of the summer monsoon season over India is the enhanced precipitation epochs when the rainfall amounts over some regions of the country are much higher than the long-term average. One such event occurred in 1996 when a cyclonic storm crossed the east coast of India on 16 June. The next day some rain gauge stations under its influence reported all-time-record values for 24-h-accumulated precipitation, measuring up to 670 mm. A region of heavy precipitation was also observed (Fig. 9) on the same date along the west coast. Both models predicted these two regions of large precipitation, but spatially spread the precipitation over a larger area and predicted lesser magnitudes for the heaviest precipitation compared to the observed grid-box average. The low-resolution model of NCMRWF predicted 50 mm as the heaviest amount, compared to the observed value of 130 mm, while the ECMWF model predicted 130 mm, compared to the grid-box-averaged observed value of 200 mm.
The forecast models do not reproduce very well the spatial distribution of year-to-year variations in the day-3 forecasts of precipitation accumulated over a season, because they do not exhibit the same signal of variability, as that observed, during the 3-yr period under study. In 1995, the ECMWF model predicted less precipitation (relative to the 3-yr mean) over most of the country, while the NCMRWF model predicted more. In reality (Fig. 6), central parts of the country received less precipitation than the 3-yr average, but the remaining parts had excess.
The all-India-averaged precipitation, as forecast by the global models, is plotted in Fig. 10 for various forecast lead times. It appears that the NCMRWF model predicted much more precipitation in 1995, and the average precipitation over India is about 1.5 times that forecast for 1996 or 1997. Because the NCMRWF forecast–assimilation system was new in 1995, there may have been an overestimation of the moisture field over India. It is evident from Fig. 10 that the day-1 forecast of all-India-averaged precipitation, accumulated over the monsoon season, of both models are close to each other in both 1996 and 1997, but, subsequently, the precipitation amount is greater in the NCMRWF model forecasts, while it is less in the ECMWF model forecasts. It is seen from Fig. 4 that for the NCMRWF model, the increase in the frequency of the rather heavy category with the forecast lead time more than compensates for the decrease in the number of rainy days and frequency of the light and moderate categories of precipitation. For the ECMWF model forecasts, the number of rainy days and frequency of the light and moderate categories remain almost same with the forecast lead time, but the number of occurrences in the rather heavy category decreases, leading to a decrease in the seasonal accumulated all-India precipitation.
The weekly accumulated precipitation during the monsoons of 1995–97, as forecast by the ECMWF (solid lines) and the NCMRWF (dashed lines) models, is shown in Fig. 11, along with the observed intraseasonal variation. It is seen that forecasts for different lengths of lead time are in phase with each other, indicating a consistency in the model forecast. The correlation coefficient (CC) between weekly accumulated precipitation forecasts from the two models, for the same forecast lead time, has values higher than 0.900, except for day-3 and day-4 forecasts in 1996 when the CC has values of 0.843 and 0.764, respectively. Forecasts from the two models are also similar in magnitude, except in 1995 when the NCMRWF model forecasts of precipitation have much larger values compared to the ECMWF model forecasts. The phase of all of the model forecasts are in general agreement with the observed phase of the intraseasonal variations. Thus, for the monsoon seasons of 1995, 1996, and 1997, both models forecast the intraseasonal phase variations consistently and correctly up to a forecast lead time of 4 days. The magnitudes of weekly accumulated all-India precipitation are also close to the observed, except for the NCMRWF model forecasts for 1995. Annamalai et al. (1999) found a single, robust mode, describing the latitudinal displacement of the tropical convergence zone from its oceanic to continental regime, and capturing the low-frequency active/break cycles, to be dominant in explaining the intraseasonal variability.
The results presented above show that both models predict more than the observed occurrences of precipitation mainly in the low-magnitude categories. The high-resolution (T213) ECMWF model has less difference (∼6%) from the observed frequency of rain than does the low-resolution NCMRWF model (10%–15%). The seasonal accumulated values of all-India rainfall are close to that actually observed, and the weekly accumulated values are in good agreement with the observed interannual variations.
Verification of forecast precipitation
The model forecasts have two types of error—the systematic one that is due to a lack of data, inadequate prescription of physical processes, inaccurate numerical methods, or a deficiency in model formulation, and the random one that is due to internal variability. The average of the forecast errors over a long period of time is a measure of the systematic part of the forecast error, while root-mean-square error (rmse) or mean absolute error are measures of the random component of the forecast error. The correlation coefficient between trends in the forecast and observation is a measure of the phase relationship between them.
The three-season-averaged forecast errors in precipitation for day-4 forecasts from both the NCMRWF and ECMWF models are presented in Fig. 12. The NCMRWF model forecasts exceed the observed values over the eastern region, along the west coast, and over the western Himalayas, while these forecasts fall short of the observed values in small patches over the west and north of the country. Over the rest of the country forecast values are within 2 mm of the observed average daily precipitation. The bias toward higher precipitation is due to the following two reasons: (a) precipitation amounts are very large in 1995 in comparison with those of the observed, and (b) a slight increase in precipitation amounts with the forecast lead time is present over the Indian area. For day-1 forecasts in 1997 (Basu 2003) the areas of positive (excess) and negative (deficient) errors are almost equal, and the magnitude of error is limited up to 6 mm over most of the country. The day-4 ECMWF forecast errors reflect the decrease in model precipitation amounts with the forecast lead time, and the small areas of positive errors over central India, present in the day-1 forecasts (not shown), are absent in the day-4 forecasts. When compared with the NCMRWF model, the magnitude of the forecast error in precipitation is much smaller for the ECMWF model, and the dipole in the error field, produced by the unrealistic spread of west coast topography, is absent in the ECMWF model forecasts. The mean forecast error of the ECMWF model is slightly more than that of the NCMRWF model over the northeast of India along the foothills of the eastern Himalayas. Annamalai et al. (1999) found that the precipitation field in the ECMWF model reanalyses to be better climatologically than that in the NCEP–NCAR reanalyses.
The rmse of day-4 forecasts of both of the models have a magnitude between 10 and 25 mm, except over the northeast of the country where the magnitude of rmse exceeds 35 mm. Because the ECMWF model forecasts have smaller mean error, compared to the NCMRWF model forecasts, these errors are more random in nature, while the NCMRWF precipitation forecast errors are of a more systematic type. A large part of this systematic error is seen as the dipole across the west coast ridgeline.
The CC between the observed and the NCMRWF model forecast precipitation at day 1 is shown in Fig. 13 (left panel). Over most of the country, the magnitude of CC lies between 0.2 and 0.4, while over large parts of central and northern India, the magnitude of CC exceeds 0.4. Within this area there are regions where the magnitude of CC exceeds 0.6. This indicates that the trend in precipitation in the day-1 forecasts of the NCMRWF model is in good phase relationship with the observed trend over a large part of the country. The magnitude of CC decreases with the forecast lead time, and by day 4 CC values over most of India are between 0.2 and 0.4, except in pockets near the west coast where the CC values still exceed 0.4 For the ECMWF model the correlation between the day-1 forecast and the observed trends (Fig. 13, right panel) is less, and the magnitude of CC is less than 0.4 over most of the country. For a sample size of 366 (the number of days in three monsoon seasons), the correlation coefficient is significant at the 0.01 level for values exceeding 0.135.
The standard WMO method of the verification of outputs (WMO 1992) is not adequate for precipitation due to its great temporal and spatial variability. The statistical parameters based on the frequency of occurrences in various classes are more suitable for determining the skill of a model in predicting precipitation. In Fig. 14, the bias, probability of detection (POD), false-alarm rate (FAR), and threat score for classes with class marks of 0.256, 2.56, 6.4, 12.8, 19.2, 25.6, 38.4, 51.2, and 76.8 mm, corresponding to 0.01, 0.10, 0.25, 0.50, 0.75, 1.0, 1.5, 2.0, and 3.0 in., respectively, are presented. For same-class marks, the precipitation forecasts from the global spectral model at NCEP have similar values for the bias and equitable threat scores over the United States during June–September of 1995, 1996, and 1997 (available at the NCEP Web site), even though the strong signal of the monsoon is not present.
The bias of a model forecast is the ratio of the predicted number of occurrences of an event to the number of occurrences of the same event actually realized in nature. The NCMRWF model overpredicts (bias > 1) rainfall up to 12.8 mm (0.5 in.) in the 24-h forecast, while it underpredicts (bias < 1) events of a higher magnitude. The NCEP model also overpredicts events of a lower magnitude, but the crossover to underprediction occurs at a higher value close to 38.4 mm (1.5 in.). Also, for the NCEP model, the value of bias increases initially to exceed 1.4 for the class mark of 12.8 mm (0.5 in.) while the bias never exceeds 1.3 for the NCMRWF model. For increasing the forecast lead time, the value of the class mark at the crossover point (from overprediction to underprediction) also increases. For day-3 and day-4 predictions, this value is close to 32.0 mm (1.25 in.). The ECMWF model forecasts also overpredict precipitation up to 12.8 mm (0.5 in.) and underpredict thereafter. The amount of overprediction is smaller (less than 1.2), when compared with the NCMRWF model predictions, and the shape of the curve indicates saturation in the bias near 0.4 for precipitation amounts above 51.2 mm (2 in.).
The threat score is the ratio of the number of correct model prediction of an event to the number of all such events in both observed and predicted data. The higher value of a threat score indicates better prediction, with a theoretical limit of 1.0 for a perfect model. The average threat score of the NCMRWF model forecasts for 1995, 1996, and 1997 over India starts close to 0.7 and then decreases to 0.3 near the 6.4-mm (0.25 in.) mark. This is better than the prediction of the NCEP model over the United States, because the latter has values less than 0.3 for the threat score for the same period of time. The ECMWF model forecasts have smaller values for the threat score than do the NCMRWF model forecasts for all classes of precipitation during the summer monsoon over India.
The FAR is the fraction of wrong prediction out of the total number of nonoccurrence of the event. For perfect prediction, the value of this parameter should be 0.0. In the present case, FAR is large for classes with a small class mark, but decreases markedly with an increase in class mark, and is practically zero for class marks above 25.6 mm (1.0 in.). The high values for both the threat score and FAR for lower class marks indicate that NCMRWF model-predicted occurrences of precipitation in these classes far exceed the observed frequencies. For ECMWF model forecasts, the values of FAR are very close to, but are less than, that of the NCMRWF model forecasts.
The POD is the fraction of correct prediction out of all of the prediction of the occurrence of an event, and a higher value for the quantity indicates the ability of the prediction model to capture the occurrence of desired events. In the present case, the probability of detection is more than 50% for class marks below 12.8 mm (0.5 in.). The bias, threat score, FAR, and POD together provide an estimate of the quality of the model prediction. A comparison of the bias and threat score of the NCEP model predictions over the United States, with the values obtained for the present model, suggests that the quality of precipitation forecasts over India during the monsoons of 1995, 1996, and 1997 is similar to that of the NCEP model over the United States during the same period. The ECMWF model forecasts have a smaller POD value for all of the classes of precipitation.
The statistical parameter that directly provides an estimate of the potential usefulness of the model prediction is the skill score, which is the ratio of the number of correct predictions by the model above those obtained by chance to the number of total predictions above the number of correct predictions obtained by chance. The magnitude of the skill score lies between 0 (no skill) and 1 (perfect skill), and the large, positive value implies that the model predictions are correct not by chance alone. The skill scores of both models for the period spanning the three summer monsoon seasons of India are shown in Fig. 15. For the NCMRWF model, both day-1 and day-2 forecasts of precipitation have significant skill for precipitation with class mark up to 25.6 mm (1.0 in.). The skill falls off rapidly for larger precipitation amounts and also for longer lead time for prediction. For the ECMWF model the skill in the forecast of precipitation over India is less compared to that in the forecasts of the NCMRWF model.
The forecast errors of an atmospheric model are due to various reasons, such as a lack of predictability, error in the prescribed initial atmospheric conditions, deficiencies in the dynamics and physics of the model, and an inaccurate initial description the boundary fields and their subsequent evolution during the forecast period. Part of this model error is random in nature while, frequently, a large part is systematic and is reflected in the monthly or seasonal mean. Assuming that the forecast–assimilation system has not undergone significant changes from previous years (as in this work), it is possible to correct some of the systematic error using the error statistics of earlier forecasts from the same system. Because the average error over the whole of the summer monsoon season is large for the NCMRWF model forecasts, especially over areas close to topography, it is worthwhile to attempt a statistical correction of the systematic error.
Because the observed precipitation values are available only for 3 yr, we choose the forecasts of each year, in turn, to be the control dataset from which the correction coefficients are computed, and then apply them to the forecasts for the other 2 yr. This way a set of six corrected forecasts is available to verify the suitability of the estimated corrections. The correction coefficients are obtained by fitting a polynomial between the observed and the forecast precipitation values. The coefficients of the polynomial are determined for each grid box and forecast hour by the least squares fit method. A linear relation is chosen to ensure that only the magnitudes are modified, without affecting the phase relations, thus, leaving the trend correlations intact. The procedure that is adopted appears to remove a large part of the systematic errors (Basu 2003) close to the topography along the west coast and the northeast of India. The seasonal average of the statistically corrected NCMRWF model day-4 forecast differences from the observed precipitation for individual seasons, as well as the three-season mean, are shown in Fig. 16. The rmse of corrected day-4 forecasts, averaged over the six sets of forecasts for the years 1995, 1996, and 1997, is shown in Fig. 17. Shaded areas have values higher than 1 cm day−1, and there are only small patches of areas with rmses exceeding 2 cm day−1. It should be mentioned here that the above exercise is only a preliminary one, because statistical corrections should be derived from a large period of past data, covering 10 seasons or more. Better values of the coefficients computed above will be worked out when observed precipitation for later years (after 1997) are available.
The linear correction does not significantly improve the frequency distribution of precipitation in different classes, because the coefficients used to make the corrections are independent of the classes. Because most of the grid points underpredict seasonal total precipitation, the corrections magnify the forecast value by a small amount, and then add a fixed amount to remove the systematic error or bias in the model forecast. This procedure improves the peaks in the forecast precipitation without changing the frequencies in the medium, rather heavy, and heavy categories by much, but shifts many events from the trace (<2.5 mm) to the light (>2.5 mm but <1 cm) category. Thus, the skill of the daily forecast is compromised by the linear correction, while the forecast of the seasonal total improves.
Discussion and conclusions
The Indian summer monsoon is the most suitable period for the verification of model-predicted rainfall over India because the frequency of occurrences of precipitation over rain gauge stations exceeds 30% of the 122 days in the season. This frequency increases to over 41% and 50% upon averaging over the ECMWF model grid boxes (∼60 km) and NCMRWF model grid box (∼150 km), respectively. The verification procedure that is employed consists of computing the quantitative parameters prescribed by the WMO, the statistical parameters related to skill, based on the realization or nonrealization of an event, the interannual and intraseasonal variability of all-India rainfall, and the comparison of the spatial distribution of daily precipitation during periods of enhanced precipitation activity. From the result presented above, the following is concluded:
Precipitation forecasts of both models, when accumulated over the whole season, reproduce the observed pattern with three large areas with a total precipitation in excess of 100 cm, and two large areas with a total precipitation of less than 50 cm. A case of enhanced precipitation over the east coast in 1996 is also well reproduced in the forecasts of both models.
Over India, the NCMRWF model forecasts of day-4 precipitation have mean error ranging between −14.0 and +14.0 mm day−1, while the ECMWF model forecasts have smaller mean error ranging between −10.0 and +4.0 mm day−1. The dipole of error across the west coast ridgeline in the forecasts of the NCMRWF model is much less in the forecasts of the ECMWF model. The NCMRWF model forecasts are better over the east of the country where the large amount of precipitation due to dynamical reasons (ascending limb of reverse Hadley cell) is better reproduced. Both models have similar magnitudes (10–25 mm) of rmse in precipitation over most of the country.
The number of rainy days and the all-India-averaged precipitation are well predicted by both models. The magnitudes of the day-1 all-India precipitation forecast by both models are close to each other in both 1996 and 1997. The NCMRWF model forecasts for the same quantity is higher in magnitude for later periods (day 2, day 3, and day 4), while it is smaller in magnitude in the ECMWF model predictions.
The NCMRWF forecasts have higher magnitudes of the coefficient of trend correlation and skill over India than the ECMWF forecasts. It has been observed (Drosdowsky and Zhang 2003) that both the correlation coefficient and skill give poor scores when rainfall forecasts are correct in intensity and areal extent, but are displaced either in space or time. Differences in location are much smaller when averaged over a large area like the NCMRWF grid box.
The observed variability of all-India precipitation, accumulated over 1 week, is reproduced remarkably well in the series of day-1 to day-4 forecasts of both models. This implies that though the short- to medium-range forecasts have errors in the spatial distribution (Fig. 12), the spatial average of weekly precipitation over the whole of India is in reasonable agreement with that observed.
Both models have a tendency to overpredict the frequency of occurrence of precipitation events in the light and moderate categories and to underpredict events in higher categories. The magnitude of bias is more for the NCMRWF than the ECMWF model forecasts than the ECMWF model forecasts. For both models, the bias changes from overestimation to underestimation at the threshold of 12.8 mm (0.5in.).
During the summer monsoon season over India, the NCMRWF model forecasts of precipitation have better threat scores compared to the ECMWF model forecasts for all threshold values, which is possibly due to the averaging over a larger area (Drosdowsky and Zhang 2003). It is also better than the threat scores of the NCEP model for summer precipitation over the United States. The ECMWF model forecasts have smaller values for both the false-alarm rate and the probability of detection when compared with the NCMRWF model forecasts of precipitation.
Most of the error in the NCMRWF model forecasts is systematic in nature as is reflected in the large magnitude of the mean error (Fig. 12, upper-left panel). A part of this component of the forecast error can be removed by a simple statistical correction leaving the random part of the error unchanged. The ECMWF model forecasts have very little systematic error over India during the summer monsoon (Fig. 12, upper-right panel) and, hence, much less need for improvement by statistical correction.
The Director General of Meteorology (IMD) has provided the observed precipitation values used in this work. The Director of ECMWF has provided the forecasts of precipitation from their model that was operational during 1995–97. The model used at NCMRWF for medium-range forecasting is a modification of an older version of the forecast model at NCEP.
The author is thankful to Dr. A. Hollingsworth for his help in getting the ECMWF forecasts, and to Dr. U. S. De, the Additional Director General of Meteorology (Research), for his help and advice regarding the use of the observed precipitation. Thanks are given to Keith Fielding and Matthew Manoussakis for their help with the ECMWF model forecasts.
Corresponding author address: B. K. Basu, NCMRWF, A50, Institutional Area, Sector-62, Noida, UP. 201 301, India. firstname.lastname@example.org