This paper utilizes forecasts from a multianalysis system to construct a superensemble of precipitation forecasts. This method partitions the computations into two time lines. The first of those is a control (or a training) period and the second is a forecast period. The multianalysis is derived from a physical initialization–based data assimilation of “observed rainfall rates.” The different members of the reanalysis are produced by using different rain-rate algorithms for physical initialization. The basic rain-rate datasets are derived from satellites’ microwave radiometers, including those from the Tropical Rainfall Measuring Mission (TRMM) satellites and the Special Sensor Microwave Imager (SSM/I) data from three current U.S. Air Force Defense Meteorological Satellite Program (DMSP) satellites. During the training period, 155 experiments were conducted to find the relationship between forecasts from the multianalysis dataset and the best “observed” estimates of daily rainfall totals. This relationship is based on multiple regression and defined by statistical weights (which vary in space.) The forecast phase utilizes the multianalysis forecasts and the statistics from the training period to produce superensemble forecasts of daily rainfall totals. The results for day 1, day 2, and day 3 forecasts are compared to various conventional forecasts with a global model. The superensemble day 3 forecasts of precipitation clearly have the highest skill in such comparisons.
Most numerical models carry with them an initial rainfall (at time zero or during the first six hours, provided by their data assimilation system) that differs considerably from the “observed” estimates (derived from surface- and space-based platforms). Correlations of initial rain rates between the model and “observations” generally lie around 0.3 over the global Tropics (30°S–30°N) [Krishnamurti et al. (1994) and Treadon (1996)]. Physical initialization is an inverse problem that enables the assimilation of the observed rain rates. Krishnamurti et al. (1991, 1994) have introduced several components of physical initialization in recent years. Kasahara et al. (1994), Puri and Miller (1990), Donner (1988), and Treadon (1996) have made important contributions in this field. The nowcasting skill, a correlation between the observed and predicted rainfall from this procedure, is generally of the order of 0.9 [Krishnamurti et al. (1994) and Treadon (1996)]. That skill shows a rapid drop to 0.6 by day 1 and to around 0.4 by day 3. Several attempts have been made to improve the tropical rainfall forecast skill beyond those levels from the physical initialization (e.g., Shin and Krishnamurti 1999, and others). However, it has not been possible to make any significant improvements. Shin and Krishnamurti did make several minor improvements to the global model, in addition to those within the physical initialization. They also implemented parameter estimations that yielded the best precipitation forecasts. Yet it appears that an entirely different approach is warranted for improving tropical three-day precipitation forecasts. Based on two recent studies (Krishnamurti et al. 1999, hereafter KKL; Krishnamurti et al. 2000 hereafter KKB), we propose here a multianalysis superensemble approach, which appears to improve the current skill for precipitation forecasts.
In those two recent papers, we showed that superensemble forecasts are invariably superior in skill to the individual multimodels. Multiseasonal climate forecasts with atmospheric general circulation models (AGCMs) have shown a great promise in recent studies. These studies, (KKB, KKL) utilize a superensemble for these multiseasonal forecasts. The strategy for the multimodel superensemble partitions the forecast time line into two components. The first of these, called the training period, utilizes the multimodel multiseasonal forecasts and the observed (i.e., analysis) fields to derive statistics. The second phase called “the forecast phase” utilizes the multimodel forecasts and the aforementioned statistics to obtain superensemble forecasts into the future. This methodology can be used for the basic variables of multimodels such as winds, temperature, pressure, precipitation, and humidity. A more practical and successful approach for the climate problem is to address the issue of “seasonal climate change.” Hence, we examined the superensemble for the seasonal and multiseasonal climate changes by deploying the same strategy. This has been applied to a well-known 10-yr-long dataset called AMIP. Here, 8 yr of these multimodel forecast datasets constituted the training period whereas the last 2 yr were used for the forecast phase. The results of this study show that the superensemble reduces the rms error of the seasonal climate change by more than 300% when they are compared to the performance of the best models. AGCMs include the specifications of sea surface temperatures and sea ice. These are, however, not true forecasts. The coupled models are the natural extension of this problem for addressing the seasonal to multiseasonal true forecast issues.
This same notion is being used here for demonstrating the large impact of TRMM datasets (a list of acronyms is provided in the appendix) on tropical prediction of rainfall. Here, we first carry out forecasts from a multianalysis of rainfall. The multianalysis comes from the use of several different rain-rate algorithms via the physical initialization of rain rates (Krishnamurti et al. 1991). This procedure ends up perturbing the divergence field, the moisture profile, the heating field, and the surface pressure tendencies. This is followed by forecasts from the multianalysis data during the training and forecast periods that ultimately provide the superensemble forecasts.
2. Data processing for the physical initialization
Rainfall data over a 24-h period preceding a forecast start time (1200 UTC) were first collected for physical initialization. This includes the TMI data from the TRMM satellite and the SSM/I datasets from the DMSP satellites F-11, F-13, and F-14. Data are collected within 6-hourly assimilation windows. All of the data within a radius of 2 grid lengths (≈1.8 latitude) were averaged using distance-dependent weights with respect to a transform grid (point). These weights, at a radius r from the grid point, are expressed by the following, W(r) = (R2 − r2)/(R2 + r2), where r ⩽ R (the scan radius). Furthermore, we set W(r) = 0 for r > R. The SSM/I raw data (level-1B data) from the DMSP satellites (F-11, F-13, and F-14) are continuously received from AFGWC, Omaha, Nebraska. Backup SSM/I data were available from NRL Monterey, California, and were retrieved on a daily basis. The data were processed in 6-h segments to generate rain rates (units of mm day−1), using several algorithms that determine rain rates from the brightness temperatures of the seven DMSP microwave channels.
The raw sensor data were first separated into the various microwave channels and processed onto the Gaussian grid of the global model using the specific rain-rate algorithms that differentiate between land and ocean locations. An accumulated rain rate was made using all available data for the 6-h period, with time centered on the data assimilation window. From the processed data, a final rain rate, global rain coverage, and coverage diagrams were prepared. We noted that given the TRMM dataset, it is possible to cover almost 90% of the transform grid points with rainfall estimates between 40°S and 40°N during a 24-h period. The NASA TRMM 2A12 algorithm determines the rainfall from the TMI instrument datasets, whereas the GPROF algorithm was used for the SSM/I datasets. The SSM/I datasets were only used as a filler to cover those grid points where the TMI data were absent. Thus, it is simply a matter of assigning a zero and a one for the weights of SSM/I and TMI data, respectively, if the latter are available. If TMI data are absent, the opposite weights are used; that is, 0 for TMI and 1 for SSM/I. The data coverage areas for the TMI and SSM/I during a 24-h period are shown in Figs. 1a,b. Here, the regions where data are absent are darkened. Overall it can be seen that the TRMM and DMSP satellites (F-11, F-12, F-14) jointly appear to provide nearly complete coverage during the 24-h period.
3. Computational procedure
The application of the regression analysis is quite straightforward. In Figs. 2a,b we illustrate the computations for the training period and the superensemble forecast period. The forecasts (for days 1, 2, and 3) from five-member (multianalysis, defined in section 4) rain rates are regressed with respect to the observed measures to obtain the respective weights: a1, a2, a3, a4, a5, and the intercept d. These are based on the datasets generated from an initial set of 155 experiments. Separate regression coefficients were obtained for each day of the forecasts. For the applications, this information (i.e., the weights) is simply carried over to the forecasts from the multianalysis forecasts, Fig. 2b. Here the Fi denotes the forecasts from the multianalysis, while C represents the control experiment where no rain-rate (physical) initialization was used.
4. A list of proposed experiments
a. Control experiment that relies only on the model-based rainfall distribution
The Florida State University (FSU) global spectral model at a resolution T126 (triangular truncated, 126 waves) was used for these studies. The global model described in Krishnamurti et al. (1991) includes dynamical computations based upon the spectral transform method and a complete array of physical parameterizations. Also included within this is a rain-rate initialization scheme called physical initialization. The control experiment derives initial rain from a data assimilation that includes a nonlinear normal mode initialization with physics, following Kitade (1983).
b. GPROF algorithm (Kummerow et al. 1996)
The GPROF 4.0 SSM/I algorithm is called a physically based profile algorithm. It examines the vertical profiles of hydrometeors in liquid and frozen states. This algorithm utilizes the brightness temperatures at 19, 22, 37, and 85 GHz. The horizontally polarized radiance at 19 and 85 GHz and the vertically polarized radiation at 19, 22, 37, and 85 GHz are used in the construction of the rain-rate algorithm. In addition, this algorithm incorporates several empirically determined features based on the Goddard Cumulus Ensemble model, and it can distinguish among several landscape categories (land, permanent ice, water, and coast). Moreover, a sophisticated radiative transfer algorithm is used within GPROF to include the effects of absorptivity and emissivity. This utilizes a model developed by Wilheit (1979) that incorporates the surface winds to define the sea state. The absorption of microwave radiances depends upon the water vapor, oxygen, and cloud water and is incorporated following Liebe (1985). Upwelling radiances are calculated using a Monte Carlo approach that determines the absorption and emission of photons along the field of view. Then, this information is used for determining the brightness temperature in the balanced state, Kummerow et al. (1996) and Roberti et al. (1994). The final step in the retrieval method of the GPROF 4.0 algorithm utilizes statistical applications that compare calculated parameters to the set of sensor observations. The output generated from the Goddard Cumulus Ensemble model and radiative transfer equations within the retrieval method produces a large database of atmospheric profiles and associated brightness temperatures. These modeled brightness temperatures are compared to the set of observed SSM/I brightness temperatures, and a minimum variance solution is obtained.
c. Cal/Val algorithm (Olson et al. 1996)
The Cal/Val algorithm makes use of vertically polarized radiation and related brightness temperatures at 19, 22, and 37 GHz and horizontally polarized radiation at 37 and 85 GHz. The algorithm distinguishes between land and ocean and utilizes the 85-GHz channel, if available. In its absence, it utilizes the 19- and 37-GHz datasets. The current generation of the so-called navy’s Cal/Val SSM/I algorithm has been developed from the contributions of Olson et al. (1990), Smith et al. (1998), Berg et al. (1998), and Olson et al. (1996).
d. NOAA/NESDIS algorithm (Ferraro and Marks 1995)
The NOAA/NESDIS SSM/I algorithm also utilizes the radiation received at satellite altitude from 19, 22, 37, and 85 GHz. At 19 GHz, it looks at both the horizontally and vertically polarized radiation, while the vertically polarized radiation and implied brightness temperatures are used at 22, 37, and 85 GHz. Scattering and emission signals are incorporated in this algorithm. Scattering over land is used to identify snow, deserts, and arid soils. In the absence of such tags, the scattering algorithm defines rain over land areas. Over the ocean, in the presence of a scattering and emission signal, the algorithm first looks for sea ice, and in its absence, rain rates are provided by this algorithm.
e. TRMM 2A12 algorithm (Kummerow et al. 1996)
The TRMM 2A12 algorithm is a slightly modified GPROF algorithm. This includes the 10-GHz (horizontal and vertical polarizations) and the 21.3- and 85-GHz vertically polarized radiances in addition to those of GPROF. Thus, 2A12 makes use of nine different channels. This is one of the primary datasets from the TRMM microwave instruments.
f. Contribution from TRMM 2A12 plus the GPROF algorithm
Basically, what we do is run 155 experiments with the FSU global spectral model at a resolution of T126 (roughly 80-km resolution) with these several options. Each experiment entails a physical initialization of the observed rain (as measured by the different algorithms) and is followed by a 3-day global forecast. After these experiments are completed, we prepare observed fields of rainfall estimates (i.e., the benchmark) that are regarded as the best measures. These fields are derived via the TRMM 2A12 algorithm as applied to the TMI datasets and the GPROF algorithm as is applied to all available SSM/I datasets from the three DMSP satellites (F-11, F-13, and F-14). Together, the data generated from the multianalysis-based forecasts and the observed best estimates are regressed to obtain weights via multiple regression for each of these forecasts, as sketched in Figs. 2a,b.
The next step in this exercise calls for a set of 25 new forecasts for a new period. Here the previously generated statistics (i.e., weights) are used along with the new forecasts from the multianalysis data to design superensemble forecasts. These are 3-day forecasts. We show that it is possible to acquire a higher forecast skill for rainfall from this superensemble that outperforms any single NWP forecast initialized from a single rain-rate algorithm.
The improvement is measured against our past performance, Treadon (1996), where he used physical initialization and GPI-based rain rates. Figure 3 illustrates a past skill, that is, correlations of rainfall (observed versus modeled) plotted against the forecast days. Here he shows a very high nowcasting skill, that is, of the order of 0.9. This was a feature of physical initialization, also noted by Krishnamurti et al. (1994). The forecast skill degrades to 0.6 by day 1 of the forecast, and it degrades even more by days 2 and 3 to values such as 0.5 and 0.45, respectively.
Following our earlier studies, Krishnamurti et al. 1994, we use a straightforward correlation between the observed rainfall totals (over 24 h) and the forecast model–based estimates. Since the observed measure of rain is derived from a mix of TRMM- and SSM/I-based algorithms, some smoothing of the observed product was deemed to be necessary. A simple local 9-point smoother (deploying an inverse distant-dependent weight with an equivalent summed weight for the central point) was used. The final total weights were normalized.
5. Results of computations
Using the proposed superensemble approach, we are able to improve the forecast skill when the TRMM–SSMI-based rain rates are used as a benchmark for the definition of the superensemble statistics and the forecast verification. Figures 4, 5, and 6 illustrate the TRMM-based forecast skills over the global Tropics, tropical Africa, and the tropical Americas, respectively. Here, TRMM noticeably improves regional short-range forecasts of precipitation beyond our previous studies.
In Figs. 4, 5, and 6, the three lines show correlations of rainfall (observed versus modeled) as a function of forecast days 0, 1, 2, and 3. The top line in each of these illustrations shows the multianalysis superensemble forecast. The next line, adjacent to the above line, is the forecast from a single run of the global model. This global model utilizes physical initialization of rain using the TRMM- and SSM/I-based initial rain using the 2A12 and the GPROF algorithms, respectively. The physical initialization follows the procedure outlined in Krishnamurti et al. (1991). The last (bottom) line with the lowest skill details the results from the control experiments that do not make use of any rain-rate initialization. Overall, we are most interested in improving the 3-day forecast skills of rainfall, and it is clear from these three illustrations that the global, as well as the regional, skills from the multianalysis superensemble are higher. These forecast results are based on five experiments each during 1–5 August 1998. Figure 4 shows the global skill, where we see that at day 3 the correlation has a value of 0.5. Although that is better than the other results shown in that diagram, nevertheless, it is not as high as one would consider useful. It is the regional skill over selected regions where the method appears to be most effective. Figures 5 and 6, respectively, show the regional skills over tropical Africa and the tropical Americas.
Over these regions, we see the day-3 forecast skill reaching as high as 0.7, which can be considered a useful skill for rainfall forecasts. The reasons why the skill is higher over the tropical Americas and Africa may be attributed to a larger climatological rainfall component over these regions. Furthermore, the limited number of experiments (155) during the training period was adequate enough to acquire useful statistics. Over other regions, such as the Asian monsoon region, traveling monsoon disturbances are frequently found. Hence, a large sample of experiments may be needed to obtain adequate statistics for the training period.
The major question that arises is how does the superensemble push the skill to superior levels when the skill of the member models is no better than 0.5. That 40% improvement simply comes from the relationship of the forecasts from the multianalysis data and the observed fields during the training period. We have also noted that the superensemble performs better than the ensemble mean of the forecasts; see KKB. This procedure evidently removes the local rainfall bias of the forecasts arising from the individual rain-rate algorithms. The collective bias removal was shown to be superior to the removal of the bias of the individual runs (and averaging such results; KKB). Further work is clearly needed to explain the performance of the superensemble.
We shall next illustrate an example of the day-3 forecasts of the precipitation over the global tropical belt in Figs. 7a–c. Figure 7a shows the observed TMI- and SSM/I-based 24-h rainfall estimate (mm day−1) between 1200 UTC 14 August and 1200 UTC 15 August 1999. Figure 7b illustrates the 3-day forecast from the multianalysis superensemble valid for the same period as above. Figure 7c shows the corresponding results from a single model, that is, that which provided the best results for this period. The global tropical correlation between the observed and the multianalysis superensemble is 0.55 where the correlation of the best model with respect to the observed estimate is 0.30. The lower skill of the best model arose from excessive rainfall totals over Africa, western Pacific, and the Asian monsoon regions. Figures 8a,b and 9a,b provide examples of day-3 forecasts of rainfall where the correlation of the multianalysis superensemble and the observed estimates were close to 0.7. These show the 24-h rainfall totals between 4 and 5 August 1998 (1200 UTC) over Africa and the tropical Americas, respectively. Although these 3-day forecasts seem to be perfect, it should be noted that these very high levels of skill have not been obtained in our previous studies. Overall these are some of the most impressive 3-day forecast skills, when we compare these with those illustrated in Fig. 3.
The above result is a major accomplishment from the use of TRMM satellites. The feasibility of real-time precipitation forecasts, using a multianalysis superensemble, appears worth investigating. With this in view, a joint NASA–FSU experimental forecast with the above framework has been started. NASA is currently providing the real-time TRMM rainfall products for the 2A12/GPROF algorithms. We have also received a similar product from the Naval Research Laboratory that incorporates background OLR-based rainfall estimates for global geostationary satellites. This is being considered as another possible rainfall algorithm for the real-time multianalysis data assimilation.
6. Concluding remarks and future outlook
The multianalysis superensemble appears to be a useful approach for improving the forecast skill of 1–3-day forecasts of rainfall. Having worked with physical initialization in global models, we had noted that although it provided a great skill for the nowcasting and 1-day forecast of rain it soon leveled off slowly toward operative skills. Implementation of this procedure within variational data assimilation was attempted by Treadon (1997) within 3DVAR and Tsuyuki (1997) within 4DVAR. Although the spectral resolution used in these studies was low, nevertheless, the above problem of the loss of skill beyond day 1 was still clearly apparent. It was felt that the essential nature of precipitation over the Tropics was mesoconvective in character, and the loss of accuracy of the larger-scale models was sufficient to offset any positive impacts from simple changes in the initialization procedures. The multianalysis superensemble proposed here looks at past rainfall observations derived from the TRMM and the DMSP satellites’ microwave instruments. Inclusion of the detailed past rainfall datasets is a unique feature of this superensemble. The forecasts from the multianalysis data project the past relationship (between model forecasts and the observed rain) into the future. Thus, high rainfall forecast skill is realized both over the globe and over selected regions, including the tropical belts of the North and South Americas and Africa. The skill of rainfall predictions over the Asian monsoon belt is only slightly higher than the global skills presented here.
The future extensions of this work would require still a higher resolution for the multianalysis models and for the superensemble. The transform grid resolution of T126 is around 80 km. The TRMM and SSM/I datasets have a footprint that varies from 30 to 50 km. Ideally a resolution such as T255 would be needed to pass the details of TRMM and SSM/I rainfall to the model. A multianalysis superensemble at that resolution would require very heavy computer resources. For improving precipitation forecasts through day 3, however, this may be necessary. We have explored precipitation forecasts with very high resolution regional spectral models. Considerable amount of research is going on currently with a variety of mesoscale models such as the MM5 and others. Many of these are nonhydrostatic models that have explicit microphysics components. It remains to be seen if further progress in 3-day precipitation forecasts can come from those efforts. However, having the past rainfall at resolutions such as T255 (see Krishnamurti et al. 1998) would provide the opportunity for resolving the organization of mesoconvective precipitating elements in active tropical weather systems such as depression, hurricanes, and monsoons, and it could provide a powerful multianalysis superensemble.
We have just started on a real-time experimentation with this procedure. We have already noted that as the number of cases during the training period increased, the correlation of rainfall on day 3 of the forecasts increased to 0.55 over the monsoon region. This appears promising in that as we do more and more days of forecasts we can pass the datasets of the forecast periods to the training period, thus increasing the length of the latter and increasing the skill somewhat.
The research reported here was supported by NASA Grants NAG8-1199 and NAG5-4729, NOAA Grant NA77WA0571, and NSF Grant ATM-9710336. We wish to convey our special thanks to C. Kummerow and R. Kakar of NASA for facilitating the availability of TRMM datasets and to the Naval Research Laboratory in Monterey for the DMSP satellite datasets. We are deeply indebted to ECMWF for providing us with their global datasets.
List of Acronyms
AFGWC Air Force Global Weather Central
AGCM Atmospheric general circulation model
AMIP Atmospheric Model Intercomparison Project
DMSP Defense Meteorological Satellite Program
FSU The Florida State University
GPI GOES Precipitation Index
GPROF Goddard Profiling (algorithm)
NASA National Aeronautics and Space Administration
NESDIS National Environmental Satellite, Data, and Information System
NOAA National Oceanic and Atmospheric Administration
NRL Naval Research Laboratory
OLR Outgoing longwave radiation
SSM/I Special Sensor Microwave Imager
TMI TRMM Microwave Imager
TRMM Tropical Rainfall Measuring Mission
Corresponding author address: Prof. T. N. Krishnamurti, Department of Meteorology, The Florida State University, Tallahassee, FL 32306.