Dynamical seasonal forecasts are afflicted with biases, including seasonal ensemble precipitation forecasts from the new ECMWF seasonal forecast system 5 (SEAS5). In this study, biases have been corrected using empirical quantile mapping (EQM) bias correction (BC). We bias correct SEAS5 24-h rainfall accumulations at seven monthly lead times over the period 1981–2010 in Java, Indonesia. For the observations, we have used a new high-resolution (0.25°) land-only gridded rainfall dataset [Southeast Asia observations (SA-OBS)]. A comparative verification of both raw and bias-corrected reforecasts is performed using several verification metrics. In this verification, the daily rainfall data were aggregated to monthly accumulated rainfall. We focus on July, August, and September because these are agriculturally important months; if the rainfall accumulation exceeds 100 mm, farmers may decide to grow a third rice crop. For these months, the first 2-month lead times show improved and mostly positive continuous ranked probability skill scores after BC. According to the Brier skill score (BSS), the BC reforecasts improve upon the raw reforecasts for the lower precipitation thresholds at the 1-month lead time. Reliability diagrams show that the BC reforecasts have good reliability for events exceeding the agriculturally relevant 100-mm threshold. A cost/loss analysis, comparing the potential economic value of the raw and BC reforecasts for this same threshold, shows that the value of the BC reforecasts is larger than that of the raw ones, and that the BC reforecasts have value for a wider range of users at 1- to 7-month lead times.
Seasonal forecasts of precipitation are becoming an increasingly important element in decision-making systems in Indonesia, especially in the agriculture and hydrological sectors. In these two sectors, stakeholders and decision-makers need seasonal forecasts to assist them in their planning strategy. Predicting when above- or below-average rainfall might occur can be valuable for agriculture because such information can help farmers to decide the type of crop that they will plant during that season. For example, farmers may plant a crop that requires less water if they have been informed in advance that the forecast rainfall will be below average. The seasonal forecast is also important in an irrigated system regarding the availability of irrigation water as well as the timing of high/low river stream flows. Currently, demands on seasonal forecasting are getting higher: users require seasonal rainfall forecasts that are skillful, statistically reliable, and free of bias (e.g., Schepen et al. 2014).
Global and regional circulation models that can forecast seasonal atmospheric and oceanic conditions are afflicted with biases to a degree that prevents their direct use for hydrological purposes (e.g., Ehret et al. 2012). Ensemble forecasts of seasonal precipitation from the European Centre for Medium-Range Weather Forecasts (ECMWF) also exhibit these systematic biases (Johnson et al. 2019).
Uncorrected meteorological forecasts are not suitable as direct input for quantitative models, such as those used in agriculture and water management (Schepen et al. 2016). The bias should be corrected because it can lead to significant errors in impact assessments (Murphy 1999). The seasonal rainfall forecast is a critical input for predicting the planting date, and the type of the rice crop that will be planted depends on either the expected duration of the rainy season or the forecast amount of rainfall in the dry season. It is expected that more skillful seasonal forecast information will lead to more successful agricultural activities and more skillful warnings for possible floods and droughts.
Apart from biases, ensemble forecasts can also have either a too-narrow or too-wide spread. So, before ECMWF or other GCM forecasts can be practically applied, statistical postprocessing is needed to correct these biases and errors in dispersion (e.g., Wilks 2011).
There are several existing statistical postprocessing methods, ranging from basic bias correction (BC) to more advanced methods (e.g., Piani et al. 2010; Salvi et al. 2011; Iizumi et al. 2011; Lafon et al. 2013; Teutschbein and Seibert 2010). Recently, a number of studies have assessed the skill of ECMWF ensemble forecast system S4 and its applicability in different contexts (Weisheimer and Palmer 2014; Kim et al. 2012; Di Giuseppe et al. 2013; Trambauer et al. 2015; Wetterhall et al. 2015). Schepen et al. (2014, 2016) have successfully applied calibration methods to 1-month temperature and precipitation forecasts in Australia from another seasonal forecasting system, the Predictive Ocean Atmosphere Model for Australia (POAMA). Two popular BC methods in seasonal forecasting are distribution mapping (e.g., quantile mapping) and linear scaling (Crochemore et al. 2016).
Amengual et al. (2012) used the empirical quantile mapping (EQM) method as a quantile–quantile calibration method based on a nonparametric function that corrects biases in the cumulative distribution functions (CDFs) of climatic variables. Reichle and Koster (2004), Boé et al. (2007), Déqué (2007), and Amengual et al. (2012) highlighted that quantile mapping is a flexible method that is widely used for correcting biases in meteorological variables. The quantile mapping method can be used to bias correct forecasts that can then be used to feed crop and hydrological models. Another advantage of applying that method is that it can be applied to various climate variables and various orographies (Wilcke et al. 2013). Quantile mapping also performs particularly well over complex terrain, similar to that found in Indonesia. As an example, Wilcke et al. (2013) showed that it exhibits good performance in Austria where there are high mountains in the west and flat regions in the east.
In Indonesia, seasonal forecasts are one of the main products of the Indonesian Agency for Meteorology, Climatology and Geophysics (BMKG). They are released twice a year: for the wet and dry seasons (BMKG 2018). For farmers in Southeast Asia, including Indonesia, a common problem is a false start of the rainy season (Marjuki et al. 2014), which is an isolated rainfall event preceding the expected monsoon onset date but followed by a dry spell. This is one factor that makes more accurate forecasting a challenge. Currently, seasonal forecasts that are provided by BMKG are mainly based on the mean of the output from four statistical models using observational rainfall data from stations (Komalasari et al. 2016). It is expected that the new ECMWF seasonal prediction system 5 (SEAS5) will become the main seasonal forecast system in Indonesia, so probabilistic forecasts from a dynamical model can be implemented to provide seasonal forecasts. There is ample evidence that probabilistic forecasts are able to improve decisions, if decision-makers use the information correctly (Rayner et al. 2005; Joslyn and LeClerc 2012; Ramos et al. 2013). In this study EQM is applied to correct for biases and other systematic errors in the precipitation forecasts of SEAS5. Empirical quantile mapping has been largely applied. However, our research provides a number of important insights, first, the seasonal forecast in Indonesia has not yet been bias corrected, to our knowledge. This is very important because Indonesian users from many sectors use the seasonal forecast information to guide them in making decisions about their future plans. To our knowledge, empirical quantile mapping is mostly applied to monthly data in seasonal forecasting, but here we show that improved skill can also be gained by bias-correcting daily forecasts. The bias correction of daily forecasts is necessary for the use of bias-corrected data in a modeling chain, for example, where bias-corrected meteorological outputs are used as inputs to a hydrological and/or crop model.
This study will focus on the driest months, that is, July, August, and September. This period is important because farmers have to decide whether it is possible to grow a third rice crop based on the seasonal rainfall forecasts for these months. About 70% of the lowland rice area in Indonesia produces two crops per year. A skillful forecast for July, August, and September precipitation could give farmers an indication of the total field area that can be planted with rice for the third crop. A third rice crop is possible in this period if the season has a number of consecutive months with precipitation above 100 mm month−1 (Oldeman 1980). However, late-season planting patterns could change if the precipitation during this period increases or decreases significantly (Naylor et al. 2007). A successful outcome of this study will be increased forecast skill of SEAS5 dry season precipitation forecasts, especially for the period of July, August, and September, which will promote the use of corrected ECMWF SEAS5 output for seasonal forecasting in Indonesia. Reliable, skillful, and, especially, valuable forecasts bring benefits to many sectors. Therefore, we will also compute the potential economic value (PEV; Richardson 2000; Wilks 2001) of both the raw and bias-corrected SEAS5 precipitation forecasts, along with traditional verification measures (e.g., Brier skill score and continuous ranked probability skill score). This study is motivated by the needs of BMKG, who have been collaborating with the Ministry of Agriculture to develop seasonal forecasts that can be used in a dynamic cropping calendar (using a crop model), with a focus on rice. Rice plants can grow well in a humid month that has rainfall between 100 and 200 mm month−1 (Ministry of Agriculture 2006). The motivation for this range of rainfall levels, identified by Oldeman (1980), is based on the amount of water demanded.
In section 2 of this article the EQM method is described. In section 3, the verification methods are presented. This is followed by a description of the data and study area in section 4. The results are presented in section 5 before the article concludes with a discussion of the findings in section 6.
2. Empirical quantile mapping
BC is one of the approaches that is commonly used to postprocess the direct output of a numerical model. EQM is a popular BC method and works with empirical probability density functions (PDFs) or CDFs for both the forecasts and the observations. This is a statistical approach based on the application of quantile–quantile adjustment: estimating quantiles for both the forecast and the observation dataset, then forming a transfer function by using corresponding quantile values. This aims to match the quantile of the forecast value to the observed value at the same quantile. To achieve this purpose, each predicted quantile is substituted by the corresponding observed quantile by means of their empirical cumulative distribution functions (ECDFs). Furthermore, this transfer function is then applied to the forecast data series:
where Yfc is the raw precipitation forecast from SEAS5, and Yfc(bc) is the bias-corrected precipitation reforecast. is the inverse ECDF of the observations, and ECDFfc is the ECDF of the forecast values (for each ensemble member separately), both determined for the training period.
The number of quantiles is a free parameter. This method differs from parametric quantile mapping, which uses the assumption that both the observed and the simulated precipitation distribution are well approximated by, for example, a gamma distribution (Piani et al. 2010). In our study, there is no fitting of a theoretical probability distribution to the data in creating the CDFs and no a priori assumptions are made (Gudmundsson et al. 2012). Reichle and Koster (2004) and Déqué (2007) already used this method. Reichle and Koster (2004) corrected surface soil moisture data from different sources such as satellite, ground measurements, and land modeling. Besides, Déqué (2007) corrected the extreme cold temperatures and summer heavy precipitation that are underestimated by the Météo-France atmospheric model ARPEGE. The major advantage of the EQM method is that it adjusts all moments, such that the entire distribution matches that of the observations for the training period, while maintaining the rank correlation between models and observations (Li et al. 2010). In this technique, the quantiles are determined by sorting model output and observations for the same historical base period, and constructing CDFs for each. Because daily input is needed to feed hydrological and crop models, for each month, lead month, and grid cell separately, we apply the BC at the daily level. We use a leave-three-years-out cross-validation technique to estimate the forecast error.
Measures of performance are evaluated using an independent sample that was not included in the training dataset. The total length of the record N is 30 years. In m-fold cross validation, m years are repeatedly removed, in this case m = 3. Subsequently, the EQM method is trained on each 27-yr period and applied to the three independent years. Finally, all independent years can be used in the verification procedure, so N = 30 years for the verification sample. To check robustness of the results to a different choice of m, we tested using a leave-one-year-out cross validation (m = 1), which showed almost identical results, probably because EQM is applied to daily accumulated precipitation. All results in section 5 are shown for m = 3.
3. Verification methods
Forecast skill refers to the relative accuracy of a set of forecasts with respect to some standard reference forecast (Wilks 2011). This study aims to investigate whether the EQM method improves the skill of the raw ECMWF seasonal precipitation forecasts in Java compared to a climatological forecast.
The precipitation forecast takes values on a continuous scale. In addition, it comes in the form of an ensemble forecast. Therefore, an appropriate and proper verification metric is the continuous ranked probability skill score (CRPSS; e.g., Wilks 2011). The CRPSS compares the continuous ranked probability score (CRPS) of a (raw or BC) forecast to that of a reference forecast. Both forecasts and observation are expressed by CDFs.
The CRPS compares the distribution of the ensemble members with the observation. Briefly, this score is the total area between the CDF of the forecast and the CDF of the observation. In the computation of the CRPSS, the observation climatology is used as a reference forecast. The CRPS is defined as follows (e.g., Wilks 2011):
F(y) denotes the forecast distribution expressed as CDF and Fo(y) is a step function CDF that jumps from 0 to 1 at the observed value. Equation (2) defines the CRPS for a single forecast, but CRPS values for multiple cases are often averaged (Hersbach 2000; Wilks 2011). A perfect CRPS is 0.
The associated skill score (CRPSS) is defined as follows (e.g., Wilks 2011):
where CRPSref stands for the CRPS of the reference, that is, climatology. The range of CRPSS is −∞ to 1. A positive skill score (CRPSS > 0) indicates that the forecast improves on the climatological forecast, while it is worse than climatology when CRPSS < 0. The CRPSS is used to verify both the raw and bias-corrected SEAS5 forecasts. In this verification, the daily rainfall data were aggregated to monthly accumulated rainfall. The CRPSS values in this study are averaged over all 30 years.
Another verification tool is the reliability (or attributes) diagram for assessing reliability and resolution. Reliability diagrams are common verification tools to illustrate the properties of probabilistic forecast systems. They are graphs of the observed frequency of an event plotted against the forecast probability of an event (e.g., Wilks 2011). From these diagrams, users can simply understand how often each forecast probability actually occurred. For example, if the forecast probability of an event equals 50% then, for perfect reliability, the event should occur on 50% of occasions on which the forecast is made. Reliability and resolution of probabilistic forecasts are separate terms in the decomposition of the Brier score. Resolution measures the ability of a forecast to distinguish situations with distinctly different frequencies of occurrence (Wilks 2011).The reliability diagrams in this study are shown for events that exceed the 100-mm threshold; observed frequencies are plotted as a function of forecast probabilities for all model grid cells over Java over the 30-yr verification period. If the reliability curves have positive slopes, it indicates that the observed frequency of the event increases as a function of the forecast probability. In case of perfect reliability the diagram shows a reliability curve equal to the diagonal line.
We also evaluate forecast skill using the Brier skill score (BSS), based on the Brier score (BS):
where N is the number of events, ft is the forecast probability, and ot is the binary observation (1 if the event occurred, and 0 if it did not). The BS is converted to a skill score, the BSS, using the BS of a reference forecast (i.e., climatology):
BSS can range from −∞ to 1, where 1 is a perfect forecast and all negative values indicate a forecast with negative skill compared to climatology.
Further, investigation of the PEV of the SEAS5 precipitation forecasts is also conducted in this study, estimated using basic cost/loss ratio (C/L) situations (Richardson 2000; Wilks 2001). The PEV of a forecasting system can be interpreted as the economic gain (relative to climatology) obtained by performing an action or nonaction, depending on the forecast. Decision-makers have a number of courses of action to choose from, and the choice is to some extent influenced by the forecast. A cost is associated with a preventive action. If no action is taken, there might be a loss if the adverse event occurs (Richardson 2000). Decision-makers can alter their actions based on forecast information. In probability forecast cases, users have to decide on the probability threshold at which to take action. In the simplest cost/loss model, users should take action if the forecast probability >C/L. The use of probabilistic forecasts in the cost/lost decision framework has the advantage that it has the potential to improve farmers’ decision-making more than deterministic forecasts because probabilistic forecasts are able to provide users with more complete information about future scenarios. The cost/loss model gives a simple illustration of the way that probability forecasts can be used in such situations and the benefits that may be obtained. Table 1 shows a contingency table that indicates the costs and losses accrued by the use of forecasts, depending on forecast and observed events. PEV is defined as follows:
where H = hits, FA = false alarms, M = misses, and ō = climatological frequency.
PEV ranges between −∞ to 1. A forecast that is better than the climatological forecast has positive PEV, and a perfect forecast has PEV = 1. Maximum PEV occurs when the cost/lost ratio is equal to the observed frequency of the event; then it equals the hit rate minus the false alarm rate, which is equal to the true skill statistic (TSS) and is defined as follows (e.g., Wilks 2011):
where CR = correct rejections. The TSS measures the ability of the forecast to distinguish between occurrences and nonoccurrences of the event. The score can vary between −1 and 1, where −1 is the worst value.
As is common (e.g., Richardson 2000), we show the potential economic value as a function of the cost/loss ratio. This allows a specific user to see the potential economic value for their specific C/L ratio. Most users have low cost/loss ratios, as the costs of taking preventative action are generally relatively low, while the losses incurred can be relatively large.
a. Study region
The present study focuses on Java in Indonesia. Figure 1 shows the real and model topography. The latter is based on geopotential data from ECMWF with a resolution of 0.4° that are divided by the acceleration of gravity to get the approximate elevation. There is substantial mismatch between the topography of the model and that of the real world (as captured by a 1-km digital elevation model), which is a result of the much lower model resolution. Variation in regional rainfall in Java is partly a consequence of the presence of very high volcanoes and mountains, with the highest mountain being Mahameru, or Great Mountain, whose height is 3676 m above sea level. Rice production is heavily concentrated on this island where rainfall and/or irrigation is very important for the growth of the plant.
b. Forecast data
This study uses seasonal precipitation reforecasts from SEAS5 for the period 1981–2010. SEAS5 was introduced in the autumn of 2017 replacing S4, which was released in 2011. SEAS5 issues a 7-month forecast, on the fifth of each month at 1200 UTC. The initial date of the forecast is the first of each month. The reforecast data used in this study are total precipitation (TP) in meters of equivalent water in the last 24 h, with a 36-km grid resolution. SEAS5 also consists of a set of retrospective seasonal forecasts for past dates, which are called reforecasts (also sometimes known as hindcasts) that can be compared to the historical record. The SEAS5 reforecast ensemble contains 25 members, while the operational SEAS5 ensemble contains 51 members.
ECMWF’s Integrated Forecasting System (IFS) represents uncertainty both in the initial conditions and in the model physics, the latter by perturbing the physics tendencies. Stochastic kinetic energy backscatter (SKEB) and stochastically perturbed physical tendency (SPPT) schemes are applied to all ensemble members of SEAS5 (Johnson et al. 2019). More information about SEAS5 and its configuration can also be found in Johnson et al. (2019; see also https://www.ecmwf.int/forecasts/documentation-and-support/long-range).
To correct the bias, a suitable dataset of precipitation observations is required. This study uses daily precipitation amounts for the period 1981–2010 from a new high-resolution (0.25°) observational dataset, called the Southeast Asia observations (SA-OBS) dataset (Van den Besselaar et al. 2017). SA-OBS is a high-resolution land-only gridded dataset for the Southeast Asian region for daily precipitation amount along with daily minimum, mean, and maximum temperature (Van den Besselaar et al. 2017). The gridded dataset SA-OBS is part of the Southeast Asian Climate Assessment and Dataset (SACA&D), that contains daily station and gridded observations for Southeast Asia (Van den Besselaar et al. 2017). The gridding method used to create this dataset involves kriging using a geographically independent variogram. This method solves a set of linear equations to minimize the variance of the observations around the interpolating surface. Further information about the gridding methodology and a comparison of the SA-OBS dataset to other observational datasets (both from rain gauges and satellites) can be found in Van den Besselaar et al. (2017). [The SA-OBS dataset can be downloaded from http://sacad.database.bmkg.go.id/download/grid/download.php. It is introduced by Van den Besselaar et al. (2017).] Matching the regular latitude-longitude SA-OBS grid to the ECMWF grid is applied before further analysis. The weighted average of all SA-OBS 0.25° grid cells in each corresponding ECMWF 36-km grid cell is computed. There are 18 × 47 grid cells of SA-OBS and 14 × 40 grid cells of ECMWF. A weighting factor is applied where a weight is given to every single SA-OBS grid cell based on the fraction of it in the corresponding ECMWF grid cell.
a. Rainfall patterns and characteristics over Java
The following factors affect rainfall variability in Indonesia (Aldrian and Susanto 2003): topography, tropical cyclones, meridional (Hadley) circulation, zonal (Walker) circulation, El Niño–Southern Oscillation (ENSO; e.g., Hendon 2003), the Madden–Julian oscillation (MJO; Madden and Julian 1994), and the activity of the monsoon. The monsoon system is a very dominant factor that controls the climate of Java, as Aldrian and Susanto (2003) described. In fact, there are two monsoon systems that have significantly different precipitation characteristics influencing Java. From November to March (NDJFM) the climate in this area is influenced by the northwest monsoon, which leads to rainy conditions. The monthly area-averaged rainfall climatology during this season ranges from about 210 to 330 mm month−1 (Fig. 2a). Figure 2a shows the monthly area-averaged precipitation climatology of 1981–2010 based on the SA-OBS dataset and the SEAS5 reforecasts (1-month lead time) in Java. On the other hand, during the period from May to September (MJJAS), the climate of Java is influenced by the southeast monsoon, which leads to relatively dry conditions. The monthly area-averaged rainfall climatology in the dry period is only between 30 and 120 mm month−1 (Fig. 2a). The graph clearly shows the seasonal variations in rainfall over this area as a result of the monsoon, both in the observations and in the SEAS 5 reforecasts, but the latter has a tendency to overestimate the precipitation. This is confirmed by the scatterplot of SA-OBS versus ensemble mean SEAS5 monthly area-averaged precipitation in Fig. 2b.
Figure 3 shows precipitation climatology maps in July, August, and September. The patterns show that the average monthly precipitation of these months is mostly below 125 mm month−1. The southwest part of Java is wetter than the northeast because of its mountainous geography (Fig. 1; Van der Eng 2010).
b. Verification of raw and bias-corrected seasonal precipitation reforecasts
As mentioned in section 2 the EQM method is applied to daily SEAS5 precipitation forecasts in each grid cell of the study area for each lead time, month, and ensemble member separately. Figure 4 shows quantile–quantile (Q–Q) plots of monthly accumulated precipitation of the raw and bias-corrected ensemble mean SEAS5 reforecasts against SA-OBS precipitation. The EQM method ensures that the distribution of SEAS5 matches that of the observation on the daily scale (the scale to which the correction is applied). As mentioned before, we chose to calibrate the daily values so that they can be used as inputs to a hydrological and/or crop model. However, it is clear from Fig. 4 that the correction at the daily scale does not ensure that the quantiles are equivalent on the monthly scale. For the 1-month lead time (Fig. 4a), the bias correction using EQM successfully corrects the raw SEAS5 forecasts for July and August, but for September there is a wet bias for higher values of the BC reforecasts. For the 7-month lead time (Fig. 4b), both the Q–Q plots of the raw and BC reforecasts show that they are generally systematically lower than the observations (dry bias), but the distribution of reforecasts is closer to that of the observations after bias correction for August and September.
Figure 5 presents the CRPSS values (over 95% of the grid cells) of the raw and BC reforecasts, for reforecasts valid in July, August, and September, for the seven lead times. The 95% range of CRPSS values (i.e., the 2.5th–97.5th percentile range of the distribution of CRPSS values) after BC (blue shaded area) for those three months is much smaller than for the raw reforecasts (pink shaded area). In July and August (September) the median CRPSS values of lead times ≤2 (3) months are positive after BC (shown by blue solid lines in Fig. 5), but they are still negative for longer lead times (lead times of 5–7 months for July, 3–7 months for August, and 4–7 months for September).
EQM generally improves CRPSS with values up to about 0.5 for July in the 1-month lead time. More positive CRPSS values appear after BC for most grid cells, as can also be seen in Fig. 6, which displays the spatial pattern of the CRPSS for that lead time. In the central and eastern parts of Java, most CRPSS values have become positive after BC, but the CRPSS has decreased and become negative in a number of grid cells in the western part of Java This might be a result of the fact that the dry season in the eastern part of Java is more consistently dry compared to the western part. The western part of Java has higher variability in rainfall during June, July, and August. Sometimes, during the dry season there is still rainfall over this region (Satyawardhana and Susandi 2015).
Figure 7 shows the reliability diagrams of raw and BC SEAS5 reforecasts for events that exceed the 100 mm month−1 threshold in July, August, and September. The diagonal line should be considered as a reference for perfect reliability. In this reliability diagram, the raw SEAS5 curves (red curves in Fig. 7) are located below the perfect reliability line, indicating that the raw SEAS5 overforecasts the event (i.e., forecasts the event with higher probability than the observed frequency). In July and August, the reliability curves are generally below the “no skill” line and exhibit little resolution. In September, the raw forecasts have better reliability.
For those three months the reliability curves of the BC reforecasts (blue curves in Fig. 7) show an improvement over the raw forecasts. The curves mostly lie within the skillful region and display more reliability than the raw forecasts, with very good reliability in July (Fig. 7a). Furthermore, for those three months, the reliability curves of the BC reforecasts for April–June, and October and November also improve over the raw forecasts (not shown).
BSS values (over 95% of the grid cells) of the raw and BC reforecasts as a function of the threshold are shown in Fig. 8a for 1-month lead time in July, August, and September. The 95% range of BSS values after BC (blue shaded area in Fig. 8) is generally smaller than for the raw forecasts (pink shaded area). In July (August) the median BSS values of thresholds less than 200 (100) mm are positive after BC (shown by blue solid lines in Fig. 8), but they are still negative for higher thresholds, some of them even showing worse skill than the raw forecasts (shown by the red solid lines in Fig. 8). However, in September the median BSS values of the BC reforecasts are positive up to high thresholds.
In addition, Fig. 8b shows the BSS for the 100-mm threshold over lead time. In July and September, the median BSS values of the BC reforecasts are positive for the 1- to 4-month lead times, while in August they are positive only for the 1- and 2-month lead times. For the forecasts that are initialized in April (June), the skill of the corrected monthly forecasts for July (September) is positive for most of the grid cells.
c. Potential economic value
Figure 9 shows the PEV of the raw and bias-corrected SEAS5 forecasts for 1- and 7-month lead time for July, August, and September at the 100-mm threshold. Positive PEV indicates that there is more PEV than climatology. It is clear from Fig. 9 that the PEV of SEAS5 forecasts varies considerably for users with different cost/loss ratios, with the largest PEV found for low C/L ratios. This is good to see as the majority of users generally have low cost/loss ratios (i.e., the cost of taking preventative action is generally relatively small and the potential losses are relatively large). It is important to note that all users get more potential value from the bias-corrected SEAS5 forecasts compared to the raw SEAS5 forecasts. We can see that the bias-corrected SEAS5 forecasts have more value than the raw forecasts for a wider range of users at the 1-month lead time in all 3 months. For example, in the 1-month lead-time forecast for August, the raw forecast has more PEV than climatology (positive numbers) for users that have C/L ratios between 0.1 and 0.5, while the bias-corrected forecast provides positive PEV for users with a C/L ratio up to 0.8. This means that more users can make better decisions using the bias-corrected SEAS5 forecast compared to the raw forecast, using a simple cost/loss model.
Comparing the PEV for the 1-month lead time to that for the 7-month lead time, it is clear that the longer the lead time, the lower the potential economic value, as expected. However, it is promising to see that there is positive PEV even at the longest lead time (7 months), and that there is still an increase in PEV after BC for cost/loss ratios around the climatological frequency. At the longest lead time, the bias-corrected forecasts are still able to provide positive PEV for a wider range of users, although the associated range of cost/loss ratios is more narrow than for the shortest lead time. When we look at Fig. 9, the 7-month forecast has less value than the 1-month forecast up to relatively high cost/loss ratios but it becomes zero or only slightly negative for the highest ratios. This is contrary to the 1-month forecast, which has negative PEV for higher C/L ratios. Because the 7-month forecast has no skill the PEV is closer to climatology. On the other hand, the 1-month forecast has skill because of reliable and sharper probability forecasts but this apparently leads to negative PEV for higher C/L ratios.
6. Discussion and conclusions
The results in this study show the skill and potential economic value of SEAS5 raw and bias-corrected precipitation reforecasts for July–September in Java. For the dry season (May–November), the bias-corrected SEAS5 forecasts show positive CRPSS values (with climatology as a reference) for most grid cells for 1-month lead time and for July–October also 2-month lead time. In July and September, many grid cells are skillful to the 3-month lead time (Fig. 5). For the 1-month lead time, the CRPSS values of the bias-corrected forecasts for most of the grid cells located in the central and eastern part of Java are positive (Fig. 6). The western part of Java shows less positive or even negative CRPSS values probably because of its high rainfall variability. The rainfall variability is caused by several factors, including topography. Figure 1 shows that the topographical conditions of western Java vary, with low land plains and beaches in the north, and high land plains and a mountain range in the south. Note that both the raw and the bias-corrected SEAS5 forecasts are not skillful compared to climatology for most of the wet season (December–April; not shown in this paper).
The BSS values of the bias-corrected SEAS5 forecasts are lower in August compared to July and September, maybe because August is mostly the peak of the dry season. In July and August at the threshold of 50 mm, the median BSS reaches it maximum value (nearly 0.5 in July) and the range of BSS values is almost entirely positive. The bias-corrected forecasts still show negative BSS for some grid cells and especially the higher thresholds, but the BSS is generally higher than for the raw forecasts. In September the threshold with a range of almost entirely positive BSS values is close to 100 mm, where the median BSS is nearly 0.3.
In addition, BC by EQM is not only able to improve forecast skill (as defined by the BSS and CRPSS), but also enhances the potential economic value of the SEAS5 forecasts, especially in the dry season, which is an agriculturally important season. It is shown in Fig. 9 that the corrected SEAS5 forecasts have a higher PEV than the raw forecasts for all ranges of users; or, in other words, the bias-corrected SEAS5 forecasts are more valuable than the raw forecasts, even for the longest lead time of 7 months. During the peak of the wet season (December–February), however, the bias-corrected SEAS5 forecasts cannot improve upon the negative PEV of the raw forecasts (not shown).
BC using the EQM method has some limitations. For example, it cannot efficiently correct overconfidence in the raw ensemble forecasts (Zhao et al. 2017), and downscaling using quantile mapping can produce outputs that have larger variance than observations when the outputs are rescaled to the original resolution (Maraun 2013). Besides, forecasts with both positive and negative skill are generated because EQM does not consider the strength of the correlation between forecasts and observations (Zhao et al. 2017). Maraun (2013) also mentioned another limitation of quantile mapping: it does not introduce any small-scale variability. This means that the temporal structure is still that of the grid box and not of the local scale. Despite these limitations, a number of studies (e.g., Piani et al. 2010; Mehrotra and Sharma 2016) have shown that quantile mapping is effective for correcting the bias of climate projections. Crochemore et al. (2016) mention that quantile mapping can be useful if the raw forecasts are only biased, and neither under nor overdispersed. A benefit of the EQM method is that it makes no prior assumption about the shape of the probability density function. This makes the distribution of corrected values identical to that of the observed data for the training period.
Besides the limitations of the method, the reader should keep in mind that there is also uncertainty in the SA-OBS data (Van den Besselaar et al. 2017). Sources of the observational uncertainty stem from uncertainty in the data quality, variability in the number of stations, the interpolation technique, and uncertainty in the metadata, as it comes from different sources. Also, because of the variations in topography and the limited number of stations the large-scale patterns are probably well defined, but small-scale features will likely be more uncertain.
Finally, although every statistical postprocessing method has its limitation(s) in its capacity to correct systematic errors in ensemble forecast data, it is a computationally cheap and effective way to improve forecast skill. In a future study, more advanced statistical postprocessing methods will be applied to further improve seasonal forecast skill.
This study was part of the G4INDO project, which was mostly funded by the Netherlands Space Office (NSO). The authors are grateful to the Santander met group for developing and maintaining the downscaleR package (https://github.com/SantanderMetGroup/downscaleR), and to the three anonymous reviewers and Albert Klein Tank (WUR) for their comments on an earlier version of the manuscript, which have helped to improve it. We also thank Gerard van der Schrier (KNMI) and Robi Muharsyah (BMKG) for assistance with the observational data and Folmer Krikken (KNMI) for the geopotential ECMWF model data.