Hybrid Dynamical–Statistical Forecasts of the Risk of Rainfall in Southeast Asia Dependent on Equatorial Waves

: Equatorial waves are a major driver of widespread convection in Southeast Asia and the tropics more widely, a region in which accurate heavy rainfall forecasts are still a challenge. Conditioning rainfall over land on local equatorial wave phases ﬁ nds that heavy rainfall can be between 2 and 4 times more likely to occur in Indonesia, Malaysia, Vietnam, and the Philippines. Equatorial waves are identi ﬁ ed in a global numerical weather prediction ensemble forecast [Met Of ﬁ ce Global and Regional Ensemble Prediction System (MOGREPS-G)]. Skill in the ensemble forecast of wave activity is highly dependent on region and time of year, although generally forecasts of equatorial Rossby waves and westward-moving mixed Rossby – gravity waves are substantially more skillful than for the eastward-moving Kelvin wave. The observed statistical relationship between wave phases and rainfall is combined with ensemble forecasts of dynamical wave ﬁ elds to construct hybrid dynamical – statistical forecasts of rainfall probability using a Bayesian approach. The Brier skill score is used to assess the skill of forecasts of rainfall probability. Skill in the hybrid forecasts can exceed that of probabilistic rainfall forecasts taken directly from MOGREPS-G and can be linked to both the skill in forecasts of wave activity and the relationship between equatorial waves and heavy rainfall in the relevant region. The results show that there is potential for improvements of forecasts of high-impact weather using this method as forecasts of large-scale waves improve.


Introduction
High-impact weather across Southeast Asia is primarily related to heavy precipitation, particularly in areas vulnerable to flooding or landslides, and where the exposure of the population is great.Precipitation in Southeast Asia is a challenging field to forecast.It exhibits high variability on small scales due to deep convection, and is linked to numerous dynamical phenomena on a variety of temporal and spatial scales, such as the Madden-Julian oscillation, equatorial waves, Borneo vortices, cold surges, and tropical cyclones among others (e.g., Chang et al. 2005;Juneng et al. 2007;Tanang et al. 2008;Xavier et al. 2014;van der Linden et al. 2016;Ferrett et al. 2020).Furthermore, forecasts of precipitation depend on convection parameterization in global numerical weather prediction (NWP) models or partially resolved dynamics in km-grid models.It is well known that NWP models have issues representing convection in the Maritime Continent (Love et al. 2011;Birch et al. 2016;Johnson et al. 2016) and the tropics more widely (Vogel et al. 2020).This means that forecasting high-impact weather in Southeast Asia is a significant challenge.However, larger-scale systems (exceeding 2000 km) in the tropics, such as equatorial waves, may be predicted at much longer lead times than smaller-scale systems or systems at higher latitudes (Ying and Zhang 2017;Judt 2020).
It is known that the risk of heavy precipitation across Southeast Asia is dependent on the passage of different equatorial wave modes over the region.In some regions of Malaysia and Indonesia extreme rainfall is up to 4 times more likely in a particular phase of equatorial waves, when they have high amplitude (Ferrett et al. 2020).Theoretical structures of equatorial waves can be identified in observations (e.g., Wheeler et al. 2000;Yang et al. 2003;Roundy and Frank 2004;Kiladis et al. 2009;Knippertz et al. 2022).Recently a methodology was developed to identify equatorial waves in real-time in order to quantify wave activity in global NWP forecasts (Yang et al. 2021).This method does not require future data for frequency filtering as many previous methods do.The study found that, in the Met Office global deterministic operational NWP forecasts, westward-moving equatorial waves tend to be relatively well forecast in Southeast Asia in terms of amplitude and phase.Forecasts of Kelvin wave activity showed more error at longer lead times, often having amplitudes that decay with time into the forecast.Nonetheless, there appeared to be skill at earlier lead times (up to 4 days), suggesting there is potential for forecasts of precipitation based on the relationship between equatorial waves and convection, and not just on simulated rainfall alone which is highly dependent on the model representation of convection.This idea is supported by Vogel et al. (2021) who construct a statistical forecast that outperforms global ensemble predictions in Africa as a result of westward-moving waves that are linked to increased rainfall in the region (Schlueter et al. 2019).Our study proposes a hybrid dynamical-statistical forecast method that incorporates the relationship between equatorial waves and rainfall and is tested against NWP ensemble predictions of Southeast Asia rainfall.
The method in this study is inspired by the Bayesian approach introduced in Cafaro et al. (2019), to incorporate the known relationship between equatorial wave activity and high-impact weather in Southeast Asia into forecasts of the probability of heavy rainfall.The hybrid dynamical-statistical forecasting technique is based on the premise that global NWP models should have substantial skill in forecasting large-scale equatorial wave propagation and growth.If dynamical forecasts of the waves are accurate enough, and their observed statistical relationship with heavy rainfall is robust enough, then there should theoretically be skill in probability forecasts of rainfall conditional on the forecast phase and amplitude of equatorial waves.The primary aim of this study is to derive the hybrid model and to establish whether the hybrid technique shows promise for prediction, including identifying the geographic locations and times of year that the hybrid forecasts demonstrate most skill using a current operational forecast system.The hybrid forecast methodology is applied to the Met Office Global and Regional Ensemble Prediction System (MOGREPS-G) to demonstrate its use.Hybrid forecasts of rainfall risk are produced and the skill of the hybrid forecasts are compared to the rainfall probabilities taken directly from MOGREPS-G simulated rainfall fields.
The approach also illustrates the potential predictability of rainfall coupled to equatorial waves.
Section 2 describes the data used to build the hybrid forecast.The steps to construct the hybrid forecast are described in section 3. Results are described in section 4, including examination of the representation of equatorial wave activity in MOGREPS-G as well as comparison of hybrid forecasts of rainfall probability with NWP forecasts of rainfall probability from MOGREPS-G.

a. GPM-IMERG
Daily precipitation is taken from The Integrated Multi-satellitE Retrievals for GPM (GPM-IMERG) dataset (Huffman et al. 2019).The product used is Level 3 daily Final Run Precipitation.This combines precipitation estimates from GPM constellation satellites (see https://gpm.nasa.gov/missions/GPM/constellation)and Global Precipitation Climatology Centre (GPCC) precipitation rain gauges.Precipitation estimates from passive microwave radiometers are combined with infrared data from geostationary weather satellites.Monthly GPCC gauge accumulation analyses are used to reduce biases in the multisatellite monthly averages where available.Rainfall over the period 2001-18 is used.Two versions of spatial resolution of GPM-IMERG data are used.The first is on the native 0.18 3 0.18 horizontal grid.The second is interpolated onto a 18 3 18 grid prior to analysis.Both versions are used in order to compare how much information is preserved by using higher-resolution data in the construction of the hybrid forecast.Rainfall from GPM-IMERG is used to calculate the climatology forecast for comparison, which is based on the climatological probability of the defined heavy rainfall threshold (at least 15% of grid points in an area exceeding the local gridpoint 95th percentile).Heavy rainfall events are calculated over the boxed regions shown in Fig. 1.Studies have shown that rainfall in Southeast Asia, such as in the Philippines and Singapore, is relatively well represented by GPM-IMERG (Sunilkumar et al. 2019;Tan et al. 2019;Tan and Santo 2018), and is consistent with radar and rain gauge estimates of rainfall in Malaysia (up to the 95th percentile in hourly rainfall accumulation on the 0.18 IMERG grid, although estimates are not as good for higher extremes) (Da Silva et al. 2021).

b. MOGREPS-G
MOGREPS-G (Bowler et al. 2008) data used in this study come from The International Grand Global Ensemble (TIGGE; Bougeault et al. 2010;Swinbank et al. 2016).Output from MOGREPS-G is used in two separate ways.The first is using percentile thresholds calculated from the simulated surface precipitation (daily accumulations) in this data to identify "heavy precipitation events," in a similar fashion to the use of the GPM-IMERG observation data.This information is the basis of the simulated forecasts of rainfall probability.The second use is to project forecast winds onto the equatorial wave basis in order to find wave amplitude and phase as input to the hybrid dynamical-statistical forecast method.
Daily initializations (at 0000 UTC) are used for four years between 2015 and 2018.Earlier dates have eleven ensemble members, increasing to 17 ensemble members during June 2017.Forecasts have a native temporal resolution of 6 h up to lead times of T 1 168 h with a horizontal grid spacing of 33 km and 70 vertical levels.It should be noted that native resolution varies with initialization time as MOGREPS-G is updated over the years so earlier forecasts are lower resolution.To identify wave activity, separate wave horizontal wind and geopotential height fields are produced from MOGREPS-G horizontal wind fields and geopotential height at 850 hPa.The fields are space and time filtered and projected onto theoretical wave mode structures for Kelvin, n 5 1 Rossby, and westward-moving mixed-Rossby gravity waves following the method described by Yang et al. (2003Yang et al. ( , 2021)).
Note that due to the time filtering employed in the equatorial wave diagnosis, the days approaching the end of the dataset can become distorted in a period of up to half of the longest period of the filter band, because of a lack of future data, as described by Wheeler and Weickmann (2001).Yang et al. (2021) examine the effect of this distortion on the accuracy of the "real-time" wave analysis used in this study.This method uses forecast data at the end of the dataset and therefore has no buffer there that minimizes distortion.They find that the distortion only has a significant impact on wave amplitude and phase of the final day of 7-day-long forecasts, with larger error relative to a wave analysis that does have a buffer (see their Fig.11).They therefore conclude that the previous 6 days of forecast are suitably accurate.Because of this, results here are only presented to T 1 144 h.Data are interpolated onto a consistent 18 3 18 grid prior to wave identification and precipitation analysis.

c. ERA5 equatorial waves
For the observed relationship between equatorial waves and heavy rainfall a dataset of equatorial waves identified from ECMWF Reanalysis 5 (ERA5; Hersbach et al. 2020) 850-hPa horizontal winds and geopotential height over 2001-14 is used.This can be viewed as an upgrade of the analysis of Ferrett et al. (2020) that used waves identified from the earlier ERA-Interim.The method for production of this dataset is described in Yang et al. (2003) as above.

a. Characterizing equatorial wave amplitude and phase
For each equatorial wave type, local phase-space coordinates are defined such that waves propagate through phase quadrants labeled in ascending order, first defined in Yang et al. (2021).For each wave type [Kelvin, n 5 1 Rossby (R1), and westwardmoving mixed Rossby gravity (WMRG) waves], coordinates are defined from zonal wind u, meridional wind y, or zonal wind horizontal divergence du/dx all evaluated on the 850-hPa level by projecting the full horizontal winds and geopotential height onto the equatorial wave basis.For the definition of wave-phase variables, the projected wind components are averaged over a 58 longitude band centered on each region box in Fig. 1.The u, y , and du/dx obtained are used to define two phase-space variables: X and Y, each normalized by its standard deviation.The variables selected to define the phase-space coordinates for each wave are defined in Table 1.Wave amplitude is then defined as A 5 X 2 1 Y 2 √ .By analogy with the propagation of Madden-Julian oscillation (MJO) phase in the familiar Wheeler-Hendon MJO phase diagram (Wheeler and Hendon 2004), eastward-moving waves (Kelvin) are designed to propagate anticlockwise on the wave phase-space diagram.However, there is a fundamental difference.The MJO phases refer to the position of the convectively active sector of the MJO falling at different longitudes.In contrast, the phase in our diagrams refers to the passage of the wave with time at a fixed longitude.Phases are constructed such that westward propagating waves progress clockwise around the phase-space diagram.The latitudes at which oscillations in different variables are used to define the phase of different wave modes are also varied due to the differing structures (Fig. 3).
The phase space for each wave is divided into 12 sectors.There are three amplitude bands defined by A , 1, 1 , A , 2, and A . 2 corresponding to weak, stronger than average, and very strong waves.The wave phase a is the anticlockwise angle from the horizontal and is defined by tan(a) 5 Y/X and then TABLE 1. Definitions of the local wave phase space coordinates and the latitudes used in their evaluation.Maximum of Y is onequarter of a wavelength to the west of maximum X in each wave (see Fig. 3); X and Y are normalized by their standard deviation and are averaged over a 58 longitude window, centered on the relevant region.

Wave
shows an example of a Kelvin wave propagating in the wave phase space both for the MetUM analysis (equivalent to the "observed" wave; black line) and MOGREPS-G (gray lines).MOGREPS-G forecast initialization is shown by the red star.Points further from the center indicate a higher amplitude wave.As mentioned previously the variables X and Y are defined such that the eastward-moving Kelvin wave propagates anticlockwise, as is also the convention with the familiar MJO Wheeler-Hendon diagram.
For the westward propagating waves, the variables X and Y have been chosen so that the wave trajectory moves clockwise in the phase diagram.Therefore, the quadrant labels 2 and 4 are swapped so that waves are expected to progress through the quadrants in ascending order.Figures 2b and 2c shows examples of R1 and WMRG waves propagating in the wave phase space.
Composites of 850-hPa wave wind during each phase of the waves are shown in Fig. 3 4c,d) and also Northern Australia in DJF.Note that in JJA there is only anomalously heavy rainfall in the Northern Hemisphere, maximizing in the cyclonic vorticity phase of R1.Ferrett et al. (2020) showed that this is dominated by rainfall within tropical cyclones that are associated with a strong projection on R1 wave structures.
Very strong WMRG northerly winds (Q 5 3; A . 2) and positive vorticity (Q 5 4; A . 2) are associated with increased rainfall over regions of Malaysia and Indonesia in JJA, especially in the Southern Hemisphere associated with convergence in the wave (Fig. 4f).In the WMRG example for DJF (Fig. 4e) rainfall is strongest in the Northern Hemisphere where the winds are southeasterly and vorticity is positive (cyclonic).
These relationships between rainfall and equatorial waves can be used in order to potentially improve forecasts of rainfall in these regions, as is detailed in the next section.

b. Design of the hybrid dynamical-statistical forecast methodology
The hybrid forecast approach is based on forecasts of equatorial wave amplitude A and phase quadrant Q.For shorthand, wave states are defined as (1) The wave state is identified in each ensemble forecast member, at different lead times, at different longitudes.Note that the waves are identified by projection onto an orthogonal basis set of meridional structure functions and therefore the wave amplitude and phase are a function of time, longitude, and pressure level, for each wave type.Attention is focused on winds at 850 hPa, due to the stronger connection with high-impact weather, although in principle the projection could be performed independently at any level.
Heavy precipitation events are defined in terms of precipitation rate above threshold over a sufficiently large fraction of the area of the forecast evaluation region.Following Ferrett et al. (2020), the critical threshold for heavy precipitation is above the 95th percentile of the daily precipitation climatology, locally for each grid box in the GPM-IMERG data.The forecast evaluation regions are defined in Fig. 1.Precipitation above threshold is considered for land points only in each region.An "event" occurs if the fraction of land points in a region with precipitation above threshold exceeds 15%.This occurs only 8%-18% of the time depending on the region and season as shown in Table 2 (4%-11% for native resolution GPM-IMERG; not shown).
The hybrid forecast method is now defined following the Bayesian technique of Cafaro et al. (2019).Event occurrence is defined using the indicator variable: where the subscripts denote validation timepoint n and evaluation region i.A composite climatology of precipitation events conditional on wave state is then obtained using ERA5 to calculate wave amplitude and phase, for each region, and GPM-IMERG to define event occurrence.The climatological statistics yield the counts of heavy precipitation events, from all days in the training record, falling into each segment of wave phase space: and the complementary counts of days with "no event" in wave phase space, f(W|u i 5 0).Normalized frequencies are obtained from the counts by dividing by the total number of days in the record so that the sum of f over the quadrants and amplitude bins in phase space equals unity.Note that phase space W(A, Q) is discretized such that A 5 1, 2, 3 and Q 5 1, 2, 3, 4. Bayes's rule is then used to derive a formula for the conditional probability of event occurrence given the wave state: where the prior probability of event occurrence, P 5 p(u i 5 1), is calculated simply from the fraction of days with an event defined for the region.The unconditioned probability of no event is p(u i 5 0) 5 1 2 P. Given forecasts of wave state W from a dynamical model, the hybrid forecast of risk of heavy precipitation is obtained simply by reading the corresponding value of the conditional probability from the static distribution p(u i 5 1|W) that has been compiled from climatological statistics of observations in the data training period .
c. Constructing precipitation probability forecasts with the hybrid method MOGREPS-G forecasts of wave states (test period 2015-18) are used to construct the Bayesian forecast rainfall probability: where i labels the target region, s indicates the forecast start date, l indicates a forecast lead time, and m labels ensemble members.BAY s,l,i represents the ensemble average of conditional rainfall probabilities obtained from knowledge of wave state in each forecast member W s,l,m,i .
The conditional rainfall probability forecasts can then be evaluated relative to the direct forecast of heavy precipitation events with the event indicator F s,l,m,i from the simulated MOGREPS-G precipitation fields: using the same heavy precipitation event definition, but based on percentiles of MOGREPS-G precipitation rates.
In words, the same methodology as used to define a heavy precipitation event using GPM-IMERG data is applied to forecast precipitation data.At any given lead time, each member will indicate "event" or "no event" (for region i), which translates into a one or zero for F s,l,m,i .The ensemble mean of this binary variable gives the direct probability forecast for the event.Note that the precipitation threshold is based on percentiles of the model climatology, allowing for model intensity bias.
For the remaining text, BAY s,l,i will be referred to as the "hybrid" forecast probability of rainfall events and ENS s,l,i will be referred to as the "simulated" forecast probability of rainfall.These are compared using the Brier skill score (BSS; Brier 1950;Glahn and Jorgensen 1970), a metric that quantifies the skill of a forecast against a climatology forecast, in order to assess the skill of the hybrid forecasts and the simulated forecasts.To calculate the BSS the Brier score (BS) is first calculated as where t is the forecast initialization time, l is the forecast lead time and N is the total number of forecasts performed.The term f is the forecast probability of the event and o is the observed event.The BSS is defined as where BS is the Brier score of the forecast to be tested and BS clim is the Brier score of the forecast to be tested against, in this case, a climatological probability of event occurrence.Two hybrid forecasts are constructed: 1) Using interpolated 18 3 18 GPM-IMERG data at matching resolution for direct comparison to the MOGREPS-G simulated precipitation data.2) Using native 0.18 3 0.18 GPM-IMERG resolution data to assess if higher-resolution observations affect hybrid forecast skill.
The first is the focus of this study but the second is shown in the skill analyses for completeness.For the BSS calculation both sets of hybrid forecasts are compared to a climatological forecast obtained at the corresponding resolution from the GPM-IMERG data.
A number of sensitivity tests have been performed on the hybrid methodology and are not shown in figures.The first relates to the length of the data training period.Results presented here use the maximum available data  but the method has also been tested using a 5-yr training period (2010-14) and a 10-yr training period .We find that varying the length of the training period does not have a large impact on hybrid forecast skill since the posterior probability used to construct the hybrid forecast is fairly robust, even for shorter time periods.Second the number of ensemble members used is varied.Similar to the training period we find little impact on hybrid forecast skill but more on simulated forecast skill.This is a result of there being more ensemble agreement on wave phase and amplitude than for rainfall amounts, meaning the hybrid forecast is less sensitive to fewer ensemble members.

a. The representation of equatorial waves in MOGREPS-G
The hybrid method relies on the dynamical model representation of equatorial waves, as opposed to numerical simulation of rainfall.It is therefore important that MOGREPS-G captures wave behavior to enable skill in the hybrid forecast of rainfall.Table 3 shows the number of days lead time for which an aggregated measure of skill (BSS) results in the MOGREPS-G forecasts of wave state W(A, Q) outperforming a forecast based on the climatological probability of wave occurrence for each region.Wave skill is assessed at the For the Kelvin wave, BSS does not indicate MOGREPS-G skill over a climatology forecast longer than 2 days in any region, and in many regions never shows skill.Previous work examining the Kelvin wave in the deterministic MetUM global forecast found that the Kelvin wave was not very well forecast in terms of amplitude, even at fairly short lead times (Yang et al. 2021), so this result is somewhat expected.However, in the case of the Kelvin wave it is the very strong convergence phase that is most strongly linked to enhanced rainfall in Southeast Asia (e.g., Ferrett et al. 2020) and so the skill of this phase only (when amplitude is very strong; A . 2) is also shown in Table 3 and suggests that the occurrence of this phase can be fairly well forecast in some regions.In DJF, there is limited skill for regions where rainfall is most strongly linked to the Kelvin wave, such as Indonesia and Malaysia, in keeping with the all-phase result.However, in JJA there is skill up to 6 days following forecast initialization in South Indonesia (containing Java), North Indonesia (containing Borneo), and East Indonesia (containing Sulawesi), suggesting that there may be potential for predictability of rainfall conditional on Kelvin wave convergence in JJA in these regions.
There is some skill in forecasts of R1 and WMRG waves exceeding 3 days in at least one season in almost all regions.The regions most affected by the westward-moving waves are those at higher latitudes, namely, the Philippines, Vietnam, Thailand, and Malaysia.DJF forecasts have particularly high skill, with maximum skillful lead times in DJF often exceeding those in JJA.This suggests that hybrid forecasts of rainfall associated with westward-moving waves in DJF for these regions are likely to show the most skill.
b. Hybrid forecasts for selected wave modes, regions, and seasons Skill of rainfall probability forecasts have been calculated using BSS for all permutations of region, wave type and time of year (either DJF or JJA).However, here the focus is only on those cases in which wave forecasts are fairly skillful (see Table 3) and therefore have the potential to improve heavy rainfall predictions.Results will therefore focus mostly on the following cases: • The role of the Kelvin wave in JJA for Indonesia rainfall • The role of R1 and WMRG waves in DJF for Philippines rainfall • The role of R1 and WMRG waves in DJF for Vietnam rainfall • The role of R1 and WMRG waves in DJF for Thailand rainfall Kelvin wave convergence around the Vietnam longitudes is relatively well represented in forecasts (Table 3).However, given Vietnam's location north of equator there is not a robust link between Vietnam rainfall and Kelvin wave occurrence (not shown).Because of this the hybrid forecasts for those cases are not useful and are therefore not discussed in detail here.
Note again that hybrid forecast BSS results are shown for those constructed using 18 3 18 resolution GPM-IMERG and for those using native GPM-IMERG resolution to compare how much extra skill is obtained when using higher-resolution observation data to construct the statistical relationship between the probability of rainfall above threshold and wave phase.In general, results show that hybrid skill is not substantially improved using higher-resolution observation data, possibly a result of the heavy rainfall event definition that quantifies rainfall within a larger area and is less affected by increased fidelity of small-scale systems.In some cases the higher-resolution hybrid skill is lower than the coarse hybrid skill.This is possibly a result of the defined heavy rainfall events being rarer in higher-resolution data since less of the area tends to be covered when rainfall is associated with smaller-scale systems than in coarse-resolution data, and rarer events may be more challenging to predict.
Despite lower wave forecast skill, the other regions have been examined for skill in the hybrid rainfall forecast relative to simulated rainfall, but results showed that equatorial waves had limited usefulness for improving rainfall predictions for those regions (not shown).This is interesting given that rainfall in these regions has been shown to be linked to equatorial waves [e.g., Kelvin waves in DJF shown in Figs.4a,b and other examples in Ferrett et al. (2020)], suggesting that the limited skill of the equatorial wave forecasts in those regions reduces the usefulness of the hybrid forecast.

1) INDONESIA
Change in the risk of the Indonesia regions rainfall as a function of wave phase space is demonstrated by the diagrams shown in Fig. 5.The risk of heavy precipitation in the Indonesia regions is dependent on Kelvin waves in JJA (Figs. 5a-d).The likelihood of a heavy rainfall event over land is increased by 2-4 times during the strongest Kelvin wave convergence phase in all four Indonesia regions, as indicated by the yellow, orange and red shading in Figs.5a-d and the probabilities shown.Rainfall in WI (Fig. 5a) and NI (Fig. 5c) regions, those covering Sumatra and Borneo, is most robustly linked to Kelvin wave activity, with the likelihood of heavy rainfall increasing by at least 4 times during very strong Kelvin wave convergence, suggesting greater potential for predictability using the hybrid forecast there.This relationship with Kelvin waves is also found in DJF for Kelvin waves, but not shown here due to the low skill in forecast wave amplitude and phase in this season.
Figure 6 shows the skill of the two hybrid forecasts of rainfall probability constructed using the conditional probabilities in Fig. 5 relative to climatology, as well as the skill of the simulated rainfall probability for the same region.A positive value of BSS indicates that the forecast in question is more skillful than a forecast based on climatology.It is found that the hybrid forecast outperforms the simulated forecast of rainfall probability at all lead times in NI and at earlier lead times in WI (Figs. 6a,c; the two regions with the strongest relationship between wave activity and heavy rainfall (Figs.5a,c).Simulated forecast skill is higher in SI (Fig. 6b) and the Kelvin wave-precipitation relationship is weaker (Fig. 5b) as a result of Java being located off equator.In this case the skill of the simulated forecast exceeds that of the hybrid forecast.In NI and EI regions there is no skill in the simulated forecast and the negative BSS values indicate model bias in precipitation which does not exist in the dynamical wave forecast.

2) PHILIPPINES
In the Philippines, the influence of R1 and WMRG waves on boreal winter heavy rainfall in the northern and southern Philippines regions (NP and SP, respectively) is examined.In JJA, heavy precipitation in NP, and to some extent SP, is completely dominated by the passage of tropical cyclones (Ferrett et al. 2020).The very strong positive vorticity associated with the cyclones projects onto the R1 structures of the equatorial wave basis (Ferrett et al. 2020).Because of this it is expected that there is a relationship between rainfall and the R1 wave in JJA.Perhaps surprisingly, the risk of heavy precipitation in the Philippines also shows a dependence on R1 and WMRG phase during DJF when TCs are rare in the Northern Hemisphere (Figs. 5e-h).The relationship is also not the same for NP and SP.For SP, the highest risk of heavy rainfall occurs between the cyclonic vorticity center (R1 and WMRG phase 4) and the region of equatorward flow on the west flank of this center.This is reflected in the composite of precipitation conditional on the cyclonic vorticity phase of R1 (Fig. 4).In NP the highest precipitation risk is in the poleward flow (y .0; phase 1) for both R1 and WMRG waves, where there is associated Northern Hemisphere convergence.
Simulated forecasts are skillful for Philippines rainfall, particularly NP, but declines with lead time (Figs.6e-h).The hybrid forecast shows skill over the climatology forecast (BSS .0) in the case of the R1 and WMRG waves in SP (Figs. 6e,g) and the WMRG wave in NP (Fig. 6h).The skill exceeds the simulated skill in the R1 SP case but not the NP case.

3) VIETNAM
Here the role of R1 and WMRG waves in DJF Vietnam heavy rainfall is examined.For central Vietnam (CV) there are significant relationships between rainfall and westward-moving equatorial waves.Figures 7a and 7c shows the likelihood of heavy rainfall is increased by almost 3 times during the Northern Hemisphere convergence phase of the WMRG wave and the northward wind Northern Hemisphere phase of R1 (y .0; R1 and WMRG phase 1).The risk of heavy rainfall is increased by around 2 times as a result of westward-moving waves in south Vietnam (SV; Figs.7b,d) in DJF.The cyclonic phase (phase 4) of both R1 and WMRG waves are linked to increased rainfall in SV.Northward R1 and WMRG winds (y .0; phase 1) also result in increased rainfall in SV, likely as a result of associated Northern Hemisphere convergence.Relationships between FIG. 5. Conditional probabilities of heavy rainfall as a function of wave phase space [Eq.( 4)] calculated using ERA5 wave data and 18 3 18 GPM-IMERG precipitation (2001-14) for various cases.Location (see Fig. 1), wave type, and season are detailed in the panel titles.The climatological probability of a rainfall event is given in the title with the conditional probability printed in text for each phase.Colored shading indicates the amplification of the conditional probability relative to the climatological probability (e.g., yellow shading shows that heavy rainfall is 1.5-2.5 times more likely than the climatological probability).
equatorial waves and NV DJF rainfall are weak, and so these are not included in discussions or figures.
In central Vietnam (CV) simulated precipitation forecast skill in DJF falls to close to zero following day 2 of the forecast (Figs.8a,c) and in SV drops to zero at later lead times (Figs. 8b,d).Hybrid forecast skill shows less potential in Vietnam than in Indonesia and Philippines.However, there is positive BSS for the lower-resolution hybrid forecasts in CV for R1 and WMRG waves in DJF, matching the simulated forecast skill at later lead times (Figs.8a,c).FIG. 7. As in Fig. 5, but for additional cases.FIG. 6. Brier skill score as function of lead time of the hybrid dynamical-statistical forecast of rainfall probability and the probabilistic forecast from simulated rainfall relative to a climatology forecast for the same cases as in Fig. 5. Location (see Fig. 1), wave type, and season are detailed in the panel titles.A positive value indicates skill relative to the climatological forecast; black is for hybrid forecast of rainfall probability, BAY s,l,i , using 18 3 18 GPM-IMERG; red is for hybrid forecast using 0.18 3 0.18 GPM-IMERG; and blue is for simulated ensemble forecast of rainfall probability, ENS s,l,i .Simulated skill lines that are not at all visible are very negative (,20.4) and, therefore, have low skill relative to the climatological forecast, indicating bias.

4) THAILAND
Figures 7e-h shows rainfall-equatorial wave relationships for southern Thailand (ST) and northern Thailand (NT) in DJF for R1 and WMRG waves.These relationships are some of the weakest examined of the regions with more skillful wave forecasts.ST heavy rainfall likelihood is increased by around 2 times in DJF as a result on northward flow (and therefore Northern Hemisphere convergence) associated with westward-moving waves (Figs.7e,g).Relationships between NT and equatorial waves are even weaker (Figs.7f,h).
Skill in simulated precipitation over ST and NT in DJF is high compared to other regions and exceeds 0.2 for all lead times in NT and 0.1 for all lead times in ST.Skill analysis of the hybrid forecasts of these four cases shows relatively weak results in terms of potential improvements to predictability, a result of the reduced heavy rainfall-equatorial wave relationships.ST shows most potential with positive values of BSS associated with R1 and WMRG waves (Figs.8g,i), consistent with the heavy rainfall wave relationships shown in (Figs.7g,i).There is no skill shown by the hybrid forecasts in NT (Figs. 8f,h).None of the hybrid forecasts in these cases exceed simulated forecast skill because of the relatively high skill shown by the simulated forecasts.
It is worth noting here that, as with other cases not shown including Malaysia cases, the hybrid method is likely substantially limited by the wave forecast skill.For example, the R1 wave has relatively low forecast skill over Thailand regions in both DJF and JJA (Table 3).This reinforces the finding that in order to have skill in the hybrid method it is required to both have a significant link between rainfall and equatorial waves, as well as skillful forecasts of wave activity.

Conclusions
A novel dynamical-statistical hybrid forecast approach has been created for predicting the risk of heavy precipitation in regions of Southeast Asia, given knowledge of the phase and amplitude of equatorial waves from global ensemble forecasts.Attention has been focused on extremes of daily precipitation accumulation, estimating the risk across areas of similar scale to the larger islands in Southeast Asia and positioned to distinguish forecasts for different islands.This scale is still relatively local compared to the meridional scale and zonal wavelength of the equatorial waves.The risk of heavy precipitation is greatly enhanced in some phases of equatorial waves, depending on region and season.In some cases the probability of heavy precipitation within the region (across a substantial area) exceeds more than 4 times the climatological probability.The conditional probability diagrams for each wave type and region yield useful information for forecasters on the important wave types and phases associated with high-impact weather risk, even without the quantitative probability forecasts derived from MOGREPS-G ensemble forecasts.Preliminary assessment of skill of the hybrid forecasts using the Brier skill score indicates that there is indeed potential for hybrid forecasts built on these relationships to extend the lead time of useful prediction of high-impact weather in some regions.
It is found that there is skill in hybrid forecasts in some regions and seasons where there is no skill measured in the simulated precipitation forecasts.For example, associated with Kelvin waves crossing Indonesia in JJA and westward waves crossing the Philippines in DJF.This is consistent with studies finding statistical forecasts in other regions of the tropics, such as Africa, can outperform NWP predictions as a result of the link between equatorial waves and rainfall (Vogel et al. 2021).In contrast, in DJF over the Indonesian regions no skill is found in the hybrid forecasts, despite an enhanced probability of heavy precipitation conditional on the convergence phase of equatorial waves.This is surprising given that other authors (e.g., Judt 2020) have emphasized the longer-range predictability in the tropics associated with equatorial waves.This illustrates potential for greater predictive skill associated with equatorial waves that will be explored in future work, including different methods to relate high-impact weather to waves and also different ways to quantify predictive skill.An important point to note is that it has been shown in previous work that there is a relationship between equatorial waves and increased heavy rainfall in regions of Malaysia (Ferrett et al. 2020).This was also found here such that Kelvin wave convergence phases, and WMRG wave cyclonic vorticity phases can be linked to increased probability of rainfall in Malaysia regions and in New Guinea (not shown).Despite this, there was no significant skill in the hybrid dynamical-statistical forecast relative to climatology for any wave types in these regions and so these results were not discussed in detail.This suggests that the processes pivotal to this relationship are perhaps still not sufficiently captured by the forecast of wave activity and the observed relationship to heavy rainfall.Possible reasons for this include poorly forecast waves or that variability on other scales is key to rainfall predictability in these regions that may be improved with further understanding of additional drivers.
This work demonstrates an initial formulation of a hybrid forecast methodology and as such should be viewed as a starting point for future improvements}variation of "event" classification, region boundaries, phase classification, and methods to take advantage of higher-resolution observation data are likely to benefit hybrid forecasts in some regions.This study also uses the BSS alone to quantify hybrid forecast skill.While the BSS is useful as a measure of overall skill, a more in-depth study of skill should be implemented to fully quantify the method's usefulness compared to other forecasts.Such a detailed analysis is outside the scope of this study and is therefore carried out in an additional study.Wolf et al. (2023) examine the sensitivity of the method described here to parameters listed above and how these affect the hybrid forecast skill.They also compare hybrid forecasts to higher-resolution convection-permitting ensemble forecasts using additional skill metrics.The benefit of blending simulated and hybrid forecast information is also investigated in this additional study.
Previous work has suggested potential for statistical forecasts in Africa as a result of the link between equatorial waves and rainfall (Vogel et al. 2021).This study also finds that rainfall in Indonesia, the Philippines, and Vietnam are more accurately predicted when incorporating the known relationship between equatorial waves and Southeast Asia rainfall, as opposed to just basing probabilistic forecasts on model simulated rainfall, suggesting potential for improvements of probabilistic forecasts of high-impact weather in future.

FIG. 1 .
FIG. 1. Boxed regions of interest for Malaysia (PM and EM), Indonesia (WI, NI, EI, and SI), the Philippines (NP and SP), Vietnam (NV, CV, and SV), Thailand (NT and ST), and Indonesian New Guinea and Papua New Guinea (ING and PNG).Color shading is the mean 2001-18 GPM-IMERG precipitation.
FIG. 2. Propagation of Kelvin, R1, and WMRG waves in local wave phase space diagrams.Lines connect consecutive lead times (steps from 1 day out to 7 days) in a forecast of wave activity to show the wave propagation through the local region; in this instance, 107.58-112.58E.Circles mark wave amplitude (A 5 X 2 1 Y 2 √ ) intervals of one and two separating weak, strong, and very strong wave activity.MetUM analysis values are shown by black lines, gray lines show MOGREPS-G ensemble member forecasts, and the ensemble mean forecast (thicker gray line).Forecast initialization date is indicated by the red star.
FIG. 3. Composite of ERA-Interim 850-hPa wave wind fields (1997-2018) for each phase of the three equatorial waves in this study}Kelvin, R1, and WMRG.Composites phases are centered on 1158E.

TABLE 2 .
Percentage of days a precipitation "event" occurs in the defined region where an "event" is defined as 18 3 18 gridpoint rainfall exceeding the 95th percentile over at least 15% of land points in the region.

TABLE 3 .
Number of days of lead time that skill in MOGREPS-G probabilistic forecast of wave state is positive relative to climatology measured by the multicategory Brier skill score for all phases of Kelvin, R1, and WMRG waves (first six columns) and the Brier skill score of the strongest convergence phase of the Kelvin wave (final two columns).Skill is assessed at longitudes varying depending on the listed regions.Unauthenticated | Downloaded 09/28/23 06:15 PM UTC central longitudes of the 15 identified regions in Southeast Asia in DJF and JJA.BSS is used to account for the skill of all 12 sectors of wave phase space.