In this study, the contribution of low-frequency (>100 days), Madden–Julian oscillation (MJO), and convectively coupled equatorial wave (CCEW) variability to the skill in predicting convection and winds in the tropics at weeks 1–3 is examined. We use subseasonal forecasts from the Navy Earth System Model (NESM); NCEP Climate Forecast System, version 2 (CFSv2); and ECMWF initialized in boreal summer 1999–2015. A technique for performing wavenumber–frequency filtering on subseasonal forecasts is introduced and applied to these datasets. This approach is better able to isolate regional variations in MJO forecast skill than traditional global MJO indices. Biases in the mean state and in the activity of the MJO and CCEWs are smallest in the ECMWF model. The NESM overestimates cloud cover as well as MJO, equatorial Rossby, and mixed Rossby–gravity/tropical depression activity over the west Pacific. The CFSv2 underestimates convectively coupled Kelvin wave activity. The predictive skill of the models at weeks 1–3 is examined by decomposing the forecasts into wavenumber–frequency signals to determine the modes of variability that contribute to forecast skill. All three models have a similar ability to simulate low-frequency variability but large differences in MJO skill are observed. The skill of the NESM and ECMWF model in simulating MJO-related OLR signals at week 2 is greatest over two regions of high MJO activity, the equatorial Indian Ocean and Maritime Continent, and the east Pacific. The MJO in the CFSv2 is too slow and too weak, which results in lower MJO skill in these regions.
Subseasonal-to-seasonal (S2S) prediction has the potential to bridge the gap between medium-range weather forecasting and seasonal climate outlooks (Vitart et al. 2012; Vitart 2014; Robertson et al. 2015; White et al. 2017). Coupled atmosphere–ocean models have shown skill in predicting precipitation in the tropics (Li and Robertson 2015; Wheeler et al. 2017) and tropical cyclones (Belanger et al. 2010; Vitart et al. 2010; Camp et al. 2018) at S2S time scales. Sources of predictability in the tropics on S2S time scales include the El Niño–Southern Oscillation (ENSO) (Ropelewski and Halpert 1987) and Madden–Julian oscillation (MJO) (Zhang 2005; Zhang et al. 2013) while convectively coupled equatorial waves (CCEWs) (Kiladis et al. 2009) mainly contribute to predictability on shorter time scales (Dias et al. 2018). However, Guo et al. (2015) observed that numerical models with more robust convectively coupled Kelvin waves also better simulate the MJO. Goswami et al. (2017a) also observed that improvements in both the MJO and CCEWs tend to coincide with overall improvements in tropical convection.
In this study, we examine boreal summer 1999–2015 forecasts from the Navy Earth System Model (NESM); the NCEP Climate Forecast System, version 2 (CFSv2) (Saha et al. 2014); and the ECMWF subseasonal prediction system (Vitart 2014). Wavenumber–frequency filtering is used to decompose OLR and winds from the model forecasts into low-frequency (>100 days), MJO, and CCEW components. This is then compared to wavenumber–frequency-filtered fields from gridded satellite observations and the ERA-Interim to determine the contribution of these modes of variability to the predictive skill of the models at weeks 1–3. In comparison to global MJO indices (Wheeler and Hendon 2004), this method allows us to better estimate the geographic variability of MJO skill.
The skill of coupled atmosphere–ocean models in forecasting precipitation at seasonal and S2S time is greatest in the tropics—in particular the central and eastern equatorial Pacific (Li and Robertson 2015; Wheeler et al. 2017). In this region, atmospheric variability is driven by sea surface temperature (SST) variability and precipitation and SST are strongly correlated (Kumar et al. 2013). In addition, coupled atmosphere–ocean models have much greater skill in the tropics at both seasonal and S2S time scales during La Niña and El Niño episodes than during neutral conditions (Li and Robertson 2015).
The MJO is the leading mode of intraseasonal variability in the tropical atmosphere and a major source of predictability on S2S time scales (e.g., Kim et al. 2014). In the tropics, deep convection is regulated by midtropospheric humidity (Holloway and Neelin 2009); this is a key aspect of moisture mode theories of the MJO (e.g., Sobel and Maloney 2013; Adames and Kim 2016). Air–sea fluxes are one process that can amplify the moisture anomalies associated with the MJO (DeMott et al. 2014, 2015). Radiative–convective feedbacks are also a source of instability for the MJO (Crueger and Stevens 2015) and may play a role in scale selection (Adames and Kim 2016). The eastward propagation of the MJO is primarily due to horizontal moisture advection (Maloney 2009) but vertical moisture advection also plays a role (Janiga and Zhang 2016). Improvements in numerical MJO prediction have come about through more realistic moisture–convection interactions (Hirons et al. 2013a,b) and the use of atmosphere–ocean coupling which extends useful skill by ~1 week (Green et al. 2017). In addition, Gonzalez and Jiang (2017) found that subseasonal prediction systems with reduced mean state humidity biases better represent horizontal moisture advection processes and have more realistic MJOs. The ability of models to represent Kelvin waves within the MJO envelope has also been linked to improved MJO simulation (Guo et al. 2015).
CCEWs (Kiladis et al. 2009) are equatorially trapped Rossby and inertio-gravity waves (Matsuno 1966) that couple to moist convection and explain a large portion of the variance in tropical convection (Wheeler and Kiladis 1999). Convectively coupled atmospheric Kelvin (Ventrice et al. 2012a,b; Reynolds et al. 2016; Schreck 2015, 2016), equatorial Rossby (ER) (Molinari et al. 2007), and mixed Rossby–gravity (MRG) waves (Dickinson and Molinari 2002) modulate tropical cyclone (TC) formation [see Frank and Roundy (2006) and Schreck et al. (2012) for an overview of the influence of tropical waves on TC genesis]. Tropical depressions (TDs) or easterly waves also serve as precursors for TCs (e.g., Wang et al. 2010a,b). However, accurately simulating the climatological activity, structure, and behavior of CCEWs in global models remains a challenge (Lin et al. 2006; Straub et al. 2010; Hung et al. 2013; Wang and Li 2017). The empirical method of Wheeler and Weickmann (2001) has ~10 days of predictive skill for ER waves and several days of predictive skill for MRG and Kelvin waves. Although individual MRG/TD and Kelvin waves are not a source of predictability at S2S time scales, accurately simulating their climatological variability and structure and their modulation by ENSO and the MJO could be important to extreme precipitation and tropical cyclone prediction.
One goal of this study is to determine the sources of the regional variability in skill in subseasonal forecasting systems found by Li and Robertson (2015) and Wheeler et al. (2017). Wheeler et al. (2017) found that daily precipitation skill is greatest in the extratropics for the first 2–3 days and originates from the cyclones and anticyclones found at these latitudes. However, by week 2 the precipitation skill was much higher in the tropics than in the extratropics. Li and Robertson (2015) found that the ECMWF coupled model had much greater skill in forecasting precipitation at week 2–3 over the Asian monsoon region than the CFSv2. This is corroborated by Jie et al. (2017) who examined the skill of the ECWMF coupled model and other subseasonal prediction systems in forecasting the boreal summer intraseasonal oscillation (BSISO) index of Lee et al. (2013).
Wavenumber–frequency filtering (Wheeler and Kiladis 1999) is commonly applied to OLR and dynamical fields from both observations and dynamical model output to isolate the signals associated with different CCEWs and the MJO. With some modifications, this filtering can be applied to subseasonal forecasts. We apply wavenumber–frequency filtering to deterministic subseasonal forecasts from three coupled atmosphere–ocean models initialized in June–August 1999–2015. We evaluate the skill of these models in predicting OLR and dynamical fields associated with different CCEWs, the MJO, and low-frequency (>100 days) modes of variability in both daily and weekly time-averaged fields. This is used to explain regional differences in the total skill in forecasting these fields at weeks 1–3.
The characteristics of the models and observational datasets used to evaluate them are discussed in section 2. The MJO and CCEWs are identified using wavenumber–frequency filtering; the modifications required to apply this filtering to subseasonal forecasts are discussed in section 2. Biases in the standard deviation of MJO- and CCEW-filtered OLR anomalies in each model are presented in section 3. In section 4 the fractional contribution of low-frequency (>100 days), MJO, and different CCEW signals to the total predictive skill in each model is discussed. A summary of the results and some conclusions are presented in section 5.
2. Data and methodology
We examine output from three global coupled atmosphere–ocean models: NESM, the ECMWF subseasonal prediction system (Vitart 2014), and the CFSv2 (Saha et al. 2014). We use the entire 45-day duration of the NESM and ECMWF control runs and the first 45 days of a subset of reforecast and operational CFSv2 runs initialized in JJA 1999–2015. The initialization frequency, resolution, and cumulus parameterizations used in each of the models is shown in Table 1.
NESM consists of the Navy Global Environmental Model (NAVGEM) (Hogan et al. 2014) coupled to the Global Ocean Forecast System (GOFS) (Metzger et al. 2014), which itself consists of the Hybrid Coordinate Ocean Model (HYCOM) (Bleck 2002) and Los Alamos Community Ice Code (CICE), version 4.1 (Hunke and Lipscomb 2010). NAVGEM radiation is based on the Rapid Radiative Transfer Model for general circulation models (RRTMG) scheme for longwave and shortwave (Iacono et al. 2008). Cloud fraction is based on Xu and Randall (1996) with a modified version of Slingo (1987) for convective clouds, including a relative humidity dependence and convective mass flux scaling. Convection is a bimodal (turbulence and dynamically forced) extension of Ridout et al. (2005), which is based on Kain–Fritsch. The boundary layer is based on Louis et al. (1982) and Sušelj et al. (2012) and surface fluxes are based on Coupled Ocean–Atmosphere Response Experiment (COARE), version 3.0 (Kara et al. 2005).
A series of 45-day integrations of NESM were initialized at 1200 UTC four times a week on Saturday, Sunday, Monday, and Tuesday during June–August 1999–2015 for the North American Multimodel Ensemble (NMME) Subseasonal Experiment (SubX) (http://cola.gmu.edu/kpegion/subx/) (Kirtman et al. 2017). Altogether this is 870 runs; 24 NESM runs failed and are not included in this total. NAVGEM is run at T359 (~55 km at the equator) with 50 vertical levels. HYCOM is run at 1/12° (~9 km at the equator)—which allows for the representation of ocean eddies—with CICE on the same grid. The ocean and sea ice are initialized from a GOFS, version 3.1, reanalysis for 1999–2015. The atmosphere was initialized with operational NOGAPS analyses for 1999–2008 and a reanalysis using NAVGEM and the hybrid version of the Naval Research Laboratory (NRL) Atmospheric Variational Data Assimilation System–Accelerated Representer (NAVDAS-AR; Kuhl et al. 2013) for 2009–15. The improved initial conditions for 2009–15 result in better short-term skill but do not have a significant effect on the skill for week 2 and beyond or the wavenumber–frequency spectra.
Reforecasts from the ECMWF subseasonal prediction system (Vitart 2014), version CY41R1, are taken from the S2S database (apps.ecmwf.int/datasets/data/s2s/) (Vitart et al. 2017). The atmosphere model is run at TL639 (~32 km) up to day 10 and TL319 (~64 km) after day 10; the ocean model is run at 1° (~111 km). The convective parameterization (Tiedtke 1989; Bechtold et al. 2004) is able to capture the sensitivity of convection to midtropospheric humidity (Hirons et al. 2013a,b), which is one of the reasons its skill in forecasting boreal intraseasonal variability is superior to many models (Jie et al. 2017). Control forecasts initialized at 0000 UTC twice a week during June–August 1999–2015 are used in this study.
The atmosphere model in the CFSv2 (Saha et al. 2014) is the Global Forecast System (GFS) and the ocean model is the Geophysical Fluid Dynamics Laboratory (GFDL) Modular Ocean Model, version 4 (MOM4). The atmospheric resolution is T126 (~147 km) with 64 vertical levels and the resolution of the ocean model is 0.5° poleward of 30° and 0.25° between 10°S and 10°N. The cumulus parameterization is the simplified Arakawa–Schubert (SAS) scheme (Han and Pan 2011). Although more forecasts are available, for computational expediency we use reforecasts and operational forecasts initialized every fifth day at 0000 UTC during June–August 1999–2015 (ncdc.noaa.gov/data-access/model-data/model-datasets/climate-forecast-system-version2-cfsv2). These reforecasts and operational forecasts are 9 months long but we only use the first 45 days for comparison with the NESM and ECMWF runs.
Model forecasts of daily averaged outgoing longwave radiation (OLR) are validated against observations from the Climate Data Record (CDR) of OLR (Lee et al. 2011). Dynamical fields from the model forecasts at 0000 UTC are validated against the ERA-Interim reanalysis (Dee et al. 2011). All comparisons between forecasts and observations are performed on a common 2.5° × 2.5° grid.
c. Wavenumber–frequency filtering
Figure 1 shows the symmetric wavenumber–frequency spectra (e.g., Wheeler and Kiladis 1999) of OLR from the 45-day integrations of the three models and OLR observations for the 45 days following each day of JJA 1999–2015. These spectra have been normalized by a red noise background. Both the NESM and ECMWF model have more eastward power than westward power at 20–45 days consistent with a robust MJO. In contrast, the CFSv2 has similar power in both directions. All three models produce spectral peaks consistent with ER waves and TDs. Kelvin waves in the NESM are slower than those in the observations while Kelvin wave activity in the CFSv2 is underestimated. In the asymmetric wavenumber–frequency spectra (not shown), an MRG wave peak is present in all three models but it is weaker than observed.
Figure 2 provides an overview of the method used to apply wavenumber–frequency filtering to the model forecasts and observations. To filter for time scales longer than 45 days, the model forecasts are padded with observations prior to the initialization time and zeroes afterward in the padded filtering method (FilPad). Two years of either OLR observations or ERA-Interim analyses in the case of zonal winds are used prior to the initialization time. The duration of zero padding is 2 yr − 45 days. FilPad is also applied to 45 days of observational anomalies coinciding with the model forecast period. The model forecasts and coincident observations both filtered using the padded method are compared to evaluate the model forecasts. We also compare observations filtered using FilPad with a reference filtering methodology (FilRef) in which the filtering is applied to one continuous time series. This is done to determine over which portion of the 45-day model forecast FilPad is able to closely approximate FilRef. Wheeler and Weickmann (2001) describe a procedure for producing forecasts of different wavenumber–frequency modes using only observations. In their method, the period after initialization in Fig. 2 is entirely zeroes and observed signals are extrapolated into the future.
In comparison to Wheeler and Kiladis (1999), the filter bands are relatively unrestrictive to allow for diversity in the propagation characteristics of the various disturbances and differences between simulated and observed spectral characteristics (Fig. 1). The wavenumber k and period p for each band are as follows: MJO (k = 0:9, p = 20:100 days), Kelvin (k = 1:14, p = 2.5:20 days), ER (k = −10:−1, p = 10:100 days), and mixed Rossby–gravity/tropical depression (MRG/TD) (k = −20:0, p = 2.5:10 days). We also apply a low-frequency/large-scale (LF) filter (k = −10:10, p = >100 days). LF is meant to capture variability in tropical convection associated with ENSO and other seasonal-to-interannual modes of variability.
The padded filtering method is applied one latitude at a time with no symmetry constraints. The procedure is as follows:
Remove the first four harmonics of the observed annual cycle 1981–2010 from both the model forecasts and observations.
Remove model mean biases so that the anomalies being filtering have a mean of zero. Biases are with respect to observations and calculated as a function of lead time for forecasts initialized in each month for each model.
Taper the first year of the two years of observations before initialization to zero with a split-cosine-bell taper to reduce spectral leakage.
Filter for LF signals.
To reduce the ringing that arises from the transition from zero to nonzero values, apply a 100-day high-pass filter to the nonzero portion of the data prior to filtering for MJO and ER signals. This is done instead of subtracting the LF signals (step 4), which exclude higher wavenumbers.
Filter for MJO and ER signals.
Prior to filtering for Kelvin and MRG/TD signals, apply a 20-day high-pass filter to the nonzero portion of the data.
Filter for Kelvin and MRG/TD signals.
For the reference filtering method we simply taper the first and last years to zero prior to filtering for the LF, MJO, ER, Kelvin, and MRG/TD signals.
Figure 3 shows the ability of the padded filtering method to approximate the reference filtering method for OLR observations. The “initialization” times are each day of all months of 1999–2015 (6205 instances). The filtered anomalies from different wavenumber–frequency filters are compared to filtered anomalies at matching times from the reference filtering method. Figure 3a shows the standard deviation ratio between FilPad and FilRef over 30°S–30°N. Values less than one indicate an underestimation of the amplitude of the filtered anomalies. Beyond week 3 the FilPad method increasingly underestimates the amplitude of the filtered signals. This underestimation is larger for the low-frequency filters than the high-frequency filters. Some ringing effects are also apparent near the end of the forecast associated with the transition to the zero padding for the Kelvin and MRG/TD filters. In the LF filter, ringing effects lead to a slight overestimation of the amplitude of the anomalies during week 1. The pattern correlation or anomaly correlation coefficient (ACC) between FilPad and FilRef is also calculated over 30°S–30°N. Here, , where and are the anomalies from FilPad and FilRef weighted by the cosine of the latitude. For the first three weeks of a 45-day forecast, the wavenumber–frequency-filtered anomalies generated using FilPad closely approximate those from FilRef (Fig. 3b).
Figures 4a and 4b illustrate OLR anomalies filtered for LF, MJO, ER, and Kelvin wave signals using FilPad for OLR observations and an NESM forecast initialized 1 June 2015. This MJO event was associated with a period of elevated tropical cyclone activity and a westerly wind burst that contributed to the development of El Niño conditions. The MJO in the NESM is too intense and propagates too fast. In addition, while the NESM is able to capture the increased Kelvin wave activity within the MJO convective envelope, the distinction between the MJO and Kelvin waves is not as clear as in observations. Time–longitude diagrams produced in real-time using CFSv2 forecasts and a similar methodology are available at ncics.org/mjo.
Following Janiga and Thorncroft (2016), we also calculate the phase and wave packet amplitude for each wavenumber–frequency filter using the following procedure:
Calculate the time derivative of the wavenumber–frequency-filtered anomalies from FilPad for both observations and model forecasts.
Normalize wavenumber–frequency-filtered anomalies and their time derivative from both model forecasts and observations by the standard deviation found in observations.
Amplitude is , where A is the standardized filtered anomaly and B is the standardized time derivative of the filtered anomaly. The phases calculated using A and B have values from 0° to 360° such as decaying (0°), suppressed (90°), developing (180°), and enhanced (270°).
The phase of the MJO signals and the wave packet amplitude in observations and the NESM forecast are shown in Figs. 4c and 4d. Periods of high wave packet amplitude can occur in both suppressed and enhanced MJO phases. There appears to be some propagation of wave packet amplitude from the eastern Pacific westward, which would be consistent with a westward MJO group velocity (Adames and Kim 2016).
The most common method of evaluating the MJO in forecasts is the Real-time Multivariate MJO (RMM) index (Wheeler and Hendon 2004). While variations in RMM skill as a function of phase can give some information on regional differences in MJO skill, the approach described above can be used to calculate the skill of the model in representing MJO-related variations in wind and convection at individual grid points. It should be noted that the LF, MJO, ER, Kelvin, and MRG/TD filtered anomalies can be linearly combined to reproduce the unfiltered anomalies with relatively little loss since they account for most of the variance in the tropics.
d. Model forecast evaluation
In section 3 we analyze model biases in the activity of the different tropical modes of variability calculated from the standard deviation of filtered OLR anomalies from model forecasts and observations. We focus on biases during weeks 2 and 3 since the model biases grow fairly quickly during week 1. Mean model biases in unfiltered fields are shown to provide context for these analyses.
In section 4 we examine the predictive skill of the models by comparing both filtered and unfiltered model forecasts and observational analyses. We calculate the Pearson correlation at each grid point and the pattern correlation or ACC over different regional domains. Here, , where and are the anomalies from the model forecasts and observational analyses weighted by the cosine of the latitude. We also calculate the mean absolute error difference (ΔMAE) relative to a forecast of climatology, which in this case is zero since we are comparing anomalies, over different regional domains. Here, . Negative values of ΔMAE indicate that the model forecast beats climatology. To explain how ACC and ΔMAE relate to the behavior of different modes of variability we compute mean errors and systematic biases in the phase and wave packet amplitude. The mean phase error is and the mean amplitude error is , where subscripts P and A denote the phase and amplitude of model forecasts and observational analyses. Similarly, the phase bias is and the amplitude bias is .
e. Significance testing
Because the forecasts from the three models have a different initialization frequency and number of runs, it is important to determine that differences in biases and skill between the models are statistically significant. For the maps of activity biases and Pearson correlation, statistical significance was determined by resampling with replacement forecasts initialized within 10-day nonoverlapping windows using a moving-block bootstrap. In each bootstrap 1000 simulations were performed. The moving block bootstrap is used to preserve the temporal autocorrelation between adjacent forecasts. Henderson et al. (2016) used a window of 4–6 days depending on MJO phase to represent the time the MJO spends in each phase. Experimentation with a range of window lengths indicates that the main conclusions of this study are not especially sensitive to window length so a single window length was used for simplicity. When plotting maps of statistical significance it is also necessary to account for the fact that false rejections of the null hypothesis can occur based on the large number of tests (Wilks 2016). For simplicity, local statistical significance is determined by p values of 0.005 (99.5% rejection of the null hypothesis at that grid point by the bootstrap distribution). This does not indicate that all points are significant at this level but ensures that a relatively small fraction of these points would be false rejections of the null hypothesis at the 95% level. The error bars of quantities integrated over regional domains (ACC, ΔMAE, phase bias, and amplitude bias) are based on the 95% range of the bootstrap distribution from the same moving-block bootstrap.
3. Systematic biases
Figure 5 shows mean biases in OLR and 850- and 200-hPa winds averaged over the first 45 days of the forecasts for each of the models. Although it is not shown in this figure, the OLR biases stabilize by ~3–5 days and the wind biases stabilize by ~10 days. The NESM is characterized by excess convection over tropical North Africa, Southeast Asia, and the western Pacific and too little convection over South America and the western Atlantic (Fig. 5a). A double ITCZ bias is also apparent over the western Pacific in the NESM. The OLR biases over South America and North Africa are consistent with a nearly 10 m s−1 easterly bias extending from Africa into South America at 200 hPa and a westerly bias over the tropical Atlantic at 850 hPa. Both OLR and wind biases are much smaller in the ECMWF model (Fig. 5b). In the CFSv2 there is a lack of convection over Southeast Asia and tropical South America (Fig. 5c).
Figures 6–9 show the observed standard deviation of OLR anomalies for different filters and the standard deviation bias of the models averaged over weeks 2–3. An advantage of the filtering procedure introduced in section 2c is that biases in wave activity can be diagnosed from subseasonal forecasts. While previous studies (e.g., Jiang et al. 2015) have examined biases in wave activity within multiyear integrations, models may drift into a different climatological state over such a long period of time.
Observed MJO activity in the boreal summer extends from the Indian Ocean across the Maritime Continent and into the western Pacific with a secondary peak in activity over the eastern Pacific (Fig. 6a). MJO activity is too high in the NESM over the western and eastern Pacific (Fig. 6b). The ECMWF model is characterized by a slight positive activity bias over the Maritime Continent (Fig. 6c) while the CFSv2 is characterized by a lack of MJO activity in this region (Fig. 6d). Convective activity over the Maritime Continent is characterized by complex interactions between the MJO, topography, and the diurnal cycle (Peatman et al. 2014), which tend to be poorly represented in models (Peatman et al. 2015). This may be why MJO activity biases are relatively large in this region.
ER wave activity during the boreal summer is strongest along a belt that extends from the western Pacific westward into Southeast Asia (Fig. 7a). This peak is partly related to tropical cyclone activity in the region that projects onto the ER filter (Aiyyer et al. 2012). NESM ER activity biases are similar to those for the MJO with excess activity over the Bay of Bengal and China as well as the eastern Pacific (Fig. 7b). ECMWF model biases are smaller but are also elevated in these two areas of high observed ER activity (Fig. 7c). In the CFSv2 there is excess ER activity in a belt between 0° and 15°N that stretches from the Indian Ocean into the Atlantic and a lack of activity within the belt of high observed activity indicating a southward shift in ER activity (Fig. 7d).
Convectively coupled Kelvin waves occur at similar wavenumbers and periods as midlatitude cyclones, which allows these them to couple and interact (Straub and Kiladis 2003). As a result, Fig. 8a shows convectively coupled Kelvin wave activity near the equator as well as cyclone activity in the midlatitudes. In the CFSv2, Kelvin wave activity is underestimated across the equatorial regions, but especially near the date line where activity in the model is ~50% of observed (Fig. 8d). The absence of Kelvin wave activity in the CFSv2 was also noted by Goswami et al. (2017a) and was improved by introducing a stochastic multicloud cumulus parameterization. Goswami et al. (2017b) note that the stochastic multicloud cumulus parameterization better simulates the spectrum of rain rates in the tropics. Kelvin wave activity biases are smaller in the NESM and ECMWF model (Figs. 8b and 8c).
Regional biases in MRG/TD activity (Fig. 9) are closely related to the mean state biases in OLR (Fig. 5). Models with negative OLR biases in certain regions tend to have overactive MRG/TD activity there as well. Tropical cyclones can project onto the MRG/TD band, too (Aiyyer et al. 2012), which may partly account for the biases shown in Fig. 9. All three models overestimate MRG/TD activity over tropical West Africa. In this region, the MRG/TD band would be dominated by African easterly waves (Kiladis et al. 2006).
4. Predictive skill
In this section we examine the skill of the model forecasts of OLR and zonal winds both spatially and as a function of forecast lead time. We begin by examining the spatial variability of the skill in weekly averaged unfiltered fields. The skill of the models in predicting different modes of variability is then examined by comparing filtered forecasts and analyses. Last, model forecasts of OLR anomalies filtered with different wavenumber–frequency bands are combined and compared to unfiltered OLR observations to determine the skill contribution from individual filter bands as a function of forecast lead time.
Figure 10 shows the Pearson correlation between model forecasts and observations of OLR. Both the model forecasts and observations are unfiltered but are averaged over weeks 1 and 2 prior to computing the Pearson correlation. Areas of correlation >0.5, which are locally significant at the 99.5% confidence level are stippled. This higher confidence level is used to indicate a relatively small fraction of false null hypothesis rejections at the 95% level. The threshold of 0.5 is commonly used to indicate useful skill for the MJO (e.g., Kim et al. 2014). In addition, as will be shown in subsequent figures, correlation >0.5 roughly coincides with a negative ΔMAE indicating that the forecast beats climatology. In all three models, the correlation is highest in the midlatitudes at week 1 and associated with midlatitude cyclone activity. However, the correlation in the midlatitudes decreases more quickly than in the tropics so that the highest correlation is concentrated in the tropics and subtropics at week 2. Wheeler et al. (2017) examined precipitation skill in the ECMWF and Predictive Ocean Atmosphere Model for Australia (POAMA-1) subseasonal prediction systems and found similar results. Li and Robertson (2015) found that at monthly time scales the correlation between forecasted and observed convection in coupled global models becomes increasingly concentrated in the equatorial Pacific and is associated with ENSO. The ECMWF model has the highest skill at both weeks 1 and 2 while the CFSv2 has the lowest skill. Compared to the NESM and ECMWF, the week 2 skill in the CFSv2 is relatively low over Southeast Asia and the Maritime Continent where MJO activity is high.
Figure 11 shows the Pearson correlation between unfiltered model forecasts and ERA-Interim analyses of 850- and 200-hPa zonal winds at week 2. The skill of the models in forecasting 850-hPa zonal wind (Figs. 11a,c,e) is much higher than for OLR. In the NESM and ECMWF model, swaths of correlation >0.5 are collocated or downstream of the regions of high MJO activity. One area of skill extends from the Indian Ocean into the central Pacific while a smaller swath of high skill is found in the eastern Pacific. In the CFSv2, the highest correlation is also found straddling the Maritime Continent and in the eastern Pacific but is much lower.
The skill of the models in forecasting 200-hPa zonal wind is greatest south of the equator in the eastern Pacific and Atlantic and much higher in the ECMWF model than the NESM and CFSv2 (Figs. 11b,d,f). During the boreal summer, the subtropical jet in the Southern Hemisphere is stronger than the one in the Northern Hemisphere. The area of highest skill is located slightly equatorward of the Southern Hemisphere subtropical jet at roughly the same latitude where boreal summer MJO zonal wind variability is greatest (Adames et al. 2016). The skill difference between the ECMWF and NESM models in both OLR (Fig. 10) and zonal winds (Fig. 11) may be related to the comparatively large mean state biases in the NESM (Fig. 5).
Figure 12a shows the daily pattern correlation or ACC between filtered OLR anomalies from the NESM and filtered OLR observations over 20°S–20°N. In Fig. 12b the ACC between filtered OLR anomalies from the NESM model forecasts and unfiltered OLR observations is shown. The correlation between unfiltered anomalies from the model and observations (UF) is shown for reference in both Figs. 12a and 12b. In Fig. 12a the correlation drops off most slowly for the LF-filtered OLR and fastest for the MRG/TD-filtered OLR. Figure 12b illustrates the importance of different modes of variability to the total skill of the model forecast at different lead times. ER variability appears to be an important source of skill for the first few days. However, the skill contribution of anomalies filtered for ER waves as well as for Kelvin and MRG/TD waves drops off fairly quickly. Into weeks 2 and 3, LF and MJO variability appear to be the most important sources of skill for OLR. It is worth noting that the LF-filtered OLR from the NESM forecasts is better correlated with the unfiltered OLR observations than the unfiltered OLR from the NESM forecasts at the end of week 3 (Fig. 12b). This suggests that removing the less predictable high-frequency modes at this time scale could lead to a better forecast. Although not shown, qualitatively similar results are seen in the ECMWF and CFSv2 model forecasts.
To help explain the spatial variability in unfiltered OLR skill among the different models (Figs. 10 and 11), we now examine maps of the Pearson correlation between filtered model forecasts and filtered observations. Figure 13 shows the correlation between model forecasts and observations of OLR, which have both been filtered for LF (Figs. 13a,c,e) and MJO (Figs. 13b,d,f) signals at week 2. Consistent with Fig. 12a, the skill of the models in predicting LF OLR variability is much greater than the skill for the MJO. This is simply due to the higher persistence of LF OLR. In all three models, LF OLR skill is greatest along the equator and in the eastern Pacific (Figs. 13a,c,e), suggesting a connection to ENSO. These areas of high LF OLR skill also correspond to the areas of high unfiltered OLR skill at week 2 (Figs. 10b,d,f). The skill of the models in predicting MJO-filtered OLR is much lower than for LF-filtered OLR. The NESM and ECMWF model have areas of correlation >0.5 over the Maritime Continent and eastern Pacific in the MJO-filtered OLR (Figs. 13b and 13d), closely corresponding with the areas of high MJO activity (Fig. 6a). However, the skill is not proportional to MJO amplitude. The skill is higher in the eastern Pacific than the MJO amplitude would suggest, which may indicate that the skill is greatest in regions downstream of MJO initiation. In the CFSv2, the MJO-filtered OLR correlation over the Maritime Continent is comparatively low (Fig. 13f). Areas of elevated skill for MJO-filtered OLR are also observed outside of the deep tropics, such as over Australia, arising from teleconnection patterns associated with the MJO.
Figure 14 shows the Pearson correlation between MJO-filtered model forecasts and MJO-filtered ERA-Interim analyses of 850- and 200-hPa zonal winds at week 2. Differences in the MJO-filtered 850-hPa zonal wind correlation between the models are especially striking (Figs. 14a,c,e). In the NESM and ECMWF model, correlation >0.5 stretches unbroken from the Arabian Peninsula to Central America. The standard deviation of observed MJO-filtered 850-hPa zonal wind is greatest in a swath stretching from the Bay of Bengal into the western Pacific at 10°N (Adames et al. 2016). The swath of skill observed in the NESM and ECMWF model stretches much farther east than the swath of MJO variability. Again the increased skill downstream could be associated with established MJO events being more predictable than those that are developing. This would be consistent with studies that have examined MJO skill using the RMM index (Kim et al. 2014).
The skill in MJO-filtered 200-hPa zonal wind (Figs. 14b,d,f) is less concentrated at the equator than the skill in MJO-filtered 850-hPa zonal wind (Figs. 14a,c,e). Both the NESM and ECMWF model have relatively high skill over the Maritime Continent for MJO-filtered 200-hPa zonal winds compared to the CFSv2. The relatively high skill of the NESM and ECMWF model in predicting MJO-filtered OLR and zonal winds over the Maritime Continent suggests that the MJO is not inherently less predictable in this region.
Figure 15 presents a summary of MJO statistics for the three models over the Indian Ocean and Maritime Continent (IOMC; 7.5°S–15°N, 60°–155°E). Figure 15a shows the ACC between MJO-filtered OLR from model forecasts and observations. MJO skill is greatest in the ECMWF model and lowest in the CFSv2, consistent with Figs. 13b, 13d, and 13f. The ΔMAE is a measure of the difference in mean absolute error between model forecasts and observations relative to a forecast of climatology (section 2d). The ECMWF model beats a forecast of climatology by the largest margin for MJO-filtered OLR. Interestingly, although the NESM has a significantly better ACC than the CFSv2, its ΔMAE is only slightly better.
To explain the discrepancy between the ACC and ΔMAE of model forecasts of MJO-filtered OLR, we now examine errors and biases in both the phase and amplitude of MJO-filtered OLR. The NESM and ECMWF model have comparable phase errors (Fig. 15c) but the NESM has much larger amplitude errors than the other models at weeks 2 and 3 (Fig. 15d). Figure 15e shows the phase biases for MJO-filtered OLR; positive and negative biases indicate fast and slow biases, respectively. The phase bias in the CFSv2 is larger than in the other two models and indicates a slow bias. The ECMWF model also has a slow bias but the magnitude of this bias is smaller than what is seen in the CFSv2. In contrast, the MJO in the NESM is too fast; this is also apparent in Fig. 4. The comparatively poor performance of the NESM in ΔMAE appears to be linked to its large-amplitude errors (Fig. 15d). Over the IOMC, the NESM has a positive MJO amplitude bias that grows from weeks 1 to 3 (Fig. 15f). The ECMWF model and CFSv2 have positive and negative MJO amplitude biases, respectively, which are smaller in magnitude than the NESM biases. This is consistent with the standard deviation biases in MJO-filtered OLR shown in Figs. 6b–d.
Figure 16 shows the pattern correlation or ACC at weeks 1–3 between model forecasts of OLR that have been filtered for different modes (including combinations of modes) and unfiltered OLR observations. A comparison between unfiltered model forecasts and unfiltered observations (UF) is shown for reference. These figures indicate whether adding a certain mode of variability results in a more skillful forecast in different regions at different lead times. Two regions are considered: the tropics (20°S–20°N) and the IOMC. In both regions, adding MJO-filtered OLR to the LF-filtered OLR has a positive or neutral impact on the ACC at week 1. Additional improvements to the ACC are gained by adding ER-filtered OLR on top of this at week 1. Further addition of the Kelvin- and MRG/TD-filtered OLR (All) does not make significant improvements to the ACC for week 1 forecasts, but may add value for the first few days or add value in certain regions. When the entire tropics are considered, the contribution of MJO-filtered OLR to the ACC at week 2 is small in the NESM and ECMWF model; in the CFSv2 there is no significant contribution. This indicates that LF signals are the dominant contributor to model skill over the entire tropics. Over the IOMC region, the addition of MJO-filtered OLR makes larger improvements to the ACC at week 2 in the NESM and ECMWF model while still adding no skill in the CFSv2 (Figs. 16b,d,f).
Figure 17 illustrates the geographic distribution of contributions from the ER and MJO modes of variability to the predictive skill shown in Fig. 16. To calculate the MJO contribution we first calculate the Pearson correlation between model forecasts of LF + MJO signals and unfiltered observations. Then we subtract the Pearson correlation between model forecasts of LF signals and unfiltered observations. Plotting this difference shows where LF + MJO exceeds LF in Fig. 16. Similarly, we can compute the correlation of model forecasts of LF + MJO + ER signals and unfiltered observations and subtract the correlation of LF + MJO signals to yield the ER contribution. At week 1, the ER contribution is greatest over 15°–30°N (Figs. 17a,c,e). One area where the contribution is especially large is the northwest Pacific near the maximum in observed ER activity (Fig. 7a). However, there are differences between the distribution of ER skill contribution and observed ER activity. It could be that in the deep tropics the skill contributions from the LF and MJO signals are so large that it is difficult for the ER signals to improve upon them. The MJO skill contribution in the NESM and ECMWF models at week 2 are remarkably similar (Figs. 17b and 17d). The skill contribution is centered over the eastern Indian Ocean and Maritime Continent where MJO activity is greatest (Fig. 6a). The MJO signals in the CFSv2 do not contribute significantly to the skill at week 2 over the Indian Ocean and Maritime Continent (Fig. 17f), consistent with Fig. 16f.
For the ACC, the penalty for adding high-frequency modes with little predictive skill to the low-frequency modes is small (Fig. 16). This is not the case for ΔMAE (Fig. 18). Similar to what was seen in the ACC, adding MJO-filtered OLR to the LF-filtered OLR either improves or does not significantly reduce the ΔMAE in each region at week 1. However, only in the ECMWF over the IOMC does adding MJO-filtered OLR improve the ΔMAE at week 2, and this is not by a statistically significant amount (Fig. 18d). At weeks 2 and 3, adding Kelvin, ER, or MRG/TD signals to the LF and MJO signals increases the ΔMAE (Fig. 18) and does not improve the ACC (Fig. 16). At week 3, using only the LF-filtered signals from the model forecasts results in the lowest ΔMAE for all models and domains. These results suggest that the wavenumber–frequency-filtering method introduced in this study could be useful in postprocessing S2S forecasts to improve skill. However, the ideal combination of modes at different forecast lead times is likely sensitive to the variable being examined.
5. Summary and conclusions
In this study we examined the ability of three global coupled atmosphere–ocean models to predict tropical OLR and zonal wind variability at lead times of 1–3 weeks during the boreal summer. A new method of filtering limited-duration forecasts for different wavenumber–frequency bands was introduced and applied to the forecasts from these models. The models were evaluated by examining their ability to reproduce the observed standard deviation of MJO, Kelvin, ER, and MRG/TD activity. The predictive skill of the models at weeks 1–3 was explained by decomposing the anomalies into distinct wavenumber–frequency signals to determine the modes of variability that contribute to forecast skill in different regions and at different lead times. Compared to global MJO indices such as the RMM (Wheeler and Hendon 2004), this approach is better able to isolate regional variations in the skill of the models at forecasting the MJO.
Mean biases in OLR and winds as well as the biases in the standard deviation of OLR filtered for the MJO, Kelvin, ER, and MRG/TD bands were smallest in the ECMWF model. The NESM produced excess convection in the west Pacific as well as excess MJO, ER, and MRG/TD activity in this region. In contrast, the CFSv2 had a lack of convection over the western Pacific and MJO and MRG/TD activity, which was too weak in this region. The CFSv2 also had a lack of Kelvin wave activity across the tropics consistent with Goswami et al. (2017a).
At week 1 the areas of highest skill in forecasting unfiltered OLR were the midlatitudes. At week 2 the area of highest skill was located in the tropics and especially concentrated in the equatorial Pacific. Similar results have been found by earlier studies examining the anomaly correlation of precipitation in coupled global models (Li and Robertson 2015; Wheeler et al. 2017). Overall, the skill was highest in the ECMWF model and lowest in the CFSv2. Decomposing the OLR anomalies into low-frequency (LF) (>100 days) and MJO signals revealed similar skill in the low-frequency band between the three models but large differences in MJO skill. At week 2 the correlation between MJO-filtered OLR from observations and NESM and ECMWF forecasts exceeded 0.5 over two areas of high MJO activity: the Maritime Continent and eastern Pacific. In sharp contrast, the skill of the CFSv2 in forecasting MJO-filtered OLR was worse over the Maritime Continent than in the NESM and ECMWF model. The NESM performed worse than the ECMWF in mean absolute error (MAE) scores over the Indian Ocean and Maritime Continent because of the large MJO amplitude biases in the NESM. Comparisons of the ECMWF model to other operational coupled models typically show that the ECMWF model far exceeds the skill of other models in subseasonal tropical prediction (e.g., Jie et al. 2017).
The correlation between MJO-filtered 850-hPa zonal winds from NESM and ECMWF forecasts and ERA-Interim analyses exceed 0.5 over a large swath extending from the Arabian Peninsula to Central America. Similar to what was seen in the OLR, the highest skill was observed over and downstream of areas of high MJO activity. This suggests that forecasts of MJO events that are already established are more skillful, consistent with previous results (Kim et al. 2014). In general, the skill of the models in forecasting low-frequency and MJO time-scale 200-hPa zonal wind signals was greatest equatorward of the subtropical jets. More detailed analyses of the structure and behavior of composite MJO events in each of these models will help to better explain these regional variations in skill.
To evaluate the contribution of different modes of variability to the OLR forecast skill we examined the ACC and MAE between forecasts of individual filtered modes (as well as sums of filtered modes) and unfiltered OLR observations. MJO-filtered forecasts of OLR added to LF-filtered forecasts of OLR were better able to reproduce the unfiltered OLR observations than low-frequency OLR forecasts alone at week 1. Adding ER-filtered OLR from the forecasts was beneficial at week 1. Over the equatorial Indian Ocean and Maritime Continent—where intraseasonal variability during the boreal summer is strongest—the MJO made meaningful contributions to OLR ACC scores at weeks 1–3 in the NESM and ECMWF. In the CFSv2, MJO signals only made a positive contribution to the ACC in this region at week 1. Skill at weeks 2–3 in the CFSv2 over the Asian monsoon region came entirely from the LF signals. While adding high-frequency signals with little skill had little effect on the ACC scores, it seriously degraded the MAE scores. In general, MAE scores for LF OLR were better than or as good as climatology out to week 3. In contrast, the unfiltered OLR forecasts were significantly worse than climatology at week 3 in each of the models according to the MAE. These results suggest that the best OLR forecast at week 1 would be a combination of the LF, MJO, and ER-filtered signals. At week 2 only the LF and MJO signals are required and at week 3 only LF signals are required. It is important to note that zonal winds—which are the main contributors to most real-time MJO indices (Straub 2013)—are more predictable than OLR so the combination of filter modes yielding the best ACC and MAE at weeks 1–3 would be different than that for OLR.
The skill of the NESM in predicting winds as well as OLR at S2S time scales suggests the potential for extended-range tropical cyclone prediction (e.g., Belanger et al. 2010; Vitart et al. 2010). The genesis of tropical cyclones is modulated by the intraseasonal variability of shear, low-level vorticity, and midlevel humidity (Camargo et al. 2009). S2S predictions of tropical cyclone formation require an understanding of the regional variability of the importance of these different parameters as well as their predictability. The modulation of synoptic tropical variability such as Kelvin and MRG/TD waves by the MJO and interannual modes could also play an important role in tropical cyclone prediction.
Model performance in process-based diagnostics such as the moisture tendency as a function of precipitation rate in the tropics has been shown to be strongly correlated with MJO skill (Klingaman et al. 2015). The ability of the NESM to simulate the vertical structure of convection and its relationship with moisture has been a major focus during the development of the model. In Klingaman et al. (2015) an uncoupled version of NAVGEM with simplified Arakawa–Schubert (SAS) cumulus parameterization, which is also used in the CFSv2, had much poorer MJO skill than the coupled version of NAVGEM with a modified version of the Kain–Fritch parameterization shown in this study. That earlier version of NAVGEM also did a poor job of simulating the vertical structure of moisture tendencies as a function of precipitation rate. A closer examination of the moisture budget of the MJO and process-based diagnostics in the NESM and how they compare to other S2S models could guide further improvements to the model physics and MJO predictive skill.
This research is supported by the Office of Naval Research through the Navy Earth System Prediction Capability effort, the Departmental Research Initiative “Predictability of Seasonal and Intraseasonal Oscillations,” and the Chief of Naval Research through the NRL Base Program (PE 0601153N). Simulations were performed as part of the NMME-SubX experiment funded by the NOAA-Modeling, Analysis, Predictions, and Projections (MAPP) Program (NOAA-OAR-CPO-2016-2004413). Computational resources were supported in part by a grant of time from the High Performance Computing Modernization Program and the NESM forecasts were integrated at the Navy Department of Defense Supercomputing Resource Center, Stennis Space Center, Mississippi.