1. Introduction
Model validation and evaluation are an indispensable part of model improvement efforts. Standard performance-oriented metrics, such as anomaly correlation coefficient (ACC) and root-mean-square error (RMSE), are routinely used in operational centers for forecast verification. Although performance-oriented metrics (e.g., ACC, RMSE) provide quantitative measures on how well a model does, process-oriented metrics help to reveal model deficiencies and identify pathways to model improvement (e.g., Li et al. 2014, 2016; Maloney et al. 2019; Miller and Wang 2019a; Hagos et al. 2021). The objective of this study is to assess the performance of the latest generation of NOAA’s ensemble forecast system, the Global Ensemble Forecast System version 12 (GEFSv12), using a suite of process-oriented model diagnostics.
The GEFSv12 was implemented for operation on 23 September 2020. It has various new features, such as the Finite-Volume Cubed-Sphere Dynamical Core (FV3; Harris and Lin 2013; Lin 2004), more advanced six-category Geophysical Fluid Dynamics Laboratory (GFDL) microphysics scheme (Chen and Lin 2013; Zhou et al. 2019), improved model perturbation techniques including stochastic physics perturbed tendencies (SPPT; Buizza et al. 1999; Palmer 1997, 2001) scheme and stochastic kinetic energy backscatter (SKEB; Berner et al. 2009; Shutts 2005) scheme, bias-corrected SSTs from a coupled model (Melhauser et al. 2017; Zhu et al. 2017; Saha et al. 2010) (for more information, see https://www.emc.ncep.noaa.gov/emc/pages/numerical_forecast_systems/gefs.php), which are expected to reduce the model biases and improve the model prediction skill. We will evaluate the GEFSv12 using a reforecast dataset, and our evaluation is focused on three levels of diagnostics: 1) evaluation of systematic model errors in representing moist convection and cloud processes; 2) evaluation of the sources of predictability relevant to extended-range forecasting [i.e., the Madden–Julian oscillation (MJO), weather regimes]; 3) evaluation of high-impact weather systems (i.e., tropical cyclones, atmospheric blocking). The diagnostics are made available to the community via the enhanced Model Evaluation Tools (METplus; Brown et al. 2021) and the Model Diagnostics Task Force (MDTF) diagnostic package (https://www.gfdl.noaa.gov/mdtf-diagnostics).
Model systematic errors, defined as the deviation between forecasts and the true state from a long-term-mean perspective (Simmons and Hollingsworth 2002; Jung 2005), may arise from errors in initial conditions, boundary forcings, and unresolved small-scale processes (such as atmospheric boundary layer, convection, cloud microphysics, and radiative processes). Different types of systematic errors dominate different spatiotemporal scales. For synoptic to subseasonal predictions, the parameterizations of diabatic processes are generally regarded as one of the major culprits for model systematic errors (Jung 2005). Thus, a detailed evaluation of moisture and precipitation is carried out to explore the potential deficiencies of the model.
Furthermore, the model’s representation of predictability sources affects the prediction skill of the model. On the extended-range time scale, the MJO (e.g., Madden and Julian 1971, 1972) is a primary source of predictability. The MJO is the dominant mode of intraseasonal variability in the tropics. In addition to the modulation of tropical precipitation, the MJO also influences the Asian monsoon, tropical cyclones, and the ENSO (e.g., Pai et al. 2011; Lim et al. 2017; Sobel and Maloney 2000; Hendon et al. 2007) and affects midlatitude weather and climate conditions via teleconnection (e.g., Henderson et al. 2017; Miller et al. 2022). Weather regimes are an important predictability source for extended-range forecasts in midlatitudes. Weather regimes are recurrent large-scale circulation patterns, persisting for several days to two weeks (e.g., Reinhold and Pierrehumbert 1982; Robertson and Ghil 1999). Weather regimes can be regarded as statistical–dynamical equilibria of a climate system (e.g., Cassou 2010; Eckmann and Ruelle 1985; Kimoto and Ghil 1993). Since they last longer than synoptic-scale weather systems, weather regimes can serve as a useful predictor for extended-range prediction. For example, Cassou (2010) showed that the Euro-Atlantic weather regimes are linked to temperature and rainfall anomalies in Europe, and Miller et al. (2020) skillfully predicted weekly tornado activity out to week 3 using weather regime frequency from dynamical model prediction. Other important sources of predictability on the extended-range time scale include land conditions and stratospheric processes, which are not examined in this study to avoid a lengthy article.
The level-3 diagnostics are focused on high-impact weather systems, including tropical cyclones and midlatitude blockings. Tropical cyclones are one of the most destructive storm systems, and blocking highs are large-scale, quasi-stationary high pressure systems that disrupt the eastward progression of synoptic-scale systems and may lead to weather extremes (e.g., Nakamura et al. 1997; Kennedy et al. 2016; Miller and Wang 2019b, 2022). Li et al. (2014) and Hamill and Kiladis (2014) examined the prediction skills of tropical cyclones (TCs) and blocking highs in the GEFSv10, respectively, which will be used as benchmarks to assess the improvements of the GEFSv12.
This paper is organized as follows. The GEFSv12 reforecasts, verification dataset, and methodology are briefly described in section 2. Section 3 contains the analysis of model systematic errors (level 1). Some sources of predictability for extended-range forecasts (i.e., MJO and weather regimes) are evaluated in section 4 (level 2). Model performance of high-impact weather systems (i.e., TCs and blockings) is examined in section 5 (level 3), followed by a discussion and summary in section 6.
2. Data and methodology
a. GEFSv12 reforecasts
The GEFSv12 retrospective forecasts span from 1 January 2000 to 31 December 2019, initialized once per day at 0000 UTC and up to 16 days (Zhou et al. 2022; Hamill et al. 2022; Guan et al. 2022). Once weekly, an ensemble reforecast of 11 members was generated with an extended lead time up to 35 days. The initial conditions of the ensemble members are perturbed by 6-h EnKF ensemble forecasts (Zhou et al. 2022). The horizontal resolution of the archived data is 0.25° × 0.25° for the first 10 days and degraded to 0.5° × 0.5° afterward (for more information, see https://noaa-gefs-retrospective.s3.amazonaws.com/Description_of_reforecast_data.pdf). Due to the large data volume, all reforecast output variables are regridded to 1° × 1° before evaluation, and only the control hindcasts with forecast lead times 1–16 days are examined in this study. Although a larger ensemble size will likely increase the prediction skill, the diagnoses of model biases are expected to be insensitive to the number of ensemble members analyzed. To facilitate the comparison between the observation and reforecasts, the forecast lead times of 24–138 h (day 2–day 6), 144–258 h, (day 7–day 11), and 262–378 h (day 12–day 16) are defined as pentad 1, pentad 2, and pentad 3, respectively.
b. Verification dataset
c. MJO indices
To evaluate the MJO in the GEFS, a standard method (Gottschalck et al. 2010) is employed to compute the Real-time Multivariate MJO Indices (RMM1 and RMM2; Wheeler and Hendon 2004). RMM1 and RMM2 are the first two principal components of the combined fields of latitudinally averaged (15°S–15°N) TOA OLR, 200- and 850-hPa zonal wind from 1 January 2000 to 31 December 2019. The RMM indices from the GEFS are evaluated against those derived from the ERA5 dataset. In addition to all-season evaluations, the extended winter [December–March (DJFM)] and summer [July–October (JASO)] seasons are examined separately.
d. Weather regime identification and relative entropy
Weather regimes are examined during the winter season (DJFM) over North America (10°–70°N, 150°–40°W). A commonly used statistical method to identify weather regimes is the K-means clustering algorithm (e.g., Michelangeli et al. 1995; Vigaud et al. 2018; Robertson et al. 2020a). Similar to Vigaud et al. (2018), weather regimes for ERA5 are obtained as follows: 1) daily 500-hPa geopotential height (Z500) anomalies are calculated by removing the climatological seasonal cycle and linear trend at each grid point on a daily basis from December 2000 to March 2019 in winter seasons (19 winters totally); 2) high-frequency, small-scale variabilities are removed by reconstructing the Z500 field using four leading empirical orthogonal functions (EOFs), which account for 53.8% of the total variance; 3) K-means clustering method is then applied to the reconstructed Z500 field to identify four weather regimes using the Euclidean distances. The procedure of obtaining four forecast weather regimes is similar, except that 1) four ERA5 leading EOFs are used to reconstruct the GEFS Z500 field; 2) the regime type is assigned based on the distances between the reconstructed GEFS Z500 and ERA5-derived regimes. As tested in our calculations (not shown) and consistent with previous studies (e.g., Straus et al. 2007; Dawson et al. 2012; Fabiano et al. 2020), the weather regime patterns remain qualitatively the same when the number of EOFs varies from 4 to 14. In addition, the EOFs derived from the GEFS closely resemble those derived from the ERA5, and the regime identification is not sensitive to small changes of the analysis domain.
To see how the relative entropy can be used to assess model performance, consider a model with a sufficiently large reforecast dataset record and some initial condition errors. The value of R shall be nonzero at the day 0, which purely represents initialization errors. The relative entropy should decrease to zero after the initial adjustment completes if the model is perfect. For a nonperfect model, a statistical equilibrium is achieved when R does not change systematically with time, and the value of R at this stage represents the errors due to model biases. Evaluation of the temporal evolution of R with the forecast lead times can thus help us compare initial condition errors with model biases.
e. Tropical cyclone detection
We adopted the GFDL vortex tracker 3.9a (Marchok 2021) for TC detection and tracking. Key input variables include mean sea level pressure, geopotential height (700, 850 hPa), horizontal wind fields (700, 850 hPa, 10 m), relative vorticity (700, 850 hPa, 10 m), and the geopotential thickness (500–850 and 200–850 hPa). To exclude extratropical cyclones and weak, transient cyclones, several additional criteria are applied to define TCs in the GEFS reforecasts: 1) the accumulated duration of the 300–500-hPa warm core should be at least 48 h; 2) based on the data resolution, the maximum 10-m wind speed of the TC should be greater than 16.5 m s−1 during its lifespan (Walsh et al. 2007); 3) the genesis location (i.e., the first record of a TC track) should be located between 40°S and 40°N. Since it is difficult to differentiate newly generated TCs from preexisting TCs around the initialization time, the TCs identified on day 1 (first 24-h forecast) will not be considered. Additionally, due to the restriction of warm-core duration, tropical disturbances that form after day 11 (hour 264) are excluded as well. Similar adjustments are made to the calculation of TC accumulated cyclone energy (ACE; Bell et al. 1999).
We tracked TCs using 6-hourly GEFS reforecasts, and TC statistics are then derived for each pentad and evaluated against the International Best Track Archive for Climate Stewardship (IBTrACS) dataset version 4 (Knapp et al. 2018). To be consistent with the GFDL tracker, we excluded tropical depressions and tropical cyclones with a duration of less than 48 h.
f. Blocking identification
3. Model systematic errors
a. Precipitation and moisture
Precipitation and humidity biases in the summer season are first examined in this section. The biases in winter are broadly similar to those in summer and thus are not examined here for brevity. As shown in Fig. 1, precipitable water (PW) and total precipitation rate (TP) from the GEFS resemble those from the ERA5. Negative biases of PW and positive biases of precipitation prevail in the deep tropics during the pentad 1 (Figs. 1c,d). The precipitation and PW biases increase from pentad 1 to pentad 2 over the Indian Ocean–western Pacific warm pool region, although they decrease in many other regions (Figs. 1d,h).
To understand the TP and PW biases, Fig. 2 shows the histogram of PW and TP as a function of PW over the tropical oceans. Daily mean data are used to construct these plots, and the western Pacific (20°S–20°N, 120°E–180°) and eastern Pacific (20°S–20°N, 180°–90°W) are examined separately because different convective modes are dominant in the two basins (e.g., Back et al. 2017).
The histogram of the ERA5-obtained PW over the western Pacific is characterized by a single prominent peak around 56 mm, corresponding to deep convection and a high precipitation rate (Figs. 2a,c). In contrast, the histogram of the observed PW over the eastern Pacific is characterized by a bimodal distribution (Brown and Zhang 1997; Zhang et al. 2003), with a primary peak around 28 mm and a secondary peak around 54 mm. The primary peak can be linked to trade wind cumuli or congestus convection, and the secondary peak is associated with deep convective precipitation and stratiform precipitation in organized mesoscale convective systems (Ahmed and Schumacher 2015). The GEFS shows biases in both basins. More specifically, the GEFS underestimates the frequency of high PW values over the western Pacific. With increasing forecast lead time, the peak frequency increases and shifts from 53 mm toward 50 mm, which is consistent with the increasing dry biases over the western Pacific warm pool (Figs. 1c,d). Over the eastern Pacific, the GEFS shows a bimodal distribution similar to the ERA5 but with a less prominent secondary peak. For the first 5 days, as the forecast lead time increases, the primary peak frequency decreases with increasing forecast lead time from day 1 to day 5 and does not change appreciably afterward. The decreasing primary peak frequency is accompanied by an increase in the secondary peak frequency. This implies an underestimated occurrence of shallow convection and overestimated occurrence of deep convection, which makes the bimodal distribution of PW in the GEFS less pronounced than in the ERA5. Mapes et al. (2018) suggested that the bimodal distribution of tropical humidity is attributed to a distinct moist atmospheric regime with sharp dry margins. Their coexistence may only occur when the drying time scale is smaller than the mixing time scale (Zhang et al. 2003). The poorly defined bimodal distribution of PW in the GEFS suggests that the drying time scale may be overestimated, or the mixing time scale may be underestimated, leaving a more homogeneous moist field.
The nonlinear relationship between TP and PW is presented in Figs. 2c and 2d. Compared to the observation, precipitation rate for a given value of PW is overestimated in the GEFS, and the TP–PW curve shows a steeper slope than the observation, especially over the eastern Pacific. Overall, Fig. 2 suggests that convection onset occurs too early in terms of column water vapor accumulation. Since precipitation removes water vapor from a column, the early convection onset prevents further accumulation of column water vapor and contributes to the negative biases of PW in many tropical regions (Figs. 1c,d). The early onset of convection is a reported common issue for various global models (Li et al. 2014, 2016; Lim et al. 2018; Kim et al. 2019). In addition, the GEFSv12 underestimates the frequency of precipitation less than 2 mm day−1 but overestimates the frequency of drizzle (2–12 mm day−1) and heavy precipitation (more than 30 mm day−1) (Fig. S1 in the online supplemental material).
Tropospheric humidity is closely coupled to convection. On the one hand, convective clouds can moisten the ambient environment via upward transport of moisture from the boundary layer and lateral detrainment; on the other hand, ambient humidity or dryness affects the intensity and depth of convective updrafts (e.g., Brown and Zhang 1997; Wang 2014). Further analysis reveals cold biases in tropospheric temperature (not shown). However, dry biases exist in the histograms of column relative humidity, and discrepancies are evident between the GEFS and observations in the precipitation rate–column relative humidity relationship (Fig. S2). The biases in precipitation and tropospheric humidity thus cannot be completely attributed to temperature biases. To shed light on the model performance in representing convective processes, vertical profiles of relative humidity (RH) are stratified based on 500-hPa omega under rainy conditions (precipitation rate > 0.1 mm day−1) over the tropical ocean (20°S–20°N; Fig. 3). Since most model mean state biases develop during pentad 1, the subsequent analysis will mainly focus on this time period.
A clear feature of Fig. 3a is the moist free troposphere in ascending regions (omega < 0 Pa s−1) and the dry free troposphere in descending regions. A trimodal vertical distribution (Fig. 3a) is discernible for omega < −0.15 Pa s−1, which roughly corresponds to shallow trade cumulus (∼900–750 hPa), congestus convection (∼500 hPa), and deep stratiform process (∼250 hPa). The GEFS shows a pattern that is broadly consistent with the ERA5 (Figs. 3a,b). Figure 3c shows that the GEFS overestimates the free tropospheric RH and underestimates the boundary layer RH in ascending regions and a nearly opposite pattern is found in descending regions. This, again, implies that the detrainment may be overestimated for deep convection, which acts to moisten the mid–upper troposphere (300–600 hPa). As is affirmed in J. Y. Han et al. (2016) and Han et al. (2017), the rain conversion rate in the GEFS-employed simplified Arakawa–Schubert (SAS) scheme is set to decrease with height exponentially above the freezing level in order to increase high cloud fraction, which eventually results in abundant moisture detraining into the gridscale air column and the subsequent moistening of the mid–upper troposphere. The recent update in the GEFS SAS scheme also includes a higher entrainment rate to reduce convective activity (Han et al. 2017). These two modifications likely reduce buoyancy of moist air parcels and thus hinder convective penetration into high altitudes, leaving a negative bias of RH above 200 hPa, along with an underestimated tropopause height. The negative biases of lower-level RH and specific humidity (not shown, but with a similar bias pattern) in the ascending regions suggests overtransport of lower-level moisture into the free troposphere. In addition, biases in other processes, such as SST or horizontal moisture advection in response to convective heating, may also contribute to the humidity biases in the boundary layer.
b. Clouds
Investigation of vertical cloud structure is desirable in order to assess the microphysics parameterization and the implication for radiative forcing. However, 3D cloud data are not available in the GEFS reforecasts. Our evaluation below will focus on cloud fraction and TOA OLR. The TOA OLR is obtained from CERES Synoptic (SYN1deg) product (Doelling et al. 2013, 2016) and the total cloud fraction is from the GCM-Oriented CALIPSO Cloud Project (Chepfer et al. 2010). Regarding cloud fraction, it would be ideal to derive synthetic cloud fraction using a satellite simulator to facilitate a direct comparison between model output and satellite data. Unfortunately, this approach is not feasible due to the lack of variables in the GEFS reforecasts required by a satellite simulator. The comparison here thus needs to be taken with caution.
Globally, spatial distribution of cloud fraction and OLR in the GEFS closely resemble those observed, as shown in Fig. 4. Large cloud fraction is found in monsoon regions, along the intertropical convergence zone (ITCZ), and in the subtropical eastern oceans where stratus clouds prevail. Meanwhile, low OLR values are present in regions of heavy precipitation (i.e., monsoon regions and the ITCZ; see Fig. 1). Figure 4c shows that negative cloud fraction biases prevail over the tropical and subtropical Pacific and Atlantic, except positive biases over the subtropical eastern oceans. Such positive biases are associated with negative biases closer to the coast, corresponding to an off-coast displacement of stratus clouds in the GEFS. Comparison between Figs. 4c and 4f reveals that the negative (positive) cloud fraction biases over the subtropical eastern Pacific are often accompanied by positive (negative) OLR biases, implying that the OLR biases in these regions can be partly attributed to the cloud fraction biases. Further analysis shows that the underestimated cloud fraction over the eastern Pacific is collocated with negative biases of boundary layer RH in this region (20°–35°N, 120°–140°W; Fig. S3). Such negative biases also appeared in an earlier GFS version (Xiao et al. 2014), and the recent updates in the planetary boundary layer scheme failed to sufficiently reduce such biases (J. Han et al. 2016). It is also interesting to note that OLR in the warm pool region is largely overestimated (Fig. 4f) while there is little bias in cloud cover (Fig. 4c). This indicates that the cloud-top height might be underestimated (Fig. 3) or high clouds might be optically too thin, which is consistent with the negative RH biases above 200 hPa shown in Fig. 3.
c. Convective structure analysis
As shown in Figs. 5a and 5b, the Indian Ocean–western Pacific warm pool is characterized by positive top-heaviness ratio, while negative top-heaviness ratio is found in the SPCZ and ITCZ region. Positive values are often a combination of deep convection and the subsequent deep stratiform process, while negative values are associated with shallow convection, which is common over the eastern Pacific and Atlantic. The differences of top-heaviness ratio between pentad 1 and ERA5 (Fig. 5c) illustrate that the GEFS has negative biases in the western Pacific and positive biases in the tropical central and eastern Pacific. This is associated with underestimated deep convection fraction over the western Pacific and overestimated deep convection fraction in the eastern Pacific, which is consistent with the precipitation biases discussed earlier (Fig. 1g; negative biases in the western Pacific and positive biases in the eastern Pacific).
4. Sources of predictability for extended-range forecasts
Decision making on agriculture planning and water management often falls on the extended-range time scale. This time scale resides between weather forecasting and climate prediction and is often referred to as the “predictability desert” (e.g., Vitart et al. 2012; Robertson et al. 2020b). Past studies (e.g., Lorenz 1963, 1969; Boer 2003; Privè and Errico 2015; Toth and Buizza 2019) suggest that the atmospheric predictability limit depends on the spatiotemporal scales: although the noise associated with synoptic and smaller scales in numerical weather models increases quickly, some large-scale predictable components can still be extracted on the longer time scales. Such predictable components can be linked to the slowly varying components of the climate systems (i.e., SST, sea ice, and land) and low-frequency variability of the climate system, such as the weather regimes, MJO, ENSO, and stratospheric processes (e.g., Robertson et al. 2020a,b; Toth and Buizza 2019; Hochman et al. 2019). In this section, we will evaluate the MJO and North American weather regimes in the GEFS.
a. The MJO
The GEFS prediction skill of the MJO is assessed using the bivariate correlation (Fig. 6a). The bivariate correlation remains above 0.6 up to 16 days in the GEFSv12 control hindcast from January 2000 to December 2019, in contrast to ∼0.4 based on the GEFSv10 ensemble mean during December–February 1985–2012 (Hamill and Kiladis 2014). The skill, however, appears lower than what is reported for some SubX and S2S models (Lim et al. 2018; Kim et al. 2019). In particular, the GEFSv11 ensemble forecasts can predict the MJO up to ∼21 days ahead (Kim et al. 2019). This is mainly because only the control run is evaluated here. Increasing ensemble size is expected to improve the model prediction skill. Figure 6a also shows that the skill is higher for strong MJO days (the RMM magnitude greater than 1.0) but decreases quickly and drops below 0.6 on day 9 for weak MJO days. Compared with GEFSv10, the latest GEFS version 12 includes the two-tiered SST scheme to ensure that coupled systems like the MJO can benefit from a bias-corrected SST from a coupled model (i.e., CFS v2) (Zhu et al. 2018; Li et al. 2019).
Pearson correlation of the RMM magnitude between the GEFS and ERA5 is shown as a function of the MJO phase and forecast lead time in Figs. 6b and 6c, which helps to offer an overview of the model’s ability to reproduce the observed MJO magnitude. The correlation decreases more rapidly with increasing forecast lead time in summer, especially for the MJO phases 3 and 4, when the center of active convection is located over the eastern Indian Ocean and the Maritime Continent. This may be related to the barrier effect of the Indo-Pacific Maritime Continent on the MJO (Zhang and Ling 2017). The correlation skill in winter is consistently higher than that in summer except for phase 1 (i.e., the MJO initiation phase) where the skill drops below 0.6 on day 16.
To illustrate the MJO propagation, Hovmöller diagrams of 20–100-day bandpass filtered GPCP precipitation anomalies averaged over 10°S–10°N are shown in Fig. 7. The reference domain for the regression is set to the eastern Indian Ocean (5°S–5°N, 75°–85°E). The observed precipitation anomalies (Fig. 7a) exhibit a clear eastward propagation from the western Indian Ocean. The eastward propagation is weakened by the Maritime Continent but resumes over the western Pacific. The MJO signals in the GEFS day-2 reforecasts closely resemble those from the GPCP. The eastward propagation of enhanced precipitation from the western Indian Ocean (50°E) to the western Pacific Ocean (150°E) remains robust even in the day-16 reforecasts, although the signals weaken with increasing forecast lead time. This represents a substantial improvement of the MJO representation in the model, as many models have difficulty reproducing the eastward propagation of the MJO signals across the Maritime Continent. On the other hand, the weakened MJO signals at long forecast lead times may be tied to the mean state biases. More specifically, Jiang (2017) and Kim (2017) suggested that the correct representation of mean horizontal moist static energy and moisture is critical for the eastward propagation of the MJO. The weakening MJO signals may be partly attributed to the PW biases over the Indian Ocean–western Pacific (Figs. 1c,d).
To investigate the model deficiencies relevant to the MJO physics, the longitude–pressure cross section of RH (averaged over 10°S–10°N) is regressed on the normalized 20–100-day bandpass filtered precipitation anomalies over the same reference region (i.e., 5°S–5°N, 75°–85°E) (Fig. 8). In the ERA5, positive RH anomalies are found throughout the troposphere over the eastern Indian Ocean with a single peak around 200–450 hPa, which can be linked to the enhanced deep convection. East of the Indian Ocean are the upper-level negative RH anomalies associated with the suppressed MJO phase and the lower-level positive RH anomalies relating to premoistening ahead of the MJO active convection. The structure corresponds to the transition from shallow to deep convection in the MJO (e.g., Hung et al. 2020; Li et al. 2020; Wei et al. 2020). The GEFS captures the enhanced RH throughout the troposphere over the Indian Ocean at all the forecast lead times, but the RH profile is less top-heavy. Additionally, the premoistening to the east occurs only in a very shallow layer below 800 hPa, which results in an abrupt shallow-to-deep convection transition. These biases are consistent with the biases of deep convection and congestus convection (i.e., midlevel convection) discussed in the level-1 diagnostics.
Regarding the MJO propagation (Figs. 9a,b), the ERA5 shows a flat peak between 2° and 12° day−1 in winter (Fig. 9a) and a well-defined peak around 3° day−1 in summer (Fig. 9b). The mean propagation speed (vertical lines) is higher in the winter (7.8° day−1) than in the summer (4.8° day−1). The GEFS shows similar distributions for day 1 to day 4 but the probability distributions in both seasons quickly widen with increasing forecast lead time, and the mean propagation speed (colored vertical lines) slows down progressively with increasing forecast lead time. Similar biases are documented in the previous GEFSv10 (Hamill and Kiladis 2014) and other S2S models (e.g., Vitart 2017). The probability distribution of the MJO magnitude changes in the ERA5 (Figs. 9c,d) shows a sharp peak around zero in both the summer and winter. In the GEFS, the probability distribution also peaks close to zero, but it broadens with the forecast lead time after day 4, suggesting that the model has difficulty maintaining the steady state of the MJO with increasing forecast lead time. Such biases in the propagation speed and magnitude have the potential to further influence the midlatitude extended-range prediction, which will be examined in a later subsection.
b. Weather regime
Weather regimes are an important source of predictability on the extended-range time scale in extratropics (e.g., Reinhold and Pierrehumbert 1982; Cassou 2010). Previous studies (e.g., Vigaud et al. 2018; Lee et al. 2019; Robertson et al. 2020a) have shown that there are typically four types of weather regimes over North America (10°–70°N, 150°–40°W): Arctic high (Fig. 11a; AH/−NAO), Pacific ridge (Fig. 11b; PR), Pacific trough (Fig. 11c; PT), and West Coast ridge (Fig. 11d; WCR) (also see in Fig. S5). The Arctic high regime is characterized by a meridional dipole pattern, resembling the negative North Atlantic Oscillation (NAO) phase, while the other three regimes have a Rossby wave train pattern.
Figure 10a shows the ACC of 7-day moving average of regime frequency of occurrence between the ERA5 and GEFS reforecasts as a function of the forecast lead time. The ACC remains above 0.6 up to 12 days, which is comparable to the performance of the ECMWF S2S reforecasts (Vigaud et al. 2018). The Arctic high regime has a higher ACC than the other regimes at all forecast lead times, and the ACC remains above 0.6 up to day 13. The Pacific ridge and Pacific trough regimes have similar ACC values, which are close to the average ACC over the four regimes, while the West Coast ridge regime has a slightly lower ACC.
The persistence probability of weather regimes is examined for pentad 1, pentad 2, and pentad 3 with respect to ERA5. The persistence probability is defined as the probability for a regime to persist on the following day. All regimes have a persistence probability above ∼0.7 in the ERA5, with the Arctic high regime having the highest value around 0.83. The persistence probability is well reproduced for all regimes and for all pentads in the GEFS, with relatively small fluctuations (<0.1).
The relative entropy is shown in Fig. 10c. To increase the sample size, two subsequent forecast lead times are bundled together to calculate the probability. On days 1–2, the mean relative entropy is around 0.15 (black solid line), which mainly represents the initial condition errors (see discussion in section 2d). The relative entropy then increases with the forecast lead time, indicating the contribution by model biases. The increasing trend of relative entropy ceases after day 7 for all weather regimes (the fluctuation around day 10 is due to some missing data in winter months in the reforecast data), suggesting that the simulated atmosphere has reached a statistical equilibrium state. The relative entropy value at this stage (∼0.25) is almost twice as large as that on days 1–2, indicating the relative contributions of initial condition errors and model physics deficiencies. Also shown in Fig. 10c is the ACC of daily 500-hPa geopotential height field between the ERA5 and reforecasts over the same reference region. The ACC shows that the model has high ACC (>0.9) before day 6, but the ACC drops sharply afterward. An interesting finding is that the Arctic high regime has the largest relative entropy value and the highest ACC. This suggests that the biases in the weather regime spatial pattern (represented by ACC and relative entropy) may not strongly affect the prediction skill of its frequency.
c. Impacts of the MJO on North American weather regimes
Although weather regimes can be regarded as a source of predictability for midlatitude atmosphere, they are modulated by other processes, such as the MJO (e.g., Vigaud et al. 2018; Cassou 2008). A better representation of such links can help improve the prediction skill of weather regimes. Figure 11 illustrates the anomalous percentage of weather regime days lagged by eight MJO phases. Anomalies are defined with respect to the observed mean weather regime frequency (ERA5), and the Monte Carlo experiments with 1000 members, each member generated as a first-order Markov series based on the regime transition probability derived from ERA5, are used to test the significance of the anomalies. The observed lag relationship (Fig. 11a) is consistent with previous studies by Cassou (2008) and Riddle et al. (2013) to some extent and indicates that all four regimes are significantly modulated by the MJO. For example, the frequency of the Arctic high is reduced by ∼40% at 12–15 days following the MJO phase 3 (Fig. 11a), suggesting the potential to predict the midlatitude weather patterns two weeks in advance if a model can capture the remote response of a weather regime to the MJO. As the MJO shifts from phase 3 to phase 6, the frequency reduction of the Arctic high regime occurs with a decreasing lag time, until they occur simultaneously at around phases 6–7. Persistent positive frequency anomalies (∼40%–60%) are found from day 0 to day 15 during the phase 1. The Pacific trough regime has a positive (negative) anomaly response in phases 2–4 (phases 5–1) and the frequency increase (reduction) also tends to occur with a shortening lag time. The suppressed anomaly signals within the first five lag days suggest a relatively long adjustment time needed for the formation of such a circulation. As for the Pacific ridge regime, its robust response to the MJO is mainly confined to phases 4–5 for positive frequency anomalies and to phase 8 for negative anomalies and the anomalous signals only exist in the first week. Compared with the aforementioned two regimes, the robust responses of the West Coast ridge and Pacific ridge regime to the MJO mainly reside in limited phases. The West Coast ridge regime shows a reduced (increased) occurrence frequency signal in phase 2 (phases 7–8) within the first 10 lag days while for the Pacific ridge regime, a reduced (increased) frequency signal is found merely in phase 8 (phase 5) within the first 5 lag days.
As shown in Figs. 11e–h, the GEFS can broadly capture the response of the weather regime occurrence frequency to the MJO. However, quantitative differences exist in the magnitude and lagging time. For the Arctic high, the GEFS underestimates the significant anomalies at long lags but erroneously produces significant positive anomalies at the lag of 13–15 days in phase 8. This suggests that the simulated midlatitude atmosphere, in this regime case, does not possess a medium-range memory for the MJO. Regarding the Pacific trough regime, the GEFS captures the broad pattern of enhanced regime frequency in phases 2–5 and reduced regime frequency in the other MJO phases, but quantitative differences exist. The anomalous signals of the Pacific ridge and West Coast ridge generally resemble the ERA5, with relatively weak significant anomalies. A common feature for the four regimes is the less coherent signals in the GEFS at long lags, which is possibly due to the less coherent MJO pattern with increasing forecast lead time (Fig. 9).
5. High-impact weather systems
a. Tropical cyclones
TCs are one of the most destructive weather systems. Figure 12 shows the spatial pattern of TC genesis density function (GDF) for the GEFS and IBTrACS during the summer season (JASO). The GDF is defined as the number of TC genesis within a 10° × 10° box per summer season. Since the JASO does not fully capture the TC season over the Southern Hemisphere and the northern Indian Ocean, our analysis is focused on the North Pacific and Atlantic.
The long-term-mean GDF distribution in the GEFS (Fig. 12b) resembles the observation pattern (Fig. 12a), with a slightly eastward-displaced GDF center over both the western and eastern Pacific. During the pentad 1, the negative TC genesis biases over the western Pacific are substantially reduced compared to the GEFSv10, in which Li et al. (2016) found underestimated GDF. Since the majority of TCs generate within the monsoon trough over the western Pacific (Ritchie and Holland 1999), Li et al. (2016) suggested that the underestimated monsoon trough may contribute to the underestimated GDF. The reduced GDF biases in the pentad 1 are consistent with the reduced relative vorticity biases at 850 hPa (Fig. S6).
Over the eastern Pacific, a northwest–southeast dipole pattern of biases exists in both the pentad 1 and pentad 2, with strong positive biases in the southeast. This bias pattern also appears in several previous GEFS versions (e.g., Li et al. 2016; Lee et al. 2018). Past studies suggest that TC genesis over the eastern Pacific is strongly associated with the ITCZ (Wang and Magnusdottir 2005; Zhang and Wang 2015). In particular, the ITCZ breakdown is an effective mechanism to induce TC formation (e.g., Guinn and Schubert 1993; Wang and Magnusdottir 2006). The GDF biases can be at least partly attributed to the strong and eastward extended ITCZ in the GEFS (Fig. 1).
The eastern North Atlantic has positive genesis biases in the pentad 1, but the biases are largely gone in the pentad 2. The majority of TCs over the North Atlantic originate from tropical easterly waves (TEWs; e.g., Thorncroft and Hodges 2001; Dunkerton et al. 2009). Further analysis (Fig. S5d) shows positive biases in TEW activity over the east Atlantic in pentad 1 and weak negative biases in pentad 2, consistent with the GDF bias evolution (Fig. S6). The biases in vertical wind shear and relative humidity were also examined. The negative biases of vertical wind shear are consistent with the positive biases of GDF over the east Atlantic, but the relative humidity biases do not help to explain the GDF biases (not shown). It is interesting to notice the negative biases in GDFs over the Gulf of Mexico and the subtropical western North Atlantic. Similar biases exist in an earlier version of the GEFS (Li et al. 2016) and are also common among CMIP models (Camargo et al. 2020). They may be partly due to the positive biases in GDFs in the upstream, and it is also possible that global models have difficulty representing tropical cyclone formation under the extratropical influence (Wang et al. 2018).
Figure 13 illustrates the interannual variability of TC genesis number and ACE (Bell et al. 2000) during the summer season in different ocean basins. The GEFS skillfully reproduces the year-to-year variation, with the pentad 1 performing better than the pentad 2. The Atlantic and eastern Pacific basin have the highest ACC (both around 0.75) of TC genesis number during the pentad 1, in contrast to a much lower ACC (0.49) for the western Pacific. For the pentad 2, the correlation skills for all ocean basins drop to a lower level, with the highest ACC over the eastern Pacific (0.66) and the lowest ACC over the western Pacific (0.41). Despite the relatively low ACC values over the western Pacific, the GEFSv12 still shows an improvement compared to the GEFSv10 (Li et al. 2016). As for the ACE, ACC is up to ∼0.68 for the first pentad in all ocean basins. In the pentad 2, however, the western Pacific not only fails to maintain a high ACC as the other two basins achieved, but also reveals a decreased interannual standard deviation. This may be related to the model biases in monsoons as discussed earlier.
b. Midlatitude blocking
Midlatitude blocking can be regarded as a weather regime in some regions. Since persistent blocking events can induce extreme weather, it is examined in this section from the global perspective. To facilitate comparison, we first adopted a one-dimensional blocking identification algorithm (Tibaldi and Molteni 1990), which was applied to the GEFSv10 (Hamill and Kiladis 2014). Blocking frequency is defined as the percentage of blocked days in the winter season. As shown in Fig. 14a, the GEFS closely resembles the long-term-mean longitudinal distribution of blocking frequency from day 1 to day 15. This is a substantial improvement over the GEFSv10, which underestimates the blocking frequency in the Euro-Atlantic sector by up to 33% with respect to the observation.
To better assess the spatial distribution of blocking occurrence, a two-dimensional blocking frequency map (Figs. 14b–e) is developed. The GEFS shows a good agreement with the observation during the pentad 1, but biases are much larger in the pentad 2 and pentad 3. In particular, positive blocking frequency biases prevail over the subpolar North Atlantic, and negative biases downstream of the Ural Mountain. The average absolute bias for those two regions is around 2% which is sufficiently large when considering that the reference climatological value is only roughly equal to 10% (relative difference ≅ 20%) and the bias magnitude may continue to grow before the equilibrium state is achieved. Similar to the ECMWF seasonal forecast models (SEAS3–5; Davini et al. 2021) and CMIP families (CMIP3–6; Davini and D’Andrea 2020), the latest GEFS also suffers from negative biases over Europe (Fig. 14e). Davini and D’Andrea (2020) pointed out that the negative biases over Europe can be attributed to the overextension of the Atlantic jet, which is consistent with what is found in the reforecast 200-hPa zonal wind field (not shown).
Given the socioeconomic impacts of blocking, it is important to know the spatiotemporal variations of its synoptic-scale predictability. We examined the pointwise Pearson correlation coefficient of blocking occurrence between the ERA5 and GEFS (Fig. S7). High correlation skill (≥0.6) can only persist for 3–4 days in most regions. Multiple factors may account for the blocking biases, such as convection scheme, turbulent orographic form drag parameterization, stochastic physics, and ocean–atmosphere coupling (e.g., Jung et al. 2010; Palmer 2012; Davini et al. 2021). A more thorough analysis based on model processes and ensemble members may be needed in the future.
6. Summary and discussion
Three levels of process-based diagnostics are applied to evaluate the NOAA’s ensemble forecast system—GEFSv12: 1) systematic model errors associated with moist convection and cloud processes; 2) prediction sources on the extended-range time scale, including the Madden–Julian oscillation (MJO) in the tropics and North American weather regimes; 3) high-impact weather systems, such as tropical cyclones (TCs) and blocking highs. On level 1, positive biases of precipitation prevail over the global tropical and subtropical ocean. Precipitable water (PW), on the other hand, is underestimated in the deep tropics. Our analysis suggests that these biases are associated with the early onset of the precipitation, which prevents the further accumulation of column water vapor. Additionally, the positive RH biases in the mid–upper troposphere may indicate overestimated detrainment rate. The lower-level negative RH biases are possibly associated with overestimated transport of moisture from the boundary layer to the free troposphere, while the upper-level (<200 hPa) negative biases are suspected to be linked to a decreased convective buoyancy resulting from multiple potential mechanisms (e.g., water loading, entrainment rate, cloudiness).
The biases in the cloud representation are investigated using total cloud fraction and outgoing longwave radiation (OLR). Negative cloud fraction biases prevail over the tropical and subtropical Pacific and Atlantic, except positive biases over the subtropical eastern oceans. Over the eastern Pacific, the OLR biases can be partly attributed to the cloud fraction biases. A top-heaviness metric is used to diagnose the different cloud and precipitation processes. Our findings show that negative biases in the top-heaviness ratio in the western (eastern) Pacific indicates underestimated (overestimated) deep convection, which is consistent with negative precipitation biases over the western Pacific and positive biases over the eastern Pacific.
On level 2, the diagnostics of the MJO suggest that the GEFS can skillfully predict the MJO up to 16 days ahead (bivariate correlation ≥ 0.6). The GEFSv12 can reproduce the eastward propagation signals across the Maritime Continent even for the forecast day 16, although the MJO amplitude and propagation become less coherent with increasing forecast lead time. North American weather regimes can be skillfully predicted up to 12 days in advance (weekly regime frequency ACC ≥ 0.6) by the GEFS. The GEFS also reproduces the persistence probability of each weather regime at all forecast lead times, and the modulation of weather regimes by the MJO is also largely captured by the GEFS. The regime “cloud” analysis demonstrates that the initial condition error is larger than the errors contributed by model biases and the statistical equilibrium over North America is reached by day 7.
On level 3, TCs and blocking highs are investigated. The biases of TC genesis frequency are reduced over the western Pacific compared to GEFSv10, and the model also shows an improved interannual correlation in pentad 1 in all ocean basins. The overestimation of TC genesis number over the eastern Pacific are possibly due to the strong and eastward extended ITCZ in the GEFS. For the North Atlantic, TC genesis number is also overestimated during the pentad 1 but the biases are substantially reduced during the pentad 2, which is likely to be associated with tropical easterly wave activity anomalies. The GEFSv12 exhibits a higher interannual correlation skill of TC genesis number and accumulated cyclone energy for each ocean basin during the pentad 1 than the GEFSv10.
The climatological mean wintertime blocking is found to be in good agreement with the observation, an improvement over the GEFSv10. Some inconsistencies arise over the Ural Mountain region, with extensively positive biases over the subpolar North Atlantic and negative biases downstream of the Ural Mountains. Additionally, the well-known European negative frequency biases also appear in the GEFS, which is likely to be associated with the overextension of the Atlantic jet.
Our analysis reveals substantial improvements of the GEFSv12 compared to an earlier model version. It is also worth pointing out that the prediction skill of the MJO, TC activity, and weather regimes is assessed only based on the GEFS control run. Higher prediction skill is expected if more ensemble members are included. The improvements may be ascribed to various upgrades, a new dynamic core, improved model physics, higher model resolution, etc. The evaluation here, however, is limited to the forecast lead time up to 16 days. As the GEFS is moving toward extend-range forecasts, detailed diagnostics of key physical processes on the longer time scales, such as the ENSO, troposphere–stratosphere interaction, the interaction of the atmosphere with the lower boundaries (i.e., land, ocean sea ice), need to be evaluated. In addition, the level-2 diagnostics in this study only cover some predictability sources. Other processes, such as sudden stratospheric warming, land–atmosphere interaction, and teleconnection patterns, also play an important role on the S2S time scale and merit in-depth evaluation.
Acknowledgments.
This work is supported by the National Oceanic and Atmospheric Administration (NOAA) Grants NA18OAR4310271 and NA18NWS4680065 under the NOAA Round-3 R2O Initiative. We acknowledge the NCAR Computational and Information Systems Laboratory (CISL) for providing the computing resources and data access.
Data availability statement.
The GEFSv12 reforecast dataset is available online and can be accessed from https://noaa-gefs-retrospective.s3.amazonaws.com/index.html. The ERA5 and GPCP version 1.3 precipitation data are available through the NCAR Research Data Archive (RDA) (https://rda.ucar.edu/datasets/ds630.0 and https://rda.ucar.edu/datasets/ds728.5/). Tropical cyclone best track data are available from the NOAA National Centers for Environmental Information (https://www.ncdc.noaa.gov/ibtracs/index.php?name=ib-v4-access). Cloud fraction observations and SYN1deg radiation product were downloaded from the CFMIP-OBS website (https://climserv.ipsl.polytechnique.fr/cfmip-obs/Calipso_goccp.html) and NASA Langley Research Center (https://ceres.larc.nasa.gov/data/), respectively. Australian Bureau of Meteorology (http://www.bom.gov.au/climate/mjo/) provides the standard MJO RMM indices.
REFERENCES
Adler, R., and Coauthors, 2017: Global Precipitation Climatology Project (GPCP) Climate Data Record (CDR), version 1.3 (updated daily). NOAA/National Centers for Environmental Information, accessed 8 October 2020, https://www.ncei.noaa.gov/access/metadata/landing-page/bin/iso?id=gov.noaa.ncdc:C00999.
Ahmed, F., and C. Schumacher, 2015: Convective and stratiform components of the precipitation-moisture relationship. Geophys. Res. Lett., 42, 10 453–10 462, https://doi.org/10.1002/2015GL066957.
Back, L. E., and C. S. Bretherton, 2009: On the relationship between SST gradients, boundary layer winds, and convergence over the tropical oceans. J. Climate, 22, 4182–4196, https://doi.org/10.1175/2009JCLI2392.1.
Back, L. E., Z. Hansen, and Z. Handlos, 2017: Estimating vertical motion profile top-heaviness: Reanalysis compared to satellite-based observations and stratiform rain fraction. J. Atmos. Sci., 74, 855–864, https://doi.org/10.1175/JAS-D-16-0062.1.
Bell, G. D., M. S. Halpert, C. F. Ropelewski, V. E. Kousky, A. V. Douglas, R. C. Schnell, and M. E. Gelman, 1999: Climate Assessment for 1998. Bull. Amer. Meteor. Soc., 80 (Suppl.), S1–S48, https://doi.org/10.1175/1520-0477-80.5s.S1.
Bell, G. D., and Coauthors, 2000: The 1999 North Atlantic and eastern North Pacific hurricane seasons [in “Climate Assessment for 1999”]. Bull. Amer. Meteor. Soc., 81 (Suppl.), S19–S22, https://doi.org/10.1175/1520-0477(2000)81[s1:CAF]2.0.CO;2.
Berner, J., G. J. Shutts, M. Leutbecher, and T. N. Palmer, 2009: A spectral stochastic kinetic energy backscatter scheme and its impact on flow-dependent predictability in the ECMWF Ensemble Prediction System. J. Atmos. Sci., 66, 603–626, https://doi.org/10.1175/2008JAS2677.1.
Boer, G. J., 2003: Predictability as a function of scale. Atmos.–Ocean, 41, 203–215, https://doi.org/10.3137/ao.410302.
Brown, B., and Coauthors, 2021: The Model Evaluation Tools (MET): More than a decade of community-supported forecast verification. Bull. Amer. Meteor. Soc., 102, E782–E807, https://doi.org/10.1175/BAMS-D-19-0093.1.
Brown, R. G., and C. Zhang, 1997: Variability of midtropospheric moisture and its effect on cloud-top height distribution during TOGA COARE. J. Atmos. Sci., 54, 2760–2774, https://doi.org/10.1175/1520-0469(1997)054<2760:VOMMAI>2.0.CO;2.
Buizza, R., M. Milleer, and T. N. Palmer, 1999: Stochastic representation of model uncertainties in the ECMWF Ensemble Prediction System. Quart. J. Roy. Meteor. Soc., 125, 2887–2908, https://doi.org/10.1002/qj.49712556006.
Camargo, S. J., and Coauthors, 2020: Characteristics of model tropical cyclone climatology and the large-scale environment. J. Climate, 33, 4463–4487, https://doi.org/10.1175/JCLI-D-19-0500.1.
Cassou, C., 2008: Intraseasonal interaction between the Madden–Julian oscillation and the North Atlantic Oscillation. Nature, 455, 523–527, https://doi.org/10.1038/nature07286.
Cassou, C., 2010: Euro-Atlantic regimes and their teleconnections. Proc. Seminar on Predictability in the European and Atlantic Regions, Reading, United Kingdom, ECMWF, https://www.ecmwf.int/sites/default/files/elibrary/2012/8609-euro-atlantic-regimes-and-their-teleconnections.pdf.
Chen, J.-H., and S.-J. Lin, 2013: Seasonal predictions of tropical cyclones using a 25-km-resolution general circulation model. J. Climate, 26, 380–398, https://doi.org/10.1175/JCLI-D-12-00061.1.
Chepfer, H., S. Bony, D. Winker, G. Cesana, J. L. Dufresne, P. Minnis, C. J. Stubenrauch, and S. Zeng, 2010: The GCM-oriented CALIPSO cloud product (CALIPSO-GOCCP). J. Geophys. Res., 115, D00H16, https://doi.org/10.1029/2009JD012251.
Davini, P., and F. D’Andrea, 2020: From CMIP3 to CMIP6: Northern Hemisphere atmospheric blocking simulation in present and future climate. J. Climate, 33, 10 021–10 038, https://doi.org/10.1175/JCLI-D-19-0862.1.
Davini, P., A. Weisheimer, M. Balmaseda, S. J. Johnson, F. Molteni, C. D. Roberts, R. Senan, and T. N. Stockdale, 2021: The representation of winter Northern Hemisphere atmospheric blocking in ECMWF seasonal prediction systems. Quart. J. Roy. Meteor. Soc., 147, 1344–1363, https://doi.org/10.1002/qj.3974.
Dawson, A., T. N. Palmer, and S. Corti, 2012: Simulating regime structures in weather and climate prediction models. Geophys. Res. Lett., 39, L21805, https://doi.org/10.1029/2012GL053284.
Doelling, D. R., and Coauthors, 2013: Geostationary enhanced temporal interpolation for CERES flux products. J. Atmos. Oceanic Technol., 30, 1072–1090, https://doi.org/10.1175/JTECH-D-12-00136.1.
Doelling, D. R., M. Sun, L. T. Nguyen, M. L. Nordeen, C. O. Haney, D. F. Keyes, and P. E. Mlynczak, 2016: Advances in geostationary-derived longwave fluxes for the CERES synoptic (SYN1deg) product. J. Atmos. Oceanic Technol., 33, 503–521, https://doi.org/10.1175/JTECH-D-15-0147.1.
Dunkerton, T. J., M. T. Montgomery, and Z. Wang, 2009: Tropical cyclogenesis in a tropical wave critical layer: Easterly waves. Atmos. Chem. Phys., 9, 5587–5646, https://doi.org/10.5194/acp-9-5587-2009.
Eckmann, J.-P., and D. Ruelle, 1985: Ergodic theory of chaos and strange attractors. The Theory of Chaotic Attractors, B. R. Hunt et al., Eds., Springer, 273–312.
Fabiano, F., H. M. Christensen, K. Strommen, P. Athanasiadis, A. Baker, R. Schiemann, and S. Corti, 2020: Euro-Atlantic weather regimes in the PRIMAVERA coupled climate simulations: Impact of resolution and mean state biases on model performance. Climate Dyn., 54, 5031–5048, https://doi.org/10.1007/s00382-020-05271-w.
Gottschalck, J., and Coauthors, 2010: A framework for assessing operational Madden–Julian oscillation forecasts: A CLIVAR MJO Working Group project. Bull. Amer. Meteor. Soc., 91, 1247–1258, https://doi.org/10.1175/2010BAMS2816.1.
Guan, H., and Coauthors, 2022: GEFSv12 reforecast dataset for supporting subseasonal and hydrometeorological applications. Mon. Wea. Rev., 150, 647–665, https://doi.org/10.1175/MWR-D-21-0245.1.
Guinn, T. A., and W. H. Schubert, 1993: Hurricane spiral bands. J. Atmos. Sci., 50, 3380–3403, https://doi.org/10.1175/1520-0469(1993)050<3380:HSB>2.0.CO;2.
Hagos, S. M., L. R. Leung, O. A. Garuba, C. Demott, B. Harrop, J. Lu, and M.-S. Ahn, 2021: The relationship between precipitation and precipitable water in CMIP6 simulations and implications for tropical climatology and change. J. Climate, 34, 1587–1600, https://doi.org/10.1175/JCLI-D-20-0211.1.
Hamill, T. M., and G. N. Kiladis, 2014: Skill of the MJO and Northern Hemisphere blocking in GEFS medium-range reforecasts. Mon. Wea. Rev., 142, 868–885, https://doi.org/10.1175/MWR-D-13-00199.1.
Hamill, T. M., and Coauthors, 2022: The reanalysis for the Global Ensemble Forecast System, version 12. Mon. Wea. Rev., 150, 59–79, https://doi.org/10.1175/MWR-D-21-0023.1.
Han, J., M. L. Witek, J. Teixeira, R. Sun, H.-L. Pan, J. K. Fletcher, and C. S. Bretherton, 2016: Implementation in the NCEP GFS of a hybrid eddy-diffusivity mass-flux (EDMF) boundary layer parameterization with dissipative heating and modified stable boundary layer mixing. Wea. Forecasting, 31, 341–352, https://doi.org/10.1175/WAF-D-15-0053.1.
Han, J., W. Wang, Y. C. Kwon, S.-Y. Hong, V. Tallapragada, and F. Yang, 2017: Updates in the NCEP GFS cumulus convection schemes with scale and aerosol awareness. Wea. Forecasting, 32, 2005–2017, https://doi.org/10.1175/WAF-D-17-0046.1.
Han, J.-Y., S.-Y. Hong, K.-S. S. Lim, and J. Han, 2016: Sensitivity of a cumulus parameterization scheme to precipitation production representation and its impact on a heavy rain event over Korea. Mon. Wea. Rev., 144, 2125–2135, https://doi.org/10.1175/MWR-D-15-0255.1.
Harris, L. M., and S.-J. Lin, 2013: A two-way nested global-regional dynamical core on the cubed-sphere grid. Mon. Wea. Rev., 141, 283–306, https://doi.org/10.1175/MWR-D-11-00201.1.
Henderson, S. A., E. D. Maloney, and S.-W. Son, 2017: Madden–Julian oscillation Pacific teleconnections: The impact of the basic state and MJO representation in general circulation models. J. Climate, 30, 4567–4587, https://doi.org/10.1175/JCLI-D-16-0789.1.
Hendon, H. H., M. C. Wheeler, and C. Zhang, 2007: Seasonal dependence of the MJO–ENSO relationship. J. Climate, 20, 531–543, https://doi.org/10.1175/JCLI4003.1.
Hersbach, H., and Coauthors, 2020: The ERA5 global reanalysis. Quart. J. Roy. Meteor. Soc., 146, 1999–2049, https://doi.org/10.1002/qj.3803.
Hochman, A., P. Alpert, T. Harpaz, H. Saaroni, and G. Messori, 2019: A new dynamical systems perspective on atmospheric predictability: Eastern Mediterranean weather regimes as a case study. Sci. Adv., 5, eaau0936, https://doi.org/10.1126/sciadv.aau0936.
Houze, R. A., Jr., 2004: Mesoscale convective systems. Rev. Geophys., 42, RG4003, https://doi.org/10.1029/2004RG000150.
Huang, J., 2018: A simple accurate formula for calculating saturation vapor pressure of water and ice. J. Appl. Meteor. Climatol., 57, 1265–1272, https://doi.org/10.1175/JAMC-D-17-0334.1.
Huffman, G. J., R. F. Adler, M. M. Morrissey, D. T. Bolvin, S. Curtis, R. Joyce, B. McGavock, and J. Susskind, 2001: Global precipitation at one-degree daily resolution from multi-satellite observations. J. Hydrometeor., 2, 36–50, https://doi.org/10.1175/1525-7541(2001)002<0036:GPAODD>2.0.CO;2.
Hung, M.-P., W.-T. Chen, C.-M. Wu, P.-J. Chen, and P.-N. Feng, 2020: Intraseasonal vertical cloud regimes based on CloudSat observations over the tropics. Remote Sens., 12, 2273, https://doi.org/10.3390/rs12142273.
Jiang, X., 2017: Key processes for the eastward propagation of the Madden-Julian oscillation based on multimodel simulations. J. Geophys. Res. Atmos., 122, 755–770, https://doi.org/10.1002/2016JD025955.
Jung, T., 2005: Systematic errors of the atmospheric circulation in the ECMWF forecasting system. Quart. J. Roy. Meteor. Soc., 131, 1045–1073, https://doi.org/10.1256/qj.04.93.
Jung, T., and Coauthors, 2010: The ECMWF model climate: Recent progress through improved physical parametrizations. Quart. J. Roy. Meteor. Soc., 136, 1145–1160, https://doi.org/10.1002/qj.634.
Kennedy, D., T. Parker, T. Woollings, B. Harvey, and L. Shaffrey, 2016: The response of high-impact blocking weather systems to climate change. Geophys. Res. Lett., 43, 7250–7258, https://doi.org/10.1002/2016GL069725.
Kim, H.-M., 2017: The impact of the mean moisture bias on the key physics of MJO propagation in the ECMWF reforecast. J. Geophys. Res. Atmos., 122, 7772–7784, https://doi.org/10.1002/2017JD027005.
Kim, H.-M., M. A. Janiga, and K. Pegion, 2019: MJO propagation processes and mean biases in the SubX and S2S reforecasts. J. Geophys. Res. Atmos., 124, 9314–9331, https://doi.org/10.1029/2019JD031139.
Kimoto, M., and M. Ghil, 1993: Multiple flow regimes in the Northern Hemisphere winter. Part I: Methodology and hemispheric regimes. J. Atmos. Sci., 50, 2625–2644, https://doi.org/10.1175/1520-0469(1993)050<2625:MFRITN>2.0.CO;2.
Knapp, K. R., H. J. Diamond, J. P. Kossin, M. C. Kruk, and C. J. Schreck III, 2018: International Best Track Archive for Climate Stewardship (IBTrACS) project, version 4. NOAA/National Centers for Environmental Information, accessed 1 October 2020, https://doi.org/10.25921/82ty-9e16.
Kullback, S., and R. A. Leibler, 1951: On information and sufficiency. Ann. Math. Stat., 22, 79–86, https://doi.org/10.1214/aoms/1177729694.
Lee, C.-Y., S. J. Camargo, F. Vitart, A. H. Sobel, and M. K. Tippett, 2018: Subseasonal tropical cyclone genesis prediction and MJO in the S2S dataset. Wea. Forecasting, 33, 967–988, https://doi.org/10.1175/WAF-D-17-0165.1.
Lee, S. H., J. C. Furtado, and A. J. Charlton-Perez, 2019: Wintertime North American weather regimes and the Arctic stratospheric polar vortex. Geophys. Res. Lett., 46, 14 892–14 900, https://doi.org/10.1029/2019GL085592.
Li, J., Y. Yang, and Z. Zhu, 2020: Application of MJO dynamics-oriented diagnostics to CMIP5 models. Theor. Appl. Climatol., 141, 673–684, https://doi.org/10.1007/s00704-020-03185-5.
Li, W., Z. Wang, M. S. Peng, and J. A. Ridout, 2014: Evaluation of tropical intraseasonal variability and moist processes in the NOGAPS analysis and short-term forecasts. Wea. Forecasting, 29, 975–995, https://doi.org/10.1175/WAF-D-14-00010.1.
Li, W., Z. Wang, and M. S. Peng, 2016: Evaluating tropical cyclone forecasts from the NCEP Global Ensemble Forecasting System (GEFS) reforecast version 2. Wea. Forecasting, 31, 895–916, https://doi.org/10.1175/WAF-D-15-0176.1.
Li, W., and Coauthors, 2019: Evaluating the MJO prediction skill from different configurations of NCEP GEFS extended forecast. Climate Dyn., 52, 4923–4936, https://doi.org/10.1007/s00382-018-4423-9.
Lim, S. Y., C. Marzin, P. Xavier, C.-P. Chang, and B. Timbal, 2017: Impacts of boreal winter monsoon cold surges and the interaction with MJO on Southeast Asia rainfall. J. Climate, 30, 4267–4281, https://doi.org/10.1175/JCLI-D-16-0546.1.
Lim, Y., S. W. Son, and D. Kim, 2018: MJO prediction skill of the subseasonal-to-seasonal prediction models. J. Climate, 31, 4075–4094, https://doi.org/10.1175/JCLI-D-17-0545.1.
Lin, S.-J., 2004: A “vertically Lagrangian” finite-volume dynamical core for global models. Mon. Wea. Rev., 132, 2293–2307, https://doi.org/10.1175/1520-0493(2004)132<2293:AVLFDC>2.0.CO;2.
Lorenz, E. N., 1963: Deterministic nonperiodic flow. J. Atmos. Sci., 20, 130–141, https://doi.org/10.1175/1520-0469(1963)020<0130:DNF>2.0.CO;2.
Lorenz, E. N., 1969: The predictability of a flow which possesses many scales of motion. Tellus, 21A, 289–307, https://doi.org/10.3402/tellusa.v21i3.10086.