Whether the state-of-the-art CMIP5 models have different El Niño types and how the degree of modeled El Niño diversity would be impacted by the future global warming are still heavily debated. In this study, cluster analysis is used to investigate El Niño diversity in 30 CMIP5 models. As the method does not rely on any prior knowledge of the patterns of El Niño seen in observations, it provides a practical way to identify the degree of El Niño diversity in models. Under the historical scenario, most models show a poor degree of El Niño diversity in their own model world, primarily due to the lopsided numbers of events belonging to the two modeled El Niño types and the weak compactness of events in each cluster. Four models are found showing significant El Niño diversity, yet none of them captures the longitudinal distributions of the warming centers of the two El Niño types seen in the observations. Heat budget analysis of the sea surface temperature (SST) anomaly suggests that the degree of modeled El Niño diversity is highly related to the climatological zonal SST gradient over the western-central equatorial Pacific in models. As the gradient is weakened in most models under the future high-emission scenario, the degree of modeled El Niño diversity is further reduced in the future. The results indicate that a better simulation of the SST gradient over the western-central equatorial Pacific might allow a more reliable simulation/projection of El Niño diversity in most CMIP5 models.
Every El Niño is different (Wyrtki et al. 1976; Capotondi et al. 2015), but it is often useful to classify different events into a few distinctive types according to the distribution of sea surface temperature (SST) anomalies in the equatorial Pacific. El Niño diversity along with its underlying dynamics has been a highly debated topic in the climate community over the past decades (e.g., Vecchi and Wittenberg 2010; Collins et al. 2010; Capotondi et al. 2015; Timmermann et al. 2018). Although the limited number of El Niño events and their complex underlying dynamics make different degrees of El Niño classification possible (Takahashi et al. 2011; Johnson 2013; Chen et al. 2015), it is now widely accepted that El Niño can be grouped into two types with respect to the centers of strongest warming: the eastern Pacific (EP) El Niño, which is manifested by strong maximum SST anomalies in the eastern equatorial Pacific, and the central Pacific (CP) El Niño, for which the warming center is located near the equatorial date line (Ashok et al. 2007; Kao and Yu 2009; Kug et al. 2009; Ren and Jin 2011).
Given the distinct impacts of EP and CP El Niño on global climate (Larkin and Harrison 2005; Taschetto et al. 2016; Xu et al. 2019), many works have discussed the ability of state-of-the-art climate models to simulate the patterns of the two types of El Niño (Yu and Kim 2010; Kim and Yu 2012), yet a consensus is still lacking. For example, Kug et al. (2012) showed that only 5 of 21 models from phase 5 of the Coupled Model Intercomparison Project (CMIP5) produce two flavors of El Niño that are less dependent on each other. Karamperidou et al. (2017) argued that only a quarter of 36 CMIP5 models can be regarded as having El Niño diversity as strong as that in observations. On the other hand, Taschetto et al. (2014) reported that most CMIP5 models simulate El Niño events that can be classified as either EP or CP type, although the warming centers have large westward biases in both types. Chen et al. (2017) showed that the El Niño diversity simulated by more than half of the CMIP5 models is in agreement with that in observations.
The impact of global warming on El Niño diversity has also been widely discussed (e.g., Kao and Yu 2009; Yu and Giese 2013; Ham et al. 2015; Lemmon and Karnauskas 2019). In observations, more CP El Niño events were found in the later part of the twentieth century (Ashok et al. 2007; Lee and McPhaden 2010; Yu and Kim 2013) and were likely caused by the strengthening of the SST gradient across the equatorial Pacific over time (Li et al. 2015, 2017). Using model simulations with future high-emission scenarios, some studies have argued that CP El Niño events will be more frequent in a warming climate with a flattening thermocline in the equatorial Pacific (Yeh et al. 2009), whereas such a trend was not found in other model studies (Taschetto et al. 2014; Ham et al. 2015; Chen et al. 2017; Lemmon and Karnauskas 2019).
One possible reason for the disagreement on the degree of El Niño diversity simulated in models and its possible change under the future global warming might be caused by the fact that techniques designed to classify El Niño in observations cannot be directly applied to model simulations. There are mainly two ways to assess model performance in capturing El Niño diversity at present. The first one relies on prior knowledge of the patterns of EP and CP El Niño in observations, for which the warming centers are located in the Niño-3 (150°–90°W, 5°S–5°N) and Niño-4 (150°E–160°W, 5°S–5°N) regions, respectively (Kug et al. 2009). Kug et al. (2010) noted that the warming center of El Niño simulated in a single climate model is biased westward by about 20°. To distinguish EP and CP El Niño in this particular model, Kug et al. (2010) compared the SST anomaly in regions 20° west of the Niño-3 and Niño-4 regions. Ham and Kug (2012) further applied this technique to multimodel results to identify modeled EP and CP El Niño. However, the westward displacement varies from model to model (Bellenger et al. 2014; Taschetto et al. 2014), making it difficult to implement this technique in multimodel simulations. Ashok et al. (2007) pointed out that the EP and CP El Niño can be characterized by the principal components (PCs) of the leading empirical orthogonal function (EOF) modes. This method was also widely used to diagnose El Niño diversity in models (Yu and Kim 2010; Kim and Yu 2012; Xu et al. 2017; Wang et al. 2019), but the orthogonal restriction imposed during the EOF analysis tends to produce unphysical features in the second mode, and thus reduce the fidelity of the pattern of CP El Niño (Lian and Chen 2012).
In the second way, whether there is El Niño diversity is scaled by metrics that are solely based on El Niño patterns from a dataset (observations and model simulations) and without any presumption of the relationships between the types, and thus overcomes the aforementioned limits. For example, Karamperidou et al. (2017) fit the scatterplots in the phase space of the first two PCs with a quadratic curve of the form PC2 = αPC12 + bPC1 + c. They argued that a larger α coefficient indicates greater El Niño diversity. This method is further employed to investigate possible changes in the variation of EP El Niño in the future (Cai et al. 2018). Lemmon and Karnauskas (2019) defined the El Niño pattern uniqueness (EPU) of a given El Niño event using the average spatial correlation in the tropical Pacific between the given El Niño event and all other El Niño events within the same dataset. The average EPU over the period of analysis, termed El Niño pattern diversity (EPD), thus scales El Niño diversity in the dataset. By comparing the values of these metrics (α coefficient and EPD), one can determine whether El Niño events from one dataset are more diverse than those from another dataset, but it is impossible to know whether they are truly diverse from the statistical point of view. The values from observations have to be used as the benchmark of degree of El Niño diversity in these studies.
In the current paper, the cluster technique is used to assess the El Niño diversity simulated in the CMIP5 models. As the technique does not rely on the patterns of El Niño flavors in observations, it can be applied to both observations and model simulations. More importantly, the significant confidence level of the degree of diversity in data can be estimated, thus providing an objective way to assess the degree of El Niño diversity in models. The remainder of this paper is arranged as follows. Section 2 introduces the data and method. Section 3 compares the classification of observed El Niño events estimated by the cluster analysis and two traditional classification methods, and shows the advantage of the cluster analysis by analyzing the results from a single model. El Niño diversity in CMIP5 models under model historical and future scenarios is investigated in sections 4 and 5, respectively. A note on the changes in El Niño diversity simulated in models under the future scenario is placed in section 6, followed by a discussion and conclusions in section 7.
2. Data and method
The model simulations used in this work are those of the first ensemble member (r1i1p1) of 30 CMIP5 models forced by historical and RCP8.5 greenhouse gas concentrations (Taylor et al. 2012) for the periods 1861–2005 and 2006–2100, respectively. Variables used include ocean temperature, zonal and meridional ocean velocity, and vertical mass transport in the upper 50 m. Here, the vertical velocity is obtained from the ocean vertical mass transport by dividing by the density and resolution. Observations include SST from the Hadley Centre Sea Ice and Sea Surface Temperature dataset version 1 (HadISST1; Rayner et al. 2003) for the period 1870–2017, and ocean temperature and 3D current velocity from the NCEP Global Ocean Data Assimilation System (NCEP-GODAS; Saha et al. 2006) for the period 1980–2016. For each variable, the linear trend and the climatological seasonal cycle in the interested period are subtracted to obtain the anomaly field. Lemmon and Karnauskas (2019) suggested removing variations at decadal-to-multidecadal time scales to get the SST anomaly in the RCP8.5 scenario, given the fact that the background state of SST may more heavily increase in the second part of twenty-first century than in the first part in such a high-emission case. However, our conclusions are insensitive to the definition of SST anomaly (not shown).
The National Oceanic and Atmospheric Administration (NOAA) of the United States of America defines an El Niño as an event with a 3-month running mean of the SST anomaly in Niño-3.4 region (170°–120°W, 5°S–5°N) of ≥0.5°C for at least 5 consecutive months (http://www.cpc.ncep.noaa.gov/products/precip/CWlink/MJO/enso.shtml). The threshold of 0.5°C corresponds to one standard deviation of the SST anomaly in the Niño-3.4 region. To identify El Niño in observations and model simulations in a consistently universal way, an El Niño event is defined as follows: in a time–longitude Hovmöller diagram of equatorial (5°S–5°N) SST anomalies along the Pacific (140°E–80°W), El Niño is defined as an area in the time–longitude diagram that persists for at least 5 consecutive months where SST anomalies are ≥1 standard deviation of SST at each longitude for at least 50° of longitude. The 50° span in longitude corresponds to the width of the Niño-3.4 region. A La Niña event is defined in a similar way but using the opposite sign.
For each identified El Niño in the time–longitude diagram, the longitude with the maximum SST anomaly is used as its center. The agglomerative hierarchical clustering analysis is then used to classify these centers into different clusters (Desbois et al. 1982; Gadgil and Joshi 1983). The clustering method starts with n clusters, where n is the number of the El Niño events. Then, the Euclidean distances between the centers of each pair of clusters are calculated, and two clusters with the smallest distance are merged to form a larger cluster. The number of clusters reduces to n − 1, and the center of the newly formed cluster is the average of the centers belonging to this newly formed cluster. The above step is repeated until all centers are in a single cluster.
To address whether the clustering with r clusters is statistically significant, a statistic termed the C value is used:
where r and x represent the cluster number and center of event, respectively; ni is the number of events belonging to the ith cluster, and xi,j is the jth event belonging to the ith cluster. Overbars denote averages, such that and represent the center of the ith cluster and the global center (i.e., the average of all centers), respectively; n is the total number of centers. The numerator denotes the distance from cluster centers to the global center, thus manifesting the degree of separation among different clusters. The denominator denotes the distance between each pair of centers belonging to the same cluster, therefore scaling the degree of compactness of events in each cluster. A small C value means that the clustering results are not optimal or the centers cannot be divided into r types, whereas a large C value means the grouping is meaningful. The C value follows the F distribution with a degree of freedom of r − 1. A C value that is greater than F(r − 1, n − r) at the 95% confidence level is regarded as significant classification in this study.
Figure 1 presents C values as a function of number of clusters for El Niño (Fig. 1a) and La Niña (Fig. 1b) in observations. There are at most three different El Niño clusters at the 95% confidence level (Fig. 1a). If the confidence level is increased to 97%, only two El Niño clusters are significant. The La Niña events cannot be grouped into different clusters that are significantly different from each other (Fig. 1b). It should be noted that only the longitude of the maximum SST anomaly during El Niño/La Niña is used here for grouping. When more information such as the magnitude and spatial patterns of SST anomaly is used, it is possible to get more flavors of El Niño/La Niña (e.g., Johnson 2013; Chen et al. 2015), leading one to believe that there is a continuum of ENSO events throughout history (Giese and Ray 2011). However, when focusing on the location of the warming/cooling center, the continuum of ENSO is found to be weak, and grouping historical El Niño into two clusters (CP and EP types) seems like the best choice. Given these findings, the cluster number is set as 2 in the following sections [r = 2 in Eq. (1)].
As the clustering technique does not rely on information from observed CP and EP events, it can be directly applied to model simulations. A larger C value means that the model has its own El Niño diversity, but it does not indicate the modeled El Niño flavors are realistic. The Student’s t test is thus used to clarify whether the longitudinal distributions of CP/EP events simulated by the models are significantly different from those of observations (at the 95% confidence level). However, as most of CMIP5 models have a strong cold tongue bias (Bellenger et al. 2014), the locations of modeled EP events are generally westward biased. As a result, the longitudinal distributions of the simulated CP and EP events from only a few models can pass the Student’s t test (will be shown in section 4).
As will be shown in sections 3 and 4, observed El Niño events have large C value, whereas most models show small and insignificant C values. From Eq. (1), the C value depends on the degree of separation among different clusters, the number of events in each cluster, and the degree of compactness of events in each cluster. Therefore, bias in any of these three factors could lead to small C values in models. To diagnose the primary reason for the small C values in models, three other C indexes are further calculated.
First, in estimating the C value of models using Eq. (1), the distance between the center of each cluster and the global center () is replaced by the value from observations. Thus, the degree of separation of the modeled CP and EP clusters is replaced by the degree of separation of the observed CP and EP clusters. This newly derived C value is referred to as the separation-unbiased C value. Second, for each cluster in the model, the distances between each pair of centers in a given cluster () are replaced by the values from observations. By doing so, the bias in the degree of compactness of each cluster is corrected. This index is referred to as the compactness-unbiased C value. Finally, the number of modeled events belonging to each cluster (ni) is replaced by the number of observed CP and EP events, thus correcting the bias in the number of the modeled CP and EP events. This index is referred to as the number-unbiased C value.
As the numbers of CP and EP events simulated by models are not necessarily the same as those from observations, the compactness-unbiased C value and the number-unbiased C value are estimated statistically. We denote the number of CP events from models and observations as nCP_model and nCP_obs, respectively. The distances between each pair of centers in a given cluster from models and observations are denoted by and , respectively. Similarly, we use nEP_model and nEP_obs to denote the number of EP events from models and observations, respectively, and and as the distances between pairs of centers in a given cluster from the models and observations, respectively. For the compactness-unbiased C value, we randomly select values from DCP_obs by times and from DEP_obs by times to obtain a compactness-unbiased C value. For the number-unbiased C value, we randomly select values from DCP_model by times and from DEP_model by times to obtain a number-unbiased C value. For each model, these procedures are repeated 1000 times.
By comparing these three newly defined C value indexes with the original C value index, it is easy to identify why some models perform poorly in terms of El Niño diversity in their own model world. For example, if the C value of a given model is small but the number-unbiased C value exceeds the 95% confidence level, it can be concluded that the model captures the longitudinal distribution of the CP and EP events but severely biases their occurrence.
To show the advantage of the cluster analysis in distinguishing different El Niño types in models, two widely used El Niño classification methods are employed here. The first is the method proposed by Kug et al. (2009), which compares SST anomalies in the Niño-3 and Niño-4 regions. We refer to this method as the Niño method. An EP (CP) El Niño is characterized by a larger warming anomaly in the normalized Niño-3 (Niño-4) region. The second is the El Niño Modoki index (EMI) proposed by Ashok et al. (2007), which measures the CP El Niño using the second EOF mode of the tropical Pacific detrended SST anomaly. Specifically, EMI is calculated as the SST anomalies averaged over the central tropical Pacific (165°E–140°W, 10°S–10°N) minus half the sum of the SST anomalies averaged over the eastern (110°–70°W, 15°S–5°N) and western (125°–145°E, 10°S–20°N) tropical Pacific, and an El Niño event is identified as a CP type if EMI is higher than 0.7 STD (Ashok et al. 2007; Yu and Kim 2013), where STD is the standard deviation of the EMI in DJF.
The heat budget of upper ocean temperature anomaly is usually used to diagnose El Niño dynamics in the form of
where T, u, and w denote the average ocean temperature, zonal velocity, and vertical velocity in the mixed layer, respectively, and Res is the residual term. The primes and overbars denote the anomaly and the mean, respectively. Given the magnitude of CP/EP El Niño varies from model to model, we divide each term in Eq. (2) by the total tendency ∂T/∂t and get the following form of heat budget:
The three terms in parentheses on the right-hand side of Eq. (2) represent the changes in SST anomaly caused by zonal, meridional, and vertical advection, and RES is the residual term. We label the three terms in the first parentheses as Z1, Z2, and Z3, those in the second parentheses as M1, M2, and M3, and those in the last parentheses as V1, V2, and V3. The mixed layer is defined as the upper 50 m, but the results are insensitive to the definition of the mixed layer depth (not shown). We will use Eq. (3) to diagnose El Niño dynamics in the following sections.
The strong climatological zonal SST gradient over the western-central Pacific (Niño-4 region) provides a favorable background for the genesis of CP El Niño in observations (e.g., Kug et al. 2009), and we therefore analyze the role of this gradient in modeled El Niño diversity. However, as climatological SST across the equatorial Pacific is biased to some extent in models (Bellenger et al. 2014; Taschetto et al. 2014), we will use the normalized climatological gradient in the following sections unless explicitly stated otherwise. Note that for a given region, the gradient is defined as the difference between the west side and the east side (west minus east).
3. Cluster analysis
We first validate cluster analysis in distinguishing different El Niño types in observations. As shown in previous section, the C value of the two types of El Niño event in observations is 6.67, which exceeds the 95% confidence level (Fig. 1a). The first two columns in Table 1 list El Niño years and their types as identified by cluster analysis. All El Niño events since 1870 (Yu and Kim 2013; Pascolini–Campbell et al. 2015), except the weak 1953/54 event, are successfully identified by the cluster method. The last two columns in Table 1 list El Niño types using the Niño and EMI methods. In general, El Niño types identified by the cluster analysis agree with those by the Niño and EMI methods. It is noted that when Niño and EMI disagree, the cluster analysis tends to agree more often with the grouping by the Niño (6 events) method than the EMI (3 events) method, probably caused by the unphysical constraint in the EMI method (Lian and Chen 2012). The composite EP and CP El Niño events, as defined by the three methods, are shown in Fig. 2. The patterns of the EP and CP El Niño events identified by the three methods are very similar.
We also apply the cluster analysis to La Niña events, yielding 8 EP and 22 CP La Niña events. However, the two identified types of La Niña events have very similar patterns (not shown). The C value of the classification is 3.08, which is lower than the 95% confidence level (Fig. 1b), indicating that the diversity of La Niña events is insignificant (Kug and Ham 2011; Chen et al. 2015; and other studies on ENSO skewness). The results are confirmed using the EMI and Niño methods (not shown).
To show the advantage of the cluster technique in distinguishing different El Niño types in models, we analyze the results from a single CMIP5 model. Figure 3 shows the time–longitude evolution of equatorial SST anomalies from the HadGEM2-ES historical simulation. This model is arbitrarily chosen to show the advantage of cluster analysis, and we consider the results from all models in section 4. Although the warming centers of El Niño events are confined to a limited area east of 140°W, 22 of them can be classified as EP type and 8 as CP type using the Niño method (red and orange bars in Fig. 3). However, the patterns of the resulting CP and EP events are in fact not very different. For example, the evolution of the model year 1953/54 CP El Niño is very similar to that of the model year 1945/46 EP El Niño. Figures 4c and 4d further show the composites of EP and CP El Niño events in this model as identified by the Niño method. The centers of the composite EP and CP types are very close to each other, indicating that the identified EP and CP types belong to the same group. We also apply the EMI method to this model and get similar results (Figs. 4a,b). On the other hand, the cluster method results in 29 El Niño events being classified as EP type and only 1 event (model year 1988/89) as CP type in this model. Although the centers of the composite EP and CP types are clearly separated (Figs. 4e,f), the C value of the classification is much smaller than the 95% confidence level (0.42 vs 4.21), indicating that this particular model does not show two different El Niño types. This example clearly suggests that the cluster analysis is more suitable than the traditional methods for classifying El Niño events in models.
4. El Niño diversity under the historical scenario
Figure 5a gives the C values and 95% confidence level calculated from the 30 CMIP5 model simulations. Nearly half of the models show a C value that is close to zero, suggesting a very weak degree of El Niño diversity in them. For La Niña events, both observation and model simulations show a very small C value (Fig. 5b). The results confirm previous findings that La Niña is less diverse than El Niño in model simulations (Kug and Ham 2011).
Four models (i.e., CMCC-CMS, GFDL-ESM2G, GFDL-ESM2M, and IPSL-CM5A-MR) show large C values exceeding the 95% confidence level. The composites patterns of CP and EP events simulated by these four models and from the observation are shown in Fig. 6. It can be seen that the warming centers of the composite EP and CP events simulated in these models are well separated in longitude. However, the patterns of the simulated CP and EP types are quite different from those in the observations. To further diagnose the realism of modeled El Niño diversity, Fig. 7 presents the longitudinal distributions of EP and CP events for individual models and observation. Models for which the longitudinal distribution of CP (EP) events is insignificantly different from that in observation (at the 95% confidence level using the Student’s t test) are shown in red at the left (right) side of the figure. For the CP El Niño events, half of the models show significant bias in longitude as compared with the observation, whereas for the EP El Niño events only six models (GFDL CM3, GISS-E2-R, HadGEM2-AO, HadGEM2-ES, IPSL-CM5A-MR, and IPSL-CM5A-LR) reproduce the longitudinal distribution of the observed EP events, presumably due to the large westward extension of the cold tongue in most models (Bellenger et al. 2014; Flato et al. 2013; Luo et al. 2005). Of the four models with significant C values (shown in bold in Fig. 7), none of them correctly reproduces the longitudinal distribution of the observed CP and EP events.
Some models such as CCSM4, GFDL CM3, GISS-E2-R, and MIROC-ESM show large degree of separation between the modeled CP and EP events. However, the C values of these models are ranked low in C value. From Eq. (1), it can be seen that the number of events in each cluster and the degree of compactness of events in each cluster also contribute to the C value. Table 2 lists the frequency of CP and EP events in models and observation. Indeed, the numbers of CP and EP events in these models are severely lopsided.
To further explore the primary reason causing the low C value in most of the models, Table 3 lists the C value, separation-unbiased C value, compactness-unbiased C value, and number-unbiased C value indexes of each model (definitions of the indexes are given in section 2). Fifteen models including GFDL CM3 and GISS-E2-R show significant El Niño diversity when the numbers of CP and EP events are corrected, whereas only four models show significant C values when the degree of cluster separation is corrected. When the degree of compactness is corrected, 10 models including CCSM4 and MIROC-ESM show significant El Niño diversity. These results clearly indicate that the number of CP/EP event and the degree of compactness of the CP/EP clusters contribute more to the C value index as compared with that from the degree of separation between the CP and EP clusters. This conclusion is consistent with the analytical solutions under some simple assumptions (see the appendix).
Why do some models perform better than others in simulating different El Niño types? To this end, we first analyze the heat budget during the EP and CP El Niño growth in observations, as shown in Fig. 8. The zonal advection feedback (ZAF; Z1 in Fig. 8), Ekman feedback (EKF; V1 in Fig. 8), and thermocline feedback (THF; V2 in Fig. 8) are the three main contributors to EP El Niño growth in the Niño-3 region (Jin et al. 2006; Ren and Jin 2013), whereas the ZAF dominates the development of CP El Niño events in the Niño-4 region (Kug et al. 2009; Kim and Jin 2011). Although anomalous temperature advection by the mean meridional current also contributes to EP El Niño growth (M2 in Fig. 8), this term works only after the occurrence of warm SST anomaly at the equator (Kang et al. 2001). In other words, the meridional advection is the response, rather than a trigger of El Niño growth, and therefore will not be discussed further in the following.
Figure 9 presents the heat budgets over the Niño-3 region during EP El Niño in models. Note that some models (i.e., GFDL-ESM2G, GISS-E2-H, HadGEM2-AO, MIROC-ESM, MIROC-ESM-CHEM, MPI-ESM-MR, MRI-CGCM3, and NorESM1-M) do not provide the vertical mass transport in the archive gateway, so the analysis is only conducted on 22 models. In general, most models capture the key dynamics of EP El Niño. That is, the THF, ZAF, and EKF terms, or at least one of them, dominate EP El Niño growth. However, EP El Niño events simulated by the other seven models (i.e., ACCESS1.3, FGOALS-s2, GFDL CM3, INM-CM4, MIROC5, MPI-ESM-LR, and MRI-ESM1, shown in the last three rows in Fig. 9) are dominated by the meridional advection, indicating that the basic mechanism of EP El Niño growth is balanced differently compared to the observation. Given this, it is meaningless and unnecessary to discuss why these seven models do not simulate the CP El Niño type. We then focus on the 15 models that do capture the key processes, or at least one of them, in generating the EP El Niño type. From Fig. 7, it can be seen that the centers of CP El Niño simulated from most of models are confined to the Niño-4 region. Figure 10 presents the heat budgets over the Niño-4 region during CP El Niño in these 15 models. Nearly all the models accurately reproduce the dominant contribution of ZAF to warming growth in the Niño-4 region during the modeled CP El Niño events.
Figure 11 compares the composite heat budget over the Niño-3 region during EP and over the Niño-4 region during CP El Niño events for the two groups of models, that is, models with C values greater and less than the 95% confidence level. As can be seen, there was a smaller difference in the heat budgets during EP events simulated between the two groups of models (Figs. 11a,b). However, the ZAF term from models with significant diversity (Fig. 11c) is much larger than that from models without significant diversity (Fig. 11d). In addition, the magnitude of ZAF from models with significant diversity is comparable to that of observations (Fig. 8b), indicating that models with significant C value have more realistic El Niño–related feedbacks. From Fig. 11, it is clear that the degree of modeled El Niño diversity depends mainly on the simulation of CP El Niño events. Since the ZAF  is the dominant contributor to CP El Niño growth in observation (Fig. 8b) and models (Figs. 11c,d), models with stronger climatological SST gradient over Niño-4 region may be more likely to show the CP El Niño events.
To this end, Fig. 12 shows the scatterplots of C value against the normalized climatological SST gradient over the Niño-4 region in the 15 models. There is indeed a tendency for models with stronger SST gradient to have a larger C value. The regression coefficient between C value and SST gradient is 0.70, which is significant at the 95% confidence level. In fact, the strong association between the C value and the gradient can be found even using all historical simulations (Fig. 13), reconfirming that the degree of El Niño diversity in model is mainly determined by the climatological SST gradient over the Niño-4 region. It is worth noting that most models underestimate the normalized climatological SST gradient over the Niño-4 region, and the C values in these models are all smaller than that in the observations.
There are three models standing out from the 95% confidence range of the linear regression (Fig. 12). In particular, the CanESM2 and IPSL-CM5A-LR are characterized by small C value but large SST gradient. To explore why these two models do not exhibit two significant types of El Niño, we look back to Fig. 9 and find that the ZAF term in these two models is larger than the TCF and EKF terms during EP El Niño growth. The large ZAF value indicates that the warming center of EP El Niño tends to move westward during EP El Niño growth (Fedorov and Philander 2000, 2001). As a result, the cluster centers of CP and EP El Niño events simulated by these two models are too close to be distinguished as two significantly different types (Fig. 7, Table 3).
5. El Niño diversity under the RCP8.5 scenario
Figure 14a displays the C values of modeled El Niño events under the RCP8.5 scenario. Only two models (i.e., CMCC-CMS and GFDL-ESM2M) exhibit significant El Niño diversity with future global warming, as manifested by the large C values that exceed the 95% confidence level. La Niña still shows very weak diversity under the future scenario (Fig. 14b). As seen in Fig. 14c, the C value in most of the models (21 of 30) decreases in the future RCP8.5 scenario, suggesting a weakening of El Niño diversity in the future.
As seen in Eq. (1), the C value depends on the number/center of both the CP and EP El Niño events. Figure 15 presents the longitudinal distributions of the CP (Fig. 15a) and EP (Fig. 15b) El Niño events under the historical and RCP8.5 scenarios. The longitudinal distributions of the events simulated by some models change significantly under the RCP8.5 scenario. For the CP El Niño, five models (i.e., CMCC-CESM, GISS-E2-R, HadGEM2-ES, MPI-ESM-MR, and MRI-ESM1) show a significant eastward replacement under the RCP8.5 scenario as compared with the historical scenario, whereas two models (INM-CM4 and IPSL-CM5A-MR) show a significant westward replacement. For the EP El Niño, two models (ACCESS1.0 and MRI-ESM1) show a significant eastward replacement, while four models (IPSL-CM5A-MR, MIROC5, MPI-ESM-LR, and MPI-ESM-MR) show a significant westward replacement. The frequency of CP or/and EP El Niño events under the RCP8.5 scenario also changes vastly from some models (Table 2). For example, the EP El Niño simulated by CMCC-CESM, MIROC5, and MRI-ESM1 increases by more than 3 times compared to under the historical scenario, whereas the frequency of CP El Niño from GFDL-ESM2G and MIROC-ESM is severely reduced in the future.
As mentioned above, the C value depends on the number/center of both the CP and EP El Niño events [Eq. (1)]. Therefore, any change in C value in the future might be caused by 1) a change in the number/center of CP El Niño events in the future, 2) a change in the number/center of EP El Niño events in the future, or 3) a change in the number/center of both CP and EP El Niño events in the future. To address the primary cause of the change in C value in the future, we again focus on the 15 models that capture the key processes in generating the EP and CP El Niño types in history (cf. Fig. 10). Figure 16 compares the difference between C values under the RCP8.5 and historical scenarios from these 15 models. Here, we define three kinds of C value under the RCP8.5 scenario. The first is estimated using CP and EP events in the RCP8.5 scenario (C value; blue bars in Fig. 16), the second uses CP events in the RCP8.5 scenario and EP events in the historical scenario (C_EP_hist; yellow bars in Fig. 16), and the third uses EP events in the RCP8.5 scenario and CP events in the historical scenario (C_CP_hist; green bars in Fig. 16).
It is evident from Fig. 16 that for the majority of models (9 of 15; models shown in yellow text), the signs of the changes in C value are the same as the signs of the changes in the C_EP_hist value, indicating that the change in C value in these models is caused primarily by changes in CP El Niño in the future. The signs of the changes in C value in four models (i.e., ACCESS1.0, CanESM2, HadGEM2-ES, and IPSL-CM5A-MR; shown in green text) are the same as the signs of the changes in the C_CP_hist value, suggesting that the change in C value in these four models can be attributed to the change in EP El Niño in the future. Although both the CP and EP El Niño events contribute to the change in C value in the future for the remaining two models (i.e., CMCC-CM and GFDL-ESM2M; shown in magenta text), a greater contribution is found from the change in CP events in the future.
Under the historical scenario in which the degree of El Niño diversity is mainly determined by the performance of CP El Niño, we showed that there is a robust relationship between C value and the normalized climatological SST gradient over the Niño-4 region (Figs. 12 and 13). As the change in C value in the future is due primarily to the change in CP El Niño events in the future, it is highly feasible that change in climatological SST gradient in the future also determines the degree of El Niño diversity in the future. To this end, Fig. 17 presents the scatterplots of the change in C value against the change in the climatological SST gradient over the Niño-4 region in the 15 individual models. It should be highlighted here that as we are discussing the change in C value for each individual model; the SST gradient is not normalized here.
As can be seen in Fig. 17, 10 of the 15 models are found in the first and third quadrants, meaning that the increase/decrease in the gradient coincides with the increase/decrease in C value in most models. In particular, all four models in which the change in C value is caused mainly by changes in EP events in the future are confined to the second and fourth quadrants (green dots), whereas most models in which the change in C value is caused mainly by changes in CP events in the future fall in the first and third quadrants (yellow dots). The only model that does not follow this pattern is CCSM4. These findings clearly confirm the strong relationship between the C value and the climatological SST gradient over the Niño-4 region. In fact, as shown in Fig. 13, there is a significant (at 95% confidence level) tendency of models with stronger SST gradient to have larger C values under the RCP8.5 scenario even when all the 30 models are used.
6. A note on the change in the degree of El Niño diversity simulated under the future scenario
Our analysis reveals a reduced degree of El Niño diversity in the future global warming that coincides with a weakening trend in the Niño-4 region. In fact, the gradient over the Niño-4 region is highly correlated to the gradient across the equatorial Pacific. Figure 18 shows the scatterplots of normalized climatological SST gradient over Niño-4 region against the cold tongue mode (CTM) index from model simulations. The CTM is the second EOF mode in tropical Pacific using undetrended SST data, and is characterized by a cooling strip in the equatorial Pacific cold tongue (Zhang et al. 2010). This mode reflects the long-term trend in SST in eastern equatorial Pacific (L’Heureux et al. 2012; Lian and Chen 2012). As can be seen, the correlation coefficients between the normalized climatological SST temperature gradient over the Niño-4 region and the CTM index for simulations from the historical and RCP8.5 scenarios are 0.60 and 0.71, respectively, significant at the 95% confidence level. Since most CMIP5 models predict a weakened CTM in the future global warming scenario, known as the “El Niño–like” pattern (Meehl and Washington 1996; Luo et al. 2017), the significant positive correlation coefficients partly explain why the climatological SST gradient over Niño-4 region decreases in the future.
In fact, as found in some recent works, the present CMIP5 models tend to produce the El Niño–like mean-state change as a forced response to global warming under both historical and future scenarios (Coats and Karnauskas 2017; Lian et al. 2018a). On the contrary, a series of recent studies found that SST in eastern equatorial Pacific has decreased in the past century, plausibly caused by anthropogenic climate change (Sandeep et al. 2014; Coats and Karnauskas 2017; Li et al. 2017; Lian et al. 2018a; Zhang et al. 2019). Given this discrepancy between observations and CMIP5 models on the trend of the SST gradient across the equatorial Pacific under historical global warming, any statement of change in El Niño diversity (i.e., frequency, pattern, variation, teleconnection) solely based on CMIP5 model simulations should be taken with great caution (Cai et al. 2018; Lemmon and Karnauskas 2019).
7. Conclusions and discussion
In this study, the cluster analysis is applied to the location of the maximum SST anomaly during El Niño events in observations and CMIP5 models to evaluate the ability of state-of-the-art CMIP5 models in showing different El Niño types. As the method does not rely on any prior knowledge of the patterns of El Niño seen in observations, it can be applied directly to model simulations. More importantly, the significant confidence level of the degree of diversity in data can be estimated, thus provides a practical way to assess the degree of El Niño diversity in data. In contrast to the arguments made in some studies, CMIP5 models were found to rarely show two El Niño types, as stated by Karamperidou et al. (2017) and Kug et al. (2012). The poor degree of El Niño diversity in models is mainly caused by the lopsided numbers of the CP and EP events, and the weak compactness of the CP/EP events in models. The degree of the separation between the CP and EP clusters, which was regarded as the primarily feature to scale the degree of diversity in data in many studies, was found to play just a minor role.
There are four models showing significant El Niño diversity. However, the longitudinal distributions of the warming centers of the two modeled El Niño types are unrealistic as compared with the observations, implying that CMIP5 models are incapable of capturing the warming centers of CP and EP events seen in observations. Under the RCP8.5 future global warming scenario, the degree of El Niño diversity is further reduced. Given these findings, arguments regarding various teleconnections, global impacts, and low-frequency variations of the two El Niño types based on multimodel simulations and projections should be interpreted with caution (Yeh et al. 2009; Ham et al. 2015). Less diversity is found in La Niña than El Niño in observations and both the historical and future scenarios from CMIP5 models.
By diagnosing the heat budget over the Niño-3 and Niño-4 regions during modeled CP and EP events, it is found that the majority of models can capture the key processes, or at least one of them, in generating the EP El Niño type in the Niño-3 region. The degree of simulated El Niño diversity thus depends primarily on model capability in simulating the CP El Niño in the Niño-4 region. Further analysis reveals that there is a significant relationship between the degree of El Niño diversity and normalized climatological SST gradient over the western-central equatorial Pacific (Niño-4 region), simply because that models with stronger gradient are more likely to have stronger zonal advective feedback (ZAF) in the Niño-4 region, and further the more frequently CP El Niño events. As this gradient is weakened in most of the models under the future high-emission scenario, the degree of modeled El Niño diversity is further reduced in the future.
It is noted that the center of an El Niño event is characterized as the longitude with the maximum SST anomaly in this study. While this kind of definition is simple and straightforward, it may not account for multiyear, homogenous El Niño events with an arbitrary or moving center of warming, or for hybrid (CP- and EP-like) events with two centers of warming. Some studies preferred to use the SST-weighted longitude as the center of an El Niño event (i.e., Giese and Ray 2011). When using this SST-weighted center, we find that El Niño events cannot be grouped into two types even in observations (not shown). Nevertheless, C values of all models except the GFDL-ESM2M are still much smaller than those from the observations in this case (not shown). Thus our main conclusion about the underestimation of El Niño diversity in most CMIP5 models is not sensitive to the definition of the center of an ENSO event.
The results shown here imply that there are two necessary conditions for generating two types of El Niño in models that are significant and realistic. First, the recharge–discharge process should be simulated well in the models, as manifest by the stronger contributions of the thermocline feedback (TCF) and Ekman feedback (EKF) during SST warming in the eastern equatorial Pacific than of the other terms. This condition guarantees that once there is an EP El Niño, its warming center is located in the far eastern equatorial Pacific. Second, a better simulation of the climatological SST gradient over Niño-4 region is the primary source to improve El Niño diversity simulated in models, as a stronger SST gradient over this region favors more CP El Niño events by westerly anomalies near the equatorial date line via the ZAF. However, the weak ability to have different El Niño types in most of CMIP5 models suggests that those two conditions are not met suitably at present. The transition between the EP El Niño and La Niña is controlled by the interannual variation of upper-ocean heat content in the equatorial Pacific, which follows a recharge–discharge oscillator scheme (Chen et al. 2015; Timmermann et al. 2018). A recent study from Vijayeta and Dommenget (2018) pointed out that air–sea interactions within the recharge–discharge framework are biased in many CMIP5 models, probably caused by biases in simulating the intensity and pattern of the Pacific cold tongue (Bellenger et al. 2014), the thermocline depth and the slope along the equatorial Pacific (Brown and Fedorov 2008), and the mean zonal wind stress over the equatorial Pacific (Guilyardi 2006). The EP El Niño thus might be biased. The underestimated climatological SST gradient over the western-central equatorial Pacific, on the other hand, might be caused by the too westward extension of cold tongue in models (e.g., Flato et al. 2013; Luo et al. 2005). In addition, the excessive precipitation over the western tropical Pacific associated with the double ITCZ (intertropical convergence zone) bias may also play a role in weakening SST gradient near the date line (Lin 2007).
Additionally, remote forcings outside the eastern Pacific such as the Indian Ocean dipole mode (Saji et al. 1999; Webster et al. 1999) and Madden–Julian oscillation (Madden and Julian 1971, 1972), and intrinsic atmospheric forcing such as the westerly wind bursts (Harrison and Vecchi 1997) also have a strong influence on El Niño evolution (Zhang et al. 1998, 2008; Wheeler and Hendon 2004; Gebbie et al. 2007; Luo et al. 2010; Zhang 2013; Lian et al. 2014, 2017; Fedorov et al. 2014; Chen et al. 2015), but they are also biased to some extent at present (Seiki et al. 2011; Lian et al. 2018b; Feng and Lian 2018). These results, together with the findings of the current study, suggest that model simulations of El Niño events have much room for improvement.
The HadISST1 dataset is from http://www.metoffice.gov.uk/hadobs/hadisst/ and the NCEP-GODAS dataset is from https://www.esrl.noaa.gov/psd/data/gridded/. Model simulations were obtained from https://esgf-node.llnl.gov/search/cmip5. This work was supported by grants from the National Program on Global Change and Air–Sea Interaction (GASI-IPOVAI-04), the National Natural Science Foundation of China (41690121, 41690120, and 41621064), and the Zhejiang Provincial Natural Science Foundation of China (LR19D060001).
Analytical Solutions of the C Value Index
In this appendix, we show that the ratio of the number between CP and EP events, as well as the degree of compactness of the CP and EP clusters, contributes more to the C value index than does the degree of separation among different clusters. In the case when there are two clusters, Eq. (1) can be written as
where the subscripts 1 and 2 denote the EP and CP clusters, respectively. Consider a unique but simple form of the C value index shown in Eq. (A1). For the ith cluster, assume that half of the events in a given cluster are located at the longitude , and the other half are located at the longitude . In this case, the C value equation can be simplified to
where R is the ratio of the number of CP and EP events. Note that one gets the same expression as Eq. (A2) if R is defined as the ratio of the number of EP and CP events. Given this, R is set to the ratio of CP to EP (EP to CP) events when CP events occur more (less) than EP events. Thus, R is always greater than 1.0. If n is much greater than 2, Eq. (A2) can be further expressed as
This newly derived C value is referred to as the C-simple index. Although this index is derived for a specific case (the distance between each pair of centers in a given cluster being the same), it is remarkably well correlated with the original C value index (Fig. A1). We therefore use this C-simple index for further analysis. The total differential of the C-simple index is
In general, for the CP and EP events, , A, and R are O(1)–O(2), O(0)–O(1), and O(0)–O(1), respectively. Thus, the contribution of the last two terms on the right-hand side (rhs) of Eq. (A4) relative to the first term on the rhs of Eq. (A4) is O(1). Therefore, the ratio of the number of CP and EP events and the degree of compactness of the CP and EP clusters contribute more to the C value index than does the degree of separation among different clusters.