The distinctive monsoon climate over East Asia, which is affected by the vast Eurasian continent and Pacific Ocean basin and the high-altitude Tibetan Plateau, provides arguably the best testbed for evaluating the competence of Earth system climate models. Here, a set of diagnostic metrics, consisting of 14 items and 7 variables, is specifically developed. This physically intuitive set of metrics focuses on the essential features of the East Asian summer monsoon (EASM) and East Asian winter monsoon (EAWM), and includes fields that depict the climatology, the major modes of variability, and unique characteristics of the EASM. The metrics are applied to multimodel historical simulations derived from 20 models that participated in phases 3 and 5 of the Coupled Model Intercomparison Project (CMIP3 and CMIP5, respectively), along with the newly developed Nanjing University of Information Science and Technology Earth System Model, version 3. The CMIP5 models show significant improvements over the CMIP3 models in terms of the simulated East Asian monsoon circulation systems on a regional scale, major modes of EAWM variability, the monsoon domain and precipitation intensity, and teleconnection associated with the heat source over the Philippine Sea. Clear deficiencies persist from CMIP3 to CMIP5 with respect to capturing the major modes of EASM variability, as well as the relationship between the EASM and ENSO during El Niño developing and decay phases. The possible origins that affect models’ performance are also discussed. The metrics provide a tool for evaluating the performance of Earth system climate models, and facilitating the assessment of past and projected future changes of the East Asian monsoon.
East Asia, located between the vast Eurasian continent and Pacific Ocean and affected by the world’s highest plateau, the Tibetan Plateau, is a unique monsoon region that extends from the deep tropics to the subpolar region. The East Asian monsoon features a distinct seasonal reversal of monsoonal flow, which affects approximately a quarter of the global population. Understanding the changes of the East Asian monsoon, in the past and future, has important implications for water management, disaster mitigation, infrastructure planning, and sustainable economic development.
Driven by a seasonal reversal of large-scale atmospheric heating and circulation (Webster et al. 1998; Chang 2004; Wang 2006), the East Asia monsoon is traditionally divided into a warm and wet summer monsoon and a cold and dry winter monsoon. The East Asian summer monsoon (EASM) has complex spatial and temporal structures, which cover a broad meridional extent ranging from the South China Sea to southern Siberia (Chang 2004; Wang et al. 2008). The subtropical rain belt stretches for thousands of kilometers, affecting China, Japan, the Korean peninsula, and surrounding areas. The southerly flow of the EASM brings abundant rainfall for agriculture, but its fluctuations also cause floods and droughts (Huang et al. 2007; Li et al. 2011). The East Asian winter monsoon (EAWM) encompasses an extremely large meridional domain from the polar region to the equator. It is the most energetic planetary-scale circulation system of the global atmosphere. Precipitation identified with the EAWM is focused along the lead baroclinic boundary between cold dry air to the north and warm humid air to the south. A strong EAWM is favorable for cold air outbreaks, inducing cold surges, snowstorms, and freezing rain (Chang and Lau 1982; Zhang et al. 1997; L. Wang et al. 2009; W. Zhou et al. 2009).
Numerical climate models are essential tools to understand, simulate, and predict monsoon variations. Assessing the capacity of climate models to simulate monsoons is imperative not only for understanding monsoon dynamics, but also for model improvement. Many studies have been carried out to evaluate the capacity of models to represent the EASM (Kang et al. 2002; Zhou and Li 2002; Wang et al. 2004a; Man et al. 2012; Sperber et al. 2013). Although models are continuously improved, current models still possess systematic biases in their simulation of the EASM, such as deficient rainfall over the East Asian subtropical front (i.e., the mei-yu/baiu rain belt) and an early onset of the EASM. Overall, the EASM shows lower reproducibility in models compared with other subsystems of the Asian–Pacific monsoon (T. Zhou et al. 2009a). In contrast to the comprehensiveness of climate model assessment with respect to the EASM, studies evaluating the simulation of the EAWM have been limited. Some common model biases have been revealed, such as a cold bias in surface air temperature, excessive winter precipitation over the East Asian region, biases in the East Asian major trough, and zonal sea level pressure (SLP) gradients (Gong et al. 2014; Jiang and Tian 2013; Wei et al. 2014).
Most previous evaluations with respect to the EASM have dealt only with the June–August (JJA) mean state, but the early summer [May–June (MJ)] and late summer [July–August (JA)] periods also exhibit remarkable differences in their mean state and year-to-year variability over East Asia (B. Wang et al. 2009; Yim et al. 2016; Xing et al. 2017). Thus, it is necessary to assess the performance of models with respect to the EASM in MJ and JA separately, in favor of improving the subseasonal prediction of dynamical models. The EAWM features surface air temperature variability that is dominated by a northern mode and southern mode, which have distinct circulation structures (Wang et al. 2010). These unique features of the EAWM have barely been evaluated in climate models. Meanwhile a set of systematic metrics that capture the essential features of the EASM and EAWM is required for a wide range of applications.
Therefore, the aim of this paper is to explore and establish a set of dynamics-oriented diagnostic metrics for objective assessing the fidelity and performance of coupled general circulation models (CGCMs) in simulating the EASM and EAWM. The evaluation is concerned with two types of process-oriented diagnostics. One is the forced response of the climate system to external forcing, such as solar radiation, which is primarily reflected by the annual cycle and diurnal cycle (Wang et al. 2011). Another type of process involves internal feedback processes within the coupled climate system, such as year-to-year variability and its relationship with El Niño–Southern Oscillation (ENSO) and the intraseasonal variability associated with the Madden–Julian oscillation (Madden and Julian 1972) and boreal summer intraseasonal oscillation (Wang and Xie 1997; Waliser 2006), as well as other modes of climate variability. For brevity, this work focuses on evaluating the annual cycle and modes of year-to-year variability, leaving the evaluation of the diurnal cycle and intraseasonal oscillation for future work. In the meanwhile, we gather and refine some existing metrics, such as the seasonal migration of rainfall, monsoon domain, precipitation intensity, and the annual cycle of EAWM, according to the feature of East Asian monsoon. Different from previous studies, the climatological states of the EASM during early summer (MJ) and late summer (JA) are a particular focus for evaluation, but the relationship between ENSO and East Asian rainfall, the major modes of variability of the western Pacific subtropical high (WPSH) and EASM teleconnection, and the northern and southern modes of EAWM variability, are also assessed.
Such a systematic diagnostic package can facilitate the quantification of the scientific quality and uncertainties of models, comparing their differences and revealing their shortcomings (Wang et al. 2011). By intercomparison of model performance, one can obtain a general idea about how good current CGCMs are at simulating the EASM and EAWM, as well as their common shortcomings. Those models that stand out based on such an intercomparison may be relatively more reliable for future projections. Thus, the diagnostic metrics developed in the present study are applied to 20 CGCMs that participated in phases 3 and 5 of the Coupled Model Intercomparison Project (CMIP3 and CMIP5, respectively).
Following this introduction, we begin in section 2 by describing the observational data, models, and objective measures used. Sections 3 and 4 set out the diagnostic metrics used for evaluating the models with respect to their simulation of the EASM and EAWM, respectively. In these two sections, a detailed assessment of one of the models with best performance and a comparison with 20 CGCMs that participated in CMIP5 (Taylor et al. 2012) and CMIP3 (Meehl et al. 2007) are presented. A summary and discussion are provided in section 5.
2. Data, models, and objective measures
a. Observational data
The monthly precipitation data used in this study are from the arithmetic mean of two datasets: the Global Precipitation Climatology Project, v2.3 (Adler et al. 2003), and the Climate Prediction Center Merged Analysis of Precipitation (CMAP; Xie and Arkin 1997). The CMAP pentad precipitation data (Xie and Arkin 1997) are applied to analyze the seasonal march of EASM rainfall. National Centers for Environmental Prediction (NCEP)–Department of Energy (DOE) Reanalysis II data (Kanamitsu et al. 2002) are used for the monthly 2-m temperature, SLP, wind, and geopotential height. The daily data of zonal wind at 850 hPa obtained from NCEP–DOE Reanalysis II are employed to validate monsoon onset. For the monthly mean sea surface temperature (SST), we use the arithmetic mean of two datasets: the Hadley Centre Sea Ice and Sea Surface Temperature (Rayner et al. 2003) and the National Oceanic and Atmospheric Administration Extended Reconstructed SST, version 4 (Huang et al. 2016). All datasets cover the period 1979–2005.
The newly developed Nanjing University of Information Science and Technology (NUIST) Earth System Model, version 3 (NESMv3) model (Cao et al. 2018; Yang et al. 2018; Yang and Wang 2019) is evaluated. The atmospheric component of this coupled model is the European Centre Hamburg Model, v6.3 (Stevens et al. 2017), with a horizontal resolution of approximately 1.875° × 1.875° in longitude and latitude, and 47 levels in the vertical direction. The Nucleus for European Modeling of the Ocean, v3.4 (Madec 2008), with a 1° × 1° horizontal resolution and 46 vertical levels, is employed as the oceanic component model. The sea ice component model is version 4.1 of the Los Alamos Sea Ice Model (Hunke and Lipscomb 2010). The coupler is OASIS3-MCT_3.0 (Craig et al. 2017). The same as with the experimental design of the CMIP5 historical run (Taylor et al. 2012), NESMv3 is imposed with changing conditions consistent with observations from 1850 to 2005, which include atmospheric composition (including CO2) due to both anthropogenic and volcanic influences, solar forcing, emissions or concentrations of short-lived species, natural and anthropogenic aerosols or their precursors, and land use.
To facilitate multimodel intercomparison, the historical simulations of 20 CMIP5 CGCMs are used for evaluation. Table 1 lists the basic information of the 20 CMIP5 models used in this study. Although the CMIP5 simulations run from 1850 to 2005, the period 1979–2005 is analyzed for the CMIP5 simulations here, because observational data are better in quality during this period (i.e., the satellite era). In addition, twentieth-century climate in coupled model simulations of 20 CMIP3 CGCMs for the available period of 1979–99 are also assessed, to compare with the CMIP5 simulations. Detailed information on the CMIP3 models and experiments can be found at https://pcmdi.llnl.gov/mips/cmip3/. The multimodel ensemble (MME) mean of the 20 CMIP5/CMIP3 CGCMs is constructed with equal weights. For fair comparison, all data are interpolated to a common grid of 2.5° × 2.5°.
c. Objective measures
To quantify the performance of models, three objective measures are employed as metric fields. One is the pattern correlation coefficient (PCC), which is used to gauge the degree of similarity between observed and simulated fields. The second is the domain-averaged normalized root-mean-square error (NRMSE) (Lee and Wang 2014), which is used to measure the magnitude of the simulation error. The NRMSE is the root-mean-square error normalized by the observed standard deviation that is calculated with reference to the whole domain. Note that one of the best performance models is selected according to the NRMSE skill (or the averaged NRMSE skill if the metric contains more than one variable). The third measure is the threat score (TS), which is defined by the number of “hit” grids divided by the sum of “hit,” “miss,” and “false-alarm” grids. It is used to appraise the performance with respect to the monsoon domain (Wang et al. 2011).
3. Evaluation of the EASM
Large-scale climatological circulations in JJA provide an important background for EASM climate and variability. The primary circulation systems that influence the EASM include the following: 1) the tropical monsoon troughs over the Bay of Bengal and South China Sea, which are closely related to EASM onset; 2) the WPSH, whose strength, shape, and position influence the precipitation of the EASM system (Lu and Dong 2001; T. Zhou et al. 2009b; Wang et al. 2013); 3) the subtropical mei-yu/baiu/changma front, which represents a unique rainy episode during the EASM’s seasonal march (Ding and Chan 2005; Wang et al. 2008); and 4) the westerly jet in the upper troposphere, which is intimately linked to the outbreak of the EASM and movement of the rain belt (Dao and Chen 1957; Lau and Li 1984; Zhou and Yu 2005). In addition, the mean state is an essential condition for models to reproduce the ENSO–monsoon teleconnection pattern (Turner et al. 2005). Given the above reasons, these large-scale circulation systems related to the EASM are considered to be basic diagnostics for the EASM.
Observationally, at low level (850 hPa), tropical southwest monsoonal flow blows from the Arabian Sea to the Philippine Sea (Fig. 1a, left-hand panel), which originates from the cross-equatorial flow associated with the Mascarene and Australian anticyclones. Meanwhile, southwesterlies from the western flank of the WPSH form the subtropical southwest monsoon across eastern China to Japan (Fig. 1a, left-hand panel). Monsoon troughs over the Bay of Bengal and South China Sea correspond to the rain belt stretched from the eastern Arabian Sea to the western Pacific (Fig. 1a, left-hand panel). A prominent feature of EASM rainfall is the mei-yu/baiu rain belt across China, Japan, the Korean peninsula, and surrounding seas (Fig. 1a, left-hand panel), which is the product of the quasi-stationary East Asian subtropical front (Wang et al. 2008). At middle level (500 hPa), the WPSH is the dominant system (Fig. 1a, right-hand panel), which links the tropical and subtropical circulations. At upper level (200 hPa), the most evident features are the westerly jet stream centered along 40°N and the tropical easterly jet to the south of 25°N (Fig. 1a, right-hand panel).
But how well do the models perform in simulating these large-scale circulation systems? We analyze one of models with best performance (i.e., NESMv3 here) in simulating these circulation systems in detail. In comparison with observations, the simulated WPSH is shifted northeastward in the lower troposphere, and its intensity is weak since the isoheight of 5870 m at 500 hPa retreats eastward (Figs. 1a,b), which could cause absent rainfall over the mei-yu/baiu rain belt (Fig. 1b, left-hand panel). While the changes in position and strength of WPSH are due to the atmosphere’s response to the observed Indian Ocean–western Pacific SST anomalies (Wang et al. 2003; T. Zhou et al. 2009b). In other words, whether a model can well capture the Indian Ocean–western Pacific SST anomalies and its interaction with atmosphere is crucial for simulating the WPSH and mei-yu/baiu rain belt (Song and Zhou 2014a). In the meanwhile, NESMv3 can realistically reproduce the monsoon troughs over the Bay of Bengal and South China Sea, as well as the corresponding tropical rain belt, but it overestimates the intensity of tropical rainfall (Fig. 1b, left-hand panel). The excessive precipitation over tropic could be attributed to SST bias (Dai 2006; Adam et al. 2018). In addition, the model successfully captures the position and intensity of the subtropical westerly jet (Fig. 1b, right-hand panel). One of the factors contributing to the position of the upper-level jet stream over East Asia is the temperature anomalies in the upper troposphere and lower stratosphere (Yu and Zhou 2007). Another possible reason of model biases in climatology of EASM is the deficiency of convective parameterization. In fact, many parameterizations have been modified in NESMv3 to improve the simulation of climatology of EASM (Yang et al. 2018). The results show that the mean precipitation and associated circulation in both the lower and upper troposphere, as well as the location and WPSH, are sensitive to each modified parameterization. Although each modification has its own improvement and limitation, the implementation of all modifications can improve most aspects of climatology over East Asia (Yang et al. 2018). The improvement of the model parameterization may be the reason that NESMv3’s performance in simulating EASM climatology on a large scale is superior.
The performances of the 20 CMIP5 and 20 CMIP3 models are also evaluated. Since the models’ PCCs with respect to the large-scale climatology are close to each other, we only compare the NRMSE here. The individual CMIP5 models have an NRMSE ranging from 0.4 to 1.0 for the precipitation climatology, 0.2 to 0.8 for the 850-hPa geopotential height climatology, 0.1 to 0.7 for the 200-hPa zonal wind climatology, and 0.1 to 0.6 for the 500-hPa geopotential height climatology (Fig. 1c). There are some improvements from CMIP3 to CMIP5 in skill. The CMIP5 models’ skill has smaller spread and lower averaged NRMSE than that of the CMIP3 models. Also, compared to CMIP3, the CMIP5 MME generally shows slightly higher skill (smaller NRMSE) in simulating the climatological large-scale circulation.
Considering the regional conditions, we next focus in on the time-mean patterns of monsoon rainfall and circulation over East Asia. The low-level southwesterlies from the tropics and the WPSH transport moisture to East Asia, forming three tropical rainfall centers (over the Indochina Peninsula, Philippines, and western Pacific) and one subtropical mei-yu rainfall belt in JJA (Fig. 2a, left-hand panel). In addition, the EASM possesses salient differences between early summer (MJ) and late summer (JA), since the rainfall distribution over East Asia changes abruptly from June to July, but changes are relatively gradual from May to June and from July to August (B. Wang et al. 2009). As shown by Fig. 2a (middle and right-hand panel), the observed major rainy regions are extended from southern China to Japan in MJ, whereas they move to the Korean Peninsula and northern China in JA. These precipitation distributions in MJ and JA are associated with a northeastward shift of the WPSH ridge line from around 20°N in MJ to around 28°N in JA.
As one of the best performance models here, MPI-ESM-P simulates a realistic spatial pattern of rainfall over East Asia in JJA, MJ, and JA, with PCCs larger than 0.77. However, it generally overestimates the rainfall intensity over the Indochina Peninsula, South China Sea, and tropical western Pacific, and underestimates the rainfall intensity over the Yellow Sea (Fig. 2b). Of note is that the model reproduces the circulation pattern well in MJ, but is less able at capturing the exact location of the WPSH in JA and JJA. The local convection–wind–evaporation–SST feedback is the key for the position and intensity of WPSH, while the enhanced mean precipitation associated with strong western North Pacific (WNP) monsoon trough in late summer makes atmospheric response much more sensitive to local SST forcing than early summer (Xiang et al. 2013). Thus, except the local convection–wind–evaporation–SST feedback, the model capability in simulating mean rainfall over the WNP monsoon trough area is additionally important for reproducing WPSH in late summer.
Figure 2c compares various models’ performances against observation with respect to the JJA, MJ, and JA mean precipitation and 850-hPa geopotential height over East Asia in terms of their PCCs and NRMSE. The PCCs of individual CMIP5 models for summer mean precipitation range from 0.4 to 0.9, and the NRMSE from 0.4 to 1.1. Most of the CMIP5 models successfully capture the early summer 850-hPa geopotential height, with PCCs larger than 0.9 and NRMSE smaller than 1.3. However, larger spread is found across the CMIP5 models’ performances with respect to the 850-hPa geopotential height in late summer and JJA, as compared to that in early summer. The improvements from CMIP3 to CMIP5 can be seen in the spread of skill, the averaged skill, and the MME’s skill. Generally, the MME is better than individual models in simulating the summer precipitation and 850-hPa geopotential height climatology.
b. Annual cycle
The annual cycle involves a large number of radiative, dynamical, and thermal processes; hence, the realism with which the basic annual characteristics can be replicated in CGCMs provides a critical indicator for assessing the skill of models. With regard to the EASM, it is important for CGCMs to accurately simulate the transition from the dry season to the rainy season associated with the northward movement of the tropical and subtropical rain belt. Meanwhile, the onset of the South China Sea summer monsoon (SCSSM) has been considered as the commencement of the EASM and is a key indicator of an abrupt transition from the dry to wet season over the South China Sea (Wang et al. 2004b; Zhu and Li 2017). Thus, the seasonal migration of rainfall over East Asia and the SCSSM onset should be key diagnostic targets.
A latitude–time plot of pentad rainfall, averaged between 110° and 130°E, is constructed to show the seasonal migration of rainfall over East Asia (Zhu et al. 2012). From March to mid-May (around pentad 28), two major rainfall belts can be found (Fig. 3a). The tropical rainbelt is located in the Maritime Continent between 10°S and 5°N, and the subtropical rainbelt is located in subtropical East Asia (23°–30°N). Around mid-May, these two rainfall belts suddenly merge together over 0°–25°N. Besides, a poleward branch of the southern China rainfall belt moves northward around mid-June (around pentad 32) and finally penetrates northern China in July, causing the start of a dry spell in southern China and a wet spell in northern China. NESMv3, as one of best performance models here, successfully reproduces the transition of the rainfall pattern, with a PCC of 0.85, especially the tropical rainbelt (Fig. 3b). The superior simulation of this transition in NESMv3 may be attributed to the convective trigger based on boundary layer depth through convection–SST feedback (Yang et al. 2018). However, the model fails to capture the northward propagation of rainfall from southern China to northern China in July. This failure is associated with the poor skill in reproducing northward jump of WPSH in July (figure not shown), which may relate to the bias in the response of WPSH to the remote SST forcing from the tropical Indian Ocean (Wu and Zhou 2016).
Results presented in Fig. 3c show that most CMIP5 models can capture the transitional structure of rainfall over East Asia, with PCCs larger than 0.6 and NRMSE ranging from 0.6 to 1.0. The skill of CMIP3 models is larger in spread than that of the CMIP5 models. It is also apparent that the MME of the CMIP5 models is more skillful than that of the CMIP3 models in simulating the seasonal migration of precipitation over East Asia.
Using the SCSSM index, which is defined by the 850-hPa zonal winds averaged over the central South China Sea (5°–15°N, 110°–120°E; Wang et al. 2004b), we calculate the climatological temporal evolution of the SCSSM onset index to examine the EASM onset date. The reversal of this index around mid-May indicates the onset of the broadscale EASM, characterized by a sudden establishment of westerlies over the entire South China Sea (Fig. 4a).
MRI-CGCM3, as one of best performance models here, successfully captures the onset date with a 2-day bias (Fig. 4a). However, it underestimates the magnitude of the easterly winds before the onset and overestimates the magnitude of the westerly winds after the onset, resulting in a relatively large NRMSE of 0.89. The SCSSM onset is mainly driven by ENSO (Wang et al. 2000; He et al. 2017) and thus whether ENSO can be well captured by model has great impact on the simulation skill of SCSSM onset (Martin et al. 2019).
Figure 4b shows that the CMIP5 models’ MME reproduces the SCSSM onset with high fidelity (3-day bias for onset date; NRMSE = 0.46). The individual CMIP5 models can simulate the observed onset realistically, with the onset date ranging from 8 May to 9 June and the NRMSE from 0.46 to 2.0. The skill measures of individual CMIP5 models are smaller in spread than their CMIP3 counterparts. However, the CMIP3 models’ MME reproduces the onset better than the CMIP5 models’ MME, at one day later.
c. Monsoon domain and precipitation intensity
The monsoon domain and precipitation intensity together provide integrated information on the annual mean rainfall, the amplitude of the annual range, and the local seasonal distribution of the rainfall (Wang 1994; Wang and Ding 2008; Wang et al. 2011). Therefore, the monsoon domain and precipitation intensity over East Asia are considered to be good indicators for EASM simulation. These metrics have been proven to be a very useful tool for gauging the performance of current models in simulating global monsoon variations (Lee et al. 2010; Wang et al. 2011).
The monsoon precipitation intensity is defined by the ratio of local summer-minus-winter precipitation to the annual total, where summer means May–September and winter means November–March for the Northern Hemisphere (Wang and Ding 2008). The monsoon domain is defined by the regions where the summer-minus-winter precipitation exceeds 2.0 mm day−1 and local summer precipitation exceeding 55% of the annual total (Wang et al. 2012).
ACCESS1.0 is one of models with the best performance in simulating monsoon domain and intensity over East Asia. However, ACCESS1.0 underestimates the monsoon precipitation intensity along the Yangtze River valley to Japan (i.e., the subtropical East Asian frontal area) and part of the South China Sea and Philippine Sea (Figs. 5a,b). These biases may relate to the fact that the simulated WPSH is weaker and shifts northward (figure not shown). For the same reason, ACCESS1.0 misses the monsoon domain over the subtropical East Asian fontal region, and part of the South China Sea and Philippine Sea. The deficiency in representing the monsoon domain indicates that the model has difficulty in replicating the seasonal march of WPSH, which is largely driven by land–ocean thermal contrast (Wang et al. 2011).
The individual CMIP5 models possess NRMSE ranging from 0.55 to 0.94 for the monsoon precipitation intensity, and a TS from 0.37 to 0.62 for the monsoon domain (Fig. 5c). The CMIP3 models display larger spread and bias in averaged skill than the CMIP5 models, either for the monsoon domain or the monsoon precipitation intensity (Fig. 5c), indicating improvements from CMIP3 to CMIP5.
d. Major modes of EASM variability
The EASM has unique features in both rainfall pattern and associated circulation systems, which are closely coupled. Thus, multivariate empirical orthogonal function (EOF) analysis (MV-EOF) (Wang 1992; Zhu et al. 2014) on a set of meteorological fields (precipitation and circulation fields) in JJA over East Asia (0°–50°N, 100°–140°E) is often used to investigate the variability of the EASM. It has been found that the leading MV-EOF mode reflects various aspects of EASM rainfall and circulation (Wang et al. 2008), including the north–south thermal contrast, shear vorticity of zonal winds, southwesterly monsoon, and SCSSM. The second MV-EOF mode features rising pressure over land and falling pressure over the WNP, hence representing the east–west thermal contrast (Wang et al. 2008). These two MV-EOF modes also reflect the major modes of variability of the WPSH and EASM teleconnection. Thus, given these observed features and insight gained from previous work, the first two MV-EOF modes are considered as important diagnostic targets for the EASM.
We use MV-EOF analysis on the precipitation and geopotential height at 850 hPa in JJA. Observationally, the rainfall patterns associated with the first MV-EOF mode in JJA displays dry anomalies over the South China Sea and Philippine Sea and wet anomalies along the mei-yu frontal area (Fig. 6a, left-hand panel). Meanwhile, a large-scale 850-hPa anticyclonic anomaly extends from the northern South China Sea to the Philippine Sea, which can transport moisture to the mei-yu frontal area on its northwest flank and suppress rainfall over the South China Sea and Philippine Sea. The second MV-EOF mode shows enhanced precipitation over southern China and suppressed rainfall over northern China (Fig. 6a, right-hand panel). In the meantime, an anticyclonic anomaly is situated over northern China and a cyclonic anomaly over the WNP, indicating a weakening WPSH and EASM.
Associated with the first MV-EOF mode in JJA, wet anomalies along the mei-yu frontal area are considerably well reproduced in GFDL-ESM2M (one of the models with best performance here), but the dry anomalies over the South China Sea fail to be captured, which may relate to the eastward shift and weakened WPSH produced by the model (Fig. 6b, left-hand panel). The first MV-EOF mode occurs in the decaying phase of El Niño (Wang et al. 2008). The main mechanism responsible for this prolonged ENSO effect is monsoon and warm ocean interaction, that is, the positive thermodynamic feedback between the cooling (warming) SSTA to the east (west) of WPSH, which maintains the WPSH to El Niño decaying summer (Wang et al. 2000; Wang et al. 2003; Wang et al. 2008). Thus, the model may fail to capture the first MV-EOF mode if this local atmosphere–ocean interaction cannot be well simulated. For the second MV-EOF mode, GFDL-ESM2M successfully captures the anticyclonic anomaly over northern China. But it has difficulty in simulating the cyclonic anomaly over the WNP and fails to reproduce the corresponding rainfall pattern, except the enhanced rainfall over southeastern China and suppressed rainfall over northeastern China (Fig. 6b, right-hand panel). The second MV-EOF mode concurs with the El Niño development phase (Wang et al. 2008). During El Niño developing summer, the impact of El Niño on the rainfall over central North China takes place through its impact on the Indian monsoon (Wang et al. 2017; Wu 2017). The reduced rainfall heating over India produces an anomalous low over central Asia in the upper troposphere through the Rossby wave response, which can further excite a barotropic Rossby wave train guided by the westerly jet stream. One of the anomalous lows in this wave train weakens the northern part of WPSH, reducing the moisture transport to northern China (Enomoto et al. 2003; Ding and Wang 2005; Ding et al. 2011). Thus, the models’ capability in simulating the Indian monsoon response to El Niño and the wave train along the westerly jet stream are two important factors that affect the models’ performance in simulating MV-EOF2.
IPSL-CM5B-LR (CNRM-CM5) shows the best performance in simulating this lead–lag relationship between the first (second) principal component and Niño-3.4 SSTA (figure not shown). Note that IPSL-CM5B-LR (CNRM-CM5) also has relative higher skill in reproducing the pattern of first (second) MV-EOF mode (Fig. 6c). However, in order to skillfully simulate rainfall over continental East Asia, dynamical models also must be able to accurately simulate not only the strength, location, and evolution of El Niño, but also the subseasonal migration of the subtropical East Asian monsoon rainbands (Wang et al. 2017).
Figure 6c shows that most CMIP5 models can basically capture the rainfall and 850-hPa geopotential height associated with the first MV-EOF mode. However, they have difficulty in getting right the precipitation linked with the second MV-EOF mode, even if the corresponding 850-hPa geopotential height can be basically reproduced. Generally, not only the skill scores of the CMIP3 and CMIP5 individual models but also the averaged skill scores of CMIP3 and CMIP5 models are comparable with regard to simulating the first two MV-EOF modes of the EASM.
e. Relationship between East Asian rainfall and ENSO
East Asian rainfall and the associated circulation distributions show pronounced differences when comparing ENSO developing years to the following decaying years (Wu et al. 2003; Wang et al. 2017). The El Niño–induced wet anomalies migrate from southern China in fall during the El Niño developing phase, move eastward during the El Niño mature phase, and shift northeastward to eastern central China and southern Japan in the El Niño decay phase (Wang et al. 2003; Wu et al. 2003). The evolution of the wet anomalies is controlled by the evolution of a low-level anomalous anticyclone over the WNP, which is determined by El Niño–related equatorial heating and local air–sea interaction (Wang et al. 2000; Wang et al. 2003). Another robust seasonal signal is the dry anomalies over central northern China during El Niño developing summers (Wu et al. 2003; Wang et al. 2017). The dry anomalies are influenced by an anomalous barotropic cyclone over East Asia, which is connected to ENSO through anomalous heating over northern India (Wu et al. 2003; Wang et al. 2017; Wu 2017). In addition, the seasonal rainfall variance over the wet and dry centers explained by ENSO is about 15%–30% during different ENSO phases (Wu et al. 2003). Whether CGCMs have the capability to simulate the relationship between ENSO and East Asian rainfall/circulation is crucial for seasonal forecasts of East Asian rainfall. Hence, we design metrics for assessing the relationship between East Asian rainfall/circulation and ENSO during the El Niño developing summer, mature phase, and decaying summer. They are illustrated by correlation maps between East Asian rainfall/850-hPa geopotential height anomalies in June–September (0) [JJAS(0)], November–April (0/1) [NDJFMA(0/1)], and May–August (1) [MJJA(1)], and Niño-3.4 index in December–February (0/1) [DJF(0/1)], respectively. Here, “0” denotes the El Niño developing year and “1” the decaying year.
Observationally, during the El Niño developing summer [JJAS(0)], rainfall tends to decrease over central and northern China, the Korean peninsula, and Japan, and increase over the southern coast of China and northern Indochina Peninsula, corresponding to an anticyclonic anomaly centered over central China and cyclonic anomalies over the WNP (Fig. 7a, left-hand panel). During the El Niño mature phase [NDJFMA(0/1)], the correlation map shows a dipole rainfall pattern, with enhanced rainfall north of 20°N and suppressed rainfall south of 20°N, which is consistent with the intensified WPSH south of 20°N (Fig. 7a, middle panel). During the El Niño decaying summer [MJJA(1)], increased convection appears north of 30°N and decreased rainfall south of 30°N, which are favored by an anticyclonic anomaly centered over the South China Sea and a cyclonic anomaly located over northeastern China (Fig. 7a, right-hand panel). Note that the correlation maps during the El Niño developing phase (decay phase) are similar to the MV-EOF2 (MV-EOF1) pattern (Fig. 6a), since MV-EOF2 (MV-EOF1) occurs during El Niño developing (decaying) phase.
As one of the best performance models here, IPSL-CM5B-LR well captures the circulation and rainfall anomalies during the El Niño mature phase, but the rainfall anomalies during the El Niño developing and decay phases are disordered (Fig. 7b). Similarly, both the CMIP3 and CMIP5 models reproduce the circulation and rainfall anomalies during the El Niño mature phase but have difficulty in simulating the rainfall anomalies during the El Niño developing and decaying summer (Fig. 7c). It suggests that the direct response over East Asia to El Niño forcing can be better captured by model, but model still has difficulty reproducing the indirect way that El Niño affects East Asia (i.e., the local air–sea interaction and the remote forcing from northern India).
f. Teleconnection associated with the major heat source of the EASM
Convection over the Philippines is considered as the major heat source of the EASM (Wang and Fan 1999) because it can change the local Hadley circulation through the emanation of Rossby waves, which in turn affects the WPSH and East Asian subtropical monsoon through poleward wave trains (Nitta 1987; Wang and Fan 1999). Therefore, the teleconnection associated with the convection over the Philippines is an important metric for EASM simulation.
To represent the Philippines convection, a zonal wind shear index (WFI) is constructed based on the low-level Rossby wave response to the heat source in the vicinity of the Philippines (Wang and Fan 1999). A WFI can reflect the variations in the WNP monsoon trough and subtropical high, as well as the leading mode of EASM variations (Wang et al. 2008). The reversed index is defined as follows:
The regressions of JJA rainfall and circulation fields with respect to a reversal of the WFI are investigated. Observationally, the regression map is characterized by suppressed rainfall over the WNP and enhanced rainfall along the East Asian subtropical front, in accordance with the southwestward displacement of the WPSH (Fig. 8a). Because WFI is nearly identical to the leading principal component of the MV-EOF (Wang et al. 2008), the regressed maps in Fig. 8 are similar to the pattern of MV-EOF1 (Fig. 6a).
NESMv3 successfully simulates the deficient rainfall over the WNP and enhanced rainfall extending from central China to southwest Japan, but the rainfall anomalies are underestimated (Fig. 8b), suggesting that the simulated response to the Philippines heat source is weaker. Furthermore, the relationship between the enhanced rainfall and the WPSH is well represented (Fig. 8b). To skillfully simulate this rainfall and circulation pattern, it is important to replicate the Matsuno–Gill pattern related to the Indian Ocean warming that is induced by ENSO in preceding winter (Song and Zhou 2014b).
For both the CMIP5 and CMIP3 models, the 850-hPa geopotential height anomalies are better simulated than the rainfall anomalies (Fig. 8c). The CMIP5 models have higher PCCs, lower NRMSE, and a smaller spread of skill scores than the CMIP3 models in representing the rainfall anomalies, indicating improvement from CMIP3 to CMIP5.
4. Evaluation of the EAWM
The climatological mean circulation of the EAWM is characterized by a cold Siberian high and warm Aleutian low at the surface, pronounced northeasterlies over East Asia in the lower troposphere, a strong East Asian trough in the midtroposphere, and a strong East Asian westerly jet stream in the upper troposphere (Jhun and Lee 2004; Wang et al. 2010; Gong et al. 2014). These circulation systems and the EAWM are inherently related to each other. As the northwesterly winter monsoon flow originating from the Siberian high and Aleutian low becomes strong, it brings more cold air and produces a stronger meridional temperature gradient. The intensity of the 500-hPa trough over coastal East Asia is quasigeostrophically linked to the surface Siberian high. Meanwhile, the stronger monsoon flow leads to a stronger polar jet stream over the East Asian region through the thermal wind relationship. Thus, assessing the capability of models in representing these systems is necessary.
Observationally (Fig. 9a), the central area of the Siberian high is located over 40°–60°N, 80°–120°E, and the high pressure ridge extends to the northern South China Sea. The northeasterlies associated with the Siberian high and Aleutian low bring cold air from the polar region to East Asia. The meridional surface temperature gradient is closely linked to the position of the westerly jet stream over the south of Japan, affecting the East Asian trough downstream of the jet.
As one of the best models here, NorESM1-M successfully captures the spatial pattern of the climatological mean circulation of the EAWM (Fig. 9b). Nevertheless, some biases can be found, with a slightly stronger Siberian high, cold bias over EA, and stronger westerlies at 200 hPa around the east of Japan (figure not shown). The intensity of the Siberian high is modulated by strong radiative cooling and cold advection throughout the troposphere (Ding and Krishnamurti 1987). Also, the 200-hPa East Asia westerly jet stream is associated with intense baroclinicity, large vertical wind shear, and strong cold advection (Zhang et al. 1997). Thus, improving models’ capability in simulating radiative cooling and cold advection is important to reduce bias in the climatological mean circulation of the EAWM.
Individually, the CMIP5 models are more skillful than the CMIP3 models in simulating the climatological mean circulation of the EAWM (Fig. 9c). Additionally, the CMIP5 models’ MME is better at simulating the climatological mean 2-m temperature and 500-hPa geopotential height than that of the CMIP3 models in terms of NRMSE.
b. Annual cycle
The onset, advance, and withdrawal are important aspects of the EAWM, and can be indicated by the evolution of surface temperature (averaged from 110° to 130°E) over East Asia. Thus, the annual cycle of surface temperature over East Asia is considered as a target for diagnosing the onset and withdrawal of the EAWM. Observationally, the EAWM establishes in early winter when temperatures decline, matures during midwinter when temperatures are at their lowest, and retreats during late winter when temperatures rise (Fig. 10a). MPI-ESM-P successfully captures the seasonal evolution of 2-m temperature, with an NRMSE of 0.09 (Fig. 10b). Consistent with observation, at 40°N, the 2-m temperature simulated by MPI-ESM-P drops below 0°C in late October, decreases to a minimum in January, and then rises above 0°C in early March. Figure 10c shows that the individual CMIP5 (CMIP3) models can reproduce the observed evolution of 2-m temperature realistically, with NRMSEs ranging from 0.09 to 0.25 (0.10 to 0.37). The CMIP5 models are more skillful than the CMIP3 models in terms of the skill spread and MME. The CMIP5 model improvements in simulating the surface air temperature are most likely related to the local radiation budget change (Wei et al. 2014).
c. Northern mode and southern mode
Surface air temperature variability is an important indicator of EAWM variability. The variation in surface air temperature over East Asia is dominated by two distinct EOF modes: the northern and southern modes, which reflect notably distinct cold-air paths invading East Asia from due north and northwest, respectively (Wang et al. 2010). As indicated by Wang et al. (2010), the northern mode, characterized by a westward shift of the East Asian major trough and enhanced surface pressure over central Siberia, represents a cold winter in northern East Asia resulting from cold-air intrusion from central Siberia. The southern mode, on the other hand, features a deepening East Asian trough and increased surface pressure over Mongolia, representing a cold winter south of 40°N resulting from cold-air intrusion from western Mongolia. Furthermore, the two dominant modes can explain more than 70% of the temperature variability over East Asia. Thus, the performance of CGCMs in simulating these two modes and associated circulation is a key diagnostic target.
The spatial pattern of the northern mode (EOF1) is characterized by maximum cooling around 60°N, and the amplitude gradually decreases southward (Fig. 11a, left-hand panel). The spatial pattern of the southern mode (EOF2) shows maximum cooling around 40°–45°N, and the cooling extends to the Indochina Peninsula and the South China Sea (Fig. 11a, right-hand panel). MRI-CGCM3 successfully captures the spatial pattern of the northern mode and southern mode, with a PCC of 0.94 and 0.92 respectively (Fig. 11b). However, the magnitude of the cooling centers for these two modes is underestimated.
Figure 11c summarizes the performances of the individual CMIP3/CMIP5 models in simulating the spatial pattern of the northern and southern mode in comparison to NESMv3. Quantitatively, the CMIP5 models are better skilled at reproducing the observed northern mode than capturing the observed southern mode. It is also apparent that there are some improvements from CMIP3 to CMIP5 in terms of model skill.
The northern and southern modes display different circulation structures. Figures 12a and 13a show the anomalous circulation regressed with reference to the first and second principal component in observation, respectively. For the northern mode, the anomalous cold air occupies the whole of northern Eurasia and is centered over western and central Siberia (Fig. 12a, left-hand panel). A positive SLP anomaly center can be found over northern Europe, with a major ridge along the Ural Mountains and a minor ridge extending southeastward to northeastern China (Fig. 12a, left-hand panel). The location of the positive SLP anomaly denotes a northwestward shift of the Siberian high. NESMv3 successfully captures the northern mode–related 2-m temperature and SLP anomalies, with PCCs of 0.85 and 0.84 respectively. However, the simulated 2-m temperature anomalies are weaker over Siberia and negative SLP anomalies are stronger over the North Pacific (Fig. 12b, left-hand panel). In the midtroposphere (500 hPa), observationally, a negative geopotential height anomaly center is located over Lake Baikal, implying a westward shift of the East Asian trough (Fig. 12a, right-hand panel). In the upper troposphere, the observed 200-hPa zonal wind shows positive anomalies over East Asia around 20°–50°N (Fig. 12a, right-hand panel), indicating an intensified and northward shift of the subtropical westerly jet. NESMv3 reproduces these features realistically, although the 500-hPa geopotential height and 200-hPa zonal wind anomalies are relatively stronger than observed over the North Pacific (Fig. 12b, right-hand panel). Previous studies have demonstrated that the extensive snow cover over southern Siberia can enhance the Siberian high, Aleutian low, and the East Asia jet, thus favoring cold air accumulation in the lower troposphere (Jhun and Lee 2004; Wang et al. 2010). Thus, one of the keys to reproduce the northern mode is to simulate snow cover forcing that can reduce solar radiation flux over southern Siberia.
Figure 12c indicates that most of the CMIP5 models can capture the regressed SLP, 2-m temperature, and 500-hPa geopotential height reasonably with reference to northern mode, with PCCs larger than 0.6. However, the CMIP5 models perform relative poorly in simulating the regressed 200-hPa zonal wind. In general, the CMIP5 models are more skillful than the CMIP3 models in representing the northern mode–related circulation structures, indicating considerable improvements from CMIP3 to CMIP5 (Fig. 12c).
For the anomalous surface air temperature associated with the southern mode, observationally, a dipole pattern is apparent, with anomalous warm air over northern Eurasia (north of 50°N) and anomalous cold air centered over Mongolia (Fig. 13a, left-hand panel). Meanwhile, an SLP ridge extends from Mongolia along the eastern flank of the Tibetan Plateau to southeastern China (Fig. 13a, left-hand panel), bringing cold air southward via a “northwest pathway.” The southern mode–related 2-m temperature and SLP anomalies are reproduced realistically by IPSL-CM5B-LR (one of the best performance models here), with PCCs of 0.78 and 0.87 respectively, although the magnitude of the temperature anomalies over northern Eurasia are underestimated (Fig. 13b, left-hand panel). In the midtroposphere, the observed East Asian trough is deepened, with negative 500-hPa geopotential height anomalies around Japan (Fig. 13a, right-hand panel). In the upper troposphere, the observed subtropical westerly jet is intensified, with positive 200-hPa zonal wind anomalies across Eurasia to the North Pacific along 30°–40°N (Fig. 13a, right-hand panel). IPSL-CM5B-LR can capture the corresponding 500-hPa geopotential height anomalies realistically, with a PCC of 0.82, as well as the 200-hPa zonal wind anomalies with a PCC of 0.71. However, the magnitudes of both the 500-hPa geopotential height and the 200-hPa zonal wind anomalies are overestimated over central Asia (Fig. 13b, right-hand panel). Two origins have been found for the variability of southern mode. One is the reduced snow cover over northeastern Siberia that is expected to result in local warm anomalies, positive pressure anomalies at the 500 hPa, and strong monsoon over southern East Asia (Wang et al. 2010). In addition, the Philippines Sea anticyclone anomaly affected by the ENSO remote forcing, the tropical–extratropical interaction, and the local air–sea interaction also play critical roles (Wang and Zhang 2002). Thus, how well models characterize these related physical processes is important in simulating the southern mode.
Figure 13c compares the performances of the individual CMIP3/CMIP5 models in terms of their simulation of the southern mode–related circulation structure. The skill measures vary tremendously from model to model; however, overall, certain improvements can be seen from CMIP3 to CMIP5.
5. Summary and discussion
In this study, we develop a set of systematic diagnostic metrics for evaluating the performance of CGCMs in terms of simulating the EASM and EAWM. The metrics developed are aimed at focusing on the observed fundamental and important dynamical processes of the EASM and EAWM, which are physically intuitive and easy to compute.
For the EASM, the diagnostics include six components: 1) the climatological circulation systems on large and regional scales, comprising the tropical monsoon troughs over the Bay of Bengal and South China Sea, the WPSH, the subtropical mei-yu/baiu/changma front, and the westerly jet in the upper troposphere; 2) the seasonal migration of rainfall and the onset of the SCSSM, which are key indicators of the seasonal transition from the dry season to the rainy season; 3) the monsoon domain and precipitation intensity, which together provide integrated information on the annual mean rainfall, amplitude of the annual range, and local seasonal distribution of the rainfall; 4) the first two MV-EOF modes of EASM variability, which reflect the major modes of variability of the WPSH and EASM teleconnection; 5) the relationship between East Asian rainfall and ENSO during different El Niño phases, which reflects the ENSO-related mechanism that modulates the East Asian rainfall; and 6) the regressions of JJA rainfall and circulation fields with respect to a reversed WFI, which reflects the teleconnection associated with the major heat source of the EASM.
The diagnostics for the EAWM include three parts. The first one is the climatological mean circulation of the EAWM, which is characterized by a cold Siberian high and warm Aleutian low at the surface, pronounced northeasterlies over East Asia in the lower troposphere, a strong East Asian trough in the midtroposphere, and a strong East Asian westerly jet stream in the upper troposphere. The second one is the annual cycle of surface temperature over East Asia, which reflects the onset, advance, and withdrawal of the EAWM. The third is the northern and southern modes and associated circulation fields, which reflect notably distinct cold air paths invading East Asia from due north and northwest, respectively.
A total of 20 CMIP5, 20 CMIP3 CGCM, and NESMv3 simulations of the late twentieth century are evaluated for multimodel intercomparison purposes. In general, NESMv3’s performance ranks among the top or above average compared with the CMIP3 and CMIP5 models. Overall, improvements are apparent from CMIP3 to CMIP5 in terms of the skill spread and the MMEs of individual models for simulating the EASM and EAWM. Table 2 summarizes how much progress has been made from CMIP3 to CMIP5 in this regard. The significant progress from CMIP3 to CMIP5 is concentrated in the following areas: 1) the climatological precipitation and circulation systems on the regional scale, 2) the East Asian monsoon domain and precipitation intensity, 3) the teleconnection associated with the major heat source of the EASM, and 4) the two modes of the EAWM variability and associated circulation fields. Nevertheless, no significant progress has been made with respect to simulating the seasonal evolution of East Asian rainfall, SCSSM onset, the large-scale EASM-related and EAWM-related climatological circulation system, and the annual cycle of the EAWM. Note that no single model can outperform other models in every metric, indicating the high complexities of the East Asian monsoon and difficulties for model simulating it.
There are long-outstanding weaknesses persisting from CMIP3 to CMIP5 models. Both sets of models fail to capture the major MV-EOF modes of EASM variability, and the relationship between East Asian rainfall and ENSO during El Niño developing and decay phases. The first (second) MV-EOF mode of the EASM occurs in the decay (developing) phase of El Niño. Therefore, both long-outstanding deficiencies may suggest that the model physics cannot replicate the anomalous barotropic cyclone (anticyclone) over East Asia (WNP) during El Niño developing (decay) phases. That might be associated with the competence of simulating the anomalous heating over India and the local air–sea interaction (Wang et al. 2000; Wu et al. 2003; Wu 2017). Moreover, it has been found that a realistic simulation of the location, timing, and intensity of ENSO-related SST and diabatic heating anomalies along the equatorial Pacific during El Niño events are primary elements for capturing the ENSO–monsoon relationship (Annamalai et al. 2007; Liu et al. 2018; Wang et al. 2017). In addition, model physics schemes (e.g., convection scheme) are important in the simulation of monsoon system (Chen et al. 2010; Yang et al. 2018).
Through such systemic diagnoses, we provide a better chance of understanding what model processes require improvement. It may also be possible to gain confidence that subsets of models are more reliable for investigating the EASM and EAWM, which may be a better choice for seasonal forecasting and future projection. In short, the metrics presented here provide a tool for evaluating the performance of CGCMs and facilitating assessment of past and projected future changes of the East Asian monsoon. However, the current metrics do not include the interaction of the monsoon with the underlying ocean, with the extratropics, along with several other factors. More detailed diagnoses of specific dynamic processes and interactions associated with monsoon variability deserve to be further developed. Since the time series of WFI index or principal components in historical run cannot be directly compared with that in observation, metrics in this study do not include the assessment of interannual variation of these time series themselves. The possible reasons that affect models’ performance are discussed, which may provide some enlightenment for reducing models’ bias, but the sensitive experiments need conducting to test these possible reasons in the future.
This study was supported by National Key Research and Development Program of China (2018YFC1505804), the Natural Science Foundation of China (Grant 41805048, Grant 41420104002, Grant 41605035), the National Key Research and Development Program of China (Grant 2016YFA0600401), and the Startup Foundation for Introducing Talent of NUIST (2018r025). This is the NUIST–Earth System Modeling Center (ESMC) publication number 293, the School of Ocean and Earth Science and Technology (SOEST) publication number 10866, and the International Pacific Research Center (IPRC) publication number 1417. The authors declare that they have no conflict of interest.
Denotes content that is immediately available upon publication as open access.