Haze days induced by aerosol pollution in North and East China have posed a persistent and growing problem over the past few decades. These events are particularly threatening to densely populated cities such as Beijing. While the sources of this pollution are predominantly anthropogenic, natural climate variations may also play a role in allowing for atmospheric conditions conducive to formation of severe haze episodes over populated areas. Here, an investigation is conducted into the effects of changes in global dynamics and emissions on air quality in China’s polluted regions using 35 simulations developed from the Community Earth Systems Model Large Ensemble (CESM LENS) run over the period 1920–2100. It is shown that internal variability significantly modulates aerosol optical depth (AOD) over China; it takes roughly a decade for the forced response to balance the effects from internal variability even in China’s most polluted regions. Random forest regressions are used to accurately model (R2 > 0.9) wintertime AOD using just climate oscillations, the month of the year, and emissions. How different phases of each oscillation affect aerosol loading is projected using these regressions. AOD responses are identified for each oscillation, with particularly strong responses from El Niño–Southern Oscillation (ENSO) and the Pacific decadal oscillation (PDO). As ENSO can be projected a few months in advance and improvements in linear inverse modeling (LIM) may yield a similar predictability for the PDO, results of this study offer opportunities to improve the predictability of China’s severe wintertime haze events and to inform policy options that could mitigate subsequent health impacts.
Growth in China’s population over the past several decades has coincided with a period of intense industrialization. This combination has precipitated a jumpstart in China’s economy—which skyrocketed from the 10th largest economy in the world in 1980 to the second largest in 2018. Unfortunately, this has given rise also to serious public health issues induced by increased air pollution. According to the World Health Organization, 91% of the world’s population lives in areas where air quality exceeds guideline limits (WHO 2016). Notably, air quality standards were met in only 84 of China’s 338 prefecture-level cities in 2016 (China News Network 2017). The problem is especially pervasive in winter, when stable synoptic meteorological conditions can contribute to particularly strong haze events (Zheng et al. 2015). The Chinese government attempted to mitigate this issue by temporarily shutting down roughly 40% of its factories at the end of 2017 [more measures in Zheng et al. (2018)]. The rationale behind the decision is that coal consumption is one of the dominant sources of China’s air pollution (Guan et al. 2016). While the sources over this region are primarily anthropogenic, the meteorological conditions that are conducive to stagnant weather are driven largely by natural causes. An important question is, what is the role of natural climate variability and how will it affect China’s air quality in the future?
Teleconnection patterns refer to connections in climate anomalies over large spatial scales. They project as emergent patterns, and typically exhibit oscillatory behavior, with positive and negative phases corresponding to different atmospheric and oceanic phenomena. Teleconnections are caused by couplings of the atmospheric, oceanic, cryospheric, and land-based processes, in what we refer to here interchangeably as natural variability and internal variability. One such example is the Arctic Oscillation (AO), where the positive phase reflects low surface pressure anomalies in the north polar region and the negative shows the opposite (Higgins et al. 2000). The AO may be connected to the high pressure region of Mongolia, the Siberian high, a wintertime anticyclonic pressure system over northeast Eurasia (Wu and Wang 2002; Gong et al. 2001). The strength and position of the Siberian high are key modulators of the East Asian winter monsoon (Ding et al. 2014; Jia et al. 2015), which transports cool, clean air into China’s heavily polluted regions, clearing out aerosols that have accumulated over roughly a week (Yang et al. 2016). Since this teleconnection may influence the Siberian high, changes in the AO could have important implications for China’s air quality.
Several studies have investigated potential connections between other oscillations and climate in China (Mantua et al. 1997; Dima and Lohmann 2007; Zhang et al. 2007). Si and Ding (2016) found that the Pacific decadal oscillation (PDO) and Atlantic multidecadal oscillation (AMO) are important drivers of East Asian summer rainfall, and proposed a potential interaction between the two oscillations through formation of a global stationary baroclinic wave train. Chen et al. (2013) determined that the relationship between the winter and summer monsoons is associated predominantly with El Niño–Southern Oscillation (ENSO) SST anomalies, with an anomalous anticyclone over the western Pacific that persists from winter to spring during El Niño events. They note a similar anticyclonic pattern during the positive phase of the PDO. As many have suggested that these oscillations—particularly ENSO and the PDO—affect the monsoonal circulation over East Asia, there is reason to suspeould also have significant bearing on China’s air quality. Finding strong connections between the oscillations and air pollution would facilitate better understanding of the underlying atmospheric mechanisms and could allow for improved prediction of haze conditions. From a more general perspective, Kushnir et al. (2019) note the importance of climate models for predicting extreme events given the sparsity of observational data. Numerous studies have investigated the reverse scenario—the effect of aerosols on teleconnections (Shindell et al. 2015; Shindell and Faluvegi 2009; Booth et al. 2012). The fact that the aerosol–teleconnection interaction could be bidirectional is an important consideration for prediction of both air quality changes and internal variability in the future.
Here, we analyze 35 simulations of wintertime (December–February) monthly 550-nm aerosol optical depth (AOD) derived from the Community Earth Systems Model Large Ensemble (CESM LENS; Kay et al. 2015) over the period 1920–2100 to assess how internal variability has influenced China’s wintertime haze (typically nitrate, sulfate, ammonium, organic aerosol, and black carbon) and how it could influence pollution in the future. We use AOD because it well represents spatiotemporal changes in anthropogenic emissions, can be validated with MODIS observations (Fig. S1 in the online supplemental material), and has been shown to correlate well with PM2.5 in China (Xin et al. 2014). We then train random forest regressors to predict haze spatially over eastern China. We exploit this model to project how changes in the oscillation indices could be reflected in China’s AOD.
a. Data overview—CESM LENS
CESM LENS is an ensemble of 40 fully coupled CESM1 simulations covering the period 1920–2100. The model begins with a multicentury simulation of a year, 1850, selected as representative of preindus-trial conditions. The model, with constant forcing, reaches a quasi equilibrium after a few centuries. The date 1 January, year 402, of this simulation is taken to provide the initial conditions for the first member of the ensemble, which is run from 1850 to 1920 (Deser et al. 2012). This simulation is integrated then to the year 2100 and is considered the first ensemble member. The remaining 39 simulations are initialized for the same year 1920, except for perturbations in initial air temperatures (on the order of 10−14 K). These slight differences between simulations impact the global climate system by altering internal variability. The consequences of this are reflected in alterations of natural climate oscillation indices. Since we are investigating the impacts of specific climate oscillation indices—some of which are calculated using SST data—we exclude 5 of the 40 simulations as they are run by a different climate modeling group, one at the University of Toronto, and show systematic temperature biases relative to the rest of the ensemble. While internal variability differs from simulation to simulation, external forcing is consistent for each member of the ensemble: phase 5 of the Coupled Model Intercomparison Projects (CMIP5) forcings from 1920 to 2005 and the representative concentration pathway with 8.5 W m−2 of additional radiative forcing (RCP8.5) from 2006 to 2100. This fact can be exploited to assess the strength of external forcing changes relative to uncertainty associated with internal variability within CESM LENS. Prior works have taken advantage of this fact to tease out the relative importance of internal variability and the forced response (Deser et al. 2012; Vega-Westhoff and Sriver 2017). Further elaboration of the ensemble is discussed in Kay et al. (2015).
The Community Atmosphere Model, version 5 (CAM5; Neale et al. 2012), is the atmospheric component of CESM LENS. Since we are investigating changes in AOD within CESM LENS, we will narrow our discussion of CAM5 to its aerosol component. The CAM5 model used in the CESM LENS simulations contains 30 vertical layers extending up to 3 hPa (roughly 30 km). It utilizes the simplified 3-mode modal aerosol scheme (MAM-3; Liu et al. 2012), which includes only Aitkin, accumulation, and coarse aerosol modes. A number of assumptions are made in this simplified scheme. For the purposes of this research the most important assumption is that NH3 is not simulated and ammonium is prescribed (Neale et al. 2012). The interactive aerosols within this CAM5 scheme are black carbon, primary organic matter, sulfate, dust, sea salt, and secondary organic aerosol, which are included in calculating total AOD over the visible spectrum, the response variable that we investigate in this study. CESM LENS implements anthropogenic emissions from the Lamarque et al. (2010) IPCC AR5 emissions dataset for 1920–2005 and emissions corresponding to RCP8.5 from 2006 to 2100.
b. Comparing internal variability with the forced response
To compare the forced response to internal variability inherent in the model, we compute the time scale for each response. The forced response is simply the mean AOD of the 35 simulations over each time step, and the time scale associated with this is associated with its temporal derivative. The response due to internal variability is the standard deviation of the 35 models at each time step. Comparing the mean ratio of these two time series shows how long it takes for the forced response to balance internal variability. If the internal response is greater than the forced response, we would expect natural variability to play an important role in modulating China’s wintertime AOD.
c. Regression modeling and prediction
Random forest regression is an ensemble machine learning technique that works by aggregating decision tree models defined on subsamples of the feature space. Random forest regression improves upon the decision tree method (a technique that is often used in weather prediction) by aggregating over many decision trees to limit the risk of overfitting. In our analysis we use random forest models composed of 100 decision trees with each tree calculating a regression based on mean-squared loss in a randomly selected subset of the feature space. Our overall model is defined on a feature space that includes ENSO, AMO, AO, and PDO indices, months of the year and local SOx (SO2 + H2SO4) emissions as the input features and local AOD as the response. The spatially averaged winter (DJF) SOx emissions over all of China that are used in CESM LENS are shown in Fig. S2 in the supplemental material.
The oscillation indices are defined for each ensemble member through NCAR’s Climate Variability Diagnostics Package (Phillips et al. 2014). The AO index is defined by the first empirical orthogonal function (EOF) of sea level pressure. The AMO index is defined by the detrended time series (i.e., with the effects of anthropogenic climate change removed) of North Atlantic SSTs (Trenberth and Shea 2006). The ENSO index used here is Niño-3.4, defined by area-weighted sea surface temperature anomalies over a specific region of the tropical Pacific (Wolter 1987). The PDO index is defined by the first EOF of the mean SST anomaly from November to March over the Pacific Ocean north of 20°N (Zhang et al. 1997). As the AMO, ENSO and PDO indices are calculated based on regional SSTs, we discuss briefly here the ocean component of CESM LENS, Parallel Ocean Program, version 2 (POP2; Smith et al. 2010). POP2 is a three-dimensional ocean general circulation model, which, in this case, uses 60 vertical layers. Given the lack of global oceanic observations in 1850, POP2 was initialized from a state of rest based on modern observations, taking advantage of the fact that the upper ocean equilibrates with the atmosphere on short time scales and that the long time scales associated with the deep ocean mean that modern observations at depth are reflective also of preindustrial conditions. Once the climate system reached a quasi equilibrium in the multicentury 1850 run, the CESM LENS simulation was run.
The spatial patterns of the oscillations are well represented by CESM LENS (Fig. S3). We divide China into grid cells, and our overall model is composed of individual random forest regression models for each grid cell. Each random forest model is trained on the same 70–30 train-test split of the data. While independent regressions are run for each individual grid cell, the spatial correlation structure is still accounted for because the oscillation data used as model inputs affect the meteorology, which connects pollution in nearby locations. To investigate potential delayed AOD responses to the natural oscillations, we added as inputs oscillation indices at 0 to 11 month lags. To minimize computation time, we removed lags that did not contribute importantly to the regression. Interactions between the different oscillations were accounted for in the feature model by creating separate indices for the product of each oscillation pair. After training the model at a given location, we predict AOD with the test set of inputs and compare to the actual CESM LENS value at the corresponding location, using R2 as an assessment of the model’s predictive power.
Once trained and tested, the random forest models can be used to predict the effects of individual oscillations. To do this, we stochastically select percentile values for the oscillation indices for the negative and positive phases of the individual oscillations. Given the structure of the oscillation data, the negative phase of the oscillation is defined at approximately the 20th percentile and the positive phase is defined at approximately the 80th percentile. Predicting AOD with these modified oscillations in our overall model will provide AOD under negative and positive oscillation conditions at each spatial coordinate. This method is performed for each oscillation and the AOD differences between the positive and negative oscillations are used to assess how AOD changes when going from a positive-phased oscillation to the negative complement—or vice versa.
a. Comparing internal variability with the forced response
The time scales for the forced AOD response to balance internal AOD variability in China are shown in Fig. 1. Internal variability plays a significant role over the whole region and it takes at least roughly a decade for it to be balanced by the forced response throughout China. From the spatial distribution, it can be seen that the forced response is greatest in eastern China, which makes sense because of the region’s high levels of anthropogenic emissions. Despite this, it takes roughly a decade for the effects of the forced response to balance internal variability even over eastern China. We may conclude from Fig. 1 that natural climate variability is important for China’s haze and it is thus crucial to assess the impacts that each oscillation might have on AOD.
b. Regression modeling fit and predictions
These effects are investigated in the random forest regressor. The R2 fit is shown in Fig. 2. Strong fits (R2 > 0.9) are seen over eastern China, where pollution levels are highest and have the most important implications for human health. Notably, the fit is poorer in western China, where temporally averaged AOD is much smaller and exhibits a noisier response.
Percent AOD differences between the positive (80th percentile) and negative (20th) phases of the major climate oscillations investigated are shown in Fig. 3. The AOD distributions reveal that the impacts due to changing ENSO and PDO phases are the most substantial, with AOD differences of almost 10% as the climate system shifts from positive to negative phases of ENSO and PDO. With ENSO, the largest AOD response takes place over South China, where AOD increases during the positive phase (i.e., El Niño). Zhao et al. (2018) note a similar spatial pattern, which they associate with an anomalous anticyclone that forms over the Philippine Sea during El Niño responsible for transport of water vapor and aerosols to the region. With the AMO, there is a moderate increasing AOD trend over the studied region during the positive AMO phase relative to the negative. Wang et al. (2009) propose that a weakened winter land–sea temperature gradient over Eurasia during AMO+ may induce weakened westerlies/anomalous easterlies that moderate the transport mechanism removing pollution from the region. Therefore, this result also agrees with previous research. With the AO, there is a slight anomalously negative AOD during its positive phase over most of the region; however, the connection detected here, like the one for AMO, is weak. With the PDO, a dipole response can be seen, with lower AOD in the north and higher AOD in the south, corresponding likely to anomalous northerly winds during the positive phase transporting pollution southward. This could be connected also to the similar ENSO pattern seen here as proposed by Zhao et al. (2018).
c. Physical explanations
Now that potential connections between ENSO, AMO, AO, and PDO indices and China’s wintertime AOD have been established from our regression models, we will discuss these effects in terms of potential physical explanations. Figure 4 shows the composite difference in sea level pressure (SLP) and 850-hPa winds between positive (80th percentile) and negative (20th) oscillation phases. During El Niño, there is anomalously high pressure over the Philippine Sea, consistent with the proposed anomalously high pressure system responsible for the transport of water vapor and aerosol into South China. Southwesterly winds are observed in this region during El Niño, which also comply with this result. The ENSO SLP response is noted also by Li et al. (2017). There is not a notable SLP response to the different AMO phases, but the winds tend to confine local pollution over the region more during AMO+, which explains why more pollution is predicted by the regression. The difference between AO phases shows a slight strengthening and eastward shift of the Siberian high and a weakening of the Aleutian low, which explains the highly uncertain response pattern displayed in Fig. 3. This result agrees with previous literature, such as Wu and Wang (2002), who note that the AO and Siberian high are parts of a coupled system where the AO can affect the Siberian high, but the relationship is not consistent over time. It can also be seen in Fig. 3 that the positive phase of the PDO has both a stronger Siberian high and a deeper Aleutian low relative to the negative phase, which increases the pressure gradient contributing to a more powerful East Asian winter monsoon. This is reflected in the enhanced westerly winds in North China during PDO+ that transport much of the heavily polluted air offshore. There is concurrently an anomalous anticyclone that forms over the Philippine Sea—much like the El Niño response—explaining the north–south dipole response seen in AOD.
Temperature differences at 500 hPa for the four oscillations are displayed in Fig. 5. South China cools while the adjacent waters warm during El Niño, amplifying the winter land–sea temperature gradient and encouraging stronger westerly winds that could enhance the anomalous anticyclone formed over the region. There is a general warming over East Asia compared to the western Pacific during AMO+, which weakens the land–sea temperature gradient and could explain why winds are more westward in China during the positive phase. This agrees with Wang et al. (2009) who note that land warming over Eurasia during AMO+ takes place in the mid and upper troposphere. The AO reflects a similar temperature pattern as the AMO, albeit with more exaggerated trends, which is realized in the presence of anomalous easterlies during the positive phase. Similarly, the PDO composites match the general ENSO pattern with a cooler South China and warmer Pacific encouraging the anomalous anticyclone.
4. Discussion and conclusions
The main purpose of this study was to evaluate the relative importance of emissions and natural climate variability on China’s wintertime haze using the CESM LENS. This could allow for refined prediction of severe haze events as the model allows a level of predictive granularity of up to a few months in advance for the climate oscillations investigated here. We found that random forest regressions form the basis of a framework that could be used to predict AOD with high fidelity (R2 > 0.9) over China’s eastern half using as input parameters only climate oscillations, the month of the year and emissions. The framework provides a means with which to assess the countrywide contributions attributable to the oscillations and emissions. The results derived from the regression are notably consistent with past findings and with meteorological changes observed under changing oscillation states.
A large difference is seen in South China’s pollution between the different ENSO phases, a result that agrees with Zhao et al. (2018), who proposed a physical explanation involving formation of an anomalous anticyclone over the Philippine Sea during El Niño resulting in transport of additional water vapor and aerosols to the region, a suggestion supported by the present analysis. We find also an increase in AOD during AMO+ relative to AMO−, which agrees with the AMO theory proposed by Wang et al. (2009), in which higher Eurasian winter temperatures during AMO+ are responsible for weakening the land–sea temperature gradient, providing impetus for anomalous westward winds relative to AMO−. The effects from the AO are subject to large variance, which limits our understanding of its effects (if any) on China’s AOD. The PDO exhibits a dipole response in AOD, with a decline in North China during the positive phase—connected potentially to a weaker Siberian high–Aleutian low pressure gradient—with an increase in South China, connected to the anomalous anticyclone that forms during El Niño. The largest overall effects are attributable to changing ENSO and PDO phases, with changes as large as 21.9% in AOD for ENSO and 9.8% for PDO.
There are a number of limitations that should be noted for this study. First, we investigate here internal variability and how it manifests itself in terms of a specific CESM ensemble. While we have found physical explanations for the CESM LENS AOD variability that align with previous studies, a next step in this research will be to employ a number of different general circulation models imposing conditions that mirror positive and negative phases of the studied oscillations to further validate the present results. We could also implement visibility data from meteorological stations, where the longest datasets extend back to the 1950s. While this comparison could be done, weak spatial coverage along with missing data render these observations unhelpful for a data problem that is already very noisy. Evaluation of observed AOD trends is difficult apart from the most recent period (i.e., 2000–present), which is too limited of a historical scope to investigate connections with oscillations whose phases can last for many decades. We have compared the magnitude of CESM LENS AOD with AOD measurements from the MODIS instruments (Fig. S1). While CESM LENS underestimates the magnitude of AOD in China, it does capture the general spatial structure as observed from MODIS—low AOD in western and central China and high AOD in southern and eastern China. The observational data are indicative of the reliability of our results since the spatial structure compares favorably with that from CESM LENS. Another limitation is the CESM LENS aerosol–chemistry scheme, which is simplified in both its mode scheme and its treatment of NH3 and ammonium. The scheme also limits our ability to accurately represent anthropogenic emissions in our regression models, where we use only SOx emissions as input and are unable to include NOx emissions. Given that SOx emissions also correlate significantly with NOx, particulate organic matter and secondary organic aerosol emissions (not shown), there is good reason to believe that SOx emissions should provide a good surrogate for anthropogenic emissions, and this is reflected in the strong R2 over East China. It is important to note that China’s haze formation involves complicated processes with many questions remaining to be answered (e.g., Song et al. 2018). However, since this research is investigating dynamical effects related to natural climate variability, issues associated with CAM5 chemistry should not have a significant bearing on our results.
We thank H. Seo for insightful feedback on an earlier draft of the paper. We also thank J.F. Lamarque for advice early in the genesis of the project. This study was supported by the Harvard Global Institute. ATA thanks NERC through NCAS for funding for the ACSIS project.
Supplemental information related to this paper is available at the Journals Online website: https://doi.org/10.1175/JAMC-D-19-0035.s1.