1. Introduction
Drought is a cumulative phenomenon caused by an extended period of deficient precipitation, often combined with local factors that exacerbate conditions in the land surface (e.g., accelerated drying due to exceptional heat; Miralles et al. 2019) and in the atmosphere (e.g., enhanced stability due to reduced humidity or large-scale subsidence; Wang et al. 2015). Anomalous circulation features (e.g., persistent ridges) are a prime source of drought and can often be associated with anomalies in sea surface temperatures (SST) hundreds to thousands of kilometers away linked by Rossby wave trains in the atmosphere (e.g., Schubert et al. 2008, 2009; S.-Y. Wang et al. 2014).
Many previous studies have shown that the SST anomalies in the Indo-Pacific and Atlantic Oceans remotely generate drought-inducing atmospheric circulation anomalies over the United States (e.g., Hoerling and Kumar 2003; Schubert et al. 2004; Seager et al. 2005; Seager and Hoerling 2014). Particularly, observational studies and AMIP simulations have established that La Niña events tend to cause major droughts in the Southwest United States, northern Mexico, and the southern Great Plains (e.g., Mason and Goddard 2001; Herweijer et al. 2006; Feng et al. 2008; Seager et al. 2014; Schubert et al. 2016; Mo and Lettenmaier 2018). Also, Schubert et al. (2004) linked the multiyear droughts in the Great Plains with cold SST anomalies in the central-eastern Pacific associated with the interdecadal Pacific oscillation (IPO) tropical branch. Hoerling and Kumar (2003) found that in addition to cold SST anomalies in the eastern tropical Pacific, remnants of warm SST anomalies from the strong 1997/98 El Niño event in the western Pacific and the Indian Oceans induced a multiyear U.S. drought from 1998 to 2002. Seager et al. (2015) showed that the 2012/13 and 2013/14 winter droughts in California were related to a warm west/cool east SST dipole in the tropical Pacific, whereas Hartmann (2015) and Seager and Henderson (2016) found that the warm SST anomalies in the northwestern tropical Pacific played a stronger role in forcing the ridge in the North American west coast during the 2013/14 winter. In addition, the Atlantic multidecadal variability (AMV) can influence the U.S. summer drought (e.g., Enfield et al. 2001; Schubert et al. 2009; Findell and Delworth 2010; Kushnir et al. 2010; Nigam et al. 2011).
On the other hand, the teleconnections induced by the respective SST patterns in the Pacific and Atlantic Oceans sometimes enhance each other over the United States. As a result, the interference of the teleconnections may be as important as the individual ones in influencing the U.S. drought (e.g., H. Wang et al. 2014). Hu and Huang (2009) found that a La Niña contemporary to a negative phase of the Pacific decadal oscillation (PDO) is more likely to cause a strong drought in the Great Plains. Mo et al. (2009) noted that positive AMV enhances U.S. drought more efficiently when the La Niña condition prevails. Although the direct influence of the AMV on U.S. drought may be relatively small, it can modulate the La Niña effect. Therefore, a cold tropical Pacific combined with a warm tropical North Atlantic form the strongest oceanic forcing to U.S. drought (e.g., Schubert et al. 2008). More recently, Huang et al. (2019) showed that since the 2000s, the forcing from the SST anomalies in the tropical North Atlantic with opposite sign to those in the tropical Pacific becomes a significant factor for the U.S. summer precipitation prediction, possibly because the AMV index experienced a transition of phase from the late 1990s to the early 2000s.
Meanwhile, it has long been known that there is a positive correlation between antecedent soil moisture conditions and subsequent precipitation over the central and western United States (Namias 1960). The theory of land–atmosphere feedbacks states that positive feedback ensues whereby dry soil conditions alter the partitioning of surface net radiation away from evaporation toward sensible heat flux (Koster and Suarez 1996; Betts 2004; Tawfik and Dirmeyer 2014; Dirmeyer et al. 2021), amplifying and prolonging droughts by suppressing moist convection (Santanello et al. 2018). However, there may be dry soil advantage situations where increased sensible heat flux can favor local convection (Ford et al. 2015; Roundy and Wood 2015). It is much more common for local land–atmosphere feedback to lead to drought enhancement (Findell and Eltahir 2003a,b). There may also be preconditioning of land and atmosphere that makes drought more likely or its onset more rapid (Ford and Labosier 2017; Otkin et al. 2018).
Over much of the United States, land–atmosphere feedback becomes active in spring, when regions transition from a moisture-limited state to energy limited in terms of the drivers for surface evapotranspiration (Guo et al. 2012). When this transition occurs, soil moisture anomalies may begin to assert control of surface fluxes affecting near-surface meteorology and boundary layer growth (Santanello et al. 2009, 2011, 2013). Land surface impacts on regional boundary layer properties via surface fluxes that affect cloud formation and precipitation can link to anomalies in soil moisture and vegetation (Findell et al. 2017). There can even be a delayed hydrologic effect in snowy areas as winter snowpack anomalies manifest as spring and summer soil moisture anomalies (Xu and Dirmeyer 2011, 2013; Shin et al. 2020a).
The effects are not always local. Modeling studies have shown that soil moisture anomalies in various parts of the United States can also have significant regional and remote effects on precipitation (Koster et al. 2016). Nonlocal impacts can operate both by directly affecting moisture supply to the atmosphere downstream (Herrera-Estrada et al. 2017) and by altering the circulation as a secondary result of land-triggered heating anomalies in the lower troposphere (Xue et al. 2018; Miralles et al. 2019; Schumacher et al. 2019). Land–atmosphere coupling is also scale dependent (Holgate et al. 2019). Droughts can be amplified or extended by local-to-regional feedbacks involving the land surface, once established (Fernando et al. 2016). Thus, there is often an interplay of time scales (i.e., lags), spatial reach (from local to remote; cf. Koster et al. 2016), and migration (Herrera-Estrada et al. 2017) that creates a complex interplay of factors.
Land surface feedbacks may interact with the ocean drivers. Using “identical twin” experiments with only different land initial conditions (NASA GLDAS2 vs NCEP CFSR) in reforecasts with CFSv2, Shin et al. (2020a) demonstrated that soil moisture anomalies at forecast initialization can persist for several weeks as they closely interact with near-surface temperature anomalies. Shin et al. (2020b) confirmed that in addition to remote SST forcing, realistic representation of land forcing (i.e., soil moisture) over the United States is critical for prediction of U.S. severe drought events approximately one season in advance.
To understand links between land–ocean surface states and drought, many previous studies have applied linear statistics under the assumption that critical anomalies are distributed normally, such as in the application of correlations and RMSE. In particular, Pearson’s correlation coefficient is one of the well-known and widely used methods to identify and quantify relationships between two time series in the Earth climate system. In spite of the fact that it is only sensitive to linear relationships, it has been widely used because its computation and interpretation are straightforward; it can distinguish positive relationships from negative ones, and its statistical significance is easily determined. However, nonlinear aspects are potentially immanent in the land–atmosphere interactions and the ocean–atmosphere interactions described above, which correlations cannot detect. For instance, precipitation is bounded by zero on the low end so it is not normally distributed except on very long time scales; soil moisture is bounded on both ends, often leading to skewed distributions; SST and land surface anomalies of equal magnitude but opposite sign rarely result in equal-but-opposite atmospheric responses; and many atmospheric processes, especially moist processes, often exhibit threshold behaviors at a critical value of some environmental variables.
Metrics from information theory, on the other hand, provide novel extensions to conventional linear/Gaussian-based statistics, which can discern nonlinearity and potentially causal relationships in the complex Earth system (e.g., DelSole and Tippett 2007; Ruddell and Kumar 2009a,b; Ruddell et al. 2016; Papagiannopoulou et al. 2017; Runge et al. 2019; Goodwell 2020). For instance, mutual information (MI) is a nonparametric measure of the dependence between two variables, similar to a correlation but without the assumption of linearity in the relationship. As an alternative to Pearson’s correlation coefficient, MI has been introduced for the analysis of statistical dependence in the highly nonlinear climate system (e.g., Hlinka et al. 2014; Ruddell et al. 2016; Goodwell and Kumar 2017; Kumar et al. 2018; Goodwell and Kumar 2019; Hsu and Dirmeyer 2021). Especially, MI and its related metrics have recently been used for drought-related research (e.g., Vaheddoost and Safari 2021; Li et al. 2022; Wu et al. 2022). However, MI also has some shortcomings including more complicated implementation and interpretation, no positive or negative polarity, and typically more expensive computational costs.
Since correlation and MI have their own advantages and disadvantages, we attempt to propose a joint approach combining correlation and MI to investigate linear dominant and nonlinear dominant relationships together. In this study, we address possible issues that arise when MI is applied to two related variables that fluctuate in space and/or time. We then demonstrate how well this joint approach works to identify dependences between time-lagged land–ocean surface states and summer drought in the southern Great Plains (SGP) for 1982–2018 as an example. The main objective of this study is to describe this new methodology.
Section 2 describes and contrasts correlation and normalized MI (NMI), their relationship and performance in some idealized cases, as well as the datasets used and their preprocessing. Main results based on a joint approach using correlation and NMI together are presented in section 3. Summary and conclusions are given in section 4.
2. Data and methodology
a. Correlation
b. Mutual information
1) Shannon entropy
For a discrete random variable X with a probability distribution p(x) for which
2) Normalized MI
The first step of computing MI is to determine a pair of optimal bin numbers. For simplicity, let us call X and Y as the target and source variable, respectively. In this study, our target variable is a single time series of a drought index related to anomalies of summer precipitation averaged over SGP, while the source variables are time series of SST or soil moisture anomalies changing in space as well as in lag time. By following Knuth (2019), who proposed a Bayesian approach to describe the probability distributions using a piecewise constant model incorporating the optimal bin width estimated from data, we found that the pair of optimal bin numbers to compute MI between our target and source variables is quite inhomogeneous in space over the entire domain (i.e., CONUS or global ocean), and even at a given location, it also varies with lag time (not shown). This causes some difficulty in comparing the values of MI in different locations and times and therefore, in interpreting the results from the MI analysis.
c. Correlation versus NMI: Four extreme cases
The following four idealized extreme cases provide insight into the characteristics of NMI versus correlation. A variable X in Fig. 1 is a time series with a sample size of 1000 varying between −5 and 5, which was constructed by randomly selecting 100 numbers between −5 and −4 [−5 < X(t) < −4; 0 < t ≤ 100], another 100 random numbers between −4 and −3, and so on. Figure 1 shows four scatter diagrams of the variable X and another variable Y constructed using a similar process. Here, we show 10 bins for each variable with an equal-width histogram to compute NMI, so a total of 100 bins is seen by the yellow grid. The first case is when Y(t) = X(t), and only 10 bins contain about 100 samples per bin along the diagonal direction from the bottom-left to the top-right corner (Fig. 1a). A gray dot indicates each pair [X(t), Y(t)], but they are overlapped by the linear regression line, meaning a perfect linear relation between X and Y with correlation = 1. NMI also equals 1 as formulated in section 2b(2) and Eq. (2), denoting that Y shares 100% of its information with X; in other words, no uncertainty of X exists when Y is given.
The second case is the same as the first one except that some samples lie off the linear regression line (Fig. 1b). For the second and third (eighth and ninth) bins from the bottom-left corner, about 50 time steps per bin are relocated to their respective opposite side of the vertical axis with |Y| = 4.5. The correlation drops below 0.4, but the NMI is still quite high, around 0.85. This suggests that only about 20% outliers can extensively degrade the correlation, but they have less effect on the NMI. This is further confirmed by the third extreme case (Fig. 1c). If the sign of half of Y(t) from each bin is reversed, the correlation becomes zero, namely, it detects no monotonically linear relationship at all. However, NMI is 0.67, meaning that X and Y share 67% information between each other, so 67% of the uncertainty of X could be reduced if Y is known. In other words, it implies that we can predict X with 67% accuracy by knowing Y. As a result, these examples demonstrate that NMI can elucidate relationships between variables that correlation cannot detect. Nevertheless, one should recall that for the linear aspect of the relationship, only correlation does tell us if the two variables are positively or negatively related, not NMI. For instance, even if X and Y are identical but have an opposite sign (i.e., correlation is −1) in Fig. 1a, NMI would still be 1.
The last case is when each of the total 100 bins contains the same number of samples, i.e., 10 gray dots per bin (Fig. 1d). In this extreme case, there is no shared information and therefore, no relationship between the two variables as both correlation and NMI are zero. This is a characteristic of MI and NMI: the minimum value is not attained when data are scattered arbitrarily (e.g., random noise) but rather when they are distributed uniformly among all chosen bins. By corollary, NMI is large when the distribution of points cluster in one or more regions of the phase space, leaving other areas with a low density of points.
d. Observational data
Daily rainfall data over CONUS are from the Climate Prediction Center (CPC) unified gauge-based analysis of precipitation at 0.25° latitude × 0.25° longitude resolution (Chen et al. 2008). The NOAA High-Resolution (0.25° latitude × 0.25° longitude) Optimal Interpolation (OI) SST, version 2 (OISST), data (Reynolds et al. 2007; Huang et al. 2021) have been used, which are available from the NOAA Physical Sciences Laboratory. The Soil MERGE (SMERGE) root zone (0–40 cm) volumetric soil moisture data (Crow and Tobin 2018; Tobin et al. 2017, 2019) at 0.125° latitude × 0.125° longitude resolution have been used. All data are freely available online, with links provided in the data availability statement.
e. Data preprocessing
All daily datasets are averaged to pentads (5-day means) to remove weather “noise” but retain submonthly variability over the period of 1982–2018. To characterize a meteorological drought, the standardized precipitation index (SPI;2 McKee et al. 1993) has been commonly used, typically estimated at monthly intervals (e.g., Mo and Lyon 2015; Shin et al. 2020b). Following Wu and Dirmeyer (2020), who discussed the evolution of drought including its onset, persistence, and demise, we have constructed a 6-pentad SPI on pentad intervals, which is referred to as SPI-6P. There are 73 pentads for each year, and SPI-6P at any given pentad t is computed based on precipitation from pentads t − 5 to t. Figure 2 shows SPI-6P based on pentad precipitation averaged over SGP (105°–95°W, 26°–38°N; U.S. land grid cells within the green box in Fig. 3a). Positive values indicate anomalous wet conditions, and negative values indicate anomalous dry conditions. According to the drought classification of the U.S. drought monitor by the National Drought Mitigation Center at the University of Nebraska–Lincoln and NOAA, severe droughts, extreme droughts, and exceptional droughts, respectively, correspond to ranges of SPI from −1.2 to −1.6, from −1.6 to −2.0, and beyond −2.0. For each drought event over SGP, its life cycle from onset to demise is easily recognized in Fig. 2, for example, the 1998 Oklahoma–Texas drought lasting from spring to late summer and the 2011 persistent drought starting in fall 2010 with peaks in the spring and summer of 2011. It is also confirmed that Fig. 2 is qualitatively in a good agreement with 9-month SPI over Texas (Fig. S1 in the online supplemental material), but shorter-term variations are revealed than can be seen using monthly or longer SPI.
3. Results
Figure 3 displays spatial distributions of correlation and NMI between SPI-6P over SGP in summer and anomalies of soil moisture and SST for the same time windows (i.e., with no time lag). Soil moisture anomalies over SGP show strong positive correlation with precipitation anomalies there, consistent with previous studies (e.g., Koster et al. 2004). This local coupling simply indicates that anomalously dry (wet) soil moistures are associated with below-normal (above normal) precipitation at the same time. In the northwestern United States and California, on the contrary, soil moisture anomalies are negatively correlated with precipitation anomalies over SGP (Fig. 3a). This dipole-like spatial pattern is also seen in the map of their NMI with the strongest dependence centered in Texas and Oklahoma, although the NMI is necessarily all positive (Fig. 3c). A negative correlation is also found in the upper Midwest. If we focus on values of NMI > 0.03 (i.e., green colors onward in Fig. 3c; a choice that is justified later in this paper), similarity of the spatial distributions between correlation and NMI are generally apparent.
For SST anomalies, relatively large correlation coefficients seem to be mainly confined to the Pacific Ocean: positive correlation is found in the subtropical central-eastern Pacific and northeastern Pacific of the Northern Hemisphere (NH), along the intertropical convergence zone (ITCZ), and near the South Pacific convergence zone (SPCZ), while regions showing negative correlation appear in the extratropical Pacific in each hemisphere (Fig. 3b). It is noteworthy that the correlation pattern in the North Pacific is reminiscent of the PDO. In the South Pacific, it may reflect the change in the position and intensity of the SPCZ (e.g., Kidwell et al. 2016) and resemble the South Pacific dipole pattern of SST (e.g., Saurral et al. 2018). Over these regions, NMI > 0.03 predominates, but areas with the blue and purple colors (NMI < 0.03) largely correspond to the blank areas in Fig. 3b where the correlation is insignificant (Fig. 3d). This may give us a hint about a more practical significance test of NMI, as will be shortly discussed. It is noteworthy that the trend removal does not change patterns but slightly enhances the magnitudes of both correlation and NMI (not shown).
a. Significance tests
Significance testing of NMI is not as simple as that of correlation. Generally, it can be established only by Monte Carlo methods, such as shuffling (shifting) source and target variables randomly in time, bootstrapping (sampling with replacement), or repeating with synthetic noise variables. We have tested a Monte Carlo method in this study by shuffling the source and target variables randomly among 37 years (with no rearranging of the variables in the 25 pentads of each year to maintain realistic subseasonal consistency) and repeating the process 1000, 500, and 200 times to examine statistical stability. Then the value of NMI is considered significant only if it is beyond a selected level (e.g., 95% or 90%) in the random sample distribution. However, two crucial issues have been identified in this approach: 1) since the resultant significance threshold is decided locally, it can be quite different in location as well as in time. Moreover, 2) a huge amount of computational time is required for this kind of significance test even for the 200-time shuffling, which is as comfortably small a set that one should choose to determine 95% confidence levels.
Instead, we propose a more practical compromise method for the significance test of NMI, which is based on a single threshold everywhere at every time. Because we attempt to identify regions in which greater shared information exists by using correlation and NMI together, we are not interested in regions having small values of NMI, i.e., the purple and blue shaded areas in Figs. 3c and 3d. Thus, we compute the cumulative distribution function (CDF) of NMI over CONUS for soil moisture anomalies and over global oceans between 50°S and 60°N for SST anomalies as source variables and find the value of NMI when the CDF equals 0.8 at each lag time3 (solid red and blue curves in Fig. 4a). We find 0.0331 to be the median of those values from 0- to 24-pentads lag time for the source variable of soil moisture and 0.0299 for SST. We define these as the single threshold significance levels of NMI in this study (dashed red and blue lines in Fig. 4).
The green curve in Fig. 4b indicates NMI between the target variable (SPI-6P over SGP in summer) and time-lagged target variable itself. As this is similar to autocorrelation, let us refer to it as auto-NMI: it is 1 with no time lag as the maximum of NMI as X and Y are identical [Eq. (2)]. The auto-NMI quickly drops to about 0.4 at 1-pentad lag, 0.25 at 2-pentad lag, and 0.1 at 4-pentad lag, respectively, corresponding to 40%, 25%, and 10% reduction of the uncertainty in the target variable when its respective past is given. It further declines below 0.05 after 6-pentad lag. In other words, the past state of the target variable beyond 6 pentads (i.e., 30 days) can explain less than 5% of the information in the target variable. The significance level of NMI with soil moisture anomalies over CONUS corresponds to the auto-NMI of the target variable with about 1.5 months lag (i.e., 9 pentads; the lag time when the green curve first crosses the dashed red line in the right panel of Fig. 4b). For the NMI with global SST, the significance level corresponds to the auto-NMI at 3-month lag (18 pentads). One may now have a sense of the NMI significance over SGP shown in Fig. 3c, representing strong land–atmosphere coupling. This strong local coupling is partially responsible for the relatively large values of the red curve at small lags in Fig. 4a, with a gradual decrease from 0- to 5-pentad lag.
b. Combining correlation and NMI
Using these criteria of significance for NMI, scatter diagrams of correlation and NMI can be divided into four different regimes (Fig. 5). NMI is significant if it is greater than or equal to its single threshold significance level, which is 0.0331 for anomalies of soil moisture as a source variable and 0.0299 for SST. Meanwhile, correlation is significant if its magnitude is greater 0.1, for either positive or negative correlations, based on sample size.4 Blue dots in Fig. 5 are referred to as Regime-C+MI, where both correlation and NMI are significant; red dots are referred to as Regime-MI, where only NMI is significant. Since NMI reveals the total dependence with no assumption of linearity unlike correlation, we define Regime-C+MI in this study as a linearity dominant regime, while Regime-MI is nonlinearity dominant, typically indicating a notable degree of clustering within the distribution. It is evident that NMI tends to increase as the absolute value of correlation increases in Regime-C+MI, most clearly in Fig. 5a. On the other hand, there are two other regimes wherein NMI is statistically insignificant (Fig. 5): green dots are referred to as Regime-C, wherein only correlation is significant, whereas the final regime (gray dots) is the case when neither correlation nor NMI is significant.
Compared to previous work (e.g., Smith 2015), this method of decomposing the total information into linear and nonlinear information is much simpler and more intuitive as it is based on the distribution of correlation versus NMI across the entire domain. This partitioning is also applicable with the source variables at longer lag (Figs. 5c,d).
The next step is naturally to examine spatial distributions of each regime. The two separate maps of correlation and NMI per the source variable in Fig. 3 can be merged into one map through the procedures explained above. In Fig. 6, solid contours are for positive correlations that are statistically significant, and dashed lines mark negative correlations. Blue and purple colors indicate the Regime-C+MI, orange and red are the Regime-MI, and green colors indicate the Regime-C. The value of NMI in Regime-C+MI tends to increase as the absolute values of the correlation increases, representing a strong linear relationship between the source and target variables. For instance, SPI-6P over SGP and anomalies of soil moisture at location A in Fig. 6a clearly shows a strong, positive linear relationship corresponding to Regime-C+MI (R = 0.711 and NMI = 0.202), i.e., drier soil moisture at A is associated with drought conditions over SGP and vice versa (Fig. 7a). Likewise, a moderate negative linear relationship is identified for SPI-6P over SGP and SST anomalies at location B of Fig. 6b, where warmer than normal SST tends to be accompanied by drought over SGP (Fig. 7b). In the following, the linear and nonlinear relationships generally describe a direct effect between source and target and multiple feedbacks in the system, respectively. The former is mathematically determined by a strong linear fit in joint probability distributions between source and target (i.e., correlation), while the latter is identifiable especially when clusters and blank areas (high density and low density) are apparent in their joint probability distributions (i.e., NMI).
The regimes having disagreements in the significance of correlation versus NMI are also well represented, especially in Fig. 6b. For example, NMI is statistically insignificant (Regime-C) in the tropical Pacific and near the Southern Hemisphere (SH) subtropical Pacific, especially over the SPCZ where the correlation is greater than 0.2. Meanwhile, a cluster of combined Regime-MI and Regime-C+MI (the values of NMI > 0.045) seems to be shifted to the southeast from the SPCZ with a center around 15°S and 130°W. There is also a blend of Regime-C+MI and Regime-C in the Atlantic Ocean where correlation is statistically significant.
How can we interpret those disagreements in the significance of correlation versus NMI? NMI represents total dependence including linear and nonlinear information, which can highlight clusters in distributions where there are peaks in probability density that are not detectible by correlations. However, the disagreements also imply that the nonlinear dependencies may be quite independent from the linear ones, which is an important concept for better understanding Regime-C. Indeed, the relationship between anomalies of soil moisture at location C within Regime-C (see Fig. 6a) and SPI-6P over SGP seems far from linear, as the density distribution shows a smooth distribution with a very weak positive slope (Fig. 7c). Thus, the significance of correlation is marginal, but NMI is well below the significance threshold.
Blank areas in Fig. 6 correspond to the situation that neither correlation nor NMI is statistically significant. For location D (see Fig. 6b), its scatter diagram shows that the majority of points seem to be confined within relatively smaller magnitudes of SST anomalies as it elongates horizontally (Fig. 7d). Not only is the distribution devoid of clusters or voids so NMI is insignificant, but the correlation is close to zero. For this location, consequently, no matter whether SST anomalies there are given, there is not much reduction of uncertainty in the SPI-6P over SGP, signifying little dependence between them.
We extend the analysis to time-lagged source variables. Although one may argue that neither correlation nor MI can explicitly prove causality between two random variables, it has been a widespread practice to relate time-lagged source variables to a target variable to help identify possible causal relationships. Figure 8 exhibits time-evolving spatial patterns of the combined correlation and NMI between time-lagged anomalies of soil moisture over CONUS and SPI-6P over SGP in summer. Their contemporaneous relationships in Fig. 6a are largely maintained at 3-pentad lag, i.e., the linearity dominant relationship with positive correlation over SGP and with negative correlation over the western United States. The anomalous atmospheric circulation associated with the North American monsoon (NAM) may account for the dominance of Regime-C+MI there at 0- and 3-pentad lags (Figs. 6a, 8a). The anticorrelation over the western United States gradually weakens as lag time increases, but a relatively strong Regime-C+MI relationship is still found in California at 6-pentad lag (i.e., 30 days earlier) and persists over the Great Basin at 18-pentad lag (i.e., 90 days earlier). This may suggest that anomalously wet (dry) soil conditions in California from late spring to early summer and Nevada from late winter to spring are associated with below-normal (above normal) precipitation over SGP in summer. Over SGP, on the other hand, the magnitude of correlation and NMI rapidly decreases in lag time after 6 pentads as Regime-C+MI is mainly confined to western and southern Texas (especially in the lower Rio Grande valley up to 12 pentads). It implies that the land state (i.e., soil moisture) there has a long memory from midwinter to spring (Lyon and Dole 1995; Hong and Kalnay 2002), which may have major implications for drought predictability over these areas. Also, the distinct signal between southern Arizona and southern New Mexico in spring and early summer (Figs. 8c,d) disappears in summer (Fig. 6a), probably resulting from the development of the NAM system there from early summer. It is noteworthy that negative correlations are significant with a peak at 6 pentads in the eastern United States, but NMI is not significant there. Interestingly, the significance in Regime-C is enhanced from 18- to 24-pentad lags over East Texas as the positive correlation slightly increases. This may indicate that dry soil moisture over East Texas in the previous winter has some influence on below-normal precipitation over the SGP, and vice versa, but the underlying mechanisms of this connection need to be further explored.
In contrast to Regime-C+MI where intensities and spatial areas diminish in lag time, the spatial coverage of Regime-MI gradually increases as lag increases, and its magnitude also intensifies. In particular, over Montana from 0- to at least 9-pentad lag (Figs. 6a, 8), there is an intensification of NMI, followed by a spreading of significant NMI across the northern Rockies and northern Great Plains to 24-pentad lag. For a selected grid point in Montana (i.e., the grid E in Fig. 8c), we plot the scatter diagram of 9-pentad (45 day) lagged anomalies of soil moisture and SPI-6P over SGP in summer (Fig. 10a). It confirms that their relationship is quite nonlinear as anomalously wet soils are linked with both drought and flood condition over SGP and drought tends to correspond with large soil moisture anomalies of both signs—negative anomalies more with moderate drought but positive soil moisture anomalies with more severe SGP drought. This is like the idealized case shown in Fig. 1c, where there is a tendency toward joint extremes of any sign between source and target variables rather than a simply normal bivariate distribution. As a result, their relationship is not well explained by a straight regression line, with its correlation close to zero but its NMI greater than 0.05.
The evolution of spatial patterns of the combined correlation and NMI between time-lagged anomalies of SST and SPI-6P over SGP in summer is much more gradual (Fig. 9), compared to their contemporaneous relationship (Fig. 6b). Regime-C+MI shows positive correlation in the NH extratropical northeastern Pacific that is maintained up to 24-pentad lag (about 4 months), while the area with negative correlation in the extratropical central Pacific gradually weakens. Another region of Regime-C+MI with positive correlation persists in the NH subtropical eastern Pacific with a center located around 15°N, 120°W. This may imply that SST anomalies in the Pacific Ocean near CONUS from midwinter to summer play a role in persistent anomalies in the large-scale circulation over SGP in summer, which may contribute to drought there. Also, the linear relationship becomes stronger in the tropical Pacific and the SPCZ at longer lags up to 24 pentads, suggesting that SST there from midwinter to spring may play a role in modulating precipitation anomaly over SGP in summer. The overall spatial characteristics of combined Regime-C+MI and Regime-MI in the Pacific resemble the PDO-like SST pattern shown in Barlow et al. (2001), linked to U.S. summer drought including SGP and also the warm Pacific SST forcing (i.e., ENSO+PDO) in Schubert et al. (2009)’s model-based study but with weaker signal in the equatorial eastern Pacific, which plays a role in generating drought and pluvial conditions over the United States, especially SGP. In addition, the spatial distributions of the significant Regime-MI/Regime-C+MI areas in the South Pacific show some resemblance to the South Pacific dipole pattern of SST (e.g., Saurral et al. 2018), and the dipole pattern seems more apparent in austral winter than summer. This may partially explain that Regime-MI tends to appear more in the winter hemisphere (Figs. 6b, 9). Also, a tripolar pattern in the NH Atlantic is detected only by correlation, especially at 9 and 12 pentads, but not NMI (Figs. 9c,d).
Interestingly, the combination of Regime-C+MI and Regime-MI in the SH subtropical Pacific (10°–20°S; Fig. 6b) evolves with lag time. Regime-MI becomes dominant at 9 and 12 pentads, suggesting more nonlinear connections. The scatter diagram of 9-pentad lagged anomalies of SST in location F of Fig. 9c and SPI-6P over SGP in summer confirms their nonlinear relationship, and the correlation coefficient is close to zero, but the NMI is 0.045 (Fig. 10b). Severe drought in SGP tends to be associated with cold SST anomalies, moderate drought to moderately wet conditions with warm SST anomalies, and very wet conditions again with colder than average SST. At longer lags, however, Regime-C+MI emerges, surrounded by Regime-MI (Figs. 9e,f). There appears to be a link between SST anomalies over this region from midwinter to summer and variation in summer precipitation over SGP. Moreover, this special connection cannot be fully detected by either MI or correlation alone because the linear and nonlinear information are mixed and rather convoluted with lag time in this case. Therefore, this may be a good example to show how this new joint approach can reveal previously undetected relationships.
We pointed out that the nonlinearity (Regime-MI) is enhanced in the northern United States at longer lag times (e.g., 24 pentads; see Fig. 8). Since the significance of NMI was determined solely by the practical method described earlier, we examine whether a significance test by more traditional means, e.g., the Monte Carlo method, will reveal additional information. As we explained before, we can randomly shuffle the source and target variables many times at each grid cell (1000 in this trial), and the value of NMI is finally considered statistically significant if it is beyond the 90th percentile of the random sample distribution. To reduce the computational cost for this test, we have applied this additional test only for Regime-C+MI and Regime-MI where the NMI has already been deemed significant based on the single thresholds described in section 3a. This corresponds to only about 20% of the total number of grid points over CONUS or global oceans. Thus, this amounts to a two-step significance test. It is noticeable that the spatial coverage of Regime-MI becomes much reduced, especially at longer lags and particularly for anomalies of soil moisture over CONUS (Figs. S2, S3). There is no substantial change for Regime-C+MI.
It is found that after the two-step significance test, the nonlinear dependence is still quite apparent for some lag periods, especially in Montana (black box in Fig. S2c) for soil moisture and in the SH subtropical Pacific (black box in Fig. S3c) for SST. Figure 11 shows the time series of NMI and correlation between the time-lagged source variables, averaged over the outlined regions, and SPI-6P over SGP in summer. The significance levels using the Monte Carlo method (long dashed red curves in Fig. 11) in this instance are higher than those from our new method in this study (short dashed red curves in Fig. 11), especially for the case of SST (e.g., its original threshold was 0.0299). Nonetheless, those source variables in the earlier period are in fact linked with variation in summer precipitation over SGP (solid black curves in Fig. 11). For instance, the anomalous soil moisture in eastern Montana has values of NMI significant from 3- to 17-pentad lag (drawn from mid-February to mid-August) although the correlation is insignificant for the entire period, confirming they are nonlinearly connected (Fig. 11a). SST anomalies over the SH subtropics have two peaks of NMI: the first peak is between lags of 5 and 12 pentads (drawn from March to early August) when the nonlinearity prevails and the correlation is insignificant, and the second is from lags of 16 to 24 pentads (drawn from January to mid-June) when the correlation becomes significant (Fig. 11b). Perhaps, the two-step significance test may be useful to identify regions showing strong linearity and nonlinearity, but it is also possible to disregard regions having moderate values of NMI because its significance level becomes higher as shown in Fig. 11 (i.e., short dashed red lines vs long dashed red curves).
4. Summary and conclusions
MI is a nonparametric measure of the dependence between two variables without the assumption of linearity in the relationship. Hence it has been introduced to analyze statistical dependence in the highly nonlinear climate system as an alternative to correlation. However, MI alone does not provide the relative contributions of the linear and nonlinear relationships between the two variables, and furthermore in the case that a linear relation is dominant, MI does not distinguish whether the two variables are positively or negatively related with each other unlike correlation. This study describes a newly proposed joint approach combining correlation and normalized MI (NMI) to examine potential land and ocean forcing of U.S. drought. We have tackled a few shortcomings and complications of MI mainly associated with its implementation and interpretation in this study and attempted to provide more practical solutions, such as imposing a fixed number of bins to compute the probability distribution functions, normalizing MI by the product of the square roots of the entropies of two variables (NMI) and estimating a single threshold significance level for NMI.
This study has used as the example target variable summer drought in SGP based on SPI calculated from 6-pentad precipitation data for 1982–2018 to substantiate how well this joint approach works on identifying dependencies between time-lagged land soil conditions over CONUS or SST over global oceans (source variables) and the target one. Based on the distribution of correlation versus NMI between the source and target variables across large domains, the selected single threshold significance levels of NMI in addition to the dual positive/negative significance levels of correlation enable us to discern four different statistical regimes in a more intuitive way (Fig. 5): a linearity dominant regime where both correlation and NMI are significant (Regime-C+MI); a nonlinearity dominant regime where only NMI is significant, typically indicating a notable degree of clustering within the distribution (Regime-MI); a weakly linear regime where only correlation is significant, suggesting that the nonlinear dependencies may be different from the linear ones (Regime-C); and a final regime where neither correlation nor NMI is significant.
The contemporaneous relationship between anomalies of soil moisture and summer precipitation over SGP is linearity dominant with positive correlation over SGP and with negative correlation over the western United states, mainly associated with atmospheric circulation of the NAM. This linearity dominant regime over SGP becomes confined to western and southern Texas at longer lags, suggesting that the land state (i.e., soil moisture) there has a long memory from midwinter to spring, which may be useful to predict anomalous precipitation over SGP in summer (Lyon and Dole 1995; Hong and Kalnay 2002). It is also found that the land soil moisture conditions in eastern Montana from mid-February to mid-August connect nonlinearly to variation in summer precipitation over SGP. On the other hand, the time-evolving spatial patterns of combined linearity and nonlinearity dominant regimes in the Pacific resemble the PDO mode of Barlow et al. (2001), the South Pacific dipole pattern (Saurral et al. 2018), and the warm Pacific SST forcing of Schubert et al. (2009) but with weaker signal in the equatorial eastern Pacific, which play a role in generating drought and pluvial conditions over the United States, especially the Great Plains. It is interesting to note that the linear and nonlinear information is mixed in the SH subtropical Pacific and is rather convoluted with lag times (Fig. 9 and Fig. S3). As a result, this special connection cannot be fully detected by either MI or correlation alone, demonstrating how this new joint approach can reveal previously undetectable relationships. Further investigation is required to study the underlying physical mechanisms of the nonlinear relationships we have identified, especially in eastern Montana (soil moisture) and in the SH subtropical Pacific (SST).
Our results suggest that NMI can pick up on strong linear relationships as correlations do, but it is not exclusively tuned to linear relationships like correlations and principal component analysis (PCA) are. It can further identify nonlinear relationships, particularly when there are clusters and blank areas (high density and low density) in scatter diagrams between the source and target variables (Figs. 1, 7, 10). The clusters that NMI can detect are like resonances or preferred bivariate (target and source) states in the coupled climate system that are not necessarily the byproduct of linear covariations. From a predictability standpoint, source and target variables correspond to predictor and predictand. Therefore, NMI is a potentially powerful tool for attribution and prediction and to reveal heretofore undetected relationships, but caution may be needed when it comes to estimating significance. Significance calculations for NMI can be costly. Here, we propose a more practical compromise method that is computationally much less expensive, although this procedure might overestimate NMI significance at longer lags (e.g., see Fig. 8 vs Fig. S2 and Fig. 9 vs Fig. S3). Last, we speculate that NMI may pick up on some of the same time-evolving features disclosed by PCA but in a nonparametric framework (i.e., no assumptions about linearity and the shape of the bivariate data distributions). It may thus provide a different and perhaps deeper way to see what PCA and correlations show because NMI is not “tuned” to only linear relationships. Future work is required to manifest this point, and we will also extend this analysis to other target regions beyond SGP.
While base e gives natural units nats and base 10 gives units of dits, bans, or hartleys, base 2 gives the unit of bits or shannons (Shannon 1948). The value of entropy is scalable from one base to another by multiplying by an appropriate factor.
The SPI calculation is based only on representing the historical precipitation data with a gamma distribution, and there are no water removal mechanisms.
Note that this value is chosen for illustrative purposes and can be adjusted depending on the application. In this study, the 0.8 value means that we consider the 20% highest values of NMI at each lag time as significant. If one wants to define only the top 5% or 10% values of NMI as significant, the value of CDF becomes 0.95 or 0.9, respectively.
Strictly speaking, the statistical significance at the 95% confidence level using a Student’s t test is about ±0.064, given the sample size. In this study, however, we define the correlation significant if its magnitude is beyond ±0.1 (corresponding to a 99.97% confidence level) for illustrative purposes.
Acknowledgments.
This work is supported by the NOAA MAPP drought project (NA20OAR4310422). We thank Abedeh Abdolghafoorian for kindly providing her Python code for the calculation of mutual information and for helpful discussions on its application.
Data availability statement.
The CPC daily unified gauge-based analysis of precipitation were retrieved from https://climatedataguide.ucar.edu/climate-data/cpc-unified-gauge-based-analysis-global-daily-precipitation. The NOAA high-resolution blended analysis of daily SST (OISST version 2) was retrieved from https://psl.noaa.gov/data/gridded/data.noaa.oisst.v2.highres.html. The SMERGE 0–40-cm root zone soil moisture data (doi:10.5067/PAVQY1KHTMUT) were retrieved from the Earthdata Search Client (EDSC), https://search.earthdata.nasa.gov/search?q=SMERGE_RZSM0_40CM.
APPENDIX
Procedure of Bin Selection for MI
To compute MI between the source variable and target (6-pentad SPI over the southern Great Plans), we begin by finding a pair of optimal bin numbers at each grid cell over CONUS for anomalies of soil moisture with no lag as the source variable. We follow the process of Knuth (2019), who derived a posterior probability for a number of bins with a uniform bin width histogram. After repeating the same procedure at two additional lag times, namely, 6- and 12-pentad lags, it is found that the domain-averaged values of the resultant optimal bin numbers at three different lag times are around 8 for the source variables while they range from 7 to 8 for the target variable. It is interesting to note that nearly the same results are for SST anomalies over global oceans between 50°S and 60°N as the source variable. This result motivated us to try to simplify the process by determining a way to select a single binning arrangement that could be applied to every location of the source variables and every lead time.
We construct 12 combinations of two numbers that can be drawn from sets of distinct numbers of bins: [8, 9, 10] for the source variable and [7, 8, 9, 10] for the target variable, as possible candidates for a pair of fixed bin numbers to apply for all grid cells in the source variable fields. Finally, we calculate the spatial correlation between patterns of MI at the various lags for each of the sets of selected fixed bin numbers for the entire domain and those of MI with the optimal bin numbers determined at each grid cell. For both anomalies of soil moisture and SST, the set of [8, 8] almost always presents the highest correlation coefficient. Therefore, it has been chosen to compute MI in this study rather than seeking the optimal number of bins at every single location and lag time, greatly speeding calculation of NMI significance.
REFERENCES
Barlow, M., S. Nigam, and E. H. Berbery, 2001: ENSO, Pacific decadal variability, and U.S. summertime precipitation, drought, and stream flow. J. Climate, 14, 2105–2128, https://doi.org/10.1175/1520-0442(2001)014<2105:EPDVAU>2.0.CO;2.
Betts, A. K., 2004: Understanding hydrometeorology using global models. Bull. Amer. Meteor. Soc., 85, 1673–1688, https://doi.org/10.1175/BAMS-85-11-1673.
Chen, M., W. Shi, P. Xie, V. B. S. Silva, V. E. Kousky, R. W. Higgins, and J. E. Janowiak, 2008: Assessing objective techniques for gauge-based analyses of global daily precipitation. J. Geophys. Res., 113, D04110, https://doi.org/10.1029/2007JD009132.
Cover, T. M., and J. A. Thomas, 2006: Elements of Information Theory. 2nd ed. Wiley-Interscience, 542 pp.
Crow, W., and K. Tobin, 2018: Smerge-Noah-CCI root zone soil moisture 0-40 cm L4 daily 0.125 × 0.125 degree V2.0. Goddard Earth Sciences Data and Information Services Center (GESDISC), accessed 30 June 2020, https://doi.org/10.5067/NRJWAMBMN6JD.
DelSole, T., and M. K. Tippett, 2007: Predictability: Recent insights from information theory. Rev. Geophys., 45, RG4002, https://doi.org/10.1029/2006RG000202.
Dirmeyer, P. A., G. Balsamo, E. M. Blyth, R. Morrison, and H. M. Cooper, 2021: Land-atmosphere interactions exacerbated the drought and heatwave over Northern Europe during summer 2018. AGU Adv., 2, e2020AV000283, https://doi.org/10.1029/2020AV000283.
Enfield, D. B., A. M. Mestas-Nuñez, and P. J. Trimble, 2001: The Atlantic multidecadal oscillation and its relation to rainfall and river flows in the continental U.S. Geophys. Res. Lett., 28, 2077–2080, https://doi.org/10.1029/2000GL012745.
Feng, S., R. J. Oglesby, C. M. Rowe, D. B. Loope, and Q. Hu, 2008: Atlantic and Pacific SST influences on medieval drought in North America simulated by the community atmospheric model. J. Geophys. Res., 113, D11101, https://doi.org/10.1029/2007JD009347.
Fernando, D. N., and Coauthors, 2016: What caused the spring intensification and winter demise of the 2011 drought over Texas? Climate Dyn., 47, 3077–3090, https://doi.org/10.1007/s00382-016-3014-x.
Findell, K. L., and E. A. B. Eltahir, 2003a: Atmospheric controls on soil moisture–boundary layer interactions. Part I: Framework development. J. Hydrometeor., 4, 552–569, https://doi.org/10.1175/1525-7541(2003)004<0552:ACOSML>2.0.CO;2.
Findell, K. L., and E. A. B. Eltahir, 2003b: Atmospheric controls on soil moisture–boundary layer interactions. Part II: Feedbacks within the continental United States. J. Hydrometeor., 4, 570–583, https://doi.org/10.1175/1525-7541(2003)004<0570:ACOSML>2.0.CO;2.
Findell, K. L., and T. L. Delworth, 2010: Impact of common sea surface temperature anomalies on global drought and pluvial frequency. J. Climate, 23, 485–503, https://doi.org/10.1175/2009JCLI3153.1.
Findell, K. L., A. Berg, P. Gentine, J. P. Krasting, B. R. Lintner, S. Malyshev, J. A. Santanello, and E. Shevliakova, 2017: The impact of anthropogenic land use and land cover change on regional climate extremes. Nat. Commun., 8, 989, https://doi.org/10.1038/s41467-017-01038-w.
Ford, T. W., and C. F. Labosier, 2017: Meteorological conditions associated with the onset of flash drought in the eastern United States. Agric. For. Meteor., 247, 414–423, https://doi.org/10.1016/j.agrformet.2017.08.031.
Ford, T. W., A. D. Rapp, and S. M. Quiring, 2015: Does afternoon precipitation occur preferentially over dry or wet soils in Oklahoma? J. Hydrometeor., 16, 874–888, https://doi.org/10.1175/JHM-D-14-0005.1.
Goodwell, A. E., 2020: “It’s raining bits”: Patterns in directional precipitation persistence across the United States. J. Hydrometeor., 21, 2907–2921, https://doi.org/10.1175/JHM-D-20-0134.1.
Goodwell, A. E., and P. Kumar, 2017: Temporal information partitioning networks (TIPNets): A process network approach to infer ecohydrologic shifts. Water Resour. Res., 53, 5899–5919, https://doi.org/10.1002/2016WR020218.
Goodwell, A. E., and P. Kumar, 2019: A changing climatology of precipitation persistence across the United States using information-based measures. J. Hydrometeor., 20, 1649–1666, https://doi.org/10.1175/JHM-D-19-0013.1.
Guo, Z., P. A. Dirmeyer, T. DelSole, and R. D. Koster, 2012: Rebound in atmospheric predictability and the role of the land surface. J. Climate, 25, 4744–4749, https://doi.org/10.1175/JCLI-D-11-00651.1.
Hartmann, D. L., 2015: Pacific sea surface temperature and the winter of 2014. Geophys. Res. Lett., 42, 1894–1902, https://doi.org/10.1002/2015GL063083.
Herrera-Estrada, J. E., Y. Satoh, and J. Sheffield, 2017: Spatiotemporal dynamics of global drought. Geophys. Res. Lett., 44, 2254–2263, https://doi.org/10.1002/2016GL071768.
Herweijer, C., R. Seager, and E. R. Cook, 2006: North American droughts of the mid to late nineteenth century: History, simulation and implications for mediaeval drought. Holocene, 16, 159–171, https://doi.org/10.1191/0959683606hl917rp.
Hlinka, J., D. Hartman, M. Vejmelka, D. Novotná and M. Paluš, 2014: Non-linear dependence and teleconnections in climate data: Sources, relevance, nonstationarity. Climate Dyn., 42, 1873–1886, https://doi.org/10.1007/s00382-013-1780-2.
Hoerling, M. P., and A. Kumar, 2003: The perfect ocean for drought. Science, 299, 691–694, https://doi.org/10.1126/science.1079053.
Holgate, C. M., A. I. J. M. Van Dijk, J. P. Evans, and A. J. Pitman, 2019: The importance of the one-dimensional assumption in soil moisture—Rainfall depth correlation at varying spatial scales. J. Geophys. Res. Atmos., 124, 2964–2975, https://doi.org/10.1029/2018JD029762.
Hong, S.-Y., and E. Kalnay, 2002: The 1998 Oklahoma–Texas drought: Mechanistic experiments with NCEP global and regional models. J. Climate, 15, 945–963, https://doi.org/10.1175/1520-0442(2002)015<0945:TOTDME>2.0.CO;2.
Hsu, H., and P. A. Dirmeyer, 2021: Nonlinearity and multivariate dependencies in land-atmosphere coupling. Water Resour. Res., 57, e2020WR028179, https://doi.org/10.1029/2020WR028179.
Hu, Z.-Z., and B. Huang, 2009: Interferential impact of ENSO and PDO on dry and wet conditions in the U.S. Great Plains. J. Climate, 22, 6047–6065, https://doi.org/10.1175/2009JCLI2798.1.
Huang, B., C.-S. Shin, and A. Kumar, 2019: Predictive skill and predictable patterns of the U.S. seasonal precipitation in CFSv2 reforecasts of sixty years (1958–2017). J. Climate, 32, 8603–8637, https://doi.org/10.1175/JCLI-D-19-0230.1.
Huang, B., C. Liu, V. Banzon, E. Freeman, G. Graham, B. Hankins, T. Smith, and H.-M. Zhang, 2021: Improvements of the daily optimum interpolation sea surface temperature (DOISST) version 2.1. J. Climate, 34, 2923–2939, https://doi.org/10.1175/JCLI-D-20-0166.1.
Kidwell, A., T. Lee, Y.-H. Jo, and X.-H. Yan, 2016: Characterization of the variability of the South Pacific convergence zone using satellite and reanalysis wind products. J. Climate, 29, 1717–1732, https://doi.org/10.1175/JCLI-D-15-0536.1.
Knuth, K. H., 2019: Optimal data-based binning for histograms and histogram-based probability density models. Digital Signal Process., 95, 102581, https://doi.org/10.1016/j.dsp.2019.102581.
Koster, R. D., and M. J. Suarez, 1996: The influence of land surface moisture retention on precipitation statistics. J. Climate, 9, 2551–2567, https://doi.org/10.1175/1520-0442(1996)009<2551:TIOLSM>2.0.CO;2.
Koster, R. D., and Coauthors, 2004: Regions of strong coupling between soil moisture and precipitation. Science, 305, 1138–1140, https://doi.org/10.1126/science.1100217.
Koster, R. D., Y. Chang, H. Wang, and S. D. Schubert, 2016: Impacts of local soil moisture anomalies on the atmospheric circulation and on remote surface meteorological fields during boreal summer: A comprehensive analysis over North America. J. Climate, 29, 7345–7364, https://doi.org/10.1175/JCLI-D-16-0192.1.
Kumar, S. V., P. A. Dirmeyer, C. D. Peters-Lidard, R. Bindlish, and J. Bolten, 2018: Information theoretic evaluation of satellite soil moisture retrievals. Remote Sens. Environ., 204, 392–400, https://doi.org/10.1016/j.rse.2017.10.016.
Kushnir, Y., R. Seager, M. Ting, N. Naik, and J. Nakamura, 2010: Mechanisms of tropical Atlantic SST influence on North American hydroclimate variability. J. Climate, 23, 5610–5628, https://doi.org/10.1175/2010JCLI3172.1.
Li, Q., X. Han, Z. Liu, P. He, P. Shi, Q. Chen, and F. Du, 2022: A novel information changing rate and conditional mutual information-based input feature selection method for artificial intelligence drought prediction models. Climate Dyn., 58, 3405–3425, https://doi.org/10.1007/s00382-021-06104-0.
Lyon, B., and R. M. Dole, 1995: A diagnostic comparison of the 1980 and 1988 U.S. summer heat wave-droughts. J. Climate, 8, 1658–1675, https://doi.org/10.1175/1520-0442(1995)008<1658:ADCOTA>2.0.CO;2.
Mason, S. J., and L. Goddard, 2001: Probabilistic precipitation anomalies associated with ENSO. Bull. Amer. Meteor. Soc., 82, 619–638, https://doi.org/10.1175/1520-0477(2001)082<0619:PPAAWE>2.3.CO;2.
McKee, T. B., N. J. Doesken, and J. Kleist, 1993: The relationship of drought frequency and duration to time scales. Proc. Eighth Conf. on Applied Climatology, Anaheim, CA, Amer. Meteor. Soc., 179–184.
Miralles, D. G., P. Gentine, S. I. Seneviratne, and A. J. Teuling, 2019: Land–atmospheric feedbacks during droughts and heatwaves: State of the science and current challenges. Ann. N. Y. Acad. Sci., 1436, 19–35, https://doi.org/10.1111/nyas.13912.
Mo, K. C., and B. Lyon, 2015: Global meteorological drought prediction using the North American Multi-Model Ensemble. J. Hydrometeor., 16, 1409–1424, https://doi.org/10.1175/JHM-D-14-0192.1.
Mo, K. C., and D. P. Lettenmaier, 2018: Drought variability and trends over the central United States in the instrumental record. J. Hydrometeor., 19, 1149–1166, https://doi.org/10.1175/JHM-D-17-0225.1.
Mo, K. C., J. E. Schemm, and S.-H. Yoo, 2009: Influence of ENSO and the Atlantic multidecadal oscillation on drought over the United States. J. Climate, 22, 5962–5982, https://doi.org/10.1175/2009JCLI2966.1.
Namias, J., 1960: Factors in the initiation, perpetuation and termination of drought. International Association of Hydrological Sciences Commission of Surface Waters Publication 51, 81–94. [Available from IAHS Press, Institute of Hydrology, Wallingford, Oxfordshire, OX10 8BB, United Kingdom.]
Nigam, S., B. Guan, and A. Ruiz-Barradas, 2011: Key role of the Atlantic multidecadal oscillation in 20th century drought and wet periods over the Great Plains. Geophys. Res. Lett., 38, L16713, https://doi.org/10.1029/2011GL048650.
Otkin, J. A., M. Svoboda, E. D. Hunt, T. W. Ford, M. C. Anderson, C. Hain, and J. B. Basara, 2018: Flash droughts: A review and assessment of the challenges imposed by rapid-onset droughts in the United States. Bull. Amer. Meteor. Soc., 99, 911–919, https://doi.org/10.1175/BAMS-D-17-0149.1.
Papagiannopoulou, C., D. G. Miralles, S. Decubber, M. Demuzere, N. E. C. Verhoest, W. A. Dorigo, and W. Waegeman, 2017: A non-linear Granger-causality framework to investigate climate–vegetation dynamics. Geosci. Model Dev., 10, 1945–1960, https://doi.org/10.5194/gmd-10-1945-2017.
Reynolds, R. W., T. M. Smith, C. Liu, D. B. Chelton, K. S. Casey, and M. G. Schlax, 2007: Daily high-resolution-blended analyses for sea surface temperature. J. Climate, 20, 5473–5496, https://doi.org/10.1175/2007JCLI1824.1.
Roundy, J. K., and E. F. Wood, 2015: The attribution of land–atmosphere interactions on the seasonal predictability of drought. J. Hydrometeor., 16, 793–810, https://doi.org/10.1175/JHM-D-14-0121.1.
Ruddell, B. L., and P. Kumar, 2009a: Ecohydrologic process networks: 1. Identification. Water Resour. Res., 45, W03419, https://doi.org/10.1029/2008WR007279.
Ruddell, B. L., and P. Kumar, 2009b: Ecohydrologic process networks: 2. Analysis and characterization. Water Resour. Res., 45, W03420, https://doi.org/10.1029/2008WR007280.
Ruddell, B. L., R. Yu, M. Kang, and D. L. Childers, 2016: Seasonally varied controls of climate and phenophase on terrestrial carbon dynamics: Modeling eco-climate system state using dynamical process networks. Landscape Ecol., 31, 165–180, https://doi.org/10.1007/s10980-015-0253-x.
Runge, J., and Coauthors, 2019: Inferring causation from time series in Earth system sciences. Nat. Commun., 10, 2553, https://doi.org/10.1038/s41467-019-10105-3.
Santanello, J. A., Jr., C. D. Peters-Lidard, S. V. Kumar, C. Alonge, and W.-K. Tao, 2009: A modeling and observational framework for diagnosing local land–atmosphere coupling on diurnal time scales. J. Hydrometeor., 10, 577–599, https://doi.org/10.1175/2009JHM1066.1.
Santanello, J. A., Jr., C. D. Peters-Lidard, and S. V. Kumar, 2011: Diagnosing the sensitivity of local land–atmosphere coupling via the soil moisture–boundary layer interaction. J. Hydrometeor., 12, 766–786, https://doi.org/10.1175/JHM-D-10-05014.1.
Santanello, J. A., Jr., S. V. Kumar, C. D. Peters-Lidard, K. Harrison, and S. Zhou, 2013: Impact of land model calibration on coupled land–atmosphere prediction. J. Hydrometeor., 14, 1373–1400, https://doi.org/10.1175/JHM-D-12-0127.1.
Santanello, J. A., Jr., and Coauthors, 2018: Land-atmosphere interactions: The LoCo perspective. Bull. Amer. Meteor. Soc., 99, 1253–1272, https://doi.org/10.1175/BAMS-D-17-0001.1.
Saurral, R. I., F. J. Doblas-Reyes, and J. García-Serrano, 2018: Observed modes of sea surface temperature variability in the South Pacific region. Climate Dyn., 50, 1129–1143, https://doi.org/10.1007/s00382-017-3666-1.
Schubert, S. D., M. J. Suarez, P. J. Pegion, R. D. Koster, and J. T. Bacmeister, 2004: On the cause of the 1930s Dust Bowl. Science, 303, 1855–1859, https://doi.org/10.1126/science.1095048.
Schubert, S. D., M. J. Suarez, P. J. Pegion, R. D. Koster, and J. T. Bacmeister, 2008: Potential predictability of long-term drought and pluvial conditions in the U.S. Great Plains. J. Climate, 21, 802–816, https://doi.org/10.1175/2007JCLI1741.1.
Schubert, S. D., and Coauthors, 2009: A U.S. CLIVAR project to assess and compare the responses of global climate models to drought-related SST forcing patterns: Overview and results. J. Climate, 22, 5251–5272, https://doi.org/10.1175/2009JCLI3060.1.
Schubert, S. D., and Coauthors, 2016: Global meteorological drought: A synthesis of current understanding with a focus on SST drivers of precipitation deficits. J. Climate, 29, 3989–4019, https://doi.org/10.1175/JCLI-D-15-0452.1.
Schumacher, D. L., J. Keune, C. C. van Heerwaarden, J. V.-G. de Arellano, A. J. Teuling, and D. G. Miralles, 2019: Amplification of mega-heatwaves through heat torrents fuelled by upwind drought. Nat. Geosci., 12, 712–717, https://doi.org/10.1038/s41561-019-0431-6.
Seager, R., and M. Hoerling, 2014: Atmosphere and ocean origins of North American drought. J. Climate, 27, 4581–4606, https://doi.org/10.1175/JCLI-D-13-00329.1.
Seager, R., and N. Henderson, 2016: On the role of tropical ocean forcing of the persistent North American west coast ridge of winter 2013/14. J. Climate, 29, 8027–8049, https://doi.org/10.1175/JCLI-D-16-0145.1.
Seager, R., Y. Kushnir, C. Herweijer, N. Naik, and J. Velez, 2005: Modeling of tropical forcing of persistent droughts and pluvials over western North America: 1856–2000. J. Climate, 18, 4065–4088, https://doi.org/10.1175/JCLI3522.1.
Seager, R., L. Goddard, J. Nakamura, N. Henderson, and D. E. Lee, 2014: Dynamical causes of the 2010/11 Texas–northern Mexico drought. J. Hydrometeor., 15, 39–68, https://doi.org/10.1175/JHM-D-13-024.1.
Seager, R., M. Hoerling, S. Schubert, H. Wang, B. Lyon, A. Kumar, J. Nakamura, and N. Henderson, 2015: Causes of the 2011–14 California drought. J. Climate, 28, 6997–7024, https://doi.org/10.1175/JCLI-D-14-00860.1.
Shannon, C. E., 1948: A mathematical theory of communication. Bell Syst. Tech. J., 27, 379–423, https://doi.org/10.1002/j.1538-7305.1948.tb01338.x.
Shin, C.-S., P. A. Dirmeyer, B. Huang, S. Halder, and A. Kumar, 2020a: Impact of land initial states uncertainty on subseasonal surface air temperature prediction in CFSv2 reforecasts. J. Hydrometeor., 21, 2101–2121, https://doi.org/10.1175/JHM-D-20-0024.1.
Shin, C.-S., B. Huang, P. A. Dirmeyer, S. Halder, and A. Kumar, 2020b: Sensitivity of U.S. drought prediction skill to land initial states. J. Hydrometeor., 21, 2793–2811, https://doi.org/10.1175/JHM-D-20-0025.1.
Smith, R., 2015: A mutual information approach to calculating nonlinearity. Stat, 4, 291–303, https://doi.org/10.1002/sta4.96.
Strehl, A., and J. Ghosh, 2002: Cluster ensembles—A knowledge reuse framework for combining multiple partitions. J. Mach. Learn. Res., 3, 583–617.
Tawfik, A. B., and P. A. Dirmeyer, 2014: A process-based framework for quantifying the atmospheric preconditioning of surface-triggered convection. Geophys. Res. Lett., 41, 173–178, https://doi.org/10.1002/2013GL057984.
Tobin, K. J., R. Torres, W. T. Crow, and M. E. Bennett, 2017: Multi-decadal analysis of root-zone soil moisture applying the exponential filter across CONUS. Hydrol. Earth Syst. Sci., 21, 4403–4417, https://doi.org/10.5194/hess-21-4403-2017.
Tobin, K. J., W. T. Crow, J. Dong, and M. E. Bennett, 2019: Validation of a new root-zone soil moisture product: Soil MERGE. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens., 12, 3351–3365, https://doi.org/10.1109/JSTARS.2019.2930946.
Vaheddoost, B., and M. J. S. Safari, 2021: Application of signal processing in tracking meteorological drought in a mountainous region. Pure Appl. Geophys., 178, 1943–1957, https://doi.org/10.1007/s00024-021-02737-8.
Wang, H., S. Schubert, R. Koster, Y.-G. Ham, and M. Suarez, 2014: On the role of SST forcing in the 2011 and 2012 extreme U.S. heat and drought: A study in contrasts. J. Hydrometeor., 15, 1255–1273, https://doi.org/10.1175/JHM-D-13-069.1.
Wang, S.-Y., L. Hipps, R. R. Gillies, and J.-H. Yoon, 2014: Probable causes of the abnormal ridge accompanying the 2013/14 California drought: ENSO precursor and anthropogenic warming footprint. Geophys. Res. Lett., 41, 3220–3226, https://doi.org/10.1002/2014GL059748.
Wang, S.-Y., and Coauthors, 2015: An intensified seasonal transition in the central U.S. that enhances summer drought. J. Geophys. Res. Atmos., 120, 8804–8816, https://doi.org/10.1002/2014JD023013.
Wu, J., and P. A. Dirmeyer, 2020: Drought demise attribution over CONUS. J. Geophys. Res. Atmos., 125, e2019JD031255, https://doi.org/10.1029/2019JD031255.
Wu, Z., J. Qiu, W. T. Crow, D. Wang, Z. Wang, and X. Zhang, 2022: Investigating the efficacy of the SMAP downscaled soil moisture product for drought monitoring based on information theory. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens., 15, 1604–1616, https://doi.org/10.1109/JSTARS.2021.3136565.
Xu, L., and P. Dirmeyer, 2011: Snow-atmosphere coupling strength in a global atmospheric model. Geophys. Res. Lett., 38, L13401, https://doi.org/10.1029/2011GL048049.
Xu, L., and P. Dirmeyer, 2013: Snow–atmosphere coupling strength. Part II: Albedo effect versus hydrological effect. J. Hydrometeor., 14, 404–418, https://doi.org/10.1175/JHM-D-11-0103.1.
Xue, Y., and Coauthors, 2018: Spring land surface and subsurface temperature anomalies and subsequent downstream late spring-summer droughts/floods in North America and East Asia. J. Geophys. Res. Atmos., 123, 5001–5019, https://doi.org/10.1029/2017JD028246.