Quantifying causal pathways of teleconnections

: Teleconnections are sources of predictability for regional weather and climate, but the relative contributions of different teleconnections to regional anomalies are usually not understood. While physical knowledge about the involved mechanisms is often available, how to quantify a particular causal pathway from data are usually unclear. Here, we argue for adopting a causal inference-based framework in the statistical analysis of teleconnections to overcome this challenge. A causal approach requires explicitly including expert knowledge in the statistical analysis, which allows one to draw quantitative conclusions. We illustrate some of the key concepts of this theory with concrete examples of well-known atmospheric teleconnections. We further discuss the particular challenges and advantages these imply for climate science and argue that a systematic causal approach to statistical inference should become standard practice in the study of teleconnections.


Introduction
The term 'teleconnection' is used to refer to a recurrent climatic effect resulting from a spatially 35 distant forcing (Wallace and Gutzler 1981). For instance, the phase of the El Niño Southern Oscillation (ENSO) in the tropical Pacific impacts precipitation in California (Swain 2015;Chang et al. 2015), parts of Australia, South Africa and South America (Iizumi et al. 2014;Dai and Wigley 2000). Several other climate modes such as the Madden Julian Oscillation (MJO), the North Atlantic Oscillation (NAO), the Quasi-Biennial Oscillation (QBO), the Indian Ocean Dipole 40 (IOD), and the Pacific Decadal Oscillation (PDO) have been described, and their interconnections as well as their remote impacts have been extensively studied using observations, climate models and physical theory (Trenberth et al. 1998;Hoskins and Karoly 1981;Wang et al. 2017;Bjerknes 1969;Walker 1925).
Due to their relevance for regional weather and climate, teleconnections remain an extremely 45 active area of research. One key task is to quantify teleconnection strength in both models and observations. For example, understanding potential biases in the strength of teleconnection signals is important to improve their representation in numerical models (Vitart 2017), which is key to improving forecasts on time-scales ranging from sub-seasonal to multi-decadal (Mariotti et al. 2020;Lang et al. 2020;López-Parages and Rodríguez-Fonseca 2012). Moreover,50 given that much of the uncertainty in regional climate projections under global warming is associated with teleconnections (Shepherd 2014), attributing ensemble spreads to changes in large-scale drivers can help to understand and constrain the projected changes (Zappa and Shepherd 2017;Kretschmer et al. 2020;Maraun et al. 2017;Mindlin et al. 2020).
However, robustly estimating the effects of a teleconnection from data remains a challenging 55 task due to the often simultaneous influences of multiple climate modes. For instance, quantifying the causal influence of the stratospheric polar vortex (SPV) on the NAO is difficult, as both the SPV and the NAO are known to be influenced by the MJO, and this influence is likely modulated by the phases of ENSO and the QBO (Barnes et al. 2019;Cassou 2008;Lee et al. 2019). Analysing a teleconnection pathway in isolation, for example using pairwise correlation, 60 4 the causal mechanisms at play is often available, how to isolate and quantify a particular effect 65 from data is usually unclear. This disconnect between physics and statistics is exemplified by the American Meteorological Society's definition of a teleconnection as a "correlation in the fluctuations of a field at widely separated points", further stating that "such correlations suggest that information is propagating between the distant points through the atmosphere" 1 . Climate scientists are well aware that correlation does not necessarily imply causation, but how to 70 overcome this mantra in statistical practice and connect the two perspectives in a quantitative manner is unclear. The difference between correlation and causation becomes crucial when one considers out-of-sample use of the statistical relationships, such as understanding the influences of model biases (Bracegirdle and Stephenson 2012;Kretschmer et al. 2020), storylines of regional climate change (Shepherd 2019), or unprecedented events (Diffenbaugh 75 et al. 2017). Moreover, it is important just for understanding the relative role of different causal factors, which is a typical goal in teleconnection studies (Junge and Stephenson 2003; Jiménez-Esteve and Domeisen 2018; Barnes et al. 2019). In short, a process-based framework to quantify the causal teleconnection pathways apparent in correlation is sorely needed.
Here we advocate for a formal causal framework in the statistical analysis of teleconnections, 80 which can be obtained by grounding it in causal inference theory (see info box). Statistical analysis in weather and climate science is usually done in the context of physical, hence causal, reasoning, but this reasoning is often only informal. The reasoning can be formalized by expressing expert knowledge about physical mechanisms in the form of a causal network. This has several advantages. First, it forces the researcher to be explicit about their assumptions, 85 which makes it easier for others to follow their argument. Second, it allows to understand how causal information is propagating and where non-causal correlation is expected. Finally, how to remove the influence of common drivers to extract a particular causal effect from data follows directly from the network structure. These advantages can be gained by moving from reasoning informally with causal narratives, to reasoning formally with causal networks. While 90 this only involves small changes in statistical practise, it can lead to significant differences in the framing of the problem and in the interpretation and useability of the results. As Harold Jeffreys noted in his seminal work on the theory of probability: "It is sometimes considered a paradox 5 that the answer depends not only on the observations but on the question; it should be a platitude" (Jeffreys 1961).

95
The purpose of this paper is not to detect causal relationships (which is a different issue), but to show how existing knowledge about physical mechanisms can be used to quantify teleconnection pathways. We illustrate this with a number of well-known teleconnection examples. Mostly we do this using Multiple Linear Regression (MLR), but all the concepts extended naturally into the non-linear context, as we illustrate with our final example. Finally, 100 we discuss particular opportunities and some practical challenges for the use of a causal framework in climate science.

105
Time-series are constructed by area-averaging over different regions, variables and time-bins.
All time-series are standardized by removing the multi-year seasonal mean and dividing by the multi-year standard deviations, and detrended by removing the multi-year linear fit slope. We recognize that NCEP reanalysis data has its limitations, but it serves our purpose of illustrating some of the key principles and methods of causal inference theory with well-known examples 110 of teleconnections. For the same reason, we do not clutter the text with confidence intervals and p-values, and rely on previous literature for establishing the physical relevance of our examples.
Causal reasoning in the statistical analysis of teleconnections 135 Teleconnections are commonly analysed using MLR, a simple and powerful tool to quantify linear dependencies. However, regression coefficients can easily be misinterpreted, or not fully exploited, if the underlying data-generating mechanisms are not taken into account. In contrast, when MLR is combined with physical reasoning, causal conclusions are possible, as we discuss below in more detail.

140
In this context, causal networks 4 provide a simple graphical tool to facilitate the process-based analysis of teleconnections (Pearl 2013). Causal networks consist of nodes, representing the physical variables involved in teleconnections (e.g. ENSO, the NAO), and links, indicating the presence and direction of the assumed causal relationship between these variables. As a starting point, causal networks can simply be thought of as a very intuitive way to qualitatively 145 outline a set of physical hypotheses, something that is already widely used in climate science in an informal manner. Schematic overviews summarizing the key findings of a paper are, for instance, often presented in this way (Jiménez-Esteve and Domeisen 2018; Lee et al. 2019). 4 There are different ways to refer to causal networks in the literature, but the most common is directed acyclic graph (DAG). 'Graph' is the mathematical term for network, 'directed ' means the links between nodes have a direction and 'acyclic' means that no causal loops are permitted (see section on practical challenges). Here we stick to the more physical term of a causal network. Note that the product of the causal links (-0.58 * 0.42 = -0.25) approximately coincides with the 180 correlation of -0.24 between DK and MED. This is a practical property of linear models (with standardized variables) called the path-tracing rule 5 (Pearl 2013). This rule follows from simple algebra and expresses how statistical associations (correlations) reflect the underlying causal effects.
While the above example may seem oversimplistic, it shows how scientific knowledge guides 185 the data analysis. A causal interpretation of regression and correlation coefficients is only justified if one has hypotheses of the underlying causal mechanisms. These physical hypotheses can be tested explicitly (as we did above using partial correlation), where possible, and should be updated in case they are not supported by the data. This simple example illustrates that conditioning on a 'mediator', here Jet, controls away the effect that one might actually aim to measure. It is often the practice in climate science, especially in the context of statistical predictions, to include different climate indices in a regression model to predict some regional target variable. While this may be unproblematic for 220 purely predictive, within-sample purposes, it can lead to spurious interpretations regarding the individual contributions of different drivers, as illustrated here.
It is worth noting that from a statistical perspective, Examples 1 and 2 are indistinguishable.
They both involve a target variable Y (DK, say, and CA), and two potential explanatory variables X (MED and ENSO) and Z (NAO and Jet), each of which are correlated with Y. Regressing Y on 225 both X and Z indicates a conditional independence between Y and X, showing that the information pathway between X and Y (reflected in their correlation) is indirect, passing through Z. However, the physical interpretation of the pathway depends entirely on the assumed direction of the causal relation between X and Z, which is opposite in the two Examples (X ← Z → Y in Example 1, and X → Z → Y in Example 2). Yet this crucial feature of the The following example is a combination of the previously discussed mediator and common driver effects. We again consider ENSO which is known to influence the SH jet ("Jet") in early ENSO influences Jet directly, via a tropospheric pathway, and indirectly via SPV, also called the stratospheric pathway (see Figure 3).

240
As there are no assumed common drivers of ENSO and Jet, the total effect of ENSO However, to quantify the contribution from SPV to Jet, one has to control for the common 250 driver ENSO, thus include ENSO in the regression; this yields a causal effect of SPV on Jet of 0.39, as already found above.
Put together, it follows that the strength of the indirect, stratospheric pathway is -0.10 ( = -0.26 * 0.39) while that of the direct, tropospheric pathway is only -0.04, i.e. the effect of ENSO on the Jet via the stratosphere is (for the data and time-period analysed here) twice as large as the 255 tropospheric link, which is consistent with recent findings (Byrne et al. 2017(Byrne et al. , 2019. We again note that the sum of the tropospheric and stratospheric pathway (-0.10 -0.04 = -0.14) is, as expected from the path-tracing rule, approximately equal to the total effect of ENSO on Jet as calculated above.
While quantifying direct and indirect effects between correlated processes through the path- physical basis for the correlation resolves this statistical indeterminism. As with Examples 1 and 265 2, here the physical interpretation of the correlation between ENSO and Jet depends on the assumed direction of causal influence between ENSO and SPV. Thus, instead of having to add a caveat that "correlation does not imply causation", which makes the result ambiguous, we can embed the statistical analysis within an expert-based causal framing and thereby make the numbers useable. For example, Saggioro and Shepherd (2019) showed that the observed delay 270 in SPV in the last decades of the 20 th century, which is attributed to the development of the ozone hole, well predicts the observed poleward shift in Jet. This makes sense as a nonstationary forcing of SPV, which is unrelated to ENSO, can be expected to induce a nonstationarity in Jet through the same causal effect of SPV on Jet found in the stationary regression model.

275
Example 4: Blocking the correct paths in the network The advantages of explicitly defining a causal network become most apparent when there are more than just a few variables and processes involved, such that it quickly becomes confusing and difficult to understand how statistical association is inherited from the causal relations. This 280 is illustrated in the next example.
In recent years, sea ice loss in the Barents and Kara region in autumn (BK) has been suggested to cause a weakening of the wintertime Northern Hemisphere stratospheric polar vortex (SPV) (Kim et al. 2014). This remains a controversial hypothesis, partly due to inconsistent model results (Screen et al. 2018;Cohen et al. 2020;Kretschmer et al. 2020). Quantifying the causal 285 effect from BK on SPV from observed data is, however, challenging, as several potential common drivers have to be taken into account (see Fig. 4 URAL is more difficult, as it is assumed to be both a common driver (i.e., we need to block its influence), and a mediator (i.e., blocking its path would regress out the effect we aim to 305 measure). Here we make use of the causal assumption that the effect comes after the cause.
To then block the confounding role of URAL without blocking its role as mediator, we condition on URAL during the same autumn months as BK, assuming that its mediating role involves some longer time-lag. In summary, and assuming linear dependencies, our regression model to quantify the causal effect of autumn BK on winter SPV is: We stress once more that the correctness of the estimate of the causal effect of BK on SPV is (apart from sampling uncertainty) conditional on the causal network, our assumption of linear Example 3). In particular, ENSO influences the Indian Ocean Dipole ("IOD"), another important driver of AU (Black et al. 2003). Figure 5 summarizes these causal assumptions.
The influence of ENSO on the IOD, and thereby on AU, has been suggested to exhibit From the marginals in Table 1(a) we can see an association of both ENSO and IOD with AU, with 360 the negative phases of both indices increasing the probability of above-average rainfall, and the positive phases decreasing it. Since we consider ENSO as a common driver of IOD and AU, we need to control for it in order to isolate the causal IOD-AU relationship. We do this by conditioning the IOD-AU association on the phase of ENSO, which is represented by the columns in Table 1(a). The added information provided by IOD, given ENSO, is represented by 365 the Bayes factor P( AU | IOD, ENSO) / P( AU | ENSO), which for AU+ is the ratio of the conditional probability to the marginal probability in the bottom entry of the same column. We can see from this that the phase of IOD has barely any effect on AU for either La Niña or El Niño. For example, the Bayes factor for IOD+ during El Niño is 0.24/0.22 = 1.09, meaning that the probability of AU+ during El Niño phases is increased by only 9% if the IOD is positive. When 370 ENSO is neutral there is a suggestion that IOD+ decreases the probability of AU+, whilst IODhas a weaker positive effect.
The dependencies in the network further allow us to interpret and decompose the causal effects. The causal effect of ENSO on AU, for instance, can be found by marginalizing over IOD: .

375
Here i=1,2,3 represent the three phases of IOD, and the equation applies for any combination of AU and ENSO phase. In the linear case the calculation reduces to the path-tracing rule, with regression coefficients replacing Bayes factors. The first factor in the product can be read off  (Chernozhukov et al. 2018;Blakely et al. 2021). However, non-linear methods generally require large data sets, often justifying a linear approach, especially when using the short observational record. The relative benefit of using a non-linear over a linear approach can be quantified using various metrics such as the Bayesian Information Criterion.

405
Opportunities for climate science Once a climate scientist has developed a theory of the causal relationships of the processes they are considering, by drawing the network explicitly they can use the rules of causal inference to determine which covariates to include and which to exclude for their specific analysis. There are several fundamental aspects where such formal causal reasoning could help 410 to make progress.
First, using causal networks and following the rules of causal inference provides an easy and transparent way to quantify teleconnection pathways, and different hypothesises could be tested in this way. Such a diagnostic approach has immediate benefits for analysing teleconnections in the observational record, and is also particularly suitable for evaluating their representation in climate models. For example, huge ensembles can be needed to detect (often only small) teleconnection signals (Smith et al. 2020). Comparing the causal effects is more efficient (Kretschmer et al. 2020); it can moreover shed light on the dynamical sources of model differences, and should help in understanding potential signal-to-noise issues (Scaife and Smith 2018).

420
Quantifying teleconnections in climate model ensembles can also help to reduce uncertainties of regional weather and climate predictions. On both subseasonal to seasonal (S2S) and multidecadal timescales, for instance, it can enable process-informed bias-adjustments (Specq and Batté 2020;Mariotti et al. 2020). In the context of climate projections, causal estimates are, for example, expected to provide more robust emergent constraints on regional weather 425 (Maraun et al. 2017). The causal flow in a network further provides a built-in narrative to communicate the sources of uncertainty. More precisely, uncertainties in the response of a regional climate hazard to anthropogenic global warming can be decomposed into different physically self-consistent probabilistic storylines (Shepherd 2019), providing important information for decision makers. In this context, networks could also easily be expanded by 430 including impact variables, such as wildfires or crop yield in a specific region (Lloyd and Shepherd 2020;Lehmann et al. 2020;Guimarães Nobre et al. 2017). In this way one can quantify the contribution of climatic effects to these impacts in conjunction or in combination with non-climatic drivers such as land use, which could also be included in the network.
A causal network approach can also be useful to guide model interventions to test the influence interpretability of findings from deep learning (e.g. novel extracted climate features), causalitybased approaches could, for example, be used to evaluate them against expert knowledge. In turn, deep learning methods could also be used to quantify known causal pathways, potentially providing a more powerful way to estimate non-linear dependencies from large spatio-450 temporal climate data (Luo et al. 2020;Ham et al. 2019). Overall, causal reasoning can help build trust in purely data-based findings, and is key to physics-guided machine learning (Reichstein et al. 2019;Knüsel et al. 2019).

Particular challenges 455
The various temporal scales of dependencies in the climate system can be difficult to address.
Note that loops and cycles are generally not permitted in a causal network. This might seem contradictory at first given a fully coupled climate system including strong auto-dependencies.
However, depending on context, time-lags and different time-scales of expected cause-effect relationships can resolve this issue to a reasonable degree (Kretschmer et al. 2020(Kretschmer et al. , 2016. In Another challenge is that relevant processes should be included in the network, meaning that there exist no confounders. This can never be fulfilled for an object as complex as the climate system. (Of course, the same criticism applies to any statistical analysis.) However, processes 480 not explicitly included can be represented as noise. Consequently, one self-consistency test is to see whether the residuals really look like noise, or instead appear to contain some kind of structure which might suggest the need for another explanatory variable. Also note that it is possible to include unknown drivers in the network and to treat this type of uncertainty explicitly within causal inference theory (Pearl 2009 Causal inference requires that hypotheses on the physical mechanisms that generated the data are made first, before any conclusions are drawn. Based on the assumptions encoded in the form of a network, the causal effects can then be estimated from the observations by following relatively simple rules. Here we discussed some examples from climate science and quantified 510 the relevant teleconnection pathways in reanalysis data. The provided cases show how causal reasoning should guide the data analysis to obtain more reliable estimates of causal effects. In our view, seeking 'objectivity' in data-driven approaches is not necessarily worthwhile, as it requires ignoring physical knowledge, which is usually crucial to achieve meaningful results.
While practical challenges of data analysis remain, such as choices of the optimal climate 515 indices, time-scales and data products, this applies to any statistical analysis. We argue that the transparent and deductive nature of causal network analysis can help in overcoming many of the limitations faced in current studies, and in reconciling differences between the conclusions of different studies.
Importantly, a causal approach is not meant to compete with traditional climate model 520 experiments or physical theory. Instead it serves as a scaffold to build scientific intuition into the statistical analysis of the data. We argue that both basic physics as well as data science are needed to make progress in climate science, and that causal theory is a framework for better reconciling the two.

20
Info Box: Causal inference theory in a nutshell Despite a clear physical conception of causality, a mathematical formalization has long been missing, and only emerged in the last few decades (Pearl and Mackenzie 2018). These major methodological advances are already successfully applied in many research disciplines such as 530 epidemiology (Greenland et al. 1999), psychology (Rohrer 2018), and medical research (Richens et al. 2020), but there are only a few examples from climate science in the context of weather attribution (Hannart et al. 2016) and model assessment (Hirt et al. 2020).
In causal statistics, a causal influence from a process, represented by the random variable X, to another process, represented by the random variable Y, means that intervening in X, while 535 keeping everything else fixed, changes the probability distribution of Y (Pearl 2000;Pearl et al. 2016). Mathematically, such (usually only hypothetical) interventions are described with the so-called 'do-operator'. The interventional conditional probability, denoted by P(Y| do(X)), generally does not coincide with the observational conditional probability of Y given X, denoted by P(Y|X). For example, the measured pressure by a barometer (X) and the actual pressure (Y) 540 have a strong statistical association and observed values of X will also be good predictors of Y.
However, as X does not cause Y, intervening in X, e.g. by moving the needle of the barometer by hand to X = x, will not change the surrounding pressure, and thus P(Y| do(X) = x) ≠ P(Y|X = x). In contrast, interventions in the pressure will lead to a change in the barometer needle.
Causal Inference theory shows that quantifying causal effects to predict the effects of 545 interventions purely based on observed data and without doing any actual experiments or interventions, is sometimes possible. In other words, it can be possible to extract the desired interventional probability from the observed probabilities. The underlying idea is that past (naturally occurring) interventions in X that led to changes in Y are present in the data but are biased by other processes that affect both X and Y. To isolate the causal effect from X to Y, one 550 thus has to account for the influence of such confounders.
A necessary requirement for causal inference is to first define a plausible causal model of the hypothesised data-generating mechanisms, usually expressed graphically in the form of a causal network. Note that it is not necessary (nor would it be possible) to represent the full climate system in such a network. Instead, the network represents a reduced model of the 555 truth, tailored to the purpose at hand. If one is for instance interested in the causal effect of X on Y, only those processes that could confound the analysis, i.e. common drivers of X and Y, into mathematical objects to which the established rules of probability theory apply. This makes it easy to understand how causal information flows along the links in the network. In particular, identifying the confounding factors that one needs to control for to extract a particular causal effect from data follows directly from the network structure.
For some graphical intuition, one can think of the links in a network as pipes which allow the 565 flow of information between the nodes. Each causal network consists of combinations of socalled 'chains' (X→ Z→ Y), 'forks' (Y ← Z → X) and 'collider' structures (X→ Z ←Y). While the information flows along the links of chains and forks, which is to say that statistical association (i.e. correlation) of X and Y is present, it is 'blocked' by the common effect Z in a collider structure, implying statistical independence of X and Y (i.e. no correlation). Once one controls 570 for the variable Z in the first two cases (i.e. the mediator in a chain or the common driver in a fork), which is the same as blocking the information flow, X and Y become independent conditional on Z. In contrast, controlling for the common effect Z in a collider structure 'opens' the otherwise blocked path from X to Y and introduces a statistical association between X and Y conditional on Z.

575
Thus, to quantify a particular causal pathway in the network one has to control for the correct processes. While it is necessary to block the effect of a common driver, it can lead to a bias if done for a common effect or an indirect pathway. In many cases one can identify the correct adjustment set at a glance, or by following relatively simple rules (see e.g. Cinelli et al. (2020) for a summary overview of good and bad adjustment sets in networks). For more complex set-580 ups one can draw on a comprehensive mathematical theory, providing rules of when and how it is possible to extract a causal effect from data (Pearl 2000;Pearl et al. 2016 that helped to improve the paper. Moreover, the authors thank the editor and three anonymous reviewers for constructive and helpful feedbacks on the manuscript.