Probabilistic causal network modelling of Southern Hemisphere jet sub-seasonal to seasonal predictability

: Skillful prediction of the Southern Hemisphere (SH) eddy-driven jet is crucial for representation of mid-to-high-latitude SH climate variability. In the austral spring-to-summer months, the jet and the stratospheric polar vortex vari-abilities are strongly coupled. Since the vortex is more predictable and in ﬂ uenced by long-lead drivers 1 month or more ahead, the stratosphere is considered a promising pathway for improving forecasts in the region on subseasonal to seasonal (S2S) time scales. However, a quanti ﬁ cation of this predictability has been lacking, as most modeling studies address only one of the several interacting drivers at a time, while statistical analyses quantify association but not skill. This methodological gap is addressed through a knowledge-driven probabilistic causal network approach, quanti ﬁ ed with seasonal ensemble hindcast data. The approach enables to quantify the jet ’ s long-range predictability arising from known late-winter drivers, namely, El Niño – Southern Oscillation (ENSO), Indian Ocean dipole (IOD), upward wave activity ﬂ ux, and polar night jet oscillation, mediated by the vortex variability in spring. Network-based predictions con ﬁ rm the vortex as determinant for skillful jet predictions, both for the jet ’ s poleward shift in late spring and its equatorward shift in early summer. ENSO, IOD, late-winter wave activity ﬂ ux, and polar night jet oscillation only provide moderate prediction skill to the vortex. This points to early spring submonthly variability as important for determining the vortex state leading up to its breakdown, creating a predictability bottleneck for the jet. The method developed here offers a new avenue to quantify the predictability provided by multiple, interacting drivers on S2S time scales. SIGNIFICANCE STATEMENT: Predictions of the Southern Hemisphere midlatitude jet stream are crucial for skill-ful forecasts of the austral mid-to-high latitudes. Several oceanic and atmospheric phenomena could, if better represented in models, improve spring-to-summer jet predictions on subseasonal to seasonal time scales. However, the combined potential skill arising from the inclusion of such phenomena has not been quanti ﬁ ed. This study does so by using a probabilistic causal network model, representing the connections between those drivers and the jet with conditional probabilities, trained on large sets of model data. The stratospheric polar vortex is con ﬁ rmed as crucial predictor of jet variability but is itself hard to predict a month in advance due to submonthly variability, creating a predictability bottleneck for the jet.

SFig. 2. ND mean composites based on EDJ-2 categories, in ERA5 (left) and hindcast (right).Geopotential height anomalies at 500 hPa (top), 2mT (middle) and MTPR (bottom).The dots show areas where anomaly values are significantly different from 0 according to a one-sample two-sided t-test at 0.1 significance level.SFig. 3. Daily hindcast and ERA5 vortex trend between 1981-2001(left panel) and 1995-2016 (right panel).Trend is computed as the multi-year linear trend centred around each calendar day, as described in more detail in the SI text.The ensemble mean trend (dark blue line) and the ensemble standard deviation (area filled in shaded blue) are shown.ERA5 trend (red line) and its standard error estimated from with a bootstrap procedure (area filled in shaded red) are shown.The ERA5 trend is calculated after a 30-day running average to remove noise.The peak of the hindcast ensemble mean trend in Epoch 1 is about 2 m/s/decade (standard deviation from −2 to 6 m/s/decade), while the ERA5 mean is about 5 m/s/decade.

S2. Stratospheric ozone bias
In System 4, ozone is activated as a prognostic variable and is radiatively interactive, thus the ozone field is free to evolve during the forecast.However, the simplified ozone chemistry in System 4 does not have a realistic representation of the polar stratospheric cloud chemistry required to produce an ozone hole, regardless of the issue with initial conditions.Indeed, Monge-Sanz and Coauthors (2022) recently showed with their new stratospheric ozone model (implemented in their study in ECMWF SEAS5) that a realistic interactive prognostic ozone field is needed to reproduce the SH polar vortex behaviour (including VB delay due to lower ozone concentration), while an ozone climatology does not provide enough information for the model to reproduce all necessary feedbacks with dynamics.
An analysis of expected ozone-induced trend for Epoch 1 (1981-2001, ozone decline) and no trend in Epoch 2 (1995-2016, recovery) is performed the November to January [] 50 hPa 60ºS.
Following Saggioro and Shepherd (2019), the reanalysis trends are computed at each calendar day (between 1 August and 2 March) as the increase per decade of the 30-day mean value of the time series around that calendar day across the years of interest.An ensemble of 100 time series of 30 time steps is generated for each vortex trend value (100 is to estimate the uncertainty and 30 is to be close to the sample used to fit the epoch's trend, 21).The hindcast trend is estimated in the same waiy for each ensemble member, from which the mean and standard deviation are derived.
SFigure 3 shows some sign of trend in the vortex in Epoch 1 (left panel), hence the CFC historical trends are able to imprint a strengthening trend in the hidcast vortex (dark blue line).However the modelled trend is overall weaker than in observations (red line).Further analysis would be needed to confirm the hypothesis of an issue with the representation of ozone chemistry, which is out of the scope of the present work.Although within sampling uncertainty, the weaker vortex compared to reanalysis led to the exclusion of an ozone variable in the network.Epoch 2 shows no trend as expected (right panel).SFig.7. EDJ-1 (top) and EDJ-2 (bottom) yearly time series, grouped by categories (eq, mid and pole).The ensemble mean, here computed assigning values 0, 1, 2 to pole, mid and eq respectively, is shown in purple.The numbers on top of the dots show the number of members assigned to the category.ERA5 categories are shown in red.

S6. SPV-low CPT
SFig. 9. Conditional probability ratio CPT/P for SPV-low.The values written on each table's entry show the mean and, in parenthesis, the 5-95 percentile ranges of a 1,000 sample bootstrap estimate of the CPT.Each bootstrap is obtained by re-sampling the ensemble members/years.The color blue (pink) of each entry indicates if the mean ratio is higher (lower) than 1, that is if the category of X is more (less) likely than climatology given the conditions of its parents.Entries are masked with grey color if the mean values of the ratio is between 0.9-1.
SFig. 4. Bias correction of EDJ-1 (left) and EDJ-2 (right) latitude time series.ERA5 climatology (red), hindcast ensemble mean (blue) and bias corrected hindcast (ensemble mean in dark purple line, standard deviation range in light purple, min-max range in very light purple).S4.Network variablesSFig.5 shows VB hindcast distribution (right) and the yearly time series (left) divided into stripplots for each of the three categories (a) and the EDJ-1 (b) and EDJ-2 (c) categories.SFig.6 shows the distribution and timeseries for the other variables and SFig 7 shows the timeseries for EDJ-1 and EDJ-2 categories.SFig. 5. (a) VB distribution for the bias corrected hindcast (right) and the yearly time series (left) divided into strip-plots for each of the three categories (early, middle and late are in orange, grey and light blue respectively; ensemble mean in purple).ERA5 values are shown in red.(b) EDJ-1 and (c) EDJ-2 categories for the bias corrected hindcast, labelled as equatorward (orange), mid (grey) and poleward (blue).The percentage values in parentheses show the fraction of all members and years assigned to each category (sum is 100% by construction).The thick coloured lines represent the hindcast cluster mean (solid) and plus or minus one standard deviation (dashed).The thick black lines represent the ensemble mean climatology (solid) and plus or minus one standard deviation (dashed).The ERA5 yearly time-series assigned to each hindcast category, as the closest based on averaged Euclidean distance, are shown in red.SFig.6. QBO, ENSO, IOD, vT-flux, PJO-1, PJO-2 and SPV-low (from top to bottom) distribution for the bias corrected hindcast (right) and the yearly time series (left) divided into colored strip-plots for each of the categories (ensemble mean in purple).ERA5 values are shown in red.
Hindcast composites of [u]  anomalies in August, September, October and November based on E or W QBO. Units in /.The extratropical stratospheric signal present around 100-50 hPa in August quickly weakens in September and is not visible any more in October and November, when it should affect SPV-low and VB.
SFig. 10.Network structure using vT-flux in August and September as separate variables, instead of their average (top).The cross-validated ROC AUC for the prediction of the variables given information about their parents (bottom), which shows no significant improvement to SPV-low and VB compared to Figure4ain the manuscript. .Network structure using vT-flux in October, in addition to vT-flux averaged in August and September (top).The cross-validated ROC AUC (mid) and PR AUC % (bottom) for the prediction of the variables given information about their parents, which shows significant improvement to SPV-low but not to VB, compared to Figure4a,b in the manuscript.

S8.
Anomaly Correlation change for refined ensemble using EDJ-1 SFig.12. Z500 (a), MTPR (b) and 2mT (c) anomaly correlation (AC) maps in September-October (SO) based on the EDJ-2 filter.For each row, the columns show AC between hindcast ensemble mean and ERA5 (left); change in AC for the perfect PCN refined ensemble compared to the ensemble mean (middle), and change in AC for the PCN refined ensemble compared to the ensemble mean (right).The perfect PCN selection takes only those ensemble members whose EDJ-1 categories match ERA5, hence showing the best possible improvement based on the PCN approach.Black dots show AC significantly different from 0 according to a one-sample two-sided t-test at 0.1 significance level.A stereographic projection is used, and latitudes between 20 • and 90 • S are shown at intervals of 20 • .