Tackling the Accuracy-Interpretability Trade-off in a Hierarchy of Machine Learning Models for the Prediction of Extreme Heatwaves

Alessandro Lovo a ENS de Lyon, CNRS, Laboratoire de Physique, F-69342 Lyon, France

Search for other papers by Alessandro Lovo in
Current site
Google Scholar
PubMed
Close
,
Amaury Lancelin b LMD/IPSL, CNRS, ENS, Université PSL, École Polytechnique, Institut Polytechnique de Paris, Sorbonne Université, Paris, France
c RTE France, Paris La Défense, France

Search for other papers by Amaury Lancelin in
Current site
Google Scholar
PubMed
Close
,
Corentin Herbert a ENS de Lyon, CNRS, Laboratoire de Physique, F-69342 Lyon, France

Search for other papers by Corentin Herbert in
Current site
Google Scholar
PubMed
Close
, and
Freddy Bouchet b LMD/IPSL, CNRS, ENS, Université PSL, École Polytechnique, Institut Polytechnique de Paris, Sorbonne Université, Paris, France

Search for other papers by Freddy Bouchet in
Current site
Google Scholar
PubMed
Close
Open access

Abstract

When performing predictions that use Machine Learning (ML), we are mainly interested in performance and interpretability. This generates a natural trade-off, where complex models generally have higher skills but are harder to explain and thus trust. Interpretability is particularly important in the climate community, especially when dealing with extreme weather events, where gaining a physical understanding of the underlying phenomena is crucial to contain impacts.

In this paper, we perform probabilistic forecasts of extreme heatwaves over France, using a hierarchy of increasingly complex ML models, which allows us to find the best compromise between accuracy and interpretability. We use models that range from a global Gaussian Approximation (GA) to deep Convolutional Neural Networks (CNNs), with the intermediate steps of a simple Intrinsically Interpretable Neural Network (IINN) and a model using the Scattering Transform (ScatNet). Our findings reveal that CNNs provide higher accuracy, but their black-box nature severely limits interpretability, even when using state-of-the-art Explainable Artificial Intelligence (XAI) tools. In contrast, ScatNet achieves similar performance to CNNs while providing greater transparency. Our interpretable models highlight known drivers of extreme European heatwaves, like persistent anticyclonic anomalies and dry soil, as well as new ones, in the form of sub-synoptic geopotential height oscillations.

This study underscores the potential of interpretability in ML models for climate science, demonstrating that simpler models can rival the performance of their more complex counterparts, all the while being much easier to understand. This gained interpretability is crucial for building trust in model predictions and uncovering new scientific insights.

© 2025 American Meteorological Society. This is an Author Accepted Manuscript distributed under the terms of the default AMS reuse license. For information regarding reuse and general copyright information, consult the AMS Copyright Policy (www.ametsoc.org/PUBSReuseLicenses).

Authors contributed equally

Corresponding author: Freddy Bouchet, freddy.bouchet@cnrs.fr

Abstract

When performing predictions that use Machine Learning (ML), we are mainly interested in performance and interpretability. This generates a natural trade-off, where complex models generally have higher skills but are harder to explain and thus trust. Interpretability is particularly important in the climate community, especially when dealing with extreme weather events, where gaining a physical understanding of the underlying phenomena is crucial to contain impacts.

In this paper, we perform probabilistic forecasts of extreme heatwaves over France, using a hierarchy of increasingly complex ML models, which allows us to find the best compromise between accuracy and interpretability. We use models that range from a global Gaussian Approximation (GA) to deep Convolutional Neural Networks (CNNs), with the intermediate steps of a simple Intrinsically Interpretable Neural Network (IINN) and a model using the Scattering Transform (ScatNet). Our findings reveal that CNNs provide higher accuracy, but their black-box nature severely limits interpretability, even when using state-of-the-art Explainable Artificial Intelligence (XAI) tools. In contrast, ScatNet achieves similar performance to CNNs while providing greater transparency. Our interpretable models highlight known drivers of extreme European heatwaves, like persistent anticyclonic anomalies and dry soil, as well as new ones, in the form of sub-synoptic geopotential height oscillations.

This study underscores the potential of interpretability in ML models for climate science, demonstrating that simpler models can rival the performance of their more complex counterparts, all the while being much easier to understand. This gained interpretability is crucial for building trust in model predictions and uncovering new scientific insights.

© 2025 American Meteorological Society. This is an Author Accepted Manuscript distributed under the terms of the default AMS reuse license. For information regarding reuse and general copyright information, consult the AMS Copyright Policy (www.ametsoc.org/PUBSReuseLicenses).

Authors contributed equally

Corresponding author: Freddy Bouchet, freddy.bouchet@cnrs.fr
Save