Causality is a philosophical endeavor formalizing knowledge about the world and its transformations. It has produced a refined mathematical framework, called Structural Causal Models (SCM), that has been instrumental in many scientific fields.
The field of causal machine learning has been expanding in recent years as the questions of robustness and interpretability of algorithms became increasingly related to causality, which puts the focus on the data generating mechanisms and their possible changes.
Notably, it is now well accepted that powerful deep learning algorithms for image classification may deliver their predictions based on “spurious features”, not causally related to the relevant information, and as such may underperform when they are deployed in different environments.
More recently, impressive results have been obtained by generative AI, allowing to produce highly-realistic novel texts and images. However, again, those algorithms are not guaranteed to behave according to basic causality principles (and even common sense) and may produce surprising mistakes.
Causality provides principled ways to study and improve AI algorithms. In particular, it can endow generative AI with the ability to emulate meaningful changes to the data generating mechanisms, called interventions, and produce “what-if” scenarios called counterfactuals. Leveraging causal generative AI thus carries the potential to explore the space of possible transformations of a system to anticipate failures, and inform decision makers.
Data is often not enough to infer unambiguously the causal model that generated it.
Previous theoretical contributions to the field of causal machine learning have revolved around the question of causal model identifiability,
establishing principles and assumptions under which we can recover information about the ground truth mechanisms based on data.
Answering this question is crucial for applications: only if mechanisms are correctly recovered can we expect interventions on the inferred causal model to be predictive of real-world changes. We have investigated the principle of Independence of Causal Mechanisms (ICM), according to which we assume that the different mechanisms forming a causal model do not inform each other. We have shown that it can be mathematically formulated and exploited in various ways to expand capabilities of causal inference to new settings [Besserve et al., AISTATS 2018].
In particular, this led to new causal model identification approaches in contexts ranging from robust inference of the direction of causation in multivariate time series [Shajarisales et al., ICML 2015; Besserve et al., CLeaR 2022], to analyzing the internal causal structure of generative AI trained on complex image datasets [Besserve et al., AAAI 2021] and generating counterfactual images able to assessing robustness of object classification algorithms [Besserve et al., ICLR 2020].
Recently, we were able to make a major breakthrough in the problem of identifiability
of nonlinear generative models: ICM could be formulated as a restriction of
the function class of such models, providing guarantees that the ground truth function
can be identified based on unlabeled observational data [Gresele et al., NeurIPS 2021, Buchholz et al., NeurIPS 2022]. Moreover, we could demonstrate that such inductive bias is implicitly performed in Variational Autoencoders [Reizinger et al., 2022],
providing an explanation for their empirical success at disentangling the factors of variations of image datasets.
On the applications side, we have pursued the goal of understanding brain function through the lens of causality, data science and biophysical modeling. We have used causal inference and machine learning to uncover information processing pathways of visual and memory systems which led to publications
in major multidisciplinary and biology journals [Besserve et al., PLOS Biology 2015; Ramirez-Villegas et al., PNAS 2015; Ramirez-Villegas et al., Nature 2020]. We have developed biologically realistic computational models of cortical networks to study the precise mechanism underlying dynamical brain phenomena during information processing, notably leading to state-of-the-art realistic simulations of high frequency oscillations occurring in memory systems as we remember events [Ramirez-Villegas et al., Neuron 2018].
More recently, we broadened our investigation of complex systems by studying multiagent [Geiger et al., UAI 2019] and socio-economic systems [Besserve & Schölkopf, UAI 2022],
which, similarly to brain networks, are characterized by their high dimensionality and recurrent interactions.
Motivated by fostering the transition to sustainable economies, we developed theoretical foundations
and algorithms to design optimal interventions in such systems [Besserve & Schölkopf, UAI 2022].
This approach merges causality, machine learning and scientific modeling by relying on a differentiable simulator of economic equilibrium.
Our current research aims at developing a Causal Computational Model (CCM) framework: learning digital representations of real-world systems integrating data, domain knowledge and an interpretable causal structure.
This aims at improving robustness and interpretability of a broad range of highly detailed models that can be linked to empirical data:
simulators of scientific models based on (partial) differential equations (e.g. general circulation models of the Earth atmosphere), digital twins of industrial systems (e.g. factories, bridges), computable general equilibrium models of the economy, multi-agent systems simulators (e.g. traffic models).
Due to their complexity, the causal structure of such models is not directly apparent
to decision makers and hinders the ability of modelers to communicate constructively
the limitations and impacts of their work. CCMs add a layer of causal abstraction:
a simplified causal representation that aggregates variables of the low-level model into a high-level model with fewer variables.
This allows users to switch between two different levels of description,
the low-level describing the minute phenomena, and the high-level describing global behavior of the system.
We have started to address how to build such abstraction from simulations in a recent work [Kekić et al., UAI 2024].
This framework will lead to causal AIs that can address the complexity of real-world systems, while producing interpretable outcomes for decision makers.