LRE - Laboratoire de Recherche de l'ÉPITA

In pursuit of the hidden features of GNN’s internal representations

Luca Veyrin-Forrer · Ataollah Kamal · Stefan Duffner · Marc Plantevit · Céline Robardet

Graph Neural Networks

Explainable artificial intelligence

Monte Carlo Tree Search

We consider the problem of explaining Graph Neural Networks (GNNs). While most attempts aim at explaining the final decision of the model, we focus on the hidden layers to examine what the GNN actually captures and shed light on the hidden features built by the GNN. To that end, we first extract activation rules that identify sets of exceptionally co-activated neurons when classifying graphs in the same category. These rules define internal representations having a strong impact in the classification process. Then - this is the goal of the current paper - we interpret these rules by identifying a graph that is fully embedded in the related subspace identified by the rule. The graph search is based on a Monte Carlo Tree Search directed by a proximity measure between the graph embedding and the internal representation of the rule, as well as a realism factor that constrains the distribution of the labels of the graph to be similar to that observed on the dataset. Experiments including 6 real-world datasets and 3 baselines demonstrate that our method DISCERN generates realistic graphs of high quality which allows providing new insights into the respective GNN models.

Gradient vector fields of discrete morse functions and watershed-cuts

Nicolas Boutry · Gilles Bertrand · Laurent Najman

Transformations d’$\omega$-automates pour la synthèse de contrôleurs réactifs

Florian Renkin

Le travail de cette thèèse s’inscrit dans le cadre de la crééation de manièère automatique de systèèmes corrects àà partir de spéécifications, ce que l’on appelle “synthèse”synthèèse. Ce besoin de crééation automatique vient d’une part de la complexitéé de plus en plus importante des systèèmes que l’on créée mais aussi de la difficultéé de véérifier si un systèème est correct. Pour que la synthèèse soit utilisable en pratique, y compris dans l’industrie, il faut êêtre capable de produire des solutions pour des problèèmes plus ou moins complexes en un temps raisonnable. De plus, on peut chercher àà optimiser les systèèmes produits afin qu’ils soient les plus simples possibles. Pour déécrire les contraintes que le systèème doit respecter, nous utiliserons des formules de logique linééaire temporelle (LTL) qui ajoutent aux opéérateurs Boolééens traditionnels une notion de temps discret afin d’exprimer des contraintes telles que “il existera un instant où la variable sera vraie”il existera un instant oùù la variable sera vraie. Dans notre cas, il s’agira de produire un contrôôleur rééactif, c’est-àà-dire associant àà une suite d’assignations de variables Boolééennes d’entréées une suite d’assignations de variables Boolééennes de sorties.L’approche de la synthèèse LTL que nous allons déécrire consiste àà :- Traduire la spéécification LTL en un jeu de paritéé oùù un joueur contrôôle l’environnement alors que le second repréésente les actions que peut faire le contrôôleur.- Rechercher dans ce jeu s’il existe une stratéégie gagnante pour le second joueur.- Cette stratéégie indique les actions que doit faire le contrôôleur pour respecter les spéécifications et il reste alors àà l’encoder sous la forme voulue (circuit, programme...).Une partie de la premièère éétape est une procéédure dite de paritisation consistant àà obtenir àà partir d’un automate quelconque un automate de paritéé. Une contribution majeure de cette thèèse consiste en l’améélioration de cette procéédure. Dans cette optique, nous proposons et comparons divers algorithmes de paritisation. La premièère mééthode est une combinaison d’algorithmes existants auxquels ont éétéé associéées des heuristiques mais aussi de nouveaux algorithmes. La seconde est l’adaptation d’une mééthode introduite en 2021 par Casares et al. assurant une forme d’optimalitéé sur la taille de l’automate de paritéé obtenu. Dans les deux cas, ces algorithmes ont àà la fois pour objectif de rééduire le temps néécessaire pour une telle transformation mais aussi de limiter la taille de l’automate créééé.Une autre contribution consiste àà proposer des techniques de simplification du contrôôleur. En particulier, nous tirerons parti des libertéés offertes par la spéécification. Par exemple, si l’on souhaite un systèème allumant une ampoule lorsqu’une préésence est déétectéée, alors ce qu’il faut faire lorsque personne n’est déétectéé n’a pas d’importance. Pour obtenir un systèème simple, on peut déécider de toujours allumer l’ampoule et le systèème n’a alors plus besoin d’un capteur. Deux types de simplifications seront déécrites. La premièère est inspiréée d’un outil existant (MeMin) et utilise un SAT-solver pour obtenir une solution minimale. La complexitéé de la recherche d’optimalitéé (NP-complet) nous incite éégalement àà nous tourner vers une seconde mééthode baséée sur les BDD visant àà fournir un systèème rééduit plus rapidement mais sans garantie d’optimalitéé.Ces deux contributions majeures ont éétéé intéégréées àà l’outil ltlsynt distribuéé avec la bibliothèèque Spot et ont éétéé accompagnéées de plusieurs amééliorations que nous éévaluons : une déécomposition du problèème permettant de crééer des contrôôleurs pour des sous-parties de la spéécification mais aussi une mééthode permettant de s’affranchir de la construction d’un jeu pour une certaine classe de formules. Ces travaux ont fait l’objet de publications dans les conféérences ATVA’20 (premièère mééthode de paritisation), TACAS’22 (seconde mééthode de paritisation), FORTE’22 (simplification de contrôôleur), CAV’22 (préésentation des éévolutions de Spot) ainsi que d’une préésentation de ltlsynt lors de la conféérence SYNT’21.L’outil ltlsynt a par ailleurs participéé aux ééditions 2020, 2021 et 2022 de la SYNTCOMP.

Diversifying a parallel SAT solver with bayesian moment matching

V. Vallade · S. Nejati · J. Sopena · V. Ganesh · Souheib Baarir

In this paper, we present a Bayesian Moment Matching (BMM) in-processing technique for Conflict-Driven Clause-Learning (CDCL) SAT solvers. BMM is a probabilistic algorithm which takes as input a Boolean formula in conjunctive normal form and a prior on a possible satisfying assignment, and outputs a posterior for a new assignment most likely to maximize the number of satisfied clauses. We invoke this BMM method, as an in-processing technique, with the goal of updating the polarity and branching activity scores. The key insight underpinning our method is that Bayesian reasoning is a powerful way to guide the CDCL search procedure away from fruitless parts of the search space of a satisfiable Boolean formula, and towards those regions that are likely to contain satisfying assignments.

On GNN explainability with activation rules

Luca Veyrin-Forrer · Ataollah Kamal · Stefan Duffner · Marc Plantevit · Céline Robardet

GNNs are powerful models based on node representation learning that perform particularly well in many machine learning problems related to graphs. The major obstacle to the deployment of GNNs is mostly a problem of societal acceptability and trustworthiness, properties which require making explicit the internal functioning of such models. Here, we propose to mine activation rules in the hidden layers to understand how the GNNs perceive the world. The problem is not to discover activation rules that are individually highly discriminating for an output of the model. Instead, the challenge is to provide a small set of rules that cover all input graphs. To this end, we introduce the subjective activation pattern domain. We define an effective and principled algorithm to enumerate activations rules in each hidden layer. The proposed approach for quantifying the interest of these rules is rooted in information theory and is able to account for background knowledge on the input graph data. The activation rules can then be redescribed thanks to pattern languages involving interpretable features. We show that the activation rules provide insights on the characteristics used by the GNN to classify the graphs. Especially, this allows to identify the hidden features built by the GNN through its different layers. Also, these rules can subsequently be used for explaining GNN decisions. Experiments on both synthetic and real-life datasets show highly competitive performance, with up to 200% improvement in fidelity on explaining graph classification over the SOTA methods.

Multi-purpose tactile perception based on deep learning in a new tendon-driven optical tactile sensor

Zhou Zhao · Zhenyu Lu

In this paper, we create a new tendon-connected multi-functional optical tactile sensor, MechTac, for object perception in field of view (TacTip) and location of touching points in the blind area of vision (TacSide). In a multi-point touch task, the information of the TacSide and the TacTip are overlapped to commonly affect the distribution of papillae pins on the TacTip. Since the effects of TacSide are much less obvious to those affected on the TacTip, a perceiving out-of-view neural network (O$^2$VNet) is created to separate the mixed information with unequal affection. To reduce the dependence of the O$^2$VNet on the grayscale information of the image, we create one new binarized convolutional (BConv) layer in front of the backbone of the O$^2$VNet. The O$^2$VNet can not only achieve real-time temporal sequence prediction (34 ms per image), but also attain the average classification accuracy of 99.06%. The experimental results show that the O$^2$VNet can hold high classification accuracy even facing the image contrast changes.

An efficient cascade of U-Net-like convolutional neural networks devoted to brain tumor segmentation

Philippe Bouchet · Jean-Baptiste Deloges · Hugo Canton-Bacara · Gaätan Pusel · Lucas Pinot · Othman Elbaz · Nicolas Boutry

Some equivalence relation between persistent homology and morphological dynamics

Nicolas Boutry · Laurent Najman · Thierry Géraud

In Mathematical Morphology (MM), connected filters based on dynamics are used to filter the extrema of an image. Similarly, persistence is a concept coming from Persistent Homology (PH) and Morse Theory (MT) that represents the stability of the extrema of a Morse function. Since these two concepts seem to be closely related, in this paper we examine their relationship, and we prove that they are equal on $n$-D Morse functions, $n\geq 1$. More exactly, pairing a minimum with a $1$-saddle by dynamics or pairing the same $1$-saddle with a minimum by persistence leads exactly to the same pairing, assuming that the critical values of the studied Morse function are unique. This result is a step further to show how much topological data analysis and mathematical morphology are related, paving the way for a more in-depth study of the relations between these two research fields.

Residual 3D U-net with localization for brain tumor segmentation

Marc Demoustier · Ines Khemir · Quoc Duon Nguyen · Lucien Martin-Gaffé · Nicolas Boutry

Discovering and visualizing tactics in table tennis games based on subgroup discovery

Pierre Duluard · Xinqing Li · Marc Plantevit · Céline Robardet · Romain Vuillemot

We report on preliminary results to automatically identify efficient tactics of elite players in table tennis games. We define such tactics as subgroups of winning strokes which table tennis experts sought to obtain to train players and adapt their strategy during games. We first report on the creation of such subgroups and their ranking by weighted relative accuracy measure (WRAcc). We then report on representation of the subgroups using visualizations that enabled our expert to provide rapid feedback and hence provided us with guidance towards further improvements of our discoveries