LRE - Laboratoire de Recherche de l'ÉPITA

Gradients intégrés renforcés

Caroline Mazini-Rodrigues · Nicolas Boutry · Laurent Najman

Les visualisations fournies par les techniques d’Intelligence Artificielle Explicable xAI) pour expliquer les réseaux de neurones convolutionnels (CNN’s) sont parfois difficile á interpréter. La richesse des motifs d’une image qui sont fournis en entrées (les pix l d’une image) entraîne des corrélations complexes entre les classes. Les techniques basées sur les gradients, telles que les gradients intégrés, mettent en évidence l’import nce de ces caractéristiques. Cependant, lorsqu’on les visualise sous forme d’images, on peut e retrouver avec un bruit excessif et donc une difficulté á interpréter les explic tions fournies. Nous proposons la méthode intitulée Gradients Intégrés Renforcés (RI ), une variation des gradients intégrés, qui vise á mettre en évidence les régions nfluentes des images dans la décision des réseaux. Cette méthode vise á réduire la sur ace des zones á analyser lors de la visualisation des résultats, générant ainsi moins e bruit apparent. Des expériences á base d’occlusions démontrent que les régions chois es par notre méthode jouent effectivement un rôle important en terme de classification.

Accurate and interpretable representations of environments with anticipatory learning classifier systems

Romain Orhand · Anne Jeannin-Girardon · Pierre Parrend · Pierre Collet

Anticipatory Learning Classifier System

Machine Learn- ing

Explainability

Non-determinism

Building Knowledge

Anticipatory Learning Classifier Systems (ALCS) are rule- based machine learning algorithms that can simultaneously develop a complete representation of their environment and a decision policy based on this representation to solve their learning tasks. This paper intro- duces BEACS (Behavioral Enhanced Anticipatory Classifier System) in order to handle non-deterministic partially observable environments and to allow users to better understand the environmental representations issued by the system. BEACS is an ALCS that enhances and merges Probability-Enhanced Predictions and Behavioral Sequences approaches used in ALCS to handle such environments. The Probability-Enhanced Predictions consist in enabling the anticipation of several states, while the Behavioral Sequences permits the construction of sequences of ac- tions. The capabilities of BEACS have been studied on a thorough bench- mark of 23 mazes and the results show that BEACS can handle different kinds of non-determinism in partially observable environments, while describing completely and more accurately such environments. BEACS thus provides explanatory insights about created decision polices and environmental representations.

Découverte de sous-groupes de prédictions interprétables pour le triage d’incidents

Youcef Remil · Anes Bendimerad · Marc Plantevit · Céline Robardet · Mehdi Kaytoue

The need for predictive maintenance comes with an increasing number of incidents, where it is imperative to quickly decide which service to contact for corrective actions. Several predictive models have been designed to automate this process, but the efficient models are opaque (say, black boxes). Many approaches have been proposed to locally explain each prediction of such models. However, providing an explanation for every result is not conceivable when it comes to a large number of daily predictions to analyze. In this article we propose a method based on Subgroup Discovery in order to (1) group together objects that share similar explanations and (2) provide a description that characterises each subgroup

Qu’est-ce que mon GNN capture vraiment ? Exploration des représentations internes d’un GNN

Luca Veyrin-Forrer · Ataollah Kamal · Stefan Duffner · Marc Plantevit · Céline Robardet

While existing GNN’s explanation methods explain the decision by studying the output layer, we propose a method that analyzes the hidden layers to identify the neurons that are co-activated for a class. We associate to them a graph.

AGAT: Building and evaluating binary partition trees for image segmentation

Jimmy Francky Randrianasoa · Camille Kurtz · Éric Desjardin · Nicolas Passat

AGAT is a Java library dedicated to the construction, handling and evaluation of binary partition trees, a hierarchical data structure providing multiscale partitioning of images and, more generally, of valued graphs. On the one hand, this library offers functionalities to build binary partition trees in the usual way, but also to define multifeature trees, a novel and richer paradigm of binary partition trees built from multiple images and/or several criteria. On the other hand, it also allows one to manipulate the binary partition trees, for instance by defining/computing tree cuts that can be interpreted in particular as segmentations when dealing with image modeling. In addition, some evaluation tools are also provided, which allow one to evaluate the quality of different binary partition trees for such segmentation tasks. AGAT can be easily handled by various kinds of users (students, researchers, practitioners). It can be used as is for experimental purposes, but can also form a basis for the development of new methods and paradigms for construction, use and intensive evaluation of binary partition trees. Beyond the usual imaging applications, its underlying structure also allows for more general developments in graph-based analysis, leading to a wide range of potential applications in computer vision, image/data analysis and machine learning.

Strong Euler wellcomposedness

Nicolas Boutry · Rocio Gonzalez-Diaz · Maria-Jose Jimenez · Eduardo Paluzo-Hildago

In this paper, we define a new flavour of well-composedness, called strong Euler well-composedness. In the general setting of regular cell complexes, a regular cell complex of dimension $n$ is strongly Euler well-composed if the Euler characteristic of the link of each boundary cell is $1$, which is the Euler characteristic of an $(n-1)$-dimensional ball. Working in the particular setting of cubical complexes canonically associated with $n$-D pictures, we formally prove in this paper that strong Euler well-composedness implies digital well-composedness in any dimension $n\geq 2$ and that the converse is not true when $n\geq 4$.

Automation of binary analysis: From open source collection to threat intelligence

Frederic Grelot · Sébastien Larinier · Marie Salmon

Many open sources of binaries, including malware, have emerged in the landscape in recent years. Their quality compares very favourably with commercial sources, as emphasised by Thibaud Binetruy (Twitter influencer under a pseudonym, Société Générale CERT, 2020): "Integrating operational threat intelin your defense mechanisms doesn’t mean buying Threat Intel. You can start by using the [mass] of open source indicators available for free." Some are provided by official sources (Abuse.ch, with data supplied by the Swiss national CERT, among others), while others are made available in more obscure ways, sometimes anonymously (VirusShare, VX-Underground, etc.). Our examination of these sources underlines the wide disparity in quality and quantity between them. We have had to take this diversity into account in our research, designing a dedicated platform that enables us to supply information to our binary analysis products and to conduct daily analyses of correlations between and within malware families on a large scale. This work can then be applied to concrete cases such as Babuk, Ryuk and Conti. We have been able to highlight links for these families by immediately identifying correlations, with additional manual analysis then confirming the genealogy of the samples precisely.

Introducing the boundary-aware loss for deep image segmentation

Minh Ôn Vũ Ngọc · Yizi Chen · Nicolas Boutry · Joseph Chazalon · Edwin Carlinet · Jonathan Fabrizio · Clément Mallet · Thierry Géraud

Most contemporary supervised image segmentation methods do not preserve the initial topology of the given input (like the closeness of the contours). One can generally remark that edge points have been inserted or removed when the binary prediction and the ground truth are compared. This can be critical when accurate localization of multiple interconnected objects is required. In this paper, we present a new loss function, called, Boundary-Aware loss (BALoss), based on the Minimum Barrier Distance (MBD) cut algorithm. It is able to locate what we call the <i>leakage pixels</i> and to encode the boundary information coming from the given ground truth. Thanks to this adapted loss, we are able to significantly refine the quality of the predicted boundaries during the learning procedure. Furthermore, our loss function is differentiable and can be applied to any kind of neural network used in image processing. We apply this loss function on the standard U-Net and DC U-Net on Electron Microscopy datasets. They are well-known to be challenging due to their high noise level and to the close or even connected objects covering the image space. Our segmentation performance, in terms of Variation of Information (VOI) and Adapted Rank Index (ARI), are very promising and lead to $\approx{}15\%$ better scores of VOI and $\approx{}5\%$ better scores of ARI than the state-of-the-art. The code of boundary-awareness loss is freely available at https://github.com/onvungocminh/MBD_BAL

Evaluation of anomaly detection for cybersecurity using inductive node embedding with convolutional graph neural networks

Amina Abou Rida · Pierre Parrend · Rabih Amhaz

In the face of continuous cyberattacks, many scientists have proposed machine learning-based network anomaly detection methods. While deep learning effectively captures unseen patterns of Euclidean data, there is a huge number of applications where data are described in the form of graphs. Graph analysis have improved detecting anomalies in non-Euclidean domains, but it suffered from high computational cost. Graph embeddings have solved this problem by converting each node in the network into low dimensional representation, but it lacks the ability to generalize to unseen nodes. Graph convolution neural network methods solve this problem through inductive node embedding (inductive GNN). Inductive GNN shows better performance in detecting anomalies with less complexity than graph analysis and graph embedding methods.

A secure blockchain-based architecture for the COVID-19 data network

Darine Al-Mohtar · Amani Ramzi Daou · Nour El Madhoun · Rachad Maallawi

The COVID-19 pandemic has impacted the world economy and mainly all activities where social distancing cannot be respected. In order to control this pandemic, screening tests such as PCR have become essential. For example, in the case of a trip, the traveler must carry out a PCR test within 72 hours before his departure and if he is not a carrier of the COVID-19, he can therefore travel by presenting, during check-in and boarding, the negative result sheet to the agent. The latter will then verify the presented sheet by trusting: (a) the medical biology laboratory, (b) the credibility of the traveler for not having changed the PCR result from "positive to negative". Therefore, this confidence and this verification are made without being based on any mechanism of security and integrity, despite the great importance of the PCR test results to control the COVID-19 pandemic. Consequently, we propose in this paper a blockchain-based decentralized trust architecture that aims to guarantee the integrity, immutability and traceability of COVID-19 test results. Our proposal also aims to ensure the interconnection between several organizations (airports, medical laboratories, cinemas, etc.) in order to access COVID-19 test results in a secure and decentralized manner.