LRE - Laboratoire de Recherche de l'ÉPITA

Adaptive test recommendation for mastery learning

Nassim Bouarour · Idir Benouaret · Cédric D’Ham · Sihem Amer-Yahia

We tackle the problem of recommending tests to learners to achieve upskilling. Our work is grounded in two learning theories: mastery learning, an instructional strategy that guides learners by providing them tests of increasing difficulty, reviewing their test results, and iterating until they reach a level of mastery; Flow Theory, which identifies different test zones, frustration, learnable, flow and boredom zones, to determine the best k tests to recommend to a learner. We formalize the AdUp Problem and develop a multi-objective optimization solution that adapts the difficulty of recommended tests to the learner’s predicted performance, aptitude, and skill gap. We leverage existing models to simulate learner behavior and run experiments to demonstrate that our formalization is best to attain skill mastery. We discuss open research directions including the applicability of reinforcement learning and the recommendation of peers in collaborative projects.

Les jeux d’argent dans la population carcérale : Pratiques du jeu, trajectoires de joueurs, problématiques d’addiction

Aymeric Brody

Why is the winner the best?

Matthias Eisenmann · Annika Reinke · Vivienn Weru · Minu D. Tizabi · Fabian Isensee · Tim J. Adler · Sharib Ali · Vincent Andrearczyk · Marc Aubreville · Ujjwal Baid · Spyridon Bakas · Niranjan Balu · Sophia Bano · Jorge Bernal · Sebastian Bodenstedt · Alessandro Casella · Veronika Cheplygina · Marie Daum · Marleen Bruijne · Adrien Depeursinge · Reuben Dorent · Jan Egger · David G. Ellis · Sandy Engelhardt · Melanie Ganz · Noha Ghatwary · Gabriel Girard · Patrick Godau · Anubha Gupta · Lasse Hansen · Kanako Harada · Mattias P. Heinrich · Nicholas Heller · Alessa Hering · Arnaud Huaulmé · Pierre Jannin · Ali Emre Kavur · Oldřich Kodym · Michal Kozubek · Jianning Li · Hongwei Li · Jun Ma · Carlos Martı́n-Isla · Bjoern Menze · Alison Noble · Valentin Oreiller · Nicolas Padoy · Sarthak Pati · Kelly Payette · Tim Rädsch · Jonathan Rafael-Patiño · Vivek Singh Bawa · Stefanie Speidel · Carole H. Sudre · Kimberlin Wijnen · Martin Wagner · Donglai Wei · Amine Yamlahi · Moi Hoon Yap · Chun Yuan · Maximilian Zenk · Aneeq Zia · David Zimmerer · Dogu Baran Aydogan · Binod Bhattarai · Louise Bloch · Raphael Brüngel · Jihoon Cho · Chanyeol Choi · Qi Dou · Ivan Ezhov · Christoph M. Friedrich · Clifton D. Fuller · Rebati Raman Gaire · Adrian Galdran · Álvaro Garcı́a Faura · Maria Grammatikopoulou · SeulGi Hong · Mostafa Jahanifar · Ikbeom Jang · Abdolrahim Kadkhodamohammadi · Inha Kang · Florian Kofler · Satoshi Kondo · Hugo Kuijf · Mingxing Li · Minh Luu · Tomaž Martinčič · Pedro Morais · Mohamed A. Naser · Bruno Oliveira · David Owen · Subeen Pang · Jinah Park · Sung-Hong Park · Szymon Plotka · Élodie Puybareau · Nasir Rajpoot · Kanghyun Ryu · Numan Saeed · Adam Shephard · Pengcheng Shi · Dejan Štepec · Ronast Subedi · Guillaume Tochon · Helena R. Torres · Helene Urien · João L. Vilaça · Kareem A. Wahid · Haojie Wang · Jiacheng Wang · Liansheng Wang · Xiyue Wang · Benedikt Wiestler · Marek Wodzinski · Fangfang Xia · Juanying Xie · Zhiwei Xiong · Sen Yang · Yanwu Yang · Zixuan Zhao · Klaus Maier-Hein · Paul F. Jäger · Annette Kopp-Schneider · Lena Maier-Hein

International benchmarking competitions have become fundamental for the comparative performance assessment of image analysis methods. However, little attention has been given to investigating what can be learnt from these competitions. Do they really generate scientific progress? What are common and successful participation strategies? What makes a solution superior to a competing method? To address this gap in the literature, we performed a multi-center study with all 80 competitions that were conducted in the scope of IEEE ISBI 2021 and MICCAI 2021. Statistical analyses performed based on comprehensive descriptions of the submitted algorithms linked to their rank as well as the underlying participation strategies revealed common characteristics of winning solutions. These typically include the use of multi-task learning (63%) and/or multi-stage pipelines (61%), and a focus on augmentation (100%), image preprocessing (97%), data curation (79%), and postprocessing (66%). The "typical" lead of a winning team is a computer scientist with a doctoral degree, five years of experience in biomedical image analysis, and four years of experience in deep learning. Two core general development strategies stood out for highly-ranked teams: the reflection of the metrics in the method design and the focus on analyzing and handling failure cases. According to the organizers, 43% of the winning algorithms exceeded the state of the art but only 11% completely solved the respective domain problem. The insights of our study could help researchers (1) improve algorithm development strategies when approaching new problems, and (2) focus on open research questions revealed by this work.

L’identification des projets de logiciel libre accessibles aux nouveaux contributeurs

Paul Hervot · Benoît Crespin

Collection processing and analysis of learning traces

Usage and practices analysis

FOSS

Mining Software Repositories

Open source barriers to entry

FOSS makes an increasing amount of the public and industrial software landscape, notably for its transparency and democratic governance. However, simply publishing the source code of a software does not automatically make it accessible, and many barriers impede new contributors approaching these projects. Through a large-scale software mining of the Software Heritage archive, we test the pertinence of three signals in the identification of accessible FOSS projects for new contributors. Our results show a positive correlation between the number of new contributors of a project successfully bringing their contribution to completion and the presence of contributing guidelines, as well as between that same number and the number of recent unique contributors in the project. Such signals could find a use in the teaching of FOSS practices, helping teachers to select accessible projects for their students.

Clustering en chémoinformatique pour le raffinement de l’activité des molécules

Maroua Lejmi · Ilef Ben Slima · Bertrand Cuissart · Nidà Meddouri · Ronan Bureau · Alban Lepailleur · Jean-Luc Lamotte · Amel Borgi

Dans le domaine de la conception des médicaments, la chémoinformatique utilise des méthodes informatiques et mathématiques pour analyser des données chimiques et biologiques et essayer de trouver très en amont des molécules intéressantes. Dans notre contexte, nous transformons les molécules pour ne conserver que leurs caractéristiques pharmacophoriques (partie active de la molécule). L’objectif de ce travail est de raffiner l’activité des molécules qui seront utilisées dans le processus de conception des médicaments en des classes d’activité. Cela permettra aux chimistes et pharmaciens une meilleure visualisation et compréhension de l’activité des molécules, et fournira des données plus fines pour le développement ultérieur d’un modèle de prédiction des molécules d’interêt therapeutique.

Metrics for community dynamics applied to unsupervised attacks detection

Julien Michel · Pierre Parrend

Features Engineering

Graph community metrics

Scalability

Graph representation

Unsupervised detection approach

Dynamic graphs

Attacks detection

Attack detection in big networks has become a necessity. Yet, with the ever changing threat landscape and massive amount of data to handle, network intrusion detection systems (NIDS) end up being obsolete. Different machine-learning-based solutions have been developed to answer the detection problem for data with evolving statistical distributions. However, no approach has proved to be both scalable and robust to passing time. In this paper, we propose a scalable and unsupervised approach to detect behavioral patterns without prior knowledge on the nature of attacks. For this purpose, we define novel metrics for graph community dynamics and use them as feature with unsupervised detection algorithm on the UGR’16 dataset. The proposed approach improves existing detection algorithms by 285.56% in precision and 222.82% in recall when compared to usual feature extraction (FE) using isolation forest.

CRACS: Compaction of rules in anticipatory classifier systems

Romain Orhand · Pierre Collet · Pierre Parrend · Anne Jeannin-Girardon

artificial evolution

learning classifier systems

rules

Rule Compaction of populations of Learning Classifier Systems (LCS) has always been a topic of interest to get more insights into the discovered underlying patterns from the data or to remove useless classifiers from the populations. However, these techniques have neither been used nor adapted to Anticipatory Learning Classifier Systems (ALCS). ALCS differ from other LCS in that they build models of their environments from which decision policies to solve their learning tasks are learned. We thus propose CRACS (Compaction of Rules in Anticipatory Classifier Systems), a compaction algorithm for ALCS that aims to reduce the size of their environmental models without impairing these models or the ability of these systems to solve their tasks. CRACS relies on filters applied to classifiers and subsumption principles. The capabilities of our compaction algorithm have been studied with three different ALCS on a thorough benchmark of 23 mazes of various levels of environmental uncertainty. The results show that CRACS reduces the size of populations of classifiers while the learned models of environments and the ability of ALCS to solve their tasks are preserved.

Explorer les débats parlementaires français de la troisième république par leurs sujets

Marie Puren · Aurélien Pellet

Cet article compare trois méthodes pour explorer de grands corpus de documents historiques par leurs sujets. Nous travaillons ici sur les débats parlementaires franais de la Troisième République, qui se prêtent particulièrement bien à ce type d’analyse. Après avoir présenté le contexte de cette étude, nous exposons les résultats obtenus avec trois méthodes issues du traitement automatique des langues et appliquées sur des textes publiés entre 1876 et 1914 : l’allocation de Dirichlet latente, les plongements de mots et le Transfer Learning.

Informatique mathématique. Une photographie en 2023

Élodie Puybareau · Samy Blusseau

A systemic mapping of methods and tools for performance analysis of data streaming with containerized microservices architecture

S. Ris · Jean Araujo · David Beserra

With the Internet of Things (IoT) growth and customer expectations, the importance of data streaming and streaming processing has increased. Data Streaming refers to the concept where data is processed and transmitted continuously and in real-time without necessarily being stored in a physical location. Personal health monitors and home security systems are examples of data streaming sources. This paper presents a systematic mapping study of the performance analysis of Data Streaming systems in the context of Containerization and Microservices. The research aimed to identify the main methods, tools, and techniques used in the last five years for the execution of this type of study. The results show that there are still few performance evaluation studies for this system niche, and there are gaps that must be filled, such as the lack of analytical modeling and the disregard for communication protocols’ influence.