LRE - Laboratoire de Recherche de l'ÉPITA

How to compute the convex hull of a binary shape? A real-time algorithm to compute the convex hull of a binary shape

In this article, we present an algorithm to compute the convex hull of a binary shape. Efficient algorithms to compute the convex hull of a set of points had been proposed long time ago. For a binary shape, the common practice is to rely on one of them: to compute the convex hull of binary shape, all pixels of the shape are first listed, and then the convex hull is computed on this list of points. The computed convex hull is finally rasterized to provide the final result as, for example, in the famous scikit-image library. To compute the convex hull of an arbitrary set of points, the points of the list that lie on the outline of the convex hull must be selected (to simplify, we call these points "extrema"). To find them, for an arbitrary set of points, it is necessary to browse all the points but not in the particular case of a binary shape. In this specific situation, the extrema necessarily belong to the inner boundary of the shape. It is a waste of time to browse all the pixels as it is possible to discard most of them when we search for these extrema. Based on this analysis, we propose a new method to compute the convex hull dedicated to binary shapes. This method browses as few pixels as possible to select a small subset of boundary pixels. Then it deduces the convex hull only from this subset. As the size of the subset is very small, the convex hull is computed in real time. We compare it with the commonly used methods and common functions from libraries to prove that our approach is faster. This comparison shows that, for a very small shape, the difference is acceptable, but when the area of the shape grows, this difference becomes significant. This leads us to conclude that substituting current functions to compute convex hull of binary shapes with our algorithm in frequently used libraries would lead to a great improvement.

Learning sentinel-2 reflectance dynamics for data-driven assimilation and forecasting

Anthony Frion · Lucas Drumetz · Guillaume Tochon · Mauro Dalla Mura · Abdeldjalil Aïssa El Bey

Over the last few years, massive amounts of satellite multispectral and hyperspectral images covering the Earth’s surface have been made publicly available for scientific purpose, for example through the European Copernicus project. Simultaneously, the development of self-supervised learning (SSL) methods has sparked great interest in the remote sensing community, enabling to learn latent representations from unlabeled data to help treating downstream tasks for which there is few annotated examples, such as interpolation, forecasting or unmixing. Following this line, we train a deep learning model inspired from the Koopman operator theory to model long-term reflectance dynamics in an unsupervised way. We show that this trained model, being differentiable, can be used as a prior for data assimilation in a straightforward way. Our datasets, which are composed of Sentinel-2 multispectral image time series, are publicly released with several levels of treatment.

Open Access to Data about Silk Heritage: A Case Study in Digital Information Sustainability

Jorge Sebastián Lozano · Ester Alba Pagán · Eliseo Martínez Roig · Mar Gaitán Salvatella · Arabella León Muñoz · Javier Sevilla Peris · Pierre Vernus · Marie Puren · Luis Rei · Dunja Mladenič

This article builds on work conducted and lessons learned within SILKNOW, a research project that aimed at enhancing the preservation and digital dissemination of silk heritage. Taking the project and this heritage typology as a case study in the digital transformation of cultural heritage institutions, it illustrates specific challenges that these institutions must face and demonstrates a few innovative answers to meet those challenges. The methodology combines approaches typical of the humanities and others usual in ICT, being inductive regarding materials and methods (consisting of a detailed review of existing online repositories and research projects devoted to textile heritage) and descriptive for the results and discussion (which explain at length the development of some tools and resources that responded to the needs detected in the previous analysis). The article reports on the state of the art and recent developments in the field of textile heritage, the tools implemented to allow the semantic access and text analysis of descriptive records associated with silk fabrics, and the spatiotemporal visualization of that information. Finally, it argues that institutional policies, namely the creation and free dissemination of open data related to cultural heritage are just as important as technical developments, showing why any future effort in these areas should take data sustainability, both in its technical and in institutional aspects, into account, since it is the most responsible and reasonable approach in terms of efficient resource allocation.

Interactive and real-time typesetting for demonstration and experimentation: <span style="font-variant:small-caps;">ETAP</span>

Didier Verna

In general, typesetting experimentation is not a very practical thing to do. WYSIWYG typesetting systems are very reactive but do not offer highly configurable algorithms, and TeX, with its separate development / compilation / visualization phases, is not as interactive as its WYSIWYG competitors. Being able to experiment with typesetting algorithms interactively and in real-time is nevertheless desirable, for instance for demonstration purposes, or for rapid prototyping and debugging of new ideas. We present ETAP (Experimental Typesetting Algorithms Platform), a tool written to ease typesetting experimentation and demonstration. ETAP currently provides several paragraph justification algorithms, all with many configuration options such as kerning, ligatures, flexible spaces, sloppiness, hyphenation, etc. The resulting paragraph is displayed with many visual hints as well, such as paragraph, character, and line boxes, baselines, over/underfullness hints, hyphenation clues, etc. All these parameters, along with the desired paragraph width, are adjustable interactively through a GUI, and the resulting paragraph is displayed and updated in real-time. But ETAP can also be used without, or in conjunction with the GUI, as a scriptable application. In particular, it is able to generate all sorts of statistical reports or charts on the behavior of the various algorithms, for instance, the number of over/underfull boxes per paragraph width, the average compression or stretch ratio per line, whatever else you want. This allows you to quickly demonstrate or evaluate the comparative behavior or merits of the provided algorithms, or whichever you may want to add to the pool.

Linear object detection in document images using multiple object tracking

Philippe Bernet · Joseph Chazalon · Edwin Carlinet · Alexandre Bourquelot · Puybareau

Linear objects convey substantial information about document structure, but are challenging to detect accurately because of degradation (curved, erased) or decoration (doubled, dashed). Many approaches can recover some vector representation, but only one closed-source technique introduced in 1994, based on Kalman filters (a particular case of Multiple Object Tracking algorithm), can perform a pixel-accurate instance segmentation of linear objects and enable to selectively remove them from the original image. We aim at re-popularizing this approach and propose: 1. a framework for accurate instance segmentation of linear objects in document images using Multiple Object Tracking (MOT); 2. document image datasets and metrics which enable both vector- and pixel-based evaluation of linear object detection; 3. performance measures of MOT approaches against modern segment detectors; 4. performance measures of various tracking strategies, exhibiting alternatives to the original Kalman filters approach; and 5. an open-source implementation of a detector which can discriminate instances of curved, erased, dashed, intersecting and/or overlapping linear objects.

Discrete Morse functions and watersheds

Gilles Bertrand · Nicolas Boutry · Laurent Najman

Any watershed, when defined on a stack on a normal pseudomanifold of dimension $d$, is a pure $(d-1)$-subcomplex that satisfies a drop-of-water principle. In this paper, we introduce Morse stacks, a class of functions that are equivalent to discrete Morse functions. We show that the watershed of a Morse stack on a normal pseudomanifold is uniquely defined, and can be obtained with a linear-time algorithm relying on a sequence of collapses. Last, we prove that such a watershed is the cut of the unique minimum spanning forest, rooted in the minima of the Morse stack, of the facet graph of the pseudomanifold.

Introducing PC $n$-manifolds and $P$-well-composedness in partially ordered sets

Nicolas Boutry

In discrete topology, discrete surfaces are well-known for their strong topological and regularity properties. Their definition is recursive, and checking if a poset is a discrete surface is tractable. Their applications are numerous: when domain unicoherence is ensured, they lead access to the tree of shapes, and then to filtering in the shape space (shapings); they also lead to Laplacian zero-crossing extraction, to brain tumor segmentation, and many other applications related to mathematical morphology. They have many advantages in digital geometry and digital topology since discrete surfaces do not have any pinches (and then the underlying polyhedron of their geometric realization can be parameterized). However, contrary to topological manifolds known in continuous topology, discrete surfaces do not have any boundary, which is not always realizable in practice (finite hyper-rectangles cannot be discrete surfaces due to their non-empty boundary). For this reason, we propose the three following contributions: (1) we introduce a new definition of boundary, called border, based on the definition of discrete surfaces, and which allows us to delimit any partially ordered set whenever it is not embedded in a greater ambient space, (2) we introduce $P$-well-composedness similar to well-composedness in the sense of Alexandrov but based on borders, (3) we propose new (possibly geometrical) structures called (smooth) $n$-PCM’s which represent almost the same regularity as discrete surfaces and that are tractable thanks to their recursive definition, and (4) we prove several fundamental theorems relative to PCM’s and their relations with discrete surfaces. We deeply believe that these new $n$-dimensional structures are promising for the discrete topology and digital geometry fields.

TTProfiler: Types and terms profile building for online cultural heritage knowledge graphs

Lamine Diop · Béatrice Markhoff · Arnaud Soulet

profile extraction

type

predicate

CIDOC CRM

concept

term

visualization

terminology

profile

knowledge graph

As more and more knowledge graphs (KG) are published on the Web, there is a need for tools that show their content. This implies showing the schema-level patterns instantiated in the graph, but also the terms used to qualify its entities. In this article, we present a new profiling tool that we call TTprofiler. It shows the predicates that relate types in the KG, and also the terms present in this KG, because of their paramount importance in most KGs, especially in the Cultural Heritage (CH) domain. We recall the role of terminologies and how they are implemented and used on the Web, we give the algorithm for building a TT profile from an online KG’s Endpoint, and we report on experiments performed over a set of Cultural Heritage Web KGs. A tool for visualizing TT profiles is also provided.

Analyse structurelle de l’influence du bruit sur l’arbre alpha

Baptiste Esteban · Guillaume Tochon · Edwin Carlinet · Didier Verna

L’arbre alpha est une représentation hiérarchique utilisée dans divers traitements d’une image tels que la segmentation ou la simplification. Ces traitements sont néanmoins sensibles au bruit, ce qui nécessite parfois de les adapter. Or, l’influence du bruit sur la structure de l’arbre alpha n’a été que peu étudiée dans la littérature. Ainsi, nous proposons une étude de l’impact du bruit en fonction de son niveau sur la structure de l’arbre. De plus, nous étendons cette étude à la persistance des nœuds de l’arbre en fonction d’une énergie donnée, et nous concluons que certaines fonctionnelles sont plus sensibles au bruit que d’autres.

Assimilation de données variationnelle de séries temporelles d’images sentinel-2 avec un modèle dynamique auto-supervisé

Anthony Frion · Lucas Drumetz · Mauro Dalla Mura · Guillaume Tochon · Abdeldjalil Aïssa-El-Bey

Au cours des dernières années, l’apprentissage profond a acquis une importance croissante dans de nombreux domaines scientifiques, notamment en ce qui concerne le traitement d’images, et en particulier pour le traitement des données issues de satellites. Le paradigme le plus courant en ce qui concerne l’apprentissage profond est l’apprentissage supervisé, qui requiert une grande quantité de données annotées représentant la vérité terrain pour la tâche d’intérêt. Or, obtenir des données correctement annotées pose souvent des difficultés financières ou techniques importantes. Pour cette raison, nous nous plaçons ici dans le cadre de l’apprentissage auto-supervisé. Nous proposons un modèle d’apprentissage profond inspiré de la théorie de l’opérateur de Koopman qui apprend, à partir de séries temporelles d’images multispectrales Sentinel-2, à modéliser les dynamiques de long terme de réflectance des pixels. Après son entraînement, notre modèle peut être utilisé dans divers problèmes inverses faisant intervenir la dynamique temporelle pour résoudre différentes tâches telles que l’interpolation ou le débruitage de données.