Publications

A first step toward a fair comparison of evaluation protocols for text detection algorithms

Aliona Dangla · Puybareau · Guillaume Tochon · Jonathan Fabrizio

Text detection is an important topic in pattern recognition, but evaluating the reliability of such detection algorithms is challenging. While many evaluation protocols have been developed for that purpose, they often show dissimilar behaviors when applied in the same context. As a consequence, their usage may lead to misinterpretations, potentially yielding erroneous comparisons between detection algorithms or their incorrect parameters tuning. This paper is a first attempt to derive a methodology to perform the comparison of evaluation protocols. We then apply it on five state-of-the-art protocols, and exhibit that there indeed exist inconsistencies among their evaluation criteria. Our aim here is not to rank the investigated evaluation protocols, but rather raising awareness in the community that we should carefully reconsider them in order to converge to their optimal usage.

CDCLSym: Introducing effective symmetry breaking in SAT solving

Hakan Metin · Souheib Baarir · Maximilien Colange · Fabrice Kordon

SAT solvers are now widely used to solve a large variety of problems, including formal verification of systems. SAT problems derived from such applications often exhibit symmetry properties that could be exploited to speed up their solving. Static symmetry breaking is so far the most popular approach to take advantage of symmetries. It relies on a symmetry preprocessor which augments the initial problem with constraints that force the solver to consider only a few configurations among the many symmetric ones.This paper presents a new way to handle symmetries, that avoids the main problem of the current static approaches: the prohibitive cost of the preprocessing phase. Extensive experiments on the benchmarks of last six SAT competitions show that our approach is competitive with the best state-of-the-art static symmetry breaking solutions.

Strategies in typecase optimization

Jim Newton · Didier Verna

We contrast two approaches to optimizing the Common Lisp typecase macro expansion. The first approach is based on heuristics intended to estimate run time performance of certain type checks involving Common Lisp type specifiers. The technique may, depending on code size, exhaustively search the space of permutations of the type checks, intent on finding the optimal order. With the second technique, we represent a typecase form as a type specifier, encapsulating the side-effecting non-Boolean parts so as to appear compatible with the Common Lisp type algebra operators. The encapsulated expressions are specially handled so that the Common Lisp type algebra functions preserve them, and we can unwrap them after a process of Boolean reduction into efficient Common Lisp code, maintaining the appropriate side effects but eliminating unnecessary type checks. Both approaches allow us to identify unreachable code, test for exhaustiveness of the clauses and eliminate type checks which are calculated to be redundant.

Method combinators

Didier Verna

In traditional object-oriented languages, the dynamic dispatch algorithm is hardwired: for every polymorphic call, only the most specific method is used. <span style="font-variant:small-caps;">Clos</span>Clos, the Common Lisp Object System, goes beyond the traditional approach by providing an abstraction known as <i>method combinations</i>method combinations: when several methods are applicable, it is possible to select several of them, decide in which order they will be called, and how to combine their results, essentially making the dynamic dispatch algorithm user-programmable.Although a powerful abstraction, method combinations are under-specified in the Common Lisp standard, and the <span style="font-variant:small-caps;">Mop</span>Mop, the Meta-Object Protocol underlying many implementations of <span style="font-variant:small-caps;">Clos</span>Clos, worsens the situation by either contradicting it or providing unclear protocols. As a consequence, too much freedom is granted to conforming implementations. The exact or intended behavior of method combinations is unclear and not necessarily coherent with the rest of <span style="font-variant:small-caps;">Clos</span>Clos.In this paper, we provide a detailed analysis of the problems posed by method combinations, the consequences of their lack of proper specification in one particular implementation, and a <span style="font-variant:small-caps;">Mop</span>Mop-based extension called <i>method combinators</i>method combinators, aiming at correcting these problems and possibly offer new functionality.

Saliency-based detection of identity documents captured by smartphones

Minh Ôn Vũ Ngọc · Jonathan Fabrizio · Thierry Géraud

Smartphones have became an easy and convenient mean to acquire documents. In this paper, we focus on the automatic segmentation of identity documents in smartphone photos or videos using visual saliency (VS). VS-based approaches, which pertain to computer vision, have not be considered yet for this particular task. Here we compare different VS methods, and we propose a new VS scheme, based on a recent distance belonging to the scope of mathematical morphology. We show that the saliency maps we obtain are competitive with state-of-the-art visual saliency methods and, that such approaches are very promising for use in identity document detection and segmentation, even without taking into account any prior knowledge about document contents. In particular they can work in real-time on smartphones.

A tutorial on well-composedness

Nicolas Boutry · Thierry Géraud · Laurent Najman

Due to digitization, usual discrete signals generally present topological paradoxes, such as the connectivity paradoxes of Rosenfeld. To get rid of those paradoxes, and to restore some topological properties to the objects contained in the image, like manifoldness, Latecki proposed a new class of images, called well-composed images, with no topological issues. Furthermore, well-composed images have some other interesting properties: for example, the Euler number is locally computable, boundaries of objects separate background from foreground, the tree of shapes is well-defined, and so on. Last, but not the least, some recent works in mathematical morphology have shown that very nice practical results can be obtained thanks to well-composed images. Believing in its prime importance in digital topology, we then propose this state-of-the-art of well-composedness, summarizing its different flavours, the different methods existing to produce well-composed signals, and the various topics that are related to well-composedness.

Lisp, jazz, aikido

Didier Verna

The relation between Science (what we can explain) and Art (what we can’t) has long been acknowledged and while every science contains an artistic part, every art form also needs a bit of science. Among all scientific disciplines, programming holds a special place for two reasons. First, the artistic part is not only undeniable but also essential. Second, and much like in a purely artistic discipline, the act of programming is driven partly by the notion of aesthetics: the pleasure we have in creating beautiful things.Even though the importance of aesthetics in the act of programming is now unquestioned, more could still be written on the subject. The field called “psychology of programming”psychology of programming focuses on the cognitive aspects of the activity, with the goal of improving the productivity of programmers. While many scientists have emphasized their concern for aesthetics and the impact it has on their activity, few computer scientists have actually written about their thought process while programming.What makes us like or dislike such and such language or paradigm? Why do we shape our programs the way we do? By answering these questions from the angle of aesthetics, we may be able to shed some new light on the art of programming. Starting from the assumption that aesthetics is an inherently transversal dimension, it should be possible for every programmer to find the same aesthetic driving force in every creative activity they undertake, not just programming, and in doing so, get deeper insight on why and how they do things the way they do.On the other hand, because our aesthetic sensitivities are so personal, all we can really do is relate our own experiences and share it with others, in the hope that it will inspire them to do the same. My personal life has been revolving around three major creative activities, of equal importance: programming in Lisp, playing Jazz music, and practicing Aikido. But why so many of them, why so different ones, and why these specifically?By introspecting my personal aesthetic sensitivities, I eventually realized that my tastes in the scientific, artistic, and physical domains are all motivated by the same driving forces, hence unifying Lisp, Jazz, and Aikido as three expressions of a single essence, not so different after all. Lisp, Jazz, and Aikido are governed by a limited set of rules which remain simple and unobtrusive. Conforming to them is a pleasure. Because Lisp, Jazz, and Aikido are inherently introspective disciplines, they also invite you to transgress the rules in order to find your own. Breaking the rules is fun. Finally, if Lisp, Jazz, and Aikido unify so many paradigms, styles, or techniques, it is not by mere accumulation but because they live at the meta-level and let you reinvent them. Working at the meta-level is an enlightening experience.Understand your aesthetic sensitivities and you may gain considerable insight on your own psychology of programming. Mine is perhaps common to most lispers. Perhaps also common to other programming communities, but that, is for the reader to decide…

The MIT lincoln laboratory / JHU / EPITA-LSE LRE17 system

Fred Richardson · Pedro Torres-Carrasquillo · Jonas Borgstrom · Douglas Sturim · Youngjune Gwon · Jesus Villalba · Jan Trmal · Nanxin Chen · Réda Dehak · Najim Dehak

SmartDoc 2017 video capture: Mobile document acquisition in video mode

Joseph Chazalon · P. Gomez-Krämer · J.-C. Burie · M. Coustaty · S. Eskenazi · M. Luqman · N. Nayef · M. Rusiñol · N. Sidère · J. M. Ogier.

As mobile document acquisition using smartphones is getting more and more common, along with the continuous improvement of mobile devices (both in terms of computing power and image quality), we can wonder to which extent mobile phones can replace desktop scanners. Modern applications can cope with perspective distortion and normalize the contrast of a document page captured with a smartphone, and in some cases like bottle labels or posters, smartphones even have the advantage of allowing the acquisition of non-flat or large documents. However, several cases remain hard to handle, such as reflective documents (identity cards, badges, glossy magazine cover, etc.) or large documents for which some regions require an important amount of detail. This paper introduces the SmartDoc 2017 benchmark (named “SmartDoc Video Capture”), which aims at assessing whether capturing documents using the video mode of a smartphone could solve those issues. The task under evaluation is both a stitching and a reconstruction problem, as the user can move the device over different parts of the document to capture details or try to erase highlights. The material released consists of a dataset, an evaluation method and the associated tool, a sample method, and the tools required to extend the dataset. All the components are released publicly under very permissive licenses, and we particularly cared about maximizing the ease of understanding, usage and improvement.

Extraction of ancient map contents using trees of connected components

Jordan Drapeau · Thierry Géraud · Mickaël Coustaty · Joseph Chazalon · Jean-Christophe Burie · Véronique Eglin · Stéphane Bres

Ancient maps are an historical and cultural heritage widely recognized as a very important source of information, but exploiting such maps is complicated. In this project, we consider the Linguistic Atlas of France (ALF), built between 1902 and 1910. This cartographical heritage produces firstrate data for dialectological researches. In this paper, we focus on the separation of the content in layers for facilitating the extraction, the analysis, the visualization and the diffusion of the data contained in these ancient linguistic atlases.