PhD defense: “Real-time LoG-based operator for scene text detection” (Nguyễn Đình Công)


Event Details


We are pleased to inform you of the thesis defense of Nguyễn Đình Công, entitled “Real-time LoG-based operator for scene text detection”. This defense will be scheduled the Thursday 25th June 2020 AM, at the LIFAT Laboratory / Polytech Tours but will not be accessible to public. It will be recorded and the link will be given here: …

The jury will be composed of:
+ Véronique Eglin: Professeur des universités, Université de Lyon, France
+ Basilis Gatos (Reviewer) : Directeur de recherche, Centre scientifique national de recherche Demokritos, Grèce
+ Jean-Christophe Burie (Reviewer): Professeur des universités, Université de La Rochelle, France
+ Conte Donatello (Thesis director): Maître de conférences HDR, Université de Tours, France
+ Delalandre Mathieu: Maître de conférences, Université de Tours, France
+ Pham The Anh: Professeur associé, Université de Hong Duc, Viet Nam

Abstract: In this thesis, a novel real-time Laplacian of Gaussian (RT-LoG) operator is proposed for scene text detection. This operator applies a two-step process for box selection within the spatial and spatial/scale-space domains and kernel decomposition with the box filtering method. Two levels of optimization are given. The first level of optimization within the spatial domain is obtained by box multualization. The second level of optimization within the spatial/scale-space domains is performed using a mixed method for box selection. The proposed RT-LoG operator appears as the top operator for scene text detection with a balanced performance between accuracy and processing time. It speeds up approximately three times as much as the brute-force operator while ensuring a reduction by a half of the latency at a same resolution level. We have embedded this operator into a new two-stage system for scene text detection. Within this system, a dedicated grouping method of keypoints was proposed using the spa- tial/scale space representation of the RT-LoG operator. The grouping is optimized through a strategy for the scale-space partitioning. The proposed grouping method is near scale and contrast-invariant, supports a normalization process. A CNN is used in the final stage for a text verification. The overall system is competitive with the top accurate systems in the literature while requiring less than two orders of magnitude for the processing resources.

Keywords: Text detection, Laplacian of Gaussian (LoG), blobs, key-points, Gaussian filtering, scale-space, stroke model, RT-LoG operator, real-time, predictability.