Text Recognition in Videos using a Recurrent Connectionist Approach

Khaoula Elagouni; Christophe Garcia; Franck Mamalet; Pascale Sébillot

doi:10.1007/978-3-642-33266-1_22

Communication Dans Un Congrès Année : 2012

Text Recognition in Videos using a Recurrent Connectionist Approach

(1) , (2) , (1) , (3)

1
2
3

Khaoula Elagouni

Fonction : Auteur
PersonId : 914801

Orange Labs R&D [Rennes]

Christophe Garcia

Fonction : Auteur
PersonId : 3989
IdHAL : christophe-garcia
ORCID : 0000-0001-7997-9837
IdRef : 098256599

Extraction de Caractéristiques et Identification

Franck Mamalet

Fonction : Auteur
PersonId : 751026
IdHAL : franck-mamalet

Orange Labs R&D [Rennes]

Pascale Sébillot

Fonction : Auteur
PersonId : 21840
IdHAL : pascale-sebillot
ORCID : 0000-0002-5429-4302
IdRef : 075988453

Multimedia content-based indexing

Résumé

Most OCR (Optical Character Recognition) systems developed to recognize texts embedded in multimedia documents segment the text into characters before recognizing them. In this paper, we propose a novel approach able to avoid any explicit character segmentation. Using a multi-scale scanning scheme, texts extracted from videos are first represented by sequences of learnt features. Obtained representations are then used to feed a connectionist recurrent model specifically designed to take into account dependencies between successive learnt features and to recognize texts. The proposed video OCR evaluated on a database of TV news videos achieves very high recognition rates. Experiments also demonstrate that, for our recognition task, learnt feature representations perform better than hand-crafted features.

Mots clés

Video text recognition multi-scale image scanning ConvNet LSTM CTC

Domaines

Multimédia [cs.MM] Traitement du texte et du document Traitement du signal et de l'image [eess.SP] Traitement du signal et de l'image [eess.SP]

Fichier principal

ICANN.pdf (217.55 Ko)

Origine : Fichiers produits par l'(les) auteur(s)

Pascale Sébillot : Connectez-vous pour contacter le contributeur

https://hal.science/hal-00753906

Soumis le : lundi 19 novembre 2012-18:53:30

Dernière modification le : mercredi 5 juillet 2023-15:28:04

Archivage à long terme le : jeudi 21 février 2013-11:46:33

Dates et versions

hal-00753906 , version 1 (19-11-2012)

Identifiants

HAL Id : hal-00753906 , version 1
DOI : 10.1007/978-3-642-33266-1_22

Citer

Khaoula Elagouni, Christophe Garcia, Franck Mamalet, Pascale Sébillot. Text Recognition in Videos using a Recurrent Connectionist Approach. 22th International Conference on Artificial Neural Networks, ICANN, Sep 2012, Lausanne, Switzerland. pp.172-179, ⟨10.1007/978-3-642-33266-1_22⟩. ⟨hal-00753906⟩

Exporter

BibTeX XML-TEI Dublin Core DC Terms EndNote DataCite

Collections

EC-PARIS UNIV-RENNES1 CNRS INRIA UNIV-LYON1 UNIV-LYON2 INSA-LYON INSA-RENNES EC-LYON IRISA LIRIS IRISA-INSA-R IRISA-D6 INRIA2 UR1-MATH-STIC UR1-UFR-ISTIC LABEXIMU UNIV-RENNES INSA-GROUPE UDL UR1-MATH-NUM

883 Consultations

695 Téléchargements

Text Recognition in Videos using a Recurrent Connectionist Approach

Résumé

Mots clés

Domaines

Dates et versions

Identifiants

Citer

Exporter

Collections

Altmetric

Partager