Modélisation et recherche de graphes visuels : une approche par modèles de langue pour la reconnaissance de scènes

Trong-Ton Pham

Thèse Année : 2010

Modélisation et recherche de graphes visuels : une approche par modèles de langue pour la reconnaissance de scènes

Visual Graph Modeling and Retrieval: A Language Model Approach for Scene Recognition

(1)

Trong-Ton Pham

Fonction : Auteur

Modélisation et Recherche d’Information Multimédia [Grenoble]

Résumé

Content-based image indexing and retrieval (CBIR) system needs to consider several types of visual features and spatial information among them (i.e., different point of views) for better image representation. This thesis presents a novel approach that exploits an extension of the language modeling approach from information retrieval to the problem of graph-based image retrieval. Such versatile graph model is needed to represent the multiple points of views of images. This graph-based framework is composed of three main stages: Image processing stage aims at extracting image regions from the image. It also consists of computing the numerical feature vectors associated with image regions. Graph modeling stage consists of two main steps. First, extracted image regions that are visually similar will be grouped into clusters using an unsupervised learning algorithm. Each cluster is then associated with a visual concept. The second step generates the spatial relations between the visual concepts. Each image is represented by a visual graph captured from a set of visual concepts and a set of spatial relations among them. Graph retrieval stage is to retrieve images relevant to a new image query. Query graphs are generated following the graph modeling stage. Inspired by the language model for text retrieval, we extend this framework for matching the query graph with the document graphs from the database. Images are then ranked based on the relevance values of the corresponding image graphs. Two instances of the visual graph model have been applied to the problem of scene recognition and robot localization. We performed the experiments on two image collections: one contained 3,849 touristic images and another composed of 3,633 images captured by a mobile robot. The achieved results show that using visual graph model outperforms the standard language model and the Support Vector Machine method by more than 10% in accuracy.

Mots clés

Information retrieval image indexing image retrieval Recherche d'information indexation d'images recherche d'images

Domaines

Recherche d'information [cs.IR]

Fichier principal

pham_phd_thesis.pdf (2.9 Mo)

Philippe Mulhem : Connectez-vous pour contacter le contributeur

https://theses.hal.science/tel-00996067

Soumis le : lundi 26 mai 2014-11:21:46

Dernière modification le : jeudi 4 avril 2024-21:18:40

Archivage à long terme le : mardi 26 août 2014-10:47:03

Dates et versions

tel-00996067 , version 1 (26-05-2014)

Identifiants

HAL Id : tel-00996067 , version 1

Citer

Trong-Ton Pham. Modélisation et recherche de graphes visuels : une approche par modèles de langue pour la reconnaissance de scènes. Information Retrieval [cs.IR]. Université de Grenoble, 2010. English. ⟨NNT : ⟩. ⟨tel-00996067⟩

Exporter

BibTeX XML-TEI Dublin Core DC Terms EndNote DataCite

Collections

UGA CNRS INRIA LIG LIG_TDCGE LIG_TDCGE_MRIM LIG_SIDCH

350 Consultations

207 Téléchargements

Modélisation et recherche de graphes visuels : une approche par modèles de langue pour la reconnaissance de scènes

Visual Graph Modeling and Retrieval: A Language Model Approach for Scene Recognition

Résumé

Mots clés

Domaines

Dates et versions

Identifiants

Citer

Exporter

Collections

Partager