Modélisation et recherche de graphes visuels : une approche par modèles de langue pour la reconnaissance de scènes - INRIA - Institut National de Recherche en Informatique et en Automatique Accéder directement au contenu
Thèse Année : 2010

Modélisation et recherche de graphes visuels : une approche par modèles de langue pour la reconnaissance de scènes

Visual Graph Modeling and Retrieval: A Language Model Approach for Scene Recognition

Résumé

Content-based image indexing and retrieval (CBIR) system needs to consider several types of visual features and spatial information among them (i.e., different point of views) for better image representation. This thesis presents a novel approach that exploits an extension of the language modeling approach from information retrieval to the problem of graph-based image retrieval. Such versatile graph model is needed to represent the multiple points of views of images. This graph-based framework is composed of three main stages: Image processing stage aims at extracting image regions from the image. It also consists of computing the numerical feature vectors associated with image regions. Graph modeling stage consists of two main steps. First, extracted image regions that are visually similar will be grouped into clusters using an unsupervised learning algorithm. Each cluster is then associated with a visual concept. The second step generates the spatial relations between the visual concepts. Each image is represented by a visual graph captured from a set of visual concepts and a set of spatial relations among them. Graph retrieval stage is to retrieve images relevant to a new image query. Query graphs are generated following the graph modeling stage. Inspired by the language model for text retrieval, we extend this framework for matching the query graph with the document graphs from the database. Images are then ranked based on the relevance values of the corresponding image graphs. Two instances of the visual graph model have been applied to the problem of scene recognition and robot localization. We performed the experiments on two image collections: one contained 3,849 touristic images and another composed of 3,633 images captured by a mobile robot. The achieved results show that using visual graph model outperforms the standard language model and the Support Vector Machine method by more than 10% in accuracy.
Fichier principal
Vignette du fichier
pham_phd_thesis.pdf (2.9 Mo) Télécharger le fichier
Loading...

Dates et versions

tel-00996067 , version 1 (26-05-2014)

Identifiants

  • HAL Id : tel-00996067 , version 1

Citer

Trong-Ton Pham. Modélisation et recherche de graphes visuels : une approche par modèles de langue pour la reconnaissance de scènes. Information Retrieval [cs.IR]. Université de Grenoble, 2010. English. ⟨NNT : ⟩. ⟨tel-00996067⟩
350 Consultations
207 Téléchargements

Partager

Gmail Facebook X LinkedIn More