Combining global and local semantic contexts for improving biomedical information retrieval - Université Toulouse III - Paul Sabatier - Toulouse INP Accéder directement au contenu
Article Dans Une Revue European Conference on Information Retrieval (ECIR) Année : 2011

Combining global and local semantic contexts for improving biomedical information retrieval

Duy Dinh
  • Fonction : Auteur
  • PersonId : 888518

Résumé

In the context of biomedical information retrieval (IR), this paper explores the relationship between the document's global context and the query's local context in an attempt to overcome the term mismatch problem between the user query and documents in the collection. Most solutions to this problem have been focused on expanding the query by discovering its context, either \textit{global} or \textit{local}. In a global strategy, all documents in the collection are used to examine word occurrences and relationships in the corpus as a whole, and use this information to expand the original query. In a local strategy, the top-ranked documents retrieved for a given query are examined to determine terms for query expansion. We propose to combine the document's global context and the query's local context in an attempt to increase the term overlap between the user query and documents in the collection via document expansion (DE) and query expansion (QE). The DE technique is based on a statistical method (IR-based) to extract the most appropriate concepts (global context) from each document. The QE technique is based on a blind feedback approach using the top-ranked documents (local context) obtained in the first retrieval stage. A comparative experiment on the TREC 2004 Genomics collection demonstrates that the combination of the document's global context and the query's local context shows a significant improvement over the baseline. The MAP is significantly raised from 0.4097 to 0.4532 with a significant improvement rate of +10.62\% over the baseline. The IR performance of the combined method in terms of MAP is also superior to official runs participated in TREC 2004 Genomics and is comparable to the performance of the best run (0.4075).
Fichier principal
Vignette du fichier
ECIR-2011-dinh-tamine.pdf (225.49 Ko) Télécharger le fichier
Origine : Fichiers produits par l'(les) auteur(s)
Loading...

Dates et versions

hal-00588336 , version 1 (22-04-2011)

Identifiants

Citer

Duy Dinh, Lynda Tamine. Combining global and local semantic contexts for improving biomedical information retrieval. European Conference on Information Retrieval (ECIR), 2011, Lecture Notes in Computer Science, 6611, pp.375-386. ⟨10.1007/978-3-642-20161-5_38⟩. ⟨hal-00588336⟩
214 Consultations
421 Téléchargements

Altmetric

Partager

Gmail Facebook X LinkedIn More