An algorithm for cross-lingual sense-clustering tested in a MT evaluation setting - INRIA - Institut National de Recherche en Informatique et en Automatique Accéder directement au contenu
Communication Dans Un Congrès Année : 2010

An algorithm for cross-lingual sense-clustering tested in a MT evaluation setting

Résumé

Unsupervised sense induction methods offer a solution to the problem of scarcity of semantic resources. These methods automatically extract semantic information from textual data and create resources adapted to specific applications and domains of interest. In this paper, we present a clustering algorithm for cross-lingual sense induction which generates bilingual semantic inventories from parallel corpora. We describe the clustering procedure and the obtained resources. We then proceed to a large-scale evaluation by integrating the resources into a Machine Translation (MT) metric (METEOR). We show that the use of the data-driven sense-cluster inventories leads to better correlation with human judgments of translation quality, compared to precision-based metrics, and to improvements similar to those obtained when a hand-crafted semantic resource is used.
Fichier principal
Vignette du fichier
Apidianaki_and_He10.pdf (251.44 Ko) Télécharger le fichier
Origine : Fichiers produits par l'(les) auteur(s)
Loading...

Dates et versions

hal-00544745 , version 1 (08-12-2010)

Identifiants

  • HAL Id : hal-00544745 , version 1

Citer

Marianna Ma Apidianaki, Yifan He. An algorithm for cross-lingual sense-clustering tested in a MT evaluation setting. International Workshop on Spoken Language Translation (IWSLT-2010), Dec 2010, Paris, France. pp.219--226. ⟨hal-00544745⟩
158 Consultations
97 Téléchargements

Partager

Gmail Facebook X LinkedIn More