An algorithm for cross-lingual sense-clustering tested in a MT evaluation setting

Marianna Ma Apidianaki; Yifan He

Communication Dans Un Congrès Année : 2010

An algorithm for cross-lingual sense-clustering tested in a MT evaluation setting

(1) , (2)

1
2

Marianna Ma Apidianaki

Fonction : Auteur
PersonId : 20607
IdHAL : marianna-apidianaki

Analyse Linguistique Profonde à Grande Echelle ; Large-scale deep linguistic processing

Yifan He

Fonction : Auteur

Centre for Next Generation Localisation

Résumé

Unsupervised sense induction methods offer a solution to the problem of scarcity of semantic resources. These methods automatically extract semantic information from textual data and create resources adapted to specific applications and domains of interest. In this paper, we present a clustering algorithm for cross-lingual sense induction which generates bilingual semantic inventories from parallel corpora. We describe the clustering procedure and the obtained resources. We then proceed to a large-scale evaluation by integrating the resources into a Machine Translation (MT) metric (METEOR). We show that the use of the data-driven sense-cluster inventories leads to better correlation with human judgments of translation quality, compared to precision-based metrics, and to improvements similar to those obtained when a hand-crafted semantic resource is used.

Mots clés

sense clustering word sense induction Machine Translation evaluation

Domaines

Informatique et langage [cs.CL]

Fichier principal

Apidianaki_and_He10.pdf (251.44 Ko)

Origine : Fichiers produits par l'(les) auteur(s)

Marianna Apidianaki : Connectez-vous pour contacter le contributeur

https://hal.science/hal-00544745

Soumis le : mercredi 8 décembre 2010-19:17:38

Dernière modification le : mercredi 26 octobre 2022-17:23:53

Archivage à long terme le : jeudi 10 mars 2011-12:56:41

Dates et versions

hal-00544745 , version 1 (08-12-2010)

Identifiants

HAL Id : hal-00544745 , version 1

Citer

Marianna Ma Apidianaki, Yifan He. An algorithm for cross-lingual sense-clustering tested in a MT evaluation setting. International Workshop on Spoken Language Translation (IWSLT-2010), Dec 2010, Paris, France. pp.219--226. ⟨hal-00544745⟩

Exporter

BibTeX XML-TEI Dublin Core DC Terms EndNote DataCite

Collections

UNIV-PARIS7 INRIA INRIA2 CAMPUS-AAR AAI

158 Consultations

97 Téléchargements

An algorithm for cross-lingual sense-clustering tested in a MT evaluation setting

Résumé

Mots clés

Domaines

Dates et versions

Identifiants

Citer

Exporter

Collections

Partager