RFreeStem: A multilanguage rule-free stemmer - Université Toulouse III - Paul Sabatier - Toulouse INP Accéder directement au contenu
Communication Dans Un Congrès Année : 2019

RFreeStem: A multilanguage rule-free stemmer

Résumé

With the large expansion of available textual data, text mining has become of specialinterest. Due to their unstructured nature, such data require important preprocessing steps.Among them, stemming is a popularly used preprocessing method that extracts the root of thewords. However, the most popular algorithms are based on the application of rules, and there-fore highly language-related. We propose a new approach, the RFreeStem, that is rather basedon corpus and can therefore be applied on many languages.
Fichier principal
Vignette du fichier
baril_26186.pdf (386.5 Ko) Télécharger le fichier
Origine : Fichiers produits par l'(les) auteur(s)
Loading...

Dates et versions

hal-02891675 , version 1 (07-07-2020)

Identifiants

  • HAL Id : hal-02891675 , version 1
  • OATAO : 26186

Citer

Xavier Baril, Oihana Coustié, Josiane Mothe, Olivier Teste. RFreeStem: A multilanguage rule-free stemmer. 37e Congres Informatique des Organisations et Systemes d'Information et de Decision (INFORSID 2019), Jun 2019, Paris, France. pp.12-29. ⟨hal-02891675⟩
55 Consultations
310 Téléchargements

Partager

Gmail Facebook X LinkedIn More