Extracting Disease-Symptom Relationships by Learning Syntactic Patterns from Dependency Graphs - INRIA - Institut National de Recherche en Informatique et en Automatique Accéder directement au contenu
Communication Dans Un Congrès Année : 2015

Extracting Disease-Symptom Relationships by Learning Syntactic Patterns from Dependency Graphs

Résumé

Disease-symptom relationships are of primary importance for biomedical informat-ics, but databases that catalog them are incomplete in comparison with the state of the art available in the scientific literature. We propose in this paper a novel method for automatically extracting disease-symptom relationships from text, called SPARE (standing for Syntactic PAttern for Relationship Extraction). This method is composed of 3 successive steps: first, we learn patterns from the dependency graphs; second, we select best patterns based on their respective quality and specificity (their ability to identify only disease-symptom relationships); finally, the patterns are used on new texts for extracting disease-symptom relationships. We experimented SPARE on a corpus of 121,796 abstracts of PubMed related to 457 rare diseases. The quality of the extraction has been evaluated depending on the pattern quality and specificity. The best F-measure obtained is 55.65% (for speci f icity ≥ 0.5 and quality ≥ 0.5). To provide an insight on the novelty of disease-symptom relationship extracted, we compare our results to the content of phenotype databases (OrphaData and OMIM). Our results show the feasibility of automatically extracting disease-symptom relationships, including true relationships that were not already referenced in phenotype databases and may involve complex symptom descriptions.
Fichier principal
Vignette du fichier
hassan_et_al_bionlp2015.pdf (293.64 Ko) Télécharger le fichier
Origine : Fichiers produits par l'(les) auteur(s)
Loading...

Dates et versions

hal-01184655 , version 1 (17-08-2015)

Identifiants

  • HAL Id : hal-01184655 , version 1

Citer

Mohsen Hassan, Olfa Makkaoui, Adrien Coulet, Yannick Toussaint. Extracting Disease-Symptom Relationships by Learning Syntactic Patterns from Dependency Graphs. BioNLP 15, Association for Computational Linguistics, Jul 2015, Beijing, China. pp.184. ⟨hal-01184655⟩
350 Consultations
1185 Téléchargements

Partager

Gmail Facebook X LinkedIn More