Seq-to-NSeq model for multi-summary generation

Guillaume Le Berre; Christophe Cerisara

Communication Dans Un Congrès Année : 2020

Seq-to-NSeq model for multi-summary generation

(1) , (1)

Guillaume Le Berre

Fonction : Auteur
PersonId : 747342
IdHAL : guillaume-le-berre

Natural Language Processing : representations, inference and semantics

Christophe Cerisara

Fonction : Auteur
PersonId : 2353
IdHAL : christophe-cerisara
IdRef : 102700168

Natural Language Processing : representations, inference and semantics

Résumé

Summaries of texts and documents written by people present a high variability, depending on the information they want to focus on and their writing style. Despite recent progress in generative models and controllable text generation, automatic summarization systems are still relatively limited in their capacity to both generate various types of summaries and capture this variability from a corpus. We propose to address this challenge with a multi-decoder model for abstractive sentence summa-rization that generates several summaries from a single input text. This model is an extension of a sequence-to-sequence model in which multiple concurrent decoders with shared attention and embeddings are trained to generate different summaries that capture the variability of styles present in the corpus. The full model is trained jointly with an Expectation-Maximization algorithm. A first qualitative analysis of the resulting de-coders reveals clusters that tend to be consistent with respect to a given style, e.g., passive vs. active voice. The code and experimental setup are released as open source.

Domaines

Traitement du texte et du document

Fichier principal

esann.pdf (220.98 Ko)

Origine : Fichiers produits par l'(les) auteur(s)

Christophe Cerisara : Connectez-vous pour contacter le contributeur

https://hal.science/hal-02902734

Soumis le : lundi 20 juillet 2020-11:41:09

Dernière modification le : lundi 11 septembre 2023-17:41:18

Archivage à long terme le : mardi 1 décembre 2020-01:23:45

Dates et versions

hal-02902734 , version 1 (20-07-2020)

Identifiants

HAL Id : hal-02902734 , version 1

Citer

Guillaume Le Berre, Christophe Cerisara. Seq-to-NSeq model for multi-summary generation. ESANN 2020, Oct 2020, Bruges, Belgium. ⟨hal-02902734⟩

Exporter

BibTeX XML-TEI Dublin Core DC Terms EndNote DataCite

Collections

CNRS INRIA UNIV-LORRAINE LORIA LORIA-NLPKD LUE-UL IMPACT-OLKI ANR

151 Consultations

102 Téléchargements

Seq-to-NSeq model for multi-summary generation

Résumé

Domaines

Dates et versions

Identifiants

Citer

Exporter

Collections

Partager