Challenging the Semi-Supervised VAE Framework for Text Classification

Ghazi Felhi; Joseph Le Roux; Djamé Seddah

doi:10.18653/v1/2021.insights-1.19

Communication Dans Un Congrès Année : 2021

Challenging the Semi-Supervised VAE Framework for Text Classification

(1) , (1) , (2)

1
2

Ghazi Felhi

Fonction : Auteur
PersonId : 753891
IdHAL : ghazi-felhi
ORCID : 0000-0002-8657-4640

Laboratoire d'Informatique de Paris-Nord

Joseph Le Roux

Fonction : Auteur

Laboratoire d'Informatique de Paris-Nord

Djamé Seddah

Fonction : Auteur
PersonId : 11545
IdHAL : djameseddah
IdRef : 086185136

Automatic Language Modelling and ANAlysis & Computational Humanities

Résumé

Semi-Supervised Variational Autoencoders (SSVAEs) are widely used models for data efficient learning. In this paper, we question the adequacy of the standard design of sequence SSVAEs for the task of text classification as we exhibit two sources of overcomplexity for which we provide simplifications. These simplifications to SSVAEs preserve their theoretical soundness while providing a number of practical advantages in the semi-supervised setup where the result of training is a text classifier. These simplifications are the removal of (i) the Kullback-Liebler divergence from its objective and (ii) the fully unobserved latent variable from its probabilistic model. These changes relieve users from choosing a prior for their latent variables, make the model smaller and faster, and allow for a better flow of information into the latent variables. We compare the simplified versions to standard SSVAEs on 4 text classification tasks. On top of the above-mentioned simplification, experiments show a speed-up of 26%, while keeping equivalent classification scores. The code to reproduce our experiments is public.

Domaines

Traitement du texte et du document

Djamé Seddah : Connectez-vous pour contacter le contributeur

https://inria.hal.science/hal-03540081

Soumis le : samedi 22 janvier 2022-23:26:41

Dernière modification le : mardi 3 octobre 2023-17:18:04

Dates et versions

hal-03540081 , version 1 (22-01-2022)

Licence

Paternité

Identifiants

HAL Id : hal-03540081 , version 1
ARXIV : 2109.12969
DOI : 10.18653/v1/2021.insights-1.19

Citer

Ghazi Felhi, Joseph Le Roux, Djamé Seddah. Challenging the Semi-Supervised VAE Framework for Text Classification. Second Workshop on Insights from Negative Results in NLP (colocated with EMNLP), Nov 2021, Punta Cana, Dominican Republic. ⟨10.18653/v1/2021.insights-1.19⟩. ⟨hal-03540081⟩

Exporter

BibTeX XML-TEI Dublin Core DC Terms EndNote DataCite

Collections

UNIV-PARIS13 CNRS INRIA LIPN INRIA2 SORBONNE-PARIS-NORD ANR

50 Consultations

0 Téléchargements

Challenging the Semi-Supervised VAE Framework for Text Classification

Résumé

Domaines

Dates et versions

Licence

Identifiants

Citer

Exporter

Collections

Altmetric

Partager