SEARNN: Training RNNs with global-local losses

Rémi Leblond; Jean-Baptiste Alayrac; Anton Osokin; Simon Lacoste-Julien

Pré-Publication, Document De Travail Année : 2017

SEARNN: Training RNNs with global-local losses

(1) , (2) , (3) , (4)

1
2
3
4

Rémi Leblond

Fonction : Auteur
PersonId : 1025012

Statistical Machine Learning and Parsimony

Jean-Baptiste Alayrac

Fonction : Auteur
PersonId : 6558
IdHAL : jean-baptiste-alayrac
IdRef : 253131529

Université Paris Sciences et Lettres

Anton Osokin

Fonction : Auteur

Models of visual object recognition and scene understanding

Simon Lacoste-Julien

Fonction : Auteur
PersonId : 1938
IdHAL : simon-lacoste-julien
ORCID : 0000-0001-6485-6180
IdRef : 22557781X

Département d'Informatique et de Recherche Opérationnelle [Montreal]

Résumé

We propose SEARNN, a novel training algorithm for recurrent neural networks (RNNs) inspired by the " learning to search " (L2S) approach to structured prediction. RNNs have been widely successful in structured prediction applications such as machine translation or parsing, and are commonly trained using maximum likelihood estimation (MLE). Unfortunately, this training loss is not always an appropriate surrogate for the test error: by only maximizing the ground truth probability, it fails to exploit the wealth of information offered by structured losses. Further, it introduces discrepancies between training and predicting (such as exposure bias) that may hurt test performance. Instead, SEARNN leverages test-alike search space exploration to introduce global-local losses that are closer to the test error. We demonstrate improved performance over MLE on three different tasks: OCR, spelling correction and text chunking. Finally, we propose a subsampling strategy to enable SEARNN to scale to large vocabulary sizes.

Domaines

Optimisation et contrôle [math.OC] Apprentissage [cs.LG]

Rémi Leblond : Connectez-vous pour contacter le contributeur

https://inria.hal.science/hal-01665263

Soumis le : vendredi 22 décembre 2017-13:39:55

Dernière modification le : vendredi 19 avril 2024-16:18:58

Dates et versions

hal-01665263 , version 1 (22-12-2017)

Identifiants

HAL Id : hal-01665263 , version 1
ARXIV : 1706.04499

Citer

Rémi Leblond, Jean-Baptiste Alayrac, Anton Osokin, Simon Lacoste-Julien. SEARNN: Training RNNs with global-local losses. 2017. ⟨hal-01665263⟩

Exporter

BibTeX XML-TEI Dublin Core DC Terms EndNote DataCite

Collections

ENS-PARIS CNRS INRIA INRIA2 TDS-MACS PSL

316 Consultations

1 Téléchargements

SEARNN: Training RNNs with global-local losses

Résumé

Domaines

Dates et versions

Identifiants

Citer

Exporter

Collections

Altmetric

Partager