Differentiable Dynamic Programming for Structured Prediction and Attention - INRIA - Institut National de Recherche en Informatique et en Automatique Accéder directement au contenu
Communication Dans Un Congrès Année : 2018

Differentiable Dynamic Programming for Structured Prediction and Attention

Résumé

Dynamic programming (DP) solves a variety of structured combinatorial problems by iteratively breaking them down into smaller subproblems. In spite of their versatility, many DP algorithms are non-differentiable, which hampers their use as a layer in neural networks trained by back-propagation. To address this issue, we propose to smooth the max operator in the dynamic programming recursion, using a strongly convex regularizer. This allows to relax both the optimal value and solution of the original combina-torial problem, and turns a broad class of DP algorithms into differentiable operators. Theoretically , we provide a new probabilistic perspective on backpropagating through these DP operators , and relate them to inference in graphical models. We derive two particular instantiations of our framework, a smoothed Viterbi algorithm for sequence prediction and a smoothed DTW algorithm for time-series alignment. We show-case these instantiations on structured prediction (audio-to-score alignment, NER) and on structured and sparse attention for translation.
Fichier principal
Vignette du fichier
decoding_embed.pdf (957.81 Ko) Télécharger le fichier
Origine : Fichiers produits par l'(les) auteur(s)

Dates et versions

hal-01809550 , version 1 (06-06-2018)
hal-01809550 , version 2 (11-06-2018)

Identifiants

  • HAL Id : hal-01809550 , version 2

Citer

Arthur Mensch, Mathieu Blondel. Differentiable Dynamic Programming for Structured Prediction and Attention. 35th International Conference on Machine Learning, Jul 2018, Stockholm, Sweden. ⟨hal-01809550v2⟩
215 Consultations
185 Téléchargements

Partager

Gmail Facebook X LinkedIn More