A Recurrent Variational Autoencoder for Speech Enhancement - INRIA - Institut National de Recherche en Informatique et en Automatique Accéder directement au contenu
Communication Dans Un Congrès Année : 2020

A Recurrent Variational Autoencoder for Speech Enhancement

Résumé

This paper presents a generative approach to speech enhancement based on a recurrent variational autoencoder (RVAE). The deep generative speech model is trained using clean speech signals only, and it is combined with a nonnegative matrix factorization noise model for speech enhancement. We propose a variational expectation-maximization algorithm where the encoder of the RVAE is finetuned at test time, to approximate the distribution of the latent variables given the noisy speech observations. Compared with previous approaches based on feed-forward fully-connected architectures, the proposed recurrent deep generative speech model induces a posterior temporal dynamic over the latent variables, which is shown to improve the speech enhancement results.
Fichier principal
Vignette du fichier
LAGH_2020.pdf (391.79 Ko) Télécharger le fichier
Origine : Fichiers produits par l'(les) auteur(s)

Dates et versions

hal-02329000 , version 1 (23-10-2019)
hal-02329000 , version 2 (07-02-2020)

Identifiants

Citer

Simon Leglaive, Xavier Alameda-Pineda, Laurent Girin, Radu Horaud. A Recurrent Variational Autoencoder for Speech Enhancement. ICASSP 2020 - IEEE International Conference on Acoustics, Speech and Signal Processing, IEEE, May 2020, Barcelone (virtual), Spain. pp.371-375, ⟨10.1109/ICASSP40776.2020.9053164⟩. ⟨hal-02329000v2⟩
452 Consultations
1238 Téléchargements

Altmetric

Partager

Gmail Facebook X LinkedIn More