MULTICHANNEL SPEECH ENHANCEMENT FOR SPEAKER VERIFICATION IN NOISY AND REVERBERANT ENVIRONMENTS - INRIA - Institut National de Recherche en Informatique et en Automatique Accéder directement au contenu
Pré-Publication, Document De Travail Année : 2021

MULTICHANNEL SPEECH ENHANCEMENT FOR SPEAKER VERIFICATION IN NOISY AND REVERBERANT ENVIRONMENTS

Résumé

Speech signals can be corrupted by environmental noise as well as room reverberation which severely affects the speaker verification performance. In this paper, we propose to combine a multichannel pre-processing pipeline including filter-and-sum network (FaSnet), Rank-1 multichannel Wiener filter, and weighted prediction error as a front-end to speaker verification. Experimental evaluation shows that the pre-processing can improve the speaker verification performance as long as the enrollment files are processed similarly to the test data and that test and enrollment occur within similar SNR ranges. Our proposed pipeline is trained on synthetic data but generalizes to unseen, real recorded clips included in the VOiCES eval dataset and improves the speaker verification performance on all the noise conditions.
Fichier non déposé

Dates et versions

hal-03487420 , version 1 (17-12-2021)

Identifiants

  • HAL Id : hal-03487420 , version 1

Citer

Sandipana Dowerah, Romain Serizel, Denis Jouvet, Mohammad Mohammadamini, Driss Matrouf. MULTICHANNEL SPEECH ENHANCEMENT FOR SPEAKER VERIFICATION IN NOISY AND REVERBERANT ENVIRONMENTS. 2021. ⟨hal-03487420⟩
236 Consultations
52 Téléchargements

Partager

Gmail Facebook X LinkedIn More