Audio source separation with one sensor for robust speech recognition

Laurent Benaroya; Frédéric Bimbot; Guillaume Gravier; Rémi Gribonval

Communication Dans Un Congrès Année : 2003

Audio source separation with one sensor for robust speech recognition

(1) , (1) , (1) , (1)

Laurent Benaroya

Fonction : Auteur
PersonId : 7037
IdHAL : elie-laurent-benaroya
IdRef : 07600953X

Speech and sound data modeling and processing

Frédéric Bimbot

Fonction : Auteur
PersonId : 830967

Speech and sound data modeling and processing

Guillaume Gravier

Fonction : Auteur
PersonId : 1046
IdHAL : guig
ORCID : 0000-0002-2266-5682
IdRef : 110355415

Speech and sound data modeling and processing

Rémi Gribonval

Fonction : Auteur
PersonId : 1255
IdHAL : remi-gribonval
ORCID : 0000-0002-9450-8125
IdRef : 113181590

Speech and sound data modeling and processing

Résumé

In this paper, we address the problem of noise compensation in speech signals for robust speech recognition. Several classical denoising methods in the field of speech and signal processing are compared on speech corrupted by music, which correspond to a frequent situation in broadcast news transcription tasks. We also present two new source separation techniques, namely adaptive Wiener filtering and adaptive shrinkage. These techniques rely on the use of a dictionary of spectral shapes to deal with the non stationarity of the signals. The algorithms are first compared on the source separation task and assessed in terms of average distortion. Their effect on the entire transcription system is eventually compared in terms of word error rate. Results show that the proposed adaptive Wiener filter approach yields a significant improvement of the transcription accuracy at signal/noise ratios greater than 15 dB.

Domaines

Traitement du signal et de l'image [eess.SP] Traitement du signal et de l'image [eess.SP]

Fichier principal

benaroya-nolisp-03.pdf (368.69 Ko)

Origine : Fichiers produits par l'(les) auteur(s)

Rémi Gribonval : Connectez-vous pour contacter le contributeur

https://inria.hal.science/inria-00576210

Soumis le : dimanche 13 mars 2011-16:34:08

Dernière modification le : vendredi 24 mars 2023-14:52:54

Archivage à long terme le : mardi 14 juin 2011-02:31:21

Dates et versions

inria-00576210 , version 1 (13-03-2011)

Identifiants

HAL Id : inria-00576210 , version 1

Citer

Laurent Benaroya, Frédéric Bimbot, Guillaume Gravier, Rémi Gribonval. Audio source separation with one sensor for robust speech recognition. ISCA Tutorial and Research Workshop on Non-Linear Speech Processing (NOLISP), IRISA, May 2003, Le Croisic, France. ⟨inria-00576210⟩

Exporter

BibTeX XML-TEI Dublin Core DC Terms EndNote DataCite

Collections

EC-PARIS UNIV-RENNES1 CNRS INRIA INSA-RENNES IRISA IRISA-D5 INRIA2 UR1-MATH-STIC UR1-UFR-ISTIC UNIV-RENNES INSA-GROUPE UR1-MATH-NUM

173 Consultations

83 Téléchargements

Audio source separation with one sensor for robust speech recognition

Résumé

Domaines

Dates et versions

Identifiants

Citer

Exporter

Collections

Partager