A Wavelet-Based Parameterization for Speech/Music Segmentation

Emmanuel Didiot; Dominique Fohr; Jean-Paul Haton; Irina Illina; Odile Mella

Communication Dans Un Congrès Année : 2006

A Wavelet-Based Parameterization for Speech/Music Segmentation

(1) , (1) , (1) , (1) , (1)

Emmanuel Didiot

Fonction : Auteur
PersonId : 835404

Analysis, perception and recognition of speech

Dominique Fohr

Fonction : Auteur
PersonId : 15652
IdHAL : dominique-fohr
IdRef : 031092942

Analysis, perception and recognition of speech

Jean-Paul Haton

Fonction : Auteur
PersonId : 830987

Analysis, perception and recognition of speech

Irina Illina

Fonction : Auteur
PersonId : 15663
IdHAL : irina-illina
IdRef : 120731746

Analysis, perception and recognition of speech

Odile Mella

Fonction : Auteur
PersonId : 15902
IdHAL : odile-mella
IdRef : 12011903X

Analysis, perception and recognition of speech

Résumé

The problem of speech/music discrimination is a challenging research problem which significantly impacts Automatic Speech Recognition (ASR) performance. This paper proposes new features for the Speech/Music discrimination task. We propose to use a decomposition of the audio signal based on wavelets, which allows a good analysis of non stationary signal like speech or music. We compute different energy types in each frequency band obtained from wavelet decomposition. Two class/non-class classifiers are used : one for speech/non-speech, one for music/non-music. On the different test corpora, the proposed wavelet approach gives better results than the MFCC one. For instance, we have a significant relative improvements of the error rate of 71.6% on the ``Scheirer'' corpus for the speech/music discrimination task.

Mots clés

speech/music discrimination wavelets static and dynamic parameters

Domaines

Informatique et langage [cs.CL]

Emmanuel Didiot : Connectez-vous pour contacter le contributeur

https://hal.science/hal-00103569

Soumis le : mercredi 4 octobre 2006-16:50:43

Dernière modification le : vendredi 24 mars 2023-14:52:48

Dates et versions

hal-00103569 , version 1 (04-10-2006)

Identifiants

HAL Id : hal-00103569 , version 1

Citer

Emmanuel Didiot, Dominique Fohr, Jean-Paul Haton, Irina Illina, Odile Mella. A Wavelet-Based Parameterization for Speech/Music Segmentation. 2006, pp.653. ⟨hal-00103569⟩

Exporter

BibTeX XML-TEI Dublin Core DC Terms EndNote DataCite

Collections

CNRS INRIA UNIV-LORRAINE INRIA2 LORIA

107 Consultations

0 Téléchargements

A Wavelet-Based Parameterization for Speech/Music Segmentation

Résumé

Mots clés

Domaines

Dates et versions

Identifiants

Citer

Exporter

Collections

Partager