F0 modeling using DNN for Arabic parametric speech synthesis

Imene Zangar; Zied Mnasri; Vincent Colotte; Denis Jouvet

Communication Dans Un Congrès Année : 2019

F0 modeling using DNN for Arabic parametric speech synthesis

(1) , (1) , (2) , (2)

1
2

Imene Zangar

Fonction : Auteur

Ecole Nationale d'Ingénieurs de Tunis

Zied Mnasri

Fonction : Auteur

Ecole Nationale d'Ingénieurs de Tunis

Vincent Colotte

Fonction : Auteur
PersonId : 16268
IdHAL : vincent-colotte
IdRef : 070401683

Speech Modeling for Facilitating Oral-Based Communication

Denis Jouvet

Fonction : Auteur
PersonId : 15904
IdHAL : denis-jouvet
IdRef : 029418666

Speech Modeling for Facilitating Oral-Based Communication

Résumé

Deep neural networks (DNN) are gaining increasing interest in speech processing applications, especially in text-to-speech synthesis. Actually state-of-the-art speech generation tools, like MERLIN and WAVENET are totally DNN-based. However, every language has to be modeled on its own using DNN. One of the key components of speech synthesis modules is the prosodic parameters generation module from contextual input features, and more particularly the fundamental frequency (F0) generation module. Actually F0 is responsible for intonation , that is why it should be accurately modeled to provide intelligible and natural speech. However, F0 modeling is highly dependent on the language. Therefore, language specific characteristics have to be taken into account. In this paper, we aim to model F0 for Arabic speech synthesis with feedforward and recurrent DNN, and using specific characteristic features for Arabic like vowel quantity and gemination, in order to improve the quality of Arabic parametric speech synthesis.

Mots clés

Fundamental frequency F0 Deep neural networks Recurrent neural networks Arabic parametric speech synthesis

Domaines

Traitement du signal et de l'image [eess.SP] Intelligence artificielle [cs.AI]

Fichier principal

conference_INNSBDDL2019.pdf (234.31 Ko)

Origine : Fichiers produits par l'(les) auteur(s)

Vincent Colotte : Connectez-vous pour contacter le contributeur

https://inria.hal.science/hal-02177496

Soumis le : mardi 9 juillet 2019-09:41:52

Dernière modification le : lundi 11 septembre 2023-17:41:19

Dates et versions

hal-02177496 , version 1 (09-07-2019)

Identifiants

HAL Id : hal-02177496 , version 1

Citer

Imene Zangar, Zied Mnasri, Vincent Colotte, Denis Jouvet. F0 modeling using DNN for Arabic parametric speech synthesis. INNSBDDL 2019 - INNS Big Data and Deep Learning, Apr 2019, Sestri Levante, Italy. ⟨hal-02177496⟩

Exporter

BibTeX XML-TEI Dublin Core DC Terms EndNote DataCite

Collections

CNRS INRIA UNIV-LORRAINE INRIA2 LORIA LORIA-NLPKD

90 Consultations

302 Téléchargements

F0 modeling using DNN for Arabic parametric speech synthesis

Résumé

Mots clés

Domaines

Dates et versions

Identifiants

Citer

Exporter

Collections

Partager