Cross-Corpora Language Recognition: A Preliminary Investigation with Indian Languages - INRIA - Institut National de Recherche en Informatique et en Automatique Accéder directement au contenu
Communication Dans Un Congrès Année : 2021

Cross-Corpora Language Recognition: A Preliminary Investigation with Indian Languages

Résumé

In this paper, we conduct one of the very first studies for cross-corpora performance evaluation in the spoken language identification (LID) problem. Cross-corpora evaluation was not explored much in LID research, especially for the Indian languages. We have selected three Indian spoken language corpora: IIITH-ILSC, LDC South Asian, and IITKGP-MLILSC. For each of the corpus, LID systems are trained on the state-of-the-art time-delay neural network (TDNN) based architecture with MFCC features. We observe that the LID performance degrades drastically for cross-corpora evaluation. For example, the system trained on the IIITH-ILSC corpus shows an average EER of 11.80% and 43.34% when evaluated with the same corpora and LDC South Asian corpora, respectively. Our preliminary analysis shows the significant differences among these corpora in terms of mismatch in the long-term average spectrum (LTAS) and signal-to-noise ratio (SNR). Subsequently, we apply different feature level compensation methods to reduce the cross-corpora acoustic mismatch. Our results indicate that these feature normalization schemes can help to achieve promising LID performance on cross-corpora experiments.
Fichier principal
Vignette du fichier
Spandan_EUSIPCO.pdf (345.68 Ko) Télécharger le fichier
Origine : Fichiers produits par l'(les) auteur(s)

Dates et versions

hal-03223314 , version 1 (10-05-2021)

Identifiants

Citer

Spandan Dey, Goutam Saha, Md Sahidullah. Cross-Corpora Language Recognition: A Preliminary Investigation with Indian Languages. EUSIPCO 2021 - 29th European Signal Processing Conference, Aug 2021, Dublin / Virtual, Ireland. ⟨10.23919/EUSIPCO54536.2021.9616273⟩. ⟨hal-03223314⟩
86 Consultations
104 Téléchargements

Altmetric

Partager

Gmail Facebook X LinkedIn More