Spoken Languages Identification for Indian Languages in Real World Condition

Sujeet Kumar, H. Muralikrishna, Veena Thenkanidiyoor, A. D. Dileep

Research output: Chapter in Book/Report/Conference proceedingConference contribution

1 Citation (Scopus)

Abstract

This work uses deep learning and advanced audio features to detect Indian spoken languages. Using pre-trained models such as wav2vec, data2vec, and ccc-wav2vec, we retrieved the feature representations of audio. Spoken language identification models were trained independently on each feature representation. To achieve this, an utterance-level embedding called u-vector with WSSL (within-sample similarity loss) is trained along with a simple DNN (Deep Neural Network) classifier on these features. In this paper, 12 Indian-spoken languages (including English) are considered and trained for only 10 hours of speech data from each language. The results show that using these feature representations and utterance-level embedding, a simple DNN can efficiently identify different Indian languages.

Original languageEnglish
Title of host publication2024 IEEE Conference on Engineering Informatics, ICEI 2024
PublisherInstitute of Electrical and Electronics Engineers Inc.
ISBN (Electronic)9798331505776
DOIs
Publication statusPublished - 2024
Event2024 IEEE Conference on Engineering Informatics, ICEI 2024 - Melbourne, Australia
Duration: 20-11-202428-11-2024

Publication series

Name2024 IEEE Conference on Engineering Informatics, ICEI 2024

Conference

Conference2024 IEEE Conference on Engineering Informatics, ICEI 2024
Country/TerritoryAustralia
CityMelbourne
Period20-11-2428-11-24

All Science Journal Classification (ASJC) codes

  • Fluid Flow and Transfer Processes
  • Artificial Intelligence
  • Computer Vision and Pattern Recognition
  • Information Systems
  • Mechanical Engineering
  • Health Informatics
  • Media Technology
  • Control and Optimization

Fingerprint

Dive into the research topics of 'Spoken Languages Identification for Indian Languages in Real World Condition'. Together they form a unique fingerprint.

Cite this