Skip to main navigation Skip to search Skip to main content

Spoken Languages Identification for Indian Languages in Real World Condition

    Research output: Chapter in Book/Report/Conference proceedingConference contribution

    Abstract

    This work uses deep learning and advanced audio features to detect Indian spoken languages. Using pre-trained models such as wav2vec, data2vec, and ccc-wav2vec, we retrieved the feature representations of audio. Spoken language identification models were trained independently on each feature representation. To achieve this, an utterance-level embedding called u-vector with WSSL (within-sample similarity loss) is trained along with a simple DNN (Deep Neural Network) classifier on these features. In this paper, 12 Indian-spoken languages (including English) are considered and trained for only 10 hours of speech data from each language. The results show that using these feature representations and utterance-level embedding, a simple DNN can efficiently identify different Indian languages.

    Original languageEnglish
    Title of host publication2024 IEEE Conference on Engineering Informatics, ICEI 2024
    PublisherInstitute of Electrical and Electronics Engineers Inc.
    ISBN (Electronic)9798331505776
    DOIs
    Publication statusPublished - 2024
    Event2024 IEEE Conference on Engineering Informatics, ICEI 2024 - Melbourne, Australia
    Duration: 20-11-202428-11-2024

    Publication series

    Name2024 IEEE Conference on Engineering Informatics, ICEI 2024

    Conference

    Conference2024 IEEE Conference on Engineering Informatics, ICEI 2024
    Country/TerritoryAustralia
    CityMelbourne
    Period20-11-2428-11-24

    All Science Journal Classification (ASJC) codes

    • Fluid Flow and Transfer Processes
    • Artificial Intelligence
    • Computer Vision and Pattern Recognition
    • Information Systems
    • Mechanical Engineering
    • Health Informatics
    • Media Technology
    • Control and Optimization

    Fingerprint

    Dive into the research topics of 'Spoken Languages Identification for Indian Languages in Real World Condition'. Together they form a unique fingerprint.

    Cite this