Vocal and Non-vocal Segmentation based on the Analysis of Formant Structure

    Research output: Chapter in Book/Report/Conference proceedingConference contribution

    3 Citations (Scopus)

    Abstract

    The process of classifying vocal and non-vocal regions in an audio clip is the base for many Music Information Retrieval (MIR) tasks. In this work, we have computed novel features based on formant structure for segmenting the vocal and non-vocal regions of a given music clip. The features such as obtuse angles at formant peak, valley locations, convexity, and concavity have been proposed for this task after thorough analysis. The obtuse angles have been computed for second, third and fourth formants as much discrimination is not found for the first formant. The computed formant related features have been added to the base-line Mel frequency cepstral coefficients (MFCCs) in order to improve the performance. Moreover, singer formant (F5) has also been computed forming a 19-dimensional feature vector. As artificial neural networks (ANNs) are more suitable for handling nonlinear data, they have been considered as a classifier. Further, the 11-point moving window has been applied to avoid intermittent misclassifications. An accuracy of 88% is obtained using the proposed approach with a 19-dimensional feature vector.

    Original languageEnglish
    Title of host publication2017 9th International Conference on Advances in Pattern Recognition, ICAPR 2017
    PublisherInstitute of Electrical and Electronics Engineers Inc.
    Pages304-309
    Number of pages6
    ISBN (Electronic)9781538622414
    DOIs
    Publication statusPublished - 27-12-2018
    Event9th International Conference on Advances in Pattern Recognition, ICAPR 2017 - Bangalore, India
    Duration: 27-12-201730-12-2017

    Publication series

    Name2017 9th International Conference on Advances in Pattern Recognition, ICAPR 2017

    Conference

    Conference9th International Conference on Advances in Pattern Recognition, ICAPR 2017
    Country/TerritoryIndia
    CityBangalore
    Period27-12-1730-12-17

    All Science Journal Classification (ASJC) codes

    • Computer Vision and Pattern Recognition
    • Signal Processing

    Fingerprint

    Dive into the research topics of 'Vocal and Non-vocal Segmentation based on the Analysis of Formant Structure'. Together they form a unique fingerprint.

    Cite this