Automatic glottis localization and segmentation in stroboscopic videos using deep neural network

M. V. Achuth Rao, Rahul Krishnamurthy, Pebbili Gopikishore, Veeramani Priyadharshini, Prasanta Kumar Ghosh

Research output: Contribution to journalConference articlepeer-review

6 Citations (Scopus)


Exact analysis of the glottal vibration patten is vital for assessing voice pathologies. One of the primary steps in this analysis is automatic glottis segmentation, which, in turn, has two main parts, namely, glottis localization and the glottis segmentation. In this paper, we propose a deep neural network (DNN) based automatic glottis localization and segmentation scheme. We pose the problem as a classification problem where colors of each pixel and its neighborhood is classified as belonging to inside or outside the glottis region. We further process the classification result to get the biggest cluster, which is declared as the segmented glottis. The proposed algorithm is evaluated on a dataset comprising of stroboscopic videos from 18 subjects where the glottis region is marked by the three Speech Language Pathologists (SLPs). On average, the proposed DNN based segmentation scheme achieves a localization performance of 65.33% and segmentation DICE score of 0.74 (absolute), which is better than the baseline scheme by 22.66% and 0.09 respectively. We also find that the DICE score obtained by the DNN based segmentation scheme correlates well with the average DICE score computed between annotation provided by any two SLPs suggesting the robustness of the proposed glottis segmentation scheme.

Original languageEnglish
Pages (from-to)3007-3011
Number of pages5
JournalProceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH
Publication statusPublished - 01-01-2018
Event19th Annual Conference of the International Speech Communication, INTERSPEECH 2018 - Hyderabad, India
Duration: 02-09-201806-09-2018

All Science Journal Classification (ASJC) codes

  • Language and Linguistics
  • Human-Computer Interaction
  • Signal Processing
  • Software
  • Modelling and Simulation


Dive into the research topics of 'Automatic glottis localization and segmentation in stroboscopic videos using deep neural network'. Together they form a unique fingerprint.

Cite this