TY - GEN
T1 - Characterization of aspirated and unaspirated sounds in speech
AU - Ramteke, Pravin Bhaskar
AU - Sadanand, Anmol
AU - Koolagudi, Shashidhar G.
AU - Pai, Vidya
N1 - Publisher Copyright:
© 2017 IEEE.
PY - 2017/12/19
Y1 - 2017/12/19
N2 - In this work, consonant aspiration and unaspiration phenomena are studied. It is known that, pronunciation of aspiration and unaspiration is characterized by the 'puff of air' released at the place of constriction in the vocal tract which is known as burst. Here, the properties of vowel immediately after the burst are studied for characterization of the burst. Excitation source signal estimated from the speech linear prediction residual is used for the task. The signal characteristics such as glottal pulse, duration of open, closed & return phases, slope of open & return phases, duration of burst, ratio of highest and lowest energies of signal and voice onset time (VOT) are explored to characterize aspiration and unaspiration. TIMIT English speech corpus is used to test the proposed approach. Random forest (RF) and support vector machine (SVMs) are used as classifiers to test the effectiveness of the features used for the task. An accuracy of 99.93% and 94.03% is achieved respectively. From the results, it is observed that the proposed features are robust in classifying the aspirated and unaspirated consonants.
AB - In this work, consonant aspiration and unaspiration phenomena are studied. It is known that, pronunciation of aspiration and unaspiration is characterized by the 'puff of air' released at the place of constriction in the vocal tract which is known as burst. Here, the properties of vowel immediately after the burst are studied for characterization of the burst. Excitation source signal estimated from the speech linear prediction residual is used for the task. The signal characteristics such as glottal pulse, duration of open, closed & return phases, slope of open & return phases, duration of burst, ratio of highest and lowest energies of signal and voice onset time (VOT) are explored to characterize aspiration and unaspiration. TIMIT English speech corpus is used to test the proposed approach. Random forest (RF) and support vector machine (SVMs) are used as classifiers to test the effectiveness of the features used for the task. An accuracy of 99.93% and 94.03% is achieved respectively. From the results, it is observed that the proposed features are robust in classifying the aspirated and unaspirated consonants.
UR - https://www.scopus.com/pages/publications/85044194881
UR - https://www.scopus.com/pages/publications/85044194881#tab=citedBy
U2 - 10.1109/TENCON.2017.8228345
DO - 10.1109/TENCON.2017.8228345
M3 - Conference contribution
AN - SCOPUS:85044194881
VL - 2017-December
T3 - IEEE Region 10 Annual International Conference, Proceedings/TENCON
SP - 2840
EP - 2845
BT - TENCON 2017 - 2017 IEEE Region 10 Conference
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 2017 IEEE Region 10 Conference, TENCON 2017
Y2 - 5 November 2017 through 8 November 2017
ER -