TY - GEN
T1 - A comparison of waveform fractal dimension techniques for voice pathology classification
AU - Baljekar, Pallavi N.
AU - Patil, Hemant A.
PY - 2012/10/23
Y1 - 2012/10/23
N2 - In this paper, an attempt is made to compare and analyze the various waveform fractal dimension techniques for voice pathology classification. Three methods of estimating the fractal dimension directly from the time-domain waveform have been compared. The methods used are Katz algorithm, Higuchi algorithm and the Hurst exponent calculated using the rescaled range (R/S) analysis. Furthermore, the effects of the window size, the base waveform used and score-level fusion with Mel frequency cepstral coefficients (MFCC) has also been evaluated. The features have been extracted from two different base waveforms, the speech signal and the Teager energy operator (TEO) phase of the speech signal. Experiments have been carried out on a subset of the Massachusetts Eye and Ear Infirmary (MEEI) database and classifier used is a 2 nd order polynomial classifier. A classification accuracy of 97.54 %was achieved on score-level fusion, an increase in performance by about 2 % as compared to MFCC alone.
AB - In this paper, an attempt is made to compare and analyze the various waveform fractal dimension techniques for voice pathology classification. Three methods of estimating the fractal dimension directly from the time-domain waveform have been compared. The methods used are Katz algorithm, Higuchi algorithm and the Hurst exponent calculated using the rescaled range (R/S) analysis. Furthermore, the effects of the window size, the base waveform used and score-level fusion with Mel frequency cepstral coefficients (MFCC) has also been evaluated. The features have been extracted from two different base waveforms, the speech signal and the Teager energy operator (TEO) phase of the speech signal. Experiments have been carried out on a subset of the Massachusetts Eye and Ear Infirmary (MEEI) database and classifier used is a 2 nd order polynomial classifier. A classification accuracy of 97.54 %was achieved on score-level fusion, an increase in performance by about 2 % as compared to MFCC alone.
UR - https://www.scopus.com/pages/publications/84867598366
UR - https://www.scopus.com/inward/citedby.url?scp=84867598366&partnerID=8YFLogxK
U2 - 10.1109/ICASSP.2012.6288910
DO - 10.1109/ICASSP.2012.6288910
M3 - Conference contribution
AN - SCOPUS:84867598366
SN - 9781467300469
T3 - ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings
SP - 4461
EP - 4464
BT - 2012 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2012 - Proceedings
T2 - 2012 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2012
Y2 - 25 March 2012 through 30 March 2012
ER -