Abstract
This paper presents a novel Bag-of-Visual-Words (BoVW) approach, to represent the grayscale spectrograms of acoustic events. Such, BoVW representations are referred as histograms of visual features, used for Acoustic Event Classification (AEC). Further, Chi-square distance between histograms of visual features evaluated, which generates kernel to Support Vector Machines (Chi-square SVM) classifier. Evaluation of the proposed histograms of visual features together with Chi-square SVM classifier is conducted on different categories of acoustic events from UPC-TALP corpora in clean and different noise conditions. Results show that proposed approach is more robust to noise and achieves improved recognition accuracy compared to other methods.
Original language | English |
---|---|
Pages (from-to) | 3319-3322 |
Number of pages | 4 |
Journal | Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH |
Volume | 2018-September |
DOIs | |
Publication status | Published - 2018 |
Event | 19th Annual Conference of the International Speech Communication, INTERSPEECH 2018 - Hyderabad, India Duration: 02-09-2018 → 06-09-2018 |
All Science Journal Classification (ASJC) codes
- Language and Linguistics
- Human-Computer Interaction
- Signal Processing
- Software
- Modelling and Simulation