TY - JOUR
T1 - Exploring the Effectiveness of Feature Reduction and Kernel-Based Matching for Query-by-Example Spoken Term Detection using CNN
AU - Gaonkar, Manisha Naik
AU - Thenkanidiyoor, Veena
AU - Dinesh, Dileep Aroor
AU - Muralikrishna, H.
N1 - Publisher Copyright:
© 2013 IEEE.
PY - 2024
Y1 - 2024
N2 - Query-by-example spoken term detection (QbE-STD) refers to the search for an audio query in a repository of audio utterances. A common approach for QbE-STD involves computing a matching matrix between the feature representations of the query and the reference utterance and deciding the relevance of the reference utterance to the query based on the computed matching matrix. The time required to compute the matching matrix is crucial since a matching matrix must be computed between a query and every reference utterance. This time depends on the number of feature representations in the query and reference utterance. Feature reduction is a technique that reduces the number of feature representations to reduce the time required to compute a matching matrix. In this study, we propose to explore feature reduction in combination with kernel-based matching of reduced feature representation for query and reference utterances. We propose to decide the relevance of a reference utterance using a convolutional neural network (CNN) based classifier on the matching matrix. We demonstrate that the proposed approach not only results in a reduction in search time but also increases the accuracy of QbE-STD.
AB - Query-by-example spoken term detection (QbE-STD) refers to the search for an audio query in a repository of audio utterances. A common approach for QbE-STD involves computing a matching matrix between the feature representations of the query and the reference utterance and deciding the relevance of the reference utterance to the query based on the computed matching matrix. The time required to compute the matching matrix is crucial since a matching matrix must be computed between a query and every reference utterance. This time depends on the number of feature representations in the query and reference utterance. Feature reduction is a technique that reduces the number of feature representations to reduce the time required to compute a matching matrix. In this study, we propose to explore feature reduction in combination with kernel-based matching of reduced feature representation for query and reference utterances. We propose to decide the relevance of a reference utterance using a convolutional neural network (CNN) based classifier on the matching matrix. We demonstrate that the proposed approach not only results in a reduction in search time but also increases the accuracy of QbE-STD.
UR - https://www.scopus.com/pages/publications/85213007125
UR - https://www.scopus.com/pages/publications/85213007125#tab=citedBy
U2 - 10.1109/ACCESS.2024.3520605
DO - 10.1109/ACCESS.2024.3520605
M3 - Article
AN - SCOPUS:85213007125
SN - 2169-3536
VL - 12
SP - 194462
EP - 194474
JO - IEEE Access
JF - IEEE Access
ER -