Exploring the Effectiveness of Feature Reduction and Kernel-Based Matching for Query-by-Example Spoken Term Detection using CNN

  • Manisha Naik Gaonkar
  • , Veena Thenkanidiyoor
  • , Dileep Aroor Dinesh
  • , H. Muralikrishna*
  • *Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

1 Citation (Scopus)

Abstract

Query-by-example spoken term detection (QbE-STD) refers to the search for an audio query in a repository of audio utterances. A common approach for QbE-STD involves computing a matching matrix between the feature representations of the query and the reference utterance and deciding the relevance of the reference utterance to the query based on the computed matching matrix. The time required to compute the matching matrix is crucial since a matching matrix must be computed between a query and every reference utterance. This time depends on the number of feature representations in the query and reference utterance. Feature reduction is a technique that reduces the number of feature representations to reduce the time required to compute a matching matrix. In this study, we propose to explore feature reduction in combination with kernel-based matching of reduced feature representation for query and reference utterances. We propose to decide the relevance of a reference utterance using a convolutional neural network (CNN) based classifier on the matching matrix. We demonstrate that the proposed approach not only results in a reduction in search time but also increases the accuracy of QbE-STD.

Original languageEnglish
Pages (from-to)194462-194474
Number of pages13
JournalIEEE Access
Volume12
DOIs
Publication statusPublished - 2024

All Science Journal Classification (ASJC) codes

  • General Computer Science
  • General Materials Science
  • General Engineering

Fingerprint

Dive into the research topics of 'Exploring the Effectiveness of Feature Reduction and Kernel-Based Matching for Query-by-Example Spoken Term Detection using CNN'. Together they form a unique fingerprint.

Cite this