A Nonparametric Feature Separability Measure and an Algorithm for Simulating Synthetic Feature Vectors

  • Chowtapalle Anuraag Chetty
  • , V. R. Simi*
  • , Justin Joseph
  • , Vipin Venugopal
  • *Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

Measures that quantitatively reflect the separability between feature sets of two classes are required to identify the determinant features and select hyper-parameters of feature extraction algorithms, in binary classification paradigms. State-of-the-art separability measures look for equality of distribution parameters of the feature sets and do not linearly quantify the level of overlap between them. Reliable algorithms for generating synthetic feature sets with known levels of overlap are required to test and compare the performance of the separability measures. A measure of separability of features between two classes termed Thresholding-based Classification Error Estimate (TCEE) and an algorithm for generating synthetic feature vectors for testing the feature separability measures are proposed in this paper. Pearson’s correlation coefficient (PCC) of the Bhattacharyya distance (BD), Relative Entropy (RE), p-value of Rank-sum test, Jeffries-Matusita (JM) distance and TCEE with the percentage of overlaps on synthetic feature sets of two distinct classes are −0.6429, −0.6428, 0.3780, −0.9881, and 1. A high value of Pearson’s correlation with the percentage of overlap justifies that the TCEE can accurately measure separability of feature sets of two classes.

Original languageEnglish
Title of host publicationInformation Management - 10th International Conference, ICIM 2024, Revised Selected Papers
EditorsShuliang Li
PublisherSpringer Science and Business Media Deutschland GmbH
Pages388-397
Number of pages10
ISBN (Print)9783031643583
DOIs
Publication statusPublished - 2024
Event10th International Conference on Information Management, ICIM 2024 - Cambridge, United Kingdom
Duration: 08-03-202410-03-2024

Publication series

NameCommunications in Computer and Information Science
Volume2102 CCIS
ISSN (Print)1865-0929
ISSN (Electronic)1865-0937

Conference

Conference10th International Conference on Information Management, ICIM 2024
Country/TerritoryUnited Kingdom
CityCambridge
Period08-03-2410-03-24

All Science Journal Classification (ASJC) codes

  • General Computer Science
  • General Mathematics

Fingerprint

Dive into the research topics of 'A Nonparametric Feature Separability Measure and an Algorithm for Simulating Synthetic Feature Vectors'. Together they form a unique fingerprint.

Cite this