Optimizing Feature Selection in Big Data: A Hybrid Spark and Fuzzy Approach

  • Aman Singh Hada
  • , Gyanaballav Samir Sahoo
  • , Chinnapareddy Krishna Vamsi
  • , Anusha Hegde
  • , Biswajit Bhowmik

    Research output: Chapter in Book/Report/Conference proceedingConference contribution

    Abstract

    The exponential growth of big data presents both immense opportunities and significant challenges. While vast datasets hold the key to unlocking groundbreaking insights, efficiently extracting value requires sophisticated feature selection techniques. Traditional methods often struggle with the sheer volume and complexity of big data. This paper addresses this challenge by proposing a novel hybrid feature selection algorithm by leveraging Apache PySpark's distributed computing power. Combining a robust feature selection technique with a novel weighting scheme, our method outperforms existing hypercuboid and fuzzy Rough Set methods. The hybrid approach achieves superior accuracy of 72.1% with a reduced feature set, demonstrating its effectiveness in identifying salient features for big data analysis.

    Original languageEnglish
    Title of host publicationCOSMIC 2024 - IEEE International Conference on Computing, Semiconductor, Mechatronics, Intelligent Systems and Communications, Proceedings
    PublisherInstitute of Electrical and Electronics Engineers Inc.
    Pages195-199
    Number of pages5
    ISBN (Electronic)9798331517892
    DOIs
    Publication statusPublished - 2024
    Event2024 IEEE International Conference on Computing, Semiconductor, Mechatronics, Intelligent Systems and Communications, COSMIC 2024 - Mangalore, India
    Duration: 22-11-202423-11-2024

    Publication series

    NameCOSMIC 2024 - IEEE International Conference on Computing, Semiconductor, Mechatronics, Intelligent Systems and Communications, Proceedings

    Conference

    Conference2024 IEEE International Conference on Computing, Semiconductor, Mechatronics, Intelligent Systems and Communications, COSMIC 2024
    Country/TerritoryIndia
    CityMangalore
    Period22-11-2423-11-24

    All Science Journal Classification (ASJC) codes

    • Artificial Intelligence
    • Computer Networks and Communications
    • Computer Science Applications
    • Information Systems
    • Electrical and Electronic Engineering
    • Mechanical Engineering
    • Electronic, Optical and Magnetic Materials
    • Instrumentation

    Fingerprint

    Dive into the research topics of 'Optimizing Feature Selection in Big Data: A Hybrid Spark and Fuzzy Approach'. Together they form a unique fingerprint.

    Cite this