Skip to main navigation Skip to search Skip to main content

Guarding Inboxes: An NLP-Based Approach for Email Spam Detection

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

The unsolicited and misleading email material is sent in bulk to many recipients, sometimes known as spam or junk email. In the recent time, the increasing volume of such emails poses challenges to the electronic communication. This study therefore tries to build a reliable and accurate method for identifying and preventing spam emails, for improving user experience and information security. The dataset “Spam email classification” extracted from the Kaggle website is used in this study to detect and categorize email spam. It analyzes the text of the email using natural language processing and applies machine learning techniques to original unbalanced and resampled balanced datasets. The results indicate that the random forest model performs most effectively with an F1-score of 98% and an accuracy of 93%, respectively.

Original languageEnglish
Title of host publicationICT Systems and Sustainability - Proceedings of ICT4SD 2024
EditorsMilan Tuba, Shyam Akashe, Amit Joshi
PublisherSpringer Science and Business Media Deutschland GmbH
Pages43-51
Number of pages9
ISBN (Print)9789819785360
DOIs
Publication statusPublished - 2024
Event8th International Conference on ICT for Sustainable Development, ICT4SD 2024 - Goa, India
Duration: 08-08-202409-08-2024

Publication series

NameLecture Notes in Networks and Systems
Volume1163
ISSN (Print)2367-3370
ISSN (Electronic)2367-3389

Conference

Conference8th International Conference on ICT for Sustainable Development, ICT4SD 2024
Country/TerritoryIndia
CityGoa
Period08-08-2409-08-24

All Science Journal Classification (ASJC) codes

  • Control and Systems Engineering
  • Signal Processing
  • Computer Networks and Communications

Fingerprint

Dive into the research topics of 'Guarding Inboxes: An NLP-Based Approach for Email Spam Detection'. Together they form a unique fingerprint.

Cite this