TY - GEN
T1 - Spam mail detection through data mining techniques
AU - Shrivastava, Shubhi
AU - Anju, R.
N1 - Publisher Copyright:
© 2017 IEEE.
PY - 2018/3/23
Y1 - 2018/3/23
N2 - In todays electronic world a huge part of communication, both professional and private, takes place in the form of electronic mails or emails. However, due to advertising agencies and social networking websites most of the emails circulated contain unwanted information which is not relevant to the user. Spam emails are a type of electronic mail where the user receives unsolicited messages via email. Spam emails cause inconvenience and financial loss to the recipients so there is a need to filter them and separate them from the legitimate emails. Many algorithms and filters have been developed to detect the spam emails but spammers continuously evolve and sophisticate their spamming techniques due to which the existing filters are becoming less effective. The method proposed in this paper involves creating a spam filter using binary and continuous probability distributions. The algorithms implemented in building the classifier model are Naive Bayes and Decision Trees. The effect of overfitting on the performance and accuracy of decision trees is analyzed. Finally, the better classifier model is identified based on its accuracy to correctly classify spam and non-spam emails.
AB - In todays electronic world a huge part of communication, both professional and private, takes place in the form of electronic mails or emails. However, due to advertising agencies and social networking websites most of the emails circulated contain unwanted information which is not relevant to the user. Spam emails are a type of electronic mail where the user receives unsolicited messages via email. Spam emails cause inconvenience and financial loss to the recipients so there is a need to filter them and separate them from the legitimate emails. Many algorithms and filters have been developed to detect the spam emails but spammers continuously evolve and sophisticate their spamming techniques due to which the existing filters are becoming less effective. The method proposed in this paper involves creating a spam filter using binary and continuous probability distributions. The algorithms implemented in building the classifier model are Naive Bayes and Decision Trees. The effect of overfitting on the performance and accuracy of decision trees is analyzed. Finally, the better classifier model is identified based on its accuracy to correctly classify spam and non-spam emails.
UR - http://www.scopus.com/inward/record.url?scp=85048111602&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85048111602&partnerID=8YFLogxK
U2 - 10.1109/INTELCCT.2017.8324021
DO - 10.1109/INTELCCT.2017.8324021
M3 - Conference contribution
AN - SCOPUS:85048111602
VL - 2018-January
T3 - ICCT 2017 - International Conference on Intelligent Communication and Computational Techniques
SP - 61
EP - 64
BT - ICCT 2017 - International Conference on Intelligent Communication and Computational Techniques
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 2017 International Conference on Intelligent Communication and Computational Techniques, ICCT 2017
Y2 - 22 December 2017 through 23 December 2017
ER -