TY - GEN
T1 - Data mining with machine learning applied for email deception
AU - More, Sujeet
AU - Kulkarni, S. A.
PY - 2013
Y1 - 2013
N2 - Spam is also known as junk mail or Unsolicited Commercial Email (UCE) which has become major problem for the sustainability of the internet and global commerce. Everyday millions of the spam mails are sent over internet to targeted population to advertise services, products and dangerous software etc. A number of spam detection algorithms have been proposed to classify emails on content based, but could not gain accuracy. Our proposed work mainly focuses on cognitive (spam) words for classification. This feature is sequential unique and closed patterns which are extracted from the message content. We show that this feature have good impact in classifying spam from legitimate messages. Our method, which can be easily implemented, compares amiably with respect to popular algorithms, like Logistic Regression, Neural Network, Naive Bayes and Random Forest using polynomial kernel as filter. We outperform the accuracy higher compared to related methods. In addition our method is resilient against irrelevant and bothersome words.
AB - Spam is also known as junk mail or Unsolicited Commercial Email (UCE) which has become major problem for the sustainability of the internet and global commerce. Everyday millions of the spam mails are sent over internet to targeted population to advertise services, products and dangerous software etc. A number of spam detection algorithms have been proposed to classify emails on content based, but could not gain accuracy. Our proposed work mainly focuses on cognitive (spam) words for classification. This feature is sequential unique and closed patterns which are extracted from the message content. We show that this feature have good impact in classifying spam from legitimate messages. Our method, which can be easily implemented, compares amiably with respect to popular algorithms, like Logistic Regression, Neural Network, Naive Bayes and Random Forest using polynomial kernel as filter. We outperform the accuracy higher compared to related methods. In addition our method is resilient against irrelevant and bothersome words.
UR - https://www.scopus.com/pages/publications/84893932945
UR - https://www.scopus.com/pages/publications/84893932945#tab=citedBy
U2 - 10.1109/ICOISS.2013.6678403
DO - 10.1109/ICOISS.2013.6678403
M3 - Conference contribution
AN - SCOPUS:84893932945
SN - 9781467361293
T3 - 2013 International Conference on Optical Imaging Sensor and Security, ICOSS 2013
BT - 2013 International Conference on Optical Imaging Sensor and Security, ICOSS 2013
T2 - 2013 International Conference on Optical Imaging Sensor and Security, ICOSS 2013
Y2 - 2 July 2013 through 3 July 2013
ER -