TY - GEN
T1 - Analysis of feature selection and extraction algorithm for loan data
T2 - 2017 International Conference on Advances in Computing, Communications and Informatics, ICACCI 2017
AU - Manohara, Pai M.M.
AU - Attigeri, Girija
AU - Pai, Radhika M.
N1 - Publisher Copyright:
© 2017 IEEE.
PY - 2017/11/30
Y1 - 2017/11/30
N2 - Fraudulent activities in financial institutes can break the economic system of the country. These activities can be identified using clustering and classification algorithms. Effectiveness of these algorithms depend on quality of the input data. Moreover, financial data comes from various sources and forms such as financial statements, stakeholders activities and others. This data from various sources is very vast and unstructured big data. Hence, parallel distributed pre-processing is very significant to improve the quality of the data. Objective of this work is dimensionality reduction considering feature selection and extraction algorithm for large volume of financial data. In this paper an attempt is made to understand the implications of feature extraction and transformation algorithm using Principal Feature Analysis on the financial data. Effect of reduced dimension is studied on various classification algorithms for financial loan data. Parallel and distributed implementation is carried out on IBM Bluemix cloud platform with spark notebook. The results show that reduction of features has significantly improved execution time without compromising the accuracy.
AB - Fraudulent activities in financial institutes can break the economic system of the country. These activities can be identified using clustering and classification algorithms. Effectiveness of these algorithms depend on quality of the input data. Moreover, financial data comes from various sources and forms such as financial statements, stakeholders activities and others. This data from various sources is very vast and unstructured big data. Hence, parallel distributed pre-processing is very significant to improve the quality of the data. Objective of this work is dimensionality reduction considering feature selection and extraction algorithm for large volume of financial data. In this paper an attempt is made to understand the implications of feature extraction and transformation algorithm using Principal Feature Analysis on the financial data. Effect of reduced dimension is studied on various classification algorithms for financial loan data. Parallel and distributed implementation is carried out on IBM Bluemix cloud platform with spark notebook. The results show that reduction of features has significantly improved execution time without compromising the accuracy.
UR - https://www.scopus.com/pages/publications/85042675377
UR - https://www.scopus.com/inward/citedby.url?scp=85042675377&partnerID=8YFLogxK
U2 - 10.1109/ICACCI.2017.8126163
DO - 10.1109/ICACCI.2017.8126163
M3 - Conference contribution
AN - SCOPUS:85042675377
VL - 2017-January
T3 - 2017 International Conference on Advances in Computing, Communications and Informatics, ICACCI 2017
SP - 2147
EP - 2151
BT - 2017 International Conference on Advances in Computing, Communications and Informatics, ICACCI 2017
PB - Institute of Electrical and Electronics Engineers Inc.
Y2 - 13 September 2017 through 16 September 2017
ER -