—Banking and Financial Institutions are facing the pressure of increased defaults by individuals and firms in the last few years repercussions due to fraudulent activities. It is not only adversely affecting banks but also other financial sectors which depend on them. This makes it imperative to study the ways to prevent them rather than curing the situations. However, banks face two challenges in identifying NPAs and Wilful defaults. The first one is the due diligence of firms/individuals before an extension of the loan. The second one is, need for the placement of automated safeguards to reduce frauds originating out from human behavior. The wilful defaults are committed mainly in loan and credit services for personal benefits and are getting converted into bad loans. Bad loans are the Non-Performing Assets (NPAs) and wilful defaults are a subset of these. Hence, it is very important to control NPAs. The objective of the paper is to design and evaluate machine learning based supervised models for NPA detection. To design models, the entire historical and current data needs to be considered, which requires, faster access to large volumes of heterogeneous data. Hence, the supervised models are implemented using big data techniques for fraud detection and analytics. The various supervised models namely Logistic Regression, Support Vector Machine, Random Forest, Neural Network, and Naive Bayes are designed for loan data and experimented using Map Reduce on Hadoop platform. These models are evaluated considering various performance metrics. The empirical result shows that the Neural Network model performs best considering precision, recall, relative commission error, and kappa statistics for NPA prediction. The best-performed model can be integrated into the existing loan management system for the early identification of NPA cases.
|Number of pages||14|
|Publication status||Published - 2021|
All Science Journal Classification (ASJC) codes