TY - GEN
T1 - Analysis of Principal Component Analysis Algorithm for Various Datasets
AU - Naveen, Soumyalatha
AU - Omkar, Av
AU - Goyal, Jhanvi
AU - Gaikwad, Ranveer
N1 - Publisher Copyright:
© 2022 IEEE.
PY - 2022
Y1 - 2022
N2 - Principal component analysis (PCA) has been used successfully as a multivariate statistical process control (MSPC) tool for detecting faults in processes with predominantly found variables. However, in machine learning, predicting the result or object hugely depends on the proper dataset. Generally, the available dataset is bulk and consists of redundant information. To make any prediction, we need to have a mechanism to remove unwanted information from the dataset for high accuracy. If we use the statistical method, there is a high chance of enormous data loss while processing. Hence in this paper, we use PCA to reduce the high dimensional data set to a smaller number of modes or the structure. We implemented PCA for a simple 2 X 2 process using python for datasets such as breast cancer, wine data set, Digits, and Iris dataset to understand the impact of PCA on various application datasets. Our results show that the monitoring performance of PCA is vastly better, with 98.15% accuracy for the Wine data set, 91.0S% for breast cancer, 75.90% for digits, and 96.90% for the Iris dataset.
AB - Principal component analysis (PCA) has been used successfully as a multivariate statistical process control (MSPC) tool for detecting faults in processes with predominantly found variables. However, in machine learning, predicting the result or object hugely depends on the proper dataset. Generally, the available dataset is bulk and consists of redundant information. To make any prediction, we need to have a mechanism to remove unwanted information from the dataset for high accuracy. If we use the statistical method, there is a high chance of enormous data loss while processing. Hence in this paper, we use PCA to reduce the high dimensional data set to a smaller number of modes or the structure. We implemented PCA for a simple 2 X 2 process using python for datasets such as breast cancer, wine data set, Digits, and Iris dataset to understand the impact of PCA on various application datasets. Our results show that the monitoring performance of PCA is vastly better, with 98.15% accuracy for the Wine data set, 91.0S% for breast cancer, 75.90% for digits, and 96.90% for the Iris dataset.
UR - https://www.scopus.com/pages/publications/85154018506
UR - https://www.scopus.com/pages/publications/85154018506#tab=citedBy
U2 - 10.1109/INCOFT55651.2022.10094448
DO - 10.1109/INCOFT55651.2022.10094448
M3 - Conference contribution
AN - SCOPUS:85154018506
T3 - 2022 International Conference on Futuristic Technologies, INCOFT 2022
BT - 2022 International Conference on Futuristic Technologies, INCOFT 2022
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 1st International Conference on Futuristic Technologies, INCOFT 2022
Y2 - 25 November 2022 through 27 November 2022
ER -