TY - GEN
T1 - MediClass-RF
T2 - 2024 IEEE International Conference on Modeling, Simulation and Intelligent Computing, MoSICom 2024
AU - Sapna, R.
AU - Sheshappa, S. N.
AU - Raja, S. Pravinth
AU - Preethi, null
AU - Devadas, Raghavendra M.
AU - Hiremani, Vani
N1 - Publisher Copyright:
© 2024 IEEE.
PY - 2024
Y1 - 2024
N2 - Proper categorization of medicinal and non-medicinal herbs is essential for further application in healthcare and the herbal industries. In our work, we proposed MediClass-RF, the Random Forest-based optimized framework aimed at medicinal plant classification, providing feature importance analysis and in-depth performance metrics evaluation. MediClass-RF was trained on a dataset that contains instances of real-world medicinal plants and non-medicinal plants. The popular term Frequency-Inverse Document Frequency approach fed these plants' properties into the modules, and synthetic ones with labels of different quality were added artificially to control noise levels and, therefore, simulate practical usage more accurately. These plants' properties were fed into the model using a popular approach. The Random Forest model was trained, tested, and validated with the normal 70:30 separation of training and testing sets, and it performed well, scoring 89% overall accuracy. In terminologies of performance, the model had a recall of 0.71, indicating its capability to capture a significant portion of actual medicinal plants, and an F 1-score of 0.83, signifying stability among precision with recall. According to the feature importance study, the most important features leading to classification are terms with anti-inflammatory, antioxidant, and antibacterial activities. Furthermore, an Area Under the Curve (AUC) suggests the model's great power in recognizing medicinal properties against non-medicinal plants. Cross-validation was done five times to validate the model further, obtaining a mean accuracy score of 0.88, calculated out of 0.875,0.91666667, 0.875,0.83333333,0.91666667, demonstrating the model's robustness across diverse subsets of the data. This work shows the possibility of machine learning, specifically Random Forest, in medicinal plant classification, making it a valuable tool for medicinal plant specialists and investigators.
AB - Proper categorization of medicinal and non-medicinal herbs is essential for further application in healthcare and the herbal industries. In our work, we proposed MediClass-RF, the Random Forest-based optimized framework aimed at medicinal plant classification, providing feature importance analysis and in-depth performance metrics evaluation. MediClass-RF was trained on a dataset that contains instances of real-world medicinal plants and non-medicinal plants. The popular term Frequency-Inverse Document Frequency approach fed these plants' properties into the modules, and synthetic ones with labels of different quality were added artificially to control noise levels and, therefore, simulate practical usage more accurately. These plants' properties were fed into the model using a popular approach. The Random Forest model was trained, tested, and validated with the normal 70:30 separation of training and testing sets, and it performed well, scoring 89% overall accuracy. In terminologies of performance, the model had a recall of 0.71, indicating its capability to capture a significant portion of actual medicinal plants, and an F 1-score of 0.83, signifying stability among precision with recall. According to the feature importance study, the most important features leading to classification are terms with anti-inflammatory, antioxidant, and antibacterial activities. Furthermore, an Area Under the Curve (AUC) suggests the model's great power in recognizing medicinal properties against non-medicinal plants. Cross-validation was done five times to validate the model further, obtaining a mean accuracy score of 0.88, calculated out of 0.875,0.91666667, 0.875,0.83333333,0.91666667, demonstrating the model's robustness across diverse subsets of the data. This work shows the possibility of machine learning, specifically Random Forest, in medicinal plant classification, making it a valuable tool for medicinal plant specialists and investigators.
UR - https://www.scopus.com/pages/publications/85219597024
UR - https://www.scopus.com/pages/publications/85219597024#tab=citedBy
U2 - 10.1109/MoSICom63082.2024.10881547
DO - 10.1109/MoSICom63082.2024.10881547
M3 - Conference contribution
AN - SCOPUS:85219597024
T3 - IEEE International Conference on Modeling, Simulation and Intelligent Computing, MoSICom 2024 - Proceedings
SP - 373
EP - 378
BT - IEEE International Conference on Modeling, Simulation and Intelligent Computing, MoSICom 2024 - Proceedings
PB - Institute of Electrical and Electronics Engineers Inc.
Y2 - 9 December 2024 through 11 December 2024
ER -