MediClass-RF: An Optimized Random Forest Framework for Medicinal Plant Classification with Feature Importance and Comprehensive Performance Evaluation

  • R. Sapna*
  • , S. N. Sheshappa
  • , S. Pravinth Raja
  • , Preethi
  • , Raghavendra M. Devadas
  • , Vani Hiremani
  • *Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceedingConference contribution

2 Citations (Scopus)

Abstract

Proper categorization of medicinal and non-medicinal herbs is essential for further application in healthcare and the herbal industries. In our work, we proposed MediClass-RF, the Random Forest-based optimized framework aimed at medicinal plant classification, providing feature importance analysis and in-depth performance metrics evaluation. MediClass-RF was trained on a dataset that contains instances of real-world medicinal plants and non-medicinal plants. The popular term Frequency-Inverse Document Frequency approach fed these plants' properties into the modules, and synthetic ones with labels of different quality were added artificially to control noise levels and, therefore, simulate practical usage more accurately. These plants' properties were fed into the model using a popular approach. The Random Forest model was trained, tested, and validated with the normal 70:30 separation of training and testing sets, and it performed well, scoring 89% overall accuracy. In terminologies of performance, the model had a recall of 0.71, indicating its capability to capture a significant portion of actual medicinal plants, and an F 1-score of 0.83, signifying stability among precision with recall. According to the feature importance study, the most important features leading to classification are terms with anti-inflammatory, antioxidant, and antibacterial activities. Furthermore, an Area Under the Curve (AUC) suggests the model's great power in recognizing medicinal properties against non-medicinal plants. Cross-validation was done five times to validate the model further, obtaining a mean accuracy score of 0.88, calculated out of 0.875,0.91666667, 0.875,0.83333333,0.91666667, demonstrating the model's robustness across diverse subsets of the data. This work shows the possibility of machine learning, specifically Random Forest, in medicinal plant classification, making it a valuable tool for medicinal plant specialists and investigators.

Original languageEnglish
Title of host publicationIEEE International Conference on Modeling, Simulation and Intelligent Computing, MoSICom 2024 - Proceedings
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages373-378
Number of pages6
ISBN (Electronic)9798331533311
DOIs
Publication statusPublished - 2024
Event2024 IEEE International Conference on Modeling, Simulation and Intelligent Computing, MoSICom 2024 - Dubai, United Arab Emirates
Duration: 09-12-202411-12-2024

Publication series

NameIEEE International Conference on Modeling, Simulation and Intelligent Computing, MoSICom 2024 - Proceedings

Conference

Conference2024 IEEE International Conference on Modeling, Simulation and Intelligent Computing, MoSICom 2024
Country/TerritoryUnited Arab Emirates
CityDubai
Period09-12-2411-12-24

All Science Journal Classification (ASJC) codes

  • Artificial Intelligence
  • Computer Science Applications
  • Energy Engineering and Power Technology
  • Electrical and Electronic Engineering
  • Modelling and Simulation
  • Instrumentation

Fingerprint

Dive into the research topics of 'MediClass-RF: An Optimized Random Forest Framework for Medicinal Plant Classification with Feature Importance and Comprehensive Performance Evaluation'. Together they form a unique fingerprint.

Cite this