TY - JOUR
T1 - Machine learning facilitated structural activity relationship approach for the discovery of novel inhibitors targeting EGFR
AU - Choudhary, Rekha
AU - Walhekar, Vinayak
AU - Muthal, Amol
AU - Kumar, Dilip
AU - Bagul, Chandrakant
AU - Kulkarni, Ravindra
N1 - Publisher Copyright:
© 2023 Informa UK Limited, trading as Taylor & Francis Group.
PY - 2023
Y1 - 2023
N2 - This research manuscript aims to find the most effective epidermal growth factor receptor (EGFR) inhibitors from millions of in house compounds through Machine Learning (ML) techniques. ML-based structure activity relationship (SAR) models were validated to predict biological activity of untested novel molecules. Six ML algorithms, including k nearest neighbour (KNN), decision tree (DT), Logistic Regression, support vector machine (SVM), multilinear regression (MLR), and random forest (RF), were used to build for activity prediction. Among these, RF classifier (accuracy for train and test set is 90% and 81%) and RF regressor (R2 and MSE for trainset is 0.83 and 0.29 and for test set, 0.69 and 0.46) showed good predictive performance. Also, the six most essential features that affect the biological activity parameter and highly contribute to model development were successfully selected by the variable importance technique. RF regression model was used to predict the biological activity expressed as pIC50 of nearly ten million molecules while RF classification model classifies those molecules into active, moderately active, and least active according to their predicted pIC50. Based on two models, thousand molecules from million molecules with higher predicted pIC50 values and classified as active were selected for molecular docking. Based on the docking scores, predicted pIC50, and binding interactions with MET769 residue, compounds, i.e., Zinc257233137, Zinc257232249, and Zinc101379788, were identified as potential EGFR inhibitors with predicted pIC50 7.72, 7.85, and 7.70. Dynamics studies were also performed on Zinc257233137 to illustrate that it has good binding free energy and stable hydrogen bonding interactions with EGFR. These molecules can be used for further research and proved to be the novel drugs for EGFR in cancer treatment. Communicated by Ramaswamy H. Sarma.
AB - This research manuscript aims to find the most effective epidermal growth factor receptor (EGFR) inhibitors from millions of in house compounds through Machine Learning (ML) techniques. ML-based structure activity relationship (SAR) models were validated to predict biological activity of untested novel molecules. Six ML algorithms, including k nearest neighbour (KNN), decision tree (DT), Logistic Regression, support vector machine (SVM), multilinear regression (MLR), and random forest (RF), were used to build for activity prediction. Among these, RF classifier (accuracy for train and test set is 90% and 81%) and RF regressor (R2 and MSE for trainset is 0.83 and 0.29 and for test set, 0.69 and 0.46) showed good predictive performance. Also, the six most essential features that affect the biological activity parameter and highly contribute to model development were successfully selected by the variable importance technique. RF regression model was used to predict the biological activity expressed as pIC50 of nearly ten million molecules while RF classification model classifies those molecules into active, moderately active, and least active according to their predicted pIC50. Based on two models, thousand molecules from million molecules with higher predicted pIC50 values and classified as active were selected for molecular docking. Based on the docking scores, predicted pIC50, and binding interactions with MET769 residue, compounds, i.e., Zinc257233137, Zinc257232249, and Zinc101379788, were identified as potential EGFR inhibitors with predicted pIC50 7.72, 7.85, and 7.70. Dynamics studies were also performed on Zinc257233137 to illustrate that it has good binding free energy and stable hydrogen bonding interactions with EGFR. These molecules can be used for further research and proved to be the novel drugs for EGFR in cancer treatment. Communicated by Ramaswamy H. Sarma.
UR - https://www.scopus.com/pages/publications/85148236658
UR - https://www.scopus.com/pages/publications/85148236658#tab=citedBy
U2 - 10.1080/07391102.2023.2175263
DO - 10.1080/07391102.2023.2175263
M3 - Article
C2 - 36762704
AN - SCOPUS:85148236658
SN - 0739-1102
VL - 41
SP - 12445
EP - 12463
JO - Journal of Biomolecular Structure and Dynamics
JF - Journal of Biomolecular Structure and Dynamics
IS - 22
ER -