CathepsinDL: Deep Learning-Driven Model for Cathepsin Inhibitor Screening and Drug Target Identification

  • Mohammed Junaid Anwar Qader
  • , Chandra Mohan Sah
  • , Tapan Kumar Sahoo*
  • , Santosh Kumar Majhi
  • , Kaushik Mishra*
  • *Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

Abstract

Cathepsins are lysosomal proteases which are crucial for protein breakdown, bone remodelling and antigen processing and whose dysregulation leads to diseases like cancer, osteoporosis and neurodegenerative disorders, making them an important drug target. This study introduces a 1D Convolutional Neural Network-based classification model to enhance screening for potential cathepsin inhibitors, leading to more efficient selection of potential targets for experimental validation. The dataset was gathered from BindingDB and ChEMBL with the target Cathepsin B, S, D and K and the respective half-maximal inhibitory concentration (IC50) values for their inhibitors, which were categorized into four classes—potent, active, intermediate, and inactive—based on the ranges of IC50 values. The inhibitor ligands were collected in the Simplified Molecular Input Line Entry System (SMILES) notation and were converted to molecular descriptors using the RDKit library. Due to the large number of features, feature selection techniques such as Recursive Feature Elimination (RFE), variance thresholding, and correlation analysis were employed to refine and reduce the initial molecular descriptor set. Data augmentation techniques like Synthetic Minority Over-sampling Technique (SMOTE) were applied to address the issue of class imbalance. The proposed model achieved high classification accuracies, with Cathepsin B at 97.67% ± 0.54% ± , Cathepsin S at 90.69% ± 0.57% , Cathepsin D at 97.27% ± 0.23% , and Cathepsin K at 92.03% ± 1.07% ± , highlighting the effectiveness of feature selection and deep learning in ligand classification. This approach enhances the identification of potential drug targets for in vitro and in vivo testing, making the process more cost-effective and time-efficient.

Original languageEnglish
Pages (from-to)173695-173711
Number of pages17
JournalIEEE Access
Volume13
DOIs
Publication statusPublished - 2025

All Science Journal Classification (ASJC) codes

  • General Computer Science
  • General Materials Science
  • General Engineering

Fingerprint

Dive into the research topics of 'CathepsinDL: Deep Learning-Driven Model for Cathepsin Inhibitor Screening and Drug Target Identification'. Together they form a unique fingerprint.

Cite this