Abstract
Cathepsins are lysosomal proteases which are crucial for protein breakdown, bone remodelling and antigen processing and whose dysregulation leads to diseases like cancer, osteoporosis and neurodegenerative disorders, making them an important drug target. This study introduces a 1D Convolutional Neural Network-based classification model to enhance screening for potential cathepsin inhibitors, leading to more efficient selection of potential targets for experimental validation. The dataset was gathered from BindingDB and ChEMBL with the target Cathepsin B, S, D and K and the respective half-maximal inhibitory concentration (IC50) values for their inhibitors, which were categorized into four classes—potent, active, intermediate, and inactive—based on the ranges of IC50 values. The inhibitor ligands were collected in the Simplified Molecular Input Line Entry System (SMILES) notation and were converted to molecular descriptors using the RDKit library. Due to the large number of features, feature selection techniques such as Recursive Feature Elimination (RFE), variance thresholding, and correlation analysis were employed to refine and reduce the initial molecular descriptor set. Data augmentation techniques like Synthetic Minority Over-sampling Technique (SMOTE) were applied to address the issue of class imbalance. The proposed model achieved high classification accuracies, with Cathepsin B at 97.67% ± 0.54% ± , Cathepsin S at 90.69% ± 0.57% , Cathepsin D at 97.27% ± 0.23% , and Cathepsin K at 92.03% ± 1.07% ± , highlighting the effectiveness of feature selection and deep learning in ligand classification. This approach enhances the identification of potential drug targets for in vitro and in vivo testing, making the process more cost-effective and time-efficient.
| Original language | English |
|---|---|
| Pages (from-to) | 173695-173711 |
| Number of pages | 17 |
| Journal | IEEE Access |
| Volume | 13 |
| DOIs | |
| Publication status | Published - 2025 |
UN SDGs
This output contributes to the following UN Sustainable Development Goals (SDGs)
-
SDG 3 Good Health and Well-being
All Science Journal Classification (ASJC) codes
- General Computer Science
- General Materials Science
- General Engineering
Fingerprint
Dive into the research topics of 'CathepsinDL: Deep Learning-Driven Model for Cathepsin Inhibitor Screening and Drug Target Identification'. Together they form a unique fingerprint.Cite this
- APA
- Author
- BIBTEX
- Harvard
- Standard
- RIS
- Vancouver