TY - JOUR
T1 - DCBLSTM—Deep Convolutional Bidirectional Long Short-Term Memory neural network for Q8 secondary protein structure prediction
AU - Banthia, Suvidhi
AU - McKenna, Adam
AU - Tiwari, Shailendra Kumar
AU - Dubey, Sandhya P.N.
N1 - Publisher Copyright:
© 2025 The Authors
PY - 2025/9
Y1 - 2025/9
N2 - Protein secondary structure prediction involves determining a protein's secondary structure from its primary amino acid sequence, serving as a critical step toward tertiary structure prediction. This, in turn, is essential for applications in drug design, protein engineering, and genetic research. Given the complexity of this task, advanced methods such as Gated Recurrent Units (GRUs) and Long Short-Term Memory (LSTM) networks are often employed, as they effectively capture long-range dependencies between amino acids, thereby improving prediction accuracy. In this study, we utilized the latter, specifically Bidirectional Long Short-Term Memory (BLSTM) networks, which process protein sequences in both forward and backward directions. This bidirectional processing has shown considerable promise in this domain. To further enhance local feature extraction, the network architecture incorporates a local feature encoding and extraction module consisting of three 1-dimensional convolutional layers, designed to capture dependencies between adjacent amino acids. Several optimization and regularization techniques were applied to refine the model, including batch normalization, kernel initialization, kernel regularization, dropout, and pooling layers. Optimal values for each parameter were identified through meticulous hyperparameter tuning. The final proposed model, termed Deep Convolutional BLSTM (DCBLSTM), was evaluated on three publicly available and widely recognized datasets: CB513, CASP10, and CASP11. For Q8-state classification, the model achieved accuracies of 88.9%, 83.9%, and 84.3%, respectively, on these datasets. These results demonstrate that the proposed model delivers state-of-the-art accuracy, outperforming several existing benchmark models. The consistently high accuracy highlights the effectiveness and robustness of the DCBLSTM model for protein secondary structure prediction.
AB - Protein secondary structure prediction involves determining a protein's secondary structure from its primary amino acid sequence, serving as a critical step toward tertiary structure prediction. This, in turn, is essential for applications in drug design, protein engineering, and genetic research. Given the complexity of this task, advanced methods such as Gated Recurrent Units (GRUs) and Long Short-Term Memory (LSTM) networks are often employed, as they effectively capture long-range dependencies between amino acids, thereby improving prediction accuracy. In this study, we utilized the latter, specifically Bidirectional Long Short-Term Memory (BLSTM) networks, which process protein sequences in both forward and backward directions. This bidirectional processing has shown considerable promise in this domain. To further enhance local feature extraction, the network architecture incorporates a local feature encoding and extraction module consisting of three 1-dimensional convolutional layers, designed to capture dependencies between adjacent amino acids. Several optimization and regularization techniques were applied to refine the model, including batch normalization, kernel initialization, kernel regularization, dropout, and pooling layers. Optimal values for each parameter were identified through meticulous hyperparameter tuning. The final proposed model, termed Deep Convolutional BLSTM (DCBLSTM), was evaluated on three publicly available and widely recognized datasets: CB513, CASP10, and CASP11. For Q8-state classification, the model achieved accuracies of 88.9%, 83.9%, and 84.3%, respectively, on these datasets. These results demonstrate that the proposed model delivers state-of-the-art accuracy, outperforming several existing benchmark models. The consistently high accuracy highlights the effectiveness and robustness of the DCBLSTM model for protein secondary structure prediction.
UR - https://www.scopus.com/pages/publications/105008661484
UR - https://www.scopus.com/pages/publications/105008661484#tab=citedBy
U2 - 10.1016/j.compbiomed.2025.110457
DO - 10.1016/j.compbiomed.2025.110457
M3 - Article
AN - SCOPUS:105008661484
SN - 0010-4825
VL - 195
JO - Computers in Biology and Medicine
JF - Computers in Biology and Medicine
M1 - 110457
ER -