Skip to main navigation Skip to search Skip to main content

Mining sequential patterns from protein sequences associated with aggregation diseases

  • B. Anup Bhat*
  • , Tanish Sunilkumar
  • , Ayush Prabhu
  • *Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

Protein aggregation is a hallmark of several neurodegenerative diseases. Mining patterns of amino acids from protein sequences aid in identifying regions of interest that can be validated clinically. Only a few studies have demonstrated the feasibility of employing Frequent Itemset Mining (FIM) algorithms for this task. However, these algorithms are not only computationally expensive but do not provide due consideration to the ordering of amino acids within a protein sequence. Apart from this, the number of patterns obtained remains sensitive to the input minimum support threshold. To overcome these limitations, the current study focuses on mining sequential patterns using the Top-k Sequential pattern mining algorithm that not only preserves the amino acid sequence but is also independent of any user input threshold. Across various protein sequences, the obtained sequential patterns were compared for similarity with frequent patterns by retaining as well as removing repeating amino acids. On average, about 89.31% non-repeating and 68.08% repeating sequential patterns were similar to the frequent patterns. Furthermore, a Jaccard Index close to 0.58 and 0.48 signifies the proximity of the sequential patterns with the frequent ones despite the absence of user-defined thresholds.

Original languageEnglish
Title of host publication2024 IEEE 3rd World Conference on Applied Intelligence and Computing, AIC 2024
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages184-189
Number of pages6
ISBN (Electronic)9798350384598
DOIs
Publication statusPublished - 2024
Event3rd IEEE World Conference on Applied Intelligence and Computing, AIC 2024 - Hybrid, Gwalior, India
Duration: 27-06-202428-06-2024

Publication series

Name2024 IEEE 3rd World Conference on Applied Intelligence and Computing, AIC 2024

Conference

Conference3rd IEEE World Conference on Applied Intelligence and Computing, AIC 2024
Country/TerritoryIndia
CityHybrid, Gwalior
Period27-06-2428-06-24

All Science Journal Classification (ASJC) codes

  • Artificial Intelligence
  • Computer Science Applications
  • Signal Processing
  • Decision Sciences (miscellaneous)
  • Modelling and Simulation
  • Health Informatics

Fingerprint

Dive into the research topics of 'Mining sequential patterns from protein sequences associated with aggregation diseases'. Together they form a unique fingerprint.

Cite this