Abstract
This work explains synthesis of protein structures based on the unsupervised learning method known as clustering. Protein structure prediction was performed for different crab and egg datasets with inputs collected from the Protein Data Bank (PDB ID: 3LIG, 2W3Z, 3ZVQ, 2KLR and 2YIZ). The three-dimensional protein structure was merged together with the filtering instances inbuilt in data mining techniques known as MergeSets. The problem description in this proposed methodology, referred to as attribute-related cluster sequence analysis, is to identify a good working algorithm for clustering of protein structures by comparing four existing algorithms: k-means, expectation maximization, farthest first and COBWEB. Experiments are conducted with the BioWeka data mining tool, Modeler 9.15 and the PyMOL tool with scripts using the Python programming language. This paper shows that the expectation maximization algorithm is the best for structured protein clustering, and this will also pave the way for identifying better algorithms for supervised learning methods.
| Original language | English |
|---|---|
| Pages (from-to) | 4287-4301 |
| Number of pages | 15 |
| Journal | Journal of Supercomputing |
| Volume | 76 |
| Issue number | 6 |
| DOIs | |
| Publication status | Published - 01-06-2020 |
All Science Journal Classification (ASJC) codes
- Software
- Theoretical Computer Science
- Information Systems
- Hardware and Architecture
Fingerprint
Dive into the research topics of '3D visualization and cluster analysis of unstructured protein sequences using ARCSA with a file conversion approach'. Together they form a unique fingerprint.Cite this
- APA
- Author
- BIBTEX
- Harvard
- Standard
- RIS
- Vancouver