Abstract
Bioinformatics is the application of computer technology to the management of biological information. In Bioinformatics, Motif finding is one of the most popular problems, which has many applications. It is the process of locating the meaningful patterns in the sequence of Deoxyribo Nucleic Acid (DNA), Ribo Nucleic Acid (RNA) or Proteins. Motifs vary in lengths, positions, redundancy, orientation and bases. Finding these short sequences (motifs or signals) is a fundamental problem in molecular biology and computer science with important applications such as knowledge-based drug design, forensic DNA analysis, and agricultural biotechnology. In this work, the clustering system is used to predict local protein sequence Motifs. Since clustering algorithms can provide an automatic, unsupervised discovery process for sequence motifs, the K-Means clustering algorithm and Rough-K-means algorithm proposed are chosen as the motif discovery method for this study and the results are compared. The structural similarity of the clusters discovered by the proposed approach is studied to analyze how the recurring patterns correlate with its structure. Also, some biochemical references are included in our evaluation.
Published Version
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have