Abstract

One of the major research directions in bioinformatics is that of predicting the protein superfamily in large databases and classifying a given set of protein domains into superfamilies. The classification reflects the structural, evolutionary and functional relatedness. These relationships are embodied in hierarchical classification such as Structural Classification of Protein (SCOP), which is manually curated. Such classification is essential for the structural and functional analysis of proteins. Yet, a large number of proteins remain unclassified. We have proposed an unsupervised machine-learning FuzzyART neural network algorithm to classify a given set of proteins into SCOP superfamilies. The proposed method is fast learning and uses an atypical non-linear pattern recognition technique. In this approach, we have constructed a similarity matrix from p-values of BLAST all-against-all, trained the network with FuzzyART unsupervised learning algorithm using the similarity matrix as input vectors and finally the trained network offers SCOP superfamily level classification. In this experiment, we have evaluated the performance of our method with existing techniques on six different datasets. We have shown that the trained network is able to classify a given similarity matrix of a set of sequences into SCOP superfamilies at high classification accuracy.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call