Abstract

For the fact that the location of proteins gave some details about the function of a protein whose location was uncertain, protein classification was regarded as a very important task in the field of biological data mining. However, the success of a human genome project led to a protein sequence explosion. There is a great need to develop a computational method for fast and reliable prediction of the locations of proteins according to their primary sequences. In this paper, we used the composite classifier system that was formed by a set of k-nearest neighbor (K-NN) classifiers, each of which was defined in a different pseudo amino composition vector. In the pseudo amino composition vector space, protein can be presented by Pseudo amino acid composition. The location of a queried protein is determined by the outcome of choice made among these constituent individual classifiers. It is shown through the outcome that the classifier outperformed the single classifier widely used in biological literature. So the composite classifier can be employed as a robust method to predict protein location in the field of biological data mining. Key words: Composite classifier system, biological data mining, atomic classifiers, pseudo amino acid composition.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.