Abstract

In this paper, we consider the problem of protein classification, which is a important and hot topic in bioinformatics. We propose a novel kernel based on the K- Spectrum Kernel by incorporating physico-chemical and biological properties of amino acids as well as the motif information for the captured protein classification problem. Similarity matrix is constructed based on an AAindex2 substitution matrix which measures the amino acid pair distance. Together with the motif content posing importance on the protein sequences, a new kernel is then constructed. We adopt the Eigen-matrix translation techniques for improving the classification accuracy. Experimental results indicate that the string-based kernel in conjunction with SVM classifier performs significantly better than the traditional spectrum kernel method. Furthermore, numerical examples also confirm the use of the Eigen- matrix translation techniques as general strategy.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.