Abstract
In this paper, we evaluated the performance of an evolutionary-based protein secondary structure (PSS) prediction model which uses the information of amino acid sequences extracted by a clustering technique. The dimension of the classifier's inputs is reduced using a k-means clustering method on sequence segments. The proposed PSS classifier is based on a Genetic Programming (GP) approach that uses IF rules for a multi-target classifier. The GP classifier is evaluated by using protein sequences and the sequence information obtained from the k-means clustering. The GP prediction model's performance is compared with those of feed-forward artificial neural networks (ANNs) and support vector machines (SVMs). The prediction methods are examined with two protein datasets RS126 and CB513. The performance of the three classification models are measured according to Q 3 and segment overlap (SOV) scores. The prediction models which use clustered data result in average 2% higher prediction accuracy than those using sequence data. In addition, the experimental results indicate the GP model's prediction scores are in average 3% higher than those of the ANN and SVMs models when amino acid sequences or clustered information are explored.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.