Abstract

Owing to the vast amount of DNA sequence data, the prediction of the complete structure of genes from the genomic DNA sequence becomes an important issue. For the eukaryotes, especially for the human genome, the splice sites identification plays a crucial role in gene structure prediction. A hybrid feature extraction approach which combing the position weight matrix (PWM) with the increment of diversity (ID) was proposed. Based on the extracted features, the support vector machine (SVM) was applied to classify authentic and false splice sites. The new algorithm was shown to be effective and simple. By the proposed algorithm, 92.98% of donor sites and 90.46% of acceptor sites were correctly classified. It is anticipated that the novel computational method is promising for the identification of splice sites in human genome.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.