Abstract

Machine learning methods played a major role in improving the accuracy of predictions and classification of DNA (Deoxyribonucleic Acid) and protein sequences. In eukaryotes, Splice-site identification and prediction is though not a straightforward job because of numerous false positives. To solve this problem, here, in this paper, we represent a bidirectional Long Short Term Memory (LSTM) Recurrent Neural Network (RNN) based deep learning model that has been developed to identify and predict the splice-sites for the prediction of exons from eukaryotic DNA sequences. During the splicing mechanism of the primary mRNA transcript, the introns, the non-coding region of the gene are spliced out and the exons, the coding region of the gene are joined. This bidirectional LSTM-RNN model uses the intron features that start with splice site donor (GT) and end with splice site acceptor (AG) in order of its length constraints. The model has been improved by increasing the number of epochs while training. This designed model achieved a maximum accuracy of 95.5%. This model is compatible with huge sequential data such as the complete genome.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.