Abstract
Since the fast development of genome sequencing has produced large scale data, the current work uses the bioinformatics methods to recognize different gene regions, such as exon, intron and promoter, which play an important role in gene regulations. In this paper, we introduce a new method based on the maximum entropy Markov model (MEMM) to recognize the promoter, which utilizes the biological features of the promoter for the condition. However, it leads to a high false positive rate (FPR). In order to reduce the FPR, we provide another new method based on the maximum entropy hidden Markov model (ME-HMM) without the independence assumption, which could also accommodate the biological features effectively. To demonstrate the precision, the new methods are implemented by R language and the hidden Markov model (HMM) is introduced for comparison. The experimental results show that the new methods may not only overcome the shortcomings of HMM, but also have their own advantages. The results indicate that, MEMM is excellent for identifying the conserved signals, and ME-HMM can demonstrably improve the true positive rate.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.