COMPUTATIONAL IDENTIFICATION OF PROMOTER REGIONS IN PROKARYOTES AND EUKARYOTES

Sudheer Menon Sudheer Menon,Gopal Agarwal Gopal Agarwal,Shanmughavel Piramanayakam Shanmughavel Piramanayakam

doi:10.36713/epra7667

Abstract

Promoters are modular DNA structures that contain complex regulatory elements required for the initiation of gene transcription. Therefore, the use of machine learning methods to identify promoters is very important for improving genome annotation and understanding transcriptional regulation. In recent years, many methods for predicting eukaryotic and prokaryotic promoters have been proposed. However, the performance of these methods is still far from satisfactory. In this article, we have developed a hybrid method (called IPMD) that combines a position correlation score function and diversity increment with modified Mahalanobis Discriminant to predict eukaryotic and prokaryotic promoters. The precise calculation and identification of promoters remains a challenge because these key DNA regulatory regions have variable structures composed of functional motifs that can provide gene-specific transcription initiation. The promoter is a regulatory DNA region, which is very important for gene transcription regulation. It is located near the transcription start site (TSS) upstream of the corresponding gene. In the post-genomics era, the availability of data makes it possible to build computational models to detect promoters robustly, because these models are expected to be helpful to academia and drug discovery. Until recently, the developed model only focused on distinguishing sequences into promoters and non-promoters. However, by considering the classification of weak and strong promoters, promoter predictors can be further improved. INDEX TERMS—: deep learning, DNA sequence analysis, Promoter prediction, Promoters, Promoter elements

Full Text