Abstract

The gene promoter region controls the transcription of a gene, so finding the gene promoter region is the most important step in gene regulation. Due to the huge amount and genetic diversity, although many algorithms have been proposed, the promoter recognition are rather complex with the performance still limited by low sensitivity and highly false positives. In this paper, we present a novel machine learning method for predicting promoter. First, the function motifs in different regions of Human promoter sequences have been recognized using Gaussian Mixture Model (GMM). The optimum number of GMM is given by the fuzzy cluster recognition algorithm based on fuzzy likelihood function without prior knowledge. Then the promoter sequences were mapped into the positional densities of oligonucleotides high dimension Bayes space. At last, Least Square Support Vector Machine classifier is built with Kernel Locality Preserving Projection to predict the promoter sequence, which simplifies the Least Square Support Vector Machine to form the Least Square model. Simulation results show that the performance is improved compared with other promoter classifiers and the proposed method can predict the unknown promoters with unknown similar genes in the database, and also the speed of the proposed method is significantly increased.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call