Abstract
BackgroundDe novo prediction of Transcription Factor Binding Sites (TFBS) using computational methods is a difficult task and it is an important problem in Bioinformatics. The correct recognition of TFBS plays an important role in understanding the mechanisms of gene regulation and helps to develop new drugs.ResultsWe here present Memetic Framework for Motif Discovery (MFMD), an algorithm that uses semi-greedy constructive heuristics as a local optimizer. In addition, we used a hybridization of the classic genetic algorithm as a global optimizer to refine the solutions initially found. MFMD can find and classify overrepresented patterns in DNA sequences and predict their respective initial positions. MFMD performance was assessed using ChIP-seq data retrieved from the JASPAR site, promoter sequences extracted from the ABS site, and artificially generated synthetic data. The MFMD was evaluated and compared with well-known approaches in the literature, called MEME and Gibbs Motif Sampler, achieving a higher f-score in the most datasets used in this work.ConclusionsWe have developed an approach for detecting motifs in biopolymers sequences. MFMD is a freely available software that can be promising as an alternative to the development of new tools for de novo motif discovery. Its open-source software can be downloaded at https://github.com/jadermcg/mfmd.
Highlights
De novo prediction of Transcription Factor Binding Sites (TFBS) using computational methods is a difficult task and it is an important problem in Bioinformatics
Previous work We have developed in previous work two approaches called Discovery Motifs by Evolutionary Computation (DMEC) [17] and Discovery Motifs by Memetic Algorithms (DMMA) [18]
In this work we propose a new algorithm for the motif discovery in DNA sequences using local search and evolutionary algorithms as an optimization strategy
Summary
De novo prediction of Transcription Factor Binding Sites (TFBS) using computational methods is a difficult task and it is an important problem in Bioinformatics. The localization of the motifs should be learned without prior knowledge For that reason, this problem is called de novo motif discovery [2]. Transcription factors are specific proteins that bind to distinct sites on the genome This binding is an essential process in gene regulation which may lead to changes in transcriptional activity for a particular gene target [3]. These sites are short (< 30 bps) and have a typical nucleotide sequence, there may normally be variations due to mutations that occurred because of the selective pressure that the genome has undergone over time [4]
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.