Abstract

Genome sequence data consists of DNA sequences or input sequences. Each one includes nucleotides with chemical structures presented as characters: 'A', 'C',' G', and 'T', and groups of motif sequences, called Transcription Factor Binding Sites (TFBSs), which are subsequences of DNA that lead to protein-synthesis. The detection of TFBSs is an important problem for bioinformatics research. With the similar patterns of motif sequences in TFBSs, computational algorithms for TFBSs detection have been improved to reduce resources used in laboratory setting. The metaheuristic algorithm is the important issue that has been continually improved to detect TFBSs with greater precision and recall. This paper proposes PSO_HD by applying Particle Swarm Optimization (PSO) as a pre-process and using Hamming distance to improve the efficiency of detecting TFBSs with more precision and recall. In order to measure its efficiency, the paper compares the TFBSs detection using PSO_HD algorithm with relevant algorithms in eight datasets. F-score is used as a measurement unit and compared to the related algorithms. The experimental results show that PSO_HD algorithm gives the highest average F-score, which can be indicated that the PSO_HD algorithm can improve the efficiency of detecting TFBSs with more precision and recall.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.