Abstract

The aim of this paper is to improve the performance of the conventional Goertzel algorithm in determining the protein coding regions in deoxyribonucleic acid (DNA) sequences. First, the symbolic DNA sequences are converted into numerical signals using electron ion interaction potential method. Then by combining the modified anti-notch filter and linear predictive coding model, we proposed an efficient algorithm to achieve the performance improvement in the Goertzel algorithm for estimating genetic regions. Finally, a thresholding method is applied to precisely identify the exon and intron regions. The proposed algorithm is applied to several genes, including genes available in databases BG570 and HMR195 and the results are compared to other methods based on the nucleotide level evaluation criteria. Results demonstrate that our proposed method reduces the number of incorrect nucleotides which are estimated to be in the noncoding region. In addition, the area under the receiver operating characteristic curve has improved by the factor of 1.35 and 1.12 in HMR195 and BG570 datasets respectively, in comparison with the conventional Goertzel algorithm.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call