Abstract
The protein coding regions play a significant role for gene applications in genomic signal processing. Unlike prokaryotes, the coding regions in eukaryotes are arranged in a random manner. Owing to unequal lengths and low volume density of coding regions, the identification of coding regions makes cumbersome. In this work, a new numerical mapping method based on Walsh codes is proposed to detect the coding regions in eukaryotes. The Walsh code for each nucleotide is obtained using the statistical features of a DNA sequence. The proposed method uses static type of mapping to convert a string of DNA nucleotides into a numerical sequence. The numerical sequence is given as an input to the digital signal processing based spectrum identification tool to detect the existence of quasi-periodic components within the coding region. The advantage of our method is that it is simple to design and easy to represent. The performance of the proposed method has been tested on four benchmark databases and a random set of sequences collected from the National Center for Biological Information (NCBI) database. Furthermore, it has been compared with other state-of-the-art spectrum based numerical mapping methods for statistical features such as sensitivity, specificity and accuracy. The proposed method is efficient as it attains 94 % accuracy, 85 % sensitivity and 96 % specificity when tested on the benchmark C. Elegans gene sequence.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.