Gene prediction by the noise-assisted MEMD and wavelet transform for identifying the protein coding regions

Qian Zheng,Tao Chen,Wenxiang Zhou,Lei Xie,Hongye Su

doi:10.1016/j.bbe.2020.12.005

Abstract

The analysis of protein coding regions of DNA sequences is one of the most fundamental applications in bioinformatics. A number of model-independent approaches have been developed for differentiating between the protein-coding and non-protein-coding regions of DNA. However, these methods are often based on univariate analysis algorithms, which leads to the loss of joint information among four nucleotides of DNA. In this article, we introduce a method on basis of the noise-assisted multivariate empirical mode decomposition (NA-MEMD) and the modified Gabor-wavelet transform (MGWT). The NA-MEMD algorithm, as a multivariate analysis tool, is utilized to reconstruct the numerical analyzed sequence since it enables a matched-scale decomposition across all variables and eliminates the mode mixing. By virtues of NA-MEMD, the MGWT method achieves a stable improvement on the general identification performance. We compare our method with other Digital Signal Processing (DSP) methods on two representative DNA sequences and three benchmark datasets. The results reveal that our method can enhance the spectra of the analyzed sequences, and improve the robustness of MGWT to different DNA sequences, thus obtaining higher identification accuracies of protein coding regions over other applied methods. In addition, another comparative experiment with the model-dependent method (AUGUSTUS) on the recently proposed benchmark dataset G3PO verifies the superiority of model-independent methods (especially NA-MEMD-MGWT) for identifying coding regions of the poor-quality DNA sequences.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Gene prediction by the noise-assisted MEMD and wavelet transform for identifying the protein coding regions

Abstract

Talk to us

Similar Papers

More From: Biocybernetics and Biomedical Engineering

Lead the way for us

Journal: Biocybernetics and Biomedical Engineering	Publication Date: Jan 1, 2021
Citations: 9

Similar Papers

A Fast Algorithm for Detecting Frame Shifts in DNA sequences
Hassan Masoom ... Lesley Cunningham
-
Hassan Masoom, et. al.Hassan Masoom ... Lesley Cunningham
01 Sep 2006
01 Sep 2006

Prediction of probable genes by Fourier analysis of genomic sequences.
Shrish Tiwari ... Sudha Bhattacharya
Bioinformatics | VOL. 13
Shrish Tiwari, et. al.Shrish Tiwari ... Sudha Bhattacharya
01 Jan 1997
Bioinformatics | VOL. 13

Nucleotide distribution variance-based dynamic representation scheme for novel gene prediction
Sajid A Marhon
Network Modeling Analysis in Health Informatics and Bioinformatics | VOL. 4
Sajid A MarhonSajid A Marhon
27 Oct 2015
Network Modeling Analysis in Health Informatics and Bioinformatics | VOL. 4

Improved Comb Filter based Approach for Effective Prediction of Protein Coding Regions in DNA Sequences
Jayakishan Meher ... Gananath Dash
Journal of Signal and Information Processing | VOL. 02
Jayakishan Meher, et. al.Jayakishan Meher ... Gananath Dash
01 Jan 2010
Journal of Signal and Information Processing | VOL. 02

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Gene prediction by the noise-assisted MEMD and wavelet transform for identifying the protein coding regions

Abstract

Talk to us

Similar Papers

More From: Biocybernetics and Biomedical Engineering