Abstract

Accurate prediction of genes in genomes has always been a challenging task for bioinformaticians and computational biologists. Therefore, the discovery of relations in coding and non-coding sequences has led to new perspectives in the understanding of the DNA sequences. This has motivated us to find new methods to distinguish coding and non-coding sequences. We first introduce a number sequence representation of DNA sequences. Multi-affinity analysis and local Holder exponent are then performed on the representation of the obtained number sequence. Three suited exponents are selected to form a parameter space. The two exponents γ(−2), γ(6) are from Multi-affinity analysis, the exponent h is from local Holder exponent. Thus, each coding or non-coding sequence may be represented by a point in the three-dimensional parameter space. We can see the points corresponding to coding and non-coding sequences in the complete genome of many prokaryotes be divided to different regions roughly. If the point (γ(−2), γ(6), h) for a DNA sequence is situated in the region corresponding to coding sequences, the sequence is discriminated as a coding sequence; otherwise, the sequence is classified as a non-coding one. Therefore these exponents can be used to distinguish coding and non-coding sequences. The Fisher's discriminant algorithm is used to give the discriminant accuracies. The average discriminant accuracies p c , p nc , q c and q nc of all 51 prokaryotes obtained by the present method reach 69.08%, 83.34%, 72.08% and 83.54%, respectively.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.