Abstract

This chapter reviews the computational techniques for identifying genes in DNA sequences for the scientific layman and describes the working principles, the capabilities, and the limitations of gene identification software. Some attention is also given to likely future developments. The emphasis is on eukaryotes, as in this application domain the problem is of the most interest and difficulty. Two types of computational analysis are normally performed on essentially every newly determined DNA sequence. The first is a database search to compare the new sequence with existing collections (nucleotide sequence, amino acid sequence, or motif). The second, the topic of this study, is a search for protein-coding regions or genes. The chapter describes the three primary means of gathering clues about the existence, location, and function of genes, namely, database similarity search, statistical regularities of coding regions, and pattern recognition of functional sites. The purpose in this review is to provide an overview of these techniques for the person who would like to understand, at a high level, how computational gene identification is done.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.