Abstract

Expressed sequence tags (ESTs), which have piled up considerably so far, provide a valuable resource for finding new genes, disease-relevant genes, and for recognizing alternative splicing variants, SNP sites, etc. The prerequisite for carrying out these researches is to correctly ascertain the gene-sequence-related ESTs. Based on analysis of the alignment results between some known gene sequences and ESTs in public database, several measures including Identity Check, Gap Check, Inclusion Check and Length Check have been introduced to judge whether an EST alignment is related to a gene sequence or not. A computational program EDSAcl.O has been developed to identify true EST alignments and exon regions of query gene sequences. When tested with human gene sequences in the standard dataset HMR195 and evaluated with the standard measures of gene prediction performance, EDSAcl.O can identify proteincoding regions with specificity of 0.997 and sensitivity of 0.88 at the nucleotide level, which outperform that of the counterpart TAP. A web server of EDSAcl.0 is available at http://infosci.hust.edu.cn.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call