Abstract

In the Full-length Human cDNA Sequencing Project, 30,160 cDNA were sequenced. Among them, our group performed sequencing of 3,588 cDNAs, mainly using the primer walking method. The sequences achieved an average Phrap score of 76, which means the average of expected sequence accuracy was 99.9999975%, by sequencing of both strands with the criterion of a Phrap score over 30. In spite of the extremely high sequence reliability, we met with difficulty in sequencing 52 cDNAs, which are termed undecipherable cDNAs. cDNAs of long repeats were considered as a possible source of sequencing difficulty; their maximum repeat length sequenced by the primer walking method was 530 bp, without using the random method, and 81% of long repeat sequences remained in the ORFs. In single repeat regions, the insertion/deletion rates were much larger than in the usual regions. The fraction of SINE/Alu repeats in the cDNAs was 5.4%, half of the fraction of the human genome. The fraction of SINE/Alu in undecipherable cDNAs was up to 10%, the same level of the human genome.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call