Abstract

Researches on the next generation sequencing (NGS) and the comparative genome analysis have recently been concerned. The analyses on transposable element composition and abundance are important parts for genome studies. Generally, the analyses of transposable element system were based on the complete spliced genomes; however, the post-processing and sequence splicing of the huge amount of short sequences from the 454 sequencer always encounter problems. Moreover, the occasion that large amount of repeat elements made up by transposable elements were incorrectly splicing or lost, leading to uncertain results. This study aimed at the construction of a framework to automatically analyze the insert sequence (IS) abundance and their composition based on a stimulated Roche 454 deep-sequencing data set, which was a 33-fold coverage of Microcystis aeruginosa NIES 843 genome. The result from the examination under the setting of three classes of division on the IS element candidates and a separated transposase examination thresholds is the most reliable. It showed that the abundance of IS element in this stimulated dataset was 10.38%, including 14 IS families and 66 IS subfamilies, which demonstrated no significant difference with the two sets of previous analysis results based on the spliced M. aeruginosa NIES 843 genome and a high percentage of IS element sequence overlap, indicating the reliability of this framework.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call