Abstract

Genome and transcriptome assembly data often contain DNA and RNA contaminations from external organisms, introduced during nucleotide extraction or sequencing. In this study, contamination of seed plant (Spermatophyta) transcriptomes/genomes with p25alpha domain encoding RNA/DNA was systematically investigated. This domain only occurs in organisms possessing a eukaryotic flagellum (cilium), which seed plants usually do not have. Nucleotide sequences available at the National Center for Biotechnology Information website, including transcriptome shotgun assemblies (TSAs), whole-genome shotgun contigs (WGSs), and expressed sequence tags (ESTs), were searched for sequences containing a p25alpha domain in Spermatophyta. Despite the lack of proteins containing the p25alpha domain, such fragments or complete mRNAs in some EST and TSA databases were found. A phylogenetic analysis showed that these were contaminations whose possible sources were microorganisms (flagellated fungi, protists) and arthropods/worms; however, there were cases where it cannot be excluded that the sequences found were genuine hits and not of external origin.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call