Abstract

Ciliates are a large group of ubiquitous and highly diverse single-celled eukaryotes that play an essential role in the functioning of microbial food webs. However, their genomic diversity is far from clear due to the need to develop cultivation methods for most species, so most research is based on wild organisms that almost invariably contain contaminants. Here we establish an integrated Genome Decontamination Pipeline (iGDP) that combines homology search, telomere reads-assisted and clustering approaches to filter contaminated ciliate genome assemblies from wild specimens. We benchmarked the performance of iGDP using genomic data from a contaminated ciliate culture and the results showed that iGDP could recall 91.9% of the target sequences with 96.9% precision. We also used a synthetic dataset to offer guidelines for the application of iGDP in the removal of various groups of contaminants. Compared with several popular metagenome binning tools, iGDP could show better performance. To further validate the effectiveness of iGDP on real-world data, we applied it to decontaminate genome assemblies of three wild ciliate specimens and obtained their genomes with high quality comparable to that of previously well-studied model ciliate genomes. It is anticipated that the newly generated genomes and the established iGDP method will be valuable community resources for detailed studies on ciliate biodiversity, phylogeny, ecology and evolution. The pipeline (https://github.com/GWang2022/iGDP) can be implemented automatically to reduce manual filtering and classification and may be further developed to apply to other microeukaryotes.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.