Abstract

High-throughput sequencing (HTS) surveys, among the most common approaches currently used in environmental microbiology, require reliable reference databases to be correctly interpreted. The EukRef Initiative (eukref.org) is a community effort to manually screen available small subunit (SSU) rRNA gene sequences and produce a public, high-quality and informative framework of phylogeny-based taxonomic annotations. In the context of EukRef, we present a database for the monophyletic phylum Ciliophora, one of the most complex, diverse and ubiquitous protist groups. We retrieved more than 11500 sequences of ciliates present in GenBank (28% from identified isolates and 72% from environmental surveys). Our approach included the inference of phylogenetic trees for every ciliate lineage and produced the largest SSU rRNA tree of the phylum Ciliophora to date. We flagged approximately 750 chimeric or low-quality sequences, improved the classification of 70% of GenBank entries and enriched environmental and literature metadata by 30%. The performance of EukRef-Ciliophora is superior to the current SILVA database in classifying HTS reads from a global marine survey. Comprehensive outputs are publicly available to make the new tool a useful guide for non-specialists and a quick reference for experts.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call