Abstract
BackgroundMarker gene studies often use short amplicons spanning one or more hypervariable regions from an rRNA gene to interrogate the community structure of uncultured environmental samples. Target regions are chosen for their discriminatory power, but the limited phylogenetic signal of short high-throughput sequencing reads precludes accurate phylogenetic analysis. This is particularly unfortunate in the study of microscopic eukaryotes where horizontal gene flow is limited and the rRNA gene is expected to accurately reflect the species phylogeny. A promising alternative to full phylogenetic analysis is phylogenetic placement, where a reference phylogeny is inferred using the complete marker gene and iteratively extended with the short sequences from a metagenetic sample under study.ResultsBased on the phylogenetic placement approach we built Séance, a community analysis pipeline focused on the analysis of 18S marker gene data. Séance combines the alignment extension and phylogenetic placement capabilities of the Pagan multiple sequence alignment program with a suite of tools to preprocess, cluster and visualise datasets composed of many samples. We showcase Séance by analysing 454 data from a longitudinal study of intestinal parasite communities in wild rufous mouse lemurs (Microcebus rufus) as well as in simulation. We demonstrate both improved OTU picking at higher levels of sequence similarity for 454 data and show the accuracy of phylogenetic placement to be comparable to maximum likelihood methods for lower numbers of taxa.ConclusionsSéance is an open source community analysis pipeline that provides reference-based phylogenetic analysis for rRNA marker gene studies. Whilst in this article we focus on studying nematodes using the 18S marker gene, the concepts are generic and reference data for alternative marker genes can be easily created. Séance can be downloaded from http://wasabiapp.org/software/seance/.Electronic supplementary materialThe online version of this article (doi:10.1186/s12862-014-0235-7) contains supplementary material, which is available to authorized users.
Highlights
Marker gene studies often use short amplicons spanning one or more hypervariable regions from an Ribosomal RNA (rRNA) gene to interrogate the community structure of uncultured environmental samples
Ribosomal RNA marker gene studies remain central in the characterisation of the community structure of uncultured microbes and microscopic eukaryotes in environmental samples
Methodological advances have focused on improving our ability to accurately estimate the level of species diversity given the large number of potential confounders, for example, appropriate techniques for read filtering and quality trimming [5], the identification and removal of artefacts from amplification [6] and pyrosequencing [7,8] and improved methods for clustering reads into operational taxonomic units (OTUs) [9]
Summary
Marker gene studies often use short amplicons spanning one or more hypervariable regions from an rRNA gene to interrogate the community structure of uncultured environmental samples. Methodological advances have focused on improving our ability to accurately estimate the level of species diversity given the large number of potential confounders, for example, appropriate techniques for read filtering and quality trimming [5], the identification and removal of artefacts from amplification [6] and pyrosequencing [7,8] and improved methods for clustering reads into operational taxonomic units (OTUs) [9] Many of these advances have been incorporated into integrated pipelines [10,11] and packaged along with traditional phylogenetic analysis tools. While phylogenetic analysis can be appropriate for data from older sequencing technologies, the limited information content of short amplicon sequences constrains our ability to make accurate inferences, leading to topological errors and inappropriate branch lengths
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.