Abstract

High-quality and high-throughput sequencing technologies are required for therapeutic and diagnostic analyses of human gut microbiota. Here, we evaluated the advantages and disadvantages of the various commercial sequencing platforms for studying human gut microbiota. We generated fecal bacterial sequences from 170 Korean subjects using the GS FLX+ (V1–4), Illumina MiSeq (V1–3, V3–4 and V4), and PacBio (V1–9) systems. Comparative analyses revealed that the PacBio data showed the weakest relationship with the reference whole-metagenome shotgun datasets. The PacBio system generated sequences with a significantly higher level of deletions than datasets generated by other platforms, with an abnormally high proportion of sequences assigned to the phylum Proteobacteria. Low sequencing accuracy and low coverage of terminal regions in public 16 S rRNA databases deteriorate the advantages of long read length, resulting in low taxonomic resolution in amplicon sequencing of human gut microbiota.

Highlights

  • Background & SummaryIn microbial ecology, next-generation sequencing (NGS) followed by computational analysis has become routine practice for phylogenetic analysis of bacterial communities in various ecosystems

  • The results reveal that the indel errors are the main driver of differences in the gut microbial community profiles obtained from the Pacific Biosciences (PacBio) dataset

  • Platform comparison for taxonomy profiling To determine whether the difference in read lengths affects the ability to assign a sequence to the lower ranks of taxonomic lineages, we analyzed the proportion of assigned sequences from the GS FLX+ (n = 169), MiSeq V4 (n = 169), and PacBio (n = 29) datasets at the family, genus, and species levels

Read more

Summary

Introduction

Background & SummaryIn microbial ecology, next-generation sequencing (NGS) followed by computational analysis has become routine practice for phylogenetic analysis of bacterial communities in various ecosystems. Platform comparison for taxonomy profiling To determine whether the difference in read lengths affects the ability to assign a sequence to the lower ranks of taxonomic lineages, we analyzed the proportion of assigned sequences from the GS FLX+ (n = 169), MiSeq V4 (n = 169), and PacBio (n = 29) datasets at the family, genus, and species levels.

Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call