Abstract

We develop a metagenomic data analysis pipeline, MicroPro, that takes into account all reads from known and unknown microbial organisms and associates viruses with complex diseases. We utilize MicroPro to analyze four metagenomic datasets relating to colorectal cancer, type 2 diabetes, and liver cirrhosis and show that including reads from unknown organisms significantly increases the prediction accuracy of the disease status for three of the four datasets. We identify new microbial organisms associated with these diseases and show viruses play important prediction roles in colorectal cancer and liver cirrhosis, but not in type 2 diabetes. MicroPro is freely available at https://github.com/zifanzhu/MicroPro.

Highlights

  • Trillions of microbes populate various sites of the human body and form microbiome communities [1]

  • In order to overcome the challenges mentioned above, we developed a metagenomic predictive pipeline, MicroPro, which analyzes data in three main steps: (1) reference-based known microbial abundance characterization—perform taxonomic profiling based on sequence alignment against reference genomes; (2) assembly-binning-based unknown organism feature extraction—use cross-assembly to assemble the combined unmapped reads from all samples and consider each assembled contig as originated from an “unknown” organism, which refers to an organism with no known references available in the database

  • MicroPro: a metagenomic disease-related prediction analysis pipeline taking unmapped reads into consideration We developed a new metagenomic analysis pipeline, MicroPro, to take into account both known and unknown microbial organisms for the prediction of disease status

Read more

Summary

Introduction

Trillions of microbes populate various sites of the human body and form microbiome communities [1]. These microorganisms and their interactions between each other and the host play an important role in many physiological processes including metabolism, reproduction and immune system activity [2, 3]. Over the past 20 years, and thanks to the rapid development of sequencing technology, sequencing-based methods have gradually replaced the cultivation technology and have become the most widely used tools for microbial analysis. The 16S ribosomal RNA sequencing together with the recent shotgun whole genome sequencing discovers large amounts of non-cultivable microbes, and fundamentally changes the way microbial analysis is performed [6, 7]. Researchers are finding more evidence correlating human microbiota with

Methods
Results
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.