Abstract

The amount of metagenomic data is growing rapidly while the computational methods for metagenome analysis are still in their infancy. It is important to develop novel statistical learning tools for the prediction of associations between bacterial communities and disease phenotypes and for the detection of differentially abundant features. In this study, we presented a novel statistical learning method for simultaneous association prediction and feature selection with metagenomic samples from two or multiple treatment populations on the basis of count data. We developed a linear programming based support vector machine with and joint penalties for binary and multiclass classifications with metagenomic count data (metalinprog). We evaluated the performance of our method on several real and simulation datasets. The proposed method can simultaneously identify features and predict classes with the metagenomic count data.

Highlights

  • The majority of microbes reside in the gut, have a profound influence on human physiology and nutrition, and are crucial for human life

  • We propose a novel supervised learning method using Linear programming (LP) based support vector machine (SVM) with joint L1,? penalty for simultaneous feature selection and binary/multiclass prediction

  • We evaluate the performance of our tool through simulation, publicly available, and our own metagenomic data sets

Read more

Summary

Introduction

The majority of microbes reside in the gut, have a profound influence on human physiology and nutrition, and are crucial for human life. Metagenomics, the culture-independent isolation and characterization of DNA from uncultured microorganisms, has facilitated the analysis of the functional biodiversity harbored in the large reservoir of uncultured bacteria and archaea. Recent advances in genome sequencing technologies have made obtaining a complete metagenomic sequencing more tractable [1]. Having on hand such a large number of microbial genomes has changed the nature of microbiology and of microbial evolution studies. A main promise of metagenomics is that it will accelerate drug discovery and biotechnology by providing new genes with novel functions [2,4]

Objectives
Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call