Abstract

The advance of metagenomic studies provides the opportunity to identify microbial taxa that are associated with human diseases. Multiple methods exist for the association analysis. However, the results could be inconsistent, presenting challenges in interpreting the host-microbiome interactions. To address this issue, we develop Meta-Signer, a novel Metagenomic Signature Identifier tool based on rank aggregation of features identified from multiple machine learning models including Random Forest, Support Vector Machines, Logistic Regression, and Multi-Layer Perceptron Neural Networks. Meta-Signer generates ranked taxa lists by training individual machine learning models over multiple training partitions and aggregates the ranked lists into a single list by an optimization procedure to represent the most informative and robust microbial features. A User will receive speedy assessment on the predictive performance of each ma-chine learning model using different numbers of the ranked features and determine the final models to be used for evaluation on external datasets. Meta-Signer is user-friendly and customizable, allowing users to explore their datasets quickly and efficiently.

Highlights

  • Recent metagenomic studies of the gut microbiome have linked dysbiosis to many human diseases[1,2,3]

  • We introduce a novel tool, Meta-Signer, a Metagenomic Signature Identifier based on rank aggregation of informative taxa learned from individual machine learning (ML) models

  • After the model predictions are evaluated and the features are ranked into a single list, Meta-Signer provides a summary of the results in a portable HTML file

Read more

Summary

Introduction

Recent metagenomic studies of the gut microbiome have linked dysbiosis to many human diseases[1,2,3]. Various metagenomic studies use parametric or non-parametric statistical tests to detect differentially abundant individual taxa between disease and control groups[5,6,7,8,9]. Meta-Signer uses RF, SVM, penalized Logistic Regression, and multiple-layer perceptron neural network (MLPNN) models to evaluate importance of each microbial taxon and generates a ranked list of features (i.e., taxa) per model. It aggregates all the ranked lists using a procedure “RankAggreg” based on the cross-entropy method or the genetic algorithm[26]. Meta-Signer is distributed as Python tool and available at https://github.com/YDaiLab/Meta-Signer

Methods
Discussion
Conclusions
32. Anderson MJ
Findings
Some spelling errors in the title and main text
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.