Abstract

The gut microbiome is shaped and modified by the polymorphisms of microorganisms in the intestinal tract. Its composition shows strong individual specificity and may play a crucial role in the human digestive system and metabolism. Several factors can affect the composition of the gut microbiome, such as eating habits, living environment, and antibiotic usage. Thus, various races are characterized by different gut microbiome characteristics. In this present study, we studied the gut microbiomes of three different races, including individuals of Asian, European and American races. The gut microbiome and the expression levels of gut microbiome genes were analyzed in these individuals. Advanced feature selection methods (minimum redundancy maximum relevance and incremental feature selection) and four machine-learning algorithms (random forest, nearest neighbor algorithm, sequential minimal optimization, Dagging) were employed to capture key differentially expressed genes. As a result, sequential minimal optimization was found to yield the best performance using the 454 genes, which could effectively distinguish the gut microbiomes of different races. Our analyses of extracted genes support the widely accepted hypotheses that eating habits, living environments and metabolic levels in different races can influence the characteristics of the gut microbiome.

Highlights

  • Materials and MethodsIn this present study, we tried four machine-learning algorithms: random forest, nearest neighbor algorithm, SMO, and Dagging, and selected the optimal one

  • Microorganisms are often considered to be small, single-celled or multi-cellular life forms that can only be observed via microscopy[1]

  • Each sample was RNA-sequenced and represented based on the expression levels of the 9,879,896 gut microbial genes. The goal of this analysis was to identify the most discriminative gut microbiome gene set that was differentially expressed among individuals from different races and investigate the differences in the human gut microbiome caused by food, lifestyle, race and other factors

Read more

Summary

Materials and Methods

In this present study, we tried four machine-learning algorithms: random forest, nearest neighbor algorithm, SMO, and Dagging, and selected the optimal one. It contains four classifiers, which are termed RandomForest, IB1, SMO, Dagging, respectively, which implements the four machine-learning algorithms described above, respectively They were directly adopted one by one as the basic machine-learning algorithm to extract important features and to build an optimal prediction model. They were all executed using their default parameters. The IFS method uses the mRMR feature list and a basic machine-learning algorithm The feature set that yields the best key measurement is considered to be the optimal combination of features for classification

Results
Discussion
Conclusions
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call