Abstract

Technological advances in next-generation sequencing (NGS) have made it possible to uncover extensive and dynamic alterations in diverse molecular components and biological pathways across healthy and diseased conditions. Large amounts of multi-omics data originating from emerging NGS experiments require feature engineering, which is a crucial step in the process of predictive modeling. The underlying relationship among multi-omics features in terms of insulin resistance is not well understood. In this study, using the multi-omics data of type II diabetes from the Integrative Human Microbiome Project, from 10,783 features, we conducted a data analytic approach to elucidate the relationship between insulin resistance and multi-omics features, including microbiome data. To better explain the impact of microbiome features on insulin classification, we used a developed deep neural network interpretation algorithm for each microbiome feature’s contribution to the discriminative model output in the samples.

Highlights

  • Advances in high-throughput DNA sequencing platforms have become essential in the field of gene expression profiling, epigenomics, genomics, and transcriptomics over the past ten years [1,2,3]

  • Classifying insulin resistance (IR) and insulin sensitivity (IS) with a small number of biomarkers is very challenging, we aim to do so by identifying biomarkers that make it possible to distinguish IR from IS. Considering these converging challenges within the biomedical field, especially with respect to clinical translation, we evaluated whether disease-specific multi-omic variables are present in patients with IR, identified the microbiome-based diagnostic signatures to a classifier setting, and interpreted how selected features contributed to the model output

  • Similar concept to the Gridsearch, we considered all combinations of hyperparameter of different batch sizes from 15 to 25 in steps of 5, tried a suite of small standard learning rates from 0.0005 to 0.01 and number of nodes was set by dividing by half from the number of features

Read more

Summary

Introduction

Advances in high-throughput DNA sequencing platforms have become essential in the field of gene expression profiling, epigenomics, genomics, and transcriptomics over the past ten years [1,2,3]. Classifying IR and IS with a small number of biomarkers is very challenging, we aim to do so by identifying biomarkers that make it possible to distinguish IR from IS Considering these converging challenges within the biomedical field, especially with respect to clinical translation, we evaluated whether disease-specific multi-omic variables are present in patients with IR, identified the microbiome-based diagnostic signatures to a classifier setting, and interpreted how selected features contributed to the model output. This project established a cohort of approximately 60 individuals at risk of diabetes. IHMP T2D performed longitudinal multi-omic analysis to obtain global microbiome-host changes. The comparisons of all developed predictive models were based on the area under the receiver operating characteristic (AUC) curve

Backward Elimination for Feature Selection
Predictive Models for IRIS with Selected Features
Predictive Models for IRIS with Microbiome Feature Substitution
Random Sample Permutation
Deep Neural Network Interpretation Algorithm
Statistical Analysis
Baseline Characteristics of the iHMP Dataset
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call