Abstract

Effective biomarker identification and accurate sample label prediction are still challenging for complex diseases. Patient similarity network (PSN) analysis is a powerful tool in disease omics data analysis. The topology of PSN can reflect the discriminative ability of the corresponding feature space on which the sample network is built. In this study, a novel omics data analysis method based on the sample reference network (DA-SRN) is proposed to identify the potential biomarkers and predict the sample categories. DA-SRN defines the informative features and the sample reference network in optimizing the network structure by genetic algorithm. It labels the samples based on the graph neural network, the reference network and the selected informative features. DA-SRN was compared with nine efficient omics data analysis methods on the genomics, metabolomics and transcriptomics datasets to show its validation. The comparison results showed that it outperformed the other methods in area under receiver operating characteristic curve (AUROC), sensitivity, specificity and area under precision-recall curve (AUPRC) in most cases. Besides, the important metabolites identified by DA-SRN for the type 2 diabetes (T2D) metabolomics data were further examined. The pathway analysis revealed the close relationships between the identified metabolites and the critical metabolic pathways related to the occurrence and development of T2D. The experimental results illustrate that DA-SRN can extract the valuable information from the complex omics data by analyzing the sample relationship, and is promising in biomarker identification and sample discrimination for complex diseases.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call