Deep IDA: a deep learning approach for integrative discriminant analysis of multi-omics data with feature ranking—an application to COVID-19

Jiuzhou Wang,Sandra E Safo,Magnus Rattray

doi:10.1093/bioadv/vbae060

Abstract

Abstract Motivation Many diseases are complex heterogeneous conditions that affect multiple organs in the body and depend on the interplay between several factors that include molecular and environmental factors, requiring a holistic approach to better understand disease pathobiology. Most existing methods for integrating data from multiple sources and classifying individuals into one of multiple classes or disease groups have mainly focused on linear relationships despite the complexity of these relationships. On the other hand, methods for nonlinear association and classification studies are limited in their ability to identify variables to aid in our understanding of the complexity of the disease or can be applied to only two data types. Results We propose Deep Integrative Discriminant Analysis (IDA), a deep learning method to learn complex nonlinear transformations of two or more views such that resulting projections have maximum association and maximum separation. Further, we propose a feature ranking approach based on ensemble learning for interpretable results. We test Deep IDA on both simulated data and two large real-world datasets, including RNA sequencing, metabolomics, and proteomics data pertaining to COVID-19 severity. We identified signatures that better discriminated COVID-19 patient groups, and related to neurological conditions, cancer, and metabolic diseases, corroborating current research findings and heightening the need to study the post sequelae effects of COVID-19 to devise effective treatments and to improve patient care. Availability and implementation Our algorithms are implemented in PyTorch and available at: https://github.com/JiuzhouW/DeepIDA

Full Text