Graph-based multi-modality integration for prediction of cancer subtype and severity

Kristel Van Steen,Christian Wohlfart,Antoaneta Vladimirova,Michael King,Diane Duroux

doi:10.1038/s41598-023-46392-6

Kristel Van Steen, Christian Wohlfart + Show 3 more

Open Access

https://doi.org/10.1038/s41598-023-46392-6

Copy DOI

Abstract

Personalised cancer screening before therapy paves the way toward improving diagnostic accuracy and treatment outcomes. Most approaches are limited to a single data type and do not consider interactions between features, leaving aside the complementary insights that multimodality and systems biology can provide. In this project, we demonstrate the use of graph theory for data integration via individual networks where nodes and edges are individual-specific. We showcase the consequences of early, intermediate, and late graph-based fusion of RNA-Seq data and histopathology whole-slide images for predicting cancer subtypes and severity. The methodology developed is as follows: (1) we create individual networks; (2) we compute the similarity between individuals from these graphs; (3) we train our model on the similarity matrices; (4) we evaluate the performance using the macro F1 score. Pros and cons of elements of the pipeline are evaluated on publicly available real-life datasets. We find that graph-based methods can increase performance over methods that do not study interactions. Additionally, merging multiple data sources often improves classification compared to models based on single data, especially through intermediate fusion. The proposed workflow can easily be adapted to other disease contexts to accelerate and enhance personalized healthcare.

Full Text