Abstract
Contemporary deep learning approaches exhibit state-of-the-art performance in various areas. In healthcare, the application of deep learning remains limited since deep learning methods are often considered as non-interpretable black-box models. However, the Machine Learning (ML) community made recent elaborations to develop the methods of eXplainable Artificial Intelligence (XAI). The explanation methods explain single decisions of an ML model when a single data point is fed into the model’s input. In a clinical setup, a data point can represent a single patient. Data point-specific explanations could possibly assist the need in personalized precision medicine decisions via explaining patient-specific predictions. Convolutional Neural Networks (CNNs) as deep learning methods have been already applied to classify transformed into images gene expression profiles of patients. Gene expression data can be structured by a prior knowledge molecular network (encoded as a graph) representing connections between genes. Each vertex of a molecular network is assigned a gene expression value as an attribute. The set of the attributes creates a graph signal representing a patient. Emerging field of geometric deep learning deals with methods applicable to graph structured data and extends CNNs as Graph Convolutional Neural Networks (GCNNs) classifying graph signals. Layer-wise Relevance Propagation (LRP) is a method to explain decisions of CNNs classifying image data. I extended the LRP method to make it available for GCNNs. Graph Layer-wise Relevance Propagation (GLRP) is presented as a new method to explain single decisions made by a GCNN model. In this thesis, I present a novel methodology generating patient-specific molecular subnetworks as explanations for classification decisions of an ML approach utilizing prior knowledge of molecular networks. GCNN serves as a ML approach, and its decisions are explained by developed GLRP. A sanity check of the developed GLRP method was demonstrated on a hand-written digits dataset. The biological validation was performed by applying the developed methodology to gene expression data from Human Umbilical Vein Endothelial Cells (HUVEC) treated or not treated with tumor necrosis factor alpha. To show the utility of introduced methodology in the scopes of precision medicine, it was applied to a large breast cancer dataset. The generated patient-specific subnetworks largely agree with clinical knowledge and could assist precision medicine approaches by identifying common as well as novel, and potentially druggable, drivers of tumor progression. Apart from generating patient-specific subnetworks, the developed methodology can be used as a general feature selection approach. The outcome of a feature selection approach is a subset of important for classification genes corresponding to a whole dataset. It is essential to sustain stability of selected feature subsets across different datasets with the same clinical endpoint since the selected genes are possible candidates for prognostic biomarkers. I analysed the stability of feature selection performed by GCNN+LRP. I have implemented a graph convolutional layer of GCNN as a Keras layer so that the SHapley Additive exPlanations (SHAP) method could be also applied to a Keras version of a GCNN model. The stability of feature selection performed by GCNN+LRP was compared to the stability of GCNN+SHAP and other ML-based feature selection approaches. The GCNN+LRP approach shows the highest stability. GCNN+LRP subnetworks were compared to GCNN+SHAP subnetworks in terms of connectivity and permutation feature importance. While GCNN+SHAP subnetworks demonstrate higher permutation importance than GCNN+LRP subnetworks, a GCNN+LRP subnetwork of an individual patient is on average substantially more connected and, therefore more interpretable in the context of prior knowledge than a GCNN+SHAP subnetwork which consists mainly of single vertices.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.