Abstract

Recognition of protein subcellular distribution patterns and identification of location biomarker proteins in cancer tissues are important for understanding protein functions and related diseases. Immunohistochemical (IHC) images enable visualizing the distribution of proteins at the tissue level, providing an important resource for the protein localization studies. In the past decades, several image-based protein subcellular location prediction methods have been developed, but the prediction accuracies still have much space to improve due to the complexity of protein patterns resulting from multi-label proteins and the variation of location patterns across cell types or states. Here, we propose a multi-label multi-instance model based on deep graph convolutional neural networks, GraphLoc, to recognize protein subcellular location patterns. GraphLoc builds a graph of multiple IHC images for one protein, learns protein-level representations by graph convolutions and predicts multi-label information by a dynamic threshold method. Our results show that GraphLoc is a promising model for image-based protein subcellular location prediction with model interpretability. Furthermore, we apply GraphLoc to the identification of candidate location biomarkers and potential members for protein networks. A large portion of the predicted results have supporting evidence from the existing literatures and the new candidates also provide guidance for further experimental screening. The dataset and code are available at: www.csbio.sjtu.edu.cn/bioinf/GraphLoc. Supplementary data are available at Bioinformatics online.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call