Abstract

The subcellular localization of a protein is important for its function and interaction with other molecules, and its mislocalization is linked to numerous diseases. While atlas-scale efforts have been made to profile protein localization across various cell lines, existing datasets only contain limited pairs of proteins and cell lines which do not cover all human proteins. We present a method that uses both protein sequences and cellular landmark images to perform Predictions of Unseen Proteins' Subcellular localization (PUPS), which can generalize to both proteins and cell lines not used for model training. PUPS combines a protein language model and an image inpainting model to utilize both protein sequence and cellular images for protein localization prediction. The protein sequence input enables generalization to unseen proteins and the cellular image input enables cell type specific prediction that captures single-cell variability. PUPS' ability to generalize to unseen proteins and cell lines enables us to assess the variability in protein localization across cell lines as well as across single cells within a cell line and to identify the biological processes associated with the proteins that have variable localization. Experimental validation shows that PUPS can be used to predict protein localization in newly performed experiments outside of the Human Protein Atlas used for training. Collectively, PUPS utilizes both protein sequences and cellular images to predict protein localization in unseen proteins and cell lines with the ability to capture single-cell variability.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.