Self-Supervised Pre-training for Protein Embeddings Using Tertiary Structures

Yuzhi Guo,Jiaxiang Wu,Hehuan Ma,Junzhou Huang

doi:10.1609/aaai.v36i6.20636

Abstract

The protein tertiary structure largely determines its interaction with other molecules. Despite its importance in various structure-related tasks, fully-supervised data are often time-consuming and costly to obtain. Existing pre-training models mostly focus on amino-acid sequences or multiple sequence alignments, while the structural information is not yet exploited. In this paper, we propose a self-supervised pre-training model for learning structure embeddings from protein tertiary structures. Native protein structures are perturbed with random noise, and the pre-training model aims at estimating gradients over perturbed 3D structures. Specifically, we adopt SE(3)-invariant features as model inputs and reconstruct gradients over 3D coordinates with SE(3)-equivariance preserved. Such paradigm avoids the usage of sophisticated SE(3)-equivariant models, and dramatically improves the computational efficiency of pre-training models. We demonstrate the effectiveness of our pre-training model on two downstream tasks, protein structure quality assessment (QA) and protein-protein interaction (PPI) site prediction. Hierarchical structure embeddings are extracted to enhance corresponding prediction models. Extensive experiments indicate that such structure embeddings consistently improve the prediction accuracy for both downstream tasks.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Self-Supervised Pre-training for Protein Embeddings Using Tertiary Structures

Abstract

Talk to us

Similar Papers

More From: Proceedings of the AAAI Conference on Artificial Intelligence

Lead the way for us

Journal: Proceedings of the AAAI Conference on Artificial Intelligence	Publication Date: Jun 28, 2022
Citations: 13

Similar Papers

Combining deep graph convolutional networks and PRSA to enhance protein-protein interaction site prediction
Zhouhan Li ... Jing Peng
-
Zhouhan Li, et. al.Zhouhan Li ... Jing Peng
09 Oct 2022
09 Oct 2022

Prediction of Protein-Protein Interaction Sites by Random Forest Algorithm with mRMR and IFS
Bi-Qing Li ... Yu-Dong Cai
PLoS ONE | VOL. 7
Bi-Qing Li, et. al.Bi-Qing Li ... Yu-Dong Cai
28 Aug 2012
PLoS ONE | VOL. 7

The Lottery Tickets Hypothesis for Supervised and Self-supervised Pre-training in Computer Vision Models
Tianlong Chen ... Zhangyang Wang
-
Tianlong Chen, et. al.Tianlong Chen ... Zhangyang Wang
01 Jun 2021
01 Jun 2021

Protein-protein interaction site prediction in Homo sapiens and E. coli using an interaction-affinity based membership function in fuzzy SVM.
Brijesh Kumar Sriwastava ... Subhadip Basu
Journal of Biosciences | VOL. 40
Brijesh Kumar Sriwastava, et. al.Brijesh Kumar Sriwastava ... Subhadip Basu
28 Sep 2015
Journal of Biosciences | VOL. 40

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Self-Supervised Pre-training for Protein Embeddings Using Tertiary Structures

Abstract

Talk to us

Similar Papers

More From: Proceedings of the AAAI Conference on Artificial Intelligence