Abstract

In recent years, there has been a breakthrough in protein structure prediction, and the AlphaFold2 model of the DeepMind team has improved the accuracy of protein structure prediction to the atomic level. Currently, deep learning-based protein function prediction models usually extract features from protein sequences and combine them with protein-protein interaction networks to achieve good results. However, for newly sequenced proteins that are not in the protein-protein interaction network, such models cannot make effective predictions. To address this, this paper proposes the Struct2GO model, which combines protein structure and sequence data to enhance the precision of protein function prediction and the generality of the model. We obtain amino acid residue embeddings in protein structure through graph representation learning, utilize the graph pooling algorithm based on a self-attention mechanism to obtain the whole graph structure features, and fuse them with sequence features obtained from the protein language model. The results demonstrate that compared with the traditional protein sequence-based function prediction model, the Struct2GO model achieves better results. https://github.com/lyjps/Struct2GO. Supplementary data are available at Bioinformatics online.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call