Seq-InSite: sequence supersedes structure for protein interaction site prediction.

Seyedmohsen Hosseini,Lucian Ilie,G Brian Golding

doi:10.1093/bioinformatics/btad738

Seyedmohsen Hosseini, Lucian Ilie + Show 1 more

Open Access

PDF Available

https://doi.org/10.1093/bioinformatics/btad738

Copy DOI

Export

Save

Cite

Journal: Bioinformatics	Publication Date: Jan 2, 2024
Citations: 3	License type: CC BY 4.0

Affiliation: Western University, McMaster University

Abstract
Full-Text PDF
Similar Papers

Abstract

Listen

Proteins accomplish cellular functions by interacting with each other, which makes the prediction of interaction sites a fundamental problem. As experimental methods are expensive and time consuming, computational prediction of the interaction sites has been studied extensively. Structure-based programs are the most accurate, while the sequence-based ones are much more widely applicable, as the sequences available outnumber the structures by two orders of magnitude. Ideally, we would like a tool that has the quality of the former and the applicability of the latter. We provide here the first solution that achieves these two goals. Our new sequence-based program, Seq-InSite, greatly surpasses the performance of sequence-based models, matching the quality of state-of-the-art structure-based predictors, thus effectively superseding the need for models requiring structure. The predictive power of Seq-InSite is illustrated using an analysis of evolutionary conservation for four protein sequences. Seq-InSite is freely available as a web server at http://seq-insite.csd.uwo.ca/ and as free source code, including trained models and all datasets used for training and testing, at https://github.com/lucian-ilie/Seq-InSite.

Full Text