Abstract

The key finding in the DNA double helix model is the specific pairing or binding between nucleotides A-T and C-G, and the pairing rules are the molecule basis of genetic code. Unfortunately, no such rules have been discovered for proteins. Here we show that intrinsic sequence patterns between intra-protein binding peptide fragments exist, they can be extracted using a deep learning algorithm, and they bear an interesting semblance to the DNA double helix model. The intra-protein binding peptide fragments have specific and intrinsic sequence patterns, distinct from non-binding peptide fragments, and multi-millions of binding and non-binding peptide fragments from currently available protein X-ray structures are classified with an accuracy of up to 93%. The specific binding between short peptide fragments may provide an important driving force for protein folding and protein-protein interaction, two open and fundamental problems in molecular biology, and it may have significant potential in design, discovery, and development of peptide, protein, and antibody drugs.

Highlights

  • Protein folding and protein-protein interaction are two fundamental, long-standing problems in molecular biology, and their importance can hardly be overestimated

  • Available protein structure data in early 1990s was not sufficient for further exploration. We examined this alternative hypothesis, and our main thinking is that if this hypothesis is true, binding peptide fragments must have specific and intrinsic sequence pattern that are distinct from non-binding ones

  • The up to 93% of accuracy (Table 1) and AUC-ROC of 0.979 (Fig. 4) from multi-millions of peptide triad (PT) and PD samples shows that intra-protein binding peptide fragments do have specific and intrinsic sequence patterns, which are distinct from the non-binding ones

Read more

Summary

Introduction

Protein folding and protein-protein interaction are two fundamental, long-standing problems in molecular biology, and their importance can hardly be overestimated. It remains a highly challenging task to predict protein structures and PPI de novo[2,3] despite the huge advances in computing power. Electrostatic interactions, unlike van der Waals forces and hydrogen bonds, are long range ones; they remain relevant beyond the limits of the closest neighbors. The limited applications of computational approaches in prediction of protein structure and PPI suggest a need for novel ideas, in particular for force fields. Computational approaches for protein folding and PPI problem starts from the assumption that a protein’s native conformation corresponds to its global free energy minimum[1] and binding peptide fragments are brought together after 3D structures are formed. We proposed an alternative mechanism: binding peptide fragments are formed first and drive the formation of protein 3D structure and PPI. Available protein structure data in early 1990s was not sufficient for further exploration

Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call