Abstract

The principle of three-dimensional protein structure formation is a long-standing conundrum in structural biology. A globular domain of a soluble protein is formed by a network of atomic contacts among amino acid residues, but regions without intramolecular non-local contacts are often observed in the protein structure, especially in loop, linker, and peripheral segments with secondary structures. Although these regions can play key roles for protein function as interfaces for intermolecular interactions, their nature remains unclear. Here, we termed protein segments without non-local contacts as floating segments and sought them in tens of thousands of entries in the Protein Data Bank. As a result, we found that 0.72% of residues are in floating segments. Regarding secondary structural elements, coil structures are enriched in floating segments, especially for long segments. Interactions with polypeptides and polynucleotides, but not chemical compounds, are enriched in floating segments. The amino acid preferences of floating segments are similar to those of surface residues, with exceptions; the small side chain amino acids, Gly and Ala, are preferred, and some charged side chains, Arg and His, are disfavored for floating segments compared to surface residues. Our comprehensive characterization of floating segments may provide insights into understanding protein sequence-structure-function relationships.

Highlights

  • Elucidating the principles of the three-dimensional (3D) structure formation of proteins is a long-standing conundrum in the field of structural biology [1,2,3]

  • How a sequence of 20 types of amino acid residues encoded in the standard genetic code in a polypeptide determines its 3D structure remains largely unclear

  • We defined floating and supported segments involved in the 3D structure of proteins (Fig 1) and characterized them on the basis of statistical analyses of the Protein Data Bank (PDB)

Read more

Summary

Introduction

Elucidating the principles of the three-dimensional (3D) structure formation of proteins is a long-standing conundrum in the field of structural biology [1,2,3]. How a sequence of 20 types of amino acid residues encoded in the standard genetic code (and additional two amino acid residues added via specific translation mechanisms [4]) in a polypeptide determines its 3D structure remains largely unclear. To illumination of this issue, extensive efforts in structural biology have accumulated a massive amount of structural data for proteins in the Protein Data Bank (PDB) [5]. Propensities of non-local contacts play pivotal roles in establishing folds of globular domains

Objectives
Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call