Binary classification of protein molecules into intrinsically disordered and ordered segments

Satoshi Fukuchi,Keiichi Homma,Ken Nishikawa,Takashi Gojobori,Kazuo Hosoda

doi:10.1186/1472-6807-11-29

Abstract

BackgroundAlthough structural domains in proteins (SDs) are important, half of the regions in the human proteome are currently left with no SD assignments. These unassigned regions consist not only of novel SDs, but also of intrinsically disordered (ID) regions since proteins, especially those in eukaryotes, generally contain a significant fraction of ID regions. As ID regions can be inferred from amino acid sequences, a method that combines SD and ID region assignments can determine the fractions of SDs and ID regions in any proteome.ResultsIn contrast to other available ID prediction programs that merely identify likely ID regions, the DICHOT system we previously developed classifies the entire protein sequence into SDs and ID regions. Application of DICHOT to the human proteome revealed that residue-wise ID regions constitute 35%, SDs with similarity to PDB structures comprise 52%, while SDs with no similarity to PDB structures account for the remaining 13%. The last group consists of novel structural domains, termed cryptic domains, which serve as good targets of structural genomics. The DICHOT method applied to the proteomes of other model organisms indicated that eukaryotes generally have high ID contents, while prokaryotes do not. In human proteins, ID contents differ among subcellular localizations: nuclear proteins had the highest residue-wise ID fraction (47%), while mitochondrial proteins exhibited the lowest (13%). Phosphorylation and O-linked glycosylation sites were found to be located preferentially in ID regions. As O-linked glycans are attached to residues in the extracellular regions of proteins, the modification is likely to protect the ID regions from proteolytic cleavage in the extracellular environment. Alternative splicing events tend to occur more frequently in ID regions. We interpret this as evidence that natural selection is operating at the protein level in alternative splicing.ConclusionsWe classified entire regions of proteins into the two categories, SDs and ID regions and thereby obtained various kinds of complete genome-wide statistics. The results of the present study are important basic information for understanding protein structural architectures and have been made publicly available at http://spock.genes.nig.ac.jp/~genome/DICHOT.

Highlights

Structural domains in proteins (SDs) are important, half of the regions in the human proteome are currently left with no structural domains (SDs) assignments
Application of DICHOT to human proteins The DICHOT system classifies the entire region of an amino acid sequence into SDs and intrinsically disordered (ID) regions
The detailed results on individual human proteins can be accessed at http:// spock.genes.nig.ac.jp/~genome/DICHOT

Summary

Introduction

Structural domains in proteins (SDs) are important, half of the regions in the human proteome are currently left with no SD assignments. These unassigned regions consist of novel SDs, and of intrinsically disordered (ID) regions since proteins, especially those in eukaryotes, generally contain a significant fraction of ID regions. With protein structures accumulating and protein structure prediction improving, it is becoming increasingly accurate to assign structural domains (SDs) to amino acid sequences. Two. The discovery of intrinsically disordered proteins (IDPs) has brought a paradigm change to structural biology [6,7,8]. It was found that phosphorylation sites preferentially reside in ID regions [21]

Methods

Results

Discussion

Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: BMC Structural Biology	Publication Date: Jan 1, 2011
Citations: 124	License type: cc-by

R Discovery Prime

R Discovery Prime

Binary classification of protein molecules into intrinsically disordered and ordered segments

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: BMC Structural Biology

Lead the way for us

Similar Papers

Intrinsically Disordered Regions of Human Plasma Membrane Proteins Preferentially Occur in the Cytoplasmic Segment
Yoshiaki Minezaki ... Ken Nishikawa
Journal of Molecular Biology | VOL. 368
Yoshiaki Minezaki, et. al.Yoshiaki Minezaki ... Ken Nishikawa
22 Feb 2007
Journal of Molecular Biology | VOL. 368

Large-scale analysis of intrinsic disorder flavors and associated functions in the protein sequence universe.
Marco Necci ... Damiano Piovesan
Protein Science | VOL. 25
Marco Necci, et. al.Marco Necci ... Damiano Piovesan
25 Oct 2016
Protein Science | VOL. 25

Development of an accurate classification system of proteins into structured and unstructured regions that uncovers novel structural domains: its application to human transcription factors
Satoshi Fukuchi ... Keiichi Homma
BMC Structural Biology | VOL. 9
Satoshi Fukuchi, et. al.Satoshi Fukuchi ... Keiichi Homma
01 Jan 2009
BMC Structural Biology | VOL. 9

CAID prediction portal: a comprehensive service for predicting intrinsic disorder and binding regions in proteins.
... Damiano Clementel
Nucleic Acids Research | VOL. 51
, et. al. ... Damiano Clementel
29 May 2023
Nucleic Acids Research | VOL. 51

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Binary classification of protein molecules into intrinsically disordered and ordered segments

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: BMC Structural Biology