Expose flexible conformations for intrinsically disordered protein
Expose flexible conformations for intrinsically disordered protein
448
- 10.1002/prot.22654
- Jan 22, 2010
- Proteins: Structure, Function, and Bioinformatics
219
- 10.1093/nar/gkaa1058
- Nov 25, 2020
- Nucleic Acids Research
116
- 10.1002/prot.24348
- Sep 17, 2013
- Proteins: Structure, Function, and Bioinformatics
793
- 10.1093/nar/gkm363
- Jun 12, 2007
- Nucleic Acids Research
1156
- 10.1016/j.bbapap.2010.01.011
- Jan 25, 2010
- Biochimica et biophysica acta
176
- 10.1074/mcp.m700564-mcp200
- Jul 1, 2008
- Molecular & Cellular Proteomics
5226
- 10.1038/nmeth.3213
- Dec 30, 2014
- Nature Methods
4
- 10.1038/s41598-023-45969-5
- Nov 21, 2023
- Scientific Reports
1344
- 10.1093/nar/gky384
- Jun 1, 2018
- Nucleic Acids Research
149
- 10.1093/bioinformatics/btl504
- Oct 4, 2006
- Bioinformatics
- Research Article
- 10.1002/prca.70012
- Jun 7, 2025
- Proteomics. Clinical Applications
ABSTRACTIntroductionThe human retina relies on a complex network of proteins, many of which exhibit intrinsic disorder and liquid‐liquid phase separation (LLPS), enabling dynamic interactions for retinal function. Disruptions in these properties, along with missense mutations, have been linked to retinal diseases. This study aims to characterize and compare retinal proteins categorized by their expression specificity and tissue distribution using bioinformatics tools to explore relationships between intrinsic protein disorder, phase separation potential, and mutation pathogenicity.MethodsWe analyzed retinal proteins classified by the Human Protein Atlas (HPA) into two major groups based on gene expression specificity (degree of unique retinal expression) and gene expression distribution (extent of expression across tissues). We analyzed nine retinal proteomes categorized by gene expression specificity and distribution. Intrinsic protein disorder was assessed using per‐residue and global disorder predictors from the Rapid Intrinsic Disorder Analysis Online (RIDAO) platform, LLPS potential was evaluated with ParSe v2, and missense mutation pathogenicity was predicted using AlphaMissense.ResultsSignificant differences in per‐residue intrinsic protein disorder were found within the specificity and distribution subgroups (p < 0.0001). In addition, global disorder predictions from the RIDAO platform demonstrated non‐random distributions of protein species across the proteomes analyzed in both subgroups (p < 0.0001). Furthermore, proteins specifically elevated in the retina exhibited higher intrinsic disorder and greater phase separation propensity (ParSe v2, AUC up to 0.650), compared to those more broadly expressed. Lastly, AlphaMissense analysis showed significant variations in the average pathogenicity scores of missense mutations within subgroups (p < 0.0001).ConclusionOur results show that intrinsic disorder, LLPS, and mutational tendencies are not evenly distributed among retinal proteomes. Our study demonstrates a link between intrinsic disorder, LLPS potential, and pathogenic vulnerability among retinal proteins, underscoring the unique structural and functional landscape of retinal proteomes. Proteins with higher specificity to the retina exhibit greater disorder and phase separation potential, highlighting their potential role in dynamic cellular processes that support retinal function. Conversely, proteins widely distributed across multiple tissues tend to be more ordered, suggesting a need for structural stability in their broader functional roles.
- Research Article
- 10.1038/s41598-024-84066-z
- Mar 12, 2025
- Scientific Reports
Acquisition of conformational ensembles for a protein is a challenging task, which is actually involving to the solution for protein folding problem and the study of intrinsically disordered protein. Despite AlphaFold with artificial intelligence acquired unprecedented accuracy to predict structures, its result is limited to a single state of conformation and it cannot provide multiple conformations to display protein intrinsic disorder. To overcome the barrier, a FiveFold approach was developed with a single sequence method. It applied the protein folding shape code (PFSC) uniformly to expose local folds of five amino acid residues, formed the protein folding variation matrix (PFVM) to reveal local folding variations along sequence, obtained a massive number of folding conformations in PFSC strings, and then an ensemble of multiple conformational protein structures is constructed. The P53_HUMAN as a well-known protein and LEF1_HUMAN and Q8GT36_SPIOL as typical disordered proteins are token as the benchmark to evaluate the predicted outcomes. The results demonstrated an effective algorithm and biological meaningful process well to predict protein multiple conformation structures.
- Research Article
58
- 10.1002/pro.3041
- Oct 25, 2016
- Protein Science
Intrinsic disorder (ID) in proteins has been extensively described for the last decade; a large-scale classification of ID in proteins is mostly missing. Here, we provide an extensive analysis of ID in the protein universe on the UniProt database derived from sequence-based predictions in MobiDB. Almost half the sequences contain an ID region of at least five residues. About 9% of proteins have a long ID region of over 20 residues which are more abundant in Eukaryotic organisms and most frequently cover less than 20% of the sequence. A small subset of about 67,000 (out of over 80 million) proteins is fully disordered and mostly found in Viruses. Most proteins have only one ID, with short ID evenly distributed along the sequence and long ID overrepresented in the center. The charged residue composition of Das and Pappu was used to classify ID proteins by structural propensities and corresponding functional enrichment. Swollen Coils seem to be used mainly as structural components and in biosynthesis in both Prokaryotes and Eukaryotes. In Bacteria, they are confined in the nucleoid and in Viruses provide DNA binding function. Coils & Hairpins seem to be specialized in ribosome binding and methylation activities. Globules & Tadpoles bind antigens in Eukaryotes but are involved in killing other organisms and cytolysis in Bacteria. The Undefined class is used by Bacteria to bind toxic substances and mediate transport and movement between and within organisms in Viruses. Fully disordered proteins behave similarly, but are enriched for glycine residues and extracellular structures.
- Research Article
7
- 10.1002/prot.26381
- May 26, 2022
- Proteins: Structure, Function, and Bioinformatics
The revelation of protein folding is a challenging subject in both discovery and description. Except for acquirement of accurate 3D structure in protein stable state, another big hurdle is how to discover structural flexibility for protein innate character. Even if a huge number of flexible conformations are known, difficulty is how to represent these conformations. A novel approach, protein structure fingerprint, has been developed to expose the comprehensive local folding variations, and then construct folding conformations for entire protein. The backbone of five amino acid residues was identified as a universal folden, and then a set of Protein Folding Shape Code (PFSC) was derived for completely covering folding space in alphabetic description. Sequentially, a database was created to collect all possible folding shapes of local folding variations for all permutation of five amino acids. Successively, Protein Folding Variation Matrix (PFVM) assembled all possible local folding variations along sequence for a protein, which possesses several prominent features. First, it showed the fluctuation with certain folding patterns along sequence which revealed how the protein folding was related the order of amino acids in sequence. Second, all folding variations for an entire protein can be simultaneously apprehended at a glance within PFVM. Third, all conformations can be determined by local folding variations from PFVM, so total number of conformations is no longer ambiguous for any protein. Finally, the most possible folding conformation and its 3D structure can be acquired according PFVM for protein structure prediction. Therefore, the protein structure fingerprint approach provides a significant means for investigation of protein folding problem.
- Research Article
4
- 10.1038/s41598-023-45969-5
- Nov 21, 2023
- Scientific Reports
The conformation flexibility of natural protein causes both complexity and difficulty to understand the relationship between structure and function. The prediction of intrinsically disordered protein primarily is focusing on to disclose the regions with structural flexibility involving relevant biological functions and various diseases. The order of amino acids in protein sequence determines possible conformations, folding flexibility and biological function. Although many methods provided the information of intrinsically disordered protein (IDP), but the results are mainly limited to determine the locations of regions without knowledge of possible folding conformations. Here, the developed protein folding fingerprint adopted the protein folding variation matrix (PFVM) to reveal all possible folding patterns for the intrinsically disordered protein along its sequence. The PFVM integrally exhibited the intrinsically disordered protein with disordering regions, degree of disorder as well as folding pattern. The advantage of PFVM will not only provide rich information for IDP, but also may promote the study of protein folding problem.
- Research Article
33
- 10.1093/nar/gkad430
- May 29, 2023
- Nucleic acids research
Intrinsic disorder (ID) in proteins is well-established in structural biology, with increasing evidence for its involvement in essential biological processes. As measuring dynamic ID behavior experimentally on a large scale remains difficult, scores of published ID predictors have tried to fill this gap. Unfortunately, their heterogeneity makes it difficult to compare performance, confounding biologists wanting to make an informed choice. To address this issue, the Critical Assessment of protein Intrinsic Disorder (CAID) benchmarks predictors for ID and binding regions as a community blind-test in a standardized computing environment. Here we present the CAID Prediction Portal, a web server executing all CAID methods on user-defined sequences. The server generates standardized output and facilitates comparison between methods, producing a consensus prediction highlighting high-confidence ID regions. The website contains extensive documentation explaining the meaning of different CAID statistics and providing a brief description of all methods. Predictor output is visualized in an interactive feature viewer and made available for download in a single table, with the option to recover previous sessions via a private dashboard. The CAID Prediction Portal is a valuable resource for researchers interested in studying ID in proteins. The server is available at the URL: https://caid.idpcentral.org.
- Research Article
79
- 10.1186/1472-6807-11-29
- Jan 1, 2011
- BMC Structural Biology
BackgroundAlthough structural domains in proteins (SDs) are important, half of the regions in the human proteome are currently left with no SD assignments. These unassigned regions consist not only of novel SDs, but also of intrinsically disordered (ID) regions since proteins, especially those in eukaryotes, generally contain a significant fraction of ID regions. As ID regions can be inferred from amino acid sequences, a method that combines SD and ID region assignments can determine the fractions of SDs and ID regions in any proteome.ResultsIn contrast to other available ID prediction programs that merely identify likely ID regions, the DICHOT system we previously developed classifies the entire protein sequence into SDs and ID regions. Application of DICHOT to the human proteome revealed that residue-wise ID regions constitute 35%, SDs with similarity to PDB structures comprise 52%, while SDs with no similarity to PDB structures account for the remaining 13%. The last group consists of novel structural domains, termed cryptic domains, which serve as good targets of structural genomics. The DICHOT method applied to the proteomes of other model organisms indicated that eukaryotes generally have high ID contents, while prokaryotes do not. In human proteins, ID contents differ among subcellular localizations: nuclear proteins had the highest residue-wise ID fraction (47%), while mitochondrial proteins exhibited the lowest (13%). Phosphorylation and O-linked glycosylation sites were found to be located preferentially in ID regions. As O-linked glycans are attached to residues in the extracellular regions of proteins, the modification is likely to protect the ID regions from proteolytic cleavage in the extracellular environment. Alternative splicing events tend to occur more frequently in ID regions. We interpret this as evidence that natural selection is operating at the protein level in alternative splicing.ConclusionsWe classified entire regions of proteins into the two categories, SDs and ID regions and thereby obtained various kinds of complete genome-wide statistics. The results of the present study are important basic information for understanding protein structural architectures and have been made publicly available at http://spock.genes.nig.ac.jp/~genome/DICHOT.
- Research Article
3
- 10.3390/ijms19103101
- Oct 10, 2018
- International Journal of Molecular Sciences
Conformational protein properties are coupled to protein functionality and could provide a useful parameter for functional annotation of differentially expressed genes in transcriptome studies. The aim was to determine whether predicted intrinsic protein disorder was differentially associated with proteins encoded by genes that are differentially regulated in lymphoma cells upon interaction with stromal cells, an interaction that occurs in microenvironments, such as lymph nodes that are protective for lymphoma cells during chemotherapy. Intrinsic disorder protein properties were extracted from the Database of Disordered Protein Prediction (D2P2), which contains data from nine intrinsic disorder predictors. Proteins encoded by differentially regulated cell-adhesion regulated genes were enriched in intrinsically disordered regions (IDRs) compared to other genes both with regard to IDR number and length. The enrichment was further ascribed to down-regulated genes. Consistently, a higher proportion of proteins encoded by down-regulated genes contained at least one IDR or were completely disordered. We conclude that down-regulated genes in stromal cell-adherent lymphoma cells encode proteins that are characterized by elevated levels of intrinsically disordered conformation, indicating the importance of down-regulating functional mechanisms associated with intrinsically disordered proteins in these cells. Further, the approach provides a generally applicable and complementary alternative to classification of differentially regulated genes using gene ontology or pathway enrichment analysis.
- Research Article
2
- 10.6026/97320630018111
- Feb 28, 2022
- Bioinformation
Hepatitis E virus (HEV) is the causative agent of Hepatitis E infections across the world. Intrinsically disordered protein regions (IDPRs) or intrinsically disordered proteins (IDPs) are regions or proteins that are characterized by lack of definite structure. These IDPRs or IDPs play significant roles in a wide range of biological processes, such as cell cycle regulation, control of signaling pathways, etc. IDPR/IDP in proteins is associated with the virus's pathogenicity and infectivity. The prevalence of IDPR/IDP in rat HEV proteome remains undetermined. Hence, we examined the unstructured/disordered regions of the open reading frame (ORF) encoded proteins of rat HEV by analyzing the prevalence of intrinsic disorder. The intrinsic disorder propensity analysis showed that the different ORF proteins consisted of varying fraction of intrinsic disorder. The protein ORF3 was identified with maximum propensity for intrinsic disorder while the ORF6 protein had the least fraction of intrinsic disorder. The analysis revealed ORF6 as a structured protein (ORDP); ORF1 and ORF4 as moderately disordered proteins (IDPRs); and ORF3 and ORF5 as highly disordered proteins (IDPs). The protein ORF2 was found to be moderately as well as highly disordered using different predictors, thus, was categorized into both IDPR and IDP. Such disordered regions have important roles in pathogenesis and replication of viruses.
- Research Article
70
- 10.1074/jbc.m112.414292
- Oct 1, 2012
- Journal of Biological Chemistry
The longstanding structure-function paradigm, which states that a protein only serves a biological function in a structured state, had to be substantially revised with the description of intrinsic disorder in proteins. Intrinsically disordered regions that undergo a stimulus-dependent disorder-to-order transition are common to a large number of signaling proteins. However, little is known about the functionality of intrinsically disordered regions in plant proteins. Here we investigated intrinsic disorder in a plant-specific remorin protein that has been described as a signaling component in plant-microbe interactions. Using bioinformatic, biochemical, and biophysical approaches, we characterized the highly abundant remorin AtREM1.3, showing that its N-terminal region is intrinsically disordered. Although only the AtREM1.3 C-terminal domain is essential for stable homo-oligomerization, the N-terminal region facilitates this interaction. Furthermore, we confirmed the stable interaction between AtREM1.3 and four isoforms of the importin α protein family in a yeast two-hybrid system and by an in planta bimolecular fluorescent complementation assay. Phosphorylation of Ser-66 in the intrinsically disordered N-terminal region decreases the interaction strength with the importin α proteins. Hence, the N-terminal region may constitute a regulatory domain, stabilizing these interactions.
- Research Article
22
- 10.1074/mcp.m112.023416
- Sep 1, 2013
- Molecular & Cellular Proteomics
Damaged and misfolded proteins that are no longer functional in the cell need to be eliminated. Failure to do so might lead to their accumulation and aggregation, a hallmark of many neurodegenerative diseases. Protein quality control pathways play a major role in the degradation of these proteins, which is mediated mainly by the ubiquitin proteasome system. Despite significant focus on identifying ubiquitin ligases involved in these pathways, along with their substrates, a systems-level understanding of these pathways has been lacking. For instance, as misfolded proteins are rapidly ubiquitylated, unconjugated ubiquitin is rapidly depleted from the cell upon misfolding stress; yet it is unknown whether certain targets compete more efficiently to be ubiquitylated. Using a system-wide approach, we applied statistical and computational methods to identify characteristics enriched among proteins that are further ubiquitylated after heat shock. We discovered that distinct populations of structured and, surprisingly, intrinsically disordered proteins are prone to ubiquitylation. Proteomic analysis revealed that abundant and highly structured proteins constitute the bulk of proteins in the low-solubility fraction after heat shock, but only a portion is ubiquitylated. In contrast, ubiquitylated, intrinsically disordered proteins are enriched in the low-solubility fraction after heat shock. These proteins have a very low abundance in the cell, are rarely encoded by essential genes, and are enriched in binding motifs. In additional experiments, we confirmed that several of the identified intrinsically disordered proteins were ubiquitylated after heat shock and demonstrated for two of them that their disordered regions are important for ubiquitylation after heat shock. We propose that intrinsically disordered regions may be recognized by the protein quality control machinery and thereby facilitate the ubiquitylation of proteins after heat shock.
- Research Article
26
- 10.1016/j.jprot.2015.09.004
- Sep 12, 2015
- Journal of Proteomics
Proteomic and bioinformatic analysis of a nuclear intrinsically disordered proteome
- Research Article
196
- 10.1021/bi060981d
- Aug 15, 2006
- Biochemistry
Evidence that many protein regions and even entire proteins lacking stable tertiary and/or secondary structure in solution (i.e., intrinsically disordered proteins) might be involved in protein-protein interactions, regulation, recognition, and signal transduction is rapidly accumulating. These signaling proteins play a crucial role in the development of several pathological conditions, including cancer. To test a hypothesis that intrinsic disorder is also abundant in cardiovascular disease (CVD), a data set of 487 CVD-related proteins was extracted from SWISS-PROT. CVD-related proteins are depleted in major order-promoting residues (Trp, Phe, Tyr, Ile, and Val) and enriched in some disorder-promoting residues (Arg, Gln, Ser, Pro, and Glu). The application of a neural network predictor of natural disordered regions (PONDR VL-XT) together with cumulative distribution function (CDF) analysis, charge-hydropathy plot (CH plot) analysis, and alpha-helical molecular recognition feature (alpha-MoRF) indicator revealed that CVD-related proteins are enriched in intrinsic disorder. In fact, the percentage of proteins with 30 or more consecutive residues predicted by PONDR VL-XT to be disordered was 57 +/- 4% for CVD-associated proteins. This value is close that described earlier for signaling proteins (66 +/- 6%) and is significantly larger than the content of intrinsic disorder in eukaryotic proteins from SWISS-PROT (47 +/- 4%) and in nonhomologous protein segments with a well-defined three-dimensional structure (13 +/- 4%). Furthermore, CDF and CH-plot analyses revealed that 120 and 36 CVD-related proteins, respectively, are wholly disordered. This high level of intrinsic disorder could be important for the function of CVD-related proteins and for the control and regulation of processes associated with cardiovascular disease. In agreement with this hypothesis, 198 alpha-MoRFs were predicted in 101 proteins from the CVD data set. A comparison of disorder predictions with the experimental structural and functional data for a subset of the CVD-associated proteins indicated good agreement between predictions and observations. Thus, our data suggest that intrinsically disordered proteins might play key roles in cardiovascular disease.
- Front Matter
6
- 10.3389/fmolb.2016.00031
- Jul 7, 2016
- Frontiers in Molecular Biosciences
Editorial: Function and Flexibility: Friend or Foe?
- Research Article
263
- 10.1021/cr400713r
- May 15, 2014
- Chemical reviews
Pathological unfoldomics of uncontrolled chaos: intrinsically disordered proteins and human diseases.
- New
- Research Article
- 10.1016/j.crstbi.2025.100174
- Nov 1, 2025
- Current Research in Structural Biology
- Research Article
- 10.1016/j.crstbi.2025.100173
- Aug 1, 2025
- Current Research in Structural Biology
- Research Article
- 10.1016/j.crstbi.2025.100172
- Aug 1, 2025
- Current research in structural biology
- Research Article
- 10.1016/j.crstbi.2025.100170
- Jun 27, 2025
- Current Research in Structural Biology
- Research Article
- 10.1016/j.crstbi.2025.100171
- Jun 20, 2025
- Current Research in Structural Biology
- Research Article
- 10.1016/j.crstbi.2024.100162
- Jun 1, 2025
- Current research in structural biology
- Research Article
2
- 10.1016/j.crstbi.2025.100164
- Jun 1, 2025
- Current research in structural biology
- Research Article
2
- 10.1016/j.crstbi.2025.100165
- Jun 1, 2025
- Current research in structural biology
- Research Article
- 10.1016/j.crstbi.2025.100167
- Jun 1, 2025
- Current research in structural biology
- Research Article
1
- 10.1016/j.crstbi.2024.100163
- Jun 1, 2025
- Current research in structural biology
- Ask R Discovery
- Chat PDF
AI summaries and top papers from 250M+ research sources.