NAStructuralDB : structural database to facilitate computational studies of molecular modeling and recognition of proteins with special focus on antibody–antigen interactions
ABSTRACT Studying the interactions between antibodies and antigens is fundamental to the development of novel therapeutic biologics. Predictions of such interactions start with data collection. Though there exist reliable resources to identify antibody structures in the Protein Data Bank (PDB), such data still requires substantial processing to be usable in predictive tasks. Redundancy in sequences needs to be removed to avoid data leakages between train, test, and validation sets. Descriptors such as surface accessibility, secondary structure, and antibody region information need to be additionally annotated. Information on inter- and intra-molecular contacts, which is crucial to studying paratope/epitope information, needs to be collected. The specialized immunoglobulin format of Nanobodies® requires a separate dataset mirroring that of antibodies, given that their structure contains only a single VHH chain. Because antibody–antigen structures account for a small amount of all protein–protein contacts, having a molecular contact reference from other proteins is also desired. To address these issues, we introduce NAStructuralDB (https://naturalantibody.com/na-structural/), a dataset of processed structures of antibodies, Nanobodies®, proteins, and their complexes with molecular contact information and associated annotations. We use the opportunity of having collected the contact data to provide a reference of binding propensities of different residues across distinct contact types.
- Research Article
33
- 10.1007/978-1-4939-1115-8_8
- Jan 1, 2014
- Methods in molecular biology (Clifton, N.J.)
Antigen-Antibody Interaction Database (AgAbDb) is an immunoinformatics resource developed at the Bioinformatics Centre, University of Pune, and is available online at http://bioinfo.net.in/AgAbDb.htm. Antigen-antibody interactions are a special class of protein-protein interactions that are characterized by high affinity and strict specificity of antibodies towards their antigens. Several co-crystal structures of antigen-antibody complexes have been solved and are available in the Protein Data Bank (PDB). AgAbDb is a derived knowledgebase developed with an objective to compile, curate, and analyze determinants of interactions between the respective antigen-antibody molecules. AgAbDb lists not only the residues of binding sites of antigens and antibodies, but also interacting residue pairs. It also helps in the identification of interacting residues and buried residues that constitute antibody-binding sites of protein and peptide antigens. The Antigen-Antibody Interaction Finder (AAIF), a program developed in-house, is used to compile the molecular interactions, viz. van der Waals interactions, salt bridges, and hydrogen bonds. A module for curating water-mediated interactions has also been developed. In addition, various residue-level features, viz. accessible surface area, data on epitope segment, and secondary structural state of binding site residues, are also compiled. Apart from the PDB numbering, Wu-Kabat numbering and explicit definitions of complementarity-determining regions are provided for residues of antibodies. The molecular interactions can be visualized using the program Jmol. AgAbDb can be used as a benchmark dataset to validate algorithms for prediction of B-cell epitopes. It can as well be used to improve accuracy of existing algorithms and to design new algorithms. AgAbDb can also be used to design mimotopes representing antigens as well as aid in designing processes leading to humanization of antibodies.
- Research Article
4
- 10.3389/fimmu.2023.1269916
- Dec 4, 2023
- Frontiers in Immunology
Antigenic drift is the biggest challenge for mutagenic RNA virus vaccine development. The primary purpose is to determine the IEMM (immune escape mutation map) of 20 amino acids' replacement to reveal the rule of the viral immune escape. To determine the relationship between epitope mutation and immune escape, we use universal protein tags as a linear epitope model. To describe and draw amino acid linkage diagrams, mutations of protein tags are classified into four categories: IEM (immune escape mutation), ADERM (antibody-dependent enhancement risk mutation), EQM (equivalent mutation), and IVM (invalid mutation). To overcome the data limitation, a general antigen-antibody (Ag-Ab) interaction map was constructed by analyzing the published three-dimensional (3D) Ag-Ab interaction patterns. (i) One residue interacts with multiple amino acids in antigen-antibody interaction. (ii) Most amino acid replacements are IVM and EQM. (iii) Once aromatic amino acids replace non-aromatic amino acids, the mutation is often IEM. (iv) Substituting residues with the same physical and chemical properties easily leads to IVM. Therefore, this study has important theoretical significance for future research on antigenic drift, antibody rescue, and vaccine renewal design. The antigenic epitope mutations were typed into IEM, ADERM, EQM, and IVM types to describe and quantify the results of antigenic mutations. The antigen-antibody interaction rule was summarized as a one-to-many interaction rule. To sum up, the epitope mutation rules were defined as IVM and EQM predomination rules and the aryl mutation escape rule.
- Research Article
172
- 10.1038/emboj.2008.8
- Jan 31, 2008
- The EMBO Journal
Protein kinase autophosphorylation of activation segment residues is a common regulatory mechanism in phosphorylation-dependent signalling cascades. However, the molecular mechanisms that guarantee specific and efficient phosphorylation of these sites have not been elucidated. Here, we report on three novel and diverse protein kinase structures that reveal an exchanged activation segment conformation. This dimeric arrangement results in an active kinase conformation in trans, with activation segment phosphorylation sites in close proximity to the active site of the interacting protomer. Analytical ultracentrifugation and chemical cross-linking confirmed the presence of dimers in solution. Consensus substrate sequences for each kinase showed that the identified activation segment autophosphorylation sites are non-consensus substrate sites. Based on the presented structural and functional data, a model for specific activation segment phosphorylation at non-consensus substrate sites is proposed that is likely to be common to other kinases from diverse subfamilies.
- Research Article
53
- 10.1016/j.str.2007.10.019
- Jan 1, 2008
- Structure
Complementary Structural Mass Spectrometry Techniques Reveal Local Dynamics in Functionally Important Regions of a Metastable Serpin
- Research Article
18
- 10.1093/bib/bbx084
- Aug 2, 2017
- Briefings in Bioinformatics
Major scientific challenges that are beyond the capability of individuals need to be addressed by multi-disciplinary and multi-institutional consortia. Examples of these endeavours include the Human Genome Project, and more recently, the Structural Genomics (SG) initiative. The SG initiative pursues the expansion of structural coverage to include at least one structural representative for each protein family to derive the remaining structures using homology modelling. However, biological function is inherently connected with protein dynamics that can be studied by knowing different structures of the same protein. This ensemble of structures provides snapshots of protein conformational diversity under native conditions. Thus, sequence redundancy in the Protein Data Bank (PDB) (i.e. crystallization of the same protein under different conditions) is therefore an essential input contributing to experimentally based studies of protein dynamics and providing insights into protein function. In this work, we show that sequence redundancy, a key concept for exploring protein dynamics, is highly biased and fundamentally incomplete in the PDB. Additionally, our results show that dynamical behaviour of proteins cannot be inferred using homologous proteins. Minor to moderate changes in sequence can produce great differences in dynamical behaviour. Nonetheless, the structural and dynamical incompleteness of the PDB is apparently unrelated concepts in SG. While the first could be reversed by promoting the extension of the structural coverage, we would like to emphasize that further focused efforts will be needed to amend the incompleteness of the PDB in terms of dynamical information content, essential to fully understand protein function.
- Research Article
1
- 10.4049/jimmunol.210.supp.75.21
- May 1, 2023
- The Journal of Immunology
COVID-19 and SARS-CoV-2 variants continue to threaten human health and life worldwide. Thousands of structures related to SARS-CoV-2 have been rapidly determined, either by X-ray crystallography or CryoEM and deposited in the Protein Data Bank (PDB) since 2020. Here, we systematically investigated the structures of 302 antibodies and 78 nanobodies in complex with spike protein or the receptor binding domain (RBD) of SARS-CoV-2. We identified 23 common epitopic sites (ES) on the RBD surface and revealed the vital role of the complementarity-determining Region (CDR) loops in recognizing epitopes. The 23 ES are characterized according to the secondary structure feature and accessible surface area. About 75% of the total surface of RBD areas could access by the elicited antibodies, while the CDR3 loops occupied 50% of the contact surface. The analysis of paratope-epitope (antibody-antigen) interaction based on these epitope sites revealed the features of potent neutralizing the virus and the unique usage of amino acids of antibody. The clustering analysis of the antibody biophysical properties and surface area availabilities on the RBD found many binding motifs. Remarkably, most variants of concern (VOC) escape mutations, including Omicron occur within these 23 ES. This analysis not only explains the differential ability of antibodies to recognize epitope sites on the RBD surface but also offers predictive guidelines for understanding the role that accumulated SARS mutations might play in escape from antibodies elicited by immunogens. This structural characterization of the epitope-paratope interactions provides insights for structure-based vaccine design and therapeutic strategies and drugs against the future virus.
- Research Article
12
- 10.1002/prot.22719
- Mar 18, 2010
- Proteins: Structure, Function, and Bioinformatics
Solution structure of the C-terminal DUF1000 domain of the human thioredoxin-like 1 protein
- Research Article
4
- 10.1016/j.cpc.2012.04.019
- Apr 28, 2012
- Computer Physics Communications
ARVO-CL: The OpenCL version of the ARVO package — An efficient tool for computing the accessible surface area and the excluded volume of proteins via analytical equations
- Research Article
10
- 10.1111/j.1582-4934.2002.tb00192.x
- Apr 1, 2002
- Journal of cellular and molecular medicine
Because, in vivo, the HIV-1 PR ( HIV-1 protease) present a high mutation rate we performed a comparative study of the energetic behaviors of the wild type HIV-1 PR and four type of mutants: Val82/Asn; Val82/Asp; Gln7/Lys, Leu33/Ile, Leu63/Ile; Ala71/Thr, Val82/Ala. We suggest that the energetic fluctuation (electrostatic, van der Waals and torsion energy) of the mutants and the solvent accessible surface (SAS) values can be useful to explain the viral resistance process developed by HIV-1 PR. The number and localization of enzyme mutations induce important modifications of the van der Waals and torsional energy, while the electrostatic energy has an insignificant fluctuation. We showed that the viral resistance can be explored if the solvent accessible surfaces of the active site for the mutant structures are calculated. In this paper we have obtained the solvent accessible surface for a group of 15 mutants (11 mutants obtained by Protein Data Bank (PDB) file, 4 mutants modeled by CHARMM software) and for the wild type HIV-1 PR). Our study try to show that the number and localization of the mutations are factors which induce the HIV-1 PR viral resistance. The larger solvent accessible surface could be recorded for the point mutant Val 82/Phe.
- Research Article
7
- 10.1002/prot.20924
- Feb 10, 2006
- Proteins: Structure, Function, and Bioinformatics
The New York Structural GenomiX Research Consortium (NYSGXRC) has selected the protein coded by yxaF gene from Bacillus subtilis as a target for structure determination. The yxaF protein has 191 residues with a molecular mass of 21 kDa and had no sequence homology to any structure in the Protein Data Bank (PDB) at the time of target selection. We aimed to elucidate the three-dimensional structure for the putative protein yxaF to better understand the relationship between protein sequence, structure, and function. This protein is annotated as a putative helix-turn-helix (HTH) type transcriptional regulator. Many transcriptional regulators like TetR and QacR use a structurally well-defined DNA-binding HTH motif to recognize the target DNA sequences. DNA-HTH motif interactions have been extensively studied. As the HTH motif is structurally conserved in many regulatory proteins, these DNA-protein complexes show some similarity in DNA recognition patterns. Many such regulatory proteins have a ligand-binding domain in addition to the DNA-binding domain. Structural studies on ligand-binding regulatory proteins provide a wealth of information on ligand-, and possibly drug-, binding mechanisms. Understanding the ligand-binding mechanism may help overcome problems with drug resistance, which represent increasing challenges in medicine. The protein encoded by yxaF, hereafter called T1414, shows foldmore » similar to QacR repressor and TetR/CamR repressor and possesses putative DNA and ligand-binding domains. Here, we report the crystal structure of T1414 and compare it with structurally similar drug and DNA-binding proteins.« less
- Research Article
43
- 10.1023/a:1016320106741
- Feb 1, 2002
- Journal of Computer-Aided Molecular Design
Molecular modeling methodologies such as molecular docking, pharmacophore modeling, and 3D-QSAR, rely on conformational searches of small molecules as a starting point. All of these methodologies seek conformations of the small molecules as they bind to target proteins, i.e., their active conformations. Thus the question as to whether active conformations can be separated from inactive conformations is extremely relevant. In this paper, 3D-descriptors that separate random conformations from active conformations of small molecules are sought. To select appropriate descriptors, 65 protein-ligand complexes were taken from the protein data bank. For each ligand the active conformation was compared to randomly generated low energy conformations. Descriptors such as solvent accessible surface area, number of internal interactions and radius of gyration appear to be useful for separating the active conformations from the random conformations. The results with all these descriptors indicate that active conformations are less compact that random conformations, i.e., they have more solvent accessible surface area, fewer internal interactions and a larger radius of gyration than random conformations. Thus these descriptors could be useful as weights to bias conformational search procedures to conformations more likely to bind to proteins or as filters to eliminate conformations unlikely to bind to any protein.
- Research Article
42
- 10.1074/mcp.m500095-mcp200
- Sep 16, 2005
- Molecular & Cellular Proteomics
The structural allostery and binding interface for the human serum transferrin (Tf)*transferrin receptor (TfR) complex were identified using radiolytic footprinting and mass spectrometry. We have determined previously that the transferrin C-lobe binds to the receptor helical domain. In this study we examined the binding interactions of full-length transferrin with receptor and compared these data with a model of the complex derived from cryoelectron microscopy (cryo-EM) reconstructions (Cheng, Y., Zak, O., Aisen, P., Harrison, S. C. & Walz, T. (2004) Structure of the human transferrin receptor.transferrin complex. Cell 116, 565-576). The footprinting results provide the following novel conclusions. First, we report characteristic oxidations of acidic residues in the C-lobe of native Tf and basic residues in the helical domain of TfR that were suppressed as a function of complex formation; this confirms ionic interactions between these protein segments as predicted by cryo-EM data and demonstrates a novel method for detecting ion pair interactions in the formation of macromolecular complexes. Second, the specific side-chain interactions between the C-lobe and N-lobe of transferrin and the corresponding interactions sites on the transferrin receptor predicted from cryo-EM were confirmed in solution. Last, the footprinting data revealed allosteric movements of the iron binding C- and N-lobes of Tf that sequester iron as a function of complex formation; these structural changes promote tighter binding of the metal ion and facilitate efficient ion transport during endocytosis.
- Research Article
1
- 10.1111/j.1751-2824.2008.00184.x
- May 9, 2008
- ISBT Science Series
Immunology
- Research Article
2
- 10.1371/journal.pone.0205052
- Dec 11, 2018
- PloS one
The principle of three-dimensional protein structure formation is a long-standing conundrum in structural biology. A globular domain of a soluble protein is formed by a network of atomic contacts among amino acid residues, but regions without intramolecular non-local contacts are often observed in the protein structure, especially in loop, linker, and peripheral segments with secondary structures. Although these regions can play key roles for protein function as interfaces for intermolecular interactions, their nature remains unclear. Here, we termed protein segments without non-local contacts as floating segments and sought them in tens of thousands of entries in the Protein Data Bank. As a result, we found that 0.72% of residues are in floating segments. Regarding secondary structural elements, coil structures are enriched in floating segments, especially for long segments. Interactions with polypeptides and polynucleotides, but not chemical compounds, are enriched in floating segments. The amino acid preferences of floating segments are similar to those of surface residues, with exceptions; the small side chain amino acids, Gly and Ala, are preferred, and some charged side chains, Arg and His, are disfavored for floating segments compared to surface residues. Our comprehensive characterization of floating segments may provide insights into understanding protein sequence-structure-function relationships.
- Research Article
14
- 10.1093/nar/gkz358
- May 9, 2019
- Nucleic Acids Research
The study of contact residues and interfacial waters of antibody–antigen (Ab-Ag) structures could help in understanding the principles of antibody–antigen interactions as well as provide guidance for designing antibodies with improved affinities. Given the rapid pace with which new antibody–antigen structures are deposited in the protein databank (PDB), it is crucial to have computational tools to analyze contact residues and interfacial waters, and investigate them at different levels. In this study, we have developed AppA, a web server that can be used to analyze and compare 3D structures of contact residues and interfacial waters of antibody–antigen complexes. To the best of our knowledge, this is the first web server for antibody–antigen structures equipped with the capability for dissecting the contributions of interfacial water molecules, hydrogen bonds, hydrophobic interactions, van der Waals interactions and ionic interactions at the antibody–antigen interface, and for comparing the structures and conformations of contact residues. Various examples showcase the utility of AppA for such analyses and comparisons that could help in the understanding of antibody–antigen interactions and suggest mutations of contact residues to improve affinities of antibodies. The AppA web server is freely accessible at http://mspc.bii.a-star.edu.sg/minhn/appa.html.