Towards computational prediction of the heparan sulfate interactome

Nehru Viji Sankaranarayanan,Umesh R Desai

doi:10.1096/fasebj.2019.33.1_supplement.800.5

Abstract

Glycosaminoglycans (GAGs), are linear polysaccharides with repeating disaccharide units. GAGs interact with numerous proteins to regulate various physiological and pathological processes such as hemostasis, cell adhesion, growth factor signaling, coagulation, viral invasion and protease regulation. Heparin (H) and heparan sulfate (HS), members of the GAG superfamily, are composed of alternating glucosamine (GlcNp) and uronic acid (UAp) residues (either glucuronic acid (GlcAp) or iduronic acid (IdoAp)) linked by 1→4 linkage that are incompletely modified through sulfation, acetylation and epimerization reactions. These modifications can produce 48 distinct disaccharides, of which 23 have been found in nature to date. Further, the IdoAp residue can exist in multiple conformations, especially 1C4 and 2SO, which can interconvert easily in solution to enhance structural possibilities. Thus, combinatorial arrangements of the several configurational and conformational variations possible at the monosaccharide level generate millions of distinct HS sequences. Yet, only one sequence, i.e., the heparin pentasaccharide sequence that binds to antithrombin with high affinity, has found therapeutic application. Surely, a number of HS sequences bind to proteins and modulate their functions. To date, these interactions have been studied in silos, which has led to rather poor understanding of HS interactome.Elucidating the HS interactome is challenging. An analysis of gene sequences has led to prediction that 435 human proteins could interact with heparin. Yet, whether these are specific, and therefore physiologically relevant, remains to be understood. We have developed a computational strategy called Combinatorial Virtual Library Screening (CVLS) to identify sequences that display a high level of specificity in binding to protein targets. The CVLS algorithm uses a two‐step approach in which in silico ‘affinity’ and in silico ‘specificity’ of interaction are key drivers of analysis. The CVLS algorithm utilizes all possible GAG sequences binding to the predicted site of GAG binding on target proteins to identify the sequence with highest specificity for each target protein. We have now validated the application of the CVLS algorithm for understanding GAG recognition of several proteins including antithrombin (AT), heparin cofactor II (HCII), fibroblast growth factor‐2 & its receptor (FGF‐2/FGFR1), transforming growth factor β2 (TGFβ2), thrombin, histone acetyltransferase p300, chemokine CXCL13 and human neutrophil elastase. We expect that our CVLS tool, which can help experimental biologists in identifying the key GAG sequence(s) to study, will be especially useful in elucidating the HS interactome.Support or Funding InformationNIH grant HL107152 to URD. Computational facility provided by National Center for Research Resources to Virginia Commonwealth University through grant S10 RR027411.This abstract is from the Experimental Biology 2019 Meeting. There is no full text article associated with this abstract published in The FASEB Journal.

Full Text