Abstract

Advancing our understanding of the roles that glycosylation plays in development and disease is frequently hindered by the diversity of the data that must be integrated to gain insight into these complex phenomena. GlyGen is a maturing initiative, supported by the NIH Common Fund, with the goal of democratizing glycoscience research by developing and implementing a comprehensive data repository that integrates diverse types of data, including glycan structures, glycan biosynthesis enzymes, glycoproteins, three‐dimensional glycoprotein structures, along with genomic and proteomic knowledge. To achieve the highest possible integration and impact, GlyGen has established international collaborations with database providers from different domains (including but not limited to EBI, NCBI, PDB, GlyTouCan and UniCarbKB) and glycoscience researchers. Information from these resources and groups are standardized and cross‐linked to allow queries across multiple domains. To facilitate easy access to this information, an intuitive, web based interface ( http://glygen.org/) has been developed to visually represent the data and the connections between datasets. In addition to the browser‐based interface we are also developing RESTful webservice‐based APIs and SPARQL endpoints, allowing programmatic access to integrated datasets. For each glycan and glycoprotein in the dataset, GlyGen provides a details page that integrates available information. Individual details pages are interlinked with each other allowing easy data exploration across multiple domains. For example, users can browse from the webpage of a glycosylated protein to the glycan structures that have been described to be attached to this protein, and, from there, to other proteins that carry the same glycan. All information accessed through GlyGen is linked back to original sources, allowing users to browse through information pages in other resources. Our goal is to provide scientists with an easy way to access the complex information underlying state‐of‐the‐art knowledge that describes the biology of glycans and glycoproteins. To maximize usefulness of the GlyGen resource by the broadest possible user population, we have implemented a query interface that suggests likely “use‐case” questions. For instance, a question such as “What are the enzymes involved in the biosynthesis of glycan X in humans?” can be posed simply by providing minimal information (e.g. glycan X in the example). Answers are then returned following the completion of a query across multiple datasets and domains; the complexity of the query remains invisible to the user. Importantly, use‐case questions were developed as a result of outreach to potential users through community meetings and directed contacts. To schedule a demo of GlyGen or add your data to GlyGen contact Michael Tiemeyer ( mtiemeyer@ccrc.uga.edu) or Raja Mazumder ( mazumder@gwu.edu).Support or Funding InformationSupported by NIH Common Fund U01 GM124267‐01

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call