VarMap: a web tool for mapping genomic coordinates to protein sequence and structure and retrieving protein structural annotations.

James D Stephenson,Matthew E Hurles,Andrew Nightingale,Janet M Thornton,Roman A Laskowski

doi:10.1093/bioinformatics/btz482

James D Stephenson, Matthew E Hurles + Show 3 more

Open Access

https://doi.org/10.1093/bioinformatics/btz482

Copy DOI

Abstract

MotivationUnderstanding the protein structural context and patterning on proteins of genomic variants can help to separate benign from pathogenic variants and reveal molecular consequences. However, mapping genomic coordinates to protein structures is non-trivial, complicated by alternative splicing and transcript evidence.ResultsHere we present VarMap, a web tool for mapping a list of chromosome coordinates to canonical UniProt sequences and associated protein 3D structures, including validation checks, and annotating them with structural information.Availability and implementation https://www.ebi.ac.uk/thornton-srv/databases/VarMap.Supplementary information Supplementary data are available at Bioinformatics online.

Highlights

The consequence of variants affecting protein sequence depends on the structural context and chemical environment
One of the transcripts is identified as the ‘RefSeq Select transcript’, chosen according to criteria described by NCBI
The information provided by VarMap could be obtained manually using the following existing tools and databases: Ensembl (Cunningham et al, 2019), VEP (McLaren et al, 2016), UniProt (UniProt, 2019), SWISS-PROT (Boutet et al, 2007), BioMart (Kinsella et al, 2011), HGNC (Braschi et al, 2019), CATH (Dawson et al, 2017), Pfam (El-Gebali et al, 2019), M-CSA (Ribeiro et al, 2018), FASTA (Pearson, 2014), PDBsum (Laskowski et al, 2018), ScoreCons (Valdar, 2002), gnomAD (Lek et al, 2016) and ClinVar (Landrum et al, 2018)

Summary

Introduction

The consequence of variants affecting protein sequence depends on the structural context and chemical environment. Understanding these elements has the potential of both uncovering the biochemical consequences of the change, and of identifying ‘hot spots’ where several variants from different individuals occur within close spatial proximity in the same protein. To benefit from the added information 3D protein structures can provide, an accurate mapping between genomic coordinates and the corresponding protein sequence, and structure, is required. Alternative splicing makes mapping genomic coordinates to protein sequence non-trivial. One of the transcripts is identified as the ‘RefSeq Select transcript’, chosen according to criteria described by NCBI (O’Leary et al, 2016), and has a corresponding protein sequence. As the translated select RefSeq and canonical UniProt sequences are independently derived, they often differ [in 18% of cases in the ClinVar database (Landrum et al, 2018) (Fig. 1C)]—resulting in different numbering of the residues

Materials and methods

VarMap web tool

Discussion

Findings

Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Bioinformatics	Publication Date: Jun 13, 2019
Citations: 50	License type: CC BY 4.0

R Discovery Prime

R Discovery Prime

VarMap: a web tool for mapping genomic coordinates to protein sequence and structure and retrieving protein structural annotations.

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Bioinformatics

Lead the way for us

Similar Papers

Structure Prediction for Alternatively Spliced Proteins
Lukasz Kozlowski ... Jerzy Orlowski
-
Lukasz Kozlowski, et. al.Lukasz Kozlowski ... Jerzy Orlowski
11 Jan 2012
11 Jan 2012

Author response: Rapid protein stability prediction using deep learning representations
Lasse M Blaabjerg ... Nicolas Jonsson
-
Lasse M Blaabjerg, et. al.Lasse M Blaabjerg ... Nicolas Jonsson
09 May 2023
09 May 2023

The (In)dependence of Alternative Splicing and Gene Duplication
David Talavera ... Xavier de la Cruz
PLoS Computational Biology | VOL. 3
David Talavera, et. al.David Talavera ... Xavier de la Cruz
01 Mar 2007
PLoS Computational Biology | VOL. 3

A Bioinformatics-Based Alternative mRNA Splicing Code that May Explain Some Disease Mutations Is Conserved in Animals.
Wen Qu ... Douglas M Ruden
Frontiers in genetics | VOL. 8
Wen Qu, et. al.Wen Qu ... Douglas M Ruden
11 Apr 2017
Frontiers in genetics | VOL. 8

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

VarMap: a web tool for mapping genomic coordinates to protein sequence and structure and retrieving protein structural annotations.

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Bioinformatics