Clinical Annotation Reference Templates: a resource for consistent variant annotation

Shawn Yost,Elise Ruark,Nazneen Rahman,Shazia Mahamdallie,Márton Münz,Anthony Renwick

doi:10.12688/wellcomeopenres.14924.1

Shawn Yost, Elise Ruark + Show 4 more

Open Access

https://doi.org/10.12688/wellcomeopenres.14924.1

Copy DOI

Abstract

Annotating the impact of a variant on a gene is a vital component of genetic medicine and genetic research. Different gene annotations for the same genomic variant are possible, because different structures and sequences for the same gene are available. The clinical community typically use RefSeq NMs to annotate gene variation, which do not always match the reference genome. The scientific community typically use Ensembl ENSTs to annotate gene variation. These match the reference genome, but often do not match the equivalent NM. Often the transcripts used to annotate gene variation are not provided, impeding interoperability and consistency. Here we introduce the concept of the Clinical Annotation Reference Template (CART). CARTs are analogous to the reference genome; they provide a universal standard template so reference genomic coordinates are consistently annotated at the protein level. Naturally, there are many situations where annotations using a specific transcript, or multiple transcripts are useful. The aim of the CARTs is not to impede this practice. Rather, the CART annotation serves as an anchor to ensure interoperability between different annotation systems and variant frequency accuracy. Annotations using other explicitly-named transcripts should also be provided, wherever useful. We have integrated transcript data to generate CARTs for over 18,000 genes, for both GRCh37 and GRCh38, based on the associated NM and ENST identified through the CART selection process. Each CART has a unique ID and can be used individually or as a stable set of templates; CART37A for GRCh37 and CART38A for GRCh38. We have made the CARTs available on the UCSC browser and in different file formats on the Open Science Framework: https://osf.io/tcvbq/. We have also made the CARTtools software we used to generate the CARTs available on GitHub. We hope the CARTs will be useful in helping to drive transparent, stable, consistent, interoperable variant annotation.

Highlights

An integral component of generation sequencing (NGS) gene analysis methods is the annotation of variation using the human reference genome as a baseline
Historical gene analysis methods, such as Sanger sequencing, can choose which sequences to use for variant annotation
The same variant annotated on GRCh38 would have genomic coordinates of chr2:73490120C>T, and would be annotated as c.8164C>T; p.Arg2722Ter using NM_015120.4 but c.8161C>T;p.Arg2721Ter in resources using reference genome based transcripts for annotation

Summary

Introduction

An integral component of generation sequencing (NGS) gene analysis methods is the annotation of variation using the human reference genome as a baseline. The majority of the clinical community, and much of the clinical research community, use RefSeq NM transcripts as baseline sequences for variant annotation[1]. NGS-based gene analyses often use ENST transcripts as the baseline sequences for variant annotation. Given the intrinsic differences in the widely used variant annotation systems it is essential that the transcripts used for variant calling are transparently provided and stably available. The CARTs aim to provide standard, interoperable, stable gene templates for variant annotation that are based on the reference genome sequence, include the required structural information, and can be used either individually or as set. We hope the CARTs will be useful in helping to drive transparent, stable, consistent, interoperable variant annotations

Methods

The reference genome sequence of the above

Findings

11. Rahman N

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Clinical Annotation Reference Templates: a resource for consistent variant annotation

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Wellcome Open Research

Lead the way for us

Journal: Wellcome Open Research	Publication Date: Nov 14, 2018
License type: CC BY 4.0

Similar Papers

Author response: Genomic and healthcare dynamics of nosocomial SARS-CoV-2 transmission
Jamie M Ellingford ... Ryan George
-
Jamie M Ellingford, et. al.Jamie M Ellingford ... Ryan George
09 Mar 2021
09 Mar 2021

SeqCAT: Sequence Conversion and Analysis Toolbox.
Kevin Kornrumpf ... Jürgen Dönitz
Nucleic acids research | VOL. 52
Kevin Kornrumpf, et. al.Kevin Kornrumpf ... Jürgen Dönitz
05 Jul 2024
Nucleic acids research | VOL. 52

GSearch: a fast and flexible general search tool for whole-genome sequencing
T Song ... K Lee
Bioinformatics | VOL. 28
T Song, et. al.T Song ... K Lee
23 Jun 2012
Bioinformatics | VOL. 28

IFRD1 Is a Candidate Gene for SMNA on Chromosome 7q22-q23
Zoran Brkanac ... Wendy H Raskind
The American Journal of Human Genetics | VOL. 84
Zoran Brkanac, et. al.Zoran Brkanac ... Wendy H Raskind
30 Apr 2009
IFRD1 Is a Candidate Gene for SMNA on Chromosome 7q22-q23
Zoran Brkanac ... Wendy H Raskind

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Clinical Annotation Reference Templates: a resource for consistent variant annotation

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Wellcome Open Research