Abstract

Aim HLA typing via next generation sequencing (NGS) identifies polymorphisms in introns and untranslated regions (UTRs), and many novel synonymous and non-coding HLA sequence variants. NGS therefore poses challenges for the application of HLA nomenclature to sequence variants in non-exonic gene regions; to maintain the utility of the IMGT/HLA Database, full-gene sequences (all coding and non-coding regions) are required for the assignment of allele names to non-exon sequences, but not all NGS HLA typing methods generate full-gene sequences. Given this limitation, we have developed the “feature service”, a free, public web-service that accepts the submission of pre-curated sequences for individual features of HLA and KIR genes. Methods The feature service is available at http://feature.nmdp-bioinformatics.org , and allows full or partial HLA and KIR consensus sequence to be processed, accessioned and persisted so that each unique sequence for a particular locus, term (exon, intron, UTR, etc.) and rank (2,3,4, etc.) is assigned a unique identifier for analysis. Rank distinguishes features identified with the same term. For example, HLA-A exon 1 is defined as locus = HLA-A, term = exon, rank = 1; HLA-A exon 2 uses the same locus and term identifiers, but is assigned rank = 2; HLA-A intron 2, uses the same locus and rank, but is assigned term = intron, etc. The service uses JSON POST and GET operations to allow rapid, automated sequence data submission and retrieval, and any Gene Ontology term can be submitted as a feature service term. Results The service has been populated with 47,253 unique exon, intron and UTR feature sequences for the 14,473 alleles in IMGT/HLA Database version 3.24.0 and 8192 unique feature sequences for the 753 alleles in IPD-KIR Database version 2.6.1. Features representing full-gene NGS HLA genotyping for 25,000 samples generated by two independent laboratories have been submitted as well. Conclusions By identifying the locus, term, rank and accession number for the gene feature sequences of each allele, the gene sequences of known and novel alleles can be accurately described in the absence of HLA nomenclature. By sharing HLA and KIR gene sequences in this way, they can be applied for clinical and research purposes prior to curation by the IPD databases.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.