Abstract

Aim Next generation sequencing (NGS) of HLA genes has resulted in a vast decrease of ambiguous genotypes compared to older typing methods (Sanger, SSO, SSP). However, ambiguity remains with NGS if the gene is not phased or the entire gene is not sequenced. The NGS programs do not always detect and report all alternative allele combinations due to systematic biases. We developed a stand-alone tool to identify ambiguous HLA allele combinations. Methods The exons of HLA genes are typically less than 300 bases and are easy to phase with NGS. However, spacing of polymorphisms within a pair of alleles is variable and phasing may not be possible. Gene feature enumeration was the basis for identifying similarities in each HLA gene in the IMGT/HLA database, whereby alleles are separated into distinct features (untranslated, exonic and intronic regions) and each unique sequence for a given feature in a gene is assigned a numeric identifier. The program then interrogates a user-selected allele combination and feature set to determine if any other pair of alleles has the same enumerated features in an unphased context. Results E-BAG (Enumeration Based Ambiguous Genotypes), a web-based tool, was built for technicians to use during routine NGS analysis. The interface allows for auto-completion of the allele name (up to the 4th field), the ability show or hide intronic ambiguity, and auto updates for each IMGT/HLA database release. We find the E-BAG tool most useful for genes that are not fully sequenced (DRB1 and DPB1), in genes where exon shuffling is common (DPB1), and when full phasing is not achieved (any gene). For example, the allele combination DPB1∗05:01:01 + DPB1∗13:01:01, if sequenced in exons 2–5 and unphased, results in two other allele combinations: DPB1∗05:01:01 + DPB1∗107:01 (exon 1 not sequenced) and DPB1∗135:01 + DPB1∗519:01 (shuffling of exon 4), as of IMGT/HLA v. 3.27, both of which are detected and reported by the E-BAG tool. Conclusions The E-BAG tool for ambiguity detection has proven to be very useful in our laboratory and has reduced analysis time, particularly for DPB1. This approach is applicable to any gene in the IMGT/HLA database including any set of features and protects against ambiguous allele pairs not identified by NGS analysis programs, especially given the rate at which new HLA alleles are published.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.