Abstract

In the past, identification of HLA alleles was limited to sequencing the region of the gene coding for the peptide binding groove, resulting in a lack of sequence information in the HLA database, challenging HLA allele assignment software programs. We investigated full-length sequences of 19 HLA class I and 7 HLA class II alleles, and we extended another 47 HLA class I alleles with sequences of 5′ and 3′ UTR regions that were all not yet available in the IPD-IMGT/HLA database. We resolved 8638 unknown nucleotides in the coding sequence of HLA class I and 2139 of HLA class II. Furthermore, with full-length sequencing of the 26 alleles, more than 90 kb of sequence information was added to the non-coding sequences, whereas extension of the 47 alleles resulted in the addition of 5.5 kb unknown nucleotides to the 5′ UTR and > 31.7 kb to the 3′ UTR region. With this information, some interesting features were observed, like possible recombination events and lineage evolutionary origins. The continuing increase in the availability of full-length sequences in the HLA database will enable the identification of the evolutionary origin and will help the community to improve the alignment and assignment accuracy of HLA alleles.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call