Abstract

Aim Utilize our region-specific extraction (RSE), targeted DNA capture methodology to generate large, contiguous DNA fragments (5–60 Kbp) from the MHC for sequencing on the PacBio RSII single molecule real-time sequencing (SMRT) platform to produce long sequenced reads (10–15 Kbp) for de novo assembly and characterization of the MHC. This unique combination of technologies produces long sequenced reads (up to 60 Kbp) that may eventually enable the construction of large, phased haplotype blocks and haplotype resolved de novo assembly of the MHC. Methods Genomic DNA from the homozygous cell line, COX (which has a fully characterized MHC haplotype) was enriched for 4 Mbp of the MHC (chr6: 29618227–33618227) using the RSE DNA capture methodology [1] . DNA fragment lengths were calculated using a BioAnalyzer prior to sequencing. SMRTbell DNA libraries were constructed according to the Pacbio standard protocol “20 kb Template Preparation Using BluePippin Size-selection system”. Libraries were sequenced on the PacBio RS II instrument (P6-C4 chemistry). Computational analysis was carried out using the PacBio SMRT portal HGAP 2 de novo assembly algorithm. Assembled contigs were evaluated using QUAST with the COX haplotype sequence as reference. Results Captured DNA fragments from the MHC were calculated to be ∼12 Kbp on average (ranging from ∼5 to 60 Kbp). The observed read length distribution following PacBio RSII sequencing reveals an average read length of ∼3.5 Kbp, with some reads as long as 60 Kbp. We are able to de novo assemble 91% of the targeted region, with 99.99% accuracy. The N50 and NG50 for the assembly were calculated to be 33,234 bp and 92,824 bp, respectively. The largest contig aligned to the COX reference was found to be ∼ 200Kbp. Conclusions Our targeted resequencing and de novo assembly approach represents a comprehensive method to characterize 4 Mbp of the Human MHC. We demonstrate the unique ability to de novo assemble and fully characterize 91% of the targeted MHC for the homozygous cell line COX with 99.9% accuracy as compared to the annotated COX haplotype reference sequence.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call