Introduction: Globin gene sequencing has long been impeded by repetitive regions, homology and structural variation. Currently, long-read sequencing technology has been limited by high cost and error rate. These challenges can be mitigated by sample multiplexing of targeted capture sequencing and the use of consensus reads, respectively. This ongoing study analyzes the utility of long range sequencing technology in identifying historically problematic beta globin cluster mutations. Materials and methods: Long range sequencing was performed on complex beta globin gene cluster cases utilizing the Sequel platform (Pacific Biosciences, Menlo Park, California, United States). Sample input was 1 microgram of DNA extracted from whole blood. Circular Consensus Sequence (CCS) FASTQ sequences files were mapped to reference genome (GRCh37/hg19) using NGMLR, followed by annotation of variants using a custom bioinformatics workflow. Comparison to multiplex ligation-dependent probe amplification (MLPA), Sanger sequencing, array CGH, short-range NGS (300 bp), hemoglobin protein, and clinical data was performed. Results: CCS read length varied from 1 to 10 kb, average 3.5. Analysis of the beta globin cluster region showed impressive superiority and comprehensiveness to Sanger, MLPA, array CGH and short range sequencing technology: 1) single nucleotide variants (SNVs) were identified to a sensitivity and specificity of 99.5%; 2) large structural variants (SVs) such as large deletions, duplications, insertions, crossovers and fusions spanning many kilobases were characterized in the heterozygous, homozygous and compound heterozygous state to a precise genomic coordinate (table 1); 3) phasing SNVs identified Hb S haplotypes (Central African region (CAR), Benin, Senegal, Arab-Indian and Cameroon). Conclusion: 1) Long range sequencing holds significant promise in genotyping hemoglobin disorders. This method improves efficiency with potential for a single test offering with increased resolution, particularly in large deletion analysis. Complex cases greatly benefited from improved range and resolution. 2) The decreased requirement of blood volume for extensive testing is of particular interest for the pediatric population. 3) Genotyping of Hb S patients detected by positive newborn screening will help predict need for prophylaxis or therapy by assessing variant zygosity, phasing and haplotypes. In conclusion, we believe that long range sequencing has immense potential to contribute extensively to thalassemia and hemoglobinopathy diagnostics and is far superior to the currently available technologies for beta globin cluster analysis. Disclosures Jen: Celgene: Employment.
Read full abstract