Abstract
BackgroundGenotyping and massively-parallel sequencing projects result in a vast amount of diploid data that is only rarely resolved into its constituent haplotypes. It is nevertheless this phased information that is transmitted from one generation to the next and is most directly associated with biological function and the genetic causes of biological effects. Despite progress made in genome-wide sequencing and phasing algorithms and methods, problems assembling (and reconstructing linear haplotypes in) regions of repetitive DNA and structural variation remain. These dynamic and structurally complex regions are often poorly understood from a sequence point of view. Regions such as these that are highly similar in their sequence tend to be collapsed onto the genome assembly. This is turn means downstream determination of the true sequence haplotype in these regions poses a particular challenge. For structurally complex regions, a more focussed approach to assembling haplotypes may be required.ResultsIn order to investigate reconstruction of spatial information at structurally complex regions, we have used an emulsion haplotype fusion PCR approach to reproducibly link sequences of up to 1kb in length to allow phasing of multiple variants from neighbouring loci, using allele-specific PCR and sequencing to detect the phase. By using emulsion systems linking flanking regions to amplicons within the CNV, this led to the reconstruction of a 59kb haplotype across the DEFA1A3 CNV in HapMap individuals.ConclusionThis study has demonstrated a novel use for emulsion haplotype fusion PCR in addressing the issue of reconstructing structural haplotypes at multiallelic copy variable regions, using the DEFA1A3 locus as an example.
Highlights
Genotyping and massively-parallel sequencing projects result in a vast amount of diploid data that is only rarely resolved into its constituent haplotypes
Single-molecule sequencing and Long Fragment read (LFR) technologies could offer a definitive haplotype on a genome-wide scale, in particular those methods that offer the possibility of long read lengths and the phasing of heterozygous SNPs into long haplotype contigs [25,26,27,28]
We have developed an emulsion haplotype fusion PCR approach for linking two PCR products each of up to 1kb in length and examined its use to reconstruct structural haplotypes at the copy-variable locus DEFA1A3
Summary
Genotyping and massively-parallel sequencing projects result in a vast amount of diploid data that is only rarely resolved into its constituent haplotypes. Addressing phase may be approached by physically separating the two parental chromosomes prior to sequence (and statistical) analysis by construction of somatic cell hybrids [17], microdissection [18], microfluidics [19] or chromosome sorting [20] These methods serve to provide haplotype information at the genome-wide or chromosome scale, yet they often rely on specialised instruments and expertise or are time-consuming and expensive. Single-molecule sequencing and Long Fragment read (LFR) technologies could offer a definitive haplotype on a genome-wide scale, in particular those methods that offer the possibility of long read lengths and the phasing of heterozygous SNPs into long haplotype contigs [25,26,27,28] Whilst such methods are potentially successful in determining phase of heterozygous SNPs for the majority of the genome, regions that are variable in structure cannot be reconstructed
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.