Haplotype phasing represents a pivotal procedure in genome analysis, entailing the identification of specific genetic variant combinations on each chromosome. Achieving chromosome-level genome phasing constitutes a considerable challenge, particularly in organisms with large and complex genomes. To address this challenge, we have developed a robust, gamete cell-based phasing pipeline, including wet-laboratory processes for plant sperm cell isolation, short-read sequencing and a bioinformatics workflow to generate chromosome-level phasing. The bioinformatics workflow is applicable for both plant and other sperm cells, for example, those of mammals. Our pipeline ensures high-quality single-nucleotide polymorphism (SNP) calling for each sperm cell and the subsequent construction of a high-density genetic map. The genetic map facilitates accurate chromosome-level genome phasing, enables crossover event detection and could be used to correct potential assembly errors. Our bioinformatics pipeline runs on a Linux system and most of its steps can be executed in parallel, expediting the analysis process. The entire workflow can be performed over the course of 1 d. We provide a practical example from our previous research using this protocol and provide the whole bioinformatics pipeline as a Docker image to ensure its easy adaptability to other studies.
Read full abstract