Abstract

BackgroundCandida albicans is a ubiquitous opportunistic fungal pathogen that afflicts immunocompromised human hosts. With rare and transient exceptions the yeast is diploid, yet despite its clinical relevance the respective sequences of its two homologous chromosomes have not been completely resolved.ResultsWe construct a phased diploid genome assembly by deep sequencing a standard laboratory wild-type strain and a panel of strains homozygous for particular chromosomes. The assembly has 700-fold coverage on average, allowing extensive revision and expansion of the number of known SNPs and indels. This phased genome significantly enhances the sensitivity and specificity of allele-specific expression measurements by enabling pooling and cross-validation of signal across multiple polymorphic sites. Additionally, the diploid assembly reveals pervasive and unexpected patterns in allelic differences between homologous chromosomes. Firstly, we see striking clustering of indels, concentrated primarily in the repeat sequences in promoters. Secondly, both indels and their repeat-sequence substrate are enriched near replication origins. Finally, we reveal an intimate link between repeat sequences and indels, which argues that repeat length is under selective pressure for most eukaryotes. This connection is described by a concise one-parameter model that explains repeat-sequence abundance in C. albicans as a function of the indel rate, and provides a general framework to interpret repeat abundance in species ranging from bacteria to humans.ConclusionsThe phased genome assembly and insights into repeat plasticity will be valuable for better understanding allele-specific phenomena and genome evolution.

Highlights

  • Candida albicans is a ubiquitous opportunistic fungal pathogen that afflicts immunocompromised human hosts

  • Single-nucleotide polymorphism (SNP) identification from deep sequencing of wild-type and homozygous strains To resolve polymorphism phasing in C. albicans, we performed deep sequencing on genomic DNA prepared from a panel of nine strains, including wild-type SC5314 and eight related strains, each known to be homozygous for one of the eight C. albicans chromosomes (Figure 1B)

  • Our approach involved three steps: identification of polymorphisms in the strains that are heterozygous for a chromosome, resolution of one of the haplotypes via direct sequencing of the corresponding homozygous strain (Figure 1C, top), and inference of the sequence of the opposite haplotype (Figure 1C, bottom)

Read more

Summary

Introduction

Candida albicans is a ubiquitous opportunistic fungal pathogen that afflicts immunocompromised human hosts. Candida albicans is a model fungal pathogen that almost exclusively exists in a diploid state and does not achieve genome diversity via a typical meiotic cycle with frequent recombination. Instead, it employs one of two strategies, both involving mating and whole chromosome loss, where the order of these events is inverted. A recent report revealed that chromosome loss can occur first to generate a matingcompetent haploid, which can subsequently mate to restore the diploid state [14] Both mating options occur rarely in C. albicans, and both leave the homologs largely intact. The phasing of polymorphisms in C. albicans has fewer entropic, degenerating forces than in most other organisms, making the assembly of its phased genome desirable

Methods
Results
Discussion
Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.