Abstract

Saccharomyces cerevisiae CEN.PK 113-7D is widely used for metabolic engineering and systems biology research in industry and academia. We sequenced, assembled, annotated and analyzed its genome. Single-nucleotide variations (SNV), insertions/deletions (indels) and differences in genome organization compared to the reference strain S. cerevisiae S288C were analyzed. In addition to a few large deletions and duplications, nearly 3000 indels were identified in the CEN.PK113-7D genome relative to S288C. These differences were overrepresented in genes whose functions are related to transcriptional regulation and chromatin remodelling. Some of these variations were caused by unstable tandem repeats, suggesting an innate evolvability of the corresponding genes. Besides a previously characterized mutation in adenylate cyclase, the CEN.PK113-7D genome sequence revealed a significant enrichment of non-synonymous mutations in genes encoding for components of the cAMP signalling pathway. Some phenotypic characteristics of the CEN.PK113-7D strains were explained by the presence of additional specific metabolic genes relative to S288C. In particular, the presence of the BIO1 and BIO6 genes correlated with a biotin prototrophy of CEN.PK113-7D. Furthermore, the copy number, chromosomal location and sequences of the MAL loci were resolved. The assembled sequence reveals that CEN.PK113-7D has a mosaic genome that combines characteristics of laboratory strains and wild-industrial strains.

Highlights

  • The 1000-dollar genome, an iconic goal in human genomics, is already a reality for the yeast Saccharomyces cerevisiae. a high quality reference genome of the laboratory strain S. cerevisiae S288C has been available atypical ENA gene complement renders the laboratory strain CEN.PK113-7D more sensitive to lithium ions [4]

  • Genome assembly, scaffolding and annotation The genome assembly of the CEN.PK113-7D strain sequence was performed by combining Illumina (36 M reads, 51 bp, paired-end) and 454 (0.6 M reads, mean length 280 bp) sequencing datasets that together represented more than 150-fold coverage of the genome

  • A hybrid assembly strategy followed by scaffolding using paired-end read information resulted in 565 scaffolds with a total size of 11.6 Mbp (GenBank BioProject PRJNA52955; http://cenpk.bt.tudelft.nl) (Table 1), which were subsequently placed into chromosomal scaffolds based on homology to S288C

Read more

Summary

Introduction

A high quality reference genome of the laboratory strain S. cerevisiae S288C has been available atypical ENA gene complement renders the laboratory strain CEN.PK113-7D more sensitive to lithium ions [4]. In addition to the presence or absence of coding regions, differences can occur in non-coding regions, such as promoter regions. Knowledge of such differences is essential for the analysis and modelling of regulatory networks in systems biology [6]. Availability of a well-annotated, high-quality reference genome is essential to interpret the changes that occur during laboratory evolution

Objectives
Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call