Background: Knowledge about the origin of SARS-CoV-2 is necessary for both a biological and epidemiological understanding of the COVID-19 pandemic. Evidence suggests that a proximal evolutionary ancestor of SARS-CoV-2 belongs to the bat coronavirus family. However, as further evidence for a direct zoonosis remains limited, alternative modes of SARS-CoV-2 biogenesis should be also considered. Results: Here we show that the genomes from SARS-CoV-2 and from SARS-CoV-1 are differentially enriched with short chromosomal sequences from the yeast S. cerevisiae at focal positions that are known to be critical for virus replication, host cell invasion, and host immune response. Specifically, for SARS-CoV-2, we identify two sites: one at the start of the viral replicase domain, and the other at the end of the spike gene past its critical domain junction; for SARS-CoV-1, one at the start of the RNA dependent RNA polymerase gene, and the other at the start of the spike protein’s receptor binding domain. As yeast is not a natural host for this virus family, we propose a directed passage model for viral constructs, including virus replicase, in yeast cells based on co-transformation of virus DNA plasmids carrying yeast selectable genetic markers followed by intra-chromosomal homologous recombination through gene conversion. Highly differential sequence homology data across yeast chromosomes congruent with chromosomes harboring specific auxotrophic markers further support this passage model. Model and data together allow us to infer a hypothetical tripartite genome assembly scheme for the synthetic biogenesis of SARS-CoV-2 and SARS-CoV-1. Conclusions: These results provide evidence that the genome sequences of SARS-CoV-1, SARS-CoV-2, but not that of RaTG13, BANAL-20-52 and all other closest SARS coronavirus family members identified, are carriers of distinct homology signals that might point to large-scale genomic editing during a passage of directed replication and chromosomal integration inside genetically modified yeast cells.
Read full abstract