Abstract

The relationship between genotype and fitness is fundamental to evolution, but quantitatively mapping genotypes to fitness has remained challenging. We propose the Phenotypic-Embedding theorem (P-E theorem) that bridges genotype-phenotype through an encoder-decoder deep learning framework. Inspired by this, we proposed a more general first principle for correlating genotype-phenotype, and the P-E theorem provides a computable basis for the application of first principle. As an application example of the P-E theorem, we developed the Co-attention based Transformer model to bridge Genotype and Fitness model, a Transformer-based pre-train foundation model with downstream supervised fine-tuning that can accurately simulate the neutral evolution of viruses and predict immune escape mutations. Accordingly, following the calculation path of the P-E theorem, we accurately obtained the basic reproduction number (${R}_0$) of SARS-CoV-2 from first principles, quantitatively linked immune escape to viral fitness and plotted the genotype-fitness landscape. The theoretical system we established provides a general and interpretable method to construct genotype-phenotype landscapes, providing a new paradigm for studying theoretical and computational biology.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call