Abstract

In this paper, we describe a new digital genetics model based on the Aevol artificial life simulator. Aevol is a computational platform designed to study populations of digital organisms evolving under various conditions. It has been extended in two directions. First, we have extended the genomic code from a binary one to a 4-base one, allowing for more realistic genomic sequence and eases the usage of Aevol as a bench-marking tool for comparative genomics. Second, we have replaced the Aevol continuous phenotype representation by a discrete one inspired by Fisher's Geometric Model. By doing so, we will be able to validate Aevol results against population genetics theory. Why a new model? There is a twofold motivation for extending the regular Aevol model: benchmarking phylogenetic algorithms and embedding Fisher's Geometric Model of evolution (FGM). Benchmarking Molecular evolutionary methods and tools are difficult to validate, as we have almost no direct access to ancient molecules. In Alife platforms such as Avida or Aevol, phy-logenies are exactly recorded. The final population resulting from such in silico experiments can be analyzed by the phy-logenetic algorithms to recover the phylogenetic tree. This process makes it possible to compare the trees inferred by these algorithms to the actual tree that was recorded along the way of artificial evolution. This approach has recently been applied to test various estimators of inversion distance (Biller et al., 2016b), revealing their limits and suggesting important improvement directions (Biller et al., 2016a). However, Aevol uses a binary representation for the genomic sequence, thus strongly limiting its usability as a benchmarking tool. This limitation called for a new model based on 4-nucleotide sequences. Fisher's Geometric Model The other intent of this new model is to enable a direct comparison of Aevol results in terms of population genetics and, more precisely, in terms of FGM. Indeed, one of the drawbacks of digital genetics and artificial life models is their difficulty to crosstalk with other theoretical approaches in evolutionary biology. FGM is a simple mathematical model describing the qualitative behavior of evolution (Fisher, 1930; Tenaillon, 2014). Assessing compatibility between Aevol's model and FGM will make it possible to validate Aevol predictions in cases where FGM alone provides a clear notion of what is expected from evolution. Aevol-ACGT model In Aevol, a population of individuals evolves through a classical mutation-selection process. The specificity of Aevol lies in the genotype-to-phenotype mapping that finely models what is observed in bacteria. A circular double-stranded DNA sequence is transcribed into a set of mR-NAs. These mRNAs are then parsed in search for Coding DNA Sequences (the " genes ") that are translated into proteins through an artificial genetic code. Finally, the proteins are combined to compute the individual's phenotype. We refer the reader to previously published work for a complete description of the binary model and the results obtained so far (Knibbe et al., 2007; Batut et al., 2013; Misevic et al., 2015). As in the classical Aevol, in Aevol-ACGT the digital organisms own a sequence of nucleotides genotype that encodes for a mathematical phenotype. The fitness of an organism is then compared with a predefined phenotypic target and the distance between the encoded phenotype and the target is used to compute the fitness. However, in the new model the genotype is a sequence on a 4-character alphabet (equivalent to ACGT) while the phenotype is modeled by a set of continuous traits (as in FGM). The phenotypic target defines the optimal value for all the traits under selection. Then the fitness w is computed from the distance between the phenotype and the phenotypic target through the classical Gaussian-based function of FGM: w = e − 1 2 n i=1 (zi−Zi) 2 (1)

Highlights

  • In this paper, we describe a new digital genetics model based on the Aevol artificial life simulator

  • There is a twofold motivation for extending the regular Aevol model: benchmarking phylogenetic algorithms and embedding Fisher’s Geometric Model of evolution (FGM)

  • The other intent of this new model is to enable a direct comparison of Aevol results in terms of population genetics and, more precisely, in terms of FGM

Read more

Summary

Why a new model?

There is a twofold motivation for extending the regular Aevol model: benchmarking phylogenetic algorithms and embedding Fisher’s Geometric Model of evolution (FGM). The final population resulting from such in silico experiments can be analyzed by the phylogenetic algorithms to recover the phylogenetic tree This process makes it possible to compare the trees inferred by these algorithms to the actual tree that was recorded along the way of artificial evolution. Aevol uses a binary representation for the genomic sequence, strongly limiting its usability as a benchmarking tool This limitation called for a new model based on 4-nucleotide sequences. The combinatorics of the (arbitrary) AA-parameters association could be problematic To overcome this difficulty, we propose to encode the parameters of the kernel using non-binary codes: the 20 AA are grouped into classes and all the AA of a same class are used to compute a same parameter. The n traits under selection are n points randomly or regularly scattered over the 2D phenotypic space and for which the target value Zi is specified and can be compared to the phenotype value at the same position zi (equation 1)

Evolutionary loop
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call