A key trait of Eukarya is the independent evolution of complex multicellular (CM) in animals, plants, fungi, brown algae and red algae. This phenotype is characterized by the initial exaptation of cell-cell adhesion genes followed by the emergence of mechanisms for cell-cell communication, together with the expansion of transcription factor gene families responsible for cell and tissue identity. The number of cell types (NCT) is commonly used as a quantitative proxy for biological complexity in comparative genomics studies. While expansions of individual gene families have been associated with NCT variation within individual CM lineages, the molecular and functional roles responsible for the independent evolution of CM across Eukarya remain poorly understood. We employed a phylogeny-aware strategy to conduct a genomic-scale search for associations between NCT and the abundance of genomic components across a phylogenetically diverse set of 81 eukaryotic species, including species from all CM lineages. Our annotation schemas represent two complimentary aspects of genomic information: homology - represented by conserved sequences - and function - represented by Gene Ontology (GO) terms. We found many gene families sharing common biological themes that define CM to be independently expanded in two or more CM lineages, such as components of the extracellular matrix, cell-cell communication mechanisms, and developmental pathways. Additionally, we describe many previously unknown associations of biological themes and biological complexity, such as mechanisms for wound response, immunity, cell migration, regulatory processes, and response to natural rhythms. Together, our findings unveil a set of functional and molecular convergences independently expanded in CM lineages likely due to the common selective pressures in their lifestyles.
Read full abstract