Phenotools: An r package for visualizing and analysing phenomic datasets

Chad M Eliason,Scott V Edwards,Julia A Clarke

doi:10.1111/2041-210x.13217

Abstract

Abstract Phenotypic data are crucial for understanding genotype–phenotype relationships, assessing the tree of life and revealing trends in trait diversity over time. Large‐scale description of whole organisms for quantitative analyses (phenomics) presents several challenges, and technological advances in the collection of genomic data outpace those for phenomic data. Reasons for this disparity include the time‐consuming and expensive nature of collecting discrete phenotypic data and mining previously published data on a given species (both often requiring anatomical expertise across taxa), and computational challenges involved with analysing high‐dimensional datasets. One approach to building approximations of organismal phenomes is to combine published datasets of discrete characters assembled for phylogenetic analyses into a phenomic dataset. Despite a wealth of legacy datasets in the literature for many groups, relatively few methods exist for automating the assembly, analysis, and visualization of phenomic datasets in phylogenetic contexts. Here, we introduce a new r package phenotools for integrating (fusing original or legacy datasets), curating (finding and removing duplicates) and visualizing phenomic datasets. We demonstrate the utility of the proposed toolkit with a morphological dataset for flightless birds and two morphological datasets for theropod dinosaurs and provide recommendations for character construction to maximize accessibility in future workflows. Visualization tools allow rapid identification of anatomical subregions with difficult or problematic histories of homology. We anticipate these tools aiding automation of the assembly and visualization of phenomic datasets to inform evolutionary relationships and rates of phenotypic evolution.

Full Text