Abstract

Recent studies have demonstrated that conflict is common among gene trees in phylogenomic studies, and that less than one percent of genes may ultimately drive species tree inference in supermatrix analyses. Herein, we examined two data sets where supermatrix and coalescent-based species trees conflict. We identified two highly influential "outlier" genes in each data set. When removed from each data set, the inferred supermatrix trees matched the topologies obtained from coalescent analyses. We also demonstrate that, while the outlier genes in the vertebrate data set have been shown in a previous study to be the result of errors in orthology detection, the outlier genes from a plant data set did not exhibit any obvious systematic error, and therefore, may be the result of some biological process yet to be determined. While topological comparisons among a small set of alternate topologies can be helpful in discovering outlier genes, they can be limited in several ways, such as assuming all genes share the same topology. Coalescent species tree methods relax this assumption but do not explicitly facilitate the examination of specific edges. Coalescent methods often also assume that conflict is the result of incomplete lineage sorting. Herein, we explored a framework that allows for quickly examining alternative edges and support for large phylogenomic data sets that does not assume a single topology for all genes. For both data sets, these analyses provided detailed results confirming the support for coalescent-based topologies. This framework suggests that we can improve our understanding of the underlying signal in phylogenomic data sets by asking more targeted edge-based questions.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call