Abstract

We propose a new method to estimate and correct for phylogenetic inertia in comparative data analysis. The method, called phylogenetic eigenvector regression (PVR) starts by performing a principal coordinate analysis on a pairwise phylogenetic distance matrix between species. Traits under analysis are regressed on eigenvectors retained by a broken-stick model in such a way that estimated values express phylogenetic trends in data and residuals express independent evolution of each species. This partitioning is similar to that realized by the spatial autoregressive method, but the method proposed here overcomes the problem of low statistical performance that occurs with autoregressive method when phylogenetic correlation is low or when sample size is too small to detect it. Also, PVR is easier to perform with large samples because it is based on well-known techniques of multivariate and regression analyses. We evaluated the performance of PVR and compared it with the autoregressive method using real datasets and simulations. A detailed worked example using body size evolution of Carnivora mammals indicated that phylogenetic inertia in this trait is elevated and similarly estimated by both methods. In this example, Type I error at α = 0.05 of PVR was equal to 0.048, but an increase in the number of eigenvectors used in the regression increases the error. Also, similarity between PVR and the autoregressive method, defined by correlation between their residuals, decreased by overestimating the number of eigenvalues necessary to express the phylogenetic distance matrix. To evaluate the influence of cladogram topology on the distribution of eigenvalues extracted from the double-centered phylogenetic distance matrix, we analyzed 100 randomly generated cladograms (up to 100 species). Multiple linear regression of log transformed variables indicated that the number of eigenvalues extracted by the broken-stick model can be fully explained by cladogram topology. Therefore, the broken-stick model is an adequate criterion for determining the correct number of eigenvectors to be used by PVR. We also simulated distinct levels of phylogenetic inertia by producing a trend across 10, 25, and 50 species arranged in "comblike" cladograms and then adding random vectors with increased residual variances around this trend. In doing so, we provide an evaluation of the performance of both methods with data generated under different evolutionary models than tested previously. The results showed that both PVR and autoregressive method are efficient in detecting inertia in data when sample size is relatively high (more than 25 species) and when phylogenetic inertia is high. However, PVR is more efficient at smaller sample sizes and when level of phylogenetic inertia is low. These conclusions were also supported by the analysis of 10 real datasets regarding body size evolution in different animal clades. We concluded that PVR can be a useful alternative to an autoregressive method in comparative data analysis.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.