Abstract

We investigated the implications of employing a circular approximation of split systems in the calculation of maximum diversity subsets of a set of taxa in a conservation biology context where diversity is measured using Split System Diversity (SSD). We conducted a comparative analysis between the maximum SSD score and the maximum SSD set(s) of size k, efficiently determined using a circular approximation, and the true results obtained through brute-force search based on the original data. Through experimentation on simulated datasets and SNP data across 50 Atlantic Salmon populations, our findings demonstrate that employing a circular approximation can lead to the generation of an incorrect max-SSD set(s). We built a graph-based split system whose circular approximation led to a max-SSD set of size k=4 that was less than the true max-SSD set by 17.6%. This discrepancy increased to 25% for k=11 when we used a hypergraph-based split system. The same comparison on the Atlantic salmon dataset revealed a mere 1% difference. However, noteworthy disparities emerged in the population composition between the two sets. These findings underscore the importance of assessing the suitability of circular approximations in conservation biology systems. Caution is advised when relying solely on circular approximations to determine sets of maximum diversity, and careful consideration of the data characteristics is crucial for accurate results in conservation biology applications.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call