Abstract

Genomic sequences are widely used to infer the evolutionary history of a given group of individuals. Many methods have been developed for sequence clustering and tree building. In the early days of genome sequencing, these were often limited to hundreds of sequences but due to the surge of high throughput sequencing, it is now common to have millions of sampled sequences at hand. We introduce MNHN-Tree-Tools, a high performance set of algorithms that builds multi-scale, nested clusters of sequences found in a FASTA file. MNHN-Tree-Tools does not rely on multiple sequence alignment and can thus be used on large datasets to infer a sequence tree. Herein, we outline two applications: a human alpha-satellite repeats classification and a tree of life derivation from 16S/18S rDNA sequences. Open source with a Zlib License via the Git protocol: https://gitlab.in2p3.fr/mnhn-tools/mnhn-tree-tools. A detailed users guide and tutorial: https://gitlab.in2p3.fr/mnhn-tools/mnhn-tree-tools-manual/-/raw/master/manual.pdf. http://treetools.haschka.net. Supplementary data are available at Bioinformatics online.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.