Abstract

The advancement of next-generation sequencing (NGS) technologies has been revolutionary for the field of evolutionary biology. This technology has led to an abundance of available genomes and transcriptomes for researchers to mine. Specifically, researchers can mine for various types of molecular markers that are vital for phylogenetic, evolutionary and ecological studies. Numerous tools have been developed to extract these molecular markers from NGS data. However, due to an insufficient number of well-annotated reference genomes for non-model organisms, it remains challenging to obtain these markers accurately and efficiently. Here, we present GeneMiner, an improved and expanded version of our previous tool, Easy353. GeneMiner combines the reference-guided de Bruijn graph assembly with seed self-discovery and greedy extension. Additionally, it includes a verification step using a parameter-bootstrap method to reduce the pitfalls associated with using a relatively distant reference. Our results, using both experimental and simulation data, showed GeneMiner can accurately acquire phylogenetic molecular markers for plants using transcriptomic, genomic and other NGS data. GeneMiner is designed to be user-friendly, fast and memory-efficient. Further, it is compatible with Linux, Windows and macOS. All source codes are publicly available on GitHub (https://github.com/sculab/GeneMiner) and Gitee (https://gitee.com/sculab/GeneMiner) for easy accessibility and transparency.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call