PC-mer: An Ultra-fast memory-efficient tool for metagenomics profiling and classification.

Saeedeh Akbari Rokn Abadi,Amirhossein Mohammadi,Somayyeh Koohi

doi:10.1371/journal.pone.0307279

Abstract

Features extraction methods, such as k-mer-based methods, have recently made up a significant role in classifying and analyzing approaches for metagenomics data. But, they are challenged by various bottlenecks, such as performance limitations, high memory consumption, and computational overhead. To deal with these challenges, we developed an innovative features extraction and sequence profiling method for DNA/RNA sequences, called PC-mer, taking advantage of the physicochemical properties of nucleotides. PC-mer in comparison with the k-mer profiling methods provides a considerable memory usage reduction by a factor of 2k while improving the metagenomics classification performance, for both machine learning-based and computational-based methods, at the various levels and also archives speedup more than 1000x for the training phase. Examining ML-based PC-mer on various datasets confirms that it can achieve 100% accuracy in classifying samples at the class, order, and family levels. Despite the k-mer-based classification methods, it also improves genus-level classification accuracy by more than 14% for shotgun dataset (i.e. achieves accuracy of 97.5%) and more than 5% for amplicon dataset (i.e. achieves accuracy of 98.6%). Due to these improvements, we provide two PC-mer-based tools, which can actually replace the popular k-mer-based tools: one for classifying and another for comparing metagenomics data.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

PC-mer: An Ultra-fast memory-efficient tool for metagenomics profiling and classification.

Abstract

Talk to us

Similar Papers

More From: PloS one

Lead the way for us

Similar Papers

An experimental comparison of different feature extraction and classification methods for telephone speech
T Schurer
-
T SchurerT Schurer
26 Sep 1994
26 Sep 1994

On-board Clutch Slippage Detection and Diagnosis in Heavy Duty Machine
Elisabeth K¨Allstr¨Om ... Jonas Larsson
International Journal of Prognostics and Health Management | VOL. 9
Elisabeth K¨Allstr¨Om, et. al.Elisabeth K¨Allstr¨Om ... Jonas Larsson
19 Nov 2020
International Journal of Prognostics and Health Management | VOL. 9

A Multilayer Method of Text Feature Extraction Based on CILIN
Xin-Fu Li ... Lei-Lei Zhao
-
Xin-Fu Li, et. al.Xin-Fu Li ... Lei-Lei Zhao
01 Aug 2008
01 Aug 2008

Iterative Angular Feature Extraction (IAFE) Method for Reverse Engineering
K H Lee ... W Ning
The International Journal of Advanced Manufacturing Technology | VOL. 21
K H Lee, et. al.K H Lee ... W Ning
01 Jul 2003
The International Journal of Advanced Manufacturing Technology | VOL. 21

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

PC-mer: An Ultra-fast memory-efficient tool for metagenomics profiling and classification.

Abstract

Talk to us

Similar Papers

More From: PloS one