Enhancing metagenomic classification with compression-based features

Jorge Miguel Silva,João Rafael Almeida

doi:10.1016/j.artmed.2024.102948

Jorge Miguel Silva, João Rafael Almeida

Open Access

https://doi.org/10.1016/j.artmed.2024.102948

Copy DOI

Export

Save

Cite

Journal: Artificial Intelligence In Medicine	Publication Date: Aug 14, 2024
Citations: 1	License type: cc-by

Abstract
Full-Text
Similar Papers

Abstract

Listen

Metagenomics is a rapidly expanding field that uses next-generation sequencing technology to analyze the genetic makeup of environmental samples. However, accurately identifying the organisms in a metagenomic sample can be complex, and traditional reference-based methods may need to be more effective in some instances. In this study, we present a novel approach for metagenomic identification, using data compressors as a feature for taxonomic classification. By evaluating a comprehensive set of compressors, including both general-purpose and genomic-specific, we demonstrate the effectiveness of this method in accurately identifying organisms in metagenomic samples. The results indicate that using features from multiple compressors can help identify taxonomy. An overall accuracy of 95% was achieved using this method using an imbalanced dataset with classes with limited samples. The study also showed that the correlation between compression and classification is insignificant, highlighting the need for a multi-faceted approach to metagenomic identification. This approach offers a significant advancement in the field of metagenomics, providing a reference-less method for taxonomic identification that is both effective and efficient while revealing insights into the statistical and algorithmic nature of genomic data. The code to validate this study is publicly available at https://github.com/ieeta-pt/xgTaxonomy.

Full Text

Published Version

View

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

Enhancing metagenomic classification with compression-based features

Abstract

Published Version

Talk to us

Similar Papers

More From: Artificial Intelligence In Medicine

Lead the way for us

Similar Papers

Bag-of-Visual-Words codebook generation using deep features for effective classification of imbalanced multi-class image datasets
Manisha Saini ... Seba Susan
Multimedia Tools and Applications | VOL. 80
Manisha Saini, et. al.Manisha Saini ... Seba Susan
10 Mar 2021
Multimedia Tools and Applications | VOL. 80

FCMM: A comparative metagenomic approach for functional characterization of multiple metagenome samples
Jongin Lee ... Jaebum Kim
Journal of Microbiological Methods | VOL. 115
Jongin Lee, et. al.Jongin Lee ... Jaebum Kim
29 May 2015
Journal of Microbiological Methods | VOL. 115

GPU-Meta-Storms: Computing the similarities among massive microbial communities using GPU
Xiaoquan Su ... Xuetao Wang
-
Xiaoquan Su, et. al.Xiaoquan Su ... Xuetao Wang
01 Aug 2013
01 Aug 2013

Impact of SMOTE on Imbalanced Text Features for Toxic Comments Classification Using RVVC Model
Vaibhav Rupapara ... Imran Ashraf
IEEE Access | VOL. 9
Vaibhav Rupapara, et. al.Vaibhav Rupapara ... Imran Ashraf
01 Jan 2020
IEEE Access | VOL. 9

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

Enhancing metagenomic classification with compression-based features

Abstract

Published Version

Talk to us

Similar Papers

More From: Artificial Intelligence In Medicine