Strain level microbial detection and quantification with applications to single cell metagenomics

Kaiyuan Zhu,S Cenk Sahinalp,A Funda Ergun,Alejandro A Schäffer,Junyan Xu,Welles Robinson,Eytan Ruppin,Yuzhen Ye

doi:10.1038/s41467-022-33869-7

Abstract

Computational identification and quantification of distinct microbes from high throughput sequencing data is crucial for our understanding of human health. Existing methods either use accurate but computationally expensive alignment-based approaches or less accurate but computationally fast alignment-free approaches, which often fail to correctly assign reads to genomes. Here we introduce CAMMiQ, a combinatorial optimization framework to identify and quantify distinct genomes (specified by a database) in a metagenomic dataset. As a key methodological innovation, CAMMiQ uses substrings of variable length and those that appear in two genomes in the database, as opposed to the commonly used fixed-length, unique substrings. These substrings allow to accurately decouple mixtures of highly similar genomes resulting in higher accuracy than the leading alternatives, without requiring additional computational resources, as demonstrated on commonly used benchmarking datasets. Importantly, we show that CAMMiQ can distinguish closely related bacterial strains in simulated metagenomic and real single-cell metatranscriptomic data.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Nature Communications	Publication Date: Oct 28, 2022
Citations: 8	License type: open-access

R Discovery Prime

R Discovery Prime

Strain level microbial detection and quantification with applications to single cell metagenomics

Abstract

Talk to us

Similar Papers

More From: Nature Communications

Lead the way for us

Similar Papers

A computational learning paradigm to targeted discovery of biocatalysts from metagenomic data: A case study of lipase identification
Mehdi F Shahraki ... Mohammad R Ghaffari
Biotechnology and Bioengineering | VOL. 119
Mehdi F Shahraki, et. al.Mehdi F Shahraki ... Mohammad R Ghaffari
03 Feb 2022
Biotechnology and Bioengineering | VOL. 119

Accurate Inference of Tumor Purity and Absolute Copy Numbers From High-Throughput Sequencing Data.
Xiguo Yuan ... Jun Bai
Frontiers in Genetics | VOL. 11
Xiguo Yuan, et. al.Xiguo Yuan ... Jun Bai
30 Apr 2020
Frontiers in Genetics | VOL. 11

SFQ: Constructing and Querying a Succinct Representation of FASTQ Files
Robert Bakarić ... Damir Korenčić
Electronics | VOL. 11
Robert Bakarić, et. al.Robert Bakarić ... Damir Korenčić
04 Jun 2022
Electronics | VOL. 11

Rhometa: Population recombination rate estimation from metagenomic read datasets.
Sidaswar Krishnan ... Martin Ostrowski
PLOS Genetics | VOL. 19
Sidaswar Krishnan, et. al.Sidaswar Krishnan ... Martin Ostrowski
27 Mar 2023
PLOS Genetics | VOL. 19

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Strain level microbial detection and quantification with applications to single cell metagenomics

Abstract

Talk to us

Similar Papers

More From: Nature Communications