MetaLAFFA: a flexible, end-to-end, distributed computing-compatible metagenomic functional annotation pipeline

Alexander Eng,Elhanan Borenstein,Adrian J Verster

doi:10.1186/s12859-020-03815-9

Abstract

BackgroundMicrobial communities have become an important subject of research across multiple disciplines in recent years. These communities are often examined via shotgun metagenomic sequencing, a technology which can offer unique insights into the genomic content of a microbial community. Functional annotation of shotgun metagenomic data has become an increasingly popular method for identifying the aggregate functional capacities encoded by the community’s constituent microbes. Currently available metagenomic functional annotation pipelines, however, suffer from several shortcomings, including limited pipeline customization options, lack of standard raw sequence data pre-processing, and insufficient capabilities for integration with distributed computing systems.ResultsHere we introduce MetaLAFFA, a functional annotation pipeline designed to take unfiltered shotgun metagenomic data as input and generate functional profiles. MetaLAFFA is implemented as a Snakemake pipeline, which enables convenient integration with distributed computing clusters, allowing users to take full advantage of available computing resources. Default pipeline settings allow new users to run MetaLAFFA according to common practices while a Python module-based configuration system provides advanced users with a flexible interface for pipeline customization. MetaLAFFA also generates summary statistics for each step in the pipeline so that users can better understand pre-processing and annotation quality.ConclusionsMetaLAFFA is a new end-to-end metagenomic functional annotation pipeline with distributed computing compatibility and flexible customization options. MetaLAFFA source code is available at https://github.com/borenstein-lab/MetaLAFFA and can be installed via Conda as described in the accompanying documentation.

Highlights

Microbial communities have become an important subject of research across multiple disciplines in recent years
The analysis of the functional capacities of microbial communities has become an important component of microbiome-based studies, providing novel insights into associations between the gut microbiome and host conditions such as depression [22], autism [18], and type 2 diabetes [16]
The pipeline we present here utilizes this latter, read-based annotation approach

Summary

Results

For a practical example of MetaLAFFA operation, we used MetaLAFFA in its default configuration to functionally annotate 4 stool samples (SRS011061, SRS011134, SRS011239, and SRS012273) from the HMP [20]. These samples ranged in size from 90 million reads to 130 million reads. Initial formatting of the input data and operation of MetaLAFFA to annotate these samples required very little effort, including:. Since these data files are post-HMP quality control, minimal reads were discarded from each sample during MetaLAFFA’s quality control phase. The resulting KO-, module-, and pathway-level profiles, as well as a full summary of operating statistics for this MetaLAFFA run can be found in Additional file 1: Tables S1–S4

Conclusions

Background

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: BMC Bioinformatics	Publication Date: Oct 21, 2020
Citations: 14	License type: open-access

R Discovery Prime

R Discovery Prime

MetaLAFFA: a flexible, end-to-end, distributed computing-compatible metagenomic functional annotation pipeline

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: BMC Bioinformatics

Lead the way for us

Similar Papers

WHAM!: a web-based visualization suite for user-defined analysis of metagenomic shotgun sequencing data
Joseph C Devlin ... Martin J Blaser
BMC Genomics | VOL. 19
Joseph C Devlin, et. al.Joseph C Devlin ... Martin J Blaser
25 Jun 2018
BMC Genomics | VOL. 19

MEDUSA: A Pipeline for Sensitive Taxonomic Classification and Flexible Functional Annotation of Metagenomic Shotgun Sequences.
Diego A. A. Morais ... Matheus A. B. Pasquali
Frontiers in genetics | VOL. 13
Diego A. A. Morais, et. al.Diego A. A. Morais ... Matheus A. B. Pasquali
07 Mar 2022
Frontiers in genetics | VOL. 13

Abstract A1-41: Automated pipeline for high confidence variant calling and functional annotation, for matched tumor/normal samples sequenced by next-generation sequencing (NGS)
Susan M Grimes ... Hojoon Lee
Cancer Research | VOL. 75
Susan M Grimes, et. al.Susan M Grimes ... Hojoon Lee
15 Nov 2015
Cancer Research | VOL. 75

Evaluation of a hybrid approach using UBLAST and BLASTX for metagenomic sequences annotation of specific functional genes.
Ying Yang ... Xiao-Tao Jiang
PLoS ONE | VOL. 9
Ying Yang, et. al.Ying Yang ... Xiao-Tao Jiang
27 Oct 2014
PLoS ONE | VOL. 9

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

MetaLAFFA: a flexible, end-to-end, distributed computing-compatible metagenomic functional annotation pipeline

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: BMC Bioinformatics