Evaluation of a hybrid approach using UBLAST and BLASTX for metagenomic sequences annotation of specific functional genes.

Ying Yang,Tong Zhang,Xiao-Tao Jiang

doi:10.1371/journal.pone.0110947

Abstract

The fast development of next generation sequencing (NGS) has dramatically increased the application of metagenomics in various aspects. Functional annotation is a major step in the metagenomics studies. Fast annotation of functional genes has been a challenge because of the deluge of NGS data and expanding databases. A hybrid annotation pipeline proposed previously for taxonomic assignments was evaluated in this study for metagenomic sequences annotation of specific functional genes, such as antibiotic resistance genes, arsenic resistance genes and key genes in nitrogen metabolism. The hybrid approach using UBLAST and BLASTX is 44–177 times faster than direct BLASTX in the annotation using the small protein database for the specific functional genes, with the cost of missing a small portion (<1.8%) of target sequences compared with direct BLASTX hits. Different from direct BLASTX, the time required for specific functional genes annotation using the hybrid annotation pipeline depends on the abundance for the target genes. Thus this hybrid annotation pipeline is more suitable in specific functional genes annotation than in comprehensive functional genes annotation.

Highlights

In recent years, the rapid development of generation sequencing (NGS) has broadened the application of metagenomics in various aspects of biological research [1]
RAPSearch2 was one of the ultra-fast tools in database search and only have a small portion of missed sequences when compared to direct BLASTX [9,10]
We made a comparison of annotation result from RAPSearch2 and UBLAST to evaluate their speed and annotation accuracy first

Summary

Introduction

The rapid development of generation sequencing (NGS) has broadened the application of metagenomics in various aspects of biological research [1]. The reduction of DNA sequencing cost has surpassed the rate predicted by Moore’s law [2]. More NGS sequences were generated in the 1000 genomes project within its first 6 months than the sequence data accumulated in NCBI Genbank database over two decades [3]. The deluge of NGS data poses higher requirement on computational resource for data analysis, which became the bottleneck for metagenomic analysis other than the sequencing cost. It may take months to analyze these data, for annotation of the overall functions of these genes. Besides the time cost of metagenomic analysis, cost of computational resources is getting higher for handling the overwhelming increase of data generated, not to mention the hardly quantifiable human resources needed for metagenomic data analysis currently [2]

Methods

Results

Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: PLoS ONE	Publication Date: Oct 27, 2014
Citations: 57	License type: CC BY 4.0

R Discovery Prime

R Discovery Prime

Evaluation of a hybrid approach using UBLAST and BLASTX for metagenomic sequences annotation of specific functional genes.

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: PLoS ONE

Lead the way for us

Similar Papers

PANNOTATOR: an automated tool for annotation of pan-genomes
A.R Santos ... M Zurita-Turk
Genetics and Molecular Research | VOL. 12
A.R Santos, et. al.A.R Santos ... M Zurita-Turk
01 Jan 2013
Genetics and Molecular Research | VOL. 12

Identification of potential biomarkers for colorectal cancer by clinical database analysis and Kaplan-Meier curves analysis.
Chongyang Li ... Ying Gao
Medicine | VOL. 102
Chongyang Li, et. al.Chongyang Li ... Ying Gao
10 Feb 2023
Medicine | VOL. 102

Exploring anammox bacteria ecology to improve nitrogen removal in wastewater treatment
Hugo Ribeiro ... Catarina Teixeira
Frontiers in Marine Science | VOL. 6
Hugo Ribeiro, et. al.Hugo Ribeiro ... Catarina Teixeira
01 Jan 2019
Frontiers in Marine Science | VOL. 6

Functional Modularity in a Large-Scale Mammalian Molecular Interaction Network
A Kramer ... D.R Richards
-
A Kramer, et. al.A Kramer ... D.R Richards
08 Aug 2005
08 Aug 2005

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Evaluation of a hybrid approach using UBLAST and BLASTX for metagenomic sequences annotation of specific functional genes.

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: PLoS ONE