Improving the quality of genome, protein sequence, and taxonomy databases: a prerequisite for microbiome meta-omics 2.0.

Olivier Pible,Jean Armengaud

doi:10.1002/pmic.201500104

Abstract

High-throughput shotgun metaproteomic approaches on environmental or medical microbiomes are producing huge amounts of tandem mass spectrometry data. These can be interpreted either with a general protein sequence database comprising tens of thousands of sequenced genomes or with a more customized database such as those obtained after metagenome sequencing of the DNA extracted from the same sample. However, not all entries in a nucleotide or protein sequence database are of equal quality and this can critically impact metaproteomic data interpretation. In this viewpoint article, we exemplify several key issues. First, either genome or transcriptome data interpretation due to inaccurate contig assembly and gene prediction may be erroneous, for its mitigation the metaproteogenomic strategies could have an interesting perspective. Errors in sample handling and taxonomical characterization may also be problematic. Cross-contamination of genome sequences is also underestimated while frequent. As a consequence of these structural errors regarding protein sequences and additional problems due to homology-based functional annotation of proteins, specific efforts for better interpretation of metaproteomic data are required. We propose the development of new bioinformatic pipelines devoted to detection and correction of errors and contaminations to improve the overall quality of sequence and taxonomy databases for metaproteomics.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Improving the quality of genome, protein sequence, and taxonomy databases: a prerequisite for microbiome meta-omics 2.0.

Abstract

Talk to us

Similar Papers

More From: PROTEOMICS

Lead the way for us

Journal: PROTEOMICS	Publication Date: Sep 10, 2015
Citations: 37

Similar Papers

A Proteogenomic Survey of the Medicago truncatula Genome
Jeremy D Volkening ... Michael R Sussman
Molecular & Cellular Proteomics | VOL. 11
Jeremy D Volkening, et. al.Jeremy D Volkening ... Michael R Sussman
01 Oct 2012
Molecular & Cellular Proteomics | VOL. 11

A Bioinformatics Workflow for Variant Peptide Detection in Shotgun Proteomics
Jing Li ... Zengliu Su
Molecular & Cellular Proteomics | VOL. 10
Jing Li, et. al.Jing Li ... Zengliu Su
09 Mar 2011
Molecular & Cellular Proteomics | VOL. 10

Using the FASTA program to search protein and DNA sequence databases.
William R. Pearson
Methods in molecular biology (Clifton, N.J.) | VOL. 24
William R. PearsonWilliam R. Pearson
01 Jan 1993
Methods in molecular biology (Clifton, N.J.) | VOL. 24

Domain fusion analysis by applying relational algebra to protein sequence and domain databases
Kevin Truong ... Mitsuhiko Ikura
BMC Bioinformatics | VOL. 4
Kevin Truong, et. al.Kevin Truong ... Mitsuhiko Ikura
01 Jan 2003
BMC Bioinformatics | VOL. 4

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Improving the quality of genome, protein sequence, and taxonomy databases: a prerequisite for microbiome meta-omics 2.0.

Abstract

Talk to us

Similar Papers

More From: PROTEOMICS