Abstract
BackgroundWidespread bioinformatic resource development generates a constantly evolving and abundant landscape of workflows and software. For analysis of the microbiome, workflows typically begin with taxonomic classification of the microorganisms that are present in a given environment. Additional investigation is then required to uncover the functionality of the microbial community, in order to characterize its currently or potentially active biological processes. Such functional analysis of metagenomic data can be computationally demanding for high-throughput sequencing experiments. Instead, we can directly compare sequencing reads to a functionally annotated database. However, since reads frequently match multiple sequences equally well, analyses benefit from a hierarchical annotation tree, e.g. for taxonomic classification where reads are assigned to the lowest taxonomic unit.ResultsTo facilitate functional microbiome analysis, we re-purpose well-known taxonomic classification tools to allow us to perform direct functional sequencing read classification with the added benefit of a functional hierarchy. To enable this, we develop and present a tree-shaped functional hierarchy representing the molecular function subset of the Gene Ontology annotation structure. We use this functional hierarchy to replace the standard phylogenetic taxonomy used by the classification tools and assign query sequences accurately to the lowest possible molecular function in the tree. We demonstrate this with simulated and experimental datasets, where we reveal new biological insights.ConclusionsWe demonstrate that improved functional classification of metagenomic sequencing reads is possible by re-purposing a range of taxonomic classification tools that are already well-established, in conjunction with either protein or nucleotide reference databases. We leverage the advances in speed, accuracy and efficiency that have been made for taxonomic classification and translate these benefits for the rapid functional classification of microbiomes. While we focus on a specific set of commonly used methods, the functional annotation approach has broad applicability across other sequence classification tools. We hope that re-purposing becomes a routine consideration during bioinformatic resource development.CWJ3S1nYSzUqUEQpCfrTiKVideo abstract
Highlights
Widespread bioinformatic resource development generates a constantly evolving and abundant landscape of workflows and software
We chose the depth-first search (DFS) approach for the tree structure derivation as it maximized the average distance of nodes from the root compared to breadth-first search (BFS) or random approach (RND), enabling more specific annotation of sequences (Fig. 1f)
We develop a functional hierarchy and repurpose taxonomic sequence classification software for functional annotation, our Gene Ontology (GO) term reference database can still be used with other alignment software for functional annotation
Summary
Widespread bioinformatic resource development generates a constantly evolving and abundant landscape of workflows and software. Such analyses range from microbiome analysis for fields such as animal health, agriculture and environmental studies [2, 3], to those focusing on human samples such as skin, saliva, stool or blood, since variation in the human microbiome has been linked to health conditions and diseases [4] Both 16S rRNA gene and whole metagenome shotgun sequencing can be used to identify the microorganisms that are present in a sample, i.e. the community structure, while further investigation is required to derive the functional potential of the microbial community from the sequence data [5, 6]. This approach for functional analysis of the microbiome can be laborious and computationally demanding
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.