Abstract

The earth harbors trillions of bacterial species adapted to very diverse ecosystems thanks to specific metabolic function acquisition. Most of the genes responsible for these functions belong to uncultured bacteria and are still to be discovered. Functional metagenomics based on activity screening is a classical way to retrieve these genes from microbiomes. This approach is based on the insertion of large metagenomic DNA fragments into a vector and transformation of a host to express heterologous genes. Metagenomic libraries are then screened for activities of interest, and the metagenomic DNA inserts of active clones are extracted to be sequenced and analysed to identify genes that are responsible for the detected activity. Hundreds of metagenomics sequences found using this strategy have already been published in public databases. Here we present the MINTIA software package enabling biologists to easily generate and analyze large metagenomic sequence sets, retrieved after activity-based screening. It filters reads, performs assembly, removes cloning vector, annotates open reading frames and generates user friendly reports as well as files ready for submission to international sequence repositories. The software package can be downloaded from https://github.com/Bios4Biol/MINTIA.

Highlights

  • Microbial ecosystems are unrivaled sources of new protein functions

  • Sequence databases are enriched with new metagenomic sequences revealing the tremendous amount of putative functions that can be found in the nonculturable species of microbial ecosystems

  • We developed the MINTIA software package dedicated to automated analysis of sequences from cloned metagenomic DNA inserts

Read more

Summary

Introduction

Microbial ecosystems are unrivaled sources of new protein functions. Thanks to advances in sequencing technologies, microbial ecosystems have been largely explored during the last decades. The activity-based metagenomics approach includes four steps: (i) inserting DNA fragments extracted from an environmental sample in an expression vector (cosmids, fosmids or bacterial artificial chromosomes), (ii) transforming a microbial expression host to create a metagenomic library, (iii) screening the clone phenotype using selective media, chromogenic/fluorogenic substrates or reporter systems to isolate the hit clones producing the targeted activity, and last, (iv) sequencing multiplexed metagenomic inserts of the hit clones using NGS technologies either after an individual DNA barcoding step (Tasse et al, 2010) or directly, without marking them (Lam et al, 2014), (v) obtaining DNA sequences in order to identify genes responsible for the screened activity (Healy et al, 1995) Using this approach, a protein function can be assessed without any prior information on its sequence. We present MINTIA, results from simulated and real metagenomic data analysis and compare its results with three available open source software packages: fabFos (https://github.com/hallamlab/FabFos), Shims (Bellott et al, 2018) and RAST (Aziz et al, 2008)

Methods
Results
Discussion
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.