Abstract

Recent advances in understanding the ecology of marine systems have been greatly facilitated by the growing availability of metagenomic data, which provide information on the identity, diversity and functional potential of the microbial community in a particular place and time. Here we present a dataset comprising over 5 terabases of metagenomic data from 610 samples spanning diverse regions of the Atlantic and Pacific Oceans. One set of metagenomes, collected on GEOTRACES cruises, captures large geographic transects at multiple depths per station. The second set represents two years of time-series data, collected at roughly monthly intervals from 3 depths at two long-term ocean sampling sites, Station ALOHA and BATS. These metagenomes contain genomic information from a diverse range of bacteria, archaea, eukaryotes and viruses. The data’s utility is strengthened by the availability of extensive physical, chemical, and biological measurements associated with each sample. We expect that these metagenomes will facilitate a wide range of comparative studies that seek to illuminate new aspects of marine microbial ecosystems.

Highlights

  • Background & SummaryMicrobial communities are key drivers of marine biogeochemistry

  • Our understanding of the incredible complexity and diversity of natural microbial populations has been greatly enhanced by the advent of cultivation-independent techniques for sequencing DNA directly from an environmental sample

  • Despite progress in describing the complexity of these natural systems, many gaps remain in our understanding of the distribution of genes and organisms in the oceans as well as the selective forces that structure community composition and distribution across space and time

Read more

Summary

Background & Summary

Microbial communities are key drivers of marine biogeochemistry. Our understanding of the incredible complexity and diversity of natural microbial populations has been greatly enhanced by the advent of cultivation-independent techniques for sequencing DNA directly from an environmental sample. We present whole community metagenomic data from 610 samples collected in the Atlantic and Pacific Oceans These data represent snapshots of microbial communities sampled across space and time, and are associated with physical and chemical measurements which are of value in addressing integrative research questions. In addition to the paired-end reads, we include a set of assembled contigs from each metagenome library (Data Citation 3 for GEOTRACES and Data Citation 4 for HOT and BATS) As these metagenomes represent the microbial community in whole water samples, sequences from bacteria (39% of reads), archaea (4%), eukaryotes (1%) and viruses (2%) are present in roughly the same proportions observed in other marine datasets[10]. The physical, chemical, and biological measurements associated with these samples enable studies of the relationships between microbial community structure, functional potential, biogeochemical cycles, and specific environmental variables

Methods
Sargasso Sea
Data Records
Technical Validation
Usage Notes
Author Contributions
Findings
Additional Information
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call