Abstract

Metagenomic data have mainly been addressed by showing the composition of organisms based on a small part of a well-examined genomic sequence, such as ribosomal RNA genes and mitochondrial DNAs. On the contrary, whole metagenomic data obtained by the shotgun sequence method have not often been fully analyzed through a homology search because the genomic data in databases for living organisms on earth are insufficient. In order to complement the results obtained through homology-search-based methods with shotgun metagenomes data, we focused on the composition of protein domains deduced from the sequences of genomes and metagenomes, and we utilized them in characterizing genomes and metagenomes, respectively. First, we compared the relationships based on similarities in the protein domain composition with the relationships based on sequence similarities. We searched for protein domains of 325 bacterial species produced using the Pfam database. Next, the correlation coefficients of protein domain compositions between every pair of bacteria were examined. Every pairwise genetic distance was also calculated from 16S rRNA or DNA gyrase subunit B. We compared the results of these methods and found a moderate correlation between them. Essentially, the same results were obtained when we used partial random 100 bp DNA sequences of the bacterial genomes, which simulated raw sequence data obtained from short-read next-generation sequences. Then, we applied the method for analyzing the actual environmental data obtained by shotgun sequencing. We found that the transition of the microbial phase occurred because the seasonal change in water temperature was shown by the method. These results showed the usability of the method in characterizing metagenomic data based on protein domain compositions.

Highlights

  • 71% of the Earth’s surface is covered by ocean, and 80% of life on this planet is believed to exist in this environment [1]

  • A total of 322 species were used for comparison

  • The correlation analysis revealed that the cluster dendrogram that was generated based on the

Read more

Summary

A Preliminary Metagenome Analysis Based on a Combination of Protein Domains

Yoji Igarashi 1,† , Daisuke Mori 1,† , Susumu Mitsuyama 1, *, Kazutoshi Yoshitake 1 , Hiroaki Ono 2 , Tsuyoshi Watanabe 3 , Yukiko Taniuchi 3,4 , Tomoko Sakami 3,5 , Akira Kuwata 3 , Takanori Kobayashi 6 , Yoshizumi Ishino 7 , Shugo Watabe 8 , Takashi Gojobori 9 and Shuichi Asakawa 1, *. Hokkaido National Fisheries Research Institute, Japan Fisheries Research and Education Agency, Kushiro, Hokkaido 085-0802, Japan. Research Center for Aquaculture Systems, National Research Institute of Aquaculture, Japan Fisheries.

Introduction
Generating Phylogenetic Trees from 16S Ribosomal RNA
Comparing the Cluster Dendrograms and Phylogenetic Trees
Analysis Test on the Environmental Data
Results
Cluster Analysis and Principal Component Analysis of the Environmental Data
Cluster analysis onwas the protein domains using environmental metagenomic
Discussion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.