Abstract

BackgroundTo date, biologists have discovered a large amount of valuable information from assembled genomes, but the abundant microbial data that is hidden in the raw genomic sequence data of plants and animals is usually ignored. In this study, the richness and composition of fungal community were determined in the raw genomic sequence data of Ceratosolen solmsi (RGSD-CS).ResultsTo avoid the interference from sequences of C. solmsi, the unmapped raw data (about 17.1%) was obtained by excluding the assembled genome of C. solmsi from RGSD-CS. Comparing two fungal reference datasets, internal transcribed spacer (ITS) and large ribosomal subunit (LSU) of rRNA, the ITS dataset discovered a more diverse fungal community and was therefore selected as the reference dataset for evaluating the fungal community based on the unmapped raw data. The threshold of 95% sequence identity revealed many more matched fungal reads and fungal richness in the unmapped raw data than those by identities above 95%. Based on the threshold of 95% sequence identity, the fungal community of RGSD-CS was primarily composed of Saccharomycetes (88.4%) and two other classes (Agaricomycetes and Sordariomycetes, 8.3% in total). Compared with the fungal community of other reported fig wasps, Agaricomycetes and Eurotiomycetes were found to be unique to C. solmsi. In addition, the ratio of total fungal reads to RGSD-CS was estimated to be at least 4.8 × 10−3, which indicated that a large amount of fungal data was contained in RGSD-CS. However, rarefaction measure indicated that a deeper sequencing coverage with RGSD-CS was required to discover the entire fungal community of C. solmsi.ConclusionThis study investigated the richness and composition of fungal community in RGSD-CS and provided new insights into the efficient study of microbial diversity using raw genomic sequence data.Electronic supplementary materialThe online version of this article (doi:10.1186/s12866-015-0370-3) contains supplementary material, which is available to authorized users.

Highlights

  • To date, biologists have discovered a large amount of valuable information from assembled genomes, but the abundant microbial data that is hidden in the raw genomic sequence data of plants and animals is usually ignored

  • Proper parameters for the screening of fungal sequences in raw genomic sequence data of Ceratosolen solmsi (RGSD-CS) Prior to screening fungal reads in RGSD-CS, the unmapped raw data was obtained from RGSD-CS, by excluding the assembled genome of C. solmsi which matched RGSD-CS with 100% similarity

  • The fungal communities were composed of 12 to 14 classes and five subphyla, which was revealed by the matched reads in the unmapped raw data, each of which hit just one fungal taxon based on the matched internal transcribed spacer (ITS) reference sequences

Read more

Summary

Introduction

Biologists have discovered a large amount of valuable information from assembled genomes, but the abundant microbial data that is hidden in the raw genomic sequence data of plants and animals is usually ignored. Culture-independent methods have been commonly applied in more recent studies. These methods, including denaturing gradient gel electrophoresis (DGGE), temperature gradient gel electrophoresis (TGGE), terminal restriction fragment length polymorphism (T-RFLP), and clone libraries, which are based on a barcoding fragment of a conserved gene, can be used to quickly and cheaply determine the main components of. High-throughput sequencing with metabarcoding of DNA has minimized these issues by providing a large amount of sequence data This valuable method has been used by scientists to discover multiple important findings regarding the relationships between microbes and their hosts [12,13,14,15], the targeted sequencing for large amounts of fungal barcoding data is relatively expensive

Methods
Results
Discussion
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call