Abstract

Each day, as the amount of genomic data and bioinformatics resources grows, researchers are increasingly challenged with selecting the most appropriate approach to analyze their data. In addition, the opportunity to undertake comparative genomic analyses is growing rapidly. This is especially true for fungi due to their small genome sizes (i.e., mean 1C = 44.2 Mb). Given these opportunities and aiming to gain novel insights into the evolution of mutualisms, we focus on comparing the quality of whole genome assemblies for fungus-growing ants cultivars (Hymenoptera: Formicidae: Attini) and a free-living relative. Our analyses reveal that currently available methodologies and pipelines for analyzing whole-genome sequence data need refining. By using different genome assemblers, we show that the genome assembly size depends on what software is used. This, in turn, impacts gene number predictions, with higher gene numbers correlating positively with genome assembly size. Furthermore, the majority of fungal genome size data currently available are based on estimates derived from whole-genome assemblies generated from short-read genome data, rather than from the more accurate technique of flow cytometry. Here, we estimated the haploid genome sizes of three ant fungal symbionts by flow cytometry using the fungus Pleurotus ostreatus (Jacq.) P. Kumm. (1871) as a calibration standard. We found that published genome sizes based on genome assemblies are 2.5- to 3-fold larger than our estimates based on flow cytometry. We, therefore, recommend that flow cytometry is used to precalibrate genome assembly pipelines, to avoid incorrect estimates of genome sizes and ensure robust assemblies.

Highlights

  • Genome sequencing and analyses are increasing daily due to decreasing costs; analyzing the data can be difficult at times due to a large availability of software potentially leading to erroneous genome assemblies

  • We show that different software can lead to different conclusion for the same genome data, that is, when the genome assembly is longer the number of genes one can predict from that assembly increases as well

  • We show that by accurately measuring the genome size using flow cytometry, the resulting data can help as a quality control for the genome assemblies

Read more

Summary

Introduction

Genome sequencing and analyses are increasing daily due to decreasing costs; analyzing the data can be difficult at times due to a large availability of software potentially leading to erroneous genome assemblies. We show that different software can lead to different conclusion for the same genome data, that is, when the genome assembly is longer the number of genes one can predict from that assembly increases as well.

Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call