Abstract

In less than 25 y, the field of animal genome science has transformed from a discipline seeking its first glimpses into genome sequences across the Tree of Life to a global enterprise with ambitions to sequence genomes for all of Earth’s eukaryotic diversity [H. A. Lewin et al., Proc. Natl. Acad. Sci. U.S.A. 115, 4325–4333 (2018)]. As the field rapidly moves forward, it is important to take stock of the progress that has been made to best inform the discipline’s future. In this Perspective, we provide a contemporary, quantitative overview of animal genome sequencing. We identified the best available genome assemblies in GenBank, the world’s most extensive genetic database, for 3,278 unique animal species across 24 phyla. We assessed taxonomic representation, assembly quality, and annotation status for major clades. We show that while tremendous taxonomic progress has occurred, stark disparities in genomic representation exist, highlighted by a systemic overrepresentation of vertebrates and underrepresentation of arthropods. In terms of assembly quality, long-read sequencing has dramatically improved contiguity, whereas gene annotations are available for just 34.3% of taxa. Furthermore, we show that animal genome science has diversified in recent years with an ever-expanding pool of researchers participating. However, the field still appears to be dominated by institutions in the Global North, which have been listed as the submitting institution for 77% of all assemblies. We conclude by offering recommendations for improving genomic resource availability and research value while also broadening global representation.

Highlights

  • The first animal genome sequence was published 23 y ago (1)

  • We have entered an era of genomic natural history

  • We show that as of June 2021, 3,278 unique animals have had their nuclear genome sequenced and the assembly made publicly available in the National Center for Biotechnology Information (NCBI) GenBank database (10)

Read more

Summary

Introduction

The first animal genome sequence was published 23 y ago (1). The 97 million–basepair (bp) (Mb) Caenorhabditis elegans genome assembly ushered in a new era of animal genome biology where genetic patterns and processes could be investigated at genome scales. A baseline accounting of our progress toward a complete perspective of Earth’s genomic natural history—where every species has a corresponding, reference-quality genome assembly available—has not been presented. This knowledge gap is important given the momentum toward sequencing all animal genomes, which is being driven by a host of sequencing consortia. We show that as of June 2021, 3,278 unique animals have had their nuclear genome sequenced and the assembly made publicly available in the National Center for Biotechnology Information (NCBI) GenBank database (10). 32 times more assemblies are available for chordates than arthropods (Fig. 1)

Methods
Results
Discussion
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call