Abstract

The Indo-Pacific humpback dolphin (Sousa chinensis), is a threatened marine mammal and belongs to the First Order of the National Key Protected Wild Aquatic Animals List in China. However, limited genomic information is available for studies of its population genetics and biological conservation. Here, we have assembled a genomic sequence of this species using a whole genome shotgun (WGS) sequencing strategy after a pilot low coverage genome survey. The total assembled genome size was 2.34 Gb: with a contig N50 of 67 kb and a scaffold N50 of 9 Mb (107.6-fold sequencing coverage). The S. chinensis genome contained 24,640 predicted protein-coding genes and had approximately 37% repeated sequences. The completeness of the genome assembly was evaluated by benchmarking universal single copy orthologous genes (BUSCOs): 94.3% of a total 4,104 expected mammalian genes were identified as complete, and 2.3% were identified as fragmented. This newly produced high-quality assembly and annotation of the genome will greatly promote the future studies of the genetic diversity, conservation and evolution.

Highlights

  • Background & SummaryThe Indo-Pacific humpback dolphin (Sousa chinensis) normally appears in southeast Asia, from at least the southeastern bay of Bengal east to central China, and south to the Indo-Malay Archipelago[1]

  • At least four species are indicated to make up the genus Sousa: the Atlantic humpback (Sousa teuszii), the Indian Ocean humpback (Sousa plumbea), the Australian humpback (Sousa sahulensis) and the Indo-Pacific humpback (S. chinensis) dolphins[7]

  • As the classification and population genetics of genus Sousa was mainly based on the limited evidences from morphology, genetic markers and the mitochondrial sequences[7,8,9], the newly produced genome of S. chinensis would greatly facilitate the classification and identification of Sousa genetic resources

Read more

Summary

Background & Summary

The Indo-Pacific humpback dolphin (Sousa chinensis) normally appears in southeast Asia (in both the Indian and Pacific oceans), from at least the southeastern bay of Bengal east to central China, and south to the Indo-Malay Archipelago[1]. To obtain a high-quality genome sequence of S. chinensis, we first performed a pilot genome survey with low depth coverage sequencing (32.9X) (Table 1) by using Illumina Hiseq 4000 to estimate the genome size and heterozygosity of the species. The assembled genome size is about 2.29 Gb25 (contig N50 = 13 Kb and scaffold N50 = 163 Kb) and the completed BUSCO evaluated is just about 76% in genome survey[26]. We constructed four additional insert size libraries (beside the previous 500 bp and 2 Kb in genome survey) and generated a total of 290.5 Gb (107.6X) clean data after filtering (Tables 1 and 2). 95% of the “total complete BUSCOs” were identified by BUSCO pipeline based on the annotation result (Table 8), which suggested a good quality genome annotation

Methods
Findings
Code Availability
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call