Abstract

Tunicates are the sister group of vertebrates and thus occupy a key position for investigations into vertebrate innovations as well as into the consequences of the vertebrate-specific genome duplications. Nevertheless, tunicate genomes have not been studied extensively in the past, and comparative studies of tunicate genomes have remained scarce. The carpet sea squirt Didemnum vexillum, commonly known as “sea vomit”, is a colonial tunicate considered an invasive species with substantial ecological and economical risk. We report the assembly of the D. vexillum genome using a hybrid approach that combines 28.5 Gb Illumina and 12.35 Gb of PacBio data. The new hybrid scaffolded assembly has a total size of 517.55 Mb that increases contig length about eightfold compared to previous, Illumina-only assembly. As a consequence of an unusually high genetic diversity of the colonies and the moderate length of the PacBio reads, presumably caused by the unusually acidic milieu of the tunic, the assembly is highly fragmented (L50 = 25,284, N50 = 6539). It is sufficient, however, for comprehensive annotations of both protein-coding genes and non-coding RNAs. Despite its shortcomings, the draft assembly of the “sea vomit” genome provides a valuable resource for comparative tunicate genomics and for the study of the specific properties of colonial ascidians.

Highlights

  • The carpet sea squirt Didemnum vexillum [1], commonly called “sea vomit”, “marine vomit”, “pancake batter tunicate”, or “carpet sea squirt”, is a colonial tunicate presumably native to Japan that has appeared as an invasive species in Europe, the Americas, and NewZealand [2]

  • In this study we expand the assembly and annotation of tunicate genomic resources, and improve the current genome assembly of the colonial tunicate D. vexillum producing a resource to contribute to unravel the origins of chordates, as well as to improve our comprehension of the genomic changes involved in the novel mechanisms of asexual reproduction of colonial animals

  • We retrieved the RUNX sequences reported on [78], from available 16 chordates from NCBI: AN08565.1, AAN08567.1, AAQ88389.1, AAS02047.1, AAS21356.1, BAA03485.1, BAF36001.1, BAF36011.1, EAX04278.1, EDL03777.1, EDL29993.1, ENSCINT00000004611.3, NP_001001890.1, NP_001092121.1, NP_004341.1 and NP_571678.1. Those sequences were searched with blastp in the proteome of D. vexillum and the following 10 species: B. floridae, B. leachii, B. schlosseri, C. robusta, C. savignyi, M. oculata, M. occidentalis, O. dioica, P. marinus, and L. chalumnae

Read more

Summary

Introduction

The carpet sea squirt Didemnum vexillum [1], commonly called “sea vomit”, “marine vomit”, “pancake batter tunicate”, or “carpet sea squirt”, is a colonial tunicate presumably native to Japan that has appeared as an invasive species in Europe, the Americas, and New. The genomes of four solitary tunicates have been assembled and annotated in substantial depth. Draft assemblies recently have become available for the pelagic colonial thaliacian Salpa thompsoni, which was used to analyze the high mutation rates in the genomes of tunicates [19]. A very fragmented assembly of the “sea vomit” D. vexillum was recently sequenced by our group to analyze non-coding RNAs (ncRNAs) [26]. In this study we expand the assembly and annotation of tunicate genomic resources, and improve the current genome assembly of the colonial tunicate D. vexillum producing a resource to contribute to unravel the origins of chordates, as well as to improve our comprehension of the genomic changes involved in the novel mechanisms of asexual reproduction of colonial animals

DNA Extraction
Partial Degradation of Genomic DNA
RNA Extraction
Genome Sequencing
Data Preprocessing and Pre-Assembling
Contig-Level Assembly
Genome Scaffolding
Assembly Polishing
Transcriptome Data Assembly
Genome Annotation
Identification of Contamination
Annotation of Non-Coding RNAs
Computational Identification of miRNAs
Mitochondrial Genes
Genome Size and GC Content Estimation
2.10. Functional Annotation of Protein Coding Genes
2.10.1. Protein Enrichment Analysis
2.10.2. Interaction Analysis of Proteins
2.10.3. Annotation of Homeobox Proteins
2.10.4. Detection of Orthologous Proteins Involved in Skeletogenesis
2.10.5. Gene Phylogenies
Results
Transcriptome Sequencing and Assembly
Detection and Analysis of Repetitive Regions
Annotation of Protein-Coding Genes
BUSCO Assessment Results
Homeobox Transcription Factors
Mitochondrial Genome
Functional Annotation and Comparison of Proteins across the Tunicates
Genome Browser and Analysis of Genomic Coordinates
Discussion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call