Abstract

BackgroundChinese giant salamander (CGS) is the largest extant amphibian species in the world. Owing to its evolutionary position and four peculiar phenomenon of life (longevity, starvation tolerance, regenerative ability, and hatch without sunshine), it is an invaluable model species for research. However, lack of genomic resources leads to fewer study progresses in these fields, due to its huge genome of ∼50 GB making it extremely difficult to be assembled.ResultsWe reported the sequenced transcriptome of more than 20 tissues from adult CGS using Illumina Hiseq 2000 technology, and a total of 93 366 no-redundancy transcripts with a mean length of 1326 bp were obtained. We developed for the first time an efficient pipeline to construct a high-quality reference gene set of CGS and obtained 26 135 coding genes. BUSCO and homologous assessment showed that our assembly captured 70.6% of vertebrate universal single-copy orthologs, and this coding gene set had a higher proportion of completeness CDS with comparable quality of the protein sets of Tibetan frog.ConclusionsThese highest quality data will provide a valuable reference gene set to the subsequent research of CGS. In addition, our strategy of de novo transcriptome assembly and protein identification is applicable to similar studies.

Highlights

  • Chinese giant salamander (CGS) is the largest extant amphibian species in the world

  • Benchmarking Universal Single-Copy Orthologs (BUSCO) and homologous assessment showed that our assembly captured 70.6% of vertebrate universal single-copy orthologs, and this coding gene set had a higher proportion of completeness CDS with comparable quality of the protein sets of Tibetan frog

  • To evaluate the completeness of this coding gene set, we employed Benchmarking Universal Single-Copy Orthologs (BUSCO; http://busco.ezlab.org/) to evaluate the gene set of CGS using vertebrata data [9] and compared with two frog species, which have whole genome data available as follows: Western clawed frog (Xenopus tropicalis; http://ftp.ensembl.org/ pub/release-81/fasta/xenopus tropicalis/) and Tibetan frog (Nanorana parkeri; BioProject accession: PRJNA243398)

Read more

Summary

Background

The Chinese giant salamander (CGS; Andrias davidianus), belonging to order Caudata, family Cryptobranchidae, is the largest extant amphibian species in the world. To obtain an integrated transcript set, we firstly put together all clean data and performed a combined assembly strategy by a publicly available program Trinity (V2.0.6; http:// trinityrnaseq.sourceforge.net/) with the following parameters: minkmer cov = 3, min glue = 3, group pairs distance = 250, path reinforcement distance = 85 [8] It yields a huge number of transcripts, up to 425 357 transcripts output, and it includes many assembly errors and background sequences. Among the total mapped reads, the unique mapped reads were >98% and the multiple mapped reads were

Evaluation of coding gene set
Conclusions and future directions
Availability of supporting data
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.