Abstract

High-quality and complete gene models are the basis of whole genome analyses. The giant panda (Ailuropoda melanoleuca) genome was the first genome sequenced on the basis of solely short reads, but the genome annotation had lacked the support of transcriptomic evidence. In this study, we applied RNA-seq to globally improve the genome assembly completeness and to detect novel expressed transcripts in 12 tissues from giant pandas, by using a transcriptome reconstruction strategy that combined reference-based and de novo methods. Several aspects of genome assembly completeness in the transcribed regions were effectively improved by the de novo assembled transcripts, including genome scaffolding, the detection of small-size assembly errors, the extension of scaffold/contig boundaries, and gap closure. Through expression and homology validation, we detected three groups of novel full-length protein-coding genes. A total of 12.62% of the novel protein-coding genes were validated by proteomic data. GO annotation analysis showed that some of the novel protein-coding genes were involved in pigmentation, anatomical structure formation and reproduction, which might be related to the development and evolution of the black-white pelage, pseudo-thumb and delayed embryonic implantation of giant pandas. The updated genome annotation will help further giant panda studies from both structural and functional perspectives.

Highlights

  • The ability of their gut microbiome to degrade bamboo cellulose and hemi-cellulose[4,5]

  • High-quality reads were mapped to the giant panda draft genome

  • The giant panda genome is the first large mammalian animal genome built by de novo assembly using Illumina sequencing short reads alone[2]

Read more

Summary

Introduction

The ability of their gut microbiome to degrade bamboo cellulose and hemi-cellulose[4,5]. A number of novel transcripts have been identified through transcriptome analysis in many model organisms with well-annotated genomes[6,7,8,9], which emphasizes the complexity underlying genome annotation. We reconstructed transcripts from the RNA-seq transcriptomic data of 12 giant panda tissues to verify the predicted gene models, fill gaps and boundaries, identify novel protein-coding transcripts, and improve the annotation of the panda genome. These findings will facilitate new insights into the genetics and evolutionary biology of this high-profile species

Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call