Abstract

Gymnocypris namensis, the only commercial fish in Namtso Lake of Tibet in China, is rated as nearly threatened species in the Red List of China’s Vertebrates. As one of the highest-altitude schizothorax fish in China, G. namensis has strong adaptability to the plateau harsh environment. Although being an indigenous economic fish with high value in research, the biological characterization, genetic diversity, and plateau adaptability of G. namensis are still unclear. Here, we used Pacific Biosciences single molecular real time long read sequencing technology to generate full-length transcripts of G. namensis. Sequences clustering analysis and error correction with Illumina-produced short reads to obtain 319,044 polished isoforms. After removing redundant reads, 125,396 non-redundant isoforms were obtained. Among all transcripts, 103,286 were annotated to public databases. Natural selection has acted on 42 genes for G. namensis, which were enriched on the functions of mismatch repair and Glutathione metabolism. Total 89,736 open reading frames, 95,947 microsatellites, and 21,360 long non-coding RNAs were identified across all transcripts. This is the first study of transcriptome in G. namensis by using PacBio Iso-seq. The acquisition of full-length transcript isoforms might accelerate the transcriptome research of G. namensis and provide basis for further research.

Highlights

  • Gymnocypris namensis, the only commercial fish in Namtso Lake of Tibet in China, is rated as nearly threatened species in the Red List of China’s Vertebrates

  • The area of lakes on the Tibetan Plateau are more than 50,900 km[2], and 1,091 lakes are larger than 1.0 km[2,3]

  • G. namensis, the only economic fish in Namtso Lake, is known as one of the highest-altitude schizothorax fish in China and it has strong ability to adapt to the plateau harsh environment[6,8]

Read more

Summary

Introduction

Gymnocypris namensis, the only commercial fish in Namtso Lake of Tibet in China, is rated as nearly threatened species in the Red List of China’s Vertebrates. Total 89,736 open reading frames, 95,947 microsatellites, and 21,360 long non-coding RNAs were identified across all transcripts This is the first study of transcriptome in G. namensis by using PacBio Iso-seq. Single molecule real-time sequencing (SMRT) technology developed by PacBio company is a long-read sequencing technology that overcome many defects of next-generation sequencing technology[15] These long reads data can cover different exon connections to obtain full-length transcripts[14,15]. The combination of the two technologies can effectively overcome their respective shortcomings in order to obtain longer and more accurate transcript information for biological research[15,22,23] At present, this method has been used in many animals and plants, such as G. selincuoensis[10], Jiejie wheat[22], corn[24] and American beaver[25], there are still few studies in aquatic animals. Based on the obtained transcripts information, we performed transcript functional annotation, microsatellites analysis, coding sequence prediction, and lncRNA prediction, providing valuable and comprehensive gene sequence resource to the research community for the further gene function and environmental adaptation studies

Objectives
Methods
Results
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call