Abstract

Cattle (Bos taurus) is one of the most widely distributed livestock species in the world, and provides us with high-quality milk and meat which have a huge impact on the quality of human life. Therefore, accurate and complete transcriptome and genome annotation are of great value to the research of cattle breeding. In this study, we used error-corrected PacBio single-molecule real-time (SMRT) data to perform whole-transcriptome profiling in cattle. Then, 22.5 Gb of subreads was generated, including 381,423 circular consensus sequences (CCSs), among which 276,295 full-length non-chimeric (FLNC) sequences were identified. After correction by Illumina short reads, we obtained 22,353 error-corrected isoforms. A total of 305 alternative splicing (AS) events and 3,795 alternative polyadenylation (APA) sites were detected by transcriptome structural analysis. Furthermore, we identified 457 novel genes, 120 putative transcription factors (TFs), and 569 novel long non-coding RNAs (lncRNAs). Taken together, this research improves our understanding and provides new insights into the complexity of full-length transcripts in cattle.

Highlights

  • Cattle (Bos taurus) are an agriculturally important species that provide human beings with large quantities of high-quality protein

  • To reveal the complexity of the transcriptome in cattle, six tissues were collected and a pooled RNA sample of them was sequenced with the Pacific Bioscience Sequel platform to accurately capture full-length sequences and uncover full-length splice variants

  • 286,688 circular consensus sequences (CCSs) were identified as full-length reads, and 276,295 were identified as fulllength non-chimeric (FLNC) reads with low artificial concatemers

Read more

Summary

Introduction

Cattle (Bos taurus) are an agriculturally important species that provide human beings with large quantities of high-quality protein. As a typical ruminant animal, cattle still play a great role in sustainable agriculture since they can effectively utilize pastures, silage, and high-fiber crop residues. Genomic information plays an important role in accelerating the molecular breeding process of cattle, so an accurate and complete reference genome and annotation are essential for genetic mechanism research, Quantitative trait locus (QTL) mapping, and genomic selection of important production traits for cattle breeding. The latest reference genome assembly (ARS-UCD1.2) was first reported in 2018, assembling 2.7 Gb of the genome (Rosen et al, 2020). The annotation of the ARS-UCD1.2 assembly (NCBI release 106), resulted in 21,039 protein-coding genes, 9,357 non-coding genes, and 4,569 pseudogenes (Rosen et al, 2020).

Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call