Abstract

BackgroundTranscriptome interpretation relies on a good-quality reference transcriptome for accurate quantification of gene expression as well as functional analysis of genetic variants. The current annotation of the horse genome lacks the specificity and sensitivity necessary to assess gene expression especially at the isoform level, and suffers from insufficient annotation of untranslated regions (UTR) usage. We built an annotation pipeline for horse and used it to integrate 1.9 billion reads from multiple RNA-seq data sets into a new refined transcriptome.ResultsThis equine transcriptome integrates eight different tissues from 59 individuals and improves gene structure and isoform resolution, while providing considerable tissue-specific information. We utilized four levels of transcript filtration in our pipeline, aimed at producing several transcriptome versions that are suitable for different downstream analyses. Our most refined transcriptome includes 36,876 genes and 76,125 isoforms, with 6474 candidate transcriptional loci novel to the equine transcriptome.ConclusionsWe have employed a variety of descriptive statistics and figures that demonstrate the quality and content of the transcriptome. The equine transcriptomes that are provided by this pipeline show the best tissue-specific resolution of any equine transcriptome to date and are flexible for several downstream analyses. We encourage the integration of further equine transcriptomes with our annotation pipeline to continue and improve the equine transcriptome.

Highlights

  • Transcriptome interpretation relies on a good-quality reference transcriptome for accurate quantification of gene expression as well as functional analysis of genetic variants

  • The transcriptional evidence provided by total RNA sequencing (RNA-seq) was the basis of our gene annotation

  • Overall mapping statistics and gene counts after filtration RNA-seq of 59 samples in 12 libraries from 8 different horse tissues provided 1917.7 million fragments and 364 Gb of sequence bases

Read more

Summary

Introduction

Transcriptome interpretation relies on a good-quality reference transcriptome for accurate quantification of gene expression as well as functional analysis of genetic variants. For less well characterized animals, such as the horse, there is often only annotation of a single variant of a gene and with insufficient annotation of multiple splice variants, UTR extensions and non-protein coding RNA This lack of information can challenge subsequent differential gene expression analyses and functional studies. ENSEMBL and NCBI provide publically available annotations for several vertebrate genomes including horse [8] Both underlying annotation pipelines integrate homology search and ab initio prediction; accurate UTR prediction and isoform recognition require speciesspecific transcriptional evidence [9, 10]. For this equine transcriptome, the transcriptional evidence provided by total RNA sequencing (RNA-seq) was the basis of our gene annotation. This approach permits more reliable discovery of novel genes and isoforms, extension of UTRs and the flexibility necessary to establish a balance between sensitivity and specificity of gene detection for downstream applications

Objectives
Methods
Results
Discussion
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.