Abstract

BackgroundZebrafish is a full-developed model system for studying development processes and human disease. Recent studies of deep sequencing had discovered a large number of long non-coding RNAs (lncRNAs) in zebrafish. However, only few of them had been functionally characterized. Therefore, how to take advantage of the mature zebrafish system to deeply investigate the lncRNAs’ function and conservation is really intriguing.ResultsWe systematically collected and analyzed a series of zebrafish RNA-seq data, then combined them with resources from known database and literatures. As a result, we obtained by far the most complete dataset of zebrafish lncRNAs, containing 13,604 lncRNA genes (21,128 transcripts) in total. Based on that, a co-expression network upon zebrafish coding and lncRNA genes was constructed and analyzed, and used to predict the Gene Ontology (GO) and the KEGG annotation of lncRNA. Meanwhile, we made a conservation analysis on zebrafish lncRNA, identifying 1828 conserved zebrafish lncRNA genes (1890 transcripts) that have their putative mammalian orthologs. We also found that zebrafish lncRNAs play important roles in regulation of the development and function of nervous system; these conserved lncRNAs present a significant sequential and functional conservation, with their mammalian counterparts.ConclusionsBy integrative data analysis and construction of coding-lncRNA gene co-expression network, we gained the most comprehensive dataset of zebrafish lncRNAs up to present, as well as their systematic annotations and comprehensive analyses on function and conservation. Our study provides a reliable zebrafish-based platform to deeply explore lncRNA function and mechanism, as well as the lncRNA commonality between zebrafish and human.

Highlights

  • Zebrafish is a full-developed model system for studying development processes and human disease

  • According to the genomic location, these zebrafish long non-coding RNAs (lncRNAs) genes are cataloged as intergenic, sense overlapping, antisense and intronic, in the percentage of 43.9%, 27.0%, 26.8% and 2.3% respectively (Fig. 2a)

  • We have evaluated the number of upstream transcription factor (TF) families shared by both conserved zebrafish lncRNA and its human counterpart (Fig. 5d)

Read more

Summary

Introduction

Zebrafish is a full-developed model system for studying development processes and human disease. Recent studies of deep sequencing had discovered a large number of long non-coding RNAs (lncRNAs) in zebrafish. LncRNAs are termed as a heterogeneous class of transcripts with the length over 200 bp and without the potential of protein-coding [1,2,3]. There are ten thousands of lncRNAs discovered across human, mouse, nematode, zebrafish etc. The well-characterized cases of using zebrafish as a model for studying protein roles in human disease are coming from the hematopoietic diseases, like ALAS2 in a microcytic, hypochromic anemia, UROD in porphyria, and etc. The similar studies of using the zebrafish as a model to probe lncRNAs’ function are still at infancy, because of the insufficient annotation and the lack of systematic survey

Methods
Results
Discussion
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call