Abstract

BackgroundNext-generation DNA sequencing technologies have made it possible to generate transcriptome data for novel organisms quickly and cheaply, to the extent that the effort required to annotate and publish a new transcriptome is greater than the effort required to sequence it. Often, following publication, details of the annotation effort are only available in summary form, hindering subsequent exploitation of the data. To promote best-practice in annotation and to ensure that data remain accessible, we have written afterParty, a web application that allows users to assemble, annotate and publish novel transcriptomes using only a web browser.ResultsafterParty is a robust web application that implements best-practice transcriptome assembly, annotation, browsing, searching, and visualization. Users can turn a collection of reads (from Roche 454 chemistry) or assembled contigs (from any sequencing chemistry, including Illumina Solexa RNA-Seq) into a searchable, browsable transcriptome resource and quickly make it publicly available. Contigs are functionally annotated based on similarity to known sequences and protein domains. Once assembled and annotated, transcriptomes derived from multiple species or libraries can be compared and searched. afterParty datasets can either be created using the existing afterParty server, or using local instances that can be built easily using a virtual machine. afterParty includes powerful visualization tools for transcriptome dataset exploration and uses a flexible annotation architecture which will allow additional types of annotation to be added in the future.ConclusionsafterParty's main use case scenario is one in which a working biologist has generated a large volume of transcribed sequence data and wishes to turn it into a useful resource that has some durability. By reducing the effort, bioinformatics skills, and computational resources needed to annotate and publish a transcriptome, afterParty will facilitate the annotation and sharing of sequence data that would otherwise remain unavailable. A typical metazoan transcriptome containing several tens of thousands of contigs can be annotated in a few minutes of interactive time and a few days of computational time.

Highlights

  • Next-generation DNA sequencing technologies have made it possible to generate transcriptome data for novel organisms quickly and cheaply, to the extent that the effort required to annotate and publish a new transcriptome is greater than the effort required to sequence it

  • To address the need for an integrated, dependency-free, intuitive tool for transcriptome annotation and publication we have developed afterParty, a web application that runs entirely within a browser and functions both as an annotation tool and a transcriptome browsing and visualization tool. afterParty takes as its input either raw reads or assembled contigs, and uses existing best-practice tools and databases to annotate them, resulting in collections of annotated putative transcripts (“datasets”) along with metadata describing how the sequences were produced. afterParty acts as a web interface to datasets, allowing nonbioinformatician users to browse contigs, search annotation, and define and visualize sets of contigs

  • L. sigmodontis is the subject of an ongoing transcriptome project [15], and the transcriptome data is typical of the type for which we expect afterParty to be useful. 764,024 reads from five libraries were assembled, and annotated using an installation of afterParty on an 8-core server

Read more

Summary

Introduction

Next-generation DNA sequencing technologies have made it possible to generate transcriptome data for novel organisms quickly and cheaply, to the extent that the effort required to annotate and publish a new transcriptome is greater than the effort required to sequence it. Publications and raw sequence data resulting from transcriptome sequencing projects are generally made available and archived, but intermediate, detailed annotations are typically not. AfterParty takes as its input either raw reads or assembled contigs, and uses existing best-practice tools and databases to annotate them, resulting in collections of annotated putative transcripts (“datasets”) along with metadata describing how the sequences were produced.

Results
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.