Abstract

BackgroundAlternative splicing (AS) of mRNA is a vital mechanism for enhancing genomic complexity in eukaryotes. Spliced isoforms of the same gene can have diverse molecular and biological functions and are often differentially expressed across various tissues, times, and conditions. Thus, AS has important implications in the study of parasitic nematodes with complex life cycles. Transcriptomic datasets are available from many species, but data must be revisited with splice-aware assembly protocols to facilitate the study of AS in helminthes.MethodsWe sequenced cDNA from the model worm Caenorhabditis elegans using 454/Roche technology for use as an experimental dataset. Reads were assembled with Newbler software, invoking the cDNA option. Several combinations of parameters were tested and assembled transcripts were verified by comparison with previously reported C. elegans genes and transcript isoforms and with Illumina RNAseq data.ResultsThoughtful adjustment of program parameters increased the percentage of assembled transcripts that matched known C. elegans sequences, decreased mis-assembly rates (i.e., cis- and trans-chimeras), and improved the coverage of the geneset. The optimized protocol was used to update de novo transcriptome assemblies from nine parasitic nematode species, including important pathogens of humans and domestic animals. Our assemblies indicated AS rates in the range of 20-30%, typically with 2-3 transcripts per AS locus, depending on the species. Transcript isoforms from the nine species were translated and searched for similarity to known proteins and functional domains. Some 21 InterPro domains, including several involved in nucleotide and chromatin binding, were statistically correlated with AS genetic loci. In most cases, the Roche/454 data explored in this study are the only sequences available from the species in question; however, the recently published genome of the human hookworm Necator americanus provided an additional opportunity to validate our results.ConclusionsOur optimized assembly parameters facilitated the first survey of AS among parasitic nematodes. The nine transcriptome assemblies, their protein translations, and basic annotations are available from Nematode.net as a resource for the research community. These should be useful for studies of specific genes and gene families of interest as well as for curating draft genome assemblies as they become available.

Highlights

  • Alternative splicing (AS) of mRNA is a vital mechanism for enhancing genomic complexity in eukaryotes

  • Optimization of assembly parameters cDNA libraries were generated from mixed stage C. elegans worms and sequenced using Roche/454 technology

  • De novo transcriptome assembly is a complicated procedure that is confounded by varied gene expression patterns, such AS of mRNA

Read more

Summary

Methods

454/Roche library construction, sequencing and data cleaning One splice-leader (SL1) and four oligo(dT) cDNA libraries were constructed from DNase treated C. elegans (Bristol N2) RNA according to previously described methods [26]. Cleaned C. elegans Roche/454 reads were mapped to C. elegans coding sequences (WormBase [41] release WS236) with Bowtie (version 2.1.0, default parameters [39]) in order to assess the scope of the dataset prior to assembly. Various combinations of parameters were tested (see Table 1), and the isotigs from each assembly were compared to the C. elegans coding sequences (CDSs) and coding transcripts (CDS + UTRs) included in WormBase [41] release WS236 by BLAST + (version 2.2.27) with a cutoff of ≥90% sequence identity over ≥75% the isotig’s length in a single highscoring segment pair. To further validate our isoform predictions, Illumina RNAseq libraries were generated from C. elegans RNA as previously described [42] (SRA numbers: SRR868958, SRR868932, SRR868957, SRR868939, SRR868942), and the resulting raw reads were mapped to assembled isotigs using Bowtie (version 2.1.0, default parameters [39]). P values calculated for each domain were population corrected using False Discovery Rate (FDR) correction [47], and a significance threshold of 0.01 on the corrected P values was used to determine which InterPro domains were significantly more often associated with AS isogroups than non-AS isogroups

Results
Conclusions
Background
Results and discussion
32. Hagiwara M
35. Martin M
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.