Abstract

The advance of high-throughput sequencing has made it one of most important techniques to obtain new transcriptomes in non-model organisms. In these studies, there is often a need to investigate the transcriptomes of two related organisms at the same time in order to find the similarities and differences between them. The traditional approach to address this problem is to perform de novo transcriptome assemblies to obtain predicted transcripts for these organisms independently and then employ similarity comparison algorithms to study them. Instead of obtaining predicted transcripts for these organisms separately from the intermediate de Bruijn structures employed by de novo transcriptome assembly algorithms, we develop an algorithm to allow direct comparisons between paths in two de Bruijn graphs by first enumerating short paths in both graphs, and iteratively extending pairs of paths that have high similarity to obtain longer pairs of corresponding paths between the two graphs. These extended path pairs represent predicted transcripts that are present in both organisms. We show that our algorithm recovers more shared transcripts than traditional approaches by applying it to simultaneously recover transcripts in mouse and rat from publicly available RNA-Seq libraries. Since our strategy utilizes sequence similarity information within the paths that is often more reliable than coverage information, the shared transcripts that are recovered are also longer, which allows detailed investigation of the similarities and differences in alternative splicing between the two organisms at both the sequence and structure levels. Our approach generalizes the pairwise sequence alignment problem to allow the input to be non-linear structures, and provides a heuristic to reliably recover similar paths from the two structures.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.