Abstract
Recently, non-coding RNAs (ncRNAs) have been discovered with novel functions, and it has been appreciated that there is a pervasive transcription. Therefore, de novo computational ncRNA detection that is accurate and efficient is desirable. The purpose of this study is to develop a ncRNA detection method based on structural conservation. A new method called Multifind, based on Multilign (Xu & Mathews, 2011), was developed. It uses an algorithm that predicts common structures among multiple sequences and estimates the probability that input sequences are ncRNA using a classification support vector machine (SVM). Multilign uses Dynalign (Mathews & Turner, 2002), which folds and aligns two sequences simultaneously without requiring any sequence identity; its structure prediction quality will therefore not be affected by input sequence diversity. Benchmarks showed, Multifind performs better than RNAz on testing sequences extracted from Rfam database (Gardner et al., 2011), especially on sequences that are more diverse. For de novo ncRNA discovery in genomes, Multifind had an advantage in low similarity regions of genome alignments. Multifind takes about 48 hours to finish scanning the whole yeast genome alignment and RNAz takes about 4 hours, therefore, its computational requirements do not present a barrier for most of the users.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.