Abstract
BackgroundRNA-Seq technology has been used widely in transcriptome study, and one of the most important applications is to estimate the expression level of genes and their alternative splicing isoforms. There have been several algorithms published to estimate the expression based on different models. Recently Wu et al. published a method that can accurately estimate isoform level expression by considering position-related sequencing biases using nonparametric models. The method has advantages in handling different read distributions, but there hasn’t been an efficient program to implement this algorithm.ResultsWe developed an efficient implementation of the algorithm in the program NURD. It uses a binary interval search algorithm. The program can correct both the global tendency of sequencing bias in the data and local sequencing bias specific to each gene. The correction makes the isoform expression estimation more reliable under various read distributions. And the implementation is computationally efficient in both the memory cost and running time and can be readily scaled up for huge datasets.ConclusionNURD is an efficient and reliable tool for estimating the isoform expression level. Given the reads mapping result and gene annotation file, NURD will output the expression estimation result. The package is freely available for academic use at http://bioinfo.au.tsinghua.edu.cn/software/NURD/.
Highlights
RNA-Seq technology has been used widely in transcriptome study, and one of the most important applications is to estimate the expression level of genes and their alternative splicing isoforms
As it is known that most human genes can have alternative splicing, estimating isoform expression is becoming a key question in RNA-Seq study
Based on the information we get in step 2 and step 3, NURD calculates the global bias curve (GBC) of the whole data, local bias curve (LBC) for each gene and the bias-corrected gene structure matrix, and get the corrected log-likelihood function
Summary
We developed an efficient implementation of the algorithm in the program NURD. It uses a binary interval search algorithm. The program can correct both the global tendency of sequencing bias in the data and local sequencing bias specific to each gene. The correction makes the isoform expression estimation more reliable under various read distributions. The implementation is computationally efficient in both the memory cost and running time and can be readily scaled up for huge datasets
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.