PseUI: Pseudouridine sites identification based on RNA sequence information

Jingjing He,Yi Xiong,Xiaolei Zhu,Bei Huang,Ting Fang,Zizheng Zhang

doi:10.1186/s12859-018-2321-0

Jingjing He, Yi Xiong + Show 4 more

Open Access

https://doi.org/10.1186/s12859-018-2321-0

Copy DOI

Journal: BMC Bioinformatics	Publication Date: Aug 29, 2018
Citations: 97	License type: open-access

Affiliation: Anhui University, Shanghai Jiao Tong University

Abstract

BackgroundPseudouridylation is the most prevalent type of posttranscriptional modification in various stable RNAs of all organisms, which significantly affects many cellular processes that are regulated by RNA. Thus, accurate identification of pseudouridine (Ψ) sites in RNA will be of great benefit for understanding these cellular processes. Due to the low efficiency and high cost of current available experimental methods, it is highly desirable to develop computational methods for accurately and efficiently detecting Ψ sites in RNA sequences. However, the predictive accuracy of existing computational methods is not satisfactory and still needs improvement.ResultsIn this study, we developed a new model, PseUI, for Ψ sites identification in three species, which are H. sapiens, S. cerevisiae, and M. musculus. Firstly, five different kinds of features including nucleotide composition (NC), dinucleotide composition (DC), pseudo dinucleotide composition (pseDNC), position-specific nucleotide propensity (PSNP), and position-specific dinucleotide propensity (PSDP) were generated based on RNA segments. Then, a sequential forward feature selection strategy was used to gain an effective feature subset with a compact representation but discriminative prediction power. Based on the selected feature subsets, we built our model by using a support vector machine (SVM). Finally, the generalization of our model was validated by both the jackknife test and independent validation tests on the benchmark datasets. The experimental results showed that our model is more accurate and stable than the previously published models. We have also provided a user-friendly web server for our model at http://zhulab.ahu.edu.cn/PseUI, and a brief instruction for the web server is provided in this paper. By using this instruction, the academic users can conveniently get their desired results without complicated calculations.ConclusionIn this study, we proposed a new predictor, PseUI, to detect Ψ sites in RNA sequences. It is shown that our model outperformed the existing state-of-art models. It is expected that our model, PseUI, will become a useful tool for accurate identification of RNA Ψ sites.

Highlights

Pseudouridylation is the most prevalent type of posttranscriptional modification in various stable RNAs of all organisms, which significantly affects many cellular processes that are regulated by RNA
Performance of single type of feature we evaluated the performance of each type of features using support vector machine (SVM) over the rigorous jackknife test, and the feature position-specific nucleotide propensity (PSNP) was found to be excellent for identifying Ψ sites
If the receiver operating characteristic (ROC) curve of one model is completely enveloped by the curve of the other model, it can be asserted that the latter model is superior to the former in performance

Summary

Introduction

Pseudouridylation is the most prevalent type of posttranscriptional modification in various stable RNAs of all organisms, which significantly affects many cellular processes that are regulated by RNA. Accurate identification of pseudouridine (Ψ) sites in RNA will be of great benefit for understanding these cellular processes. Due to the low efficiency and high cost of current available experimental methods, it is highly desirable to develop computational methods for accurately and efficiently detecting Ψ sites in RNA sequences. Facing the exponential-increasing of RNA sequences in the post-genomic era, it is urgent to have an accurate, efficient and low-cost method to identify Ψ sites on RNA segments. Former studies suggest that computational methods or statistical learning methods are promising candidates because of their low cost and reasonable efficiency [14, 15]

Methods

Results

Conclusion

Full Text

Paper version not known

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

PseUI: Pseudouridine sites identification based on RNA sequence information

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: BMC Bioinformatics

Lead the way for us

Similar Papers

TargetM6A: Identifying N6-Methyladenosine Sites From RNA Sequences via Position-Specific Nucleotide Propensities and a Support Vector Machine.
Guang-Qing Li ... Hong-Bin Shen
IEEE Transactions on NanoBioscience | VOL. 15
Guang-Qing Li, et. al.Guang-Qing Li ... Hong-Bin Shen
10 Aug 2016
IEEE Transactions on NanoBioscience | VOL. 15

Predicting protein-binding regions in RNA using nucleotide profiles and compositions
Daesik Choi ... Wook Lee
BMC Systems Biology | VOL. 11
Daesik Choi, et. al.Daesik Choi ... Wook Lee
01 Mar 2017
BMC Systems Biology | VOL. 11

PRNAm-PC: Predicting N6-methyladenosine sites in RNA sequences via physical–chemical properties
Zi Liu ... Kuo-Chen Chou
Analytical Biochemistry | VOL. 497
Zi Liu, et. al.Zi Liu ... Kuo-Chen Chou
31 Dec 2015
Analytical Biochemistry | VOL. 497

IRNA-AI: identifying the adenosine to inosine editing sites in RNA sequences
Wei Chen ... Pengmian Feng
Oncotarget | VOL. 8
Wei Chen, et. al.Wei Chen ... Pengmian Feng
01 Dec 2016
Oncotarget | VOL. 8

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

PseUI: Pseudouridine sites identification based on RNA sequence information

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: BMC Bioinformatics