Abstract
BackgroundThe knowledge of miRNAs regulating the expression of sets of mRNAs has led to novel insights into numerous and diverse cellular mechanisms. While a single miRNA may regulate many genes, one gene can be regulated by multiple miRNAs, presenting a complex relationship to model for accurate predictions.ResultsHere, we introduce miREM, a program that couples an expectation-maximization (EM) algorithm to the common approach of hypergeometric probability (HP), which improves the prediction and prioritization of miRNAs from gene-sets of interest. miREM has been made available through a web-server (https://bioinfo-csi.nus.edu.sg/mirem2/) that can be accessed through an intuitive graphical user interface. The program incorporates a large compendium of human/mouse miRNA-target prediction databases to enhance prediction. Users may upload their genes of interest in various formats as an input and select whether to consider non-conserved miRNAs, amongst filtering options. Results are reported in a rich graphical interface that allows users to: (i) prioritize predicted miRNAs through a scatterplot of HP p-values and EM scores; (ii) visualize the predicted miRNAs and corresponding genes through a heatmap; and (iii) identify and filter homologous or duplicated predictions by clustering them according to their seed sequences.ConclusionWe tested miREM using RNAseq datasets from two single “spiked” knock-in miRNA experiments and two double knock-out miRNA experiments. miREM predicted these manipulated miRNAs as having high EM scores from the gene set signatures (i.e. top predictions for single knock-in and double knock-out miRNA experiments). Finally, we have demonstrated that miREM predictions are either similar or better than results provided by existing programs.
Highlights
The knowledge of Micro RNA (miRNA) regulating the expression of sets of Messenger RNA (mRNA) has led to novel insights into numerous and diverse cellular mechanisms
In contrast to current methods based on hypergeometric probability (HP) only, we introduce a novel strategy in complement to HP, which (i) ’weigh-down’ the contribution from overlapping target genes when calculating the significance of each miRNAsignature using an expectation-maximization (EM) algorithm, a general probabilistic framework that can be used for this purpose [12]; and (ii) cluster all predicted miRNAs according to their seed region sequences for identifying “synonymous” predictions
We used a gene-set of repressed genes as input (Additional file 3: Table S2) and ran miREM, CORNA, GeneSet2MiRNA and ChemiRs (Table 2 and Additional file 4: Table S3; for Sylamer, whole gene list ranked by fold change was input). miREM has predicted involving miRNAs correctly, with hsa-miR-155-5p and hsa-miR-1-3p ranked at the first and third positions respectively
Summary
We have developed miREM, an HP-EM-based program designed to predict miRNA activities from a gene list. miREM’s web server incorporates a large compendium of human/mouse miRNA-target prediction databases and provides rich output results facilitating prioritization and interpretation of predicted results. To test miREM performance, we benchmarked miREM predictions against CORNA [7], GeneSet2MiRNA [8], ChemiRs [9], and Sylamer [10] results using several datasets with known miRNA activities These are detailed in three case studies as follows: Case study 1: knock-in miRNA experiments We used two RNAseq expression datasets from miR-155 and miR-1 knock-in experiments in U2OS cells, respectively [25]. In these experiments, we used a gene-set of repressed genes as input (Additional file 3: Table S2) and ran miREM, CORNA, GeneSet2MiRNA and ChemiRs (Table 2 and Additional file 4: Table S3; for Sylamer, whole gene list ranked by fold change was input). We tested miREM’s performances using different HP p-value thresholds and EM convergence parameters given the downregulated gene list from hsa-miR-155 knock-in experiment. hsa-miR-155-5p remained the first-ranked candidate in various prediction settings (Additional file 6: Table S5)
Published Version (Free)
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.