Stable solution to l2,1-based robust inductive matrix completion and its application in linking long noncoding RNAs to human diseases

Ashis Kumer Biswas,Chris Ding,Jean X Gao,Dongchul Kim,Mingon Kang

doi:10.1186/s12920-017-0310-1

Abstract

BackgroundsA large number of long intergenic non-coding RNAs (lincRNAs) are linked to a broad spectrum of human diseases. The disease association with many other lincRNAs still remain as puzzle. Validation of such links between the two entities through biological experiments are expensive. However, a plethora lincRNA-data are available now, thanks to the High Throughput Sequencing (HTS) platforms, Genome Wide Association Studies (GWAS), etc, which opens the opportunity for cutting-edge machine learning and data mining approaches to extract meaningful relationships among lincRNAs and diseases. However, there are only a few in silico lincRNA-disease association inference tools available to date, and none of them utilizes side information of both the entities simultaneously in a single framework.MethodsThe recently developed Inductive Matrix Completion (IMC) technique provides a recommendation platform among two entities considering respective side information about them. However, the formulation of IMC is incapable of handling noise and outliers that may be present in the datasets, while data sparsity consideration is another issue with the standard IMC method. Thus, a robust version of IMC is needed that can solve the two issues. As a remedy, in this paper, we propose Stable Robust Inductive Matrix Completion (SRIMC) that utilizes the l2,1 norm based regularization to optimize the objective function with a unique 2-step stable solution approach.ResultsWe applied SRIMC to the available association data between human lincRNAs and OMIM disease phenotypes as well as a diverse set of side information about the lincRNAs and the diseases. The method performs better than the state-of-the-art methods in terms of precision@k and recall@k at the top-k disease prioritization to the subject lincRNAs. We also demonstrate that SRIMC is equally effective for querying about novel lincRNAs, as well as predicting rank of a newly known disease for a set of well-characterized lincRNAs.ConclusionsWith the experimental results and computational evaluation, we show that SRIMC is robust in handling datasets with noise and outliers as well as dealing with novel lincRNAs and disease phenotypes.

Highlights

LincRNA-disease association inference problem It is a surprising fact that, only 2% of the entire human genome codes for proteins [1]
We demonstrate that Stable Robust Inductive Matrix Completion (SRIMC) is effective for querying about novel lincRNAs, as well as predicting rank of a newly known disease for a set of well-characterized lincRNAs
With the experimental results and computational evaluation, we show that SRIMC is robust in handling datasets with noise and outliers as well as dealing with novel lincRNAs and disease phenotypes

Summary

Introduction

LincRNA-disease association inference problem It is a surprising fact that, only 2% of the entire human genome codes for proteins [1]. It has become evident that the non-protein coding portion of the genome, especially the long intergenic non-coding RNAs (lincRNAs) having length more than 200 bases each with no overlaps with any annotated protein-coding regions, are of critical functional importance. These lincRNAs demonstrate diverse molecular mechanisms and implicate various human diseases [2]. Fully annotating the functions of the lincRNAs and their involvements in human disease implications still remain a challenge for the researchers. Developing machine learning algorithm to rank disease implications by a given lincRNA based on prior knowledge would be beneficial to the community for tackling the challenge

Methods

Results

Conclusion

Full Text

Paper version not known

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Stable solution to l2,1-based robust inductive matrix completion and its application in linking long noncoding RNAs to human diseases

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: BMC Medical Genomics

Lead the way for us

Journal: BMC Medical Genomics	Publication Date: Dec 1, 2017
License type: open-access

Similar Papers

Robust Inductive Matrix Completion Strategy to Explore Associations Between LincRNAs and Human Disease Phenotypes.
Ashis Kumer Biswas ... Dong-Chul Kim
IEEE/ACM transactions on computational biology and bioinformatics | VOL. 16
Ashis Kumer Biswas, et. al.Ashis Kumer Biswas ... Dong-Chul Kim
07 Jun 2018
IEEE/ACM transactions on computational biology and bioinformatics | VOL. 16

Robust Inductive Matrix Completion strategy to explore associations between lincRNAs and human disease phenotypes
Ashis Kumer Biswas ... Mingon Kang
-
Ashis Kumer Biswas, et. al.Ashis Kumer Biswas ... Mingon Kang
01 Dec 2016
01 Dec 2016

LiDiAimc: LincRNA-disease associations through inductive matrix completion
Ashis Biswas ... Jean Gao
-
Ashis Biswas, et. al.Ashis Biswas ... Jean Gao
01 Nov 2017
01 Nov 2017

Drug repositioning based on the target microRNAs using bilateral-inductive matrix completion.
K. Deepthi ... A. S. Jereesh
Molecular genetics and genomics : MGG | VOL. 295
K. Deepthi, et. al.K. Deepthi ... A. S. Jereesh
24 Jun 2020
Molecular genetics and genomics : MGG | VOL. 295

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Stable solution to l2,1-based robust inductive matrix completion and its application in linking long noncoding RNAs to human diseases

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: BMC Medical Genomics