Performance evaluation of methods for integrative dimension reduction

Hadi Fanaee-T,Magne Thoresen

doi:10.1016/j.ins.2019.04.041

Abstract

Dimension reduction (DR) methods play an inevitable role in analyzing and visualizing high-dimensional multi-source data. In the recent decades many variants of these methods have been developed in various disciplines and domains. Due to the diversity and an ever-increasing number of developed techniques, choosing the right method for the given problem is a difficult task. In this study we benchmark 87 methods for integrative dimension reduction of mRNA expression and DNA methylation data, which is a common problem in biology and medicine. Our ranking is obtained based on four main factors: quality of dimension reduction (local, global, and local-global neighborhood preservation), clustering quality, speed and sensitivity to input parameters on multiple datasets generated by InterSIM (a semi-realistic multi-source data simulator in the genomics domain). The results are later validated on a real dataset for breast cancer through visual evaluation metrics such as co-ranking matrices, inspection of true cancer sub-types in two-dimensional projections, and LCMC curves. We also demonstrate the relationship between the methods via network inference. The findings in this study can be useful in algorithm selection and planning of experimental design in multi-source data analysis.

Highlights

Analysis of data from multiple sources is a rapidly emerging area with an ever-increasing role in biology and medicine, data integration has become an important research area
We extracted a list of methods that can be used for integrative data analysis, which are selected from different families: Dimension Reduction (DR) methods, Non-negative Matrix Factorization (NMF), Joint Matrix Factorization (JMF), Joint Non-negative Matrix Factorization (JNMF), Multi-Block data methods (MB), Bayesian Multi-Block models (BMB), and Joint/Separated Matrix Factorization (JSMF)
The Local Continuity Meta-Criterion (LCMC) [6] is a parameter-free and widely accepted quality measure for dimension reduction for single-view datasets. It can be defined as the average number of overlaps between the k-nearest neighbors in the high-dimensional space and the low-dimensional projection

Summary

Introduction

Analysis of data from multiple sources is a rapidly emerging area with an ever-increasing role in biology and medicine, data integration has become an important research area. An important problem in omics data analysis is the high dimensionality of the data. The main goal in DR is to minimize the distance between the points in a high-dimensional space and the points in a low-dimensional projection of the same data. A good DR method should produce a lower-dimensional projection that is faithful to the original high-dimensional space. This faithfulness is relative and depending on the goal of the analysis, one might concentrate on preserving the local or the global neighborhoods. For data coming from multiple sources (or views) this becomes even more complicated, because

Objectives

Methods

Results

Conclusion

Full Text

Paper version not known

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Information Sciences	Publication Date: Apr 22, 2019
Citations: 7	License type: cc-by

R Discovery Prime

R Discovery Prime

Performance evaluation of methods for integrative dimension reduction

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Information Sciences

Lead the way for us

Similar Papers

Review of classical dimensionality reduction and sample selection methods for large-scale data processing
Xinzheng Xu ... Tianming Liang
Neurocomputing | VOL. 328
Xinzheng Xu, et. al.Xinzheng Xu ... Tianming Liang
17 Aug 2018
Neurocomputing | VOL. 328

Exploring combinations of dimensionality reduction, transfer learning, and regularization methods for predicting binary phenotypes with transcriptomic data
S. R. Oshternian ... R. S. N. Fehrmann
BMC Bioinformatics | VOL. 25
S. R. Oshternian, et. al.S. R. Oshternian ... R. S. N. Fehrmann
26 Apr 2024
BMC Bioinformatics | VOL. 25

Investigation of different dimension reduction and normalization methods for local appearance-based face recognition
...
-
, et. al. ...
01 Apr 2009
01 Apr 2009

Identifying nuclear protein subcellular localization using feature dimension reduction method
Tong Wang ... Lihua Hu
-
Tong Wang, et. al. Tong Wang ... Lihua Hu
01 Sep 2010
01 Sep 2010

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Performance evaluation of methods for integrative dimension reduction

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Information Sciences