Abstract

BackgroundMatrix factorization methods are linear models, with limited capability to model complex relations. In our work, we use tropical semiring to introduce non-linearity into matrix factorization models. We propose a method called Sparse Tropical Matrix Factorization (STMF) for the estimation of missing (unknown) values in sparse data.ResultsWe evaluate the efficiency of the STMF method on both synthetic data and biological data in the form of gene expression measurements downloaded from The Cancer Genome Atlas (TCGA) database. Tests on unique synthetic data showed that STMF approximation achieves a higher correlation than non-negative matrix factorization (NMF), which is unable to recover patterns effectively. On real data, STMF outperforms NMF on six out of nine gene expression datasets. While NMF assumes normal distribution and tends toward the mean value, STMF can better fit to extreme values and distributions.ConclusionSTMF is the first work that uses tropical semiring on sparse data. We show that in certain cases semirings are useful because they consider the structure, which is different and simpler to understand than it is with standard linear algebra.

Highlights

  • Matrix factorization methods are linear models, with limited capability to model complex relations

  • We show that in certain cases semirings are useful because they consider the structure, which is different and simpler to understand than it is with standard linear algebra

  • Even on a relatively small 200 × 100 matrix results show that negative matrix factorization (NMF) cannot successfully recover extreme values compared to Sparse Tropical Matrix Factorization (STMF), see Fig. 1

Read more

Summary

Introduction

Matrix factorization methods are linear models, with limited capability to model complex relations. We use tropical semiring to introduce nonlinearity into matrix factorization models. We propose a method called Sparse Tropical Matrix Factorization (STMF) for the estimation of missing (unknown) values in sparse data. Matrix factorization methods are getting increasingly popular in many research areas [1,2,3]. These methods generate linear models, which cannot model complex relationships. Our work focuses on incorporating non-linearity into matrix factorization models by using tropical semiring. The motivation for using tropical matrix factorization can be seen in the classic example of movie rating data, where a users-by-movies matrix contains the rating users assigned to movies.

Methods
Results
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.