Abstract
Non-negative matrix tri-factorization (NMTF) is a popular technique for learning low-dimensional feature representation of relational data. Currently, NMTF learns a representation of a dataset through an optimization procedure that typically uses multiplicative update rules. This procedure has had limited success, and its failure cases have not been well understood. We here perform an empirical study involving six large datasets comparing multiplicative update rules with three alternative optimization methods, including alternating least squares, projected gradients, and coordinate descent. We find that methods based on projected gradients and coordinate descent converge up to twenty-four times faster than multiplicative update rules. Furthermore, alternating least squares method can quickly train NMTF models on sparse datasets but often fails on dense datasets. Coordinate descent-based NMTF converges up to sixteen times faster compared to well-established methods.
Highlights
Extracting patterns from relational data is a key task in natural language processing [1], bioinformatics [2], and digital humanities [3]
We find that traditional multiplicative update rules method has the worst performance
These results indicate that multiplicative update rules, which is the default negative matrix tri-factorization (NMTF) optimization method in many applications, perform substantially worse than alternative optimization methods described in the present study
Summary
Extracting patterns from relational data is a key task in natural language processing [1], bioinformatics [2], and digital humanities [3]. We typically represent a relational dataset with a data matrix, encoding, for example, information on document-term frequencies, gene-disease associations, or user-item ratings. Non-negative matrix tri-factorization (NMTF) is a general technique that takes a data matrix and compresses, or embeds, the matrix into a compact latent space. The learned embedding space can be used to identify clusters [4, 5], reveal interesting patterns [6, 7], and generate feature representations for downstream analytics [8, 9]. Identify cancer driver genes from patient data [11], and to model topics in text data [12]. Despite numerous applications, training NMTF models on large datasets can be slow and has remained computationally challenging [13]
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.