Abstract
With the ever growing collection of large volumes of scientific data, development of interpretable machine learning tools to analyze such data is becoming more important. However, robust, interpretable machine learning tools are lacking, threatening extraction of scientific insight and discovery. Nonnegative Matrix Factorization (NMF) algorithms decompose an m × n nonnegative data matrix A into a k × n basis matrix H and an m × k weight matrix W, such that A ≈ WH, where k is the desired rank. In this paper, we present a novel two stage algorithm, UoI-NMF_cluster for NMF, which is based on three innovations: (i) completely separate bases learning from weight estimation, (ii) learn bases by clustering NMF results across bootstrap resamples of the data, and (iii) use the recently introduced Union of Intersections (UoI) framework to estimate ultra-sparse weights that maximize data reconstruction accuracy. We deploy our algorithm on various synthetic and scientific data to illustrate its performance, with a focus on neuroscience data. Compared to other NMF algorithms, UoI-NMF_cluster yields: a) more accurate parts-based decompositions of noisy data, b) a sparse and accurate weight matrix, and c) high accuracy reconstructions of the de-noised data. Together, these improvements enhance the performance and interpretability of NMF application to noisy data, and suggest similar approaches may benefit other matrix decomposition algorithms.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.