Abstract

Understanding the role of genetics in diseases is a challenging process that has multiple applications within functional genomics and precision medicine. In this paper, we present a general clustering method to identify disease genes under a multi-view setting. First, by incorporating the graph Laplacian of spectral clustering (SC) into the discriminative K-means, we obtain a single-view subspace representation, which is endowed with both discriminant power and geometrical structure information of that data layer. Then, integrating these individual subspaces together on the Grassmann manifold, we can further find a unified low-dimensional representation under the multi-view SC framework. The proposed two-stage method generalizes the single-view discriminative K-means and the multi-view Grassmann clustering, and can directly handle the case where both attribute-based data and interaction-based networks are available, which is extremely useful in biological research. As a case study of disease gene identification, we apply this method to a benchmark dataset that contains nine gene-by-term text profiles. Experimental results show that our method provides competitive results compared to the state-of-art clustering methods, including a similar one that fuses multiple kernels and Laplacians.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call