Multiview clustering (MVC) has recently received great interest due to its pleasing efficacy in combining the abundant and complementary information to improve clustering performance, which overcomes the drawbacks of view limitation existed in the standard single-view clustering. However, the existing MVC methods are mostly designed for vectorial data from linear spaces and, thus, are not suitable for multiple dimensional data with intrinsic nonlinear manifold structures, e.g., videos or image sets. Some works have introduced manifolds' representation methods of data into MVC and obtained considerable improvements, but how to fuse multiple manifolds efficiently for clustering is still a challenging problem. Particularly, for heterogeneous manifolds, it is an entirely new problem. In this article, we propose to represent the complicated multiviews' data as heterogeneous manifolds and a fusion framework of heterogeneous manifolds for clustering. Different from the empirical weighting methods, an adaptive fusion strategy is designed to weight the importance of different manifolds in a data-driven manner. In addition, the low-rank representation is generalized onto the fused heterogeneous manifolds to explore the low-dimensional subspace structures embedded in data for clustering. We assessed the proposed method on several public data sets, including human action video, facial image, and traffic scenario video. The experimental results show that our method obviously outperforms a number of state-of-the-art clustering methods.