Cross-modal hashing similarity retrieval plays dual roles across various applications including search engines and autopilot systems. More generally, these methods also known to reduce the computation and memory storage in a training scheme. The key limitation of current methods are that: (i) they relax the discrete constrains to solve the optimization problem which may defeat the model purpose, (ii) projecting heterogenous data into a latent space may encourage to loss the diverse representations in such data, (iii) transforming real-valued data point to the binary codes always resulting in a loss of information and producing the suboptimal continuous latent space. In this paper, we propose a novel framework to project the original data points from different modalities into its own low-dimensional latent space and finds the cluster centroid points in its a low-dimensional space, using Cluster-wise Unsupervised Hashing (CUH). In particular, the proposed clustering scheme aims to jointly learns the compact hash codes and the corresponding linear hash functions. A discrete optimization framework is developed to learn the unified binary codes across modalities under of the guidance cluster-wise code-prototypes. Extensive experiments over multiple datasets demonstrate the effectiveness of our proposed model in comparison with the state-of-the-art in unsupervised cross-modal hashing tasks.