This letter discusses a novel low-power digital CMOS architecture for speaker identification (SI) by combining $k$ -means clustering with Gaussian mixture model (GMM) scoring. We show that $k$ -means clustering at the front-end reduces the dimensionality of speech features to minimize downstream processing without affecting SI accuracy. Implementation of cluster generator is discussed with novel distance computing and online centroid update datapaths to minimize overhead of the clustering layer (CL). The integrated design achieves $6\times $ lower energy than the conventional for SI among ten speakers.