Clustering is one of the most important unsupervised learning problems in machine learning. As one of the most widely used clustering algorithms, K-means has been studied extensively. A number of more complicated and advanced clustering algorithms have been developed based on K-means. Moreover, K-means is often used as a final clustering step in many algorithms, such as subspace clustering, nonnegative matrix factorization, etc. However, for high-dimensional data, these algorithms generally use all features of the data, which often degrades the clustering performance due to the use of redundant and noisy information. Existing researches have demonstrated the importance of learning patterns with meaningful features, which inspires us to simultaneously discover useful features in the K-means framework. Thus, in this article, we incorporate feature selection into K-means framework. Moreover, to further enhance the clustering ability, we minimize the fitting residual with sparse norm and exploit the representation on manifold, which enhances the robustness to outliers, missing values, and noise and improves the ability to recover nonlinear structures of the data. We conducted extensive experiments to testify the effectiveness of the proposed method on gene expression and face image data sets. In particular, we compare the clustering performance with several state-of-the art algorithms on both original data and noisy data. We also analyze the convergence, parameter sensitivity, learned features, and computational time of the proposed method. From extensive experimental results, we observe superior clustering performance to the baseline methods, which implies the effectiveness of the proposed method.