In this article, we focus on the high-resolution remote sensing scene classification, which aims to label the scene image according to its content. While the high similarity of the between-class and variety of the inner class in the scene images make it challenging. To address the issue, we propose a unified two-stage classification framework named variable-weighted multi-feature fusing (VWMF) method. In the first stage, we fuse multiple low-level features into a concatenated holistic histogram feature via pre-trained unsupervised models. Then, we use a kernel collaborative representation-based classification method to predict the two most possible categories. In the second stage, for the two possible categories, we calculate the variable-weighted feature through a pre-calculated feature variable weight set of the corresponding two categories, and then put the variable-weighted feature into the pre-trained SVM for one-versus-one classification. Comprehensive experiments on two publicly available data sets demonstrate the superiority of our VWMF method compared with state-of-the-art remote sensing scene classification methods.