Visual place recognition has gained popularity in recent years. Mainstream convolutional neural network-based methods formulate it as a ranking task and optimize it in the paradigm of deep metric learning, however, the ranking-motivated losses concern only the ranking relationship for each query image and the compactness of intra-place feature distribution is ignored. It is still challenging due to varying viewpoints, illuminations and even dynamic objects. In this paper, a novel multi-task learning framework is proposed, which combines the existing triplet ranking task and our designed binary classification task to jointly optimize the network for better generalization capability. Specifically, a binary classification network with the corresponding binary cross-entropy loss is designed in the classification task. In this way, the intra-place feature compactness and inter-place feature separability are reinforced. At testing stage, this classification network is discarded without increasing the computation cost. Furthermore, an attention module is presented to promote the network to concentrate on the salient regions by assigning different importance to each spatial position. Our method achieves the top-10 recalls of 97.27%, 94.6%, and 96.93% on Pitts250k-test, Tokyo 24/7, and TokyoTM-val datasets, respectively. Extensive experiments prove that the proposed network can learn discriminative global features with better robustness to viewpoints and environmental variations.