The deep learning technique plays an important role in Wi-Fi localization systems as it could mine deep features of measurement data. The main challenges are to combat the signal fluctuation resulting in decrease of sample discrimination, and to leverage the broadest information of sample measurements during the training phase, since they are directly related to the location accuracy and robustness. Hence, to address the above issues, this paper proposes an indoor Wi-Fi localization scheme which mainly contains two modules. Firstly, an improved contrastive learning is introduced to handle the sample signal measurements to increase the discrimination. It is from the perspective of learning and encoding, and avoids the drawbacks brought by traditional processing methods. Then, we build a parallel fusion network named as PaCNN-LSTM based on convolutional neural network (CNN) and long short-term memory network (LSTM). Compared with existing networks, PaCNN-LSTM connects neural networks in parallel rather than serial, which improves the generalization performance of model when extracting the spatial and temporal features of signal measurements. In addition, it also considers the large amount of middle layer information that is always ignored. By adding a Flatten layer after the pooling layer, the available information of samples has been broadened. Extensive experimental results show that the localization performance of the proposed scheme is outperformed than others, where the location accuracy is improved by about 22%.