The gradual establishment of large-scale distributed camera networks and the rapid development of “Internet +” have resulted in the recent popularization of massive video surveillance systems. As pedestrians are the key monitoring targets in video surveillance systems, many studies are focusing on pedestrian re-identification monitoring algorithms across cameras. At present, the pedestrian re-identification model is not only faced with the difficulty of training the network model due to the huge quantity difference between different types of training samples, but also needs to reduce the impact of the large difference in visual performance on the model identification accuracy. To solve these difficulties, this paper proposed a deep learning model and designed a system based on a deep convolutional neural network for pedestrian re-identification. In particular, we determined the difference between the system input neighborhoods in order to derive the local relationship between the two input images, thus reducing the effects of illumination and perspective. Furthermore, we employed focal loss to solve the phenomenon of sample imbalance in the pedestrian re-identification process in order to enhance the actual application potential of the model. The proposed method was implemented in our developed end-to-end monitoring system for pedestrian re-identification. The hardware component of the system design framework was composed of a digital matrix, streaming media storage server and a network high-speed dome, with the ability to extend to additional tasks in the future. Our approach reduces the effects of data imbalances and visual performance differences, with a score of 76.0% for rank-1 and 99.5% for rank-20 on large data sets (CUHK03), which is not only a significant improvement over the previous IDLA, but also superior to other existing approaches.
Read full abstract