Abstract

SummaryPerson re‐identification (Re‐ID) aims to retrieve a person of interest across multiple nonoverlapping cameras. In recent years, to enable person Re‐ID technology to play out its application value in real‐world scenarios, visual surveillance through unmanned aerial vehicle (UAV) platforms has received intense attention, and aerial person datasets have been constructed. However, the pedestrian images captured by ground cameras and those captured by UAVs exhibit great differences. Person Re‐ID methods based on ground person images have difficulty performing Re‐ID on aerial person images. In this paper, we first use a meta‐transfer method to learn to generalize the aerial person Re‐ID task. Specifically, combining the ideas of meta‐learning and transfer learning, a meta‐learning strategy is introduced to study a feature extractor, and a transfer learning strategy is introduced to utilize and further improve the acquired meta‐knowledge. To prevent the catastrophic forgetting and overfitting problems caused by large‐scale model parameters, we freeze the lower‐layer neurons with powerful generalization and fine‐tune the higher‐layer neurons with strong specialization to transfer and represent the feature extractor. In addition, during the model training process, it is observed that the presence of difficult categories in the given dataset significantly affects the convergence speed and recognition accuracy of the utilized meta‐learning method, and the loss function based on the general Euclidean distance measure tends to mislead the model to optimize in a suboptimal direction. Therefore, we introduce a curriculum sampling based learning strategy that is harmonized with our meta‐transfer learning framework and a new metric formulation of sample similarity based on the Mahalanobis distance to improve the model. In the experimental part, when our method is adopted, a Rank‐1 accuracy of 63.63% and a mean average precision (mAP) of 38.02% are achieved on an aerial Re‐ID dataset, demonstrating its potential for completing person Re‐ID with aerial images. The results obtained on two commonly used ground pedestrian datasets show the generalization of the proposed method. Ablation studies also validate that each component contributes to improving the performance of the model.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call