Abstract

SummaryUnsupervised person re‐identification based on video sequences can be applied to surveillance systems and is attracting much more attention. It aims to spot specific person in other scenes captured by different cameras. This work explores an innovative strategy, namely, learning to cluster unlabeled person in the videos through graph convolutional networks. In this article, we find that the possibility of inter‐frame linkage can be inferred from context. Therefore, a pose‐guided topology linkage clustering framework is proposed. Our framework consists of three modules: (i) a pose‐guided representation module; (ii) a pose‐guided embedding module; (iii) a link prediction module. First, the representation coding alone is performed at the level of relational induction bias, embedding the implicit pose structure information in image features. Then, based on the consideration of the topology relationship between adjacent and cross‐frame, graph convolutional network is introduced to infer the likelihood of linkage between frame nodes. Experiments show that the proposed method demonstrates excellent scalability in addition to being an effective response to person clustering in case of changes, and does not need the number of clusters as a prior.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call