Abstract

Many venues, such as airports, railway stations, and shopping malls, have video surveillance systems for security and monitoring. However, searching for and retrieving people based on attribute descriptions in a large number of videos is difficult, particularly with weather variations and crowded places. Most of the existing attribute-based person retrieval systems consist of two main modules: object detection and person attribute recognition. The common drawbacks of object detection in the existing methods are false-positive, missing detection, and multi bounding boxes for the same object. Moreover, attribute recognition algorithms suffer from low accuracy for a single attribute classifier, while attributes error spread in the cascading multi-attribute classifier. This paper overcomes these issues by applying the ByteTrack algorithm instead of object detection to exploit the person's spatio-temporal information and generate a tube that maintains all the boxes that include the objects and associates high and low score boxes of the objects without raising false positive detection. Also, linking each person bounding boxes together results in more accurate attributes recognition than defining the attributes of each bounding box separately. Moreover, the proposed algorithm merges between selected predictions of two attribute recognition algorithms to improve the recognition performance. An extensive empirical evaluation was carried out on the SoftBioSearch database. The simulation results reveal that the proposed retrieval algorithm provides effective retrieval performance that exceeds the best conventional method by 14%.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call