Abstract

Drone, a.k.a. Unmanned aerial vehicle (UAV), has been pervasively applied in geological hazard monitoring, smart agriculture, and urban planning in the past decade. In this work, we fuse multiple attributes into a noise-tolerant hashing framework that can detect objects from drone pictures extremely fast. Our method can intrinsically and flexibly encode various topological structures from each target object, based on which multi-scale objects can be discovered in a view- and altitude-invariant way. Moreover, by leveraging lF and l1 norms collaboratively, the calculated hash codes are robust to low quality drone pictures and noisy semantic labels. More specifically, for each drone-borne picture, we extract visually/semantically salient object parts inside it. To characterize their topological structure, we construct a graphlet by linking the spatially adjacent object patches into a small graph. Subsequently, a binary matrix factorization (MF) is designed to hierarchically exploit the semantics of these graphlets, wherein three attributes: i) deep binary hash codes learning, ii) contaminated pictures/labels denoising, and iii) adaptive data graph updating are seamlessly incorporated. Such multi-attribute binary MF can be solved iteratively, and in turn each graphlet is transformed into the binary hash codes. Finally, the hash codes corresponding to graphlets within each drone photo are utilized for ranking-based object discovery. Comprehensive experiments on the DAC-SDC, MOHR, and our self-compiled data set have demonstrated the competitively speed and accuracy of our method. As a byproduct, we employ an elaborately-designed FPGA architecture to calculate our hash codes. On average, a 57 frames per second (fps) object detection speed is achieved on 4K drone videos (without temporal modeling).

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call