The advent of a panoply of resource limited devices opens up new challenges in the design of computer vision algorithms with a clear compromise between accuracy and computational requirements. In this letter we present new binary image descriptors that emerge from the application of triplet ranking loss, hard negative mining and anchor swapping to traditional features based on pixel differences and image gradients. These descriptors, BAD (Box Average Difference) and HashSIFT, establish new operating points in the state-of-the-art's accuracy vs. resources trade-off curve. In our experiments we evaluate the accuracy, execution time and energy consumption of the proposed descriptors. We show that BAD bears the fastest descriptor implementation in the literature while HashSIFT approaches in accuracy that of the top deep learning-based descriptors, being computationally more efficient. We have made the source code public <sup xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">1</sup> .
Read full abstract