Abstract
The vision system of a mobile robot has to interpret the environment in real time at low power. As a good algorithm for extracting information from images, SIFT (Scale Invariant Feature Transform) is widely used in computer vision. However, the high computational complexity makes it hard to achieve real-time performance of SIFT with pure software. This paper presents a machine vision system implementing the SIFT algorithm on an embedded image processing card, where real-time scene recognition is accomplished with low power consumption through the cooperation between an FPGA (Field Programmable Gate Array) and a DSP (Digital Signal Processor) chip. The original SIFT keypoint detection algorithm is adapted for parallel computation and implemented with a hardware pipeline in the FPGA. Although our current system is designed for 360×288 video frames, this pipelined architecture can be applied to images with arbitrary resolution. Meanwhile, the original 128-dimensional SIFT descriptor is replaced by an 18-dimensional new descriptor which can be generated more efficiently and can be matched according to an absolute distance threshold with the distance defined by infinity-norm. On this basis, a five-branch-tree data structure is designed for fast searching and matching of descriptors, and robust scene recognition is realized through the combination of keypoints. Since our new descriptor allows one keypoint to be matched to several keypoints, which is a distinct property from the original SIFT algorithm, our system can recognize multiple images with overlapping contents simultaneously. In addition, compared with traditional work that needs off-line training, our system can perform fast on-line learning, which is a desirable property for mobile robots.
Paper version not known (
Free)
Published Version
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have