Abstract
SLAM algorithm permits the robot to cartography the desired environment while positioning it in space. It is a more efficient system and more accredited by autonomous vehicle navigation and robotic application in the ongoing research. Except it did not adopt any complete end-to-end hardware implementation yet. Our work aims to a hardware/software optimization of an expensive computational time functional block of monocular ORB-SLAM2. Through this, we attempt to implement the proposed optimization in FPGA-based heterogeneous embedded architecture that shows attractive results. Toward this, we adopt a comparative study with other heterogeneous architecture including powerful embedded GPGPU (NVIDIA Tegra TX1) and high-end GPU (NVIDIA GeForce 920MX). The implementation is achieved using high-level synthesis-based OpenCL for FPGA and CUDA for NVIDIA targeted boards.
Highlights
In most cases, the compute-intensive tasks are managed by CPU, it might be beneficial for the power consumption but the notion of the execution time could be missed
The High-Level Synthesis (HLS) purpose a hardware description through converting a high-level language based on C/C++ programming language to a hardware model for Field Programmable Gate Array (FPGA) taking into consideration the low usage of FPGA resources
Where the highend NVIDIA GeForce 920MX GPU achieves 2-times speedup than Intel Core i5 CPU, whereas TX1 GPU could not tackle the computational time
Summary
The compute-intensive tasks are managed by CPU, it might be beneficial for the power consumption but the notion of the execution time could be missed. FPGA contains very developed resources in the form of an array of programmable logic blocks, such as Digitalsignal-Processing (DSP) that has the capability of Multiply-accumulate (MAC) operation in single instruction cycle, Look-Up Table (LUT) and embedded memory type SRAM. Those resources are designing a modern embedded architecture and often used to implement complex algorithms that make FPGA more attractive choice. In the case of complex algorithms, VHDL and Verilog are often more difficult and unacceptable to most software developers: wherefore High-Level Synthesis (HLS) [2] used for making this task easier. The HLS purpose a hardware description through converting a high-level language based on C/C++ programming language to a hardware model for FPGA taking into consideration the low usage of FPGA resources
Published Version (Free)
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have