Abstract
Visual semantic segmentation, which is represented by the semantic segmentation network, has been widely used in many fields, such as intelligent robots, security, and autonomous driving. However, these Convolutional Neural Network (CNN)-based networks have high requirements for computing resources and programmability for hardware platforms. For embedded platforms and terminal devices in particular, Graphics Processing Unit (GPU)-based computing platforms cannot meet these requirements in terms of size and power consumption. In contrast, the Field Programmable Gate Array (FPGA)-based hardware system not only has flexible programmability and high embeddability, but can also meet lower power consumption requirements, which make it an appropriate solution for semantic segmentation on terminal devices. In this paper, we demonstrate EDSSA—an Encoder-Decoder semantic segmentation networks accelerator architecture which can be implemented with flexible parameter configurations and hardware resources on the FPGA platforms that support Open Computing Language (OpenCL) development. We introduce the related technologies, architecture design, algorithm optimization, and hardware implementation of the Encoder-Decoder semantic segmentation network SegNet as an example, and undertake a performance evaluation. Using an Intel Arria-10 GX1150 platform for evaluation, our work achieves a throughput higher than 432.8 GOP/s with power consumption of about 20 W, which is a 1.2× times improvement the energy-efficiency ratio compared to a high-performance GPU.
Highlights
Visual semantic segmentation is widely used in various applications, such as intelligent robot technology [1,2,3], autonomous driving [4,5], and pedestrian detection [6]
In order to solve these problems, we propose an Encoder-Decoder semantic segmentation networks accelerator architecture (EDSSA) of an Open Computing Language (OpenCL)-based Field Programmable Gate Array (FPGA) heterogeneous platform
We used the development tool based on an Intel FPGA SDK for OpenCL pro 17.1 to implement the development of EDSSA
Summary
Visual semantic segmentation is widely used in various applications, such as intelligent robot technology [1,2,3], autonomous driving [4,5], and pedestrian detection [6]. Classic VSLAM is mostly based on low-level computer vision features (points, lines, etc.). The description can extract geometric spatial information well, it lacks a high-level understanding of the environment in terms of semantics. With the development of deep learning technologies, researchers have proposed various neural network algorithms to achieve high-level feature extraction based on computer vision technology, such as image classification [12,13,14] and semantic segmentation [15,16,17]. The combination of classic VSLAM and a semantic segmentation network represents a new evolution of the traditional feature point extraction methods. Semantic VSLAM frameworks have been proposed to solve several problems with the classic VSLAM algorithms [18] and have shown good performance
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.