Abstract

Convolutional neural networks (CNNs) for visual semantic segmentation have been attracting considerable attention recently because of their superior support for many significant tasks, such as autonomous driving, semantic SLAM (simultaneous localization and mapping) and remote sensing surveying and mapping. These kinds of applications generally need to be implemented on the smart terminals, which means that a kind of hardware platform with high energy efficiency and real-time performance is required. However, CNNs for semantic segmentation usually contain some symmetrical encoders and decoders, corresponding to the down-sampling process (e.g., pooling, convolution) and the up-sampling process (e.g., unpooling, deconvolution). All of these processes are computing and storage intensive, which limits their applicability in the resource constrained embedded systems. In this paper, an FPGA-based accelerator programed by OpenCL is proposed. We evaluate its performance on the CamVid dataset. The global accuracy only drops by 2.04% with 8-bit quantization. Additionally, the system shows 48.89 GOPS and 2.4x real-time performance against CPU when running on an Arria-10 GX1150 device.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.