Abstract

Recently, extensive convolutional neural network (CNN)-based methods have been used in remote sensing applications, such as object detection and classification, and have achieved significant improvements in performance. Furthermore, there are a lot of hardware implementation demands for remote sensing real-time processing applications. However, the operation and storage processes in floating-point models hinder the deployment of networks in hardware implements with limited resource and power budgets, such as field-programmable gate arrays (FPGAs) and application-specific integrated circuits (ASICs). To solve this problem, this paper focuses on optimizing the hardware design of CNN with low bit-width integers by quantization. First, a symmetric quantization scheme-based hybrid-type inference method was proposed, which uses the low bit-width integer to replace floating-point precision. Then, a training approach for the quantized network is introduced to reduce accuracy degradation. Finally, a processing engine (PE) with a low bit-width is proposed to optimize the hardware design of FPGA for remote sensing image classification. Besides, a fused-layer PE is also presented for state-of-the-art CNNs equipped with Batch-Normalization and LeakyRelu. The experiments performed on the Moving and Stationary Target Acquisition and Recognition (MSTAR) dataset using a graphics processing unit (GPU) demonstrate that the accuracy of 8-bit quantized model drops by about 1%, which is an acceptable accuracy loss. The accuracy result tested on FPGA is consistent with that of GPU. As for the resource consumptions of FPGA, the Look Up Table (LUT), Flip-flop (FF), Digital Signal Processor (DSP), and Block Random Access Memory (BRAM) are reduced by 46.21%, 43.84%, 45%, and 51%, respectively, compared with that of floating-point implementation.

Highlights

  • Object detection and classification in remote sensing images are hot research topics in earth observation applications

  • Another part of the experimentations aims to implement the quantized network in field-programmable gate arrays (FPGAs) and compare it with the fundamental network described in section 4 in terms of the on-chip with the fundamental network described

  • While the 8-bit quantized model has an accuracy degradation of 0.99%, compared with the floating-point network, the hardware resource consumptions of Look Up Table (LUT), FF, Digital Signal Processor (DSP), Block Random Access Memory (BRAM) are reduced by 46.21%, 43.84%, 45.00% and 51%, respectively

Read more

Summary

Introduction

Object detection and classification in remote sensing images are hot research topics in earth observation applications. Researchers are exploring another way, which is to quantize the models, converting the floating-point precision into lower bit-width fixed-point numerical representation This type of methods aims at increasing the computation efficiency via approximate multiplications and additions. Experiments showed that while the network provided a 58× speed up and enabled complex models to be deployed on CPU in real time, it still had a significant degradation of classification accuracy on ImageNet. the prior category of approaches is more focused on designing compact network architectures to achieve a high computational efficiency, which has great improvements in some baseline architectures. The results show that this hardware implementation achieves significant improvements in memory and logical resource consumption

Hybrid-Type Inference
Symmetric Quantization Scheme
Quantized Convolutional Layer and Fully Connected Layer
Training Approach for Quantized Layers
Backward-Propagation
Implement Architecture in FPGA
Hybrid-Type
Design
The outputsinofFigure
Experiments and Results
Dataset Description and Data Preprocessing
Quantized Training with Different Bit-width
Training
Evaluation
Conclusions
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call