Abstract

Robot vision is an essential research field that enables machines to perform various tasks by classifying/detecting/segmenting objects as humans do. The classification accuracy of machine learning algorithms already exceeds that of a well-trained human, and the results are rather saturated. Hence, in recent years, many studies have been conducted in the direction of reducing the weight of the model and applying it to mobile devices. For this purpose, we propose a multipath lightweight deep network using randomly selected dilated convolutions. The proposed network consists of two sets of multipath networks (minimum 2, maximum 8), where the output feature maps of one path are concatenated with the input feature maps of the other path so that the features are reusable and abundant. We also replace the standard convolution of each path with a randomly selected dilated convolution, which has the effect of increasing the receptive field. The proposed network lowers the number of floating point operations (FLOPs) and parameters by more than 50% and the classification error by 0.8% as compared to the state-of-the-art. We show that the proposed network is efficient.

Highlights

  • Object detection is one of the essential techniques that robots need to perform a variety of tasks

  • We reduce the the number of floating point operations (FLOPs) and parameters by more than 50%

  • This paper has dealt with the effects of multiple paths and randomly selected dilated convolutions on lightweight deep networks

Read more

Summary

Introduction

Object detection is one of the essential techniques that robots need to perform a variety of tasks. It is technically challenging to detect objects quickly and accurately in robot vision. Deep convolutional neural networks (DCNNs) have attracted extensive attention in various computer vision applications such as object detection [1,2,3,4,5,6,7], object classification [8,9,10,11,12,13,14,15], and image segmentation [16,17,18,19,20]. DCNNs are composed of a series of convolutional layers, resulting in abundant features, more parameters, and complicated structures. These properties lead to a significant improvement in performance. Some of the prominent research involving DCNNs is as follows: combining networks in networks (NIN [21]), reducing the number of parameters by proposing a bottleneck layer (GoogLeNet [22]), placing several simple networks (VGGNet [11]), connecting an extra path between different layers (ResNet [12]), concatenating from previous layers to the layers (DenseNet [13]), and increasing the number of channels as the layers get deeper (PyramidNet [23])

Methods
Results
Discussion
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call