Abstract

This paper uses a deep learning model called Faster R-CNN to detect and track objects in images. Two backbone networks such as ResNet-101 and VGG-16 are tested on a self-created dataset and PASCAL VOC dataset. Intersection over union (IoU) technique is used for the purpose of object tracking. The impacts of batch size, number of iterations and learning rate are analysed. The paper finds that ResNet-101 outperforms VGG-16 significantly by 13% on test data. This finding reinforces that deeper network is better in feature extractions and generalizations. IoU is able to track multiple objects and can identify the loss of track. The processing of frames per second is found to be 5 fps. The study has implications for many computer vision applications. For example, the deep learning based object detection and tracking can either augment the capability of LiDARs and Sensors or become an alternative to them in self-driving vehicles.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call