Abstract

Abstract—Object detection is a computer vision technique that locates objects in images or videos by creating bounding boxes around them. In this paper, we propose a model based on object detection using deep learning technologies along with text to speech conversion.An object detection system uses a deep learningmodel to detect objects using YOLO (You Only Look Once) and text-to- speech (TTS) to synthesize a voice announcement about each object. The system we used is built using python OpenCV tool and Google text to speech (gTTS) is used to convert text into audio segment. First variations of YOLO algorithm are compared and then the best one is used according to result we get it by training it on COCO dataset. After the object is detected, the name of the detected object is displayed then the voice output is generated by using Google Text To Speech(gTTS) module. The contribution we make is to present a visual substitution system that uses features extraction and matching to recognize objects with a voice feedback. Index Terms—Object Detection, YOLO, Open CV, python, Google Text To Speech

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call