Abstract

The project aims to develop an innovative object detection system that uses audio input and output to enhance user engagement and provide detailed product information. The system employs deep learning techniques for real-time object identification, generating an audio-based output with the object's name. In order to enable users to interact with the system verbally, it also incorporates Natural Language Processing (NLP), which interprets and processes speech in order to identify the object. The system also includes an online search module, which provides users with descriptions of the object's attributes and potential applications. The system undergoes rigorous testing and optimization to ensure accuracy and responsiveness. The project employs user-friendly technology that can assist people with low vision or disabilities, providing a practical and time-saving alternative for seeking instant information about objects. In order to make object recognition and information retrieval a smooth and inclusive experience for all users, this audio-enabled system is a first step towards bridging the gap between artificial intelligence and human-technology interaction. Key Words: Object Detection, Natural Language Processing, YOLOv4, Google Gemini, MeaningCloud, Wikipedia library

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call