Abstract. Around 295 million individuals globally suffer from moderate to severe vision impairment. They struggle with daily activities and depend heavily on others for assistance. Leveraging augmented reality (AR) and artificial intelligence (AI), we have developed SOVAR, a mobile application that enables greater independence for visually impaired individuals in their daily lives. SOVAR includes two modules: navigation and scene understanding. Navigation involves two phases: mapping and guidance. During mapping, SOVAR builds and optimizes the maps with key locations labeled by users via voice input. During guidance, SOVAR plans the path and guides users to requested key locations, with visual and audio assistance and realtime obstacle avoidance. The scene understanding module includes a Large Vision Language Model (LVLM) to help the users through image captioning and visual question answering. For navigation, our user study shows that participants successfully navigated to the key locations from three separate locations in 86.67% of trials without intervention. The success rate improves with increased user familiarity with the application. For scene understanding, our study shows that leveraging LVLMs to help the visually impaired allowed the participants to answer the visual-related questions with an accuracy of 100%. In this work, we developed SOVAR, a mobile application leveraging AR and AI to assist the visually impaired in navigation and scene understanding. The promising results from the user study on SOVAR demonstrate the effectiveness of AR and AI for visual assistance and indicate their potential impact on general assistive technologies.
Read full abstract