Vision Processing for Assistive Vision: A Deep Reinforcement Learning Approach

Jack White,Tatiana Kameneva,Chris Mccarthy

doi:10.1109/thms.2021.3121661

Abstract

There is increasing interest in using computer vision and machine learning to enhance human decision making with computer-mediated assistive vision systems. In particular, retinal implants are a rapidly advancing technology offering individuals suffering vision loss due to retinal dystrophies, an opportunity to restore partial vision. However, the visual representations achievable with current and near-term implants are severely limited in resolution and contrast, placing high importance on the selection of visual features to convey via the implant. Using vision processing algorithms on camera-captured input, functional outcomes can be enhanced with such devices. To this end, we propose a novel end-to-end vision processing pipeline for prosthetic vision that learns task-salient visual filters in simulation offline via deep reinforcement learning (DRL). Once learnt, these filters are deployable on a prosthetic vision device to process camera-captured images and produce task-guiding scene representations in real-time. We show how a set of learnt visual features enabling a virtual agent to optimally perform the task of navigation in a 3-D environment can be extracted and applied to enhance the same features in real world images. We evaluate and validate our proposed approach quantitatively and qualitatively using simulated prosthetic vision. To our knowledge, this is the first application of DRL to the derivation of scene representations for human-centric computer-mediated displays such as prosthetic vision devices.

Full Text