Abstract

According to WHO, globally there are at least 2.2 billion people who have a vision impairment or blindness. While they are known as the disabled in our society, rarely have they benefited from the development of technology and introduction of new products. That is how I developed this idea of constructing a medium that would transfer visual information to audio forms, which would help this disadvantaged group gain access to “visual experience”. In order to achieve this goal, several essential problems should be solved: how can one establish a technological formula to convert visual in-put to audio output? Extracting visual features from paintings, I could set up my own data collection based on some core features, which can capture the mood of a painting. I then use Convolutional Neural Network to train the AI to link visual and audio in-formation. Loss functions are introduced to minimize the error of the product. After I obtain the mood of my pictures, I will use it to formulate a definitive sound, in the process of which the “mood” gets preserved, such as joy, sorrow, or any artistic theme that is shared both in visual and audio experience. Towards the end of this paper, several experiments will be introduced to test the efficacy of my project. The first part is on the functional efficacy of my AI system. The second part is about my product hypothesis: I will give to two distinct participants one pair of visual input and audial output. Then I will check whether they can recognize the exact theme inherent in the corresponding media. Their successful identifying the same “mood” testifies the success of the project goal of transferring visual experience to audio.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call