Audio description (AD) serves as a vital means to make visual media accessible to non-sighted and visually impaired audiences. This study systematically investigates the impact of narrative specificity and voice quality on imageability and comprehension in both sighted and non-sighted populations. Twenty non-sighted participants, including congenitally blind individuals and those who lost their sight early in life, were compared with a group of 20 sighted participants, matched for verbal working memory capabilities. Participants listened to 50 short event descriptions, describing spatiotemporal relations with varying levels of narrative specificity, presented in both typical and dysphonic voices. After each event description, participants rated their ability to imagine the content, overall comprehension, listening effort, and listening enjoyment. Results indicate that high narrative specificity enhanced imageability in non-sighted individuals, especially for scenarios involving changes in motion, and, to some extent, for visuospatial relations, irrespective of sightedness. Additionally, dysphonic voices increased listening effort and reduced enjoyment for non-sighted participants only. These findings underscore the importance of considering voice quality and narrative specificity in AD for non-sighted users and have implications for both professional audio describers and the development of automated AD systems. Lay summary Imagine you're watching a movie but instead of seeing the scenes, you're listening to someone describe them to you. This is what audio description (AD) does - it makes movies, TV shows, and other visual media accessible for people who can't see or have trouble seeing. But not all descriptions are created equal. Think about the difference between someone telling you "a person walks into a room" versus "a tall, anxious man in a red shirt bursts into a sunlit, cluttered room, glancing over his shoulder." The second description paints a much clearer picture in your mind, doesn't it? This study looked at how specific these descriptions are (like our detailed scene above) and the quality of the voice telling the story, to see how they affect people's ability to imagine and understand what's being described. We worked with 40 people - half of whom have never been able to see or lost their sight when they were very young, and the other half who can see. Everyone listened to 50 short stories about different events. These stories varied in how detailed they were and were told in either a clear voice or a voice that was hard to listen to (a hoarse voice). After hearing each story, participants judged how well they could picture what was described, how well they understood it, how hard they had to work to listen, and how much they enjoyed listening. The results were interesting. When the stories were really detailed, people who couldn't see were better able to "see" the action in their minds, especially if the story involved movement or where things were located. This was true for everyone, regardless of whether they could see or not. But, when the voice telling the story was hard to listen to, it made it tougher for people who couldn't see to follow along and enjoy the story. What this tells us is that for audio descriptions to be really helpful and enjoyable for everyone, especially those who rely on them, it's important to not only choose the right words but also the right voice. This insight is valuable not just for people who create audio descriptions but also for developing technology that can automatically generate them. Making movies and TV shows more enjoyable and accessible for everyone is the goal, and we hope that this study help getting us there. Declaration of use of AI: ChatGPT4 was used to construct this lay-text based on the abstract from the original manuscript
Read full abstract