Computer Vision and Conflicting Values: Describing People with Automated Alt Text

  • TL;DR
  • Abstract
  • Literature Map
  • Similar Papers
TL;DR

This paper examines ethical issues in automated alt text generation, focusing on Facebook's policies regarding identity categories and contrasting them with museum practices. It highlights the normative dilemmas and concludes that avoiding these conflicts remains challenging.

Abstract
Translate article icon Translate Article Star icon

Scholars have recently drawn attention to a range of controversial issues posed by the use of computer vision for automatically generating descriptions of people in images. Despite these concerns, automated image description has become an important tool to ensure equitable access to information for blind and low vision people. In this paper, we investigate the ethical dilemmas faced by companies that have adopted the use of computer vision for producing alt text: textual descriptions of images for blind and low vision people. We use Facebook's automatic alt text tool as our primary case study. First, we analyze the policies that Facebook has adopted with respect to identity categories, such as race, gender, age, etc., and the company's decisions about whether to present these terms in alt text. We then describe an alternative---and manual---approach practiced in the museum community, focusing on how museums determine what to include in alt text descriptions of cultural artifacts. We compare these policies, using notable points of contrast to develop an analytic framework that characterizes the particular apprehensions behind these policy choices. We conclude by considering two strategies that seem to sidestep some of these concerns, finding that there are no easy ways to avoid the normative dilemmas posed by the use of computer vision to automate alt text.

Similar Papers
  • Research Article
  • Cite Count Icon 22
  • 10.1145/3555570
Understanding Emerging Obfuscation Technologies in Visual Description Services for Blind and Low Vision People
  • Nov 7, 2022
  • Proceedings of the ACM on Human-Computer Interaction
  • Rahaf Alharbi + 2 more

Blind and low vision people use visual description services (VDS) to gain visual interpretation and build access in a world that privileges sight. Despite their many benefits, VDS have many harmful privacy and security implications. As a result, researchers are suggesting, exploring, and building obfuscation systems that detect and obscure private or sensitive materials. However, as obfuscation depends largely on sight to interpret outcomes, it is unknown whether Blind and low vision people would find such approaches useful. Our work aims to center the perspectives and opinions of Blind and low vision people on the potential of obfuscation to address privacy concerns in VDS. By reporting on interviews with 20 Blind and low vision people who use VDS, our findings reveal that popular research trends in obfuscation fail to capture the needs of Blind and low vision people. While obfuscation might be helpful in gaining more control, tensions around obfuscation misrecognition and confirmation are prominent. We turn to the framework of interdependence to unpack and understand obfuscation in VDS, enabling us to complicate privacy concerns, uncover the labor of Blind and low vision people, and emphasize the importance of safeguards. We provide design directions to move the trajectory of obfuscation research forward.

  • Conference Article
  • Cite Count Icon 52
  • 10.1145/3441852.3471207
Designing Tools for High-Quality Alt Text Authoring
  • Oct 17, 2021
  • Kelly Mack + 3 more

Alternative (alt) text provides access to descriptions of digital images for people who use screen readers. While prior work studied screen reader users' (SRUs') preferences about alt text and automatic alt text (i.e., alt text generated by artificial intelligence), little work examined the alt text author's experience composing or editing these descriptions. We built two types of prototype interfaces for two tasks: authoring alt text and providing feedback on automatic alt text. Through combined interview-usability testing sessions with alt text authors and interviews with SRUs, we tested the effectiveness of our prototypes in the context of Microsoft PowerPoint. Our results suggest that authoring interfaces that support authors in choosing what to include in their descriptions result in higher quality alt text. The feedback interfaces highlighted considerable differences in the perceptions of authors and SRUs regarding "high-quality" alt text. Finally, authors crafted significantly lower quality alt text when starting from the automatic alt text compared to starting from a blank box. We discuss the implications of these results on applications that support alt text.

  • PDF Download Icon
  • Conference Article
  • Cite Count Icon 23
  • 10.1145/3517428.3544796
A Dataset of Alt Texts from HCI Publications
  • Oct 22, 2022
  • Sanjana Shivani Chintalapati + 2 more

Figures in scientific publications contain important information and results, and alt text is needed for blind and low vision readers to engage with their content. We conduct a study to characterize the semantic content of alt text in HCI publications based on a framework introduced by Lundgard and Satyanarayan. Our study focuses on alt text for graphs, charts, and plots extracted from HCI and accessibility publications; we focus on these communities due to the lack of alt text in papers published outside of these disciplines. We find that the capacity of author-written alt text to fulfill blind and low vision user needs is mixed; for example, only 50% of alt texts in our sample contain information about extrema or outliers, and only 31% contain information about major trends or comparisons conveyed by the graph. We release our collected dataset of author-written alt text, and outline possible ways that it can be used to develop tools and models to assist future authors in writing better alt text. Based on our findings, we also discuss recommendations that can be acted upon by publishers and authors to encourage inclusion of more types of semantic content in alt text.

  • Conference Article
  • Cite Count Icon 8
  • 10.1145/3586182.3616646
A Multi-modal Toolkit to Support DIY Assistive Technology Creation for Blind and Low Vision People
  • Oct 29, 2023
  • Liwen He + 4 more

We design and build A11yBits, a tangible toolkit that empowers blind and low vision (BLV) people to easily create personalized do-it-yourself assistive technologies (DIY-ATs). A11yBits includes (1) a series of Sensing modules to detect both environmental information and user commands, (2) a set of Feedback modules to send multi-modal feedback, and (3) two Base modules (Sensing Base and Feedback Base) to power and connect the sensing and feedback modules. The toolkit enables accessible and easy assembly via a “plug-and-play” mechanism. BLV users can select and assemble their preferred modules to create personalized DIY-ATs.

  • Conference Article
  • Cite Count Icon 11
  • 10.1145/3597638.3614494
Practices and Barriers of Cooking Training for Blind and Low Vision People
  • Oct 22, 2023
  • Ru Wang + 5 more

Cooking is a vital yet challenging activity for blind and low vision (BLV) people, which involves many visual tasks that can be difficult and dangerous. BLV training services, such as vision rehabilitation, can effectively improve BLV people’s independence and quality of life in daily tasks, such as cooking. However, there is a lack of understanding on the practices employed by the training professionals and the barriers faced by BLV people in such training. To fill the gap, we interviewed six professionals to explore their training strategies and technology recommendations for BLV clients in cooking activities. Our findings revealed the fundamental principles, practices, and barriers in current BLV training services, identifying the gaps between training and reality.

  • Conference Article
  • Cite Count Icon 22
  • 10.1145/3613904.3642238
“It’s Kind of Context Dependent”: Understanding Blind and Low Vision People’s Video Accessibility Preferences Across Viewing Scenarios
  • May 11, 2024
  • Lucy Jiang + 4 more

While audio description (AD) is the standard approach for making videos accessible to blind and low vision (BLV) people, existing AD guidelines do not consider BLV users’ varied preferences across viewing scenarios. These scenarios range from how-to videos on YouTube, where users seek to learn new skills, to historical dramas on Netflix, where a user’s goal is entertainment. Additionally, the increase in video watching on mobile devices provides an opportunity to integrate nonverbal output modalities (e.g., audio cues, tactile elements, and visual enhancements). Through a formative survey and 15 semi-structured interviews, we identified BLV people’s video accessibility preferences across diverse scenarios. For example, participants valued action and equipment details for how-to videos, tactile graphics for learning scenarios, and 3D models for fantastical content. We define a six-dimensional video accessibility design space to guide future innovation and discuss how to move from “one-size-fits-all” paradigms to scenario-specific approaches.

  • Conference Article
  • Cite Count Icon 2
  • 10.1145/3544549.3585819
“Unfold and Go Touch”: A Portable Method for Making Existing Touchscreens Accessible to Blind and Low Vision People in Self-Service Terminals
  • Apr 19, 2023
  • Weiyue Lin + 3 more

Self-service terminals (SSTs) are almost everywhere in our daily life and increasingly use capacitive and infrared touchscreens as the interface. Most of the current solutions to help blind and low vision (BLV) people access existing touchscreens mostly are only suitable for capacitive touchscreens and not for infrared touchscreens. In this paper, we proposed a voice-based interactive method using a conductive folding stand with the phone camera to allow BLV people to access both touchscreens of SSTs. Voice feedback was provided to guide users to move the phone close to the button and touch it with the end of the unfolded stand. Using a portable accessory, this method directly guided users to touch the target and effectively avoids false triggering. A preliminary evaluation indicated that our approach enabled users to access the target buttons on the touchscreen with high accuracy and a short completion time.

  • Research Article
  • 10.1145/3770654
TapNav: Adaptive Spatiotactile Screen Readers for Tactually Guided Touchscreen Interactions for Blind and Low Vision People
  • Dec 2, 2025
  • Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies
  • Ricardo E Gonzalez Penuela + 3 more

Screen readers are audio-based software that Blind and Low Vision (BLV) people use to interact with computing devices, such as tablets and smartphones. Although this technology has significantly improved the accessibility of touchscreen devices, the sequential nature of audio limits the bandwidth of information users can receive and process. We introduce TapNav, an adaptive spatiotactile screen reader prototype developed to interact with touchscreen interfaces spatially. TapNav's screen reader provides adaptive auditory feedback that, in combination with a tactile overlay, conveys spatial information and location of interface elements on-screen. We evaluated TapNav with 12 BLV users who interacted with TapNav to explore a data visualization and interact with a bank transactions application. Our qualitative findings show that touch points and spatially constrained navigation helped users anticipate outcomes for faster exploration, and offload cognitive load to touch. We provide design guidelines for creating tactile overlays for adaptive spatiotactile screen readers and discuss their generalizability beyond our exploratory data analysis and everyday application navigation scenarios.

  • Book Chapter
  • Cite Count Icon 1
  • 10.4324/9781003171935-28
Alt Text as Poetry Project
  • Jul 21, 2022
  • Bojana Coklyat + 1 more

Alt text is a system of text descriptions built into websites and other online platforms to make visual content accessible to blind people, those with low vision, and those living with certain cognitive disabilities. It is often overlooked altogether or understood through the lens of compliance, resulting in text that is written in a reluctant, perfunctory style. However, alt text is an essential part of web accessibility and has tremendous expressive potential; it can be written creatively and generously, centering disability culture. This chapter will discuss the artistic project Alt Text as Poetry by Bojana Coklyat and Shannon Finnegan, which centers around the question of how we can make spaces and experiences that disabled people not only can access but want to access. This chapter reframes alt text as a type of creative practice by elaborating on the methods of creative image description and providing examples of others who are a part of this ecosystem.

  • Research Article
  • Cite Count Icon 14
  • 10.1145/3167902.3167905
Designing smartglasses applications for people with low vision
  • Nov 27, 2017
  • ACM SIGACCESS Accessibility and Computing
  • Shiri Azenkot + 1 more

While our community has many active projects involving blind people, low vision is rarely addressed. People with low vision have functional vision, but their visual impairment adversely affects their daily life and it cannot be corrected with glasses or contact lenses. Over the last few years, we have been conducting research with this understudied demographic: understanding low vision people's needs and designing applications to address the challenges they face. In this article, we discuss our ongoing research in this area, focusing on designing augmented reality applications for low vision users. We begin this article by describing low vision and motivating our focus on augmented reality applications on smartglasses for low vision people. We then provide overviews of three research projects that exemplify our research agenda: a study where we observed low vision people conducting a navigation and shopping task, a study where we examined low vision people's perception of virtual text and shapes on smartglasses, and the design of a smartglasses application that facilitates a visual search task.

  • Conference Article
  • Cite Count Icon 89
  • 10.1145/3025453.3025949
Understanding Low Vision People's Visual Perception on Commercial Augmented Reality Glasses
  • May 2, 2017
  • Yuhang Zhao + 3 more

People with low vision have a visual impairment that affects their ability to perform daily activities. Unlike blind people, low vision people have functional vision and can potentially benefit from smart glasses that provide dynamic, always-available visual information. We sought to determine what low vision people could see on mainstream commercial augmented reality (AR) glasses, despite their visual limitations and the device's constraints. We conducted a study with 20 low vision participants and 18 sighted controls, asking them to identify virtual shapes and text in different sizes, colors, and thicknesses. We also evaluated their ability to see the virtual elements while walking. We found that low vision participants were able to identify basic shapes and read short phrases on the glasses while sitting and walking. Identifying virtual elements had a similar effect on low vision and sighted people's walking speed, slowing it down slightly. Our study yielded preliminary evidence that mainstream AR glasses can be powerful accessibility tools. We derive guidelines for presenting visual output for low vision people and discuss opportunities for accessibility applications on this platform.

  • Conference Article
  • Cite Count Icon 3
  • 10.17210/hcik.2016.01.198
저시력 장애인을 위한 보조기기 개선 방안에 대한 연구 : 스마트 보조기기와 애플리케이션 활용을 중심으로
  • Jan 27, 2016
  • Ickpyo Oh + 5 more

Low vision people who are more than 88% of visually impaired people want to use their residual vision and don't want to look like disabled. However, many assistive devices for low vision are suitable for use indoors and people with disabled are exposed using assistive device so that they are reluctant to use that. So, many low vision people want to use smart phone to solve problem but now functions of smartphone are not enough. In this study, we want to suggest smart assistive software and device for low vision people to use with residual vision as much as possible without being self-conscious. For that, we interviewed expert of low vision and low vision people with qualitative research methods. Based on the results, we present solution and suggest EYESEE, assistive device and application for low vision people.

  • Conference Article
  • Cite Count Icon 18
  • 10.1145/3544548.3581213
Understanding How Low Vision People Read Using Eye Tracking
  • Apr 19, 2023
  • Ru Wang + 4 more

While being able to read with screen magnifiers, low vision people have slow and unpleasant reading experiences. Eye tracking has the potential to improve their experience by recognizing fine-grained gaze behaviors and providing more targeted enhancements. To inspire gaze-based low vision technology, we investigate the suitable method to collect low vision users’ gaze data via commercial eye trackers and thoroughly explore their challenges in reading based on their gaze behaviors. With an improved calibration interface, we collected the gaze data of 20 low vision participants and 20 sighted controls who performed reading tasks on a computer screen; low vision participants were also asked to read with different screen magnifiers. We found that, with an accessible calibration interface and data collection method, commercial eye trackers can collect gaze data of comparable quality from low vision and sighted people. Our study identified low vision people’s unique gaze patterns during reading, building upon which, we propose design implications for gaze-based low vision technology.

  • Research Article
  • Cite Count Icon 6
  • 10.1145/3458055.3458061
Accessible interactive 3D models for blind and low-vision people
  • Jan 1, 2021
  • ACM SIGACCESS Accessibility and Computing
  • Samuel Reinders

Blind and low-vision (BLV) people experience difficulty accessing graphical information, particularly regarding travel and education. Tactile diagrams and 3D printed models can improve access to graphical information for BLV people; however, these formats only allow limited detailed and contextual information. Interactive 3D printed models (I3Ms) exist, but many rely on passive audio labels that don't particularly empower BLV people in independent knowledge building and interpretation. This project investigates the creation of I3Ms that offer more engaging experiences with a focus on facilitating independent exploration and knowledge discovery. Specifically, this project explores how BLV people want to interact with I3Ms, interactive functionalities and behaviours that I3Ms should support, such as conversational interfaces and model agency, and to understand the relationship between I3Ms and conventional accessible graphics.

  • Research Article
  • Cite Count Icon 2
  • 10.1145/3178412.3178421
Using direct visual augmentation to provide people with low vision equal access to information
  • Jan 9, 2018
  • ACM SIGACCESS Accessibility and Computing
  • Yuhang Zhao

Low vision is a visual impairment that cannot be corrected with eyeglasses or contact lenses. Low vision people have functional vision and prefer using that vision instead of relying on audition and touch. Existing approaches to low vision accessibility enhance people's vision using simple "signal-to-signal" techniques that do not take into account the user's context. There is thus a major gap between low vision people's needs and existing low vision technologies. My doctorial research aims to address this gap by augmenting low vision people's visual experience with direct and optimal visual feedback based on the user's context. I will design and study novel methods for visual augmentation , which involves visual feedback beyond simple enhancements. My research considers two dimensions: visual condition and task. By understanding the visual perception of people with different visual abilities and exploring their needs in different visual tasks, I will design applications with visual feedback that is optimal for specific context to maximize people's access to information. My research will yield design insights and novel applications for people with all visual abilities.

Save Icon
Up Arrow
Open/Close
Notes

Save Important notes in documents

Highlight text to save as a note, or write notes directly

You can also access these Documents in Paperpal, our AI writing tool

Powered by our AI Writing Assistant