Widget Captioning: Generating Natural Language Description for Mobile User Interface Elements

Yang Li,Gang Li,Luheng He,Jingjie Zheng,Zhiwei Guan,Hong Li

doi:10.18653/v1/2020.emnlp-main.443

Abstract

Natural language descriptions of user interface (UI) elements such as alternative text are crucial for accessibility and language-based interaction in general. Yet, these descriptions are constantly missing in mobile UIs. We propose widget captioning, a novel task for automatically generating language descriptions for UI elements from multimodal input including both the image and the structural representations of user interfaces. We collected a large-scale dataset for widget captioning with crowdsourcing. Our dataset contains 162,859 language phrases created by human workers for annotating 61,285 UI elements across 21,750 unique UI screens. We thoroughly analyze the dataset, and train and evaluate a set of deep model configurations to investigate how each feature modality as well as the choice of learning strategies impact the quality of predicted captions. The task formulation and the dataset as well as our benchmark models contribute a solid basis for this novel multimodal captioning task that connects language and user interfaces.

Highlights

Mobile apps come with a rich and diverse set of design styles, which are often more graphical and unconventional compared to traditional desktop applications
A novel task to automatically generate captions for user interface (UI) elements1 based on their visual appearance, structural properties and context
We propose widget captioning as a task for automatically generating language descriptions for UI elements in mobile user interfaces; The task raises unique challenges for modeling and extends the popular image captioning task to the user interface domain

Summary

Introduction

Mobile apps come with a rich and diverse set of design styles, which are often more graphical and unconventional compared to traditional desktop applications. Language descriptions of user interface (UI) elements—that we refer to as widget captions— are a precondition for many aspects of mobile UI usability. Widget captions are an enabler for many language-based interaction capabilities Mobile UI. A significant portion of mobile apps today lack widget captions in their user interfaces, which have stood out as a primary issue for mobile accessibility (Ross et al, 2018, 2017). More than half of image-based elements have missing captions (Ross et al, 2018). Beyond image-based ones, our analysis of a UI corpus here showed that a wide range of elements have missing captions. Existing tools for examining and fixing missing captions (AccessibilityScanner, 2019; AndroidLint, 2019; Zhang et al, 2018, 2017; Choo et al, 2019) require developers to manually compose a language description for each element, which imposes a substantial overhead on developers

Methods

Results

Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Widget Captioning: Generating Natural Language Description for Mobile User Interface Elements

Abstract

Highlights

Summary

Talk to us

Similar Papers

Lead the way for us

Publication Date: Jan 1, 2020
Citations: 38	License type: cc-by

Similar Papers

Generating adaptable user interface in SPLE
Hafiyyan Sayyid Fadhlillah ... Daya Adianto
-
Hafiyyan Sayyid Fadhlillah, et. al.Hafiyyan Sayyid Fadhlillah ... Daya Adianto
10 Sep 2018
10 Sep 2018

Evaluating Educational Game via User Experience (UX) and User Interface (UI) Elements
Khairul Yusri Zamri ... Han Keong Tan
EDUCATUM Journal of Social Sciences | VOL. 8
Khairul Yusri Zamri, et. al.Khairul Yusri Zamri ... Han Keong Tan
24 May 2022
EDUCATUM Journal of Social Sciences | VOL. 8

MOBILE USER INTERFACE DESIGN FOR SMALLHOLDER AGRICULTURE TO BE A SMART FARMER: A SCOPING REVIEW
Mohamad Jahidi Osman ... Nurul Hawani Idris
Journal of Information System and Technology Management | VOL. 7
Mohamad Jahidi Osman, et. al.Mohamad Jahidi Osman ... Nurul Hawani Idris
07 Mar 2022
Journal of Information System and Technology Management | VOL. 7

A deep learning-based automated framework for functional User Interface testing
Zubair Khaliq ... Dawood Ashraf Khan
Information and Software Technology | VOL. 150
Zubair Khaliq, et. al.Zubair Khaliq ... Dawood Ashraf Khan
01 Oct 2022
Information and Software Technology | VOL. 150

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Widget Captioning: Generating Natural Language Description for Mobile User Interface Elements

Abstract

Highlights

Summary

Talk to us

Similar Papers