Understanding How Blind Users Handle Object Recognition Errors: Strategies and Challenges.

  • Abstract
  • Literature Map
  • Similar Papers
Abstract
Translate article icon Translate Article Star icon

Object recognition technologies hold the potential to support blind and low-vision people in navigating the world around them. However, the gap between benchmark performances and practical usability remains a significant challenge. This paper presents a study aimed at understanding blind users' interaction with object recognition systems for identifying and avoiding errors. Leveraging a pre-existing object recognition system, URCam, fine-tuned for our experiment, we conducted a user study involving 12 blind and low-vision participants. Through in-depth interviews and hands-on error identification tasks, we gained insights into users' experiences, challenges, and strategies for identifying errors in camera-based assistive technologies and object recognition systems. During interviews, many participants preferred independent error review, while expressing apprehension toward misrecognitions. In the error identification task, participants varied viewpoints, backgrounds, and object sizes in their images to avoid and overcome errors. Even after repeating the task, participants identified only half of the errors, and the proportion of errors identified did not significantly differ from their first attempts. Based on these insights, we offer implications for designing accessible interfaces tailored to the needs of blind and low-vision users in identifying object recognition errors.

Similar Papers
  • Research Article
  • 10.3389/conf.fninf.2013.09.00066
Logarithmic Hybrid Optical Neural Network-type systems: towards digital-optical cognitive models of robust object recognition
  • Jan 1, 2013
  • Frontiers in Neuroinformatics
  • Kypraios Ioannis

Event Abstract Back to Event Logarithmic Hybrid Optical Neural Network-type systems: towards digital-optical cognitive models of robust object recognition Ioannis Kypraios1* 1 Oxford University, Centre for Innovation & Enterprise, APEM Computing Lab, United Kingdom Abstract - By combining the complex logarithmic r-theta mapping of a space-variant imaging sensor [1, 2] with the hybrid digital-optical neural network filter together with a window unit multiple objects of the same class or of different classes can robustly be recognized. The resulted logarithmic r-theta mapping for hybrid digital-optical neural network system or briefly referred to as logarithmic hybrid optical neural network system is shown to exhibit with a single pass over the input data simultaneously in-plane rotation, out-of-plane rotation, scale, log r-θ map translation and shift invariance, and good clutter tolerance by recognizing correctly the different objects within the cluttered scenes. Here, we study the biologically-inspired knowledge learning and knowledge representation [3] achieved by the logarithmic hybrid optical neural network-type of object recognition systems (see Fig. 1 and Fig. 2). We investigate the effects that altering the knowledge representation can have on the problem’s learned knowledge and the problem solving process, in overall. Further, we study the logarithmic unconstrained-, logarithmic constrained-, and logarithmic modified-hybrid optical neural network systems architectures’ designs [4, 5]. We show how the logarithmic unconstrained-hybrid optical neural network object recognition system applies an unconstrained representation of the problem’s knowledge to maximize the search of solutions, how the logarithmic constrained-hybrid optical neural network object recognition system uses a constrained representation of the problem’s knowledge to guide the search towards certain solutions over others in the multidimensional search space, and how the logarithmic modified-hybrid optical neural network object recognition system uses memory-like masks to recall certain solutions over others in the multidimensional search space. Fig. 1. (a) Simplified human retina and visual cortex model used for description purposes only; (b) A digital-optical computational model for the cognitive interaction between the retina and the human visual cortex with the general logarithmic hybrid optical neural network architercture. Fig. 2. Biologically-inspired knowledge learning and knowledge representation with the logarithmic hybrid optical neural network-type of object recognition systems. Figure 1 Figure 2 References [1] C. F. R. Weiman, “Video compression via log-polar mapping”, Real-Time Image Processing II, SPIE Symposium on OE/Aerospace Sensing, Vol. 1295, 266-277 (1990) [2] C. G. Ho et al., “Sensor geometry and sampling methods for space-variant image processing”, Pattern Analysis and Applications, Vol. 5, 369-384 (2002) [3] I. Lee, and B. Portier, "An empirical study of knowledge representation and learning within conceptual spaces for intelligent agents", 6th IEEE/ACIS International Conference Computer and Information Science, 463-468, Melbourne, Australia (2007) [4] I. Kypraios et al., “Fully invariant complex Logarithmic r- map for the Hybrid Optical Neural Network filter for object recognition within cluttered scenes”, 50th Anniversary International Symposium ELMAR 2008, IEEE Region 8/IEEE Croatia Section/EURASIP, Zadar-Croatia, Vol. 1, 141-146 (2008) [5] I. Kypraios et al., “Logarithmic r-θ mapping for hybrid optical neural network filter for object recogntion within cluttered scenes”, Recent Advances in Multimedia Signal Processing and Communications, SCI231, Springer-Verlag Berlin Heidelberg, 91-120 (2009) Keywords: digital-optical, cognitive, object recognition, Robust and Adaptive Systems, Knowledge representation, learning and memory Conference: Neuroinformatics 2013, Stockholm, Sweden, 27 Aug - 29 Aug, 2013. Presentation Type: Poster Topic: Neuromorphic engineering Citation: Kypraios I (2013). Logarithmic Hybrid Optical Neural Network-type systems: towards digital-optical cognitive models of robust object recognition. Front. Neuroinform. Conference Abstract: Neuroinformatics 2013. doi: 10.3389/conf.fninf.2013.09.00066 Copyright: The abstracts in this collection have not been subject to any Frontiers peer review or checks, and are not endorsed by Frontiers. They are made available through the Frontiers publishing platform as a service to conference organizers and presenters. The copyright in the individual abstracts is owned by the author of each abstract or his/her employer unless otherwise stated. Each abstract, as well as the collection of abstracts, are published under a Creative Commons CC-BY 4.0 (attribution) licence (https://creativecommons.org/licenses/by/4.0/) and may thus be reproduced, translated, adapted and be the subject of derivative works provided the authors and Frontiers are attributed. For Frontiers’ terms and conditions please see https://www.frontiersin.org/legal/terms-and-conditions. Received: 09 Apr 2013; Published Online: 11 Jul 2013. * Correspondence: Dr. Ioannis Kypraios, Oxford University, Centre for Innovation & Enterprise, APEM Computing Lab, Begbroke, Oxfordshire, OX5 1PF, United Kingdom, ioanniskyp@yahoo.com Login Required This action requires you to be registered with Frontiers and logged in. To register or login click here. Abstract Info Abstract The Authors in Frontiers Ioannis Kypraios Google Ioannis Kypraios Google Scholar Ioannis Kypraios PubMed Ioannis Kypraios Related Article in Frontiers Google Scholar PubMed Abstract Close Back to top Javascript is disabled. Please enable Javascript in your browser settings in order to see all the content on this page.

  • Research Article
  • Cite Count Icon 19
  • 10.1016/j.cogsys.2018.09.022
Novel object detection and recognition system based on points of interest selection and SVM classification
  • Oct 9, 2018
  • Cognitive Systems Research
  • R Bhuvaneswari + 1 more

Novel object detection and recognition system based on points of interest selection and SVM classification

  • Conference Article
  • Cite Count Icon 16
  • 10.1145/3544548.3581213
Understanding How Low Vision People Read Using Eye Tracking
  • Apr 19, 2023
  • Ru Wang + 4 more

While being able to read with screen magnifiers, low vision people have slow and unpleasant reading experiences. Eye tracking has the potential to improve their experience by recognizing fine-grained gaze behaviors and providing more targeted enhancements. To inspire gaze-based low vision technology, we investigate the suitable method to collect low vision users’ gaze data via commercial eye trackers and thoroughly explore their challenges in reading based on their gaze behaviors. With an improved calibration interface, we collected the gaze data of 20 low vision participants and 20 sighted controls who performed reading tasks on a computer screen; low vision participants were also asked to read with different screen magnifiers. We found that, with an accessible calibration interface and data collection method, commercial eye trackers can collect gaze data of comparable quality from low vision and sighted people. Our study identified low vision people’s unique gaze patterns during reading, building upon which, we propose design implications for gaze-based low vision technology.

  • Research Article
  • Cite Count Icon 3
  • 10.1080/10447318.2021.1952802
An Empirical Comparison between the Effects of Normal and Low Vision on Kinematics of a Mouse-Mediated Pointing Movement
  • Jul 31, 2021
  • International Journal of Human–Computer Interaction
  • Yuenkeen Cheong + 2 more

Vision problem is affecting many Americans today. While there are several pioneering studies that examine computer input tasks performed by people with low vision, most focus on aggregate measures of performance, such as total task time. To provide a more detailed analysis of low vision user performance, we captured kinematics of pointing movements with the goal of determining the effect of low vision on the process of the movement. Ten participants were recruited to form a sighted and a low vision group. After controlling for the effects of age and psychomotor ability, differences in movement performance and kinematics between the two groups were compared. As expected, longer movement times were observed among low vision participants. When the movement was parsed into primary (i.e., initial phase) and secondary (i.e., homing phase) submovements, the kinematics of the primary submovement were similar for the two groups. However, low vision participants were found to spend more time in the secondary submovement. The effect of visual condition was amplified when a low vision participant had to move the cursor over longer distances. These findings suggest that for computing tasks requiring mouse-mediated pointing, task improvements focused on the secondary movement (i.e. homing phase) would benefit low vision users. Improving performance during homing phase could result in the overall improvement of performance. These results could also be useful to guide the development of adaptive and individualized assistive technology to assist users acquire intended targets. These results could also be useful to guide the development of adaptive and individualized assistive technology to assist users acquire intended targets.

  • Conference Article
  • Cite Count Icon 86
  • 10.1145/3025453.3025949
Understanding Low Vision People's Visual Perception on Commercial Augmented Reality Glasses
  • May 2, 2017
  • Yuhang Zhao + 3 more

People with low vision have a visual impairment that affects their ability to perform daily activities. Unlike blind people, low vision people have functional vision and can potentially benefit from smart glasses that provide dynamic, always-available visual information. We sought to determine what low vision people could see on mainstream commercial augmented reality (AR) glasses, despite their visual limitations and the device's constraints. We conducted a study with 20 low vision participants and 18 sighted controls, asking them to identify virtual shapes and text in different sizes, colors, and thicknesses. We also evaluated their ability to see the virtual elements while walking. We found that low vision participants were able to identify basic shapes and read short phrases on the glasses while sitting and walking. Identifying virtual elements had a similar effect on low vision and sighted people's walking speed, slowing it down slightly. Our study yielded preliminary evidence that mainstream AR glasses can be powerful accessibility tools. We derive guidelines for presenting visual output for low vision people and discuss opportunities for accessibility applications on this platform.

  • Conference Article
  • Cite Count Icon 2
  • 10.1109/icodsa53588.2021.9617466
A Review on K-N earest Neighbour Based Classification for Object Recognition
  • Oct 6, 2021
  • Maria Auleria + 2 more

Object recognition has an important role in automation technology. There have been many research and proposed methods to perform object recognition optimally. In object recognition, KNN classifier is one of the popular classification techniques used in object recognition systems. Many kinds of research done to compare the performance of KNN classifier with other classifiers in object recognition systems. However, no works are found reviewing the optimal implementation of the KNN classification method to achieve the best performance in object recognition systems. This paper reviews some research done on KNN based classification for object recognition systems and classify the research based on the type of image dataset used and the image visual features extracted used in the research. A total of 25 papers is classified into 2 main categories: image dataset of objects with cluttered background and image dataset of objects with a discarded background. The research is further classified into several different subcategories: color features, shape features, texture features, edge features, corner features, and interest points. A systematic literature review is done to find the optimal implementation of the KNN classification method in object recognition systems. The result of this paper shows the suitable type of image dataset of objects and the feature extraction technique used in KNN based object recognition, and the performance of KNN classifier in object recognition systems.

  • Conference Article
  • 10.1117/12.177763
<title>Object-oriented object recognition system</title>
  • Jun 10, 1994
  • Proceedings of SPIE, the International Society for Optical Engineering/Proceedings of SPIE
  • David J Garcia

Pattern recognition systems have been developed using a variety of technologies from many disciplines. With the development of new technologies in these disciplines comes the possibility for improving pre-existing recognition systems. Such is the case of using object-oriented programming concepts from computer science in object recognition applications. Efficient object recognition imposes new requirements to the library (database) of objects. The library has to go beyond the role of a simple storage medium and provide efficient retrieval and management capabilities for the objects' information. An entity can be stored in an object structure along with its descriptive attributes or features. In identifying an unknown object, the object recognition system queries the database by the passage of messages checking the similarities of the unknown object to each of the objects in the database on a feature basis. The only interface the database shares with the object recognition system is through the passing of messages which allows for flexibility in how the database processes the messages. This is only one of many advantages of the use of an object-oriented database in an object recognition system. An object recognition system utilizing object-oriented concepts is developed in detail.

  • Research Article
  • Cite Count Icon 3
  • 10.14714/cp101.1767
Multivalent Cartographic Accessibility: Tactile Maps for Collaborative Decision-Making
  • Feb 23, 2023
  • Cartographic Perspectives
  • Harrison Cole

Conventional visual maps present significant accessibility challenges for blind or low vision users, leaving them with few or no options for interpreting spatial data. This need not be the case: tactile maps, designed to be read through touch, have been published for more than a century. But they have most often been categorized as a navigation tool, or mere “tactile graphics” (i.e., not as expressly spatial documents). Tactile maps that allow their users to explore and synthesize thematic spatial data are rare, as are studies evaluating them. As our world continues to face existential threats that are spatial in nature—pandemics, supply chain disruptions, floods, etc.—maps will continue to provide critical information in ways that other media are unable to match. In the absence of accessible thematic maps, blind people will not only be left out of the loop, but their capacity for contributing valuable input will be severely diminished. In response, I describe here a study that evaluates the potential of thematic tactile maps for providing blind users an accessible means of analyzing spatial data when working in collaboration with sighted partners. Findings indicate that while the maps did not prove to be useful tools on their own, they did facilitate collaboration between blind or low vision participants and sighted participants. This suggests that, with some refinements, similar maps could be feasibly distributed as a means for people with visual disabilities to meaningfully participate in an otherwise inaccessible process that requires the synthesis of thematic spatial information.

  • Conference Article
  • Cite Count Icon 1
  • 10.1117/12.131736
<title>Role of algebraic geometry in computer vision</title>
  • Nov 1, 1992
  • Proceedings of SPIE, the International Society for Optical Engineering/Proceedings of SPIE
  • Gabriel Taubin

In this paper we describe the geometric components of our model-based approach to 3D rigid object recognition and positioning from range data that have potential applications in Graphics, Geometric Modeling, and Computer Aided Geometric Design. As in many other object recognition systems, due to occlusion, objects are recognized and located by comparing and geometrically matching small regions of the data set with corresponding regions of known models stored in a database. In our case, a known object is represented in the database as a hierarchical collection of regions, each of them approximated by an algebraic surface. The preliminary recognition and matching is based on comparing euclidean invariant functions of the coefficients of the corresponding polynomials. The final recognition and matching is based on determining how well the data fits a stored model. Although we have not implemented a complete system yet, towards the implementation of an object recognition and position estimation system based on this structure, a number of computational problems associated with algebraic curves and surfaces have been analyzed and solved. These problems, described in this paper, are: (1) how to fit unconstrained algebraic curve and surfaces to data, (2) how to fit bounded algebraic curves and surfaces, (3) how to efficiently compute Euclidean invariants of polynomials, and (4) how to define an intrinsic coordinate system of a polynomial, generalizing the notion of center and principal axes of a nonsingular quadric curve or surface. The intrinsic coordinate system of a polynomial can be used to transform its coefficients to a normal form. The coefficients of a polynomial in normal form constitute a complete set of Euclidean invariants.

  • Conference Article
  • Cite Count Icon 7
  • 10.1109/infop.2015.7489450
Object recognition with plain background by using ANN and SIFT based features
  • Dec 1, 2015
  • Sachin V Sinkar + 1 more

In this paper, we have developed algorithm which combine features from an object image to recognize it. Object recognition is the most interesting and challenging area of research due to its importance to a wide range of applications as objects differ in shape, size and color. The descriptors used for feature extraction in the existing methods for object recognition are mostly intensity based. For the recognition of object with varying level of light illumination color descriptors have been used in this paper. The color descriptors used in the object recognition system are based upon the intensity levels of hue (H), saturation (S) and value (V) in terms of their histogram for respective object image. The system works in single object recognition mode with plain background. The features used for recognition are mainly the size (along the boundaries) and HSV intensity values of the object's image. It consists of two key modules, feature extraction and object recognition. In this paper object recognition is accomplished by using two different methods in which the classification of the extracted features of object image are on the basis of artificial neural networks (ANN) and scale invariant feature transform (SIFT) based features. We have adapted the Euclidean distance measure metric algorithm for matching the extracted features of object's image. Finally the results in terms of object detection and recognition rates are calculated on the basis of false rejection ratio (FRR) and false acceptance ratio (FAR).

  • Research Article
  • 10.5565/rev/elcvia.714
Toward a perceptual object recognition system
  • Dec 21, 2015
  • ELCVIA Electronic Letters on Computer Vision and Image Analysis
  • Dounia Awad

[1] demonstrated that humans are easily able to recognize an object in less than 0.5 seconds. Unfortunately,object recognition remains one of the most challenging problems in computer vision. Many algorithms basedon local approaches have been proposed in recent decades. Local approaches can be divided in 4 phases:region selection, region appearance description, image representation and classification [2]. Although thesesystems have demonstrated excellent performance, some weaknesses remain. The first limitation is in the region selection phase. Many existing techniques extract a large number of points/regions of interest. For instance, dense grids contain tens of thousands of points per image while interest point detectors often extract thousands of points. Furthermore, some studies have demonstrated that these techniques were not designed to detect the most pertinent regions for object recognition. There is only a weak correlation between the distribution of extracted points and eye fixations [3]. The second limitation mentioned in the literature concerns the region appearance description phase. The techniques used in this phase typically describe image regions using high-dimensional vectors [4]. For example, SIFT, the most popular descriptor for object recognition, produces a 128-dimensional vector per region [5].The main objective of this thesis is to propose a pipeline for an object recognition algorithm based on human perception which addresses the object recognition system complexity: query run time and memory allocation. In this context, we propose a filter based on a visual attention system [6] to address the problems of extracting a large number of points of interest using existing region selection techniques. We chose to use bottom-up visual attention systems that encode attentional fixations in a topographic map, known as a saliency map. This map serves as basis for generating a mask to select salient points according to human interest, from the points extracted by a region selection technique [7]. Furthermore, we addressed the problem of high dimensionality of descriptors in region appearance phase. We proposed a new hybrid descriptor representing the spatial frequency of some perceptual features, extracted by a visual attention system (color, texture, intensity [8]. This descriptor consist of a concatenation of energy measures computed at the output of a filter bank [9], at each level of the multi-resolution pyramid of perceptual features. This descriptor has the advantage of being lower dimensional than traditional descriptors.The test of our filtering approach, using Perreira da Silva system [10] as a filter on VOC2005, demonstrated that we can maintain approximately the same performance of an object recognition system by selecting only 40% of extracted points (using Harris-Laplace [11] and Laplacian [12]), while having an important reduction in complexity (40% reduction in query run time). Furthermore, evaluating our descriptor with an object recognition system using Harris-Laplace and Laplacian interest point detectors on VOC2007 database showed a slight decrease in performance ( 5% reduction of average precision) compared to the original system based on the SIFT descriptor, but with a 50% reduction in complexity. In addition, we evaluated our descriptor using a visual attention system as the region selection technique on VOC2005. The experiment showed a slight decrease in performance (3% reduction in precision), but a drastically reduced complexity of the system (with 5% reduction in query run-time and 70% in complexity).In this thesis, we proposed two approaches to manage the problems of complexity in object recognitionsystem. In future, it would be interesting to address the problems of the last two phases in object system: image representation and classification, by introducing perceptually plausible concepts such as deep learning techniques.

  • PDF Download Icon
  • Single Book
  • Cite Count Icon 10
  • 10.5772/2392
Advances in Object Recognition Systems
  • May 9, 2012
  • Ioannis Kypraios

An invariant object recognition system needs to be able to recognise the object under any usual a priori defined distortions such as translation, scaling and in-plane and out-of-plane rotation. Ideally, the system should be able to recognise (detect and classify) any complex scene of objects even within background clutter noise. In this book, we present recent advances towards achieving fully-robust object recognition. The relation and importance of object recognition in the cognitive processes of humans and animals is described as well as how human- and animal-like cognitive processes can be used for the design of biologically-inspired object recognition systems. Colour processing is discussed in the development of fully-robust object recognition systems. Examples of two main categories of object recognition systems, the optical correlators and pure artificial neural network architectures, are given. Finally, two examples of object recognition's applications are described in details. With the recent technological advancements object recognition becomes widely popular with existing applications in medicine for the study of human learning and memory, space science and remote sensing for image analysis, mobile computing and augmented reality, semiconductors industry, robotics and autonomous mobile navigation, public safety and urban management solutions and many more others. This book is a "must-read" for everyone with a core or wider interest in this "hot" area of cutting-edge research.

  • Research Article
  • Cite Count Icon 1
  • 10.1145/3750054
A Multi-Method Investigation of Guide Robot Characteristics for Blind and Low Vision Users
  • Jul 22, 2025
  • ACM Transactions on Human-Robot Interaction
  • Katherine Shih + 3 more

Guide robots have the potential to improve the experience of independent travel for people who are blind or have low vision. While many technical challenges for robot guides have been studied, open questions about robot behaviors and features remain. We conducted a two-phase user study with 16 blind and low-vision participants. First, we tested whether robot path planners that account for specific orientation cues affect users’ comfort and spatial awareness. Then, we elicited user preferences for guide robots through semi-structured interviews and scenario-based design sessions. We provide insights regarding desired robot behaviors and features that impact robot usability, human agency vs. autonomy, and perceived safety.

  • Research Article
  • Cite Count Icon 1
  • 10.1167/jov.23.15.18
Invited Session IV: Extended reality--applications in vision science and beyond: Augmented reality systems for people with low vision.
  • Dec 1, 2023
  • Journal of vision
  • Yuhang Zhao

Low vision is a visual impairment that falls short of blindness but cannot be corrected by eyeglasses or contact lenses. While current low vision aids (e.g., magnifier, CCTV) support basic vision enhancements, such as magnification and contrast enhancement, these enhancements often arbitrarily alter a user's full field of view without considering the user's context, such as their visual abilities, tasks, and environmental factors. As a result, these low vision aids are not sufficient or preferred by low vision users in many important tasks. Augmented reality (AR) technology presents a unique opportunity to enhance low vision people's visual experience by automatically recognizing the surrounding environment and presenting tailored visual augmentations. In this talk, I will talk about how we design and build intelligent AR systems to support low vision people in visual tasks, such as a head-mounted AR system that presents visual cues to orient users' attention in a visual search task, as well as a projection-based AR system that projects visual highlights on the stair edges to support safe stair navigation. I will conclude my talk by discussing our future research direction on AR for low vision accessibility.

  • Research Article
  • Cite Count Icon 5
  • 10.1145/2435227.2435229
A novel low-power embedded object recognition system working at multi-frames per second
  • Mar 1, 2013
  • ACM Transactions on Embedded Computing Systems
  • Antonis Nikitakis + 2 more

One very important challenge in the field of multimedia is the implementation of fast and detailed Object Detection and Recognition systems. In particular, in the current state-of-the-art mobile multimedia systems, it is highly desirable to detect and locate certain objects within a video frame in real time. Although a significant number of Object Detection and Recognition schemes have been developed and implemented, triggering very accurate results, the vast majority of them cannot be applied in state-of-the-art mobile multimedia devices; this is mainly due to the fact that they are highly complex schemes that require a significant amount of processing power, while they are also time consuming and very power hungry. In this article, we present a novel FPGA-based embedded implementation of a very efficient object recognition algorithm called Receptive Field Cooccurrence Histograms Algorithm (RFCH). Our main focus was to increase its performance so as to be able to handle the object recognition task of today's highly sophisticated embedded multimedia systems while keeping its energy consumption at very low levels. Our low-power embedded reconfigurable system is at least 15 times faster than the software implementation on a low-voltage high-end CPU, while consuming at least 60 times less energy. Our novel system is also 88 times more energy efficient than the recently introduced low-power multi-core Intel devices which are optimized for embedded systems. This is, to the best of our knowledge, the first system presented that can execute the complete complex object recognition task at a multi frame per second rate while consuming minimal amounts of energy, making it an ideal candidate for future embedded multimedia systems.

Save Icon
Up Arrow
Open/Close