Scene Recognition for Visually-Impaired People’s Navigation Assistance Based on Vision Transformer with Dual Multiscale Attention

Yahia Said,Mohamed Atri,Marwan Ali Albahar,Ahmed Ben Atitallah,Yazan Ahmad Alsariera

doi:10.3390/math11051127

Abstract

Notable progress was achieved by recent technologies. As the main goal of technology is to make daily life easier, we will investigate the development of an intelligent system for the assistance of impaired people in their navigation. For visually impaired people, navigating is a very complex task that requires assistance. To reduce the complexity of this task, it is preferred to provide information that allows the understanding of surrounding spaces. Particularly, recognizing indoor scenes such as a room, supermarket, or office provides a valuable guide to the visually impaired person to understand the surrounding environment. In this paper, we proposed an indoor scene recognition system based on recent deep learning techniques. Vision transformer (ViT) is a recent deep learning technique that has achieved high performance on image classification. So, it was deployed for indoor scene recognition. To achieve better performance and to reduce the computation complexity, we proposed dual multiscale attention to collect features at different scales for better representation. The main idea was to process small patches and large patches separately and a fusion technique was proposed to combine the features. The proposed fusion technique requires linear time for memory and computation compared to existing techniques that require quadratic time. To prove the efficiency of the proposed technique, extensive experiments were performed on a public dataset which is the MIT67 dataset. The achieved results demonstrated the superiority of the proposed technique compared to the state-of-the-art. Further, the proposed indoor scene recognition system is suitable for implementation on mobile devices with fewer parameters and FLOPs.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Mathematics	Publication Date: Feb 24, 2023
Citations: 3	License type: CC BY 4.0

R Discovery Prime

R Discovery Prime

Scene Recognition for Visually-Impaired People’s Navigation Assistance Based on Vision Transformer with Dual Multiscale Attention

Abstract

Talk to us

Similar Papers

More From: Mathematics

Lead the way for us

Similar Papers

Indoor Scene Recognition with a Visual Attention-Driven Spatial Pooling Strategy
Tarek Elguebaly ... Nizar Bouguila
-
Tarek Elguebaly, et. al.Tarek Elguebaly ... Nizar Bouguila
01 May 2014
01 May 2014

Indoor versus outdoor scene recognition for navigation of a micro aerial vehicle using spatial color gist wavelet descriptors
Anitha Ganesan ... Anbarasu Balasubramanian
Visual Computing for Industry, Biomedicine, and Art | VOL. 2
Anitha Ganesan, et. al.Anitha Ganesan ... Anbarasu Balasubramanian
26 Nov 2019
Visual Computing for Industry, Biomedicine, and Art | VOL. 2

InstaIndoor and multi-modal deep learning for indoor scene recognition
Andreea Glavan ... Estefanía Talavera
Neural Computing and Applications | VOL. 34
Andreea Glavan, et. al.Andreea Glavan ... Estefanía Talavera
22 Jan 2022
Neural Computing and Applications | VOL. 34

CONSTRUCTION OF A DUAL-TASK MODEL FOR INDOOR SCENE RECOGNITION AND SEMANTIC SEGMENTATION BASED ON POINT CLOUDS
J Jiang ... J Li
ISPRS Annals of the Photogrammetry, Remote Sensing and Spatial Information Sciences | VOL. X-1/W1-2023
J Jiang, et. al.J Jiang ... J Li
05 Dec 2023
ISPRS Annals of the Photogrammetry, Remote Sensing and Spatial Information Sciences | VOL. X-1/W1-2023

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Scene Recognition for Visually-Impaired People’s Navigation Assistance Based on Vision Transformer with Dual Multiscale Attention

Abstract

Talk to us

Similar Papers

More From: Mathematics