Integrating synthetic datasets with CLIP semantic insights for single image localization advancements

Dansheng Yao,Mengqi Zhu,Hehua Zhu,Wuqiang Cai,Long Zhou

doi:10.1016/j.isprsjprs.2024.10.027

Abstract

Accurate localization of pedestrians and mobile robots is critical for navigation, emergency response, and autonomous driving. Traditional localization methods, such as satellite signals, often prove ineffective in certain environments, and acquiring sufficient positional data can be challenging. Single image localization techniques have been developed to address these issues. However, current deep learning frameworks for single image localization that rely on domain adaptation fail to effectively utilize semantically rich high-level features obtained from large-scale pretraining. This paper introduces a novel framework that leverages the Contrastive Language-Image Pre-training model and prompts to enhance feature extraction and domain adaptation through semantic information. The proposed framework generates an integrated score map from scene-specific prompts to guide feature extraction and employs adversarial components to facilitate domain adaptation. Furthermore, a reslink component is incorporated to mitigate the precision loss in high-level features compared to the original data. Experimental results demonstrate that the use of prompts reduces localization errors by 26.4 % in indoor environments and 24.3 % in outdoor settings. The model achieves localization errors as low as 0.75 m and 8.09 degrees indoors, and 4.56 m and 7.68 degrees outdoors. Analysis of prompts from labeled datasets confirms the model’s capability to effectively interpret scene information. The weights of the integrated score map enhance the model’s transparency, thereby improving interpretability. This study underscores the efficacy of integrating semantic information into image localization tasks.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Integrating synthetic datasets with CLIP semantic insights for single image localization advancements

Abstract

Talk to us

Similar Papers

More From: ISPRS Journal of Photogrammetry and Remote Sensing

Lead the way for us

Similar Papers

Multi-sensor Fusion for Autonomous Positioning of Indoor Robots
Zipei Shuai ... Hongyang Yu
-
Zipei Shuai, et. al.Zipei Shuai ... Hongyang Yu
13 Oct 2021
13 Oct 2021

Comparison of two data fusion methods for localization of wheeled mobile robot in farm conditions
S Erfani ... A Hajiahmad
Artificial Intelligence in Agriculture | VOL. 1
S Erfani, et. al.S Erfani ... A Hajiahmad
01 Mar 2019
Artificial Intelligence in Agriculture | VOL. 1

Indoor localization for mobile robots using lampshade corners as landmarks: Visual system calibration, feature extraction and experiments
Xiaohan Chen ... Yingmin Jia
International Journal of Control, Automation and Systems | VOL. 12
Xiaohan Chen, et. al.Xiaohan Chen ... Yingmin Jia
08 Oct 2014
International Journal of Control, Automation and Systems | VOL. 12

Recent Advances in Mobile Robot Localization in Complex Scenarios
Haojie Zhang ... Xiaobin Xu
-
Haojie Zhang, et. al.Haojie Zhang ... Xiaobin Xu
01 Jan 2023
01 Jan 2023

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Integrating synthetic datasets with CLIP semantic insights for single image localization advancements

Abstract

Talk to us

Similar Papers

More From: ISPRS Journal of Photogrammetry and Remote Sensing