ST-Sem: A Multimodal Method for Points-of-Interest Classification Using Street-Level Imagery

Shahin Sharifi Noorian,Achilleas Psyllidis,Alessandro Bozzon

doi:10.1007/978-3-030-19274-7_3

Abstract

Street-level imagery contains a variety of visual information about the facades of Points of Interest (POIs). In addition to general morphological features, signs on the facades of, primarily, business-related POIs could be a valuable source of information about the type and identity of a POI. Recent advancements in computer vision could leverage visual information from street-level imagery, and contribute to the classification of POIs. However, there is currently a gap in existing literature regarding the use of visual labels contained in street-level imagery, where their value as indicators of POI categories is assessed. This paper presents Scene-Text Semantics (ST-Sem), a novel method that leverages visual labels (e.g., texts, logos) from street-level imagery as complementary information for the categorization of business-related POIs. Contrary to existing methods that fuse visual and textual information at a feature-level, we propose a late fusion approach that combines visual and textual cues after resolving issues of incorrect digitization and semantic ambiguity of the retrieved textual components. Experiments on two existing and a newly-created datasets show that ST-Sem can outperform visual-only approaches by 80% and related multimodal approaches by 4%.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

ST-Sem: A Multimodal Method for Points-of-Interest Classification Using Street-Level Imagery

Abstract

Talk to us

Similar Papers

Lead the way for us

Similar Papers

Cross-media web video topic detection based on heterogeneous interactive tensor learning
Chengde Zhang ... Xia Xiao
Knowledge-Based Systems | VOL. 283
Chengde Zhang, et. al.Chengde Zhang ... Xia Xiao
17 Nov 2023
Knowledge-Based Systems | VOL. 283

Textual Primacy Online: Impression Formation Based on Textual and Visual Cues in Facebook Profiles
Ayellet Pelled ... Tanya Zilberstein
American Behavioral Scientist | VOL. 61
Ayellet Pelled, et. al.Ayellet Pelled ... Tanya Zilberstein
01 Jun 2017
American Behavioral Scientist | VOL. 61

Monitoring crop phenology with street-level imagery using computer vision
Raphaël D’Andrimont ... Momchil Yordanov
Computers and Electronics in Agriculture | VOL. 196
Raphaël D’Andrimont, et. al.Raphaël D’Andrimont ... Momchil Yordanov
01 May 2022
Computers and Electronics in Agriculture | VOL. 196

Where you Instagram?
Xutao Li ... Tuan-Anh Nguyen Pham
-
Xutao Li, et. al.Xutao Li ... Tuan-Anh Nguyen Pham
17 Oct 2015
17 Oct 2015

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

ST-Sem: A Multimodal Method for Points-of-Interest Classification Using Street-Level Imagery

Abstract

Talk to us

Similar Papers