Detecting and locating trending places using multimodal social network data

Luis Lucas,David Tomás,Jose Garcia-Rodriguez

doi:10.1007/s11042-022-14296-8

Luis Lucas, David Tomás + Show 1 more

Open Access

https://doi.org/10.1007/s11042-022-14296-8

Copy DOI

Journal: Multimedia Tools and Applications	Publication Date: Dec 20, 2022
Citations: 1	License type: CC BY 4.0

Affiliation: University of Alicante

Abstract

This paper presents a machine learning-based classifier for detecting points of interest through the combined use of images and text from social networks. This model exploits the transfer learning capabilities of the neural network architecture CLIP (Contrastive Language-Image Pre-Training) in multimodal environments using image and text. Different methodologies based on multimodal information are explored for the geolocation of the places detected. To this end, pre-trained neural network models are used for the classification of images and their associated texts. The result is a system that allows creating new synergies between images and texts in order to detect and geolocate trending places that has not been previously tagged by any other means, providing potentially relevant information for tasks such as cataloging specific types of places in a city for the tourism industry. The experiments carried out reveal that, in general, textual information is more accurate and relevant than visual cues in this multimodal setting.

Full Text