Explaining digital humanities by aligning images and textual descriptions

Marcella Cornia,Matteo Stefanini,Lorenzo Baraldi,Massimiliano Corsini,Rita Cucchiara

doi:10.1016/j.patrec.2019.11.018

Marcella Cornia, Matteo Stefanini + Show 3 more

Open Access

https://doi.org/10.1016/j.patrec.2019.11.018

Copy DOI

Abstract

Replicating the human ability to connect Vision and Language has recently been gaining a lot of attention in the Computer Vision and the Natural Language Processing communities. This research effort has resulted in algorithms that can retrieve images from textual descriptions and vice versa, when realistic images and sentences with simple semantics are employed and when paired training data is provided. In this paper, we go beyond these limitations and tackle the design of visual-semantic algorithms in the domain of the Digital Humanities. This setting not only advertises more complex visual and semantic structures but also features a significant lack of training data which makes the use of fully-supervised approaches infeasible. With this aim, we propose a joint visual-semantic embedding that can automatically align illustrations and textual elements without paired supervision. This is achieved by transferring the knowledge learned on ordinary visual-semantic datasets to the artistic domain. Experiments, performed on two datasets specifically designed for this domain, validate the proposed strategies and quantify the domain shift between natural images and artworks.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Pattern Recognition Letters	Publication Date: Nov 18, 2019
Citations: 24	License type: other-oa

R Discovery Prime

R Discovery Prime

Explaining digital humanities by aligning images and textual descriptions

Abstract

Talk to us

Similar Papers

More From: Pattern Recognition Letters

Lead the way for us

Similar Papers

Multi-Style Unsupervised Image Synthesis Using Generative Adversarial Nets
Guoyun Lv ... Syed Muhammad Israr
IEEE Access | VOL. 9
Guoyun Lv, et. al.Guoyun Lv ... Syed Muhammad Israr
01 Jan 2020
IEEE Access | VOL. 9

STRUKTUR DAN UNSUR KEBAHASAAN TEKS DESKRIPSI SISWA KELAS VII SMP NEGERI 30 PADANG
Juvira Lusita ... Emidar Emidar
Pendidikan Bahasa Indonesia | VOL. 8
Juvira Lusita, et. al.Juvira Lusita ... Emidar Emidar
17 Mar 2019
Pendidikan Bahasa Indonesia | VOL. 8

A CNN-transformer hybrid approach for decoding visual neural activity into text
Jiang Zhang ... Huafu Chen
Computer Methods and Programs in Biomedicine | VOL. 214
Jiang Zhang, et. al.Jiang Zhang ... Huafu Chen
14 Dec 2021
Computer Methods and Programs in Biomedicine | VOL. 214

Optimal text-to-image synthesis model for generating portrait images using generative adversarial network techniques
Mohammed Berrahal ... Mostafa Azizi
Indonesian Journal of Electrical Engineering and Computer Science | VOL. 25
Mohammed Berrahal, et. al.Mohammed Berrahal ... Mostafa Azizi
01 Feb 2022
Indonesian Journal of Electrical Engineering and Computer Science | VOL. 25

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Explaining digital humanities by aligning images and textual descriptions

Abstract

Talk to us

Similar Papers

More From: Pattern Recognition Letters