Affective Feedback Synthesis Towards Multimodal Text and Image Data

Puneet Kumar,Balasubramanian Raman,Omkar Ingle,Gaurav Bhatt,Daksh Goyal

doi:10.1145/3589186

Abstract

In this article, we have defined a novel task of affective feedback synthesis that generates feedback for input text and corresponding images in a way similar to humans responding to multimodal data. A feedback synthesis system has been proposed and trained using ground-truth human comments along with image–text input. We have also constructed a large-scale dataset consisting of images, text, Twitter user comments, and the number of likes for the comments by crawling news articles through Twitter feeds. The proposed system extracts textual features using a transformer-based textual encoder. The visual features have been extracted using a Faster region-based convolutional neural networks model. The textual and visual features have been concatenated to construct multimodal features that the decoder uses to synthesize the feedback. We have compared the results of the proposed system with baseline models using quantitative and qualitative measures. The synthesized feedbacks have been analyzed using automatic and human evaluation. They have been found to be semantically similar to the ground-truth comments and relevant to the given text–image input.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Affective Feedback Synthesis Towards Multimodal Text and Image Data

Abstract

Talk to us

Similar Papers

More From: ACM Transactions on Multimedia Computing, Communications, and Applications

Lead the way for us

Journal: ACM Transactions on Multimedia Computing, Communications, and Applications	Publication Date: May 31, 2023
Citations: 1

Similar Papers

Mass Classification of Breast Cancer Using CNN and Faster R-CNN Model Comparison
Sunardi Sunardi ... Anton Yudhana
Kinetik: Game Technology, Information System, Computer Network, Computing, Electronics, and Control | VOL. -
Sunardi Sunardi, et. al.Sunardi Sunardi ... Anton Yudhana
30 Aug 2022
Kinetik: Game Technology, Information System, Computer Network, Computing, Electronics, and Control | VOL. -

Identification of mild cognitive impairment using multimodal 3D imaging data and graph convolutional networks
Shengbin Liang ... Wencai Du
Physics in Medicine & Biology | VOL. -
Shengbin Liang, et. al.Shengbin Liang ... Wencai Du
29 Oct 2024
Physics in Medicine & Biology | VOL. -

Diagnosis of primary clear cell carcinoma of the liver based on Faster region-based convolutional neural network.
Bin Liu ... Yanyan Zhang
Chinese Medical Journal | VOL. 136
Bin Liu, et. al.Bin Liu ... Yanyan Zhang
25 Oct 2023
Chinese Medical Journal | VOL. 136

A deep-level region-based visual representation architecture for detecting strawberry flowers in an outdoor field
P Lin ... C Fraisse
Precision Agriculture | VOL. 21
P Lin, et. al.P Lin ... C Fraisse
07 Jun 2019
Precision Agriculture | VOL. 21

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Affective Feedback Synthesis Towards Multimodal Text and Image Data

Abstract

Talk to us

Similar Papers

More From: ACM Transactions on Multimedia Computing, Communications, and Applications