Abstract

Visual paraphrase generation task aims to rewrite a given image-related original sentence into a new paraphrase, where the paraphrase needs to have the same expressed meaning as the original sentence but have a difference in expression form. Existing studies mainly extract two semantic vectors to represent the entire image and the entire original sentence, respectively, for paraphrase generation. However, these semantic vectors for an image or a sentence may lead to the model failing to focus on some key objects in the original sentence, which may generate semantically inconsistent sentences by changing key object information. In this article, we propose an object-level paraphrase generation model, which generates paraphrases by adjusting the permutation of key objects and modifying their associated descriptions. To adjust the permutation of key objects, an object-sorting module aims to obtain new object sequences based on the key object information and original sentences. Then, a sequence generation module sequentially generates paraphrases based on the permutation of the newly object sequences. Each generation step focuses on different image features associated with different key objects to generate descriptions with differences. Furthermore, we use a semantic discriminator module to promote the generated paraphrase to be semantically close to the original sentence. Specifically, the loss function of the discriminator penalizes the excessive distance between the paraphrase and the original sentence. Extensive experiments on the MS COCO dataset show that the proposed model outperforms the baselines.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.