A study of the evaluation metrics for generative images containing combinational creativity

Boheng Wang,Liuqing Chen,Yunhuai Zhu,Jingcheng Liu,Lingyun Sun,Peter Childs

doi:10.1017/s0890060423000069

Abstract

AbstractIn the field of content generation by machine, the state-of-the-art text-to-image model, DALL⋅E, has advanced and diverse capacities for the combinational image generation with specific textual prompts. The images generated by DALL⋅E seem to exhibit an appreciable level of combinational creativity close to that of humans in terms of visualizing a combinational idea. Although there are several common metrics which can be applied to assess the quality of the images generated by generative models, such as IS, FID, GIQA, and CLIP, it is unclear whether these metrics are equally applicable to assessing images containing combinational creativity. In this study, we collected the generated image data from machine (DALL⋅E) and human designers, respectively. The results of group ranking in the Consensual Assessment Technique (CAT) and the Turing Test (TT) were used as the benchmarks to assess the combinational creativity. Considering the metrics’ mathematical principles and different starting points in evaluating image quality, we introduced coincident rate (CR) and average rank variation (ARV) which are two comparable spaces. An experiment to calculate the consistency of group ranking of each metric by comparing the benchmarks then was conducted. By comparing the consistency results of CR and ARV on group ranking, we summarized the applicability of the existing evaluation metrics in assessing generative images containing combinational creativity. In the four metrics, GIQA performed the closest consistency to the CAT and TT. It shows the potential as an automated assessment for images containing combinational creativity, which can be used to evaluate the images containing combinational creativity in the relevant task of design and engineering such as conceptual sketch, digital design image, and prototyping image.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

A study of the evaluation metrics for generative images containing combinational creativity

Abstract

Talk to us

Similar Papers

More From: Artificial Intelligence for Engineering Design, Analysis and Manufacturing

Lead the way for us

Journal: Artificial Intelligence for Engineering Design, Analysis and Manufacturing	Publication Date: Jan 1, 2023
Citations: 4

Similar Papers

Quantifying quality of class-conditional generative models in time series domain
Alireza Koochali ... Sheraz Ahmed
Applied Intelligence | VOL. 53
Alireza Koochali, et. al.Alireza Koochali ... Sheraz Ahmed
26 Jul 2023
Applied Intelligence | VOL. 53

Evaluation of Generative Adversarial Networks for Time Series Data
Hiba Arnout ... Thomas Runkler
-
Hiba Arnout, et. al.Hiba Arnout ... Thomas Runkler
18 Jul 2021
18 Jul 2021

Knowledge-Driven Generative Adversarial Network for Text-to-Image Synthesis
Jun Peng ... Liujuan Cao
IEEE Transactions on Multimedia | VOL. 24
Jun Peng, et. al.Jun Peng ... Liujuan Cao
01 Jan 2021
IEEE Transactions on Multimedia | VOL. 24

Constructing Better Evaluation Metrics by Incorporating the Anchoring Effect into the User Model
Nuo Chen ... Fan Zhang
-
Nuo Chen, et. al.Nuo Chen ... Fan Zhang
06 Jul 2022
06 Jul 2022

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

A study of the evaluation metrics for generative images containing combinational creativity

Abstract

Talk to us

Similar Papers

More From: Artificial Intelligence for Engineering Design, Analysis and Manufacturing