Method for evaluating partially generated data

Ya O Isaienkov,O B Mokin

doi:10.35668/2520-6524-2024-3-13

Abstract

Generative models, such as autoencoders, generative adversarial networks, and diffusion models, have become an integral part of innovation in various fields in recent years, including art, design, medicine, and more. Due to their ability to create new data samples, they open broad opportunities for automation and process improvement. However, assessing the quality of generated data remains a challenging task, as traditional methods do not always adequately reflect the diversity and realism of the generated samples. This is particularly true for partial data generation, where changes are applied only to specific parts of an image, significantly complicating the assessment of their quality. This work examines various approaches to evaluating generative models, including automatic metrics such as Inception Score and Fréchet Inception Distance, precision, recall, density, and coverage, as well as a human-in-the-loop method such as HYPE. While these metrics have proven effective in evaluating the results of traditional generation, their use in the case of partially generated data may be inappropriate due to their limitations. To address this issue, the paper proposes a new method for evaluating partially generated data that involves the human factor. This method is based on analysing transformed images by users, who identify the areas that have been altered, and evaluates their quality using precision, recall, and F1-score metrics by seeking intersections between actual altered areas and those selected by users using IoU. The proposed approach provides a more objective assessment of the realism and quality of generated image fragments during transformations. A practical example of applying the developed method is presented using a dataset of panoramic dental images, where the quality of three models was evaluated: 1) a GAN based on a U-generator; 2) the same model with post-processing of the output image and segmentation mask; and 3) a self-validated GAN. The evaluation was performed by 30 individuals. The average F1-scores for these models were 0,78, 0,27, and 0,20, respectively. Since lower F1-scores in this case indicate better results (the more accurately users identified the transformations, the worse the model performed), the best model by this metric is the self-validated GAN, which is also supported by subjective assessments mentioned in the authors’ work.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Method for evaluating partially generated data

Abstract

Talk to us

Similar Papers

More From: Science, technologies, innovation

Lead the way for us

Similar Papers

Evaluation Metrics for Generative Models: An Empirical Study
Eyal Betzalel ... Coby Penso
Machine Learning and Knowledge Extraction | VOL. 6
Eyal Betzalel, et. al.Eyal Betzalel ... Coby Penso
07 Jul 2024
Machine Learning and Knowledge Extraction | VOL. 6

A deep adversarial approach for the generation of synthetic titanium alloy microstructures with limited training data
Gowtham Nimmal Haribabu ... Bikramjit Basu
Computational Materials Science | VOL. 230
Gowtham Nimmal Haribabu, et. al.Gowtham Nimmal Haribabu ... Bikramjit Basu
22 Sep 2023
Computational Materials Science | VOL. 230

Stable parallel training of Wasserstein conditional generative adversarial neural networks
Massimiliano Lupo Pasini ... Junqi Yin
The Journal of Supercomputing | VOL. 79
Massimiliano Lupo Pasini, et. al.Massimiliano Lupo Pasini ... Junqi Yin
03 Aug 2022
The Journal of Supercomputing | VOL. 79

Quality Assessment Method for GAN Based on Modified Metrics Inception Score and Fréchet Inception Distance
Artem Obukhov ... Mikhail Krasnyanskiy
-
Artem Obukhov, et. al.Artem Obukhov ... Mikhail Krasnyanskiy
01 Jan 2020
01 Jan 2020

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Method for evaluating partially generated data

Abstract

Talk to us

Similar Papers

More From: Science, technologies, innovation