Aesthetic quality evaluation of packaging design with graph neural networks and composition features
In the field of visual communication design, the aesthetic quality assessment of packaging images faces significant challenges due to the complexity and subjectivity of their layout composition. To enhance the objectivity and intelligence of such evaluations, this study proposes a packaging design aesthetic quality assessment method combining image composition features and graph neural networks (CGA-GNN). The method extracts visual structural information from images based on graph construction rules (e.g., symmetry, proximity, rule of thirds) and integrates a graph attention mechanism to improve compositional awareness in node feature aggregation. Experiments were conducted on the constructed dataset of 1,200 annotated packaging images. The results demonstrate that CGA-GNN significantly outperforms existing baseline models in both prediction accuracy and consistency. Specifically, the model achieves a Weighted Root Mean Squared Error (WRMSE) of 0.378 ± 0.018, which is significantly lower than that of GraphSAGE-GAT (0.397 ± 0.021, p < 0.05), GAT (0.425 ± 0.022, p < 0.01), and CNN (0.446 ± 0.023, p < 0.001). Regarding Spearman’s rank correlation coefficient, CGA-GNN attains a score of 0.714 ± 0.017, Markedly higher than other comparative models, with a Maximum improvement of 0.073 (p < 0.001). Additionally, its Graph Structural Integrity Rate (GSIR) reaches 0.921 ± 0.016, representing an approximately 15% increase compared to CNN (0.802 ± 0.020). Ablation studies further reveal that the model achieves optimal performance when all three compositional rules are incorporated (WRMSE = 0.378, Spearman’s ρ = 0.714, Kendall’s W = 0.691), validating the complementary effect of multi-rule integration. The findings confirm the effectiveness of deep integration between composition rules and graph neural networks in assessing the aesthetic quality of packaging images, providing technical support for standardized design evaluation, personalized recommendation, and creative assistance.
- Conference Instance
- 10.1145/3423268
- Oct 12, 2020
It is our great pleasure to welcome you to the 2020 ACM Joint Workshop on Aesthetic and Technical Quality Assessment of Multimedia and Media Analytics for Societal Trends (ATQAM/MAST'20). This Joint Workshop is divided into two tracks: ATQAM and MAST. ATQAM track: Visual quality assessment techniques can be divided into image and video technical quality assessment (IQA and VQA,or broadly TQA) and aesthetics quality assessment (AQA). While TQA is a long-standing field, having its roots in media compression, AQA is relatively young. These two topics have mostly been studied separately, even though they deal with similar aspects of the underlying subjective experience of media. The mission of this track is to bring together individuals in the two fields of TQA and AQA for sharing of ideas and discussions on current trends, developments, issues, and future directions. We hope that bridging TQA and AQA, will result in a better understanding of quantitative measures of quality of experience in the broader context of multimedia applications MAST Track: Traditional multimedia content analytics and research usually deals with tasks such as indexing and summarization. In the context of understanding the impact of media on society and shaping our experiences, we believe in the need for a holistic approach to quantify how people, places and topics are portrayed in media, complementing the traditional branches of media analytics research. To this end, the MAST workshop aims to close the loop with the audience's experience, and analyze the impact of media in terms of constantly evolving societal patterns.
- Research Article
4
- 10.1016/j.knosys.2024.111749
- Apr 2, 2024
- Knowledge-Based Systems
Personalized Image Aesthetics Assessment based on Graph Neural Network and Collaborative Filtering
- Book Chapter
- 10.1007/978-981-13-3663-8_30
- Jan 1, 2019
Nowadays, the acquisition of digital photos is becoming easier and easier, and the demand for objective assessment of the aesthetic quality of photos is growing, with the development of computer vision and pattern recognition technology, this demand is gradually being explored. After studying the effect of gray interval distribution on the aesthetic quality of photos, an objective assessment method for the aesthetic quality of photographs which considers both global and local features is proposed. For the whole features, the gray interval and line angles features are extracted. The subject areas are extracted from the photos through clarity detection method, then calculate the clarity, lighting and the calculation of rule of thirds in the subject area, as to extract the local features. A total of 18 features including the whole and the local features, using LIBSVM to calculate and train, and to test this method on CUHKPQ image dataset, then we get a good performance. It can be concluded that the combination of 18 features in the objective assessment of photos aesthetic quality can meet the demand of practical application.
- Research Article
1
- 10.5120/ijca2015907614
- Dec 17, 2015
- International Journal of Computer Applications
is a branch of philosophy which deals with the study of emotions in relation to the sense of beauty. Nowadays, there is a tremendous increase in the use of digital images as a means for representing and communicating information. With the considerable increase of consumer photos, evaluating the quality of photos has become a difficult task. People are more interested in photos that are visually pleasing. The aesthetic beauty of a picture is determined using aesthetic quality factors like prettiness, cuteness, neatness, cuddliness, loveliness etc. Aesthetic quality assessment is a challenging task that require understanding of subjective notions . Aesthetic quality score of an image can be calculated using low level features such as contrast, sharpness, colorfulness etc. This paper provide a survey of aesthetic quality assessment of photographic images and a brief description of existing approaches.
- Conference Article
13
- 10.1109/iccad51958.2021.9643549
- Nov 1, 2021
Graph Neural Networks (GNNs) have emerged as the state-of-the-art (SOTA) method for graph-based learning tasks. However, it still remains prohibitively challenging to inference GNNs over large graph datasets, limiting their application to large-scale real-world tasks. While end-to-end jointly optimizing GNNs and their accelerators is promising in boosting GNNs' inference efficiency and expediting the design process, it is still underexplored due to the vast and distinct design spaces of GNNs and their accelerators. In this work, we propose G-CoS, a GNN and accelerator co-search framework that can automatically search for matched GNN structures and accelerators to maximize both task accuracy and acceleration efficiency. Specifically, GCoS integrates two major enabling components: (1) a generic GNN accelerator search space which is applicable to various GNN structures and (2) a one-shot GNN and accelerator co-search algorithm that enables simultaneous and efficient search for optimal GNN structures and their matched accelerators. To the best of our knowledge, G-CoS is the first co-search framework for GNNs and their accelerators. Extensive experiments and ablation studies show that the GNNs and accelerators generated by G-CoS consistently outperform SOTA GNNs and GNN accelerators in terms of both task accuracy and hardware efficiency, while only requiring a few hours for the end-to-end generation of the best matched GNNs and their accelerators.
- Conference Article
2
- 10.1145/3394171.3421895
- Oct 12, 2020
The Joint Workshop on Aesthetic and Technical Quality Assessment of Multimedia and Media Analytics for Societal Trends (ATQAM/ MAST) aims to bring together researchers and professionals working in fields ranging from computer vision, multimedia computing, multimodal signal processing to psychology and social sciences. It is divided into two tracks: ATQAM and MAST. ATQAM track: Visual quality assessment techniques can be divided into image and video technical quality assessment (IQA and VQA, or broadly TQA) and aesthetics quality assessment (AQA). While TQA is a long-standing field, having its roots in media compression, AQA is relatively young. Both have received increased attention with developments in deep learning. The topics have mostly been studied separately, even though they deal with similar aspects of the underlying subjective experience of media. The aim is to bring together individuals in the two fields of TQA and AQA for the sharing of ideas and discussions on current trends, developments, issues, and future directions. MAST track: The research area of media content analytics has been traditionally used to refer to applications involving inference of higher-level semantics from multimedia content. However, multimedia is typically created for human consumption, and we believe it is necessary to adopt a human-centered approach to this analysis, which would not only enable a better understanding of how viewers engage with content but also how they impact each other in the process.
- Research Article
- 10.1109/tnnls.2024.3497330
- Jan 1, 2024
- IEEE transactions on neural networks and learning systems
Graph neural network (GNN) ushered in a new era of machine learning with interconnected datasets. While traditional neural networks can only be trained on independent samples, GNN allows for the inclusion of intersample interactions in the training process. This gain, however, incurs additional memory cost, rendering most GNNs unscalable for real-world applications involving vast and complicated networks with tens of millions of nodes (e.g., social circles, web graphs, and brain graphs). This means that storing the graph in the main memory can be difficult, let alone training the GNN model with significantly less GPU memory. While much of the recent literature has focused on either mini-batching GNN methods or quantization, graph reduction methods remain largely scarce. Furthermore, present graph reduction approaches have several drawbacks. First, most graph reduction focuses only on the inference stage (e.g., condensation, pruning, and distillation) and requires full graph GNN training, which does not reduce training memory footprint. Second, many methods focus solely on the graph's structural aspect, ignoring the initial population feature-label distribution, resulting in a skewed postreduction label distribution. Here, we propose a feature-label constrained graph net collapse (FALCON) to address these limitations. Our three core contributions lie in: 1) designing FALCON, a topology-aware graph reduction technique that preserves feature-label distribution by introducing a K-means clustering with a novel dimension-normalized Euclidean distance; 2) implementation of FALCON with other state-of-the-art (SOTA) memory reduction methods (i.e., mini-batched GNN and quantization) for further memory reduction; and 3) extensive benchmarking and ablation studies against SOTA methods to evaluate FALCON memory reduction. Our comprehensive results show that FALCON can significantly collapse various public datasets (e.g., PPI and Flickr to as low as 34% of the total nodes) while keeping equal prediction quality across GNN models. Our FALCON code is available at https://github.com/basiralab/FALCON.
- Research Article
- 10.1088/2632-2153/addfaa
- Jun 10, 2025
- Machine Learning: Science and Technology
Prediction of chemical yields is crucial for exploring untapped chemical reactions and optimizing synthetic pathways for targeted compounds. Recently, graph neural networks have proven successful in achieving high predictive accuracy. However, they remain intrinsically black-box models, offering limited interpretability. Understanding how each reaction component contributes to the yield of a chemical reaction can help identify critical factors driving the success or failure of reactions, thereby potentially revealing opportunities for yield optimization. In this study, we present a novel method for interpretable chemical reaction yield prediction, which represents the yield of a chemical reaction as a simple summation of component-wise contributions from individual reaction components. To build an interpretable prediction model, we introduce a graph neural additive network architecture, wherein shared neural networks process individual reaction components in an input reaction while leveraging a reaction-level embedding to derive their respective contributions. The predicted yield is obtained by summing these component-wise contributions. The model is trained using a learning objective designed to effectively quantify the contributions of individual components by amplifying the influence of significant components and suppressing that of less influential components. The experimental results on benchmark datasets demonstrated that the proposed method achieved both high predictive accuracy and interpretability, making it suitable for practical use in synthetic pathway design for real-world applications.
- Conference Article
- 10.1109/acait56212.2022.10137804
- Dec 9, 2022
Aesthetic quality evaluation of images has an important role in the field of visual analysis, and the widespread use of high-quality image editing has gradually increased the importance of image aesthetic evaluation in automatic image processing tasks. Previous researchers have mostly explored the mapping relationship between images and labeled scores using convolutional neural networks, but the aesthetic features of different regions on images have not been explored sufficiently, when an image is rich in background information and it is necessary to correlate the aesthetic features of different regions to evaluate the image, convolutional neural networks often cannot extract the aesthetic features of the image adequately due to the lack of the advantage of global feature modeling. We introduce a novel Transformer architecture for image aesthetic quality assessment(IAFormer), IAFormer can model the global aesthetic features of an image, and it is a framework that unifies the aesthetic quality assessment of images and the aesthetic cropping of images, while the aesthetic quality of the image is evaluated, the aesthetic weights on different patches within the image can be calculated to give valid reference information for the aesthetic cropping task.
- Research Article
- 10.1016/j.engappai.2024.109647
- Nov 29, 2024
- Engineering Applications of Artificial Intelligence
Semantic graph neural network with multi-measure learning for semi-supervised classification
- Research Article
48
- 10.1016/j.media.2022.102471
- Jul 1, 2022
- Medical Image Analysis
Resting-state functional magnetic resonance imaging (rs-fMRI) has been successfully employed to understand the organisation of the human brain. Typically, the brain is parcellated into regions of interest (ROIs) and modelled as a graph where each ROI represents a node and association measures between ROI-specific blood-oxygen-level-dependent (BOLD) time series are edges. Recently, graph neural networks (GNNs) have seen a surge in popularity due to their success in modelling unstructured relational data. The latest developments with GNNs, however, have not yet been fully exploited for the analysis of rs-fMRI data, particularly with regards to its spatio-temporal dynamics. In this paper, we present a novel deep neural network architecture which combines both GNNs and temporal convolutional networks (TCNs) in order to learn from both the spatial and temporal components of rs-fMRI data in an end-to-end fashion. In particular, this corresponds to intra-feature learning (i.e., learning temporal dynamics with TCNs) as well as inter-feature learning (i.e., leveraging interactions between ROI-wise dynamics with GNNs). We evaluate our model with an ablation study using 35,159 samples from the UK Biobank rs-fMRI database, as well as in the smaller Human Connectome Project (HCP) dataset, both in a unimodal and in a multimodal fashion. We also demonstrate that out architecture contains explainability-related features which easily map to realistic neurobiological insights. We suggest that this model could lay the groundwork for future deep learning architectures focused on leveraging the inherently and inextricably spatio-temporal nature of rs-fMRI data.
- Research Article
1
- 10.1109/tnnls.2023.3268766
- Oct 1, 2024
- IEEE transactions on neural networks and learning systems
Graph neural networks (GNNs) have achieved great success in many fields due to their powerful capabilities of processing graph-structured data. However, most GNNs can only be applied to scenarios where graphs are known, but real-world data are often noisy or even do not have available graph structures. Recently, graph learning has attracted increasing attention in dealing with these problems. In this article, we develop a novel approach to improving the robustness of the GNNs, called composite GNN. Different from existing methods, our method uses composite graphs (C-graphs) to characterize both sample and feature relations. The C-graph is a unified graph that unifies these two kinds of relations, where edges between samples represent sample similarities, and each sample has a tree-based feature graph to model feature importance and combination preference. By jointly learning multiaspect C-graphs and neural network parameters, our method improves the performance of semisupervised node classification and ensures robustness. We conduct a series of experiments to evaluate the performance of our method and the variants of our method that only learn sample relations or feature relations. Extensive experimental results on nine benchmark datasets demonstrate that our proposed method achieves the best performance on almost all the datasets and is robust to feature noises.
- Conference Article
6
- 10.1145/3539597.3570480
- Feb 27, 2023
Graph Neural Networks (GNNs) can effectively capture both the topology and attribute information of a graph, and have been extensively studied in many domains. Recently, there is an emerging trend that equips GNNs with knowledge distillation for better efficiency or effectiveness. However, to the best of our knowledge, existing knowledge distillation methods applied on GNNs all employed predefined distillation processes, which are controlled by several hyper-parameters without any supervision from the performance of distilled models. Such isolation between distillation and evaluation would lead to suboptimal results. In this work, we aim to propose a general knowledge distillation framework that can be applied on any pretrained GNN models to further improve their performance. To address the isolation problem, we propose to parameterize and learn distillation processes suitable for distilling GNNs. Specifically, instead of introducing a unified temperature hyper-parameter as most previous work did, we will learn node-specific distillation temperatures towards better performance of distilled models. We first parameterize each node's temperature by a function of its neighborhood's encodings and predictions, and then design a novel iterative learning process for model distilling and temperature learning. We also introduce a scalable variant of our method to accelerate model training. Experimental results on five benchmark datasets show that our proposed framework can be applied on five popular GNN models and consistently improve their prediction accuracies with 3.12% relative enhancement on average. Besides, the scalable variant enables 8 times faster training speed at the cost of 1% prediction accuracy.
- Conference Article
- 10.1109/iccst50977.2020.00017
- Oct 1, 2020
Image composition is a vital factor in image aesthetics. In this paper, based on photographic composition of the image itself, we combine the aesthetic deep features with composition features by utilizing multi-task learning. We summarize a series of calculation formulas of the most classic photographic composition rules, such as the rule of thirds, and calculate the composition features and scores of the images. In multi-task learning module, we design double-column networks with static sharing structures. Features from different networks are fused by the method of soft parameter sharing. The composition score and the original aesthetic score of the image are used to supervise the training of the networks. Experiments on AVA-mini dataset show that the multi-task learning can make better use of the composition information of the image. Our method can outperform on the regression task of the image aesthetic quality assessment.
- Conference Article
11
- 10.1109/vcip53242.2021.9675430
- Dec 5, 2021
With the development of the game industry and the popularization of mobile devices, mobile games have played an important role in people's entertainment life. The aesthetic quality of mobile game images determines the users' Quality of Experience (QoE) to a certain extent. In this paper, we propose a multi-task deep learning based method to evaluate the aesthetic quality of mobile game images in multiple dimensions (i.e. the fineness, color harmony, colorfulness, and overall quality). Specifically, we first extract the quality-aware feature representation through integrating the features from all intermediate layers of the convolution neural network (CNN) and then map these quality-aware features into the quality score space in each dimension via the quality regressor module, which consists of three fully connected (FC) layers. The proposed model is trained through a multi-task learning manner, where the quality-aware features are shared by different quality dimension prediction tasks, and the multi-dimensional quality scores of each image are regressed by multiple quality regression modules respectively. We further introduce an uncertainty principle to balance the loss of each task in the training stage. The experimental results show that our proposed model achieves the best performance on the Multi-dimensional Aesthetic assessment for Mobile Game image database (MAMG) among state-of-the-art image quality assessment (IQA) algorithms and aesthetic quality assessment (AQA) algorithms.
- New
- Research Article
- 10.1038/s41598-025-22646-3
- Nov 6, 2025
- Scientific Reports
- New
- Research Article
- 10.1038/s41598-025-22635-6
- Nov 6, 2025
- Scientific Reports
- New
- Research Article
- 10.1038/s41598-025-22867-6
- Nov 6, 2025
- Scientific Reports
- New
- Research Article
- 10.1038/s41598-025-22758-w
- Nov 6, 2025
- Scientific Reports
- New
- Research Article
- 10.1038/s41598-025-22854-x
- Nov 6, 2025
- Scientific Reports
- New
- Research Article
- 10.1038/s41598-025-22866-7
- Nov 6, 2025
- Scientific Reports
- New
- Research Article
- 10.1038/s41598-025-22895-2
- Nov 6, 2025
- Scientific Reports
- New
- Research Article
- 10.1038/s41598-025-22651-6
- Nov 6, 2025
- Scientific Reports
- New
- Research Article
- 10.1038/s41598-025-22678-9
- Nov 6, 2025
- Scientific Reports
- New
- Research Article
- 10.1038/s41598-025-22664-1
- Nov 6, 2025
- Scientific Reports
- Ask R Discovery
- Chat PDF
AI summaries and top papers from 250M+ research sources.