Assessing the Robustness of Cluster Solutions in Emotionally-Annotated Pictures Using Monte-Carlo Simulation Stabilized K-Means Algorithm

Marko Horvat,Alan Jović,Kristijan Burnik

doi:10.3390/make3020022

Marko Horvat, Alan Jović + Show 1 more

Open Access

https://doi.org/10.3390/make3020022

Copy DOI

Abstract

Clustering is a very popular machine-learning technique that is often used in data exploration of continuous variables. In general, there are two problems commonly encountered in clustering: (1) the selection of the optimal number of clusters, and (2) the undecidability of the affiliation of border data points to neighboring clusters. We address both problems and describe how to solve them in application to affective multimedia databases. In the experiment, we used the unsupervised learning algorithm k-means and the Nencki Affective Picture System (NAPS) dataset, which contains 1356 semantically and emotionally annotated pictures. The optimal number of centroids was estimated, using the empirical elbow and silhouette rules, and validated using the Monte-Carlo simulation approach. Clustering with k = 1–50 centroids is reported, along with dominant picture keywords and descriptive statistical parameters. Affective multimedia databases, such as the NAPS, have been specifically designed for emotion and attention experiments. By estimating the optimal cluster solutions, it was possible to gain deeper insight into affective features of visual stimuli. Finally, a custom software application was developed for study in the Python programming language. The tool uses the scikit-learn library for the implementation of machine-learning algorithms, data exploration and visualization. The tool is freely available for scientific and non-commercial purposes.

Highlights

IntroductionClustering can be broadly described as the task of dividing the population, or data points, or observations, as they are called, into a number of groups such that data points in the same groups, given a chosen set of attributes and metrics to compare them, are more similar to other data points in the same group than to those in other groups or clusters [1]
Clustering can be broadly described as the task of dividing the population, or data points, or observations, as they are called, into a number of groups such that data points in the same groups, given a chosen set of attributes and metrics to compare them, are more similar to other data points in the same group than to those in other groups or clusters [1].Clustering is an unsupervised process, which means that we are given unlabeled data and we need to put similar samples in one group and dissimilar samples in another, different cluster
When applying clustering in practice, one often encounters several problems: (1) the selection of the cluster similarity measure, (2) the selection of the optimal number of clusters, (3) the undecidability of the affiliation of border data points to neighboring clusters, and (4) the lack of correct group labels, which limits the applicability of the clustering model [3,4]

Summary

Introduction

Clustering can be broadly described as the task of dividing the population, or data points, or observations, as they are called, into a number of groups such that data points in the same groups, given a chosen set of attributes and metrics to compare them, are more similar to other data points in the same group than to those in other groups or clusters [1]. The difference in the articulation of images by pixel-defined content and semantic content is referred to as the semantic gap [11] This coupling of semantics and affect in emotionally annotated multimedia documents can be defined as a deterministic interaction between the semantics of a document and the effect that its semantics evoke. In this regard, the practical goal of the presented research is to develop an intelligent system that can infer the emotional content of a multimedia document from the evaluation of its semantics, and estimate the dominant semantics from the affective annotations when such information is available. The conclusion is presented in the final section at the end of the paper

Affective Multimedia Databases

Models of Affect in Affective Multimedia Databases

The NAPS Affective Picture Database

Related Work

Unsupervised Machine Learning Methods

Disadvantages of the k-Means Algorithm and the Solutions Used

Unstable Cluster Indexes

Statistical Distribution Undecidability

Experiment and Results

The Optimal Number of Clusters

D Dis quantitatively

Conclusions

Full Text

Published version (

Free)

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Machine Learning and Knowledge Extraction	Publication Date: May 4, 2021
Citations: 8	License type: CC BY 4.0

R Discovery Prime

R Discovery Prime

Assessing the Robustness of Cluster Solutions in Emotionally-Annotated Pictures Using Monte-Carlo Simulation Stabilized K-Means Algorithm

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Machine Learning and Knowledge Extraction

Lead the way for us

Similar Papers

Hepatic Steatosis detection using the co-occurrence matrix in tomography and ultrasound images
Elymar C Rivas ... Villie Morocho
-
Elymar C Rivas, et. al.Elymar C Rivas ... Villie Morocho
01 Sep 2015
01 Sep 2015

A HINDU-ARABIC TO HAUSA NUMBER TRANSCRIPTION SYSTEM
Muhammad Auwal Abubakar ... Safiriyu Ijiyemi Eludiora
Malaysian Journal of Computing | VOL. 6
Muhammad Auwal Abubakar, et. al.Muhammad Auwal Abubakar ... Safiriyu Ijiyemi Eludiora
02 Mar 2021
Malaysian Journal of Computing | VOL. 6

Identification of Patients with Sarcopenia Using Gait Parameters Based on Inertial Sensors.
Jeong-Kyun Kim ... Kang Bok Lee
Sensors (Basel, Switzerland) | VOL. 21
Jeong-Kyun Kim, et. al.Jeong-Kyun Kim ... Kang Bok Lee
04 Mar 2021
Sensors (Basel, Switzerland) | VOL. 21

Implementation of the Bernstein-Vazirani Quantum Algorithm Using the Qiskit Framework
Alexandru-Gabriel Tudorache ... Vasile-Ion Manta
Bulletin of the Polytechnic Institute of Iași. Electrical Engineering, Power Engineering, Electronics Section | VOL. 67
Alexandru-Gabriel Tudorache, et. al.Alexandru-Gabriel Tudorache ... Vasile-Ion Manta
01 Jun 2021
Bulletin of the Polytechnic Institute of Iași. Electrical Engineering, Power Engineering, Electronics Section | VOL. 67

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Assessing the Robustness of Cluster Solutions in Emotionally-Annotated Pictures Using Monte-Carlo Simulation Stabilized K-Means Algorithm

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Machine Learning and Knowledge Extraction