Image-Label Recovery on Fashion Data Using Image Similarity from Triple Siamese Network

Debapriya Banerjee,Won Hwa Kim,Maria Kyrarini

doi:10.3390/technologies9010010

Debapriya Banerjee, Won Hwa Kim + Show 1 more

Open Access

https://doi.org/10.3390/technologies9010010

Copy DOI

Journal: Technologies	Publication Date: Jan 21, 2021
Citations: 2	License type: CC BY 4.0

Affiliation: The University of Texas at Arlington

Abstract

Weakly labeled data are inevitable in various research areas in artificial intelligence (AI) where one has a modicum of knowledge about the complete dataset. One of the reasons for weakly labeled data in AI is insufficient accurately labeled data. Strict privacy control or accidental loss may also cause missing-data problems. However, supervised machine learning (ML) requires accurately labeled data in order to successfully solve a problem. Data labeling is difficult and time-consuming as it requires manual work, perfect results, and sometimes human experts to be involved (e.g., medical labeled data). In contrast, unlabeled data are inexpensive and easily available. Due to there not being enough labeled training data, researchers sometimes only obtain one or few data points per category or label. Training a supervised ML model from the small set of labeled data is a challenging task. The objective of this research is to recover missing labels from the dataset using state-of-the-art ML techniques using a semisupervised ML approach. In this work, a novel convolutional neural network-based framework is trained with a few instances of a class to perform metric learning. The dataset is then converted into a graph signal, which is recovered using a recover algorithm (RA) in graph Fourier transform. The proposed approach was evaluated on a Fashion dataset for accuracy and precision and performed significantly better than graph neural networks and other state-of-the-art methods.

Highlights

Supervised learning [1,2] is an approach in Machine Learning (ML) for classification [3]or regression tasks [4], where a set of labeled data is used to train a prediction model.in practice, obtaining sufficient labeled data for training a model can be difficult.There may be a strict privacy-control policy that restricts one from obtaining labeled data or human error that can cause false or missing labels in the dataset
The proposed approach was evaluated on a Fashion dataset for accuracy and precision and performed significantly better than graph neural networks and other state-of-the-art methods
We introduce a novel approach for SSL in the Fashion dataset, where we had a limited amount of labeled data to train our model

Summary

Introduction

Supervised learning [1,2] is an approach in Machine Learning (ML) for classification [3]or regression tasks [4], where a set of labeled data is used to train a prediction model.in practice, obtaining sufficient labeled data for training a model can be difficult.There may be a strict privacy-control policy that restricts one from obtaining labeled data or human error that can cause false or missing labels in the dataset. Regression tasks [4], where a set of labeled data is used to train a prediction model. In practice, obtaining sufficient labeled data for training a model can be difficult. There may not be enough of a budget to obtain all information labeled by human annotators, especially when expert knowledge is needed for the annotations. Finding the category of each product is time-consuming, it is important to develop a framework that can automatically categorize new data on the basis of a small amount of labeled data.

Objectives

Methods

Results

Conclusion