An Improved Method of Identifying Mislabeled Data and the Mislabeled Data in MNIST and CIFAR-10

Xinbin Zhang

doi:10.2139/ssrn.3080736

An Improved Method of Identifying Mislabeled Data and the Mislabeled Data in MNIST and CIFAR-10

Xinbin Zhang

https://doi.org/10.2139/ssrn.3080736

Copy DOI

Journal: SSRN	Publication Date: Nov 30, 2017
Citations: 1

Affiliation: Beijing University of Posts and Telecommunications

#Mislabeled Data #Ground True Data + Show 8 more

Abstract
Full-Text PDF
Similar Papers

Abstract

Objects classification is an important part of machine learning and the quality of the training data plays an important role. Some datasets, such as MNIST and CIFAR-10 are regarded as ground true data, and the accuracy on the two datasets is an important criterion for a machine learning models or algorithms. A number of mislabeled data detection techniques have been proposed; however, there is no reproduction work on MNIST and CIFAR-10. In this paper I use an improved method to identify mislabeled data in MNIST and CIFAR-10 and find 675 errors in MNIST, 118 errors in CIFAR-10. After correcting mislabeled instances, the accuracy increases. And the list of the current state of art of different datasets needs to be reproduced again with new dataset.

Full Text