Abstract

Clustering of handwritten digits is carried out for sixty thousand images contained in the training sample of the MNIST database. For clustering, the Kohonen neural network is used. For each handwritten digit, the optimal number of clusters (no more than 50) is determined. When determining the distance between objects (images of handwritten digits), the Euclidean norm is used. Checking the correctness of building clusters is carried out using data from the test sample of the MNIST database. The test sample contains ten thousand images. It is concluded that the images from the test sample belong to the "correct digit" cluster with a probability of more than 90%. For each digit, an F-measure is calculated to evaluate the clusters. The best F-measures are obtained for digits 0 and 1 (F-mean is 0.974). The worst values are obtained for the number 9 (F-mean is 0.903). A cluster analysis is also carried out, which allows drawing conclusions about possible errors in recognition by the Kohonen neural network. Intersections of clusters for images of handwritten digits are constructed. Examples of intersections of clusters are given, as well as examples of images that are incorrectly recognized by the neural network.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call