Abstract

In fields like security risk analysis, Fast Moving Consumer Goods, Internet of Things, or the banking sector, it is necessary to deal with large datasets containing a great list of variables. In these situations, the analysis becomes intricate and computationally expensive, so data reduction techniques play an important role. Prototype generation methods provide a reduced dataset with the same properties as the original. GSOMs (growing self-organizing maps) reduce the data size without the need for prefixing the number of neurons needed to represent the input space. To the best of the authors’ knowledge, this is the first time that the GSOM is applied for reduction and generation of prototypes, posing an advantage over their predecessors, the SOMs (self-organizing maps), which do not have the automatic growth feature. This work addresses the use of a GSOM to reduce the number of prototypes to use in a 1-NN (1 nearest neighbor) classifier. The proposed methodology is applied to an income dataset for testing and a large bank dataset that contain classifications into two different groups. The 1-NN classifier is used to obtain predictions using the nodes of the GSOM as prototypes. This article demonstrates that GSOMs save a significant amount of time in obtaining nearly the same validation results as SOMs by comparing the classifications obtained in the bank dataset. The results show data reductions of more than 99%, and accuracies greater than 80% for the income dataset and 74% for the bank dataset.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call