Quality assessment of data discrimination using self-organizing maps

Alexey Mekler,Dmitri Schwarz

doi:10.1016/j.jbi.2014.06.001

Alexey Mekler, Dmitri Schwarz

Open Access

https://doi.org/10.1016/j.jbi.2014.06.001

Copy DOI

Abstract

MotivationOne of the important aspects of the data classification problem lies in making the most appropriate selection of features. The set of variables should be small and, at the same time, should provide reliable discrimination of the classes. The method for the discriminating power evaluation that enables a comparison between different sets of variables will be useful in the search for the set of variables. ResultsA new approach to feature selection is presented. Two methods of evaluation of the data discriminating power of a feature set are suggested. Both of the methods implement self-organizing maps (SOMs) and the newly introduced exponents of the degree of data clusterization on the SOM. The first method is based on the comparison of intraclass and interclass distances on the map. Another method concerns the evaluation of the relative number of best matching unit’s (BMUs) nearest neighbors of the same class. Both methods make it possible to evaluate the discriminating power of a feature set in cases when this set provides nonlinear discrimination of the classes. AvailabilityCurrent algorithms in program code can be downloaded for free at http://mekler.narod.ru/Science/Articles_support.html, as well as the supporting data files.

Full Text