Abstract

The family of metric algorithms based on determining the distance from one observation to another has a number of advantages, such as their suitability for many types of problems and results have a clear interpretation. Therefore, metric algorithms are widely used in credit risk modeling, non-destructive quality control of products, medical diagnostics, geology, and many other practical areas. The most common metric algorithm in practice is the k-nearest neighbors method. At the same time, one of the key problems of metric algorithms is the problem of dimension, since the decision is made on the basis of all observations of the training sample. In addition, usually all variables have the same weight when calculating the distance, which leads to a drop in the quality of the algorithm with an increase in the number of features. The article discusses a new machine learning method for solving classification problems – a metric classifier with the selection of feature weights, which allows to solve these problems to a large extent. Nine algorithms were used to optimize the function. Classification quality based on them is checked on 3 problems from the UCI repository. As a result of the comparison, the truncated Newton method was chosen to build a new metric classifier. The quality of the new classifier was tested on 8 datasets from the same repository and compared with the quality of the classical nearest neighbor method. This classifier has a higher quality for problems with a large number of features in comparison to the classical approach. Data set characteristics and calculation results are presented in the corresponding tables.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call