Abstract

Most of machine learning approaches have stemmed from the application of minimizing the mean squared distance principle, based on the computationally efficient quadratic optimization methods. However, when faced with high-dimensional and noisy data, the quadratic error functionals demonstrated many weaknesses including high sensitivity to contaminating factors and dimensionality curse. Therefore, a lot of recent applications in machine learning exploited properties of non-quadratic error functionals based on L1 norm or even sub-linear potentials corresponding to quasinorms Lp (0<p<1). The back side of these approaches is increase in computational cost for optimization. Till so far, no approaches have been suggested to deal with arbitrary error functionals, in a flexible and computationally efficient framework. In this paper, we develop a theory and basic universal data approximation algorithms (k-means, principal components, principal manifolds and graphs, regularized and sparse regression), based on piece-wise quadratic error potentials of subquadratic growth (PQSQ potentials). We develop a new and universal framework to minimize arbitrary sub-quadratic error potentials using an algorithm with guaranteed fast convergence to the local or global error minimum. The theory of PQSQ potentials is based on the notion of the cone of minorant functions, and represents a natural approximation formalism based on the application of min-plus algebra. The approach can be applied in most of existing machine learning methods, including methods of data approximation and regularized and sparse regression, leading to the improvement in the computational cost/accuracy trade-off. We demonstrate that on synthetic and real-life datasets PQSQ-based machine learning methods achieve orders of magnitude faster computational performance than the corresponding state-of-the-art methods, having similar or better approximation accuracy.

Highlights

  • Modern machine learning and artificial intelligence methods are revolutionizing many fields of science today, such as medicine, biology, engineering, high-energy physics and sociology, where large amounts of data have been collected due to the emergence of new high-throughput computerized technologies

  • Using quadratic potentials can be drastically compromised by all these circumstances: a lot of practical and theoretical efforts have been made in order to exploit the properties of non-quadratic error potentials which can be more appropriate in certain contexts

  • We introduce a rich family of piecewise quadratic potentials of subquadratic growth (PQSQ-potentials), suggest general approach for their optimization and prove convergence of a simple iterative algorithm in the most general case

Read more

Summary

Introduction

Modern machine learning and artificial intelligence methods are revolutionizing many fields of science today, such as medicine, biology, engineering, high-energy physics and sociology, where large amounts of data have been collected due to the emergence of new high-throughput computerized technologies. April 3, 2018 ties of L1 metrics [1, 2] found numerous applications in bioinformatics [3], and L1 norm-based methods of dimension reduction are of great use in automated image analysis [4] Not surprisingly, these approaches come with drastically increased computational cost, for example, connected with applying linear programming optimization techniques which are substantially more expensive compared to mean squared error-based methods. If a given arbitrary potential (such as L1-based or fractional quasinorm-based) can be approximated by a piecewise quadratic function, this should lead to relatively efficient and simple optimization algorithms. It appears that only potentials of quadratic or subquadratic growth are possible in this approach: these are the most usefull ones in data analysis. As an other application of PQSQ-based framework in machine learning, we develop PQSQbased regularized and sparse regression (imitating the properties of lasso and elastic net)

Definition of the PQSQ potential
Basic approach for optimization
Mean value and k-means clustering in PQSQ approximation measure
Nonlinear methods
Regularizing linear regression with PQSQ potential
Introducing sparsity by ‘black hole’ trick
Practical choices of parameters
Implementation
Conclusion
B Number of non-zero coefficients
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call