Critical properties of the SAT/UNSAT transitions in the classification problem of structured data

Mauro Pastore

doi:10.1088/1742-5468/ac312b

Abstract

The classification problem of structured data can be solved with different strategies: a supervised learning approach, starting from a labeled training set, and an unsupervised learning one, where only the structure of the patterns in the dataset is used to find a classification compatible with it. The two strategies can be interpreted as extreme cases of a semi-supervised approach to learn multi-view data, relevant for applications. In this paper I study the critical properties of the two storage problems associated with these tasks, in the case of the linear binary classification of doublets of points sharing the same label, within replica theory. While the first approach presents an SAT/UNSAT transition in a (marginally) stable replica-symmetric phase, in the second one the satisfiability line lies in a full replica-symmetry-broken phase. A similar behavior in the problem of learning with a margin is also pointed out.

Full Text