Abstract
Fundamental machine learning theory shows that different samples contribute unequally to both the learning and testing processes. Recent studies on deep neural networks (DNNs) suggest that such sample differences are rooted in the distribution of intrinsic pattern information, namely sample regularity. Motivated by recent discoveries in network memorization and generalization, we propose a pair of sample regularity measures with a formulation-consistent representation for both processes. Specifically, the cumulative binary training/generalizing loss (CBTL/CBGL), the cumulative number of correct classifications of the training/test sample within the training phase, is proposed to quantify the stability in the memorization-generalization process, while forgetting/mal-generalizing events (ForEvents/MgEvents), i.e., the misclassification of previously learned or generalized samples, are utilized to represent the uncertainty of sample regularity with respect to optimization dynamics. The effectiveness and robustness of the proposed approaches for mini-batch stochastic gradient descent (SGD) optimization are validated through sample-wise analyses. Further training/test sample selection applications show that the proposed measures, which share the unified computing procedure, could benefit both tasks.
Published Version
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have