Abstract

High-dimensional and sparse (HiDS) data with non-negativity constraints are commonly seen in industrial applications, such as recommender systems. They can be modeled into an HiDS matrix, from which non-negative latent factor analysis (NLFA) is highly effective in extracting useful features. Preforming NLFA on an HiDS matrix is ill-posed, desiring an effective regularization scheme for avoiding overfitting. Current models mostly adopt a standard L <sub xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">2</sub> scheme, which does not consider the imbalanced distribution of known data in an HiDS matrix. From this point of view, this paper proposes an instance-frequency-weighted regularization (IR) scheme for NLFA on HiDS data. It specifies the regularization effects on each latent factors with its relevant instance count, i.e., instance-frequency, which clearly describes the known data distribution of an HiDS matrix. By doing so, it achieves finely grained modeling of regularization effects. The experimental results on HiDS matrices from industrial applications demonstrate that compared with an L <sub xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">2</sub> scheme, an IR scheme enables a resultant model to achieve higher accuracy in missing data estimation of an HiDS matrix.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call