Abstract

We consider the problem of collaborative filtering from a channel coding perspective. We model the underlying rating matrix as a finite alphabet matrix with block constant structure. The observations are obtained from this underlying matrix through a discrete memoryless channel with a noisy part representing noisy user behavior and an erasure part representing missing data. Moreover, the clusters over which the underlying matrix is constant are {\it unknown}. We establish a sharp threshold result for this model: if the largest cluster size is smaller than $C_1 \log(mn)$ (where the rating matrix is of size $m \times n$), then the underlying matrix cannot be recovered with any estimator, but if the smallest cluster size is larger than $C_2 \log(mn)$, then we show a polynomial time estimator with diminishing probability of error. In the case of uniform cluster size, not only the order of the threshold, but also the constant is identified.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.