Abstract

This work addresses the problem of “user disambiguation”—estimating the likelihood of each member of a small group using a shared account or device. The specific focus is on television set-top box (STB) viewership data in multiperson households, in which it is impossible to tell with certainty which household members watch what. We formulate user disambiguation as a predictive problem and develop a solution for estimating the likelihood that each individual in a multiperson household watches each TV segment. This method learns priors for viewership in single-person households and then adapts them to the specifics of each multiperson household’s viewership history. We formalize two ad hoc heuristics that are currently used in industry (and research) for estimating audience composition of STB data and conduct a comparative analysis using three data sources: simulated data, real large-scale viewership data, and fully labeled panel data. The results show that our method has superior performance. This approach has practical value for both advertisers and researchers who seek better understanding of TV viewership. It also has applications beyond TV advertising, such as detecting the sharing of streaming passwords among multiple households or any other situation in which multiple users share devices or accounts.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call