Abstract
The availability of large quantities of online data affords the isolation of key user segments based on demographics and behaviors for many online systems. However, there is an open question of how organizations can best leverage this user information in communication and decision-making. The automatic generation of personas to represent customer segments is an interactive design technique with considerable potential for product development, policy decision, and content creation. A persona is an imaginary but characteristic person that is the representation of a customer, audience, or user segment. The representative segment shares common characteristics in terms of behavioral attributes or demographics. Representing a user segment, a persona is generally developed in the form of a detailed profile narrative, typically expressed in one or two pages, about a representative but an imaginary individual that embodies the collection of users with similar behaviors or demographics. In order to make the fictitious individual appear as a real person to system developers and other decision-makers, the persona profile usually comprises a variety of demographic and behavioral details, such as socioeconomic status, gender, hobbies, family members, friends, possessions, among other data and information. Along with this data, the persona profiles typically address the goals, needs, wants, frustrations and other attitudinal aspects of the fictitious individual that are relevant to the product being developed and designed. Personas have typically been fairly static once created by using manual, qualitative methods. In this research, we demonstrate a data-driven approach for creating and validating personas in real time, based on automated analysis of actual user data. Using a variety of data collection sites and research partners from various verticals (digital content, non-profits, retail, service, etc., we are specifically interested in understanding the users of these organizations by identifying (1) whom the organizations are reaching (i.e., user segment) and (2) what content are associated with each user segment. Focusing on one aspect of user behavior, we collect tens of millions of instances of interaction by users to online content, specifically examining the topics of content interaction. We then decompose the interaction patterns, discover related impactful demographics, and add personal properties; this approach creates personas based on these behavioral and demographic aspects that represent the core user segments for each organization. We conduct analysis to remove outliers and use non-negative matrix factorization to identify first the meaningful behavioral patterns and then the impactful demographic groupings. We then demonstrate how these findings can be leveraged to generate real-time personas based on actual user data to facilitate organizational communication and decision-making. Demonstrating that these insights can be used to develop personas in near real-time, the research results provide insights into user segmentation, competitive marketing, topical interests, and preferred system features for the users. Overall, research implications are that personas can be generated in near real-time representing the core users groups of online products.
Published Version
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have