Abstract
Evaluation metrics for search typically assume items are homoge- neous. However, in the context of web search, this assumption does not hold. Modern search engine result pages (SERPs) are composed of a variety of item types (e.g., news, web, entity, etc.), and their influence on browsing behavior is largely unknown. In this paper, we perform a large-scale empirical analysis of pop- ular web search queries and investigate how different item types influence how people interact on SERPs. We then infer a user browsing model given people's interactions with SERP items - creating a data-driven metric based on item type. We show that the proposed metric leads to more accurate estimates of: (1) total gain, (2) total time spent, and (3) stopping depth - without requiring extensive parameter tuning or a priori relevance information. These results suggest that item heterogeneity should be accounted for when de- veloping metrics for SERPs. While many open questions remain concerning the applicability and generalizability of data-driven metrics, they do serve as a formal mechanism to link observed user behaviors directly to how performance is measured. From this approach, we can draw new insights regarding the relationship between behavior and performance - and design data-driven metrics based on real user behavior rather than using metrics reliant on some hypothesized model of user browsing behavior.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.