Counterfactual Evaluation And Learning From Logged User Feedback

Adith Swaminathan

doi:10.7298/x4fj2dw6

Abstract

Interactive systems that interact with and learn from user behavior are ubiquitous today. Machine learning algorithms are core components of such systems. In this thesis, we will study how we can re-use logged user behavior data to evaluate interactive systems and train their machine learned components in a principled way. The core message of the thesis is: Using simple techniques from causal inference, we can improve popular machine learning algorithms so that they interact reliably. These improvements are effective and scalable, and complement current algorithmic and modeling advances in machine learning. They open further avenues for research in Counterfactual Evaluation and Learning to ensure machine learned components interact reliably with users and with each other. This thesis explores two fundamental tasks — evaluation and training of interactive systems. Solving evaluation and training tasks using logged data is an exercise in counterfactual reasoning. So we will ﬁrst review concepts from causal inference for counterfactual reasoning, assignment mechanisms, statistical estimation and learning theory. The thesis then contains two parts. In the ﬁrst part, we will study scenarios where unknown assignment mechanisms underlie the logged data we collect. These scenarios often arise in learning-to-rank and learning-to-recommend applications. We will view these applications through the lens of causal inference and modularize the problem of building a good ranking engine or recommender system into two components — ﬁrst, infer a plausible assignment mechanism and second, reliably learn to rank or recommend assuming this mechanism was active when collecting data. The second part of the thesis focuses on scenarios where we collect logged data from past interventions. We will formalize these scenarios as batch learning from logged contextual bandit feedback. We will ﬁrst develop better off-policy estimators for evaluating online user-centric metrics in information retrieval applications. In subsequent chapters, we will study the bias-variance trade-off when learning from logged interventions. This study will yield new learning principles, algorithms and insights into the design of statistical estimators for counterfactual learning.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Counterfactual Evaluation And Learning From Logged User Feedback

Abstract

Talk to us

Similar Papers

Lead the way for us

Similar Papers

Causal inference and interpretable machine learning for personalised medicine

-

01 Jan 2019
01 Jan 2019

Machine Learning as a Mean to Uncover Latent Knowledge from Source Code

-

05 Jun 2020
05 Jun 2020

Automated Macro-scale Causal Hypothesis Formation Based on Micro-scale Observation

-

01 Jan 2017
01 Jan 2017

Using learning to rank approach for parallel corpora based cross language information retrieval
...
-
, et. al. ...
13 Aug 2012
13 Aug 2012

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Counterfactual Evaluation And Learning From Logged User Feedback

Abstract

Talk to us

Similar Papers