Interactive query interfaces have become a popular tool for ad hoc data analysis and exploration. Compared with traditional systems that are optimized for throughput or batched performance, these systems focus more on user-centric interactivity. This poses a new class of performance challenges to the backend, which are further exacerbated by the advent of new interaction modes (e.g., touch, gesture) and query interface paradigms (e.g., sliders, maps). There is, thus, a need to clearly articulate the evaluation space for interactive systems. In this paper, we extensively survey the literature to guide the development and evaluation of interactive data systems. We highlight unique characteristics of interactive workloads, discuss confounding factors when conducting user studies, and catalog popular metrics for evaluation. We further delineate certain behaviors not captured by these metrics and propose complementary ones to provide a complete picture of interactivity. We demonstrate how to analyze and employ user behavior for system enhancements through three case studies. Our survey and case studies motivate the need for behavior-driven evaluation and optimizations when building interactive interfaces.