Abstract
Using computational notebooks (e.g., Jupyter Notebook), data scientists rationalize their exploratory data analysis (EDA) based on their prior experience and external knowledge, such as online examples. For novices or data scientists who lack specific knowledge about the dataset or problem to investigate, effectively obtaining and understanding the external information is critical to carrying out EDA. This article presents EDAssistant, a JupyterLab extension that supports EDA with in situ search of example notebooks and recommendation of useful APIs, powered by novel interactive visualization of search results. The code search and recommendation are enabled by advanced machine learning models, trained on a large corpus of EDA notebooks collected online. A user study is conducted to investigate both EDAssistant and data scientists’ current practice (i.e., using external search engines). The results demonstrate the effectiveness and usefulness of EDAssistant, and participants appreciated its smooth and in-context support of EDA. We also report several design implications regarding code recommendation tools.
Full Text
Topics from this Paper
Advanced Machine Learning Models
Code Recommendation
Exploratory Data Analysis
External Search Engines
Data Scientists
+ Show 5 more
Create a personalized feed of these topics
Get StartedSimilar Papers
Gastro Hep Advances
Jan 1, 2022
May 6, 2021
Journal of Cleaner Production
Feb 1, 2021
Oct 1, 2022
arXiv: Human-Computer Interaction
Oct 27, 2019
Nov 12, 2019
Information
Mar 24, 2021
May 6, 2021
Nov 1, 2021
Mar 28, 2022
Cell Systems
Nov 1, 2018
ACM Transactions on Interactive Intelligent Systems
ACM Transactions on Interactive Intelligent Systems
Aug 14, 2023
ACM Transactions on Interactive Intelligent Systems
Aug 1, 2023
ACM Transactions on Interactive Intelligent Systems
Jul 24, 2023
ACM Transactions on Interactive Intelligent Systems
Jun 22, 2023
ACM Transactions on Interactive Intelligent Systems
Jun 19, 2023
ACM Transactions on Interactive Intelligent Systems
Jun 1, 2023
ACM Transactions on Interactive Intelligent Systems
May 17, 2023
ACM Transactions on Interactive Intelligent Systems
May 5, 2023
ACM Transactions on Interactive Intelligent Systems
May 5, 2023
ACM Transactions on Interactive Intelligent Systems
May 5, 2023